• Title/Summary/Keyword: clustering techniques

Search Result 528, Processing Time 0.025 seconds

A Comparison and Analysis on High-Dimensional Clustering Techniques for Data Mining (데이터 마이닝을 위한 고차원 클러스터링 기법에 관한 비교 분석 연구)

  • 김홍일;이혜명
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.12
    • /
    • pp.887-900
    • /
    • 2003
  • Many applications require the clustering of large amounts of high dimensional data. Most automated clustering techniques have been developed but they do not work effectively and/or efficiently on high dimensional (numerical) data, which is due to the so-called “curse of dimensionality”. Moreover, the high dimensional data often contain a significant amount of noise, which causes additional ineffectiveness of algorithms. Therefore, it is necessary to look over the structure and various characteristics of high dimensional data and to develop algorithm that support clustering adapted to applications of the high dimensional database. In this paper, we investigate and classify the existing high dimensional clustering methods by analyzing the strength and weakness of each method for specific applications and comparing them. Especially, in terms of efficiency and effectiveness, we compare the traditional algorithms with CLIP which are developed by us. This study will contribute to develop more advanced algorithms than the current algorithms.

  • PDF

Hyper-ellipsoidal clustering algorithm using Linear Matrix Inequality (선형행렬 부등식을 이용한 타원형 클러스터링 알고리즘)

  • Lee, Han-Sung;Park, Joo-Young;Park, Dai-Hee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.300-305
    • /
    • 2002
  • In this paper, we use the modified gaussian kernel function as clustering distance measure and recast the given hyper-ellipsoidal clustering problem as the optimization problem that minimizes the volume of hyper-ellipsoidal clusters, respectively and solve this using EVP (eigen value problem) that is one of the LMI (linear matrix inequality) techniques.

A study on finding influential twitter users by clustering and ranking techniques (클러스터링 및 랭킹 기법을 활용한 트위터 인플루엔셜 추출 연구)

  • Choi, Jun-Il;Chang, Joong-Hyuk
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.20 no.1
    • /
    • pp.19-26
    • /
    • 2015
  • Recently, a lot of users are using social network services as the spread of SNS and generalization of smart-phone. In this study, we apply clustering and ranking method for finding twitter influential users. First, we propose five ranking elements. The five elements include the number of follow, the number of retweet, IRP, IFP and influ-score. These elements are used by centroid point of clustering methods. This study can help to find novel approaches for finding twitter influential users.

A Study on the TICC(Time Interval Clustering Control) Algorithm which Using a Timing in MANET (MANET에서 Time Interval Clustering Control 기법에 관한 연구)

  • Kim, Young-Sam;Doo, Kyoung-Min;Kim, Sun-Guk;Lee, Kang-Whan;Chi, Sam-Hyeon
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.629-630
    • /
    • 2008
  • MANET is depended on the property as like variable energy, high degree of mobility, location environments of nodes etc. So, in this paper, we propose an algorithm techniques which is TICC (Time Interval Clustering Control) based on energy value in property of each node for solving cluster problem. It provides improving cluster energy efficiency how can being node manage to order each node's energy level. TICC is clustering method. It has shown that Node's energy efficiency and life time are improved in MANET.

  • PDF

An Optimized Partner Searching System for B2B Marketplace Applying Clustering Techniques (군집화 기법을 이용한 B2B Marketplace상의 최적 파트너 검색 시스템)

  • Kim Shin-Young;Kim Soo-Young
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.572-579
    • /
    • 2003
  • With the expansion of e-commerce, E-marketplace has become one of the most discussed topics in recent years. Limited theoretical works, however, have been done to optimize the practical use of e-marketplace systems. Other potential issues aside, this research has focused on this problem: 'the participants waste too much time, effort and cost to find out their best partner in B2B marketplace.' To solve this problem, this paper proposes a system which provides the user-company with the automated and customized brokering service. The system proposed in this paper assesses the weight on the priorities of a user-company, runs the two-stage clustering algorithm with self-organizing map and K-means clustering technique. Subsequently, the system shows the clustering result and user guide-line. This system enables B2B marketplace to have more efficiency on transaction with smaller pool of partners to be searched.

  • PDF

Similarity measure for P2P processing of semantic data (시맨틱웹 데이터의 P2P 처리를 위한 유사도 측정)

  • Kim, Byung Gon;Kim, Youn Hee
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.6 no.4
    • /
    • pp.11-20
    • /
    • 2010
  • Ontology is important role in semantic web to construct and query semantic data. Because of dynamic characteristic of ontology, P2P environment is considered for ontology processing in web environment. For efficient processing of ontology in P2P environment, clustering of peers should be considered. When new peer is added to the network, cluster allocation problem of the new peer is important for system efficiency. For clustering of peers with similar chateristics, similarlity measure method of ontology in added peer with ontologies in other clusters is needed. In this paper, we propose similarity measure techniques of ontologies for clustering of peers. Similarity measure method in this paper considered ontology's strucural characteristics like schema, class, property. Results of experiments show that ontologies of similar topics, class, property can be allocated to the same cluster.

Fusion of Background Subtraction and Clustering Techniques for Shadow Suppression in Video Sequences

  • Chowdhury, Anuva;Shin, Jung-Pil;Chong, Ui-Pil
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.14 no.4
    • /
    • pp.231-234
    • /
    • 2013
  • This paper introduces a mixture of background subtraction technique and K-Means clustering algorithm for removing shadows from video sequences. Lighting conditions cause an issue with segmentation. The proposed method can successfully eradicate artifacts associated with lighting changes such as highlight and reflection, and cast shadows of moving object from segmentation. In this paper, K-Means clustering algorithm is applied to the foreground, which is initially fragmented by background subtraction technique. The estimated shadow region is then superimposed on the background to eliminate the effects that cause redundancy in object detection. Simulation results depict that the proposed approach is capable of removing shadows and reflections from moving objects with an accuracy of more than 95% in every cases considered.

Application of Genetic and Local Optimization Algorithms for Object Clustering Problem with Similarity Coefficients (유사성 계수를 이용한 군집화 문제에서 유전자와 국부 최적화 알고리듬의 적용)

  • Yim, Dong-Soon;Oh, Hyun-Seung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.1
    • /
    • pp.90-99
    • /
    • 2003
  • Object clustering, which makes classification for a set of objects into a number of groups such that objects included in a group have similar characteristic and objects in different groups have dissimilar characteristic each other, has been exploited in diverse area such as information retrieval, data mining, group technology, etc. In this study, an object-clustering problem with similarity coefficients between objects is considered. At first, an evaluation function for the optimization problem is defined. Then, a genetic algorithm and local optimization technique based on heuristic method are proposed and used in order to obtain near optimal solutions. Solutions from the genetic algorithm are improved by local optimization techniques based on object relocation and cluster merging. Throughout extensive experiments, the validity and effectiveness of the proposed algorithms are tested.

Fuzzy Technique-based Identification of Close and Distant Clusters in Clustering

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.11 no.3
    • /
    • pp.165-170
    • /
    • 2011
  • Due to advances in hardware performance, user-friendly interfaces are becoming one of the major concerns in information systems. Linguistic conversation is a very natural way of human communications. Fuzzy techniques have been employed to liaison the discrepancy between the qualitative linguistic terms and quantitative computerized data. This paper deals with linguistic queries using clustering results on data sets, which are intended to retrieve the close clusters or distant clusters from the clustering results. In order to support such queries, a fuzzy technique-based method is proposed. The method introduces distance membership functions, namely, close and distant membership functions which transform the metric distance between two objects into the degree of closeness or farness, respectively. In order to measure the degree of closeness or farness between two clusters, both cluster closeness measure and cluster farness measure which incorporate distance membership function and cluster memberships are considered. For the flexibility of clustering, fuzzy clusters are assumed to be formed. This allows us to linguistically query close or distant clusters by constructing fuzzy relation based on the measures.

Fast Search Algorithm for Determining the Optimal Number of Clusters using Cluster Validity Index (클러스터 타당성 평가기준을 이용한 최적의 클러스터 수 결정을 위한 고속 탐색 알고리즘)

  • Lee, Sang-Wook
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.9
    • /
    • pp.80-89
    • /
    • 2009
  • A fast and efficient search algorithm to determine an optimal number of clusters in clustering algorithms is presented. The method is based on cluster validity index which is a measure for clustering optimality. As the clustering procedure progresses and reaches an optimal cluster configuration, the cluster validity index is expected to be minimized or maximized. In this Paper, a fast non-exhaustive search method for finding the optimal number of clusters is designed and shown to work well in clustering. The proposed algorithm is implemented with the k-mean++ algorithm as underlying clustering techniques using CB and PBM as a cluster validity index. Experimental results show that the proposed method provides the computation time efficiency without loss of accuracy on several artificial and real-life data sets.