• Title/Summary/Keyword: Clustering Problem

Search Result 708, Processing Time 0.024 seconds

An Incremental Similarity Computation Method in Agglomerative Hierarchical Clustering

  • Jung, Sung-young;Kim, Taek-soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.7
    • /
    • pp.579-583
    • /
    • 2001
  • In the area of data clustering in high dimensional space, one of the difficulties is the time-consuming process for computing vector similarities. It becomes worse in the case of the agglomerative algorithm with the group-average link and mean centroid method, because the cluster similarity must be recomputed whenever the cluster center moves after the merging step. As a solution of this problem, we present an incremental method of similarity computation, which substitutes the scalar calculation for the time-consuming calculation of vector similarity with several measures such as the squared distance, inner product, cosine, and minimum variance. Experimental results show that it makes clustering speed significantly fast for very high dimensional data.

  • PDF

A new Ensemble Clustering Algorithm using a Reconstructed Mapping Coefficient

  • Cao, Tuoqia;Chang, Dongxia;Zhao, Yao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2957-2980
    • /
    • 2020
  • Ensemble clustering commonly integrates multiple basic partitions to obtain a more accurate clustering result than a single partition. Specifically, it exists an inevitable problem that the incomplete transformation from the original space to the integrated space. In this paper, a novel ensemble clustering algorithm using a newly reconstructed mapping coefficient (ECRMC) is proposed. In the algorithm, a newly reconstructed mapping coefficient between objects and micro-clusters is designed based on the principle of increasing information entropy to enhance effective information. This can reduce the information loss in the transformation from micro-clusters to the original space. Then the correlation of the micro-clusters is creatively calculated by the Spearman coefficient. Therefore, the revised co-association graph between objects can be built more accurately because the supplementary information can well ensure the completeness of the whole conversion process. Experiment results demonstrate that the ECRMC clustering algorithm has high performance, effectiveness, and feasibility.

Combined Artificial Bee Colony for Data Clustering (융합 인공벌군집 데이터 클러스터링 방법)

  • Kang, Bum-Su;Kim, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • Data clustering is one of the most difficult and challenging problems and can be formally considered as a particular kind of NP-hard grouping problems. The K-means algorithm is one of the most popular and widely used clustering method because it is easy to implement and very efficient. However, it has high possibility to trap in local optimum and high variation of solutions with different initials for the large data set. Therefore, we need study efficient computational intelligence method to find the global optimal solution in data clustering problem within limited computational time. The objective of this paper is to propose a combined artificial bee colony (CABC) with K-means for initialization and finalization to find optimal solution that is effective on data clustering optimization problem. The artificial bee colony (ABC) is an algorithm motivated by the intelligent behavior exhibited by honeybees when searching for food. The performance of ABC is better than or similar to other population-based algorithms with the added advantage of employing fewer control parameters. Our proposed CABC method is able to provide near optimal solution within reasonable time to balance the converged and diversified searches. In this paper, the experiment and analysis of clustering problems demonstrate that CABC is a competitive approach comparing to previous partitioning approaches in satisfactory results with respect to solution quality. We validate the performance of CABC using Iris, Wine, Glass, Vowel, and Cloud UCI machine learning repository datasets comparing to previous studies by experiment and analysis. Our proposed KABCK (K-means+ABC+K-means) is better than ABCK (ABC+K-means), KABC (K-means+ABC), ABC, and K-means in our simulations.

A Fast K-means and Fuzzy-c-means Algorithms using Adaptively Initialization (적응적인 초기치 설정을 이용한 Fast K-means 및 Frizzy-c-means 알고리즘)

  • 강지혜;김성수
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.516-524
    • /
    • 2004
  • In this paper, the initial value problem in clustering using K-means or Fuzzy-c-means is considered to reduce the number of iterations. Conventionally the initial values in clustering using K-means or Fuzzy-c-means are chosen randomly, which sometimes brings the results that the process of clustering converges to undesired center points. The choice of intial value has been one of the well-known subjects to be solved. The system of clustering using K-means or Fuzzy-c-means is sensitive to the choice of intial values. As an approach to the problem, the uniform partitioning method is employed to extract the optimal initial point for each clustering of data. Experimental results are presented to demonstrate the superiority of the proposed method, which reduces the number of iterations for the central points of clustering groups.

An Energy-Efficient Sensor Network Clustering Using the Hybrid Setup (하이브리드 셋업을 이용한 에너지 효율적 센서 네트워크 클러스터링)

  • Min, Hong-Ki
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.1
    • /
    • pp.38-43
    • /
    • 2011
  • Cluster-based routing is high energy consumption of cluster head nodes. A recent approach to resolving the problem is the dynamic cluster technique that periodically re-selects cluster head nodes to distribute energy consumption of the sensor nodes. However, the dynamic clustering technique has a problem that repetitive construction of clustering consumes the more energies. This paper proposes a solution to the problems described above from the energy efficiency perspective. The round-robin cluster header(RRCH) technique, which fixes the initially structured cluster and sequentially selects cluster head nodes, is suggested for solving the energy consumption problem regarding repetitive cluster construction. A simulation result were compared with the performances of two of the most widely used conventional techniques, the LEACH(Low Energy Adaptive Clustering Hierarchy) and HEED(Hybrid, Energy Efficient, Distributed Clustering) algorithms, based on energy consumption, remaining energy for each node and uniform distribution. The evaluation confirmed that in terms of energy consumption, the technique proposed in this paper was 26.5% and 20% more efficient than LEACH and HEED, respectively.

A Certain Class of Root Clustering of Control Systems with Structured Uncertainty (구조적불확실성을 갖는 제어시스템의 Root Clustering 해석)

  • 조태신;김영철
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.10
    • /
    • pp.1259-1268
    • /
    • 1995
  • This note presents the robust root clustering problem of interval systems whose characteristic equation might be given as either a family of interval polynomials or a family of polytopes. Corresponding to damping ratio and robustness margin approximately, we consider a certain class of D-region such as parabola, left-hyperbola, and ellipse in complex plane. Then a simpler D-stability criteria using rational function mapping is presented and prove. Without .lambda. or .omega. sweeping calculation, the absolute criteria for robust D-stability can be determined.

  • PDF

Clustering based object feature matching for multi-camera system (멀티 카메라 연동을 위한 군집화 기반의 객체 특징 정합)

  • Kim, Hyun-Soo;Kim, Gyeong-Hwan
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.915-916
    • /
    • 2008
  • We propose a clustering based object feature matching for identification of same object in multi-camera system. The method is focused on ease to system initialization and extension. Clustering is used to estimate parameters of Gaussian mixture models of objects. A similarity measure between models are determined by Kullback-Leibler divergence. This method can be applied to occlusion problem in tracking.

  • PDF

Analysis of Document Clustering Varing Cluster Centroid Decisions (클러스터 중심 결정 방법에 따른 문서 클러스터링 성능 분석)

  • 오형진;변동률;이신원;박순철;정성종;안동언
    • Proceedings of the IEEK Conference
    • /
    • 2002.06c
    • /
    • pp.99-102
    • /
    • 2002
  • K-means clustering algorithm is a very popular clustering technique, which is used in the field of information retrieval. In this paper, We deal with the problem of K-means Algorithm from the view of creating the centroids and suggest a method reflecting document feature and considering the context of each document to determine the new centroids during the process of forming new centroids. For experiment, We used the automatic document summarizer to summarize the Reuter21578 newslire test dataset and achieved 20% improved results to the recall metrics.

  • PDF

Multiple Peak Detection Using the Extended Fuzzy Clustering (확장된 퍼지 클러스터링 알고리즘을 이용한 다중 첨두 검출)

  • 김수환;조창호;강경진;이태원
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.29B no.1
    • /
    • pp.102-112
    • /
    • 1992
  • We have already proposed an extended fuzzy clustering algorithm which considers the importance of the data to be classified in a previous paper. In this paper, we suggest the extended fuzzy clustering algorithm based new method to slove a multiple peak detection problem, and prove experimently that this algorithm can detect the multiple peak adaptively to the noise and the shape of peaks.

  • PDF

Clustering Method of Weighted Preference Using K-means Algorithm and Bayesian Network for Recommender System (추천시스템을 위한 k-means 기법과 베이시안 네트워크를 이용한 가중치 선호도 군집 방법)

  • Park, Wha-Beum;Cho, Young-Sung;Ko, Hyung-Hwa
    • Journal of Information Technology Applications and Management
    • /
    • v.20 no.3_spc
    • /
    • pp.219-230
    • /
    • 2013
  • Real time accessiblity and agility in Ubiquitous-commerce is required under ubiquitous computing environment. The Research has been actively processed in e-commerce so as to improve the accuracy of recommendation. Existing Collaborative filtering (CF) can not reflect contents of the items and has the problem of the process of selection in the neighborhood user group and the problems of sparsity and scalability as well. Although a system has been practically used to improve these defects, it still does not reflect attributes of the item. In this paper, to solve this problem, We can use a implicit method which is used by customer's data and purchase history data. We propose a new clustering method of weighted preference for customer using k-means clustering and Bayesian network in order to improve the accuracy of recommendation. To verify improved performance of the proposed system, we make experiments with dataset collected in a cosmetic internet shopping mall.