DOI QR코드

DOI QR Code

Clustering Method for Reduction of Cluster Center Distortion

클러스터 중심 왜곡 저감을 위한 클러스터링 기법

  • Jeong, Hye-C. (Dept. of Electrical Engineering, Yeungnam University) ;
  • Seo, Suk-T. (Dept. of Electrical Engineering, Yeungnam University) ;
  • Lee, In-K. (Dept. of Electrical Engineering, Yeungnam University) ;
  • Kwon, Soon-H. (Dept. of Electrical Engineering, Yeungnam University)
  • Published : 2008.06.25

Abstract

Clustering is a method to classify the given data set with same property into several classes. To cluster data, many methods such as K-Means, Fuzzy C-Means(FCM), Mountain Method(MM), and etc, have been proposed and used. But the clustering results of conventional methods are sensitively influenced by initial values given for clustering in each method. Especially, FCM is very sensitive to noisy data, and cluster center distortion phenomenon is occurred because the method dose clustering through minimization of within-clusters variance. In this paper, we propose a clustering method which reduces cluster center distortion through merging the nearest data based on the data weight, and not being influenced by initial values. We show the effectiveness of the proposed through experimental results applied it to various types of data sets, and comparison of cluster centers with those of FCM.

클러스터링은 주어진 임의의 데이터 중에서 유사한 성질을 지닌 데이터를 복수개의 그룹으로 조직화하는 기법이다. 이를 위해 K-Means, Fuzzy C-Means(FCM), Mountain Method(MM) 등과 같은 많은 기법들이 제안되었고 또한 널리 사용되어지고 있다. 그러나 이러한 기법들은 초기값에 따라 클러스터링 결과가 크게 달라지는 단점이 있다. 특히 가장 널리 사용되는 FCM 기법은 잡음 데이터에 취약하며, 주어진 입력 데이터의 클러스터 내부분산을 최소화 하는 방법을 사용하기 때문에 클러스터링 중심의 왜곡 현상이 발생한다. 본 논문에서는 데이터 가중치에 근거한 비례적 근접데이터 병합을 통하여 클러스터 중심 왜곡을 저감하며 초기값에 영향을 받지 않는 클러스터링 기법을 제안한다. 그리고 FCM으로 얻어진 클러스터 중심과 제안기법을 적용하여 얻어진 클러스터 중심에 대한 비교 검토를 통하여 제안기법의 효용성을 확인한다.

Keywords

References

  1. J. A. Hartigan, M. A. Wong, "A K-means clustering algorithm," Applied Statistics, Vol. 28, pp. 100-108, 1979 https://doi.org/10.2307/2346830
  2. J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Pleum, New York, 1981
  3. R. R. Yager and D. P. Filev, Essential of fuzzy modeling and control, John Wiley & Sons, Inc., New York, 1994
  4. 이중우, 손세호, 권순학, "개선된 산 클러스터링 방법," 한국 퍼지 및 지능시스템 학회 논문지, 제 11권, 1호, pp. 1-8, 2001
  5. J. S. Nath, S. K. Shevade, "An efficient clustering scheme using support vector methods," Pattern Recognition, Vol. 36, No. 8, pp. 1473-1480, 2006
  6. H. Rhee, K. Oh, "A design and analysis of objective function-based unsupervised neural networks for fuzzy clustering," Neural Processing Letters, Vol. 4, pp. 82-95, 1996
  7. S. Jiang, X. Song, H. Wang, J. J. Han, Q. H. Li, "A clustering-based method for unsupervised intrusion detections," Pattern Recognition Letters, Vol. 27, pp. 802-810, 2006 https://doi.org/10.1016/j.patrec.2005.11.007
  8. T. Hu, Y. Yu, J. Xiong, S. Y. Sung, "Maximum likelihood combination of multiple clusterings," Pattern Recognition Letters, Vol. 27, pp. 1457-1464, 2006 https://doi.org/10.1016/j.patrec.2006.02.013
  9. K. Blekas, I. E. Lagaris, "Newtonian clustering: An approach based on molecular dynamics and global optimization," Pattern Recognition, Vol. 40, No. 6, pp. 1734-1744, 2007 https://doi.org/10.1016/j.patcog.2006.07.012
  10. K. L. Wu, M. S. Yang, "Alternative C-means clustering algorithms," Pattern Recognition, Vol. 35, pp. 2267-2278, 2002 https://doi.org/10.1016/S0031-3203(01)00197-2
  11. H. Wang, C. Wang, G. Wu, "Bi-criteria fuzzy C-means analysis," Fuzzy Sets and Systems, Vol. 64, pp. 311-319, 1994 https://doi.org/10.1016/0165-0114(94)90154-6
  12. S. H. Kwon, "Cluster validity index for fuzzy clustering," Electronics Letters, Vol. 34, No. 22, pp. 2176-2177, 1998 https://doi.org/10.1049/el:19981523
  13. A. M. Bensaid, L. O. Hall, J. C. Bezdek, L. P. Clarke, M. L. Silbiger, J. A. Arrington, R. F. Murtagh, "Validity-guided clustering with applications to image (re)segmentation," IEEE Trans. Fuzzy Systerms, Vol. 4, No. 2, pp. 112-123. 1996 https://doi.org/10.1109/91.493905
  14. J. C. Dunn, "Indices of partition fuzziness and the detection of clusters in large data sets," Fuzzy Automata and Decision Processes, M. M. Gupta, Ed. Elsvier, New York, 1976

Cited by

  1. A Systematic Approach to Improve Fuzzy C-Mean Method based on Genetic Algorithm vol.13, pp.3, 2013, https://doi.org/10.5391/IJFIS.2013.13.3.178