DOI QR코드

DOI QR Code

A Non-linear Variant of Global Clustering Using Kernel Methods

커널을 이용한 전역 클러스터링의 비선형화

  • Received : 2010.01.27
  • Accepted : 2010.03.11
  • Published : 2010.04.30

Abstract

Fuzzy c-means (FCM) is a simple but efficient clustering algorithm using the concept of a fuzzy set that has been proved to be useful in many areas. There are, however, several well known problems with FCM, such as sensitivity to initialization, sensitivity to outliers, and limitation to convex clusters. In this paper, global fuzzy c-means (G-FCM) and kernel fuzzy c-means (K-FCM) are combined to form a non-linear variant of G-FCM, called kernel global fuzzy c-means (KG-FCM). G-FCM is a variant of FCM that uses an incremental seed selection method and is effective in alleviating sensitivity to initialization. There are several approaches to reduce the influence of noise and accommodate non-convex clusters, and K-FCM is one of them. K-FCM is used in this paper because it can easily be extended with different kernels. By combining G-FCM and K-FCM, KG-FCM can resolve the shortcomings mentioned above. The usefulness of the proposed method is demonstrated by experiments using artificial and real world data sets.

References

  1. R. Xu and D. Wunsch, "Survey of clustering algorithms," IEEE Transactions on Neural Networks, Vol. 16, No. 3, pp. 645-678, May 2005. https://doi.org/10.1109/TNN.2005.845141
  2. W. Wang, Y. Zhang, Y. Li, and X. Zhang, "The global fuzzy c-means clustering algorithm," Proceedings of the 6th World Congress on Intelligent Control and Automation, Dalian, China, pp. 3604-3607, June 2006.
  3. H.-S. Tsai, "A study on kernel-based clustering algorithms," Ph.D. dissertation, Department of Applied Mathematics, Chung Yuan Christian University, Chung Li, Taiwan, 2007.
  4. J. He, M. Lan, C.-L. Tan, S.-Y. Sung, and H.-B. Low, "Initialization of cluster refinement algorithms: A review and comparative study," Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, Budapest, Hungary, pp. 297-302, July 2004.
  5. A. Likas, N. Vlassis, and J. J. Verbeek, "The global k-means clustering algorithm," Pattern Recognition, Vol. 36, No. 2, pp. 451-461, Feb. 2003. https://doi.org/10.1016/S0031-3203(02)00060-2
  6. A. M. Bagirov, "Modified global k-means algorithm for minimum sum-of-squares clustering problems," Pattern Recognition, Vol. 41, No. 10, pp. 3192-3199, Oct. 2008. https://doi.org/10.1016/j.patcog.2008.04.004
  7. J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, CA, pp. 281-297, Jan. 1966.
  8. R. J. Hathaway and J. C. Bezdek, "Optimization of clustering criteria by reformulation," IEEE Transactions on Fuzzy Systems, Vol. 3, No. 2, pp. 241-245, May 1995. https://doi.org/10.1109/91.388178
  9. M. Girolami, "Mercer kernel-based clustering in feature space," IEEE Transactions on Neural Networks, Vol. 13, No. 3, pp. 780-784, May 2002. https://doi.org/10.1109/TNN.2002.1000150
  10. M. Filippone, F. Camastra, F. Masulli, and S. Rovetta, "A survey of kernel and spectral methods for clustering," Pattern Recognition, Vol. 41, No. 1, pp. 176-190, Jan. 2008. https://doi.org/10.1016/j.patcog.2007.05.018
  11. A. Asuncion and D. Newman, UCI Machine Learning Repository : http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.
  12. M. Meila, "Comparing clusterings-an information based distance," Journal of Multivariate Analysis, Vol. 98, No. 5, pp. 873-895, May 2007. https://doi.org/10.1016/j.jmva.2006.11.013