The Similarity Plot for Comparing Clustering Methods

군집분석 방법들을 비교하기 위한 상사그림

  • Received : 2012.11.20
  • Accepted : 2013.04.11
  • Published : 2013.04.30


There are a wide variety of clustering algorithms; subsequently, we need a measure of similarity between two clustering methods. Such a measure can compare how well different clustering algorithms perform on a set of data. More numbers of compared clustering algorithms allow for more number of valuers for a measure of similarity between two clustering methods. Thus, we need a simple tool that presents the many values of a measure of similarity to compare many clustering methods. We suggest some graphical tools to compareg many clustering methods.

군집분석을 위한 알고리즘은 매우 많다. 이러한 군집분석 방법들이 개체들을 어떻게 여러 개의 군집으로 나누는 지를 서로 비교하기 위해서는 나누어지는 군집들이 얼마나 동일한가를 알 수 있는 동의 측도가 필요하다. 우리가 고려하여야 할 군집분석 방법들이 많아질수록 덩달아 동의 측도들 값도 많아지게 된다. 그래서 복수 개의 군집분석 방법들과 대응되는 동의 측도값들을 한 눈에 확인할 수 있는 도구가 필요하다. 본 논문을 통하여 군집분석 방법들과 대응되는 동의 측도값들을 한 눈에 확인할 수 있는 그래픽도구들을 제안하고자 한다.


  1. Banerjee, A., Dhillon, I. S., Ghosh, J. and Sra, S. (2005). Clustering on the unit hypersphere using von Mises-Fisher distributions, Journal of Machine Learning Research, 6, 1345-1382.
  2. Brouwer, R. K. (2009). Extending the rand, adjusted rand and jaccard indices to fuzzy partitions, Journal of Intelligence and Information System, 32, 213-235.
  3. Campello, R. J. G. B. (2007). A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment, Pattern Recognition Letters, 28, 833-841.
  4. Ceccarelli, M. and Maratea, A. (2008). A fuzzy extension of some classical concordance measures and an efficient algorithm for their computation, KES 2008, Part III, LNAI 5179, 755-763.
  5. Fowlkes, E. and Mallows, C. (1983). A Method for comparing two hierarchical clusterings, Journal of American Statistical Association, 78, 553-569.
  6. Harrison, D. and Rubinfeld, D. (1978). Hedonic prices and the demand for clean air, Journal of Environmental Economics & Management, 5, 81-102.
  7. Hubert, L. and Arabie, P. (1985). Comparing partitions, Journal of Classification, 2, 193-218.
  8. Meila, M. (2007). Comparing clustering-An information based distance, Journal of Multivariate Analysis, 98, 873-895.
  9. Monti, S., Tamayo, P., Mesirov, J. and Golub, T. (2003). Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data, Machine Learning, 52, 91-118.
  10. Song, M. S. and Cho, S. S. (2004). Statistical Data Analysis Using SAS, 2nd ed., Free-Academy, Seoul.
  11. Strehl, A. and Ghosh, J. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions, Journal of Machine Learning Research, 3, 583-617.
  12. Rand, W. (1971). Objective criteria for the evaluation of clustering methods, Journal of American Statistical Association, 66, 846-850.
  13. Unnikrishnan, R., Pantofaru, C. and Hebert, M. (2007). Toward objective evaluation of image segmentation Algorithms, IEEE Transactions On Pattern Analysis And Machine Intelligence, 29, 929-944.
  14. Vinh, N. and Epps, J. (2009). A novel approach for automatic number of clusters detection in microarray data based on consensus clustering, 9th IEEE International Conference on Bioinformatics and Bioengineering, 84-91.
  15. Warrens, M. J. (2008). On the equivalence of Cohen's Kappa and the Hubert-Arabie adjusted Rand index, Journal of Classification, 25, 177-183.
  16. Yu, Z., Wong, H.-S. and Wang, H. (2007). Graph-based consensus clustering for class discovery from gene expression data, Bioinformatics, 23, 2888-2896.

Cited by

  1. Water Quality Improvement Plan for Small Streams in the Northernmost Basin of Bukhan River based on Pollution Grade and Typological Analysis Linkage vol.32, pp.3, 2016,