DOI QR코드

DOI QR Code

A symbiotic evolutionary algorithm for the clustering problems with an unknown number of clusters

클러스터 수가 주어지지 않는 클러스터링 문제를 위한 공생 진화알고리즘

  • Shin, Kyoung-Seok (Dept. of Industrial Engineering, Chonnam National University) ;
  • Kim, Jae-Yun (Dept. of Business Administration, Chonnam National University)
  • Received : 2011.01.07
  • Accepted : 2011.01.28
  • Published : 2011.03.31

Abstract

Clustering is an useful method to classify objects into subsets that have some meaning in the context of a particular problem and has been applied in variety of fields, customer relationship management, data mining, pattern recognition, and biotechnology etc. This paper addresses the unknown K clustering problems and presents a new approach based on a coevolutionary algorithm to solve it. Coevolutionary algorithms are known as very efficient tools to solve the integrated optimization problems with high degree of complexity compared to classical ones. The problem considered in this paper can be divided into two sub-problems; finding the number of clusters and classifying the data into these clusters. To apply to coevolutionary algorithm, the framework of algorithm and genetic elements suitable for the sub-problems are proposed. Also, a neighborhood-based evolutionary strategy is employed to maintain the population diversity. To analyze the proposed algorithm, the experiments are performed with various test-bed problems which are grouped into several classes. The experimental results confirm the effectiveness of the proposed algorithm.

Keywords

References

  1. 김성호, 백승익(2001), "인위적 데이터를 이용한 군집 분석 프로그램간의 비교에 대한 연구," 지능정보연구, 7권, 2호, pp. 35-49.
  2. 문숙경, 김우성(2004), "마케팅자료에서 특성점들을 이용한 군집방법," 품질경영학회지, 32권, 4호, pp. 265-273.
  3. 오은녕, 이희상(2002), "클러스터링 기법을 이용한 이동통신의 고객 세분화 연구," 한국경영과학회 추계논문집, pp. 421-424.
  4. 황인수(2002), "데이터 마이닝에서 그룹 세분화를 위한 2단계 계층적 클러스터링 알고리듬," 경영과학, 19권 1호, pp. 189-196.
  5. Al-Sultan, K.(1995), "A Tabu search approach to the clustering problem," Pattern Recognition, Vol. 28, No. 9, pp. 1443-1451. https://doi.org/10.1016/0031-3203(95)00022-R
  6. Bandyopadhyay, S. and Maulik, U.(2002a), "An evolutionary technique based on K-Means algorithm for optimal clustering in ${R^{N}}$," Information Sciences, Vol. 146, pp. 221-237. https://doi.org/10.1016/S0020-0255(02)00208-6
  7. Bandyopadhyay, S. and Maulik, U.(2002b), "Genetic clustering for automatic evolution of clusters and application to image classification," Pattern Recognition, Vol. 35, pp. 1197-1208. https://doi.org/10.1016/S0031-3203(01)00108-X
  8. Brown, D. and Huntley, C.(1992), "A practical application of simulated annealing to clustering," Pattern Recognition, Vol. 25, No. 4, pp. 401-412. https://doi.org/10.1016/0031-3203(92)90088-Z
  9. Cooley, R., Mobasher, B. and Srivastava J. (1997), "Web Mining: Information Pattern Discovery on the World Wide Web," Proc. of the 9th IEEE International Conference, pp. 558-567.
  10. Davies, D.L. and Bouldin, D.W.(1979), "A cluster separation measure," IEEE Transactions on Pattern Recognition and Machine Intelligence, Vol. 1, No. 2, pp. 224-227.
  11. Garai, G. and Chaudhuri, B.B.(2004), "A novel genetic algorithm for automatic clustering," Pattern Recognition Letters, Vol. 25, pp. 173-187. https://doi.org/10.1016/j.patrec.2003.09.012
  12. Halkidi, M., Batistakis, Y. and Vazirgiannis, M. (2001), "On clustering validation techniques," Journal of Intelligent Information Systems, Vol. 17, pp. 107-145. https://doi.org/10.1023/A:1012801612483
  13. Hruschka E.R., Campello, R.G.B., Freitas, A.A. and Carvalho, A.P.L.(2009). "A survey of evolutionary algorithms for clustering," IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applica tions and Reviews, Vol. 39, No. 2, pp. 133-155. https://doi.org/10.1109/TSMCC.2008.2007252
  14. Kim, Y.K., Kim, J.Y. and Kim, Y.(2000), "A coevolutionary algorithm for balancing and sequencing in mixed model assembly lines," Applied Intelligence, Vol. 13, pp. 247-258. https://doi.org/10.1023/A:1026568011013
  15. Kim, Y.K., Park, K.T. and Ko, J.S.(2003), "A symbiotic evolutionary algorithm for the integration of process planning and job shop scheduling," Computers & Operations Research, Vol. 30, pp. 1151-1171. https://doi.org/10.1016/S0305-0548(02)00063-1
  16. Koontz, W.L.G., Narendra, P.M. and Fukunaga, K.(1975), "A branch and bound clustering algorithm," IEEE Transactions on Computers, Vol. 24, No. 9, pp. 908-915.
  17. Liu, G.(1968), Intoduction to combinatorial mathematics, NewYork: McGraw-Hill.
  18. Margulis, L.(1981), Symbiosis in cell evolution, W.H.Freeman, SanFrancisco.
  19. Maulik, U. and Bandyopadhyay, S.(2000), "Genetic algorithm-based clustering technique," Pattern Recognition, Vol. 33, pp. 1455-1465. https://doi.org/10.1016/S0031-3203(99)00137-5
  20. Moriarty, D.E. and Miikkulainen, R. (1997), "Forming neural networks through efficient and adaptive coevolution," Evolutionary Computation, Vol. 5, pp. 373-399. https://doi.org/10.1162/evco.1997.5.4.373
  21. Selim, S. and Alsultan, K.(1991), "A simulated annealing algorithm for the clustering problems," Pattern Recognition, Vol. 24, No. 10, pp. 1003-1008. https://doi.org/10.1016/0031-3203(91)90097-O
  22. Sung, C. and Jin, H.(2000), "A Tabu-search-based heuristic for clustering," Pattern Recognition, Vol. 33, pp. 849-858. https://doi.org/10.1016/S0031-3203(99)00090-4
  23. Tou, J.T. and Gonazlez, R.C.(1974), Pattern Recognition Principles, Addison-Wesley, Reading, MA.
  24. Theodoridis, S. and Koutroumbas, K.(2006), Pattern Recognition, 3rd Edition, Academic Press.
  25. Tseng, L.Y. and Yang, S.B.(2001), "A genetic approach to the automatic clustering problem," Pattern Recognition, Vol. 34, pp. 415-424. https://doi.org/10.1016/S0031-3203(00)00005-4
  26. Xu, R. and Wunsch, D., II(2005), "Survey of clustering algorithms," IEEE Transactions on Neural Networks, Vol. 16, No. 3, pp. 645-678. https://doi.org/10.1109/TNN.2005.845141