Selection of Cluster Topic Words in Hierarchical Clustering using K-Means Algorithm

  • Lee Shin Won (Dept of Computer Engineering, Chonbuk National University) ;
  • Yi Sang Seon (Dept of Computer Engineering, Chonbuk National University) ;
  • An Dong Un (Dept of Computer Engineering, Chonbuk National University) ;
  • Chung Sung Jong (Dept of Computer Engineering, Chonbuk National University)
  • 발행 : 2004.08.01

초록

Fast and high-quality document clustering algorithms play an important role in providing data exploration by organizing large amounts of information into a small number of meaningful clusters. Hierarchical clustering improves the performance of retrieval and makes that users can understand easily. For outperforming of clustering, we implemented hierarchical structure with variety and readability, by careful selection of cluster topic words and deciding the number of clusters dynamically. It is important to select topic words because hierarchical clustering structure is summarizes result of searching. We made choice of noun word as a cluster topic word. The quality of topic words is increased $33\%$ as follows. As the topic word of each cluster, the only noun word is extracted for the top-level cluster and the used topic words for the children clusters were not reused.

키워드