An Improved K-means Document Clustering using Concept Vectors

Shin, Yang-Kyu;

Journal of the Korean Data and Information Science Society

Volume 14 Issue 4
/
Pages.853-861
/
2003
/
1598-9402(pISSN)

The Korean Data and Information Science Society (한국데이터정보과학회)

An Improved K-means Document Clustering using Concept Vectors

Shin, Yang-Kyu

Published : 2003.11.30

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

An improved K-means document clustering method has been presented, where a concept vector is manipulated for each cluster on the basis of cosine similarity of text documents. The concept vectors are unit vectors that have been normalized on the n-dimensional sphere. Because the standard K-means method is sensitive to initial starting condition, our improvement focused on starting condition for estimating the modes of a distribution. The improved K-means clustering algorithm has been applied to a set of text documents, called Classic3, to test and prove efficiency and correctness of clustering result, and showed 7% improvements in its worst case.

Journal of the Korean Data and Information Science Society

An Improved K-means Document Clustering using Concept Vectors

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)