An Incremental Web Document Clustering Based on the Transitive Closure Tree

Youn Sung-Dae;Ko Suc-Bum;

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

제9권1호
/
Pages.1-10
/
2006
/
1229-7771(pISSN)
/
2384-0102(eISSN)

한국멀티미디어학회 (Korea Multimedia Society)

이행적 폐쇄트리를 기반으로 한 점증적 웹 문서 클러스터링

An Incremental Web Document Clustering Based on the Transitive Closure Tree

윤성대 (부경대학교 전자컴퓨터정보통신공학부) ;
고석범 (부경대학교 대학원 전자계산학과)

발행 : 2006.01.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

기존의 문서 클러스터링 기법에는 k-means와 같이 수행속도가 우수한 기법과, 분류의 정확률이 우수한 계층적 집괴 클러스터링 기법이 있다. 두 기법은 각각 분류의 정확률 저하와 저속의 수행속도로서 상호 단점을 가지며, 새로운 문서를 삽입 할 때마다 문서 유사도를 재계산해야 하는 문제가 있다. 웹 정보의 특성은 잦은 문서의 추가를 통해 정보를 축적하는 것이다. 따라서 본 논문에서는 정확률이 우수한 계층적 집괴 클러스터링 기법을 기반으로 수행속도를 향상 시킬 수 있는 이행적 폐쇄 트리 기법을 제안하고, 또한 새로운 문서의 삽입과 삭제에 우수한 점증적인 클러스터링이 가능한 기법을 제안한다. 제안된 기법의 효율성을 검증하기 위하여 기존의 알고리즘과 정확률, 재현율, F-Measure, 수행속도에 대해 비교 평가 및 분석한다.

In document clustering methods, the k-means algorithm and the Hierarchical Alglomerative Clustering(HAC) are often used. The k-means algorithm has the advantage of a processing time and HAC has also the advantage of a precision of classification. But both methods have mutual drawbacks, a slow processing time and a low quality of classification for the k-means algorithm and the HAC, respectively. Also both methods have the serious problem which is to compute a document similarity whenever new document is inserted into a cluster. A main property of web resource is to accumulate an information by adding new documents frequently. Therefore, we propose a new method of transitive closure tree based on the HAC method which can improve a processing time for a document clustering, and also propose a superior incremental clustering method for an insertion of a new document and a deletion of a document contained in a cluster. The proposed method is compared with those existing algorithms on the basis of a pre챠sion, a recall, a F-Measure, and a processing time and we present the experimental results.

한국멀티미디어학회논문지 (Journal of Korea Multimedia Society)

이행적 폐쇄트리를 기반으로 한 점증적 웹 문서 클러스터링

An Incremental Web Document Clustering Based on the Transitive Closure Tree

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)