Web Document Clustering based on Graph using Hyperlinks

Lee, Joon;Kang, Jin-Beom;Choi, Joong-Min;

한국HCI학회:학술대회논문집

2009.02a
/
Pages.590-595
/
2009

The HCI Society of Korea (한국HCI학회)

Web Document Clustering based on Graph using Hyperlinks

하이퍼링크를 이용한 그래프 기반의 웹 문서 클러스터링

이준 (한양 대학교 컴퓨터공학과) ;
강진범 (한양 대학교 컴퓨터공학과) ;
최중민 (한양 대학교 컴퓨터공학과)

Published : 2009.02.09

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

With respect to the exponential increment of web documents on the internet, it is important how to improve performance of clustering method for web documents. Web document clustering techniques can offer accurate information and fast information retrieval by clustering web documents through semantic relationship. The clustering method based on mesh-graph provides high recall by calculating similarity for documents, but it requires high computation cost. This paper proposes a clustering method using hyperlinks which is structural feature of web documents in order to keep effectiveness and reduce computation cost.

인터넷 상의 웹 문서의 수가 기하급수적으로 늘어남에 따라서, 정보검색에서의 웹 문서 클러스터링은 성능과 속도가 매우 중요하게 되었다. 웹 문서 클러스터링은 의미적으로 관계가 있는 웹 문서들을 같은 클러스터로 군집함으로써 정보 검색을 보다 빠르고, 정보를 정확하게 제공할 수 있다. 그물망 그래프 형태의 클러스터링은 모든 문서간의 유사도를 측정함으로써 재현율을 높일 수 있지만, 높은 계산 비용을 갖는다. 본 논문에서는 그물망 형태의 클러스터링의 재현율과 정확율을 유지하며 계산 비용을 줄이기 위하여, 웹 문서의 구조적 특징인 하이퍼링크(Hyperlinks)를 이용한 클러스터링 방법을 제안한다.

한국HCI학회:학술대회논문집

Web Document Clustering based on Graph using Hyperlinks

하이퍼링크를 이용한 그래프 기반의 웹 문서 클러스터링

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)