문서지문기법을 이용한 웹 문서의 자동 분류

  • Published : 2004.10.01

Abstract

As documents in webs are increasing explosively due to the rapid development of electronic documents, an efficient system classifying documents automatically is required. In this study, a new document classification method, which is called Document Finger Print Method, is suggested to classify web documents automatically and efficiently. The performance of the suggested method is evaluated alone with other existing methods such as key words based method, weighted key words based method, neural networks, and decision trees. An experiment is designed with 10 documents categories and 59 randomly selected words. The result shows that the suggested algorithm has a superior classifying performance compared to other methods. The most important advantage of this method is that the suggested method works well without the size limits of the number of words in documents.

Keywords