Design of Keyword Extraction System Using TFIDF

TFIDF를 이용한 키워드 추출 시스템 설계

  • Published : 2002.03.01

Abstract

In this paper, a test was performed to determine whether words in Anchor Text were appropriate as key words. As a result of the test. there were proper words of high weighting factor, while some others did not even appear in the text. therefore, were not appropriate as key words. In order to resolve this problem. a new method was proposed to extract key words. Using the proposed method, inappropriate key words can be removed so that new key words be set, and then, ranking becomes possible with the TFIDF value as a weighting factor of the key word. It was verified that the new method has higher accuracy compared to the previous methods.

References

  1. Tech Report 87-881 Dept. of Computer Science Term weighting approaches in automatic text retrieval Salton. G.;Buckley. C.
  2. AAAI 1195 Spring Symposium on Information Gathering from Heterogeneous WebWatcher: A Learning Apprentice for the World Web Armstrong. R.;Fritag. D.;Joachims. T.;Michell. T.
  3. Information Retrieval DataStructure and Algorithms William. B.;Frakes;Ricardo;Baeza/Yates
  4. Science v.253 Developments in automatic text retrieval G. Salton
  5. Agents '98 CiteSeer: An Automous Web Agent for Automatic Retrieval' and Identification of Interesting Publications Kuet D. Bollacker;Steve Lawence;C. Lee Giles
  6. Proceeding of the 7th International World Wide Web Conference(WWW7) The Anatomy of a Large-Scale Hypertextual Web Search Engine Sergey Brin;Lawrence Page
  7. Communications of the AMC v.18 no.11 A Vector Space Model for Automatic Indexing Salton G.;A.Wong;C.S. Yang
  8. Intermation Retrieval Systems Theory and Implementaion Gerald Kowalski