시소러스의 연관성 정보를 이용한 문서의 순위 결정 방법

Document ranking methods using term dependencies from a thesaurus

  • 발행 : 1993.12.01

초록

최근 시소러스를 기반으로 하는 불리안 검색 시스템에서 문서의 순위 결정에 사용 될 수 있는 Relevance, R-distance, K-distance와 같은 방법들이 개발되었다. 이러한 방법들은 색인어들 사이의 연관성 정보를 이용하여 문서들의 순위를 결정함으로써 많은 경우에 높은 검색 효율을 제공할 지라도, 불리안 연산자 AND, OR, NOT에 대한 연산 방법이 문제점으로 지적되어왔다. 본 논문에서는 개선된 퍼지 집합 모델과 확장된 불리안 모델을 시소러스가 제공하는 색인어들 사이의 연관성 정보를 효율적으로 이용할 수 있도록 확장함으로써, 기존 방법들의 문제점을 극복하는 새로운 순위 결정 방법 KB-FSM과 KB-EBM을 제안한다. 또한 KB-FSM과 KB-EBM이 Relevance, R-distance, K-distance보다 문서들의 순위를 보다 정확하게 결정함을 성능 비교를 통하여 입증한다.

In recent years various document ranking methods such as Relevance. R-Distance and K-Distance have been developed wh~ch can be used in thesaurus-based boolean retrieval systems. They give high quality document rankings in many cases by using term dependence lnformatlon from a thesaurus. However, they suffer from several problems resulting from inefficient and Ineffective evaluation of boolean operators AND. OR and NOT. In this paper we propose new thesaurus-based document ranking methods called KB-FSM and KB-EBM by exploitmg the enhanced fuzzy set model and the extended boolean model. The proposed methods overcome the problems of the previous methods and use term dependencies from a thesaurs effectively. We also show through performance comparison that KB-FSM and KBEBM provide higher retrieval effectiveness than Relevance. R-D~stance and K-Distance.

키워드

참고문헌

  1. Journal of the American Society for Information Science. v.38 no.5 Historical Note: The Past Thirty Years in Information Retrieval G.salton
  2. Journal of American Society for Information Science v.37 no.5 Unanswered Questions in the Design of Controlled Vocabularies. E.Svenonius
  3. Proceedings of the 11th International ACM SIGIR Conference on Research and Development in Information and Retrieval Concept Based Retrieval in Classical IR Systems. H.P.Giger
  4. ACM Annual Conference A Knowledge-Base for Retrieval Evaluation. R.Rada;S.Humphrey;C.Coccia
  5. The Expert Systems in Government Symposium. Relevance on a Biomedical Classification Structure. R.Rada;S.Humphrey(et al.)
  6. International Journal of Man-Machine Studies v.31 no.2 A Graphical Thesaurus-Based Information Retrieval System. C.F.McMath;R.S.Tamaru;R.Rada
  7. IEEE Transactions on Pattern Analysis and Machine Intelligence v.10 no.2 Merging Thesauri: Principles and Evaluation. H.Mili;R.Rada
  8. IEEE Transactions on Systems, Man, and Cybernetics v.19 no.1 Development and Application of a Metric on Semantic Nets. R.Rada;H.Mili;E.Bicknell;M.Blettner
  9. Journal of the American Society for Informaion Science v.40 no.5 Ranking Documents with a Thesaurus. R.Rada;E.Bicknell
  10. Journal of Documentation v.46 no.2 A Model of Knowledge Based Information Retrieval with Hierarchical Concept Graph. Y.W.Kim;J.H.Kim
  11. Far-East Workshop on Future Database Systems. A Knowledge-Based Approach to Rank Documents for Boolean Queries. J.H.Lee;M.H.Kim;Y.J.Lee
  12. Information Processing & Management v.30 no.1 Ranking Documents in Thesaurus-Based Boolean Retrieval Systems J.H.Lee;M.H.Kim;Y.J.Lee
  13. Journal of Documentation v.49 no.2 Information Retrieval Based on Conceptual Distance in Is-a Hierarchies. J.H.Lee;M.H.Kim;Y.J.Lee
  14. Journal of the American Society for Information Science v.31 no.3 MEDLINE: An Introduction to On-Line Searching D.B.McCarn
  15. Communications of the ACM v.25 no.1 The New (1982) Computing Reviews Classification System - Final Version J.E.Sammet;A.Ralston
  16. Bell System Technical Journal v.35 no.6 Minimization of Boolean Functions E.J.McCluskey
  17. Information Processing & Management v.17 no.5 A General Model of Query Processing in Information Retrieval System D.A.Buell
  18. Information Processing & Management v.15 no.5 Fuzzy Set Theoretical Approach to Document Retrieval T.Radecki
  19. Journal of the American Society for Information Science v.27 An Approach to Associative Retrieval through the Theory of Fuzzy sets W.M.Sachs
  20. Information Processing & Management v.15 A Mathematical Model of a Weighted Boolean Retrieval System. W.G.Waller;D.H.Kraft
  21. Journal of the American Society for Information Science v.31 no.4 Fuzzy Requests: An Approach to Weighted Boolean Searches. A.Bookstein
  22. Journal of the American Society for Information Science v.29 no.6 On the Nature of Fuzz: A Diatribe. S.E.Robertson
  23. Fuzzy Set Theory and Its Applications,(2nd ed.) H.J.Zimmermann
  24. Microprocessing and Multiprogramming - The Euromicro Journal v.35 no.5 Enhancing the Fuzzy Set Model for High Quality Document Rankings J.H.Lee,;M.H.Kim,;Y.J.Lee,
  25. International Symposium on Database Systems on Advanced Applications Enhancing the Fuzzy Set Model with Positively Compensatory Operators. J.H.Lee;W.Y.Kim;M.H.Kim;Y.J.Lee
  26. International ACM SIGIR Conference on Research and Development in Information Retrieval On the Evaluation of Boolean Operators in the Extended Boolean Retrieval Framework J.H.Lee;W.Y.Kim;M.H.Kim;Y.J.Lee
  27. Information Processing Letters v.46 no.5 Analysis of Fuzzy Operators for High Quality Information Retrieval M.H.Kim;J.H.Lee;Y.J.Lee
  28. Journal of the American Society for Information Science v.32 no.4 A Comparison of Two Weighting Schemes for Boolean Retrieval A.Bookstein
  29. Information Processing & Management v.17 no.3 Threshold Values and Boolean Retrieval System D.A.Buell;D.H.Kraft
  30. Communications of the ACM v.26 no.11 Extended Boolean Information Retrieval G.Salton;E.A.Fox;H.Wu
  31. Journal of the American Society for Information Science v.36 no.3 Advanced Feedback Methods in Information Retrieval G.Salton;E.A.Fox;E.Voorhees
  32. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer G.Salton
  33. Rank Correlation Methods, (4th ed.) M.Kendall
  34. Technical Report CS-TR-92-76 The Extended Boolean Model Using Term Dependencies from a Thesauru, J.H.Lee;M.H.Kim;Y.J.Lee