An experiment in automatic indexing with korean texts : a comparison of syntactico-statistical and manual methods

구문 . 통계적 기법을 이용한 한국어 자동색인에 관한 연구

  • Published : 1993.06.01

Abstract

This study was undertaken in order to develop practical automatic indexing techniques suitable for Korean natural language texts. It has taken a modest step toward this goal by developing an automatic syntactico-statistical indexing method and evaluating the method by comparing the resutls with manual indexing. For this experimental study, the Korean text database was constructed manually based on 300 abstracts covering business subject. The experimental results showed that the performance of the automatic syntactico-statistical indexing system was comparable to that of other studies which have compared automatic indexing with manual indexing.

본 논문은 자연어 형태의 한국어 텍스트 부터 주제를 대표할 수 있는 색인어를 자동으로 추출하는 실험적인 구문 . 통계적 자동색인 시스템을 구현하였다. 구문 . 통계적 자동색인 시스템은 형태소 분석과 단어 가증 기법을 이용하여 단일어와 명사구를 동시에 선택하는 자동색인 시스템을 말한다. 시스템의 성능을 측정하기 위하여, 300개의 우리말 학술 및 학위논문 초록에서 선택된 단일 . 복합어 색인어를 수작업 색인과 비교하였다. 이와 같은 실험 결과를 가지고 아직 미흡한 연구상태인 우리말 자동색인 개발에 있어서 필요한 기초자료를 제시하였다.

Keywords

References

  1. 정보검색론 정영미
  2. 도서관학 v.9 자동색인의 통계적 기법과 한국어 문헌의 실험 정영미;이태영
  3. 연세대학교 석사학위 논문 지식베이스를 이용한 자동색인시스템에 관한 연구 허미숙
  4. Journal of ACM v.24 Operations Research Applied to Document Indexing and Retrieval Decision Bookstein,A.;Kraft,D.
  5. Journal of the American Society for Information Science v.25 no.5 Probabilistic Models for Automatic Indexing Bookstein,A.;Swanson,D.R.
  6. Science and Technology Libraries v.5 no.1 American Petroleum Institute's Machine-aided Indexing and Searching Project Brenner,E.H.(et al.)
  7. Journal of the American Society for Information Science v.21 no.5 A Highly Associative Document Retrieval System Cagan,C.
  8. Factors Determining the Performace of Indexing Systems Cleverdon,C.W.;Mills,J.;Keen,E.M.
  9. Journal of the American Society for Information Science v.41 no.6 Indexing by Latent Semantic Analysis Deerwester,S.;Dumais,S.T.;Furnas,G.;Laudauer,T.;Harshman,R.
  10. Journal of the ACM v.8 Semantic Road Map for Literature Searches Doyle,L.B.
  11. Information Processing and Management v.27 no.1 The Operation and Performance of Artificially Intelligent Keywording System Driscoll,J.;Rajala,D.;Shaffer,W.;Thomas.D.
  12. Ph. D. Dissertation, Cornell University Experiments in Automatic Phrase Indexing for Document Retrieval : A Comparison of Syntactic and Non-Syntactic Methods Fagan,J.L.
  13. Journal of the American Society for Information Science v.40 no.2 The Effectiveness of a Nonsyntactic Approach to Automatic Phrase Indexing for Document Retrieval Fagan,J.L.
  14. Journal of the American Society for Information Science v.26 no.4 A Probabilistic Approach to Automatic Keyword Indexing Part Ⅰ : On the Distribution of Specialty Words in a Technical Literature Harter,S.P.
  15. Journal of the American Society for Information Science v.26 no.5 A Probabilistic Approach to Automatic Keyword Indexing Part Ⅱ : An Algorithm for Probabilistic Indexing Harter,S.P.
  16. Journal of the American Society for Information Science v.38 no.3 Knowledge-Based Indexing of the Medical Literature : The Indexing Aid Project Humphrey,S.;Miller,N.E.
  17. Information Processing and Management v.20 no.5-6 Automatic Indexing of Full Texts Jonak;Zdemek
  18. Journal of Documentation v.39 no.1 How Do We Index? : A Report of Some ASLIB Informatics Group Activity Jones,K.P.
  19. Journal of the American Society for Information Science v.41 no.2 INDEX : The Statistical Basis for an Automatic Conceptural Phrases-Indexing System Jones,L.P.;Gassie,E.W.Jr.;Radhakrishnan,S.
  20. Computer-Based Knowledge Retrieval Kemp,D.A.
  21. The National Library of Medicine Report Evaluation of Operation Efficiency of Medlars Lancaster,F.W.
  22. Indexing and Abstracting in Theory and Practice. Champaign Lancaster,F.W.
  23. American Documentation v.20 no.1 Word-Word Association in Document Retrieval System Lesk,M.E.
  24. Information Processing and Management v.25 no.6 Comparing and Combining Effectiveness of Latent Semantic Indexing and the Ordinary Vector Space Model for Information Retrieval Lochbaum,K.E.;Streeter,L.A.
  25. IBM Journal of Research and Development v.1 no.4 A Statistical Approach to Mechanized Encoding and Searching of Library Information Luhn,H.P.
  26. Interactive Biblioqraphic Search : the User/Computer Inteface, Montvale The User Interface for the INTREX Retrieval System Marcus,R.S.(et al.)
  27. Journal of Chemical Information and Computer Science v.27 An Expert System for Machine-aided Indexing Martinez,C.(et al.)
  28. Machinen Indexing : Progress and Problems Some Remarks on Mechanized Indexing and Some Small Scale Empirical Results O'Connor,J.
  29. Journal of ACM v.11 no.4 Mechanized Indexing Methods and Their Testing O'Conner,J.
  30. Information Processing and Management v.17 no.2 Indexing Consistency, Quality and Efficiency Rolling,L.
  31. Automatic Information Organization and Retrieval Salton,G.
  32. American Documentation v.20 no.1 A Comparison Between Manual and Automatic Indexing Methods Salton,G.
  33. Technical Report no. TR85-713 Another Look at Automatic Text Retrieval System Salton,G.
  34. Proceedings of 1986 ACM Conference on Research and Development in Information Retrieval Recent Trends in Automatic Information Retrieval Salton,G.;F.Rabitti(ed.)
  35. Automatic Text Processing : The Transformation, Analysis, and Retrieval of Information by Computer Salton,G.
  36. Journal of ACM v.15;29 no.1;4 Computer Evaluation of Term and Text Processing Salton,G.;Lesk,M.
  37. Journal of Documentation v.29 no.4 On the Specification of Term Values in Automatic Indexing Salton,G.;Yang,C.S.
  38. Journal of the American Society for Information Science v.26 no.1 A Theory of Term Importance in Automatic Text Analysis Salton,G.;Yang,C.S.;Yu,C.T.
  39. Journal of the American Society for Information Science v.32 The Measurement of Term Importance in Automatic Indexing Salton,G.;Wu,H.;Yu,C.T.
  40. Information Processing and Management v.24 no.5 The Term-Weighting Approaches in Automatic Text Retrieval Salton,G.;Buckley,C.
  41. International Forum on Information and Documentation v.9 no.1 The Semantic Analysis : A Processing Tool for Text Data Base Smetacek,V.
  42. Journal of Documentation v.28 no.1 A Statistical Interpretation of Term Specification and its Application in Retrieval Sparck Johnes, K.
  43. Information Storage and Retrieval v.9 no.11 Indexing Term Weighting Sparck Johnes, K.
  44. Information Processing and Management v.13 Automatic Versus Manual Indexing Van der Meulen, W.A.;Hanssen,P.J.E.
  45. Journal of the American Society for Information Science v.38 no.4 Concept Recognition in an Automatic Text-Processing System for the Life Science Vleduts-Stokolov,N.
  46. Ph. D. Dissertation, Columbia University Word Frequency and Automatic Indexing Weinberg,B.H.
  47. Ph. D. Dissertation, Indiana University Computer Processing with Thai Text : Keyword in Context Indexing Yindeemak,L.