A Parser of Definitions in Korean Dictionary based on Probabilistic Grammar Rules

확률적 문법규칙에 기반한 국어사전의 뜻풀이말 구문분석기

  • Published : 2001.05.01

Abstract

The definitions in Korean dictionary not only describe meanings of title, but also include various semantic information such as hypernymy/hyponymy, meronymy/holonymy, polysemy, homonymy, synonymy, antonymy, and semantic features. This paper purposes to implement a parser as the basic tool to acquire automatically the semantic information from the definitions in Korean dictionary. For this purpose, first we constructed the part-of-speech tagged corpus and the tree tagged corpus from the definitions in Korean dictionary. And then we automatically extracted from the corpora the frequency of words which are ambiguous in part-of-speech tag and the grammar rules and their probability based on the statistical method. The parser is a kind of the probabilistic chart parser that uses the extracted data. The frequency of words which are ambiguous in part-of-speech tag and the grammar rules and their probability resolve the noun phrase's structural ambiguity during parsing. The parser uses a grammar factoring, Best-First search, and Viterbi search In order to reduce the number of nodes during parsing and to increase the performance. We experiment with grammar rule's probability, left-to-right parsing, and left-first search. By the experiments, when the parser uses grammar rule's probability and left-first search simultaneously, the result of parsing is most accurate and the recall is 51.74% and the precision is 87.47% on raw corpus.

Keywords

References

  1. 병렬 명사구의 구문해석 김철호
  2. 한국어 어휘 중의성 해소를 위한 태깅 시스템 김재한
  3. 정보과학회 추계 학술발표논문집 용언의 하위범주화 정보를 이용한 특수문형의 처리방안 이상국;김윤호;김재문;이상조
  4. 의미정보를 이용하는 중심어 주도의 한국어 파싱 서영훈
  5. 자연언어처리 Makoto Nago(저)
  6. 자동 품사 부착을 위한 새로운 통계적 모형 이상주
  7. 한국과학기술원, 인공지능연구센터 기술보고서 한국어 문법-NLP를 위한 HPSG/K 장석진
  8. 정보과학회 인공지능연구회 소식지 no.11 한국어의 문법적 특성과 LFG 분석기법 윤덕호
  9. 정보과학회논문지(B) v.26 no.6 어절 태그 변형 규칙을 이용한 한국어 품사 태거 임희석;김진동;임해창
  10. 정보과학회논문지(B) v.26 no.4 의미속성에 기반한 한국어 명사 의미 체계 조평옥;옥철영
  11. 한국인지과학회 논문지 v.10 no.4 사전 뜻풀이말에서 구축한 한국어 명사 의미계층구조 조평옥;안미정;옥철영;이수동
  12. Natural Language Processing in LISP : An Introduction to Computational Linguistics Gerald Gazdar;Chris Mellish
  13. Efficient Parsing for Natural Language : A Fast Algorithm for Practical Systems Masaru Tomita
  14. The psycho-biology of language: An introduction to dynamic philogy G. Zipf
  15. Corpus studies and probablistic grammar Halliday, M. A. K.;Aijmer, K.(ed.);Altenberg, B.(ed.)
  16. the Proceedings of European ACL Pearl: A Probabilistic Chart Parser D.Magerman;M. Marcus
  17. the Proceedings of European ACL Efficiency, robustness and Accuracy in Picky Chart Parsing D.Magerman;C. Weir
  18. Darpa Workshop on Speech and Natural Language Automatically acquiring phrase structure using distributional analysis E.Brill;M.Marcus
  19. the Proceedings of European ACL Using an annotated corpus as a stochastic grmmar R. Bob
  20. Corpus-based Parsing and Sublanguage Studies Satoshi Sekine
  21. Proc. of 29th Meeting of the ACL Word Sense Disambiguation using Statistical Methods P.F. Brown
  22. Computers and Humanities A Method for Disambiguating Word Senses in a Large Copus Gale, William;K.Church;D.Yarowsky
  23. Proc. of the 15th Int'l Conf. on Computational Linguistics Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora D.Yarowsky
  24. Proc. of SIGIR Word Sense Disambiguation and Information Retrieval M.Sanderson
  25. Proc. of ACMSIGIR Conference Using WordNet to disam biguate word sense for text retrival E. M. Voorhees
  26. 한국과학 재단 연구결과 보고서 국어 명사의 의미관계에 대한 연구 윤평현
  27. Language and Speech v.8 A probabilistic Procedure for grouping words into phrases W. Stolz
  28. Project APRIL a progress report, In the proceedings of the Annual Meeting of the Association for Computational Linguistics R.Haigh;G.Sampson;E.Atwell
  29. In the Proceedings of European ACL Using an annotated corpus as a stochastic grmmar R.Bob