DOI QR코드

DOI QR Code

Disambiguation of Homograph Suffixes using Lexical Semantic Network(U-WIN)

어휘의미망(U-WIN)을 이용한 동형이의어 접미사의 의미 중의성 해소

  • 배영준 (울산대학교 정보통신공학과) ;
  • 옥철영 (울산대학교 컴퓨터정보통신공학부)
  • Received : 2012.08.06
  • Accepted : 2012.09.10
  • Published : 2012.10.30

Abstract

In order to process the suffix derived nouns of Korean, most of Korean processing systems have been registering the suffix derived nouns in dictionary. However, this approach is limited because the suffix is very high productive. Therefore, it is necessary to analyze semantically the unregistered suffix derived nouns. In this paper, we propose a method to disambiguate homograph suffixes using Korean lexical semantic network(U-WIN) for the purpose of semantic analysis of the suffix derived nouns. 33,104 suffix derived nouns including the homograph suffixes in the morphological and semantic tagged Sejong Corpus were used for experiments. For the experiments first of all we semantically tagged the homograph suffixes and extracted root of the suffix derived nouns and mapped the root to nodes in the U-WIN. And we assigned the distance weight to the nodes in U-WIN that could combine with each homograph suffix and we used the distance weight for disambiguating the homograph suffixes. The experiments for 35 homograph suffixes occurred in the Sejong corpus among 49 homograph suffixes in a Korean dictionary result in 91.01% accuracy.

현재까지 대부분의 한국어처리시스템에서는 가급적 많은 접미파생명사를 사전에 등재하여 처리하였다. 그러나 접미사는 생산성이 높기 때문에 모든 접미파생명사를 사전에 등재하는 것은 한계가 있다. 따라서 접미파생명사의 의미 분석을 통해서 미등재 접미파생명사를 분석할 필요가 있다. 본 논문에서는 접미파생명사의 의미 분석의 일환으로 한국어 어휘의미망(U-WIN)을 이용한 동형이의어 접미사의 중의성 해소 방법을 제시한다. 형태 의미 주석 세종 말뭉치에서 동형이의어 접미사를 포함한 33,104개의 접미파생명사를 대상으로 실험하였다. 실험을 위해 먼저 동형이의어 접미사를 의미 태깅하였으며, 접미사 앞의 어근을 추출하여 U-WIN의 노드에 매핑시켰다. 또한 동형이의어 접미사와 결합되는 U-WIN 상의 노드들에 대해 거리 가중치를 부여하여 이를 동형이의어 접미사 중의성 해소에 사용하였다. 동형이의어 접미사 49종 중 세종말뭉치에 나타난 35개의 동형이의어 접미사를 대상으로 실험한 결과 91.01%의 정확률을 보였다.

Keywords

References

  1. J. Heo, H. C. Seo and M. G. Jang, "Homonym Disambiguation based on Mutual Information and Sense-Tagged Compound Noun Dictionary", Journal of KIISE, Vol.33, No.12. pp.1073-1089, 2006.
  2. M. H. Kim and H. C. Kwon, "Word Sense Disambiguation using Semantic Relations", Journal of KIISE, Vol.38, No.10, pp.554-564, 2011.
  3. S. J. Kang, "Ontology Construction and Its Application to Disambiguate Word Senses", The KIPS transactions: Part B, Vol.11, No.4, pp.491-500, 2004.
  4. M. Lesk , "Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone", In Proceedings of the 5th SIGDOC (New York, NY), pp.24-26, 1986.
  5. S. Banerjee and T. Pedersen, "Extended gloss overlaps as a measure of semantic relatedness", In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI, Acapulco, Mexico), pp.805-810, 2003.
  6. P. Resnik, "Selectional preference and sense disambiguation", In Proceedings of the ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? (Washington, D.C.), pp.52-57, 1997.
  7. D. Yarowsky , "Word-Sense Disambiguation using Statistical Models of Roget's Categories Trained on Large Corpora", In Proceedings of Coling-92, 1992.
  8. R. Navigli and P. Velardi, "Structural semantic interconnections: A knowledge-based approach to word sense disambiguation", IEEE Trans. Patt. Anal. Mach. Intell. Vol.27, No.7, pp.1075-1088, 2005. https://doi.org/10.1109/TPAMI.2005.149
  9. R. Navigli, "Word sense disambiguation: A survey", ACM Computing Surveys, Vol.41, Issue 2, No.10, 2009.
  10. M. Galley and K. Mckeown, "Improving word sense disambiguation in lexical chaining", In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI, Acapulco, Mexico). pp.1486-1488, 2003.
  11. R. Mihalcea, P. Tarau and E. Figa, "Pagerank on semantic networks, with application to word sense disambiguation", In Proceedings of the 20th International Conference on Computational Linguistics (COLING, Geneva, Switzerland), pp.1126-1132, 2004.
  12. A. S. Yoon, S. H. Hwang, E. R. Lee and H. C. Kwon, "Construction of Korean Wordnet KorLex 1.5 ", Journal of KIISE: Software and Applications, Vol.36, No.1, pp.92-108, 2009.
  13. S. H. Lee, "세종 전자 사전의 어휘 의미 부류 체계", 새국어생활 Vol.17, No.3, pp.51-67, 2007.
  14. J. H. Im, Y. J. Bae, H. S. Choe and C. Y. Ock, "A Measure of Semantic Similarity and its Application in User-Word Intelligent Network", in Proceedings of the KCC, Vol.34, No.1, pp.189-193, 2007.
  15. J. H. Im, H. S. Choe and C. Y. Ock, "Semantic Information Retrieval Based on User-Word Intelligent Network", in Proceedings of KCA, Vol.4, No.2, pp.547-550, 2006.
  16. M. H. Cho, S. F. Choi, H. S. Choi and H. M. Yoon, "Improvement of Science and Technology Information Retrieval Service using Semantic Language Resource", in Proceedings of KCA, Vol.4, No.2, pp.570-574, 2006.
  17. Y. H. Lee, C. Y. Ock and E. B. Lee, "Korean Compound Noun Decomposition and Semantic Tagging System using User-Word Intelligent Network", The KIPS transactions: Part B, Vol.19, No.1, pp.63-76, 2012.
  18. Y. J. Nam and C. Y. Ock, "Constructing Dictionary Information for the Processing of Derivational Suffixes of Nouns based on corpus Analysis", Journal of KIISE, Vol.23, No.4, pp.389-401, 1996.
  19. R. J. Kim and Y. J. Jeong, "A Device for Distinguishing Homonym Relationship of Suffix '-i'", Journal of Korealex, Vol.12, pp.185-207, 2008.