DOI QR코드

DOI QR Code

A Framework for WordNet-based Word Sense Disambiguation

워드넷 기반의 단어 중의성 해소 프레임워크

  • Ren, Chulan (Department of Computer Engineering, MyongJi University) ;
  • Cho, Sehyeong (Department of Computer Engineering, MyongJi University)
  • 임초람 (명지대학교 컴퓨터공학과) ;
  • 조세형 (명지대학교 컴퓨터공학과)
  • Received : 2013.05.13
  • Accepted : 2013.06.18
  • Published : 2013.08.25

Abstract

This paper a framework and method for resolving word sense disambiguation and present the results. In this work, WordNet is used for two different purposes: one as a dictionary and the other as an ontology, containing the hierarchical structure, representing hypernym-hyponym relations. The advantage of this approach is twofold. First, it provides a very simple method that is easily implemented. Second, we do not suffer from the lack of large corpus data which would have been necessary in a statistical method. In the future this can be extended to incorporate other relations, such as synonyms, meronyms, and antonyms.

본 연구에서는 단어의 의미 중의성을 해소하기 위한 방법을 제안하고 그 결과를 제시한다. 본 연구에서는 워드넷을 두가지 차원에서 활용하였는데, 하나는 사전으로서의 활용이며 다른 하나는 단어간의 개념 계층 구조를 가진 일종의 온톨로지로서 활용하였다. 이 중의성 해소 방식의 장점은 첫째 매우 단순하다는데 있다. 둘째로는 코퍼스를 활용하는 지식 기반/통계 기반 방식이 아니기 때문에 의미 태그 부착된 코퍼스의 부족으로 인한 문제가 발생하지 않는다는 것이다. 현재는 워드넷 온톨로지 중에서 개념 계층 구조, 즉 상위어-하위어 (hypernym-hyponym)의 관계만을 사용하였으나 향후 어렵지 않게 다른 관계들, 즉 유사어(synonym), 반의어(antonym), 부분어(meronym) 등의 관계를 활용하여 확장함으로써 성능의 향상을 기대할 수 있다.

Keywords

References

  1. Daniel Jurafsky and James H. Martin, Speech and Language Processing, 2nd edition, Pearson 2009
  2. Christiane Fellbaum(ed.), WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press. 1998
  3. MALLERY, J. C. Thinking about foreign policy: Finding an appropriate role for artificial intelligence computers. Ph.D. dissertation. MIT Political Science Department, Cambridge, MA. 1988.
  4. Roberto Navigli. "Word Sense Disambiguation: A Survey," ACM Computing Surveys, 41(2), 2009, pp. 1-69.
  5. A. Novischi, M. Srikanth, and A. Bennett, "Lcc-wsd: System description for English coarse grained all words task at semeval 2007," in Proc. of the 4th International Workshop on Semantic Evaluations, pp. 223-226, Prague, Czech Republic, 2007.
  6. M. Ciaramita and Y. Altun, "Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger," in Proc. of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 594-602, 2006.
  7. L. M'arquez, G. Escudero, D. Martinez, and G. Rigau, "Supervised corpus-based methods for WSD," in Word Sense Disambiguation: Algorithms and Applications, E. Agirre and P. Edmonds, Eds. Springer, New York, NY, pp. 167-216, 2007.
  8. R Mihalcea and E. Faruque, "Senseleamer: Minimally supervised word sense disambiguation for all words in open text," in Proc. of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (Senseval-3), Barcelona, Spain, pp. 155-158, 2004.
  9. S. Tratz, A. Sanfilippo, M. Ggregory, A. Chappell, C. Posse, and P. Whitney, "PNNL: A supervised maximum entropy approach to word sense disambiguation," in Proc. of the 4th International Workshop on Semantic Evaluations (SemEval), Prague, Czech Republic, pp. 264-267, 2007.
  10. M'ARQUEZ, L., ESCUDERO, G., MART'INEZ, D., AND RIGAU, G., "Supervised corpus-based methods for WSD," in Word Sense Disambiguation: Algorithms and Applications, E. Agirre and P. Edmonds, Eds. Springer, New York, NY, 167-216. 2006.
  11. PEDERSEN, T. "Unsupervised corpus-based methods for WSD," in Word Sense Disambiguation: Algorithms and Applications, E. Agirre and P. Edmonds, Eds. Springer, New York, NY, 133-166. 2006.
  12. R Mihalcea, "Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling," in Proc. Of HLT/EMNLP, Vancouver, BC, Canada, pp. 411-418, 2005.
  13. LESK, M., "Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone," in Proceedings of the 5th SIGDOC (New York, NY). Pp.24-26. 1986.
  14. PEDERSEN, T., PATWARDHAN, S., AND MICHELIZZI, J. "WordNet::Similarity-measuring the relatedness of concepts," in Proceedings of the 19th National Conference on Artificial Intelligence (AAAI, San Jose, CA) pp.144-152. 2004.
  15. MCCARTHY, D. AND CARROLL, J. "Disambiguating nouns, verbs and adjectives using automatically acquired selectional preferences," Computational Linguistics 29-4, pp. 639-654. 2003. https://doi.org/10.1162/089120103322753365
  16. BANERJEE, S. AND PEDERSEN, T., "Extended gloss overlaps as a measure of semantic relatedness," in Proceedings of the 18th International Joint Conference on Artificial Intelligence. 805-810. 2003.
  17. PEDERSEN, T., BANERJEE, S., AND PATWARDHAN, S., "Maximizing semantic relatedness to perform word sense disambiguation," Res. rep. UMSI 2005/25. University of Minnesota Supercomputing Institute, Minneapolis, MN. 2005.
  18. NAVIGLI, R, "Consistent validation of manual and automatic sense annotations with the aid of semantic graphs," Computational Linguistics, 32- 2, pp.273-281. 2006. https://doi.org/10.1162/coli.2006.32.2.273
  19. NAVIGLI, R. "Experiments on the validation of sense annotations assisted by lexical chains," in Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, 129-136. 2006.
  20. RADA, R., MILI, H., BICKNELL, E., AND BLETTNER, M. "Development and application of a metric on semantic nets," IEEE Trans. Syst. Man Cybernet. 19, 1, 17-30. 1989. https://doi.org/10.1109/21.24528
  21. SUSSNA, M. "Word sense disambiguation for free-text indexing using a massive semantic network," in Proceedings of the 2nd International Conference on Information and Knowledge Base Management, 67-74., 1993
  22. Qun Liu, Sujian Li, "Word Similarity Computing Based on How-net," Computational Linguistics and Chinese Language Processing, Vol.7, No.2, pp.59-76. , August 2002
  23. LEACOCK, C., CHODOROW, M., AND MILLER, G., "Using corpus statistics and WordNet relations for sense identification," Computational. Linguistics, 24, 1, 147-166. 1998.
  24. Feng Li, Fang Li, "an new approach measuring semantic similarity in Hownet 2000," Journal of Chinese Information Processing, vol.21, No.3, May 2007.
  25. Dekang Lin, "An information-theoretic definition of similarity," in Proceedings of ICML, pages 296-304. 1998.
  26. Vaclav Snael, Pavel Moravec, Jaroslav Pokorny. "WordNet Ontology Based Model for Web Retrieval," International Workshop on Challenges in Web Information Retrieval and Integration (WIRI'05), 0-7695-2414-1/05.
  27. Brigham Young Universiy, Corpus of Contemporary American English, Available: http://www.americancorpus.org/, 2013 [Accessed August, 19, 2013]