Anaphoricity Determination of Zero Pronouns for Intra-sentential Zero Anaphora Resolution

문장 내 영 조응어 해석을 위한 영대명사의 조응성 결정

  • 김계성 (경북대학교 컴퓨터공학과) ;
  • 박성배 (경북대학교 IT대학 컴퓨터학부) ;
  • 박세영 (경북대학교 IT대학 컴퓨터학부) ;
  • 이상조 (경북대학교 IT대학 컴퓨터학부)
  • Received : 2010.09.16
  • Accepted : 2010.10.14
  • Published : 2010.12.15

Abstract

Identifying the referents of omitted elements in a text is an important task to many natural language processing applications such as machine translation, information extraction and so on. These omitted elements are often called zero anaphors or zero pronouns, and are regarded as one of the most common forms of reference. However, since all zero elements do not refer to explicit objects which occur in the same text, recent work on zero anaphora resolution have attempted to identify the anaphoricity of zero pronouns. This paper focuses on intra-sentential anaphoricity determination of subject zero pronouns that frequently occur in Korean. Unlike previous studies on pair-wise comparisons, this study attempts to determine the intra-sentential anaphoricity of zero pronouns by learning directly the structure of clauses in which either non-anaphoric or inter-sentential subject zero pronouns occur. The proposed method outperforms baseline methods, and anaphoricity determination of zero pronouns will play an important role in resolving zero anaphora.

문서에서 생략된 요소가 지시하는 대상을 식별해 내는 작업은 기계 번역, 정보추출 등과 같은 자연언어처리 분야의 다양한 응용들을 위해 필요하다. 문장에서 생략된 요소들은 영조응사, 영대명사 등으로 불리며, 지시(reference)의 한 유형으로 간주되고 있지만, 모든 영형이 문서에서 명확하게 언급된 지시 대상을 지시하지는 않는다. 이에 영형의 조응성을 결정하려는 연구가 최근 진행되고 있으며, 본 논문에서는 한국어에서 가장 빈번하게 나타나는 영형 주어(subject zero pronouns)의 문장 내 조응성 결정에 초점을 맞춘다. 주어진 영형과 선행사 후보들 간의 쌍대 비교(pairwise comparison)에 기반한 기존 연구와 달리, 본 논문은 비조응적 혹은 문장 간에서 해결 가능한 영형이 나타난 절의 구조를 직접 학습함으로써 영형의 문장 내 조응성을 결정한다. 실험에서 제안한 방법은 베이스라인보다 나은 성능을 보였으며, 영형의 조응성 결정은 향후 영형 조응어 해석에 긍정적인 영향을 줄 수 있을 것으로 기대된다.

Keywords

References

  1. D.-S. Wu and T. Liang, "Zero Anaphora Resolution by Case-based Reasoning and Pattern Conceptualization," Expert Systems with Applications, vol.36, no.4, pp.7544-7551, May 2009. https://doi.org/10.1016/j.eswa.2008.09.065
  2. N.-R. Han, Korean Zero Pronouns: Analysis and Resolution, Doctoral dissertation, Department of Linguistics at the University of Pennsylvania, 2006.
  3. R. Iida, K. Inui, and Y. Matsumoto, "Zero-Anaphora Resolution by Learning Rich Syntactic Pattern Features," ACM Transactions on Asian Language Information Processing, vol.6, no.4, article 12, December 2007.
  4. S. Zhao and H. T. Ng, "ldentification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach," In Proceedings of the Joint Conference on EMNLP-CoNLL, pp.541-550, 2007.
  5. Halliday, M.A.K and Hasan R., Cohesion in English, London:Longman, 1976.
  6. R. Iida, K. lnui, and Y. Matsumoto, "Capturing salience with a trainable cache model for zeroanaphora resolution," In Proceedings of the Joint Conference of the ACL-IJCNLP, pp.647-655, 2009.
  7. M. Collins and N. Duffy, "Convolution Kernels for Natural Language," In Proceedings of Neural Information Processing Systems, pp.625-632, 2001.
  8. A. Moschitti, "Making Tree Kemels Practical for Natural Language Leaming," In Proceedings of EACL, pp.113-120, 2006
  9. B. J. Grosz, S. Weinstein, and A. K. Joshi, "Centering: A Framework for Modeling the Local Coherence of Discourse," Computational Linguistics, vol.21 no.2, pp.203-225, June 1995.
  10. J.-E. Roh and J.-H. Lee, "Generation of Zero Pronouns Based on the Centering Theory and Pairwise Salience of Entities," IEICE Transactions on Information and Systems, voI.E89-D, no.2, pp.837-846, February 2006. https://doi.org/10.1093/ietisy/e89-d.2.837
  11. V. Ng and C. Cardie, '"Improving Machine Learning Approaches to Coreference Resolution," In Proceedings of ACL, pp.104-111, 2002
  12. W. M. Soon, H. T. Ng, and D. C. Y. Lim, "A Machine Learning Approach to Coreference Resolution of Noun Phrases," Computational Linguistics, vol.27, no.4, pp.521-544, 2001. https://doi.org/10.1162/089120101753342653
  13. S. Bergsma, D. Lin and R. Goebel, "Distributional Identification of Non-Referential Pronouns." In Proceedings of ACL HLT, Columbus, Ohio, pp.10-18, 16th-18th, 2008.
  14. E. Jo, H. Kim, and J Seo, "Distinguishing Referential Expression 'Geot' Using Decision Tree," Journal of KIISE Software and applications, vol.34, no.9, pp.880-888, Sep. 2007. (ín Korean)
  15. K.-S. Kim, S.-B. Park, H.-J. Song, S. - Y. Park, and S.-J. Lee, "ldentification of Subject Shareness for Korean-English Machine Translation," In Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence, pp.211-222, 2008.
  16. Korean electronic dictionaries in the 21st Century Sejong Project (http://www.sejong.or.kr)
  17. T. Joachims, "Making large-Scale SVM Learning Practical," Advances in Kernel Methods - Support Vector Leaming, B.Scholkopf and C.Burges and A.Smola (ed.), MIT-Press, 1999.