DOI QR코드

DOI QR Code

Zero-anaphora resolution in Korean based on deep language representation model: BERT

  • Kim, Youngtae (Computer and Telecommunications Engineering Division, Yonsei University) ;
  • Ra, Dongyul (Computer and Telecommunications Engineering Division, Yonsei University) ;
  • Lim, Soojong (Language Intelligence Research Section, Electronics and Telecommunications Research Institute)
  • Received : 2019.09.23
  • Accepted : 2020.05.06
  • Published : 2021.04.15

Abstract

It is necessary to achieve high performance in the task of zero anaphora resolution (ZAR) for completely understanding the texts in Korean, Japanese, Chinese, and various other languages. Deep-learning-based models are being employed for building ZAR systems, owing to the success of deep learning in the recent years. However, the objective of building a high-quality ZAR system is far from being achieved even using these models. To enhance the current ZAR techniques, we fine-tuned a pretrained bidirectional encoder representations from transformers (BERT). Notably, BERT is a general language representation model that enables systems to utilize deep bidirectional contextual information in a natural language text. It extensively exploits the attention mechanism based upon the sequence-transduction model Transformer. In our model, classification is simultaneously performed for all the words in the input word sequence to decide whether each word can be an antecedent. We seek end-to-end learning by disallowing any use of hand-crafted or dependency-parsing features. Experimental results show that compared with other models, our approach can significantly improve the performance of ZAR.

Keywords

References

  1. R. Sasano, D. Kawahara, and S. Kurohashi, A fully-lexicalized probabilistic model for Japanese zero anaphora resolution, in Proc. 22nd Int. Conf. Comput. Linguistics (Manchester, UK), Aug. 2008, pp. 769-776.
  2. H. Nakaiwa and S. Shirai, Anaphora resolution of Japanese zero pronouns with deictic reference, in Proc. Int. Conf. Comput. Linguistics. (Madrid, Spain), 1996, pp. 812-817.
  3. R. Iida, I. Kentaro, and Y. Matsumoto, Zero anaphora resolution by learning rich syntactic pattern features, ACM Trans. Asian Language Inform. Process. 6 (2007), no. 4, 12:1-22.
  4. J. Devlin et al., BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. North American Chap. Assoc. Comput. Linguistics (Minneapolis, MN, USA), 2019, pp. 4171-4186.
  5. A. Vaswani et al., Attention is all you need, Advances Neural Inform. Process. Syst. 30 (2017), 6000-6010.
  6. I. Tsochantaridis et al., Support vector machine learning for interdependent and structured output spaces, in Proc. 21st Int. Conf. Mach. Learn. (Banff, Canada), 2004, pp. 823-830.
  7. O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, Advances in Neural Inform. Process. Syst. 28 (2015), 2674-2682.
  8. C. Park and C. Lee, Co-reference resolution for Korean pronouns using pointer networks, J. Korean Inst. Inform. Sci. Eng. 44 (2017), 496-502.
  9. C. Chen and V. Ng, Chinese zero pronoun resolution with deep neural networks, in Proc. Annu. Meet. Assoc. Comput. Linguistics, (Berlin, Germany), 2016, pp. 778-788.
  10. Q. Yin et al., Zero pronoun resolution with attention-based neural network, in Proc. Int. Conf. Comput. Linguistics (Santa Fe, NM, USA), 2018, pp. 13-23.
  11. R. Iida et al., Intra-Sentential subject zero anaphora resolution using multi-column convolutional neural network, in Proc. Conf. Empirical Methods in Natural Language Process. (Austin, TX), 2016, pp. 1244-1254.
  12. M. Okumura and K. Tamura, Zero pronoun resolution in Japanese discourse based on centering theory, in Proc. Int. Conf. Computat. Linguistics. (Copenhagen, Denmark), 1996, pp. 871-887.
  13. M. Murata and M. Nagao, Resolution of verb ellipsis in Japanese sentences using surface expressions and examples, in Proc. Natural Language Process. Pacific Rim Symp. (Bangkok, Thailand), 1997, pp. 75-80.
  14. S. Nariyama, Grammar for ellipsis resolution in Japanese, in Proc. Int. Conf. Theoret. Methodol. Issues Mach. Transl. (Keihanna, Japan), 2020, pp. 135-145.
  15. K. Seki, A. Fujii, and T. Ishikawa, A probabilistic method for analyzing Japanese anaphora integrating zero pronoun detection and resolution, in Proc. Int. Conf. Comput. Linguistics (Taipei, Taiwan), 2002, pp. 911-917.
  16. S. Lim, C. Lee, and M. Jang, Restoring an elided entry word in a sentence for encyclopedia QA system, in Proc. Int. Joint Conf. Natural Language Process. (Jeju, Rep. of Korea), 2005, pp. 215-219.
  17. S. Zhao and H. T. Ng, Identification and resolution of Chinese zero pronouns: A machine learning approach, in Proc. Joint Conf. Empirical Methods Natural Language Process. Comput. Natural Language Learn. (Prague, Czech), 2007, pp. 541-550.
  18. R. Iida and M. Poesio, A cross-lingual ILP solution to zero anaphora resolution, in Proc. Annu. Meet. Assoc. Comput. Linguistics (Portland, OR, USA), 2011, pp. 804-813.
  19. S. Jung and C. Lee, Deep neural architecture for recovering dropped pronouns in Korean, ETRI J. 40 (2018), 257-264. https://doi.org/10.4218/etrij.2017-0085
  20. Z. Lan et al., ALBERT: A lite BERT for self-supervised learning of language representations, in Proc. Int. Conf. Learn. Represent. (Addis Ababa, Ethiopia), 2020.
  21. Z. Yang et al., XLNet: Generalized autoregressive pretraining for language understanding, arXiv preprint, CoRR, 2020, arXiv:1906.08237.
  22. Y. Liu et al., Roberta: A robustly optimized BERT pretraining approach, arXiv preprint, CoRR, 2019, arXiv:1907.11692.
  23. Q. Yin et al., Chinese zero pronoun resolution: A collaborative filtering-based Approach, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19 (2019) no. 1, 3:1-20.
  24. F. Kong, M. Zhang, and G. Zhou, Chinese zero pronoun resolution: A chain-to-chain approach, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 19 (2019), no. 1, 2:1-21.
  25. I. Goodfellow, Y. Bengio, and A. Courville, Deep learning, MIT Press, Cambridge, MA, USA 2016.
  26. S. Hochreiter, and J. Schmidhuber, Long short-term memory, Neural Comput. 9 (1997), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  27. J. Chung et al., Empirical evaluation of gated recurrent neural networks on sequence modeling, in Proc. Neural Inform. Process. Syst., Workshop Deep Learn. Dec. 2014.
  28. I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. Int. Conf. Neural Inform. Process. Syst. (Montreal, Canada), 2014, pp. 3104-3112.
  29. D. Bahdanau, K. H. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, in Proc. Int. Conf. Learn. Represent. (San Diego, CA, USA), 2015.
  30. J. L. Ba, J. R. Kiros, and G. E. Hinton, Layer normalization, ArXiv Preprint, CoRR, 2016, ArXiv:1607.06450.
  31. Y. Wu et al., Google's neural machine translation system: Bridging the gap between human and machine translation, arXiv preprint, 2016, arXiv:1609.08144.
  32. M. Schuster and K. Nakajima, Japanese and Korean voice search, in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (Kyoto, Japan), 2012, pp. 5149-5152.
  33. R. Sennrich, B. Haddow, and A. Birch, Neural machine translation of rare words with subword units, in Proc. Annu. Meet. Assoc. Comput. Linguistics (Berlin, Germany), 2016, pp. 1715-1725.
  34. Google, TensorFlow code and pre-trained models for BERT, available at https://github.com/google-research/bert.
  35. ETRI, Public Artificial Intelligence Open API DATA, Korean BERT language model, available at http://aiopen.etri.re.kr/service_dataset.php. (Korean).
  36. Facebook, fastText-Library for text representation and classification, available at https://fasttext.cc/.