DOI QR코드

DOI QR Code

VS3-NET: Neural variational inference model for machine-reading comprehension

  • Park, Cheoneum (Computer Science Departmemt, Kangwon National University) ;
  • Lee, Changki (Computer Science Departmemt, Kangwon National University) ;
  • Song, Heejun (Artificial Intelligence Center, Samsung Electronics Co., Samsung Research)
  • Received : 2018.08.20
  • Accepted : 2019.02.25
  • Published : 2019.12.06

Abstract

We propose the VS3-NET model to solve the task of question answering questions with machine-reading comprehension that searches for an appropriate answer in a given context. VS3-NET is a model that trains latent variables for each question using variational inferences based on a model of a simple recurrent unit-based sentences and self-matching networks. The types of questions vary, and the answers depend on the type of question. To perform efficient inference and learning, we introduce neural question-type models to approximate the prior and posterior distributions of the latent variables, and we use these approximated distributions to optimize a reparameterized variational lower bound. The context given in machine-reading comprehension usually comprises several sentences, leading to performance degradation caused by context length. Therefore, we model a hierarchical structure using sentence encoding, in which as the context becomes longer, the performance degrades. Experimental results show that the proposed VS3-NET model has an exact-match score of 76.8% and an F1 score of 84.5% on the SQuAD test set.

Keywords

References

  1. P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, Squad: 100,000+ questions for machine comprehension of text, 2016, arXiv preprint arXiv:1606.05250.
  2. D. Chen, A. Fisch, J. Weston, and A. Bordes, Reading Wikipedia to answer open-domain questions, 2017, arXiv preprint arXiv:1704.00051.
  3. D. Weissenborn, G. Wiese, and L. Seiffe, Making neural QA as simple as possible but not simpler, in Proc. Conf. Comput. Natural Lang. Learn. (CoNLL 2017), Vancouver, Canada, Aug. 2017, pp. 1-12.
  4. W. Wang et al., Gated self-matching networks for reading comprehension and question answering, in Proc. Ann. Mtg. Assoc. Comput. Ling., Vancouver, Canada, July 2017, pp. 189-198.
  5. M. Seo et al., Bidirectional attention flow for machine comprehension, 2016, arXiv preprint arXiv:1611.01603.
  6. O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Ann. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 2692-2700.
  7. D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, in Proc. ICLR' 15, San Diego, CA, USA, May 2015, arXiv preprint arXiv:1409.0473.
  8. Y. Miao, L. Yu, and P. Blunsom, Neural variational inference for text processing, 2016, arXiv preprint arXiv:1511.06038.
  9. B. Zhang et al., Variational neural discourse relation recognizer, 2016, arXiv preprint arXiv:1603.03876.
  10. D. P. Kingma and M. Welling, Auto-encoding variational bayes, 2013, arXiv preprint arXiv:1312.6114.
  11. D. J. Rezende, S. Mohamed, and D. Wierstra, Stochastic backpropagation and approximate inference in deep generative models, in Proc. Int. Conf. Mach. Learning, Beijing, China, June 21-26, 2014, pp. 1278-1286.
  12. T. Lei, Y. Zhang, and Y. Artzi, Training RNNs as fast as CNNs, 2017, arXiv preprint arXiv:1709.02755.
  13. K. Cho et al., Learning Phrase Representations using RNN encoder-decoder for statistical machine translation, 2014, arXiv preprint arXiv:1406.1078.
  14. S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computat. 9 (1997), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
  15. D. Ferrucci et al., Watson: Beyond jeopardy!, Artif. Intel. 199-200 (2013), 93-105. https://doi.org/10.1016/j.artint.2012.06.009
  16. Y. Yang, W.-T. Yih, and C. Meek, WIKIQA: A challenge dataset for open-domain question answering, in Proc. Conf. Empir. Methods Natural Lang. Process., Lisbon, Portugal, Sept. 2015, pp. 2013-2018.
  17. K. M. Hermann et al., Teaching machines to read and comprehend, in Proc. Int. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 1693-1701.
  18. M. E. Peters et al., Deep contextualized word representations, 2018, arXiv preprint arXiv:1802.05365.
  19. H. Lee, H. Kim, and Y. Lee, GF-Net: High-performance machine Reading Comprehension through Feature Selection Feature Selection, in Proc. KCC, 2018, pp. 598-600.
  20. W. Y. Adams et al., QANet: Combining local convolution with global self-attention for reading comprehension, 2018, arXiv preprint arXiv: 1804.09541.
  21. J. Chung et al., Recurrent latent variable model for sequential data, in Proc. Int. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 2980-2988.
  22. B. Zhang et al., Variational neural machine translation, 2016, arXiv preprint arXiv:1605.07869.
  23. C. Clark and M. Gardner, Simple and effective multi-paragraph reading comprehension, 2017, arXiv preprint arXiv:1710.10723.
  24. J. Pennington, R. Socher, and C. Manning, Glove: Global vectors for word representation, in Proc. Conf. Empirical Methods Nat. Lang. Process., Doha, Qatar, 2014, pp. 1532-1543.
  25. Y. Kim, Convolutional neural networks for sentence classification, in Proc. Conf. Empirical Methods Nat. Lang. Process., Doha, Qatar, 2014, pp. 1746-1751.
  26. C. Park and C. Lee, Coreference resolution using hierarchical pointer networks, KIISE Trans. Comput. Practices 23 (2017), 542-549. https://doi.org/10.5626/KTCP.2017.23.9.542
  27. D. Kingma and J. Ba, ADAM: A method for stochastic optimization, 2015, arXiv preprint arXiv:1412.6980.
  28. S. Wang and J. Jiang, Machine comprehension using match-LSTM and answer pointer, 2016, arXiv preprint arXiv:1608.07905.
  29. R. Liu et al., Structural embedding of syntactic trees for machine comprehension, 2017, arXiv preprint arXiv:1703.00572.
  30. Y. Shen et al., ReasoNet: Learning to stop reading in machine comprehension, 2017, arXiv preprint arXiv: 1609.05284.
  31. J. Zhang et al., Exploring question understanding and adaptation in neural-network-based question answering, 2017, arXiv preprint arXiv:1703.04617.
  32. Z. Chen et al., Smarnet: Teaching machines to read and comprehend like human, 2017, arXiv preprint arXiv:1710.02772.
  33. M. Hu, Y. Peng, and X. Qiu, Reinforced mnemonic reader for machine comprehension, 2017, arXiv preprint arXiv:1705.02798.
  34. R. Liu et al., Phase conductor on multi-layered attentions for machine comprehension, 2017, arXiv preprint arXiv:1710.10504.
  35. H.-Y. Huang et al., Fusionnet: Fusing via fully-aware attention with application to machine comprehension, 2017, arXiv preprint arXiv:1711.07341.
  36. C. Park et al., S2-Net: Korean machine reading comprehension with SRU-based Self matching network, in Proc. KIISE for HCLT, 2017, pp. 35-40.
  37. L. van der Maaten and G. Hinton, Visualizing Data using t-SNE, J. Machine Learn. Res. 9 (2008), 2579-2605.

Cited by

  1. Analysis of English Multitext Reading Comprehension Model Based on Deep Belief Neural Network vol.2021, 2019, https://doi.org/10.1155/2021/5100809