DOI QR코드

DOI QR Code

Document Summarization Model Based on General Context in RNN

  • Received : 2019.07.23
  • Accepted : 2019.11.05
  • Published : 2019.12.31

Abstract

In recent years, automatic document summarization has been widely studied in the field of natural language processing thanks to the remarkable developments made using deep learning models. To decode a word, existing models for abstractive summarization usually represent the context of a document using the weighted hidden states of each input word when they decode it. Because the weights change at each decoding step, these weights reflect only the local context of a document. Therefore, it is difficult to generate a summary that reflects the overall context of a document. To solve this problem, we introduce the notion of a general context and propose a model for summarization based on it. The general context reflects overall context of the document that is independent of each decoding step. Experimental results using the CNN/Daily Mail dataset show that the proposed model outperforms existing models.

Keywords

References

  1. D. R. Radev, E. Hovy, and K. McKeown, "Introduction to the special issue on summarization," Computational linguistics, vol. 28, no. 4, pp. 399-408, 2002. https://doi.org/10.1162/089120102762671927
  2. K. Hong, J. M. Conroy, B. Favre, A. Kulesza, H. Lin, and A. Nenkova, "A repository of state of the art and competitive baseline summaries for generic news summarization," in Proceedings of the 9th International Conference on Language Resources and Evaluation, Reykjavik, Iceland, 2014, pp. 1608-1616.
  3. G. Erkan and D. R. Radev, "Lexrank: graph-based lexical centrality as salience in text summarization," Journal of Artificial Intelligence Research, vol. 22, no. 1, pp. 457-479, 2004. https://doi.org/10.1613/jair.1523
  4. I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3104-3112.
  5. S. Chopra, M. Auli, and A. M. Rush, "Abstractive sentence summarization with attentive recurrent neural networks," in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, 2016, pp. 93-98.
  6. R. Nallapati, B. Zhou, C. Gulcehre, and B. Xiang, "Abstractive text summarization using sequence-tosequence RNNs and beyond," in Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, Berlin, Germany, 2016, pp. 280-290.
  7. R. Nallapati, F. Zhai, and B. Zhou, "SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents," in Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, 2017, pp. 3075-3081.
  8. D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, 2015.
  9. M. T. Luong, H. Pham, and C. D. Manning, "Effective approaches to attention-based neural machine translation," in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1412-1421.
  10. O. Vinyals, M. Fortunato, and N. Jaitly, "Pointer networks," in Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 2692-2700.
  11. C. Gulcehre, S. Ahn, R. Nallapati, B. Zhou, and Y. Bengio, "Pointing the unknown words," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 140-149.
  12. A. See, P. J. Liu, and C. D. Manning, "Get to the point: summarization with pointer-generator networks," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1073-1083.
  13. K. M. Hermann, T. Kocisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom, "Teaching machines to read and comprehend," in Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 1693-1701.
  14. S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735- 1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  15. K. Cho, B. V. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1724-1734.
  16. Z. Tu, Z. Lu, Y. Liu, X. Liu, and H. Li, "Modeling coverage for neural machine translation," in Proceedings of the 54th Annual meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 76-85.
  17. R. Paulus, C. Xiong, and R. Socher, "A deep reinforced model for abstractive summarization," in Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
  18. V. Anderson and S. Hidi, "Teaching students to summarize," Educational Leadership, vol. 46, no. 4, pp. 26- 28, 1988.
  19. J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of Machine Learning Research, vol. 12, pp. 2121-2159, 2011.
  20. C. Y. Lin, "Rouge: a package for automatic evaluation of summaries," in Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 74-81.