Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization

Yan, Wanying;Guo, Junjun;

doi:10.3745/JIPS.04.0181

Journal of Information Processing Systems

Volume 16 Issue 4
/
Pages.820-831
/
2020
/
1976-913X(pISSN)
/
2092-805X(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization

Yan, Wanying (College of Information Engineering and Automation, Kunming University of Science and Technology) ;
Guo, Junjun (College of Information Engineering and Automation, Kunming University of Science and Technology)

Received : 2020.03.20
Accepted : 2020.05.24
Published : 2020.08.31

https://doi.org/10.3745/JIPS.04.0181 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Extractive document summarization aims to select a few sentences while preserving its main information on a given document, but the current extractive methods do not consider the sentence-information repeat problem especially for news document summarization. In view of the importance and redundancy of news text information, in this paper, we propose a neural extractive summarization approach with joint sentence semantic clipping and selection, which can effectively solve the problem of news text summary sentence repetition. Specifically, a hierarchical selective encoding network is constructed for both sentence-level and document-level document representations, and data containing important information is extracted on news text; a sentence extractor strategy is then adopted for joint scoring and redundant information clipping. This way, our model strikes a balance between important information extraction and redundant information filtering. Experimental results on both CNN/Daily Mail dataset and Court Public Opinion News dataset we built are presented to show the effectiveness of our proposed approach in terms of ROUGE metrics, especially for redundant information filtering.

Keywords

References

J. Cheng and M. Lapata, "Neural summarization by extracting sentences and words," in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany, 2016.
R. Nallapati, F. Zhai, and B. Zhou, "SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents," in Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, CA, 2017.
Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, "Neural document summarization by jointly learning to score and select sentences," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia, 2018, pp. 654-663.
M. Gambhir and V. Gupta, "Recent automatic text summarization techniques: a survey," Artificial Intelligence Review, vol. 47, no. 1, pp. 1-66, 2017. https://doi.org/10.1007/s10462-016-9475-9
Z. Cao, W. Li, S. Li, and F. Wei, "Retrieve, rerank and rewrite: soft template based neural summarization," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 2018, pp. 152-161.
S. Narayan, S. B. Cohen, and M. Lapata, "Ranking sentences for extractive summarization with reinforcement learning," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), New Orleans, LO, 2018, pp. 1747-1759.
Q. Zhou, N. Yang, F. Wei, and M. Zhou, "Selective encoding for abstractive sentence summarization," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), Vancouver, Canada, 2017, pp. 1094-1104.
X. Zhang, F. Wei, and M. Zhou, "HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization," in Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL 2019), Florence, Italy, 2019, pp. 5059-5069.
K. M. Hermann, T. Kocisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom, "Teaching machines to read and comprehend," Advances in Neural Information Processing Systems, vol. 28, pp. 1693-1701, 2015.

Journal of Information Processing Systems

Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)