Automatic Document Title Generation with RNN and Reinforcement Learning

Cho, Sung-Min;Kim, Wooseng;

doi:10.21219/jitam.2020.27.1.049

Journal of Information Technology Applications and Management

Volume 27 Issue 1
/
Pages.49-58
/
2020
/
1598-6284(pISSN)
/
2508-1209(eISSN)

Korea Data Strategy Society (한국데이터전략학회)

DOI QR Code

Automatic Document Title Generation with RNN and Reinforcement Learning

RNN과 강화 학습을 이용한 자동 문서 제목 생성

Cho, Sung-Min (Graduate School of Computer Science, Kwangwoon University) ;
Kim, Wooseng (School of Software, Kwangwoon University)

조성민 ;
김우생

Received : 2019.11.29
Accepted : 2020.02.15
Published : 2020.02.29

https://doi.org/10.21219/jitam.2020.27.1.049 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Lately, a large amount of textual data have been poured out of the Internet and the technology to refine them is needed. Most of these data are long text and often have no title. Therefore, in this paper, we propose a technique to combine the sequence-to-sequence model of RNN and the REINFORCE algorithm to generate the title of the long text automatically. In addition, the TextRank algorithm was applied to extract a summarized text to minimize information loss in order to protect the shortcomings of the sequence-to-sequence model in which an information is lost when long texts are used. Through the experiment, the techniques proposed in this study are shown to be superior to the existing ones.

Keywords

References

An, I. S., Kim, H. W., and Kim, H. J., "A User Timeline Summarization Technique using TextRank Algorithm", Journal of KISS: Databases, 2012. 8, pp. 238-245.
Chopra, S., Auli, M., and Rush, A. M., "Abstractive sentence summarization with attentive recurrent neural networks", Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 93-98.
Haveliwala, T. H., "Topic-sensitive pagerank", Proceedings of the 11th international conference on World Wide Web, ACM, 2002, pp. 517-526.
Jeong, S. W., Kim, J. T., and Kim, H. S., "Document Summarization Using TextRank Based on Sentence Embedding", Journal of KIISE, 2019. 3, pp. 285-289. https://doi.org/10.5626/jok.2019.46.3.285
Lee, H. G., Lee, S. H., Kim, J. T., and Kim, H. S., "Generating End-to-End Document Title using Sequence to Sequence Model and Keyword", Korea Information Science Society, 2016. 12, pp. 452-454.
Mihalcea, R. and Tarau, P., "Textrank: Bringing order into text", Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004, pp. 404-411.
Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., and Khudanpur, S., "Recurrent neural network based language model", Eleventh Annual Conference of the International Speech Communication Association, 2010.
Paulus, R., Xiong, C., and Socher, R., "A deep reinforced model for abstractive summarization", arXiv preprint arXiv: 1705.04304, 2017.
Rush, A. M., Chopra, S., and Weston, J., "A neural attention model for abstractive sentence summarization", arXiv preprint arXiv:1509.00685, 2015.
Salton, G. and McGill, M. J., "Introduction to modern information retrieval", Mcgraw-Hill, 1983.
Shin, Y. M., Noh, Y. S., and Park, S. Y., "Abstractive Multi-Document Summarization via Self-Attention based Multi Document Encoder," The Korean Institute of Information Scientists and Engineers, 2019. 6, pp. 527-529.
Sundermeyer, M., Schlüter, R., and Ney, H., "LSTM neural networks for language modeling", Thirteenth Annual Conference of the International Speech Communication Association, 2012.
Sutskever, I., Vinyals, O., and Le, Q. V., "Sequence to sequence learning with neural networks", Advances in NIPS, 2014.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Łukasz, K., and Polosukhin, I., "Attention is all you need", Advances in Neural Information Processing Systems, 2017, pp. 5998-6008.
Williams, R. J. and Zipser, D., "A learning algorithm for continually running fully recurrent neural networks", Neural Computation, Vol. 1, No. 2, 1989, pp. 270-280. https://doi.org/10.1162/neco.1989.1.2.270
Williams, R. J., "Simple statistical gradient- following algorithms for connectionist reinforcement learning", Machine Learning, Vol. 8, No. 3-4, 1992, pp. 229-256. https://doi.org/10.1007/BF00992696

Cited by

A Study on Fruit Quality Identification Using YOLO V2 Algorithm vol.9, pp.1, 2020, https://doi.org/10.17703/ijact.2021.9.1.190

Journal of Information Technology Applications and Management

Automatic Document Title Generation with RNN and Reinforcement Learning

RNN과 강화 학습을 이용한 자동 문서 제목 생성

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)