DOI QR코드

DOI QR Code

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization

완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법

  • Ko, Eunjung (Graduate School of Business IT, Kookmin University) ;
  • Kim, Namgyu (School of MIS, Kookmin University)
  • 고은정 (국민대학교 비즈니스IT전문대학원) ;
  • 김남규 (국민대학교 경영대학 경영정보학부)
  • Received : 2018.05.29
  • Accepted : 2018.06.25
  • Published : 2018.06.30

Abstract

Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

다양한 스마트 기기 및 관련 서비스의 증가에 따라 텍스트 데이터가 폭발적으로 증가하고 있으며, 이로 인해 방대한 문서로부터 필요한 정보만을 추려내는 작업은 더욱 어려워졌다. 따라서 텍스트 데이터로부터 핵심 내용을 자동으로 요약하여 제공할 수 있는 텍스트 자동 요약 기술이 최근 더욱 주목을 받고 있다. 텍스트 요약 기술은 뉴스 요약 서비스, 개인정보 약관 요약 서비스 등을 통해 현업에서도 이미 활발하게 적용되고 있으며, 학계에서도 문서의 주요 요소를 선별하여 제공하는 추출(Extraction) 접근법과 문서의 요소를 발췌한 뒤 이를 조합하여 새로운 문장을 구성하는 생성(Abstraction) 접근법에 따라 많은 연구가 이루어지고 있다. 하지만 문서의 자동 요약 기술에 비해, 자동으로 요약된 문서의 품질을 평가하는 기술은 상대적으로 많은 진전을 이루지 못하였다. 요약문의 품질 평가를 다룬 기존의 대부분의 연구들은 사람이 수작업으로 요약문을 작성하여 이를 기준 문서(Reference Document)로 삼고, 자동 요약문과 기준 문서와의 유사도를 측정하는 방식으로 수행되었다. 하지만 이러한 방식은 기준 문서의 작성 과정에 막대한 시간과 비용이 소요될 뿐 아니라 요약자의 주관에 의해 평가 결과가 다르게 나타날 수 있다는 한계를 갖는다. 한편 이러한 한계를 극복하기 위한 연구도 일부 수행되었는데, 대표적으로 전문에 대해 차원 축소를 수행하고 이렇게 축소된 전문과 자동 요약문의 유사도를 측정하는 기법이 최근 고안된 바 있다. 이 방식은 원문에서 출현 빈도가 높은 어휘가 요약문에 많이 나타날수록 해당 요약문의 품질이 우수한 것으로 평가하게 된다. 하지만 요약이란 본질적으로 많은 내용을 줄여서 표현하면서도 내용의 누락을 최소화하는 것을 의미하므로, 단순히 빈도수에 기반한 "좋은 요약"이 항상 본질적 의미에서의 "좋은 요약"을 의미한다고 보는 것은 무리가 있다. 요약문 품질 평가의 이러한 기존 연구의 한계를 극복하기 위해, 본 연구에서는 요약의 본질에 기반한 자동 품질 평가 방안을 제안한다. 구체적으로 요약문의 문장 중 서로 중복되는 내용이 얼마나 적은지를 나타내는 요소로 간결성(Succinctness) 개념을 정의하고, 원문의 내용 중 요약문에 포함되지 않은 내용이 얼마나 적은지를 나타내는 요소로 완전성(Completeness)을 정의한다. 본 연구에서는 간결성과 완전성의 개념을 적용한 요약문 품질 자동 평가 방법론을 제안하고, 이를 TripAdvisor 사이트 호텔 리뷰의 요약 및 평가에 적용한 실험 결과를 소개한다.

Keywords

References

  1. Blei, D. M., A. Y. Ng and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol.3, (2003), 993-1022.
  2. Daume III, H. and D. Marcu., "Bayesian Query-Focused Summarization," Proceeding of the International Conference on Computation Linguistics and the annual meeting of the Association for Computational Linguistics, (2006), 305-312.
  3. Deerwester, S., S. T. Dumais, G. W. Furnas, T. K. Landauer and R. Harshman, "Indexing by Latent Semantic Analysis," Journal of the American Society for Information Science, Vol.41, No.6(1990), 391-407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
  4. Gong, Y. and X. Liu, "Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis," Proceeding of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (2001), 19-25.
  5. Gupta, S., A. Nenkova and D. Jurafsky, "Measuring Importance and Query Relevance in Topic-Focused Multi-Document Summarization," Proceeding of the Annual Meeting of the Association for Computational Linguistics, (2007), 193-196.
  6. Haghighi, A., and L. Vanderwende, "Exploring Content Models for Multi-Document Summarization," Proceeding of Human Language Technologies: The 2009 Annual Conference of the North American Chpter of the Association for Computational Linguistcs, (2009), 362-370.
  7. Kim, N., D. Lee, H. Choi and W. X. S. Wong, "Investigations on Techniques and Applications of Text Analytics," The Journal of Korean Institute of Communications and Information Sciences, Vol.42, No.2(2017), 471-492. https://doi.org/10.7840/kics.2017.42.2.471
  8. Lin, C. Y. and E. Hovy, "Automatic Evaluation of Summaries Using n-Gram Co-Occurrence Statistics," Proceeding of HLT-NAACL, (2003), 71-78.
  9. Lin, C. Y., "Rouge: A Package for Automatic Evaluation of Summaries," Proceeding of the Workshop on Text Summarization Branches Out, (2004), 74-81.
  10. Litvak, M. and M. Last, "Graph-based keyword extraction for single-document summarization," Proceedings of the workshop on Multi-source Multilingual Information Extraction and Summarization. Association for Computational Linguistics, (2008).
  11. Luhn, H. P., "The Automatic Creation of Literature Abstracts," IBM Journal of Research Development, Vol.2, No.2(1958), 159-165. https://doi.org/10.1147/rd.22.0159
  12. Mani, I., "Automatic Summarization," John Benjamins Publishing Company, (2001), 114-125.
  13. Mihalcea, R. and P. Tarau, "TextRank - Bringing Order Into Texts," Proceeding of the Conference on Empirical Methods in Natural Language, (2004), 8-15.
  14. Mihalcea, R. and P. Tarau, "An Algorithm for Language Independent Single and Multiple Document Summarization," Proceeding of the International Joint Conference on Natural Language, (2005), 19-24.
  15. Nenkova, A. and R. Passonneau, "Evaluating Content Selection in Summarization: The Pyramid Method," Proceedings of HLT-NAACL, (2004), 145-152.
  16. Radev, D., H. Jing and M. Budzikowska, "Centroid-Based Summarization of Multiple Documents," Information Processing & Management, Vol.40, (2004), 919-938. https://doi.org/10.1016/j.ipm.2003.10.006
  17. Ouyan, Y., W. Li and Q. Lu, "An Integrated Multi-Document Summarization Approach based on Word Hierarchical Representation," Proceedings of the ACL-IJCNLP Conference Short Papers, (2009), 113-116.
  18. Steinberger, J. and K. Jezek, "Text Summarization and Singular Value Decomposition," Lecture Notes for Computer Science, Vol. 2457, (2004), 245-254.
  19. Wan, X., "Timed TextRank: Adding the Temporal Dimension to Multi-Document Summarization," Proceeding of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (2007), 867-868.