A Study on the Construction of the Automatic Extracts and Summaries - On the Basis of Scientific Journal Articles -

자동 발췌문/요약 시스템 구축에 관한 연구 - 학술지 논문기사를 중심으로 -

  • 이태영 (전북대학교 인문대학 문헌정보학과)
  • Published : 2005.09.01


Various corpus-based approaches, rhetorical roles of discourse structure, and unifications of similar sentences were applied to construct the automatic Ext/Sums(extracts and summaries). Rhetorical roles of sentences like objective, method, background, result, conclusion, etc. for making elastic Ext/Sums were established and extraction engines according to respective role were prepared. The $90\%$ of Success rate in extracting the important sentences of sample articles was accomplished. Rearranging the selected sentences, it used unification of similar sentences using the cosine coefficient equation, deletion of unnecessary modification and insertion clauses, junction of short sentences, and connection of sentences able to link. They suggest the methods applying rhetorical roles of sentences, meaning and signature of noun and verb in clauses, and cue words and location will be researched to construct the more effective Ext/Sums.


Automatic Summaries;Extraction Methods;Location;Rhetorical Roles;Web


  1. Brandow, R., K. Mite, and L. Rau. 1995. 'Automatic condensation of Electronic Publications by Sentence Selection.' Information Processing & Management, 31(5): 675-685
  2. Edmundson, H. P. 1969. 'New Methods in Automatic Extracting.' Journal. of ACM, 16(2): 377-391. quoted in F. W. Lancaster. Indexing and Abstrac- ting in Theory and Practice, London: 1998. 269
  3. Hirst, G. and D. ST-Onge. 1998[to appear]. 'Lexical Chains as representation of context for the detection and correc- tion of malapropisms'. In Fellbaum, C., ed., WordNet: An Electronic Lexical Database and Some of its Applications. Cambridge, MA: The MIT Press
  4. Jones, K. S. 1999. 'Automatic summarizing: factors and directions', quoted in I. Mani and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Sum- marization. Cambridge, Massachusetts: the MIT Press.
  5. Li, W., K-F. Wong, and C. Yuan. 2001. 'Toward Automatic Chinese Temporal Information Extraction.' JASIST, 52 (9): 748-62
  6. Myaeng, S. H. and D. H. Jang. 1999. 'Development and Evaluation of a Statistically-based Document Summari- zation System', quoted in I. Mani and M.T. Maybury(eds.). 1999. Ad- vanced in Automatic Text Summari- zation. Cambridge, Massachusetts: the MIT Press
  7. Paice, C. D. 1990. 'Constructing Literature Abstract by Computer : Techniques and Prospects.' Information Processing & Management, 26(1): 171-186
  8. Meadow, C. T., B. R. Boyce, and D. H. Kraft. 2000. Text Information Retrieval Systems. San Diego: Academic Press. 208-211
  9. Salton, G., J. Allen, and A. Singhal. 1996. 'Automatic text decomposition and structuring.' Information Processing & Management, 32: 127-138
  10. Mani, I. and M. T. Maybury(eds.). 1999. Advanced in Automatic Text Sum- marization. Cambridge, Massachusetts: the MIT Press
  11. McKeown, K., J. Robin, and K. Kukich. 1995. 'Generating Concise Natural Language Summaries', Information Processing and Management: an Inter- national Journal, 31(5): 703-733
  12. 이태영. 1992. '한국어 초록 작성의 자동화에 대한 연구-미생물학분야 학술지의 논문을 대상으로-', 연세대학교 대학원 박사학위 논문
  13. Chowdhury, G. G. 1999. Introduction to Mordern Information Retrieval. London: Library Association Publishing
  14. Hovy, E. and C. Lin. 1999. 'Automated Text Summarization in SUMMARIST', In Proceedings of the Workshop on Gaps and Bridges in NL Planning and Generation, 53-58. ECAI Con- ference. Budapest, Hungary
  15. Mani, I. 2001. Automatic Summarization. Amsterdam: John Benjamins Publi- shing Company
  16. Moens, M-F., C. Uyttendaele, and J. Du- mortier. 1999. 'Abstracting of Legal Cases: The Pontential of Clustering Based on the Selection of Represen- tative Objects.' JASIS, 50: 151-161<151::AID-ASI6>3.0.CO;2-I
  17. van Dijk, T. A. 1979. 'Recalling and Summarizing Complex Discourse'. In W. Burchart and K. Hulker(eds.), Text Processing Science, 49-93, Berlin: Walter de Gruyter. quoted in I. Mani. 2001. Automatic Summari- zation. Amsterdam, John Benjamins Publishing Company, 139-142
  18. Kupiec, J., J. Pedersen, and F. Chen. 1995. 'A Trainable document summarizer'. Proceedings of the Eighteenth Annual International ACM Conference on Research and Development in Infor- maton Retrieval (SIGIR), 68-73. seattle, WA
  19. Teufel, S. and M. Moens. 1999. 'Argu- mentive classification of extracted sentences as a first step towards flexible abstracting', quoted in I. Mani and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Sum- marization. Cambridge, Massachusetts: the MIT Press
  20. Barzilay, R. and M. Elhaadad. 1997. 'Using Lexical Chains for Text Summari- zation', In Proceedings of the Work- shop on Intelligent Scalable Text Summarization at the ACL/EACL Conference, 2-9. Madrid, Spain.
  21. Salton, G., Singhal, A., Mitra, M., Buckley, C., 1997. 'Automatic text structuring and summarization.' Information Pro- cessing & Management, 33: 193-207
  22. 최인숙. 2000. '술어기반 문형정보를 이용한 자동요약시스템에 관한 연구', 연세대학교 대학원 박사학위 논문
  23. Earl, L. L. 1970. 'Experiments in Automatic Extracting and Indexing.' Information Storage & Retrieval, 6(4): 313-334. quoted in F. W. Lancaster. Indexing and Abstracting in Theory and Prac- tice. London: 1998. 270
  24. Rush, J. E. et al. 1971. 'Automatic Abst- racting and Indexing. II. Production of Indicative Abstracts by Appli- cation of Contextual Inference and Syntactic Coherence Criteria.' JASIS, 22(4): 260-274
  25. 이재윤. 1993. '동적 시소러스의 구축에 관한 실험적 연구', 연세대학교 대학원 석사학위 논문
  26. Alone, C., M. E. Okurowski, J. Gorlinsky, and B. Larsen. 1999. 'A Trainable Summarizer with Konwledge Acquired from Robust NLP Techniques', quo- ted in I. Mani and M.T. Maybury (eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Ma- ssachusetts: the MIT Press
  27. Boguraev, B. and C. Kennedy. 1997. 'Salience- based Content Characterization of Text Documents', In Proceedings of the Workshop on Intelligent Scalable Text summarization at the ACL/ EACL Conference, 2-9. Madrid, Spain.
  28. Schutze, H. 1998. 'Automatic word sensec discrimination.' Computational Lingui- stics, 24: 97-123
  29. 최상희. 2004. '질의응답을 위한 복수문서 요약에 관한 실헙적 연구', 연세대학교 대학원 박사학위논문
  30. Llorens, J., M. Velasco, A. de Amescua, J. A. Moreiro, and V. Martinez. 2004. 'Automatic Generation of Domain Representations Using Thesaurus Struc- tures', JASIST, 55(10): 846-858