DOI QR코드

DOI QR Code

A Study on Extracting Ideas from Documents and Webpages in the Field of Idea Mining

아이디어 마이닝 분야에서 문헌과 웹페이지의 아이디어 발췌에 대한 연구

  • Lee, Tae-Young (library & information science dept. of chonbuk national university)
  • Received : 2011.12.30
  • Accepted : 2012.01.31
  • Published : 2012.03.30

Abstract

The ideas and quasi-ideas useful for human's creation were drawn out from documents and webpages with extraction methods used in idea mining, opinion mining, and topic signal mining. The extraction methods comprised (1) decisive cue phrases, (2) cue figures and sounds, (3) contextual signals, and (4) discourse segmentations, They tested on the idea samples, such as thoughts, plans, opinions, writings, figures, sounds, and formulas. Methods (1), (3), and (4) received largely positive evaluation, judging the efficiency of 4 methods by F measure, a mixture of recall and precision ratio. In particular, decisive cue phrase method was effective to search idea and contextual signal method was effective to detect quasi-idea.

일반적인 문헌/문서나 웹페이지에서 창조에 도움이 되는 아이디어와 준아이디어를 색출하기 위하여 아이디어 마이닝 기법을 적용하였다. 아이디어 마이닝과 의견 마이닝 및 논제 신호 마이닝에서 사용하는 발췌 기법으로 웹 페이지, 문헌, 문서 등에 포함되어 있는 아이디어를 발췌하였다. 발췌 기법을 (1) 결정적 단서 어구, (2) 단서 멀티미디어, (3) 문맥 신호, 및 (4) 담화 구절 방법으로 정리하여 7가지 아이디어 유형 -사상, 계획, 의견, 글, 그림, 소리, 공식 별로 실험하였다. 각 기법들의 효율성은 재현율과 정확률을 혼합한 F 측정값으로 판단하였고 (1), (3), (4) 방법은 대체로 긍정적인 평가를 얻었다. 특히, 결정적 단서 어구는 아이디어 적출에 문맥 신호는 준아이디어 추출에 효과적인 것으로 판단되었다.

Keywords

References

  1. Al-Halimi, R. K. (2003). Mining topic signals from text. Unpublished doctoral dissertation, University of Waterloo. Retrieved from: http://uwspace.uwaterloo.ca/handle/10012/1165.
  2. Barzilay, R., & Elhaadad, M. (1997). Using lexical chains for text summarization, In Proceedings of the Workshop on Intelligent Scalable Text Summarization at the ACL/EACL Conference, 2-9, Madrid, Spain.
  3. Bergstrom, T., & Karahalios, K. (2008). Conversation clusters: Human-computer dialog for topic extraction. Retrieved from: http://social.cs.uiuc.edu/papers/pdfs/bergstrom-1361.pdf.
  4. Brandow, R., Mite, K., & Rau, L. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5), 675-685. https://doi.org/10.1016/0306-4573(95)00052-I
  5. Buitelaar, P., & Eigner, T. (2008). Topic extraction from scientific literature for competency management, Retrieved from: http://citeseerx.ist.psu.edu/.../download?doi=10.
  6. Businessdictionary.com (n.d.). Retrieved from: http://www.businessdictionary.com/
  7. Chung, Young Mee, & Kim, Yong Kwang (2008). A study on an effective event detection method for event-focused news summarization. Journal of the Korean Society for Information Management, 25(4), 227-243. https://doi.org/10.3743/KOSIM.2008.25.4.227
  8. Dave, K., Lawrence, S., & Pennock, D. M. (2004). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. Retrieved from: http://www.kushaldave.com/p451-dave.pdf.
  9. Definitions.net (n.d.). Retrieved from: http://www.definitions.net/
  10. Dey. L., & Haque, SK. M. (2008). Opinion mining fom noisy text data. Retrieved from: http://dl.acm.org/citation.cfm?id=1390763.
  11. Dong-A Daily News (2009). 2.19, 1.
  12. Edmundson, H. P. (1998). New methods in automatic extracting. In F. W. Lancaster, Indexing and abstracting in theory and practice (p. 269), London: Library Association Publishing.
  13. Hovy, E., & Lin, C. (1999). Automated text summarization in SUMMARIST. In Proceedings of the Workshop on Gaps and Bridges in NL Planning and Generation, 53-58. ECAI Conference. Budapest: Hungary.
  14. Kupiec, J., Pedersen, J., & Chen, F. (1995). A trainable document summarizer. Proceedings of the Eighteenth Annual International ACM Conference on Research, 68-73.
  15. Lee, Ji-Hye, & Chung, Young Mee (2009). An experimental study on opinion classification using supervised latent semantic indexing (LSI). Journal of the Korean Society for Information Management, 26(3), 451-462. https://doi.org/10.3743/KOSIM.2009.26.3.451
  16. Lee, Tae Young (2005). A Study on the construction of the automatic extracts and summaries: On the basis of scientific journal articles. Journal of the Korean Society for Library and Information Science, 39(3), 139-163.
  17. Liu, B. (2009). Opinion mining, Retrieved from: http://www.cs.uic.edu/-liub/FBS/opinion-mining.pdf.
  18. Mani, I. (2001). Automatic summarization. Amsterdam: John Benjamins Publishing Company.
  19. Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to information retrieval, Cambridge, New York: Cambridge University Press.
  20. Marcu, D. (1999). Discourse trees are good indicators of importance in text. In I. Mani, & M.T. Maybury (Eds.). Advanced in Automatic Text Summarization (pp. 123-136). Cambridge, Massachusetts: The MIT Press.
  21. Meadow, C. T., Boyce, B. R., & Kraft, D. H. (2000). Text information retrieval systems. San Diego: Academic Press. 208-211.
  22. Myaeng, S. H., & Jang, D. H. (1999). Development and evaluation of a statistically-based document summarization system, In I. Mani, & M.T. Maybury (Eds.), Advanced in Automatic Text Summarization (pp. 61-70). Cambridge, Massachusetts: The MIT Press.
  23. Pang, B., and Lee, L. (2008). Opinion mining and sentiment analysis. Retrieved from: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1344
  24. Roth, B. (2007). Topic extraction and relation in instant messaging, Retrieved from: http://nlp.stanford.edu/courses/cs224n/2010/reports/rothben.pdf
  25. Schutze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97-123.
  26. Teufel, S., & Moens, M. (1999). Argumentive classification of extracted sentences as a first step towards flexible abstracting, In I. Mani, & M.T. Maybury (Eds.), Advanced in Automatic Text Summarization(pp. 155-176). Cambridge, Massachusetts: the MIT Press.
  27. The free dictionary (n.d.). Retrieved from: www.thefreedictionary.com/.
  28. Thorleuchter, D. (2008). Finding new technological ideas and inventions with text mining and technique philosophy. Retreived from: http://www.springerlink.com/content/j21800t0768x6644/.
  29. Thorleuchter, D., Van den Poel, D., & Prinzie, A. (2009). Mining ideas from textual information. Retrieved from: http://www.feb.ugent.be/nl/Ondz/wp/Papers/wp_09_619.pdf.
  30. Wang, X., Zhang, K., Jin, X., & Shen, D. (2008). Mining common topics from multiple asynchronous text streams. Retrieved from: http://wsdm2009.org/papers/p192-wang.pdf.
  31. Webster online dictionary (n.d.). Retrieved from: http://www.websters-online-dictionary.org/.
  32. Wikipedia (n.d.) Retrieved from: http://ko.wikipedia.org/wiki/.
  33. Yourdictionary.com (n.d.). Retrieved from: www.yourdictionary.com.