DOI QR코드

DOI QR Code

Fusion Approach to Targeted Opinion Detection in Blogosphere

블로고스피어에서 주제에 관한 의견을 찾는 융합적 의견탐지방법

  • Yang, Kiduk (Department of Library and Information Science, Kyungpook National University)
  • Received : 2014.11.20
  • Accepted : 2015.03.19
  • Published : 2015.03.30

Abstract

This paper presents a fusion approach to sentiment detection that combines multiple sources of evidence to retrieve blogs that contain opinions on a specific topic. Our approach to finding opinionated blogs on topic consists of first applying traditional information retrieval methods to retrieve blogs on a given topic and then boosting the ranks of opinionated blogs based on the opinion scores computed by multiple sentiment detection methods. Our sentiment detection strategy, whose central idea is to rely on a variety of complementary evidences rather than trying to optimize the utilization of a single source of evidence, includes High Frequency module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinionated documents), Low Frequency module, which makes use of uncommon/rare terms (e.g., "sooo good") that express strong sentiments, IU Module, which leverages n-grams with IU (I and you) anchor terms (e.g., I believe, You will love), Wilson's lexicon module, which uses a collection-independent opinion lexicon constructed from Wilson's subjectivity terms, and Opinion Acronym module, which utilizes a small set of opinion acronyms (e.g., imho). The results of our study show that combining multiple sources of opinion evidence is an effective method for improving opinion detection performance.

이 논문은 여러가지 자료를 결합해 어떤 주제에 관한 의견이 실려있는 블로그를 찾는 융합적 의견탐지방법을 소개한다. 주제에 관한 의견이 담긴 블로그를 찾기위해 이 연구는 기존의 IR 방법으로 주제에 관한 블로그를 검색한 후 여러가지 의견탐지 방법을 합산한 의견점수로 검색결과의 순위를 조정하는 방법을 쓴다. 의견탐지 모듈의 주요 구성 요소는 의견이 실려있는 블로그에 자주 나오는 단어들을 활용한 고빈도 모듈, 강한 감정을 표현하는 희귀 한 용어들을 (e.g., "sooo good") 활용한 저빈도 모듈, "I"와 "you"에 묶인 n-gram을 (e.g., I believe, You will love) 활용한 IU모듈, 윌슨의 주관 용어 목록을 바탕으로 한 윌슨의 어휘모듈, 그리고 소수의 의견 약어를 (e.g., imho) 이용한 의견 약어 모듈들 이다. 본 연구의 결과는 여러 가지 방법을 융합하는 것이 의견 검출 성능을 향상시키는데 효과적이 다는 것을 보여주었다.

Keywords

References

  1. Lada, Adamic and N. Glance. 2005. "The political blogosphere and the 2004 US election:Divided they blog." Proceedings of the 3rd International Workshop on Link discovery, 36-43.
  2. Bartell, Brian T., G. W. Cottrell and R. K. Belew. 1994. "Automatic combination of multiple ranked retrieval systems." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 173-181.
  3. Chklovski, Timothy. 2006. "Deriving quantitative overviews of free text assessments on the web." Proceedings of the 11th International Conference on Intelligent User Interfaces, 155-162.
  4. Ding, Xiaowen., B. Liu and P. S. Yu. 2008. A holistic lexicon-based approach to opinion mining. Proceedings of the 2008 International Conference on Web Search and Data Mining, 231-240.
  5. Efron, Miles. 2004. "The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents." Proceedings of the 13th ACM International Conference on Information and Knowledge Management, 390-398.
  6. Fox, Edward A. and J. A. Shaw. 1995. "Combination of multiple searches." Proceeding of the 3rd Text Retrieval Conference, 105-108.
  7. Hu, Minqing and B. Liu. 2004. "Mining and Summarizing Customer Reviews." In KDD'04: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 168-177.
  8. Joshi, Hemant, C. Bayrak and X. Xu. 2007. "UALR at TREC: Blog Track." Proceedings of the 15th Text Retrieval Conference.
  9. Lee, Joon H. 1997. "Analyses of multiple evidence combination." Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 267-276.
  10. Liu, Bing, M. Hu and J. Cheng. 2005. "Opinion observer: analyzing and comparing opinions on the web." Proceedings of the 14th International Conference on World Wide Web, 342-351.
  11. Liu, Bing. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
  12. Macdonald, Craig, R. L. Santos, I. Ounis and I. Soboroff. 2010. Blog track research at TREC. ACM SIGIR Forum, 44(1), 58-75. https://doi.org/10.1145/1842890.1842899
  13. Mishne, Gilad. 2007. "Multiple Ranking Strategies for Opinion Retrieval in Blogs." Proceedings of the 15th Text Retrieval Conference.
  14. Mishne, Gilad and M. de Rijke. 2006. "Deriving wishlists from blogs: Show us your blog, and we'll tell you what books to buy." Proceedings of the 15th International World Wide Web Conference. 925-926.
  15. Oard, Doug, T. Elsayed, J. Wang, Y. Wu, P. Zhang, E. Abels and D. Lin. 2007. "TREC 2006 at Maryland: Blog, Enterprise, Legal and QA Tracks." Proceedings of the 15th Text Retrieval Conference.
  16. Ounis, Iadh, C. Macdonald, J. Lin and I. Soboroff. 2011. Overview of the TREC-2011 microblog track. Proceedings of the 20th Text Retrieval Conference (TREC 2011).
  17. Ounis, Iadh, C. Macdonald, M. de Rijke and G. Mishne. 2007. "Overview of the TREC 2006 Blog Track." Proceedings of the 15th Text Retrieval Conference.
  18. Taboada, Maite, J. Brooke, M. Tofiloski, K. Voll and M. Stede. 2011. Lexiconbased methods for sentiment analysis. Computational Linguistics, 37(2), 267-307. https://doi.org/10.1162/COLI_a_00049
  19. Thelwall, Mike, K. Buckley and G. Paltoglou. 2012. Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology, 63(1), 163-173. https://doi.org/10.1002/asi.21662
  20. Thelwall, Mike, K. Buckley, G. Paltoglou, D. Cai and A. Kappas. 2010. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544-2558. https://doi.org/10.1002/asi.21416
  21. Wiebe, Janyce, T. Wilson, R. Bruce, M. Bell and M. Martin. 2004. "Learning subjective language." Computational Linguistics, 30(3), 277-308. https://doi.org/10.1162/0891201041850885
  22. Wilson, Theresa, D. R. Pierce and J. Wiebe. 2003. "Identifying opinionated sentences." Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 33-34.
  23. Yang, Kiduk. 2002. "Combining Text- and Link-based Retrieval Methods for Web IR." Proceedings of the 10th Text Retrieval Conference, 609-618.
  24. Yang, Kiduk. 2009. WIDIT in TREC 2008 Blog Track: Leveraging Multiple Sources of Opinion Evidence. Proceedings of the 17th Text Retrieval Conference.
  25. Yang, Kiduk. 2014. Combining multiple sources of evidence to enhance Web search performance. Journal of Korean Library and Information Science Society, 45(3), 5-36.
  26. Yang, Kiduk and N. Yu. 2005. "WIDIT: Fusion-based approach to Web search optimization." Information Retrieval Technology, 206-220. New York: Springer-Verlag.
  27. Yang, Kiduk, N. Yu, A. Valerio and H. Zhang. 2007a. "WIDIT in TREC2006 Blog track." Proceedings of the 15th Text Retrieval Conference.
  28. Yang, Kiduk, N. Yu, A. Valerio, H. Zhang and W. Ke. 2007b. "Fusion approach to finding opinions in Blogsophere." Proceedings of the 1st International Conference on Weblog and Social Media.
  29. Yang, Kiduk, N. Yu, A. Wead, G. La Rowe, Y. H. Li, C. French and Y. Lee. 2005. "WIDIT in TREC2004 Genomics, HARD, Robust, and Web tracks." Proceedings of the 13th Text Retrieval Conference.
  30. Yang, Kiduk, N.Yu and H. Zhang. 2008. "WIDIT in TREC2007 Blog track: Combining lexicon-based methods to detect opinionated Blogs." Proceedings of the 16th Text Retrieval Conference.
  31. Zhang, Ethan and Y. Zhang. 2007. "UCSC on TREC 2006 Blog Opinion Mining." Proceedings of the 15th Text Retrieval Conference.
  32. Zhang, Wei and C. Yu. 2007. "UIC in TREC 2006 Blog Track." Proceedings of the 15th Text Retrieval Conference.