DOI QR코드

DOI QR Code

Extracting Multiword Sentiment Expressions by Using a Domain-Specific Corpus and a Seed Lexicon

  • Lee, Kong-Joo (Department of Information & Communication Engineering, Chungnam National University) ;
  • Kim, Jee-Eun (Department of English Linguistics, Hankuk University of Foreign Studies) ;
  • Yun, Bo-Hyun (Department of Computer Education, Mokwon University)
  • Received : 2013.01.25
  • Accepted : 2013.03.21
  • Published : 2013.10.31

Abstract

This paper presents a novel approach to automatically generate Korean multiword sentiment expressions by using a seed sentiment lexicon and a large-scale domain-specific corpus. A multiword sentiment expression consists of a seed sentiment word and its contextual words occurring adjacent to the seed word. The multiword sentiment expressions that are the focus of our study have a different polarity from that of the seed sentiment word. The automatically extracted multiword sentiment expressions show that 1) the contextual words should be defined as a part of a multiword sentiment expression in addition to their corresponding seed sentiment word, 2) the identified multiword sentiment expressions contain various indicators for polarity shift that have rarely been recognized before, and 3) the newly recognized shifters contribute to assigning a more accurate polarity value. The empirical result shows that the proposed approach achieves improved performance of the sentiment analysis system that uses an automatically generated lexicon.

Keywords

References

  1. S.-J. Chang, Korean, Philadelphia, PA: John Benjamins Publishing Company, 1996.
  2. Y. Cho and K. Lee, "Automatic Affect Recognition Using Natural Language Processing Techniques and Manually Built Affect Lexicon," IEICE Trans. Inf. Syst., vol. E89D, no. 12, 2006, pp. 2964-2971.
  3. Y. Choi and C. Cardie, "Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis," Proc. Conf. Empirical Methods Natural Language Process., 2008, pp. 793-801.
  4. M. Dong and R. Kothari, "Classifiability Based Pruning of Decision Trees," Proc. Neural Netw., 2001, pp. 1739-1743.
  5. H. Kanayama and T. Nasukawa, "Fully Automatic Lexicon Expansion for Domain-Oriented Sentiment Analysis," Proc. Conf. Empirical Methods Natural Language Process., 2006, pp. 355-363.
  6. S. Lee and J. Seo, "Grammatical Relations Identification of Korean Parsed Texts Using Support Vector Machines," LNCS, vol. 3206, 2004, pp. 121-128.
  7. Y. Lu et al., "Automatic Construction of a Context-Aware Sentiment Lexicon: An Optimization Approach," Proc. 20th Int. Conf. World Wide Web, 2011, pp. 347-356.
  8. I. Milevskiy and J.-Y. Ha, "A Fast Algorithm for Korean Text Extraction and Segmentation from Subway Signboard Images Utilizing Smartphone Sensors," J. Comput. Sci. Eng., vol. 5, no. 3, Sept. 2011, pp. 161-166. https://doi.org/10.5626/JCSE.2011.5.3.161
  9. R. Quirk et al., A Comprehensive Grammar of the English Language, New York: Longman, 1985.
  10. M. Taboada et al., "Lexicon-Based Methods for Sentiment Analysis," Comput. Linguistics, June 2011, vol. 37, no. 2, pp. 267-307. https://doi.org/10.1162/COLI_a_00049

Cited by

  1. 소셜 빅데이터 마이닝 기반 이슈 분석보고서 자동 생성 vol.3, pp.12, 2013, https://doi.org/10.3745/ktsde.2014.3.12.553
  2. Predicting the Unemployment Rate Using Social Media Analysis vol.14, pp.4, 2013, https://doi.org/10.3745/jips.04.0079