DOI QR코드

DOI QR Code

The Study of Developing Korean SentiWordNet for Big Data Analytics : Focusing on Anger Emotion

빅데이터 분석을 위한 한국어 SentiWordNet 개발 방안 연구 : 분노 감정을 중심으로

  • Received : 2014.07.04
  • Accepted : 2014.09.11
  • Published : 2014.11.30

Abstract

Efforts to identify user's recognition which exists in the big data are being conducted actively. They try to measure scores of people's view about products, movies and social issues by analyzing statements raised on Internet bulletin boards or SNS. So this study deals with the problem of determining how to find the emotional vocabulary and the degree of these values. The survey methods are using the results of previous studies for the basic emotional vocabulary and degree, and inferring from the dictionary's glosses for the extended emotional vocabulary. The results were found to have the 4 emotional words lists (vocabularies) as basic emotional list, extended 1 stratum 1 level list from basic vocabulary's glosses, extended 2 stratum 1 level list from glosses of non-emotional words, and extended 2 stratum 2 level list from glosses' glosses. And we obtained the emotional degrees by applying the weight of the sentences and the emphasis multiplier values on the basis of basic emotional list. Experimental results have been identified as AND and OR sentence having a weight of average degree of included words. And MULTIPLY sentence having 1.2 to 1.5 weight depending on the type of adverb. It is also assumed that NOT sentence having a certain degree by reducing and reversing the original word's emotional degree. It is also considered that emphasis multiplier values have 2 for 1 stratum and 3 for 2 stratum.

빅데이터 내에 존재하는 감정 정보를 추출하여 사용자들이 특정 대상에 대하여 갖고 있는 인식이 어떠한지를 파악하고자 하는 노력이 활발히 이루어지고 있다. 상품, 영화, 그리고 사회적 이슈 등에 대한 문장을 분석하여 사람들이 해당 주제에 어떠한 견해를 가지고 있는지를 분석하고 측정하여 구체적인 선호도를 알아내는 것이다. 문장에서 드러나는 감정 정도를 얻기 위해서는 감정어휘의 목록과 정도값을 제시할 수 있는 감정어휘사전이 필요하므로 본 연구에서는 감정어휘를 발견하는 방법과 이들의 정도값을 결정하는 문제를 다룬다. 기본적인 방법은 기초 감정어휘의 목록 수집과 이들의 정도값은 선행연구 결과와 직접 설문 방식을 이용하고, 확장된 목록의 수집과 정도값은 사전의 표제어 설명부(glosses)를 이용해 추론하는 것이다. 그 결과 발견된 감정어휘는 전형성을 띠고 있는 기본형 감정어휘, 기본형 감정어휘의 gloss에 사용된 확장형 1단계 1층위 감정어휘, 비 감정어휘 중 gloss에 기본형 또는 확장형 감정어휘를 가지고 있는 확장형 2단계 1층위 감정어휘, gloss의 gloss에 기본형 또는 확장형 감정어휘가 사용된 확장형 2단계 2층위 감정어휘의 네 종류로 나뉜다. 그리고 확장형 감정어휘의 정도값은 기본형 감정어휘의 정도값을 기초로 문형의 가중치와 강조승수를 적용하여 얻었다. 실험 결과 AND, OR 문형은 내포된 어휘의 감정 정도값을 평균내는 가중치를, Multiply 문형은 정도 부사어의 종류에 따라 1.2~1.5의 가중치를 갖는 것으로 파악되었다. 또한 NOT 문형은 사용된 어휘의 감정 정도를 일정 정도로 낮추어 역전시키는 것으로 추정된다. 또한 확장형 어휘에 적용되는 강조승수는 1층위에서 2, 2층위에서 3을 갖는 것으로 예상된다.

Keywords

References

  1. Abbasi, A., Chen, H., Thome, S., and Fu, T., "Affect Analysis of Web forums and Blogs Using Correlation Ensembles," IEEE Transactions on Knowledge and Data Engineering, Vol. 20, No. 9, pp. 1168-1180, 2008. https://doi.org/10.1109/TKDE.2008.51
  2. Baccianella, S., Esuli, A., and Sebastiani, F., "SentiWordNet 3.0 : An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining," In Proceedings of the 7th Conference on International Language Resources and Evaluation(LREC'10), pp. 2200-2204, 2010.
  3. Biswas, S., Yoo, J. H., and Jung, C. Y., "A Study on Priorities of the Components of Big Data Information Security Service by AHP," Journal of Society for e-Business Studies, Vol. 18, No. 4, pp. 301-314, 2013. https://doi.org/10.7838/jsebs.2013.18.4.301
  4. Choi, S. J., "The Type and Character of Feeling Verb," EoMunNonJip, Vol. 58, pp. 127-159, 2008.
  5. Choi, S. J., "The level of Feeling Verb : in the case of Anger words," Lingua Humanitatis, Vol. 11, No. 2, pp. 273-295, 2009.
  6. Collins Cobuild Advanced Learner's English Dictionary, 6th Edition, Harper Collins Publishers, 2009.
  7. Dehkharghani, R., Yanikoglu, B. D., and Tapucu, Y., "Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification," IEEE 12th International Conference on Data Mining Workshops(ICDMW), pp. 669-673, 2012.
  8. Esuli, A. and Sebastiani, F., "Determining the Semantic Orientation of Terms through Gloss Classification," In Proceedings of 14th ACM International conference on Information and knowledge management, pp. 617-624, 2005.
  9. Esuli, A. and Sebastiani, F., "Determining Term Subjectivity and Term Orientation for Opinion Mining," In Proceedings of EACL-06, 11th Conference of the European Chapter of the Association for Computational Linguistics, pp. 193-200, 2006.
  10. Esuli, A. and Sebastiani, F., "SentiWord-Net : A Publicly Available Lexical Resource for Opinion Mining," In Proceedings of the 5th Conference on Language Resources and Evaluation(LREC'06), pp. 417-422, 2006.
  11. Esuli, A. and Sebastiani, F., "Random-Walk Models of Term Semantics : An Application to Opinion-Related Properties," In Proceedings of the 3rd language Technology Conference(LTI '07), pp. 221-225, 2007.
  12. Gim, E. Y., "A Study on the Korean Emotion Verbs," PhD thesis, Chonnam National University, 2004.
  13. Hamouda, A. and Rohaim, M., "Reviews Classification Using SentiWordNet Lexicon," The Online Journal on Computer Science and information Technology(OJCSIT), Vol. 2, No. 1, pp. 120-123, 2011.
  14. Hatzivassiloglou, V. and Katheleen R. M., "Predicting the Semantic Orientation of Adjectives," In Proceedings of ACL-97, 35th Annual Meeting of the Association for Computational Linguistics, pp. 174-181, 1997.
  15. Hwang, J. W. and Ko, Y. J., "A Korean Sentence and Document Sentiment Classification System Using Sentiment Features," Journal of KISS : computing practices, Vol. 14, No. 3, pp. 336-340, 2008.
  16. Kamps, J., Marx, M., Mokken, R. J., and Rijke, M. D., "Using WordNet to Measure Semantic Orientation of Adjectives," In Proceedings of LREC-04, 4th International Conference on Language Resources and Evaluation, Vol. IV, pp. 1115-1118, 2004.
  17. Lyons, W., Emotion, Cambridge UniversityPress, London, 1980.
  18. Ohana, B. and Tierney, B., "Sentiment Classification of Reviews Using Senti-WordNet," Proceedings of the 9th IT&T Conference, 2009.
  19. Rao, D., Lewis, S., and Reichenbach, C., "Automatic Opinion Poloarity Classification of Movie Reviews," Colorado Research in Linguistics, Vol. 17, No. 1, 2004.
  20. Roh, J. H., Kim, H. J., and Chang, J. Y., "Improving Hypertext Classification Systems through WordNet-based Feature Abstraction," Journal of Society for e-Business Studies, Vol. 18, No. 2, pp. 95-110, 2013. https://doi.org/10.7838/jsebs.2013.18.2.095
  21. Rohracher, H., Einfuhrung in die psychologie, Urban und Schwarzenberg, Munchen, Berlin, Wien, 1976(윤흥섭 역. 심리학개론, 성원사, 1990).
  22. Shaver, P., Schwarth, J., Kirson, D., and O'Connor, C., "Emotion Knowledge : Further Exploration of a Prototype Approach," Journal of Personality and Social Psychology, Vol. 52, No. 6, pp. 1061-1086, 1987. https://doi.org/10.1037/0022-3514.52.6.1061
  23. Su, Q., Xiang, Kun., Wang, H., Sun, B., and Yu, S., "Using Pointwise Mutual Information to Identify Implicit Features in Customer Reviews," International Conference on the Computer Processing of Oriental Languages, pp. 22-30, 2006.
  24. Turney, P. D. and Littman, M. T., "Measuring Praise and Criticism : Inference of Semantic Orientation from Association," ACM Transactions on Information Systems, Vol. 21, No. 4, pp. 315-346, 2003. https://doi.org/10.1145/944012.944013
  25. Yeon, J., Shim, J., and Lee, S. G., "Outlier Detection Techniques for Biased Opinion Discovery," Journal of Society for e-Business Studies, Vol. 18, No. 4, pp. 315-326, 2013. https://doi.org/10.7838/jsebs.2013.18.4.315
  26. Yoon, A. S. and Kwon, H. C., "Compononet Analysis for Constructing an Emotion Ontology," Korean Journal of Cognitive Science, Vol. 21, No. 1, pp. 157-175, 2010. https://doi.org/10.19066/cogsci.2010.21.1.008

Cited by

  1. An Analysis of IT Proposal Evaluation Results using Big Data-based Opinion Mining vol.41, pp.1, 2018, https://doi.org/10.11627/jkise.2018.41.1.001
  2. 한글 감정단어의 의미적 관계와 범주 분석에 관한 연구 vol.47, pp.2, 2014, https://doi.org/10.16981/kliss.47.201606.51