DOI QR코드

DOI QR Code

Sentiment Dictionary Construction Based on Reason-Sentiment Pattern Using Korean Syntax Analysis

한국어 구문분석을 활용한 이유-감성 패턴 기반의 감성사전 구축

  • Woo Hyun Kim (Department of Industrial Data Engineering, Hanyang University) ;
  • Heejung Lee (School of Interdisciplinary Industrial Studies, Hanyang University)
  • 김우현 (한양대학교 산업데이터엔지니어링학과) ;
  • 이희정 (한양대학교 산업융합학부)
  • Received : 2023.12.01
  • Accepted : 2023.12.15
  • Published : 2023.12.31

Abstract

Sentiment analysis is a method used to comprehend feelings, opinions, and attitudes in text, and it is essential for evaluating consumer feedback and social media posts. However, creating sentiment dictionaries, which are necessary for this analysis, is complex and time-consuming because people express their emotions differently depending on the context and domain. In this study, we propose a new method for simplifying this procedure. We utilize syntax analysis of the Korean language to identify and extract sentiment words based on the Reason-Sentiment Pattern, which distinguishes between words expressing feelings and words explaining why those feelings are expressed, making it applicable in various contexts and domains. We also define sentiment words as those with clear polarity, even when used independently and exclude words whose polarity varies with context and domain. This approach enables the extraction of explicit sentiment expressions, enhancing the accuracy of sentiment analysis at the attribute level. Our methodology, validated using Korean cosmetics review datasets from Korean online shopping malls, demonstrates how a sentiment dictionary focused solely on clear polarity words can provide valuable insights for product planners. Understanding the polarity and reasons behind specific attributes enables improvement of product weaknesses and emphasis on strengths. This approach not only reduces dependency on extensive sentiment dictionaries but also offers high accuracy and applicability across various domains.

Keywords

Acknowledgement

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2023S1A5A2A03083440).

References

  1. Aggarwal, C.C. and Aggarwal, C.C., Machine learning for text: An introduction, Springer International Publishing, 2018.
  2. Ahn, J. and Kim, H., Building a Korean Sentiment Lexicon Using Collective Intelligence, Journal of Intelligence and Information Systems, 2015, Vol. 21, No. 2, pp. 49-67. https://doi.org/10.13088/jiis.2015.21.2.49
  3. Ahmed, M., Chen, Q., and Li, Z., Constructing domain-dependent sentiment dictionary for sentiment analysis, Neural Computing and Applications, 2020, Vol. 32, pp. 14719-14732. https://doi.org/10.1007/s00521-020-04824-8
  4. Alshari, E.M., Azman, A., Doraisamy, S., Mustapha, N., and Alkeshr, M., Effective method for sentiment lexical dictionary enrichment based on Word2Vec for sentiment analysis, In fourth international conference on information retrieval and knowledge management, IEEE, 2018.
  5. Baccianella, S., Esuli, A., and Sebastiani, F., Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining, In Lrec, 2010.
  6. Behdenna, S., Barigou, F., and Belalem, G., Sentiment analysis at document level, In Smart Trends in Information Technology and Computer Communications: First International Conference, 2016.
  7. Bian, S., Jia, D., Li, F., and Yan, Z., A new Chinese financial sentiment dictionary for textual analysis in accounting and finance, 2021, Available at SSRN 3446388.
  8. Birjali, M., Kasri, M., and Beni-Hssane, A., A comprehensive survey on sentiment analysis: Approaches, challenges and trends, Knowledge-Based Systems, 2021, Vol. 226, p. 107134.
  9. Cambria, E., Speer, R., Havasi, C., and Hussain, A., Senticnet: A publicly available semantic resource for opinion mining, In 2010 AAAI fall symposium series, 2010.
  10. Cambria, E., Havasi, C., and Hussain, A., Senticnet 2: A semantic and affective resource for opinion mining and sentiment analysis, 2012, In Twenty-Fifth international FLAIRS conference.
  11. Cho, S.H. and Kang, H.B., Text sentiment classification for SNS-based marketing using domain sentiment dictionary, In IEEE International Conference on Consumer Electronics, 2012.
  12. Do, H.H., Prasad, P.W., Maag, A., and Alsadoon, A., Deep learning for aspect-based sentiment analysis: a comparative review, Expert Systems with Applications, 2019, Vol. 118, pp. 272-299. https://doi.org/10.1016/j.eswa.2018.10.003
  13. Havasi, C., Speer, R., and Alonso, J., ConceptNet: A lexical resource for common sense knowledge, Recent advances in natural language processing V: selected papers from RANLP, 2007, Vol. 309, p. 269.
  14. Heo, C. and Ohn, S., A Novel Method for Constructing Sentiment Dictionaries using Word2vec and Label Propagation, Journal of Korean Institute of next Generation Computing, 2017, Vol. 13, No. 2, pp. 93-101.
  15. Hong, Y., Lee, J., and Lee, G,, A Korean Syntactic Analyzer based on the Dependency Grammar, In KIISE Conference, 1993, pp. 781-784.
  16. Hutto, C. and Gilbert, E., Vader: A parsimonious rule-based model for sentiment analysis of social media text, In Proceedings of the International AAAI Conference On Web and Social Media, 2014.
  17. Jiaheng, H., Yonghua, C., and Chengyao, W., Constructing sentiment dictionary with deep learning: Case study of financial data, Data Analysis and Knowledge Discovery, 2018, Vol. 2, No. 10, pp. 95-102.
  18. Jurafsky, D. and James H.M., Speech and language processing, Pearson Education India, 2019.
  19. Kim, J., Remarks on "Sentence", Poetics and Linguistics, 2003, Vol. 6, pp. 65-113.
  20. Kim, J., Oh, Y., and Chae, S., Construction of a Domain-Specific Sentiment Dictionary Using Graph-based Semi-supervised Learning Method, Science of Emotion and Sensibility, 2015, Vol. 18, No. 1, pp. 97-104. https://doi.org/10.14695/KJSOS.2015.18.1.103
  21. Kudo, T., Yamamoto, K., and Matsumoto, Y., Applying conditional random fields to Japanese morphological analysis, In Proceedings of the conference on empirical methods in natural language processing, 2004.
  22. Kwon, O., Kim, J., Cho, H., Hong, K., Han, J., Jung J., Kim, Y., and Choi S., KHU-SentiwordNet: Developing A Korean SentiwordNet Combining Empty Morpheme, In Korea Society of IT Service Conference, 2019, pp. 194-197.
  23. Lee, J., Research on Designing Korean Emotional Dictionary using Intelligent Natural Language Crawling System in SNS, The Journal of Information Systems, 2020, Vol. 29, No. 3, pp. 237-251. https://doi.org/10.5859/KAIS.2020.29.3.237
  24. Li, S., Shi, W., Wang, J., and Zhou, H., A deep learning-based approach to constructing a domain sentiment lexicon: A case study in financial distress prediction, Information Processing and Management, 2021, Vol. 58, No. 5, 102673.
  25. Lim, J., Bae, Y., Kim, H., Kim, Y., and Lee, K., Korean Dependency Guidelines for Dependency Parsking and Exo-Brain Language Analysis Corpus, In Hangul and Korean Information Processing Conference, 2015.
  26. Liu, H., and Singh, P., ConceptNet-a practical commonsense reasoning tool-kit, BT Technology Journal, 2004, Vol. 22, No. 4, pp. 211-226. https://doi.org/10.1023/B:BTTJ.0000047600.45421.6d
  27. Liu, B., Sentiment analysis and subjectivity, Handbook of natural language processing, 2010.
  28. Liu, B., Sentiment analysis and opinion mining, Springer Nature, 2022.
  29. Mel'cuk, I. A., Dependency syntax: theory and practice, SUNY press, 1988.
  30. Nazir, A., Rao, Y., Wu, L., and Sun, L., Issues and challenges of aspect-based sentiment analysis: A comprehensive survey, IEEE Transactions on Affective Computing, 2020, Vol. 13, No. 2, pp. 845-863. https://doi.org/10.1109/TAFFC.2020.2970399
  31. Nivre, J. Dependency grammar and dependency parsing, MSI report, 2005, Vol. 5133, No. 1959, pp. 1-32.
  32. Manning, C.D. and Zeman, D., Universal dependencies v1: A multilingual treebank collection, In Proceedings of the Tenth International Conference on Language Resources and Evaluation, 2016.
  33. Park, K., Lee, J., Jang, S., and Jung, D., An empirical study of tokenization strategies for various Korean NLP tasks, arXiv preprint arXiv:2010.02534, 2020.
  34. Park, S., Moon, J., Kim, S., Cho, W. I., Han, J., Park, J., ... and Cho, K., Klue: Korean language understanding evaluation, arXiv preprint arXiv:2105.09680, 2021.
  35. Park, S., Na, C., Choi, M., Lee, D., and On, B., Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon, Journal of Intelligence and Information Systems, 2018, Vol. 24, No. 4, pp. 219-240. https://doi.org/10.13088/JIIS.2018.24.4.219
  36. Plutchik, R., A general psychoevolutionary theory of emotion, In Theories of emotion, Academic press, 1980.
  37. Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., and Bandyopadhyay, S., Enhanced SenticNet with affective labels for concept-based opinion mining, IEEE Intelligent Systems, 2013, Vol. 28, No. 2, pp. 31-38. https://doi.org/10.1109/MIS.2013.4
  38. Shin D., Cho, D., and Nam, J., Building the Korean Sentiment Lexicon DecoSelex for Sentiment Analysis, Journal of Korealex, 2016, Vol. 28, pp. 75-111. https://doi.org/10.33641/kolex.2016..28.75
  39. Sivasankar, E., Krishnakumari, K., and Balasubramanian, P., An enhanced sentiment dictionary for domain adaptation with multi-domain dataset in Tamil language (ESD-DA), Soft Computing, 2021, Vol. 25, pp. 3697-3711. https://doi.org/10.1007/s00500-020-05400-x
  40. Song, J. and Lee, S., Automatic Construction of Positive/Negative Feature-Predicate Dictionary for Polarity Classification of Product Reviews, Journal of Computing Science and Engineering, 2011, Vol. 38, No. 3, pp. 157-168.
  41. Song, K., Feng, S., Gao, W., Wang, D., Chen, L., and Zhang, C., Build emotion lexicon from microblogs by combining effects of seed words and emoticons in a heterogeneous graph, In Proceedings of the 26th ACM Conference on Hypertext and Social Media, 2015.
  42. Steinberger, J. Ebrahim, M., Ehrmann, M., Hurriyetoglu, A., Kabadjov, M., Lenkova, P., Steinberger, R., Tanev, H., Vazquez, S., Zavarella, V., Creating sentiment dictionaries via triangulation, Decision Support Systems, 2012, Vol. 53, No. 4, pp. 689-694. https://doi.org/10.1016/j.dss.2012.05.029
  43. Tan, A.H., Text mining: The state of the art and the challenges, In Proceedings of the Pakdd Workshop on Knowledge Disocovery from Advanced Databases, 1999.
  44. Tesniere, L., Elements de syntaxe structural, Editions Klincksieck, 1959.
  45. Tsai, A.C.R., Wu, C.E., Tsai, R.T.H., and Hsu, J.Y.J., Building a concept-level sentiment dictionary based on commonsense knowledge, IEEE Intelligent Systems, 2013, Vol. 28, No. 2, pp. 22-30. https://doi.org/10.1109/MIS.2013.25
  46. Yang, L., Li, Y., Wang, J., and Sherratt, R.S., Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning, IEEE access, 2020, Vol. 8, pp. 23522-23530. https://doi.org/10.1109/ACCESS.2020.2969854
  47. Yang, Y., Aspect-Level Opinion Pattern Mining Using Dependency Relations in Product Reviews, [Master's thesis], Hanyang University 2023.
  48. Yue, L., Chen, W., Li, X., Zuo, W., and Yin, M., A survey of sentiment analysis in social media, Knowledge and Information Systems, 2019, Vol. 60, pp. 617-663. https://doi.org/10.1007/s10115-018-1236-4