DOI QR코드

DOI QR Code

딥러닝과 감성사전을 결합한 하이브리드 감성분석 시스템 개발

Development of Sentiment Detection combined with Deep Learning and Sentiment Dictionary

  • 김백기 (고려사이버대학교 융합정보대학원 ) ;
  • 장경배 (고려사이버대학교 기계제어공학과 )
  • Bae-Ki Kim (The Graduate School of Interdisciplinary Information Studies, The Cyber University of Korea ) ;
  • Kyung-Bae Jang (Dept. of Mechanical and Control Engineering, The Cyber University of Korea)
  • 투고 : 2023.04.22
  • 심사 : 2023.06.07
  • 발행 : 2023.08.31

초록

스마트 디바이스와 소셜미디어의 보급 확산, 온라인에서의 제품 구매 증 로 인하여 많은 기업들이 소비자의 소비패턴, 생각 등을 이해하려고 한다. 이에 따라 온라인에서 제품이나 서비스에 대한 소비자들의 의견이 포함된 리뷰를 수집하여 소비자들의 감성을 이해하는 필요성이 증대되고 있으며 국내·외 기업이나 연구기관에서 관련 연구가 진행되고 있다. 그러나 아직 영어로 표현된 데이터를 대상으로 한 연구가 대부분이며, 영어 텍스트에 대한 어휘사전이나 머신러닝 접근법으로 감성분석(Sentiment Analysis)에 대한 많은 연구와 성과가 발표되고 있다. 그에 반해 국어는 한국어가 갖고 있는 복잡성과 딥러닝을 위한 레이블링 데이터가 부족하기 때문에 러신머닝 접근법에 의한 감성분석 정확률이 상대적으로 낮다. 이러한 문제점을 개선하고자 본 연구에서는 한글 온라인 리뷰를 대상으로 딥러닝과 감성사전 기법의 장점을 활용하여 감성분석의 정확도를 향상시키는 하이브리드 접근법의 시스템을 활용했다. 이를 통해 정확도, 정밀도, 재현율 등의 지표들이 얼마나 개선되는지 확인했다. 본 연구 결과는 향후 기업이 다량의 온라인 리뷰를 자동으로 분석 및 활용하는데 도움이 될 수 있을 것으로 기대한다.

Due to the spread of smart devices and social media, and the increase in product purchases online, many companies are trying to understand consumers' consumption patterns and thoughts. Accordingly, the need to understand consumers' emotions by collecting reviews including consumers' opinions on products or services online is increasing, and related research is being conducted by domestic and foreign companies and research institutes. However, most of the studies are still focused on data expressed in English, and many studies and results on sentiment analysis as a lexicon or machine learning approach for English text have been published. On the other hand, the Korean language has relatively low accuracy due to the complexity of Korean and the lack of labeling data for deep learning. To improve these problems, this study utilized a hybrid approach system that improves the accuracy of sentiment analysis by utilizing the advantages of deep learning and sentiment dictionary techniques for Korean online reviews. Through this, it was confirmed how much the indicators such as accuracy, precision, and recall improved. The results of this study are expected to help companies automatically analyze and utilize a large amount of online reviews in the future.

키워드

참고문헌

  1. Chen T., Xu R., He Y. and Wang X., "Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN," Expert Systems With Applications, Vol.72, pp.221-230, 2017. https://doi.org/10.1016/j.eswa.2016.10.065
  2. N.Y.Kim, D.H.Lee, H.H,Choi and W.X.S.Wong,, "Investigations on Techniques and Applications of Text Analytics", The Journal of Korean Institute of Communications and Information Sciences, Vol.42, No. 2, pp.471-492, 2017. https://doi.org/10.7840/kics.2017.42.2.471
  3. S.R.Hong, Y.O.Jeong and J.H.Lee, "Semi-supervised learning technique for large-scale social media sentiment analysis," Journal of the Korea Intelligent Systems Society, Vol.24, No.5, pp.482-488, 2014. https://doi.org/10.5391/JKIIS.2014.24.5.482
  4. Liu, B. and Zhang, L., "A survey of opinion mining and sentiment analysis," In:Aggarwal C., Zhai C. (eds) Mining Text Data, pp.415-463, Springer, Boston, MA, 2012.
  5. Y.H.Jeong, "A Study on Sentiment Classification Techniques for Korean Short Sentences Based on Machine Learning," Doctoral dissertation, Korea University, 2017.
  6. S.D.Choi and O.B.Kwon, "Study on Korean SentiWordNet development plan for big data analysis: focusing on feelings of anger," The Journal of Society for e-Business Studies Vol.19, No.4, pp.1-19, 2014. https://doi.org/10.7838/jsebs.2014.19.4.001
  7. M.P.Hong, M.Y.Shin, S.H.Park and H.M.Lee, "Automatic analysis of hybrid text tone based on syntax analysis and machine learning," Korean Language and Information Society, Language and Information Vol.14, No.2, pp.159-181, 2010. https://doi.org/10.29403/LI.14.2.9
  8. Collomb, Costea, Joyeux, Hasan and Brunie, "A Study and Comparison of Sentiment Analysis Methods for Reputation Evaluation," University of Lyon, France, 2014.
  9. Jurgita K., Algis K. and Tomas K "A Comparison of Approaches for Sentiment Classification on Lithuanian Internet Comments," 4th Biennial International Workshop on Balto-Slavic Natural Language Processing, pp.2-11, 2013.
  10. Medhat W., Hassan A. and Korashy H., "Sentiment analysis algorithms and applications: A survey," Ain Shams Engineering Journal, Vol.5, No.4, pp.1093-1113, 2014. https://doi.org/10.1016/j.asej.2014.04.011
  11. S.G.Kim, H.J.Cho and J.Y.Kang, "The Status of Using Text Mining in Academic Research and Analysis Methods," Journal of Information Technology and Architecture, Vol.13, No.2, 2016.
  12. Young T., Hazarikaz D., Poria S. and Cambria E, "Recent Trends in Deep Learning Based Natural Language Processing," arXiv:1708.02709v8 [cs.CL] 25 Nov, 2018.
  13. Liu, B. and Zhang, L., "A survey of opinion mining and sentiment analysis," In:Aggarwal C., Zhai C. (eds) Mining Text Data, pp. 415-463. Springer, Boston, MA, 2012
  14. Appel. O, Chiclana. F, Carter. J and Fujita, H, "A Hybrid Approach to the Sentiment Analysis Problem at the Sentence Level," Knowledge-Based Systems, Vol.108, pp.110-124, 2016. https://doi.org/10.1016/j.knosys.2016.05.040
  15. J.K.Lee and K,H.Y, "Trend Analysis of Barrier-free Academic Research using Text", Journal of Internet of Things and Convergence, Vol.9, No.2, pp.19-31, 2023. https://doi.org/10.20465/KIOTS.2023.9.2.019
  16. J.W.Mok, H.J.Jang and H.S.Lee, "HTML Tag Depth Embedding: An Input Embedding Method of the BERT Model for Improving Web Document Reading", Journal of Internet of Things and Convergence, Vol.8, No.5, pp.17-25, 2022. https://doi.org/10.20465/KIOTS.2022.8.5.017