DOI QR코드

DOI QR Code

Sentiment analysis of online food product review using ensemble technique

앙상블 기법을 활용한 온라인 음식 상품 리뷰 감성 분석

  • Received : 2019.02.08
  • Accepted : 2019.04.20
  • Published : 2019.04.28

Abstract

In the online marketplace, consumers are exposed to various products and freely express opinions. As consumer product reviews have a important effect on the success of online markets and other consumers, online market needs to accurately analyze the consumers' emotions about their products. Text mining, which is one of the data analysis techniques, can analyze the consumer's reviews on the products and efficiently manage the products. Previous studies have analyzed specific domains and less than 20,000 data, despite the different accuracy of the analysis results depending on the data domain and size. Further, there are few studies on additional factors that can improve the accuracy of analysis. This study analyzed 72,530 review data of food product domain that was not mainly covered in previous studies by using ensemble technique. We also examined the influence of summary review on improving accuracy of analysis. As a result of the study, this study found that Boosting ensemble technique has the highest accuracy of analysis. In addition, the summary review contributed to improving accuracy of the analysis.

온라인 마켓에서 소비자는 다양한 상품을 접하고 이에 대한 의견을 자유롭게 기술한다. 소비자의 상품 리뷰가 다른 소비자와 온라인 마켓의 성공에 큰 영향을 주는 만큼 온라인 마켓은 판매 상품에 대한 소비자의 감성을 정확하게 분석할 필요가 있다. 데이터 분석 기법 중 하나인 텍스트 마이닝은 상품에 대한 소비자 리뷰를 분석하여 상품을 효율적으로 관리할 수 있게 해준다. 선행 연구들은 데이터 도메인과 사이즈에 따라 분석 결과의 정확도가 다르게 나타남에도 불구하고 특정 도메인과 2만개 미만의 데이터를 분석해왔다. 또한, 분석의 정확도를 향상 시킬 수 있는 추가 요인에 대한 연구는 거의 수행하지 않았다. 본 연구는 앙상블 기법을 활용하여 기존 연구에서 주로 다루지 않은 음식 상품 도메인의 72,530개 리뷰 데이터를 분석하였다. 또한, 분석 정확도 향상과 관련하여 요약 리뷰의 영향력을 살펴보았다. 연구 결과, 본 연구는 기존 연구와 다르게 부스팅 앙상블 기법이 가장 높은 분석 정확도를 보인다는 사실을 발견하였다. 또한, 요약 리뷰는 분석의 정확도 향상에 기여하는 것으로 나타났다.

Keywords

Table 1. The study of sentiment analysis using ensemble technique

DJTJBT_2019_v17n4_115_t0001.png 이미지

Table 2. Analysis result of Normal review

DJTJBT_2019_v17n4_115_t0002.png 이미지

Table 3. Analysis result of Normal review + Summary review

DJTJBT_2019_v17n4_115_t0003.png 이미지

References

  1. G. Kim & H. Koo. (2016). The causal relationship between risk and trust in the online marketplace: A bidirectional perspective. Computers in Human Behavior, 55, 1020-1029. DOI : 10.1016/j.chb.2015.11.005
  2. P. A. Pavlou & D. Gefen. (2004). Building effective online marketplaces with institution-based trust. Information systems research, 15(1), 37-59. DOI : 10.1287/isre.1040.0015
  3. W. Fan, L. Wallace, S. Rich & Z. Zhang. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), 76-82. DOI : 10.1145/1151030.1151032
  4. D. Paranyushkin. (2011). Identifying the pathways for meaning circulation using text network analysis. Berlin: Nodus Labs.
  5. J. H. Ryu & Y. Y. You. (2018). The Fourth Industrial Revolution Core Technology Association Analysis Using Text Mining. Journal of Digital Convergence, 16(8), 129-136. DOI : 10.14400/JDC.2018.16.8.129
  6. J. H. Bae, J. E. Son & M. Song. (2013). Analysis of twitter for 2012 South Korea presidential election by text mining techniques. Journal of Intelligence and Information Systems, 19, 141-156. DOI : 10.13088/jiis.2013.19.3.141
  7. D. Y. Lee, J. C. Jo & H. S. Lim. (2017). User Sentiment Analysis on Amazon Fashion Product Review Using Word Embedding. Journal of the Korea Convergence Society, 8(1), 11. DOI : 10.15207/JKCS.2017.8.4.001
  8. E. Y. Kim & E. J. Ko. (2018). Monitoring Mood Trends of Twitter Users using Multi-modal Anal ysis method of Texts and Images. Journal of the Korea Convergence Society, 9(1), 419-431. DOI : 10.15207/JKCS.2018.9.1.419
  9. L. Rokach. (2010). Pattern classification using ensemble methods. World Scientific.
  10. G. Wang, J. Sun, J. Ma, K. Xu & J. Gu. (2014). Sentiment classification: The contribution of ensemble learning. Decision support systems, 57, 77-93. DOI : 10.1016/j.dss.2013.08.002
  11. N. F. F. Da Silva, E. R. Hruschkaa & E. R. Hruschka. (2014). Tweet sentiment analysis with classifier ensembles. Decision Support Systems, 66, 170-179. DOI : 10.1016/j.dss.2014.07.003
  12. C. Catal & M. Nangir. (2017). A sentiment classification model based on multiple classifiers. Applied Soft Computing, 50, 135-141. DOI : 10.1016/j.asoc.2016.11.022
  13. T. Chalothom & J. Ellman. (2015). Simple approaches of sentiment analysis via ensemble learning. In information science and applications. (pp. 631-639). Berlin, Heidelberg.
  14. E. Fersini, E. Messina & F. A. Pozzi. (2014). Sentiment analysis: Bayesian ensemble learning. Decision support systems, 68, 26-38. DOI : 10.1016/j.dss.2014.10.004
  15. A. Hassan, A. Abbasi & D. Zeng. (2013). Twitter sentiment analysis: A bootstrap ensemble framework. In Social Computing. (pp. 8-14). Alexandria. USA.
  16. C. Rodriguez-Penagos, J. A. Batalla, J. Codina-Filba, D. Garcia-Narbona, J. Grivolla, P. Lambert & R. Sauri. (2013). FBM: Combining lexicon-based ML and heuristics for Social Media Polarities. In Second Joint Conference on Lexical and Computational Semantics (*SEM). Proceedings of the Seventh International Workshop on Semantic Evaluation. (pp. 483-489) Atlanta. Georgia.
  17. Y. Su, Y. Zhang, D. Ji, Y. Wang & H. Wu. (2012). Ensemble learning for sentiment classification. In Workshop on Chinese Lexical Semantics. (pp. 84-93). July, Berlin, Heidelberg.