DOI QR코드

DOI QR Code

Multi-Dimensional Analysis Method of Product Reviews for Market Insight

마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안

  • 박정현 (한양대학교 경영학부) ;
  • 이서호 (한양대학교 경영학부) ;
  • 임규진 (한양대학교 경영학부) ;
  • 여운영 (한양대학교 비즈니스인포매틱스학과) ;
  • 김종우 (한양대학교 경영대학 경영학부)
  • Received : 2020.04.01
  • Accepted : 2020.06.02
  • Published : 2020.06.30

Abstract

With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.

인터넷의 발달로, 소비자들은 이커머스에서 손쉽게 상품 정보를 확인한다. 이때 활용되는 상품 리뷰는 사용자 경험을 토대로 작성되어 구매의사결정의 효율성을 높일 뿐만 아니라 상품 개발에 도움을 주기도 한다. 하지만, 방대한 양의 상품 리뷰에서 관심있는 평가차원의 세부내용을 파악하는 데에는 많은 시간과 노력이 소비된다. 예를 들어, 노트북을 구매하려는 소비자들은 성능, 무게, 디자인과 같은 평가차원에 대해 각 차원별로 비교 상품의 평가를 확인하고자 한다. 따라서 본 논문에서는 상품 리뷰에서 다차원 상품평가 점수를 자동적으로 생성하는 방안을 제안하고자 한다. 본 연구에서 제시하는 방안은 크게 2단계로 구성된다. 사전준비 단계와 개별상품평가 단계로, 대분류 상품군 리뷰를 토대로 사전에 생성된 차원분류모델과 감성분석모델이 개별상품의 리뷰를 분석하게 된다. 차원분류모델은 워드임베딩과 연관분석을 결합함으로써 기존 연구에서 차원과 단어들의 관련성을 찾기 위한 워드임베딩 방식이 문장 내 단어의 위치만을 본다는 한계를 보완한다. 감성분석모델은 정확한 극성 판단을 위해 구(phrase) 단위로 긍부정이 태깅된 학습데이터를 구성하여 CNN 모델을 생성한다. 이를 통해, 개별상품평가 단계에서는 구 단위의 리뷰에 준비된 모델들을 적용하고 평가차원별로 종합함으로써 다차원 평가점수를 얻을 수 있다. 본 논문의 실험에서는 대분류 상품군 리뷰 약 260,000건으로 평가모델을 구성하고, S사와 L사의 노트북 리뷰 각 1,011건과 1,062건을 실험데이터로 활용한다. 차원분류모델은 구로 분해한 개별상품 리뷰를 6개 평가차원으로 분류했고, 기존 워드임베딩 방식보다 연관분석을 결합한 모델의 정확도가 13.7% 증가했음을 볼 수 있었다. 감성분석모델은 문장보다 구 단위로 학습한 모델이 평가차원을 면밀히 분석함으로써 29.4% 더 높은 정확도를 보임을 확인했다. 본 연구를 통해 판매자, 소비자 모두가 상품의 다차원적 비교가 가능하다는 점에서 구매 및 상품 개발에 효율적인 의사결정을 기대할 수 있다.

Keywords

References

  1. Henry, A., P. Nigel, B. Linda and V. Kevin, "Consumer Behavior - A Strategic Approach. Boston", M. A.: Houghton Mifflin Company, (2004).
  2. Byun, S., D. Lee and N. Kim, "Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site", Journal of Intelligence and Information Systems, Vol.22, No.3(2016), 23-43. https://doi.org/10.13088/jiis.2016.22.3.023
  3. Cho, H., S. Kim, J. Lee and J. S. Lee, "Data-Driven Integration of Multiple Sentiment Dictionaries for Lexicon-Based Sentiment Classification of Product Reviews", Knowledge-Based Systems, Vol.71, No.8(2014), 61-71. https://doi.org/10.1016/j.knosys.2014.06.001
  4. Cho, S. G. and S. B. Kim, "Finding Meaningful Pattern of Key Words in IIE Transactions Using Text Mining", Journal of Korean Institute of Industrial Engineers, Vol.38, No.1(2012), 67-73. https://doi.org/10.7232/JKIIE.2012.38.1.067
  5. Choi, J., H. Ryu, D. Yu, N. Kim and Y. Kim, "System Design for Analysis and Evaluation of E-Commerce Products Using Review Sentiment Word Analysis", KIISE Transactions on Computing Practices, Vol.22, No.5(2016), 209-217. https://doi.org/10.5626/KTCP.2016.22.5.209
  6. Collobert, R. and J. Weston, A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning, In Proceedings of the 25th international conference on Machine learning, 2008. Available at https://dl.acm.org/doi/abs/10.1145/1390156.1390177 (Accessed 8 March, 2020).
  7. Gamon, M., A. Aue, S. Corston-Oliver and E. Ringger, "Pulse: Mining Customer Opinions from Free Text", In international symposium on intelligent data analysis, Vol.6, No.12(2005), 121-132.
  8. Goldberg, Y. and O. Levy, Word2Vec Explained:Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method, Cornell University, 2014. Available at https://arxiv.org/abs/1402.3722v1 (Accessed 15 March, 2020).
  9. Hu, H. W., Y. L. Chen and P. T. Hsu, "A Novel Approach to Rate and Summarize Online Reviews according to User-Specified Aspects", Journal of Electronic Commerce Research, Vol.17, No.2(2016), 132-152.
  10. Jeong, H. H., J. Y. Choi and J. Y. Park, "The Effect of Online Consumer Ratings and Quantity of Reviews on Visiting Intention:Focusing on the Types of Restaurant", Journal of distribution research, Vol.24, No.4(2019), 1-21.
  11. Jeong, J., K. H. Mo, S. Seo, C. Y. Kim, H. Kim and P. Kang, "Unsupervised Document Multi-Category Weight Extraction based on Word Embedding and Word Network Analysis: A Case Study on Mobile Phone Reviews", Journal of the Korean Institute of Industrial Engineers, Vol.44, No.6(2018), 442-451. https://doi.org/10.7232/JKIIE.2018.44.6.442
  12. Ju, M. and S. Youn, "Sentiment Analysis on Movie Reviews Using Word Embedding and CNN", Journal of the Korea Society of Digital Industry and Information Management, Vol.15, No.1(2019), 87-97. https://doi.org/10.17662/KSDIM.2019.15.1.087
  13. Kim, B. J. and E. R. Hwang, "The Effect of Product Evaluation Information on the Consumer Decision Making in the On-line Context: The Word of Mouth Effect of Product Review and Consumer Replies", The Korean Academic Society of Business Administration, (2007), 1-26.
  14. Kim, G. and C. Lee, "Korean Movie Review Sentiment Analysis Using Convolutional Neural Network", In Proc. of the KIISE Korea Computer Congress, (2016), 747-749.
  15. Kim, J. M. and J. H. Lee, "Text Document Classification based on Recurrent Neural Network Using Word2Vec", Journal of Korean Institute of Intelligent Systems Vol.27, No.6 (2017), 560-565. https://doi.org/10.5391/JKIIS.2017.27.6.560
  16. Kim, K. H., B. W. Whang and M. C. Kim, "Algorithm Mining Association Rules by Considering Weight Support", The KIPS Transactions: PartD, Vol.11, No.3(2004), 545-552.
  17. Kim, M., J. Byun, C. Lee and Y. Lee, "Multi-Channel CNN for Korean Sentiment Analysis", Annual Conference on Human and Language Technology, (2018), 79-83.
  18. Kovacs, J., "Travel websites usher in new era of guest reviews", Hotel & Motel Management, Vol.223, No.19(2008), 29-55.
  19. Lee, G., J. Jeong, S. Seo, C. Kim and P. Kang, "Sentiment Classification with Word Localization based on Weakly Supervised Learning with a Convolutional Neural Network", Knowledge-Based Systems, Vol.152, No.7(2018), 70-82. https://doi.org/10.1016/j.knosys.2018.04.006
  20. Lee, J., "How eWOM reduces uncertainties in decision-making process: using the concept of entropy in information theory", The Journal of Society for e-Business Studies, Vol.16, No.4(2011), 241-256.
  21. Liu, B., M. Hu and J. Cheng, "Opinion Observer:Analyzing and Comparing Opinions on the Web", In Proceedings of the 14th international conference on World Wide Web, (2005), 342-351.
  22. Mikolov, T., I. Sutskever, K. Chen, G. S. Corrado and J. Dean, "Distributed Representations of Words and Phrases and Their Compositionality", In Advances in neural information processing systems, Vol.26, No.347(2013), 3111-3119.
  23. Myung, J. S., D. J. Lee and S. G. Lee, "A Korean Product Review Analysis System Using a Semi-Automatically Constructed Semantic Dictionary", Journal of KIISE:Software and Applications, Vol.35, No.6(2008), 392-403.
  24. Park, H. J., M. C. Song and K. S. Shin, "Sentiment Analysis of Korean Reviews Using CNN:Focusing on Morpheme Embedding", Journal of Intelligence and Information Systems, Vol.24, No.2(2018), 59-83. https://doi.org/10.13088/JIIS.2018.24.2.059
  25. Robinson, R., T. T. Goh and R. Zhang, "Textual Factors in Online Product Reviews: A Foundation for a More Influential Approach to Opinion Mining", Electronic Commerce Research, Vol.12, No.3(2012), 301-330. https://doi.org/10.1007/s10660-012-9095-7
  26. Scaffidi, C., K. Bierhoff, E. Chang, M. Felker, H. Ng and C. Jin, "Red Opal: Product-Feature Scoring from Reviews", In Proceedings of the 8th ACM conference on Electronic commerce, (2007), 182-191.
  27. Seo, D. S., K. H. Mo, J. S. Park, G. C. Lee and P. S. Kang, "Word Sentiment Score Evaluation based on Graph-Based Semi-Supervised Learning and Word Embedding", The Journal of the Korean Institute of Industrial Engineers, Vol.43, No.5(2017), 330-340. https://doi.org/10.7232/JKIIE.2017.43.5.330
  28. Silver, L., Smartphone Ownership Is Growing Rapidly Around the World, but Not Always Equally, Pew Research Center, 2019. Available at https://www.pewresearch.org/global/2019/02/05/smartphone-ownership-is-growing-rapidly-around-the-world-but-not-always-equally/ (Accessed 28 December, 2019).
  29. Son, S. B. and J. H. Chun, "Product Feature Extraction and Rating Distribution Using User Reviews", Journal of Society for e-Business Studies, Vol.22, No.1(2017), 65-87. https://doi.org/10.7838/jsebs.2017.22.1.065
  30. Song, J. S. and S. W. Lee, "Automatic Construction of Positive/Negative Feature-Predicate Dictionary for Polarity Classification of Product Reviews", Korean Institute of Information Scientists and Engineers, Vol.38, No.3(2011), 157-168.
  31. Song, M., H. Park and K. S. Shin, "Attention-Based Long Short-Term Memory Network Using Sentiment Lexicon Embedding for Aspect-Level Sentiment Analysis in Korean", Information Processing & Management, Vol.56, No.3(2019), 637-653. https://doi.org/10.1016/j.ipm.2018.12.005
  32. Tadano, R., K. Shimada and T. Endo, "Multi-Aspects Review Summarization based on Identification of Important Opinions and Their Similarity", In Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, Vol.26, No.80(2010), 685-692.
  33. Xu, H., Y. Zhang and R. DeGroof, "A Feature-Based Sentence Model for Evaluation of Similar Online Products 1", Journal of Electronic Commerce Research, Vol.19, No.4 (2018), 320-335.
  34. Yeon, J. H., D. J. Lee, J. H. Shim and S. G. Lee, "Product Review Data and Sentiment Analytical Processing Modeling", The Journal of Society for e-Business Studies, Vol.16, No.4(2011), 125-137. https://doi.org/10.7838/jsebs.2011.16.4.125
  35. Yu, S. U. and K. H. Kim, "Analysis of Tourism Reviews by Using Multi-Dimensional Ranking Model", Korean Institute of Information Scientists and Engineers, Vol.33, No.3(2017), 3-14.