DOI QR코드

DOI QR Code

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores

평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구

  • 현지연 (한양대학교 경영대학 비즈니스인포매틱스학과) ;
  • 유상이 (한양대학교 경영대학 경영학과) ;
  • 이상용 (한양대학교 경영대학)
  • Received : 2019.01.13
  • Accepted : 2019.03.29
  • Published : 2019.03.31

Abstract

As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.

개인에게 맞춤형 서비스를 제공하는 것이 중요해지면서 개인화 추천 시스템 관련 연구들이 끊임없이 이루어지고 있다. 추천 시스템 중 협업 필터링은 학계 및 산업계에서 가장 많이 사용되고 있다. 다만 사용자들의 평점 혹은 사용 여부와 같은 정량적인 정보에 국한하여 추천이 이루어져 정확도가 떨어진다는 문제가 제기되고 있다. 이와 같은 문제를 해결하기 위해 현재까지 많은 연구에서 정량적 정보 외에 다른 정보들을 활용하여 추천 시스템의 성능을 개선하려는 시도가 활발하게 이루어지고 있다. 리뷰를 이용한 감성 분석이 대표적이지만, 기존의 연구에서는 감성 분석의 결과를 추천 시스템에 직접적으로 반영하지 못한다는 한계가 있다. 이에 본 연구는 리뷰에 나타난 감성을 수치화하여 평점에 반영하는 것을 목표로 한다. 즉, 사용자가 직접 작성한 리뷰를 감성 수치화하여 정량적인 정보로 변환해 추천 시스템에 직접 반영할 수 있는 새로운 알고리즘을 제안한다. 이를 위해서는 정성적인 정보인 사용자들의 리뷰를 정량화 시켜야 하므로, 본 연구에서는 텍스트 마이닝의 감성 분석 기법을 통해 감성 수치를 산출하였다. 데이터는 영화 리뷰를 대상으로 하여 도메인 맞춤형 감성 사전을 구축하고, 이를 기반으로 리뷰의 감성점수를 산출한다. 본 논문에서 사용자 리뷰의 감성 수치를 반영한 협업 필터링이 평점만을 고려하는 전통적인 방식의 협업 필터링과 비교하여 우수한 정확도를 나타내는 것을 확인하였다. 이후 제안된 모델이 더 개선된 방식이라고 할 근거를 확보하기 위해 paired t-test 검증을 시도했고, 제안된 모델이 더 우수하다는 결론을 도출하였다. 본 연구에서는 평점만으로 사용자의 감성을 판단한 기존의 선행연구들이 가지는 한계를 극복하고자 리뷰를 수치화하여 기존의 평점 시스템보다 사용자의 의견을 더 정교하게 추천 시스템에 반영시켜 정확도를 향상시켰다. 이를 기반으로 추가적으로 다양한 분석을 시행한다면 추천의 정확도가 더 높아질 것으로 기대된다.

Keywords

JJSHBB_2019_v25n1_219_f0001.png 이미지

Ratings and reviews for movies rated by user gmlx****

JJSHBB_2019_v25n1_219_f0002.png 이미지

User-based Collaborative Filtering

JJSHBB_2019_v25n1_219_f0003.png 이미지

Item-based Collaborative Filtering

JJSHBB_2019_v25n1_219_f0004.png 이미지

Overview of the proposed framework

JJSHBB_2019_v25n1_219_f0005.png 이미지

Building sentiment dictionary

JJSHBB_2019_v25n1_219_f0006.png 이미지

Dictionary Accuracy and Pos/Neg Word Frequency

Confusion Matrix

JJSHBB_2019_v25n1_219_t0001.png 이미지

Original ratings and Proposed new ratings (Example)

JJSHBB_2019_v25n1_219_t0002.png 이미지

Experimental Results of Proposed Model with Test data

JJSHBB_2019_v25n1_219_t0003.png 이미지

Paired t-test (MAE)

JJSHBB_2019_v25n1_219_t0004.png 이미지

Paired t-test (RMSE)

JJSHBB_2019_v25n1_219_t0005.png 이미지

References

  1. Ahn. J. K. and H. W. Kim , "Building a Korean Sentiment Dictionary and Applications of Natural Language Processing", J Intell Inform Syst, (2014), 177-182.
  2. Bhojne, N. G., Deore, S., Jagtap, R., Jain, G. and Kalal, C., "Collaborative Approach based Restaurant Recommender System using Naive Bayes", International Journal of Advanced Research in Computer and Communication Engineering, Vol. 6, No.4(2017).
  3. Breese, J. S., Heckerman, D., and Kadie, C., "Empirical analysis of predictive algorithms for collaborative filtering", In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, (1998), 43-52.
  4. Choi. D. J., H. S. Choi and C. Y. Park., "Classification of ratings in online reviews", Journal of the Korean Data & Information Science Society, (2016), 845-854. https://doi.org/10.7465/jkdi.2016.27.4.845
  5. Deng, D., Jing, L., Yu, J., Sun, S. and Zhou, H., "Neural gaussian mixture model for review-based rating prediction", RecSys '18 Proceedings of the 12th ACM Conference on Recommender Systems, (2018).
  6. Garcis-Cumbreras, M. A., A. Montejo-Raez, and M. C. Diaz-Galiano, "Pessimists and optimists: Improving collaborative filtering through sentiment analysis", Expert Systems with Applications, Vol. 40, No.17(2013), 6758-6765. https://doi.org/10.1016/j.eswa.2013.06.049
  7. Goldberg, D., Nichols, D., Oki, B. M., and Terry, D., "Using collaborative filtering to weave an information tapestry", Communications of the ACM, (1992), 61-70.
  8. Hoerl, E. and Kennard, R.W., "Ridge Regression: Biased Estimation for Nonorthogonal Problems", Technometrics, Vol.12, No.1 (1970), 55-67. https://doi.org/10.1080/00401706.1970.10488634
  9. Jakob, "Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations", TSA '09 Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, (2009), 57-64.
  10. Jeon, B. K., H. C. Ahn, "A Collaborative Filtering System Combined with Users Review Mining Application to the Recommendation of Smartphone-Apps", J Intell Inform Syst, Vol.21, No.2(2015), 1-18. https://doi.org/10.13088/jiis.2015.21.2.01
  11. Jung, C. H., J. H. Kim, Y. J. Jeon, H. J. Jung, "Korean Sentiment Dictionary based on the Reliability of Review data", Korean Institute of Information Scientists and Engineers, 2017 Korea Software Congress, (2017), 1965-1967.
  12. Kim. D. H. and S. H. Choi, "A Domain Adaptive Sentiment Dictionary Construction Method for Domain Sentiment Analysis", The Korean Society of Computer and Information, Proceedings of the Korean Society of Computer Information Conference, (2015), 15-18.
  13. Kim. S. B., S. J. Kwon. and J. T. Kim, "Building Sentiment Dictionary and Polarity Classification of Blog Review by Using Elastic Net", Korean Information Science Society, (2015), 639-641.
  14. Kim. Y. S. and S. R. Jeong, "Intelligent VOC Analyzing System Using Opinion Mining", Korea Intelligent Information Systems Society, Journal of Intelligence and Information Systems 19(3), (2013), 113-125. https://doi.org/10.13088/jiis.2013.19.3.113
  15. Lee, S. H., J. Choi and J. W. Kim, "Analysis of movie review emotion through customized emotional dictionary construction by domain", J Intell Inform Syst, Vol.22, No.2 (2016), 97-113. https://doi.org/10.13088/JIIS.2016.22.2.097
  16. Leung, C. W., Chan, S. C. and Chung, F., "Integrating Collaborative Filtering and Sentiment Analysis: A Rating Inference Approach", ECAI 2006 Workshop on Recommender Systems, (2006), 62-68.
  17. Liu, S. M., Chan, J. H., "A multi-label classification based approach for sentiment classification", Expert Systems with Application, Vol. 42, (2005), 1083-1093. https://doi.org/10.1016/j.eswa.2014.08.036
  18. Oh. Y. J. and S. H. Choi, "Movie Rating Inference by Construction of Movie Sentiment Sentence using Movie comments and ratings", Journal of Korean Society for Internet Information, Vol.16, No.2(2015), 41-48.
  19. Park, J. Y. and B. S. Jeon, "A structural Analysis of the Movie Reviews", The Journal of the Korea Contents Association, Vol. 14, No. 5(2014).
  20. Ricci, F., Rokach, L., Shapira, B. and Kantor, P., Recommender systems handbook, 2011.
  21. Sarwar, B., Karypis, G., Konstan, J., and Riedl, J., "Item-based collaborative filtering recommendation algorithms", In Proceedings of the 10th international conference on World Wide Web, (2001), 285-295.
  22. Son. J. E., S. B. H., H. J. Kim and S. J. Cho., "Review and Analysis of Recommender Systems, Journal of the Korean Institute of Industrial Engineers, Vol. 41, No. 2(2015), 185-208. https://doi.org/10.7232/JKIIE.2015.41.2.185
  23. Song. J. S., J. B. Back and S. W. Lee, "Automatic Construction of Positive/Negative Dictionary to Improve Performance of Product Review Classification", Korean Institute of Information Scientists and Engineers, 2010 Korea Computer Congress, Vol. 37, No.1 (2010), 136-137.
  24. Tibshirani, R., "Regression Shrinkage and Selection via the Lasso", Journal of the Royal Statistical Society, Vol. 58, No. 1(1996), 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  25. Wang, Y., Liu, Y. and Yu, X., "Collaborative Filtering with Aspect-Based Opinion Mining: A Tensor Factorization Approach", Proceedings of 2012 IEEE 12th International Conference on Data Mining (ICDM), (2012), 1152-1157.
  26. Yu, E., Y. Kim, N. Kim, and S. R. Jeong, "Predicting the direction of the stock index by using a domain-specific sentiment dictionary," KIIS Journal of Intelligence and Information Systems, Vol. 19, No. 1(2013), 95-110. https://doi.org/10.13088/jiis.2013.19.1.095

Cited by

  1. CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석 vol.25, pp.4, 2019, https://doi.org/10.13088/jiis.2019.25.4.141
  2. 「겨울왕국2」의 콜라보레이션 패션제품에 대한 소비자 리뷰 - 의미 네트워크와 감성분석 - vol.28, pp.2, 2020, https://doi.org/10.29049/rjcc.2020.28.2.265
  3. K-POP 아이돌 그룹 신한복 스타일에 대한 글로벌 반응: 블랙핑크 패션 사례 vol.18, pp.12, 2019, https://doi.org/10.14400/jdc.2020.18.12.533
  4. Improvement of a Product Recommendation Model using Customers' Search Patterns and Product Details vol.26, pp.1, 2021, https://doi.org/10.9708/jksci.2021.26.01.265
  5. BERT 기반 감성분석을 이용한 추천시스템 vol.27, pp.2, 2021, https://doi.org/10.13088/jiis.2021.27.2.001