DOI QR코드

DOI QR Code

Text Mining and Sentiment Analysis for Predicting Box Office Success

  • Kim, Yoosin (Big Data Analytics Department, University of Seoul) ;
  • Kang, Mingon (Department of Computer Science, Kennesaw State University) ;
  • Jeong, Seung Ryul (Business IT Graduate School, Kookmin University)
  • Received : 2018.03.05
  • Accepted : 2018.04.12
  • Published : 2018.08.31

Abstract

After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers' sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the user-generated content can be a more accurate clue to predict business success.

Keywords

References

  1. E. Cambria, B. Schuller, Y. Xia, and C. Havasi, "New Avenues in Opinion Mining and Sentiment Analysis," IEEE Intell. Syst., vol. 28, no. 2, pp. 15-21, 2013. https://doi.org/10.1109/MIS.2013.30
  2. Y. Kim, R. Dwivedi, J. Zhang, and S. R. Jeong, "Competitive intelligence in social media Twitter: iPhone 6 vs. Galaxy S5," Online Inf. Rev., vol. 40, no. 1, pp. 42-61, 2016. https://doi.org/10.1108/OIR-03-2015-0068
  3. Y. Liu, "Word of Mouth for Movies: Its Dynamics and Impact on Box Office Revenue," J. Mark., vol. 70, no. 3, pp. 74-89, Jul. 2006. https://doi.org/10.1509/jmkg.70.3.74
  4. Y. Kim, D. Y. Kwon, and S. R. Jeong, "Comparing Machine Learning Classifiers for Movie WOM Opinion Mining," KSII Trans. Internet Inf. Syst., vol. 9, no. 8, pp. 3178-3190, 2015.
  5. H. Rui, Y. Liu, and A. Whinston, "Whose and what chatter matters? The effect of tweets on movie sales," Decis. Support Syst., vol. 55, no. 4, pp. 863-870, Nov. 2013. https://doi.org/10.1016/j.dss.2012.12.022
  6. Jin-Cheon Na, T. T. Thet, and C. S. G. Khoo, "Comparing sentiment expression in movie reviews from four online genres," Online Inf. Rev., vol. 34, no. 2, pp. 317-338, 2010. https://doi.org/10.1108/14684521011037016
  7. B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," in Proc. of Conf. Empir. Methods Nat. Lang. Process., pp. 79-86, 2002.
  8. Y. Liu, Y. Chen, R. F. Lusch, H. Chen, D. Zimbra, and S. Zeng, "User-Generated Content on Social Media: Predicting Market Success with Online Word-on-Mouth," IEEE Intell. Syst., vol. 25, no. 1, pp. 8-12, 2010. https://doi.org/10.1109/MIS.2010.146
  9. Y. Rao, J. Lei, L. Wenyin, Q. Li, and M. Chen, "Building emotional dictionary for sentiment analysis of online news," World Wide Web, 2013.
  10. J. Lim and J. Kim, "An Empirical Comparison of Machine Learning Models for Classifying Emotions in Korean Twitter," J. Korea Multimed. Soc., vol. 17, no. 2, pp. 232-239, 2014. https://doi.org/10.9717/kmms.2014.17.2.232
  11. G. P. Sonnier, L. McAlister, and O. J. Rutz, "A Dynamic Model of the Effect of Online Communications on Firm Sales," Mark. Sci., vol. 30, no. 4, pp. 702-716, Jul. 2011. https://doi.org/10.1287/mksc.1110.0642
  12. B. Pang and L. Lee, Opinion Mining and Sentiment Analysis. 2008.
  13. H. Chen and D. Zimbra, "AI and Opinion Mining," IEEE Intell. Syst., vol. 25, no. 3, pp. 74-76, May 2010. https://doi.org/10.1109/MIS.2010.75
  14. A. Ortigosa, J. M. Martín, and R. M. Carro, "Sentiment analysis in Facebook and its application to e-learning," Comput. Human Behav., vol. 31, no. 1, pp. 527-541, 2014. https://doi.org/10.1016/j.chb.2013.05.024
  15. H. Chen, "Business and Market Intelligence 2.0, Part 2," IEEE Intell. Syst., vol. 25, no. 2, pp. 2-5, Mar. 2010. https://doi.org/10.1109/MIS.2010.123
  16. J. Bollen, H. Mao, and X. Zeng, "Twitter mood predicts the stock market," J. Comput. Sci., vol. 2, no. 1, pp. 1-8, Mar. 2011. https://doi.org/10.1016/j.jocs.2010.12.007
  17. C. Wu, Z. Chuang, and Y. Lin, "Emotion Recognition from Text Using Semantic Labels and Separable Mixture Models," ACM Trans. Asian Low-Resource Lang. Inf. Process., vol. 5, no. 2, pp. 165-181, 2006. https://doi.org/10.1145/1165255.1165259
  18. C. Hung and H. K. Lin, "Using objective words in sentiwordnet to improve word-of-mouth sentiment classification," IEEE Intell. Syst., vol. 28, no. 2, pp. 47-54, 2013. https://doi.org/10.1109/MIS.2013.1
  19. Y. Kim, S. R. Jeong, and I. Ghani, "Text Opinion Mining to Analyze News for Stock Market Prediction," Int. J. Adv. Soft Comput. Its Appl., vol. 6, no. 1, 2014.
  20. Z. Shi, H. Rui, and A. Whinston, "Content sharing in a social broadcasting environment: evidence from twitter," Mis Q., vol. 38, no. 1, pp. 123-142, 2014. https://doi.org/10.25300/MISQ/2014/38.1.06

Cited by

  1. 장비점검 일지의 비정형 데이터분석을 통한 고장 대응 효율화 사례 연구 vol.21, pp.1, 2018, https://doi.org/10.7472/jksii.2020.21.1.127
  2. Popularity prediction of movies: from statistical modeling to machine learning techniques vol.79, pp.47, 2018, https://doi.org/10.1007/s11042-019-08546-5
  3. Design and Application of Mobile Education Information System Based on Psychological Education vol.2021, pp.None, 2018, https://doi.org/10.1155/2021/1789750
  4. Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries vol.72, pp.5, 2018, https://doi.org/10.1080/01605682.2019.1705194