DOI QR코드

DOI QR Code

A Study on the Performance Evaluation of Machine Learning for Predicting the Number of Movie Audiences

영화 관객 수 예측을 위한 기계학습 기법의 성능 평가 연구

  • Jeong, Chan-Mi (Graduate School(Big Data Analytics), Ewha Womans University) ;
  • Min, Daiki (School of Business, Ewha Womans University)
  • Received : 2020.03.26
  • Accepted : 2020.04.20
  • Published : 2020.05.31

Abstract

The accurate prediction of box office in the early stage is crucial for film industry to make better managerial decision. With aims to improve the prediction performance, the purpose of this paper is to evaluate the use of machine learning methods. We tested both classification and regression based methods including k-NN, SVM and Random Forest. We first evaluate input variables, which show that reputation-related information generated during the first two-week period after release is significant. Prediction test results show that regression based methods provides lower prediction error, and Random Forest particularly outperforms other machine learning methods. Regression based method has better prediction power when films have small box office earnings. On the other hand, classification based method works better for predicting large box office earnings.

영화 제작에 막대한 비용이 투입되지만 관객수요는 매우 불확실하기 때문에 개선된 수요예측은 수익 개선을 위한 의사결정의 중요 수단으로 활용될 수 있다. 본 연구에서는 영화의 개봉 후 수요를 예측함에 있어 기계학습 기법의 적용 타당성을 예측 성능의 관점에서 검증하였다. 분석결과를 종합하면 다음과 같다. 첫째, 대안변수에 대한 통계적 검증 결과 기본 영화 특성(감독, 배우)과 함께 개봉 후 2주차까지의 스크린수, 상영횟수, 관객수, 주요 배우에 대한 관심도 등 시계열 자료가 수요예측에 유의미한 것을 확인하였다. 둘째, Random Forest Classifier와 SVM(Support Vector Machine) 등 분류 기반 기계학습 기법과 Random Forest Regressor와 k-NN Regressor와 같은 회귀모형 기반 기계학습 기법에 적용하여 예측 성능을 평가한 결과, Random Forest 기법이 우수한 결과를 보였다. 셋째, 누적관객수가 1분위보다 작은 영화에서 회귀모형 기반 기법은 낮은 예측 정확도를 보였으며, 분류기반 기법은 반대로 가장 우수한 결과를 얻었다. 즉, 영화 수요의 분포 특성에 따라서 차별화된 기계학습 기법을 적용하는 것이 필요하다.

Keywords

References

  1. Abel, F., Diaz-Aviles, E., Henze, N., Krause, D., and Siehndel, P., "Analyzing the Blogosphere for predicting the success of music and movie products," International Conference on Advances in Social Networks Analysis and Mining, pp. 276-280, 2010.
  2. Breiman, L., Machine Learning 45:5. Kluwer Academic Publishers, 2001.
  3. Brewer, S. M., Kelley, J. M., and Jozefowicz, J. J., "A blueprint for success in the US film industry," Applied Economics, Vol. 41, No. 5, pp. 589-606, 2009. https://doi.org/10.1080/00036840601007351
  4. Chintagunta, P. K., Gopinath, S., and Venkataraman, S., "The effects of online user reviews on movie box office performance: Accounting for sequential rollout and aggregation across local markets," Marketing Science, Vol. 29, No. 5, pp. 944-957, 2010. https://doi.org/10.1287/mksc.1100.0572
  5. Chong, M., "Evaluating real-time search query variation for intelligent information retrieval service," Journal of Digital Convergence, Vol. 16, No. 12, pp. 335-342, 2018. https://doi.org/10.14400/JDC.2018.16.12.335
  6. Delen, D., Sharda, R., and Kumar, P., "Movie forecast Guru: A Web-based DSS for Hollywood managers," Decision Support Systems, Vol. 43, No. 4, pp. 1151-1170, 2007. https://doi.org/10.1016/j.dss.2005.07.005
  7. Demir, D., Kapralova, O., and Lai, H., "Predicting IMDB movie ratings using Google Trends," 2012.
  8. Eliashberg, J. and Shugan, S. M., "Film Critics: Influencers or Predictors?," Journal of Marketing, Vol. 61, No. 2, pp. 68-78, 1997. https://doi.org/10.2307/1251831
  9. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M., and Brilliant, L., "Detecting influenza epidemics using search engine query data," Nature, Vol. 457, No. 19, pp. 1012-1015, 2009. https://doi.org/10.1038/nature07634
  10. Gong, J. J., Young, S. M., and der Stede, W. A. V.. "Real options in the motion picture industry: Evidence from film marketing and sequels," Contemporary Accounting Research, Vol. 28, No. 5, pp. 1438-1466, 2011. https://doi.org/10.1111/j.1911-3846.2011.01086.x
  11. Gunn, S. R., "Support vector machines for classification and regression," ISIS technical report, Vol. 14, No. 1, pp. 5-16, 1998.
  12. Guo, Z., Zhang, X., and Hou, Y., "Predicting box office receipts of movies with pruned Random Forest," International Conference on Neural Information Processing ICONIP 2015: Neural Information Processing, pp. 55-62, 2015.
  13. Jung, J., Hwang, S., and Kwon, C., "Forecasting Korean Unemployment Rate with Web Queries," Korean Institute of Industrial Engineers, pp. 3373-3377, 2015.
  14. Kim, J. and Kim., J., "Relationship between Internet Buzz Share and Market Share : Movie Ticket Case", The Journal of Society for e-Business Studies, Vol. 18, No. 2, pp. 241-255, 2013.
  15. Kim, T., Hong, J., and Kang, P., "Box office forecasting using machine learning algorithms based on SNS data," International Journal of Forecasting, Vol. 31, pp. 364-390, 2015. https://doi.org/10.1016/j.ijforecast.2014.05.006
  16. Koo, P. and Kim, M., "A Study on the Relationship between Internet Search Trends and Company's Stock Price and Trading Volume", The Journal of Society for e-Business Studies, Vol. 20, No. 2, pp. 1-14, 2015. https://doi.org/10.7838/jsebs.2015.20.2.001
  17. Kwon, S. J., "Factors influencing Cinema Success: using News and Online Rates," Review of Culture & Economy, Vol. 17, No. 1, pp. 35-55, 2014.
  18. Lee, K. J. and Chang, W., "Bayesian belief network for box-office performance: A case study on Korean movies," Expert Systems with Applications, Vol. 36, pp. 280-291, 2009. https://doi.org/10.1016/j.eswa.2007.09.042
  19. Litman, B. R., "Predicting Success of Theatrical Movies: An Empirical Study," The Journal of Popular Culture, Vol. 16, No. 4, pp. 159-175, 1983. https://doi.org/10.1111/j.0022-3840.1983.1604_159.x
  20. Lovallo, D., Clarke, C., and Camerer, C., "Robust analogizing and the outside view: two empirical tests of case-based decision making," Strategic Management Journal, Vol. 33, No. 5, pp. 496-512, 2012. https://doi.org/10.1002/smj.962
  21. Preis, T., Moat, H., and Stanley, H., "Quantifying trading behavior in financial markets using Google trends," Science Report, Vol. 3, p. 1684, 2013. https://doi.org/10.1038/srep01684
  22. Qin, L., "Word-of-Blog for movies: A predictor and an outcome of box office revenue?," Journal of Electronic Commerce Research, Vol. 12, No. 3, pp. 187-198, 2011.
  23. Ravid, S. A., "Information, blockbusters, and stars: A study of the film industry," The Journal of Business, Vol. 72, No. 4, pp. 463-492, 1999. https://doi.org/10.1086/209624
  24. Rogers, E. M., "New product adoption and diffusion," Journal of Consumer Research, Vol. 2, No. 4, pp. 290-301, 1976. https://doi.org/10.1086/208642
  25. Sawhney, M. S. and Eliashberg, J., "A parsimonious model for forecasting gross box-office revenues of motion pictures," Marketing Science, Vol. 15, No. 2, pp. 113-131, 1996. https://doi.org/10.1287/mksc.15.2.113
  26. Sharda, R. and Delen, D., "Predicting boxoffice success of motion pictures with neural networks," Expert Systems with Applications, Vol. 30, pp. 243-254, 2006. https://doi.org/10.1016/j.eswa.2005.07.018
  27. Simonoff, J. S. and Sparraw, I. R., "Predicting movie grosses: winners and losers, blockbusters and sleepers," Chance, Vol. 13, No. 3, pp. 15-24, 2000. https://doi.org/10.1080/09332480.2000.10542216
  28. Siroky, D. S., "Navigating Random Forests and related advances in algorithmic modeling," Statistics Survey, Vol. 3, pp. 147-163, 2009. https://doi.org/10.1214/07-SS033
  29. Song, J., Choi., K., and Kim. G., "Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning," Journal of Intelligent Information System, Vol. 24, No. 4, pp. 67-83, 2018.
  30. Subramaniyaswamy, V., Viginesh, V. M., Vishnu, P. R., and Logesh, R., "Predicting movie box office success using multiple regression and SVM," 2017 International Conference on Intelligent Sustainable Systems(ICISS), pp. 182-186, 2017.
  31. Wang, F., Zhang, Y., Li, X., and Zhu, H., "Why do moviegoers go to the theater? The role of prerelease media publicity and online word of mouth in driving moviegoing behavior," Journal of Interactive Advertising, Vol. 11, No. 1, pp. 50-62, 2010. https://doi.org/10.1080/15252019.2010.10722177
  32. Wen, K. and Yang, C., "Determinants of the box office performance of motion picture in China-indication for Chinese motion picture market by adapting determinants of the box office(part II)," Journal of Science and Innovation, Vol. 1, No. 4, pp. 17-26, 2011.
  33. Yu, L., Zhao, Y., Tang, L., and Yang, Z., "Online big data-driven oil consumption forecasting with Google trends," International Journal of Forecasting, Vol. 35, pp. 213-223, 2019. https://doi.org/10.1016/j.ijforecast.2017.11.005
  34. Zhang, L., Luo, J., and Yang, S., "Forecasting box office revenue of movies with BP neural network," Expert Systems with Applications, Vol. 36, pp. 6580-6587, 2009. https://doi.org/10.1016/j.eswa.2008.07.064
  35. Zhang, W. and Skiena, S., "Improving movie gross prediction through news analysis," 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Workshops, pp. 301-304, 2009.
  36. Zhang, Z., Li, B., Deng, Z., Chai, J., Wang, Y., and An, M., "Research on movie box office forecasting based on internet data," 2015 8th International Symposium on Computational Intelligence and Design, 2015.