DOI QR코드

DOI QR Code

Predicting Movie Success based on Machine Learning Using Twitter

트위터를 이용한 기계학습 기반의 영화흥행 예측

  • 임준엽 (가톨릭대학교 컴퓨터공학과) ;
  • 황병연 (가톨릭대학교 컴퓨터정보공학부)
  • Received : 2014.02.13
  • Accepted : 2014.05.26
  • Published : 2014.07.31

Abstract

This paper suggests a method for predicting a box-office success of the film. Lately, as the growth of the film industry, a variety of studies for the prediction of market demand is being performed. The product life cycle of film is relatively short cultural goods. Therefore, in order to produce stable profits, marketing costs before opening as well as the number of screen after opening need a plan. To fulfill this plan, the demand for the product and the calculation of economic profit scale should be preceded. The cases of existing researches, as a variable for predicting, primarily use the factors of competition of the market or the properties of the film. However, the proportion of the potential audiences who purchase the goods is relatively insufficient. Therefore, in this paper, in order to consider people's perception of a movie, Twitter was utilized as one of the survey samples. The existing variables and the information extracted from Twitter are defined as off-line and on-line element, and applied those two elements in machine learning by combining. Through the experiment, the proposed predictive techniques are validated, and the results of the experiment predicted the chance of successful film with about 95% of accuracy.

본 논문에서는 영화의 흥행을 예측하기 위한 방법을 제안한다. 최근 영화시장이 성장함에 따라 시장의 수요를 예측하기 위한 다양한 연구들이 수행되고 있다. 영화는 비교적 수명주기가 짧은 문화상품이다. 따라서 안정적인 수익을 창출하기 위해 개봉 전 마케팅비용 및 개봉 후 스크린 수 등에 대한 설계가 필요하다. 이를 위해서는 상품의 수요와 경제적인 수익규모에 대한 계산이 선행되어야 한다. 기존 관련 연구들의 경우 예측을 위한 변수로서 주로 영화 자체의 속성들이나 시장에서의 경쟁요인 등을 이용한다. 그러나 정작 상품을 구매하는 주체인 잠재관객들에 대한 비중은 비교적 미비하다. 따라서 본 논문에서는 사람들이 가진 영화에 대한 인지도를 고려하기 위해 트위터를 하나의 설문표본으로서 활용했다. 기존에 사용된 변수들과 트위터에서 추출한 정보를 오프라인 요소와 온라인 요소로 정의하고, 두 요소를 취합하여 기계학습을 적용했다. 실험을 통해 본 논문에서 제시하는 예측기법을 검증했으며, 실험결과 약 95%의 정확도로 영화의 흥행을 예측했다.

Keywords

References

  1. L. Barbosa and J. Feng, "Robust Sentiment Detection on Twitter from Biased and Noisy Data," Proc. of the 23rd International Conference on Computational Linguistics, pp.36-44, 2010.
  2. Statistic Brain (2014, Jan. 01), Twitter Statistics [Online]. Available : http://www.statisticbrain.com/twitter-statistics/
  3. E.-M. Kim, "The Determinants of Motion Picture Box Office Performance: Evidence from Movies Exhibited in Korea," Korean Journal of Journalism & Communication Studies, Vol.47, No.2, pp.190-220, 2003.
  4. J. Bollen, H. Mao, and X. Zeng, "Twitter Mood Predicts the Stock Market," Journal of Computational Science, Vol.2, No.1, pp.1-8, 2011. https://doi.org/10.1016/j.jocs.2010.12.007
  5. M. Pennacchiotti and A.-M. Popescu, "A Machine Learning Approach to Twitter User Classification," Proc. of the Fifth International AAAI Conference on Weblogs and Social Media, pp.281-288, 2011.
  6. E. T. K. Sang and J. Bos, "Predicting the 2011 Dutch Senate Election Results with Twitter," Proc. of the SASN/the EACL Workshop on Semantic Analysis in Social Network, pp.53-60, 2012.
  7. A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, "Predicting Election with Twitter: What 140 Characters Reveal about Political Sentiment," Proc. of the Fourth International AAAI Conference on Weblogs and Social Media, pp.178-185, 2010.
  8. S. Albert, "Movie Stars and the Distribution of Financially Successful Films in the Motion Picture Industry," Journal of Cultural Economics, Vol.22, No.4, pp.249-270, 1998. https://doi.org/10.1023/A:1007511817441
  9. N. Terry, J. W. Cooley, and M. Zachary, "The Determinants of Foreign Box Office Revenue for English Language Movies," Journal of International Business and Cultural Studies, Vol.2, No.1, pp.1-12, 2010.
  10. Y. Liu, "Word of Mouth for Movies: Its Dynamics and Impact on Box Office Revenue," Journal of Marketing, Vol.70, No.3, pp.74-89, 2006. https://doi.org/10.1509/jmkg.70.3.74
  11. S.-Y. Park, "Influence of the Word-of-Mouth Effect through SNS on the Movie Performance-Focused on the Case of -," Journal of the Korea Contents Association, Vol.12, No.7, pp.40-53, 2012. https://doi.org/10.5392/JKCA.2012.12.07.040
  12. Y. H. Kim, J. H. Hong, "A Study for the Development of Motion Picture Box-office Prediction Model," Communications of the Korean Statistical Society, Vol.18, No.6, pp.859-869, 2011. https://doi.org/10.5351/CKSS.2011.18.6.859
  13. J. Yim, B.-Y. Hwang, "An Analysis of Correlation between Movie Attendance and Related Tweets for Predicting Box Office," Proc. of the 40th Korea Information Processing Society Fall Conference, Vol.20, No.2, 2013.
  14. G. Lee, U. Jang, "Predicting Financial Success of a Movie Using Bayesian Choice Model," Proc. of the KIIE/KORMS Spring Conference, pp.1428-1433, 2006.
  15. I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, 3rd Ed., pp.85-99, Morgan Kaufmann Series in Data Management Systems, 2011.
  16. NAVER (2013, Nov. 27), Naver Movie [Online]. Available: http://movie.naver.com/
  17. Twitter (2012, Sep. 24), The Streaming APIs${\mid}$Twitter Developers [Online]. Available: https://dev.twitter.com/docs/streaming-apis
  18. Korean Film Council (2013, Nov. 27), KOFIC Cinema Tickets Integrated Computer Network [Online]. Available: http://www.kobis.or.kr/

Cited by

  1. Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter vol.4, pp.10, 2015, https://doi.org/10.3745/KTSDE.2015.4.10.447
  2. Effect of online word-of-mouth variables as predictors of box office vol.29, pp.4, 2016, https://doi.org/10.5351/KJAS.2016.29.4.657