DOI QR코드

DOI QR Code

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning

영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측

  • Song, Junga (Department of BigData Business, Hanbat National University) ;
  • Choi, Keunho (Department of Business & Accounting, Hanbat National University) ;
  • Kim, Gunwoo (Department of Business & Accounting, Hanbat National University)
  • 송정아 (한밭대학교 빅데이터비즈니스학과) ;
  • 최근호 (한밭대학교 경영회계학과) ;
  • 김건우 (한밭대학교 경영회계학과)
  • Received : 2018.08.02
  • Accepted : 2018.12.16
  • Published : 2018.12.31

Abstract

The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.

2013년 누적인원 2억명을 돌파한 한국의 영화 산업은 매년 괄목할만한 성장을 거듭하여 왔다. 하지만 2015년을 기점으로 한국의 영화 산업은 저성장 시대로 접어들어, 2016년에는 마이너스 성장을 기록하였다. 영화산업을 이루고 있는 각 이해당사자(제작사, 배급사, 극장주 등)들은 개봉 영화에 대한 시장의 반응을 예측하고 탄력적으로 대응하는 전략을 수립해 시장의 이익을 극대화하려고 한다. 이에 본 연구는 개봉 후 역동적으로 변화하는 관람객 수요 변화에 대한 탄력적인 대응을 할 수 있도록 주차 별 관람객 수를 예측하는데 목적을 두고 있다. 분석을 위해 선행연구에서 사용되었던 요인 뿐 아니라 개봉 후 역동적으로 변화하는 영화의 흥행순위, 매출 점유율, 흥행순위 변동 폭 등 선행연구에서 사용되지 않았던 데이터들을 새로운 요인으로 사용하고 Naive Bays, Random Forest, Support Vector Machine, Multi Layer Perception등의 기계학습 기법을 이용하여 개봉 일 후, 개봉 1주 후, 개봉 2주 후 시점에는 차주 누적 관람객 수를 예측하고 개봉 3주 후 시점에는 총 관람객 수를 예측하였다. 새롭게 제시한 변수들을 포함한 모델과 포함하지 않은 모델을 구성하여 실험하였고 비교를 위해 매 예측시점마다 동일한 예측 요인을 사용하여 총 관람객 수도 예측해보았다. 분석결과 동일한 시점에 총 관람객 수를 예측했을 경우 보다 차주 누적 관람객 수를 예측하는 것이 더 높은 정확도를 보였으며, 새롭게 제시한 변수들을 포함한 모델의 정확도가 대부분 높았으며 통계적으로 그 차이가 유의함으로써 정확도에 기여했음을 확인할 수 있었다. 기계학습 기법 중에는 Random Forest가 가장 높은 정확도를 보였다.

Keywords

JJSHBB_2018_v24n4_67_f0001.png 이미지

Prediction model

JJSHBB_2018_v24n4_67_f0002.png 이미지

Comparison of prediction accuracy over time

JJSHBB_2018_v24n4_67_f0003.png 이미지

Comparison accuracy by the target variable (Random Forest)

JJSHBB_2018_v24n4_67_f0004.png 이미지

Confusion Matrix for Random Forest

2012-2017 Key statistical indicators of Korean film industry (Korean Film Council, 2017)

JJSHBB_2018_v24n4_67_t0001.png 이미지

Number of movies by class (Total Audience)

JJSHBB_2018_v24n4_67_t0002.png 이미지

Definition of variable

JJSHBB_2018_v24n4_67_t0003.png 이미지

Definition of target variable (Weekly Audience)

JJSHBB_2018_v24n4_67_t0004.png 이미지

Definition of target variable (Total Audience)

JJSHBB_2018_v24n4_67_t0005.png 이미지

Comparison of prediction accuracy

JJSHBB_2018_v24n4_67_t0006.png 이미지

References

  1. Korean Film Council (2017). 2017 Korean film industry settlement, Korean Film, Available at http://www.kofic.or.kr/ (Downloaded 13 February, 2018)
  2. Chang, J. Y., "An Experimental Evaluation of Box office Revenue Prediction through Social Bigdata Analysis and Machine Learning." The journal of the institute of internet, broadcasting and communication, Vol.17, No.3(2017), 167-173. https://doi.org/10.7236/JIIBC.2017.17.3.167
  3. Choi, S. H., "The Impact of Distributors in the Movie Exhibition Market : Focusing on Distributor Types," Review of Culture & Economy, Vol.20, No.1(2017), 105-128.
  4. Jeon, S. H., and Y, S, Son, "Prediction of box office using data mining," The Korea Journal of Applied Statistics, Vol.29, No.7(2016), 1257-1270. https://doi.org/10.5351/KJAS.2016.29.7.1257
  5. Kang, S. J., "Analysis Box Office Success of A Movie - Focused on Commercial Film Released in 2016," Journal of the Korea Entertainment Industry Association, Vol.11, No.5(2017), 1-15.
  6. Kim, B. S., "Comparison of Factors Predicting Theatrical Movie Success : Focusing on the Classification by the Release Type and the Length of Run," Korean Journal of Journalism & Communication Studies, Vol.53, No.1(2009), 257-287.
  7. Kim, S.Y, S. H. Im, and Y. S. Jung, "A Comparison Study of the Determinants of Performance of Motion Pictures: Art Film vs. Commercial Film," Journal of The Korea Contents Association, Vol.10, No.2(2010), 381-393 https://doi.org/10.5392/JKCA.2010.10.2.381
  8. Kwon. S. J., "The Impact of Critics on Movie Market Performances: Art versus Commercial," Review of Culture & Economy, Vol.17, No.3(2014), 3-21.
  9. Litman B., "Predicting success of theatrical movies: An empirical study", The Journal of Popular Culture, Vol.16, No.4(1983), 159-175. https://doi.org/10.1111/j.0022-3840.1983.1604_159.x
  10. Park, S. H. and H. J. Song, "Word of Mouth and Box Office Performance: WOM's Impact on Weekly Box Office Revenues," Korean Journal of Journalism & Communication Studies, Vol.56, No.4(2012), 210-235.
  11. Quader, N., Gani, M. O., Chaki, D., & Ali, M. H., "A machine learning approach to predict movie box-office success", Computer and Information Technology (ICCIT), Vol.20 (2017), 1-7.
  12. Rhee , T. G., and F. Zulkernine, "Predicting Movie Box Office Profitability: A Neural Network Approach," 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 2016.
  13. Song, J. W. and S. J. Han, "Predicting gross box office revenue for domestic films," Communications for Statistical Applications and Methods, Vol.20(2013), 301-309. https://doi.org/10.5351/CSAM.2013.20.4.301
  14. Yim, J. Y. and B. Y. Hwang, "Predicting Movie Success based on Machine Learning Using Twitter," KIPS transactions on Software and Data Engineering, Vol.3, No.7(2014), 263-270. https://doi.org/10.3745/KTSDE.2014.3.7.263
  15. Yoo, H. J. "The Determinants of Motion Pictures Box Office Performances - For Movies Produced in Korea Between 1988 and 1999," Korean Journal of Journalism & Communication Studies, Vol. 46, No. 3 (2002), 183-213.