DOI QR코드

DOI QR Code

A Win/Lose prediction model of Korean professional baseball using machine learning technique

  • Seo, Yeong-Jin (Technical Support Team, Hiball Inc.) ;
  • Moon, Hyung-Woo (Institute of Industrial Technology Research Center, Changwon National University) ;
  • Woo, Yong-Tae (Dept. of Computer Engineering, Changwon National University)
  • Received : 2019.01.31
  • Accepted : 2019.02.25
  • Published : 2019.02.28

Abstract

In this paper, we propose a new model for predicting effective Win/Loss in professional baseball game in Korea using machine learning technique. we used basic baseball data and Sabermetrics data, which are highly correlated with score to predict and we used the deep learning technique to learn based on supervised learning. The Drop-Out algorithm and the ReLu activation function In the trained neural network, the expected odds was calculated using the predictions of the team's expected scores and expected loss. The team with the higher expected rate of victory was predicted as the winning team. In order to verify the effectiveness of the proposed model, we compared the actual percentage of win, pythagorean expectation, and win percentage of the proposed model.

Keywords

CPTSCQ_2019_v24n2_17_f0001.png 이미지

Fig. 1. Deep Learning Neural Network Model

CPTSCQ_2019_v24n2_17_f0002.png 이미지

Fig. 3. Compared ReLU activate function and sigmoid activate function

CPTSCQ_2019_v24n2_17_f0003.png 이미지

Fig. 4. Win/Loss prediction model

CPTSCQ_2019_v24n2_17_f0004.png 이미지

Fig. 5. Deep network model for prediction of score

CPTSCQ_2019_v24n2_17_f0005.png 이미지

Fig. 6. Deep network model for prediction of loss

CPTSCQ_2019_v24n2_17_f0007.png 이미지

Fig. 7. Compared the actual percentage of win, pythagorean expectation and proposed model

CPTSCQ_2019_v24n2_17_f0008.png 이미지

Fig, 2. Compared using drop-out algorithm (a) Before using drop-out algorithm, (b) After using drop-out algorithm,

Table 1. Correlation between sabermetrics data and team score

CPTSCQ_2019_v24n2_17_t0001.png 이미지

Table 2. A team data processing result using moving average

CPTSCQ_2019_v24n2_17_t0002.png 이미지

Table 3. A team era data convert to Z-score

CPTSCQ_2019_v24n2_17_t0003.png 이미지

Table 4. August 2, 2011 games score/loss prediction by team

CPTSCQ_2019_v24n2_17_t0004.png 이미지

Table 5. August 2, 2011 games win probability by team

CPTSCQ_2019_v24n2_17_t0005.png 이미지

Table 6. Compared the actual percentage of win, pythagorean expectation and proposed model

CPTSCQ_2019_v24n2_17_t0006.png 이미지

References

  1. Injung kim "Deep Learning: New trend of machine learning", Journal of korea institute of communication sciences, Vol. 31 No. 11, pp. 52-57, 2014.
  2. Sung Eun Moon, Soo Beom Jang, Jung Huk Lee, Jong Seok Lee, "Machine Learning and Deep Learning Technology Trends", Journal of korea institute of communication sciences, Vol. 33, No. 10, pp. 49-56, 2016.
  3. Y. LeCun, Y. Bengio and G. Hin ton, "Deep learning", Nature, Vol. 521, No. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  4. B.Bauer, A. J. Balis, "Sabermetrics Revolution", Hanbit Biz , pp. 99-156, 2015.
  5. P. Hoang, "Supervised Learning in Baseball Pitch Prediction and Hepatitis C diagnosis", Doctoral dissertation, North Carolina State University, 2015.
  6. David M. Hansen, "Introducing machine learning via baseball's hall of fame", Journal of Computing Sciences in Colleges, Vol. 30, No. 4, pp. 7-15, 2015.
  7. Arlo Lyle, "Baseball Prediction Using Ensemble Learning", Master's thesis, University of Tulsa, 2007.
  8. Jin Seok Chea, Kook Song Jong, "Comparisons of the Outcomes of Statistical Models Applied to the Prediction of Post-season Entry in Korean Professional Baseball", Korean journal of sport science, Vol. 25, No.1, pp. 92-107, 2014. https://doi.org/10.24985/kjss.2014.25.1.92
  9. Oh Yun Hak, Han Kim, Yoon Jae Seop , Jong Seok Lee, "Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games", Journal of Korean Insitite of Industrial Engineers, Vol. 40, No.1, pp. 8-17, 2014. https://doi.org/10.7232/JKIIE.2014.40.1.008
  10. Jong Hoon Kim, Kyung Tae Kim, Jong Ki Han, "Big Data Analysis based on Deep Learning for Baseball Game Data", Journal of korea institute of communication sciences, Vol. 2015, No.11, pp. 262-265, 2015.
  11. Andrew D. Blaikie, Gabriel J. Abud, John A. David, and R. Drew Pasteur, "NFL & NCAA Football Prediction using Artificial Neural Networks", 2011 Midstates Conference for Undergraduate Research in Computer Science and Mathematics, 2011.
  12. Bernard Loeffel holz, Earl Bednar and Kenneth W Bauer, "Predicting NBA Games Using Neural Networks", Journal of Quantitative Analysis in Sports, Vol. 5, No. 1, 2009.
  13. N Srivastava, G Hinton and A Krizhevsky, "Dropout : A Simple Way to Prevent Neural Networks from Overfitting", JOURNAL OF MACHINE LEARNING RESEARCH, Vol. 15, No. 2, pp. 1929-1958, 2014.
  14. Hee Yul Choi, Yun Hong Min, "Understanding Dropout Algorithms", Journal of Korean Institute of Information Scientists and Engineers, Vol. 33, No. 8, pp. 32-38, 2015.
  15. Vinod Nair, G Hinton, "Rectified Linear Units Improve Restricted Boltzmann Machines", Proceedings of the 27th international, pp.807-814, 2010.
  16. Tom Tango, Mitchel Lichtman and Andrew Dolphin, "The Book: Playing the Percentages in Baseball", CreateSpace Independent Publishing Platform, 2014.
  17. "MovingAverageMethod," http://terms.naver.com/entry.nn?docId= 120434&cid=50304&categoryId=50304 (accessed June 1, 2017)
  18. "Outlier," http://terms.naver.com/entry.nhn?docId=1924352&cid=42125&categoryId=42125 (accessed June 1, 2017)
  19. "Z-Score," http://terms.naver.com/entry.nhn?docId=512343&cid=42126&categoryId=42126 (accessed June 1, 2017)
  20. Joseph F. Hair Jr, Willam C. Black, Barry J. Babin,Rolph E. Anderson, "Multivariate Data Analysis, 7th Edition," Pearson, 2010.

Cited by

  1. 양방향 순환신경망 임베딩을 이용한 리그오브레전드 승패 예측 vol.9, pp.2, 2019, https://doi.org/10.3745/ktsde.2020.9.2.61