• Title/Summary/Keyword: 야구 데이터 분석

Search Result 55, Processing Time 0.027 seconds

Multi-dimensional Visualization Tool for Baseball Statistical Data Using R (R을 활용한 야구 통계 데이터 다차원 시각화 도구)

  • Kim, Ju Hee;Choi, Yong Suk
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.143-146
    • /
    • 2016
  • 본 연구에서는 대용량의 야구 데이터를 R 패키지인 googleVis를 이용하여 시각화하는 웹페이지를 구축하고, 버블 차트로 시각화하여 표현하였다. 웹페이지에서는 시각화하는 객체를 버블로 나타내며, 객체는 타자, 투수, 팀 3가지이다. 각 객체의 속성들을 버블 색상, 버블 사이즈, X-Y좌표, 연도에 설정함으로써 5차원으로 시각화하여 표현할 수 있게 한다. 웹페이지 기능 중 타임슬립 애니메이션을 사용하여 시간의 흐름에 따른 기록 변화를 한 눈에 관찰할 수 있으며, 선수 검색 기능을 통해 특정 선수들을 선택하여 비교 및 분석하는 것이 가능하다.

  • PDF

Pitching grade index in Korean pro-baseball (한국프로야구에서의 투수평가지표)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.3
    • /
    • pp.485-492
    • /
    • 2014
  • In baseball, the traditional measure of pitchers are wins and ERA. But these statistics are influenced by luck or team power. So sabermetrician proposes a number of indicators that predict future performance. We determine a new measure, which we call pitching grade index (PGI) that efficiently summarizes a pitcher's performance on a numerical scale using principal components analysis. The PGI statistic can often be useful to assessing a pitcher's individual contribution. Also K-means clustering algorithm are used for segmentation of players into groups.

A Study on the Win-Loss Prediction Analysis of Korean Professional Baseball by Artificial Intelligence Model (인공지능 모델에 따른 한국 프로야구의 승패 예측 분석에 관한 연구)

  • Kim, Tae-Hun;Lim, Seong-Won;Koh, Jin-Gwang;Lee, Jae-Hak
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.77-84
    • /
    • 2020
  • In this study, we conducted a study on the win-loss predicton analysis of korean professional baseball by artificial intelligence models. Based on the model, we predicted the winner as well as each team's final rank in the league. Additionally, we developed a website for viewers' understanding. In each game's first, third, and fifth inning, we analyze to select the best model that performs the highest accuracy and minimizes errors. Based on the result, we generate the rankings. We used the predicted data started from May 5, the season's opening day, to August 30, 2020 to generate the rankings. In the games which Kia Tigers did not play, however, we used actual games' results in the data. KNN and AdaBoost selected the most optimized machine learning model. As a result, we observe a decreasing trend of the predicted results' ranking error as the season progresses. The deep learning model recorded 89% of the model accuracy. It provides the same result of decreasing ranking error trends of the predicted results that we observe in the machine learning model. We estimate that this study's result applies to future KBO predictions as well as other fields. We expect broadcasting enhancements by posting the predicted winning percentage per inning which is generated by AI algorism. We expect this will bring new interest to the KBO fans. Furthermore, the prediction generated at each inning would provide insights to teams so that they can analyze data and come up with successful strategies.

The Effect of Discomfort Index on Outfielder's Game Record Data (불쾌지수가 외야수의 경기 기록 데이터에 미치는 영향)

  • Kim, Semin;Shin, Chwa-Cheol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.8
    • /
    • pp.978-984
    • /
    • 2020
  • In this study, the correlation between sports records and weather data was analyzed using the big data analysis method. To this end, data was collected by API and crawling, data was processed, statistics were performed, and data visualization was performed. The subject of this study was a player who entered the regular at-bat among outfielders in the 2019 KBO League. In addition, meteorological data were analyzed by using the unpleasant index and above 70 and below 70. As a result of the study, in the various hitting indicators, which are the records that pitchers intervene, the higher the unpleasant index, the better the outfielder's record, but pitchers, walks, pitches, pitching success rates, pitches per turn, pitches per game From the records of the back, it was found that the outfielder made the pitcher difficult. It is expected that this study will help the development of the sports data industry and the performance of baseball players, baseball teams, and coaching staff.

The estimation of winning rate in Korean professional baseball league (한국 프로야구의 승률 추정)

  • Kim, Soon-Kwi;Lee, Young-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.653-661
    • /
    • 2016
  • In this paper, we provide a suitable optimal exponent in the generalized Pythagorean theorem and propose to use the logistic model & the probit model to estimate the winning rate in Korean professional baseball league. Under a criterion of root-mean-square-error (RMSE), the efficiencies of the proposed models have been compared with those of the Pythagorean theorem. We use the team historic win-loss records of Korean professional baseball league from 1982 to the first half of 2015, and the proposed methods show slight outperformances over the generalized Pythagorean method under the criterion of RMSE.

Batting index prediction model 2017 (2017년 한국프로야구 타자력 예측모형 개발)

  • Hong, Chong Sun;Shin, Dong Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.635-645
    • /
    • 2017
  • In this paper, we propose batting index prediction models of 2017. Due to the insufficiency of KBO pitchers data, batting index prediction models of 2016 has been developed based on elected eight batting index collecting the past three years data of MLB and KBO. It has been found that this prediction model fits well to both MLB and KBO, and the KBO model fits better than MLB in some cases. Using these prediction models, we analyzed and compared 2016's estimated values for the batting index of MLB and KBO. With the relation results between batting index prediction and batter's age for MLB and KBO, it can be determined that there is no relationship between the significant batting index and ages.

Rapid Detection of Important Events in Baseball Video Using multi-Modal Analysis (멀티 모달 분석을 통한 야구 동영상에서의 실시간 중요 이벤트 검출 알고리즘)

  • Lee, Jin-Ho;Kim, Hyoung-Gook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.11a
    • /
    • pp.133-136
    • /
    • 2009
  • 본 논문에서는 야구 동영상에서 실시간으로 중요 이벤트 장면을 검출하는 알고리즘을 제안한다. 제안하는 알고리즘은 영상정보를 분석하여 Pitching 신과 Close Up 신을 추출하여 Play 구간을 검출하고, 오디오 정보를 분석하여 오디오 이벤트 구간을 검출한다. Play 구간의시작인 Pitching 신을 검출하기 위해서는 오프라인 모델과 온라인 모델을 혼용하여 다양한 환경에 상관없이 높은 성능을 보일 수 있도록 하였으며, 아나운서의 억양 및 관중의 함성의 고조도가 높아지는 구간을 기반으로 검출된 오디오 이벤트 구간을 영상 정보 분석을 통해 획득된 Play 장면구간을 결합하여 중요 이벤트 장면 검출의 정확도를 높일 수 있도록 하였다. 실험에 의하면 제안하는 알고리즘은 1초의 동영상 데이터를 처리하는데 0.024초의 소요 시간이 필요하고, 0.89의 Recall과 0.975의 Precision 검출 성능을 보임을 알 수 있었다.

  • PDF

A study on relationship between the performance of professional baseball players and annual salary (한국 프로야구 선수들의 경기력과 연봉의 관계 분석)

  • Seung, Hee-Bae;Kang, Kee-Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.285-298
    • /
    • 2012
  • This research deals with a relationship between the performance of Korean professional baseball players and their annual salaries. It is based on the sabermetrics, which measures the performance of baseball batters in a refined way. We collect the records of batters of eight professional baseball clubs during the season 2009 and 2010. Then, we calculate every index of the sabermetrics. Principal component analysis is used for examining the relationship between those indexes of sabermetrics and annual salary for the next year. In general, batters who show higher performance get more salary. The result of this research can be useful in order to reach an agreement on salary between a player and his club partner.

Analysis of the Importance and Satisfaction of Viewing Quality Factors among Non-Audience in Professional Baseball According to Corona 19 (코로나 19에 따른 프로야구 무관중 시청품질요인의 중요도, 만족도 분석)

  • Baek, Seung-Heon;Kim, Gi-Tak
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.2
    • /
    • pp.123-135
    • /
    • 2021
  • The data processing of this study is focused on keywords related to 'Corona 19 and professional baseball' and 'Corona 19 and professional baseball no spectators', using text mining and social network analysis of textom program to identify problems and view quality. It was used to set the variable of For quantitative analysis, a questionnaire on viewing quality was constructed, and out of 270 survey respondents, 250 questionnaires were used for the final study. As a tool for securing the validity and reliability of the questionnaire, exploratory factor analysis and reliability analysis were conducted, and IPA analysis (importance-satisfaction) was conducted based on the questionnaire that secured validity and reliability, and the results and strategies were presented. As a result of IPA analysis, factors related to the image (image composition, image coloration, image clarity, image enlargement and composition, high-quality image) were found in the first quadrant, and the second quadrant was the game situation (support team game level, support player game level, star). Player discovery, competition with rival teams), game information (match schedule information, player information check, team performance and player performance, game information), interaction (consensus with the supporting team), and some factors appeared. The factors of commentator (baseball-related knowledge, communication ability, pronunciation and voice, use of standard language, introduction of game-related information) and interaction (real-time communication with the front desk, sympathy with viewers, information exchange such as chatting) appeared.

Analysis of the Relationship between a Batter's Performance and Discomfort Index using Big Data: focusing on the Number of Pitches and On Base Percentage (빅데이터를 활용한 타자의 출루 관련 경기력과 불쾌지수의 관계 분석 : 투구 수 유도와 출루율을 중심으로)

  • Kim, Semin;You, Kangsoo
    • Journal of Industrial Convergence
    • /
    • v.18 no.4
    • /
    • pp.61-66
    • /
    • 2020
  • Recently, attempts have been made to use data to operate games, seasons, and teams in professional baseball. Therefore, in this study, we collected baseball game records and analyzed the relationship between on-base rate and pitching count induction, and this was defined as the third record for non-game factors such as discomfort index, which includes the weather application data. When the discomfort index was over 75, the pitcher's induction of pitching was high, and when the discomfort index was less than 69.9, the on-base rate was high, but when the discomfort index was 70 or more and less than 75, the batter's on-base performance was the lowest. Through the results of the study, it could be inferred that the discomfort index, the batter's on-base rate, and the number of induction pitches are related, and that it is highly likely to be related to the pitcher's performance. Through this study, we could see the possibility of defining a cumulative/ratio record defined as the primary record and a saver metric defined as the secondary record, and a third, tertiary record linking data outside the game.