• Title/Summary/Keyword: 야구 데이터 분석

Search Result 54, Processing Time 0.029 seconds

Analysis of Professional Baseball Data based on Big Data (빅데이터 기반 프로야구 데이터 분석)

  • Shin, Dong-Jin;Hwang, Seung-Yeon;Lee, Don-Hee;Moon, Jin-Yong;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.4
    • /
    • pp.177-185
    • /
    • 2020
  • Recently, the popularity of professional baseball is increasing day by day, and it has data related to professional baseball on various portal sites. If you want to increase the popularity of professional baseball and produce results through analysis using relevant data, you have the advantage of accessing professional baseball. In this paper, three analyzes were conducted using data related to professional baseball. Therefore, in this paper, the trend related to the number of articles retrieved from a specific site of a professional baseball team was examined, and the correlation between professional baseball scores and the number of spectators was analyzed. Finally, we analyzed the current status of professional baseball batting average and on base percentage in 2016 and 2017.

The Correlation Of Weather And Hanhwa Eagles (날씨와 한화 이글스의 상관관계)

  • Heo, Tai-Sung;Kang, Ha-Ram
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.237-238
    • /
    • 2021
  • 야구는 데이터 스포츠라 불릴 만큼 경기마다 많은 데이터가 생성되며, 이를 바탕으로 경기를 진행한다. 본 연구는 한국 프로야구 구단인 한화 이글스의 승률 및 타자의 성적과 날씨 사이의 상관관계를 분석하였다. 이를 위하여 한화 이글스의 승률과 타자의 성적을 한국프로야구(KBO) 공식 홈페이지 및 야구 기록 통계사이트 스탯티즈(statiz)에서 수집하였으며, 날씨 데이터는 온도와 습도를 고려한 불쾌지수 데이터를 기상청으로 부터 수집하였다. 파이선의 pandas 라이브러리를 사용하여 데이터 전처리를 실행하였다. 이후 파이선의 matplotlib 라이브러리를 이용하여 데이터 분석 및 시각화를 진행하였다. 본 연구의 분석 결과로는 불쾌지수가 보통일 때 승률이 가장 크고 높음일 때 가장 낮음을 확인할 수 있었다. 또한, 타자들의 평균 성적을 분석한 결과 보통과 매우 높음은 전체적인 타격 지수가 비슷하나 높음일 때 부진한 것으로 나왔다.

  • PDF

Korean Baseball League Q&A System Using BERT MRC (BERT MRC를 활용한 한국 프로야구 Q&A 시스템)

  • Seo, JungWoo;Kim, Changmin;Kim, HyoJin;Lee, Hyunah
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.459-461
    • /
    • 2020
  • 매일 게시되는 다양한 프로야구 관련 기사에는 경기 결과, 각종 기록, 선수의 부상 등 다양한 정보가 뒤섞여있어, 사용자가 원하는 정보를 찾아내는 과정이 매우 번거롭다. 본 논문에서는 문서 검색과 기계 독해를 이용하여 야구 분야에 대한 Q&A 시스템을 제안한다. 기사를 형태소 분석하고 BM25 알고리즘으로 얻은 문서 가중치로 사용자 질의에 적합한 기사들을 선정하고 KorQuAD 1.0과 직접 구축한 프로야구 질의응답 데이터셋을 이용해 학습시킨 BERT 모델 기반 기계 독해로 답변 추출을 진행한다. 야구 특화 데이터 셋을 추가하여 학습시켰을 때 F1 score, EM 모두 15% 내외의 정확도 향상을 보였다.

  • PDF

Baseball Player Scouting Model using Big Data Analysis and Game Theory (빅데이터 분석과 게임이론을 활용한 야구선수 영입 모델)

  • Kim, Yun-Hu;Kim, Sang-Heon;Choi, Hyung-June;Jung, Jai-Eun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.321-322
    • /
    • 2018
  • 빅데이터 분석은 스포츠에서 다양한 분야에서 사용되고 있다. 야구산업에서도 세이버 메트릭스를 활용하여 전술 훈련, 개인 훈련 등 다양한 방면으로 활용되고 있다. 본 논문에서는 기존의 연구인 빅데이터 분석과 게임이론을 활용한 축구선수 영입 모델을 야구에 적용킨 시뮬레이션을 진행하고 합리적인 의사결정 모델을 제안한다.

  • PDF

A Study on the Excellent Operation of the "Korea Baseball Hall of Fame" Based on Baseball Records (야구기록 활용에 기반한 '한국야구명예의전당' 운영 방안 연구)

  • Choi, Tae-Suk;Yim, Jin-Hee
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.16 no.3
    • /
    • pp.157-177
    • /
    • 2016
  • This study proposes an effective direction for operating the "Korea Baseball of Fame" based on baseball records. To clarify problems, a literature review on the case analysis of the "Hall of Fame" of the United States and Japan, an assortment and management of Korean baseball records, and a consultation with officials who work in the baseball industry were progressed. In conclusion, this study suggested building an archive database to operate the "Korea Baseball Hall of Fame" effectively. First, archives will be collected, and then managed and utilized at the "Korea Baseball Hall of Fame." Second, to preserve the memories of baseball heroes, an oral archive will be constructed. Third, baseball records will be assorted and stored in a database.

The Study of Selecting Pitcher using Data Mining on Professional Baseball Game Simulator (데이터마이닝을 이용한 프로야구 경기 시뮬레이터에서의 투수 선정 방법에 대한 연구)

  • 정지문;박혜원;최성
    • Proceedings of the KAIS Fall Conference
    • /
    • 2000.10a
    • /
    • pp.370-374
    • /
    • 2000
  • 야구 경기에서는 한 경기에 여러 투수가 등판하게 되는데, 상황에 따라 성격이 다른 투수가 공을 던지게 된다. 이러한 등판 투수의 선정은 감독 고유의 권한이며 감독이 오랜 경험을 통해 승리하기 위해 최적의 투수를 선정하게 된다. 본 논문은 그러한 감독의 경험을 학습하기 위하여 프로야구 경기에서 발생하는 기록 데이터를 데이터마이닝을 이용하여 분석한 후, 앞으로 열릴 경기에 등판할 투수를 미리 예측할 수 있는 방안에 대하여 연구하였다.

An Analysis System Using Big Data based Real Time Monitoring of Vital Sign: Focused on Measuring Baseball Defense Ability (빅데이터 기반의 실시간 생체 신호 모니터링을 이용한 분석시스템: 야구 수비능력 측정을 중심으로)

  • Oh, Young-Hwan
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.1
    • /
    • pp.221-228
    • /
    • 2018
  • Big data is an important keyword in World's Fourth Industrial Revolution in public and private division including IoT(Internet of Things), AI(Artificial Intelligence) and Cloud system in the fields of science, technology, industry and society. Big data based on services are available in various fields such as transportation, weather, medical care, and marketing. In particular, in the field of sports, various types of bio-signals can be collected and managed by the appearance of a wearable device that can measure vital signs in training or rehabilitation for daily life rather than a hospital or a rehabilitation center. However, research on big data with vital signs from wearable devices for training and rehabilitation for baseball players have not yet been stimulated. Therefore, in this paper, we propose a system for baseball infield and outfield players, especially which can store and analyze the momentum measurement vital signals based on big data.

Analysis of the Korean Baseball League using a Markov Chain Model (마르코프 연쇄를 이용한 한국 프로야구 경기 분석)

  • Moon, Hyung Woo;Woo, Yong Tae;Shin, Yang Woo
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.4
    • /
    • pp.649-659
    • /
    • 2013
  • We use a Markov chain model to analyze the Korean Baseball League. We derive the distributions of the number of runs scored and the number of batters that complete their turn at bat in a baseball game using the time inhomogeneous Markov chain. The model is tested with real data produced from the 2011 Korean Baseball League.

An Estimation Model for Defence Ability Using Big Data Analysis in Korea Baseball

  • Ju-Han Heo;Yong-Tae Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.119-126
    • /
    • 2023
  • In this paper, a new model was presented to objectively evaluate the defense ability of defenders in Korean professional baseball. In the proposed model, using Korean professional baseball game data from 2016 to 2019, a representative defender was selected for each team and defensive position to evaluate defensive ability. In order to evaluate the defense ability, a method of calculating the defense range for each position and dividing the calculated defense area was proposed. The defensive range for each position was calculated using the Convex Hull algorithm based on the point at which the defenders in the same position threw out the ball. The out conversion score and victory contribution score for both infielders and outfielders were calculated as basic scores using the defensive range for each position. In addition, double kill points for infielders and extra base points for outfielders were calculated separately and added together.

Long term trends in the Korean professional baseball (한국프로야구 기록들의 장기추세)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.1-10
    • /
    • 2015
  • This paper offers some long term perspective on what has been happening to some baseball statistics for Korean professional baseball. The data used are league summaries by year over the period 1982-2013. For the baseball statistics, statistically significant positive correlations (p < 0.01) were found for doubles (2B), runs batted in (RBI), bases on balls (BB), strike outs (SO), grounded into double play (GIDP), hit by pitch (HBP), on base percentage (OBP), OPS, earned run average (ERA), wild pitches (WP) and walks plus hits divided by innings pitched (WHIP) increased with year. There was a statistically significant decreasing trend in the correlations for triples (3B), caught stealing (CS), errors (E), completed games (CG), shutouts (SHO) and balks (BK) with year (trend p < 0.01). The ARIMA model of Box-Jenkins is applied to find a model to forecast future baseball measures. Univariate time series results suggest that simple lag-1 models fit some baseball measures quite well. In conclusion, the single most important change in Korean professional baseball is the overall incidence of completed games (CG) downward. Also the decrease of strike outs (SO) is very remarkable.