• Title/Summary/Keyword: 앙상블기법

Search Result 301, Processing Time 0.027 seconds

Predicting win-loss using game data and deriving the importance of subdivided variables (게임데이터를 이용한 승패예측 및 세분화된 변수 중요도 도출 기법)

  • Oh, Min-Ji;Choi, Eun-Seon;Oui, Som Akhamixay;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.231-240
    • /
    • 2020
  • With the development in the IT industry and the growth in the game industry, user's game data is recorded in seconds according to various plays and options, and a vast amount of game data can be analyzed based on Bigdata. Combined with business, Bigdata is used to discover new values for profit creation in various fields, but it is utilized in the game industry in insufficient ways. In this study, considering the characteristics of the subdivided lines, we constructed a win-loss prediction model for each line using the game data of League of Legends, and derived the importance of variables. This study can contribute to planning of strategies for general game users to get information about team members in advance and increase the win rate by using the record search sites.

A Study On The Classification Of Driver's Sleep State While Driving Through BCG Signal Optimization (BCG 신호 최적화를 통한 주행중 운전자 수면 상태 분류에 관한 연구)

  • Park, Jin Su;Jeong, Ji Seong;Yang, Chul Seung;Lee, Jeong Gi
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.905-910
    • /
    • 2022
  • Drowsy driving requires a lot of social attention because it increases the incidence of traffic accidents and leads to fatal accidents. The number of accidents caused by drowsy driving is increasing every year. Therefore, in order to solve this problem all over the world, research for measuring various biosignals is being conducted. Among them, this paper focuses on non-contact biosignal analysis. Various noises such as engine, tire, and body vibrations are generated in a running vehicle. To measure the driver's heart rate and respiration rate in a driving vehicle with a piezoelectric sensor, a sensor plate that can cushion vehicle vibrations was designed and noise generated from the vehicle was reduced. In addition, we developed a system for classifying whether the driver is sleeping or not by extracting the model using the CNN-LSTM ensemble learning technique based on the signal of the piezoelectric sensor. In order to learn the sleep state, the subject's biosignals were acquired every 30 seconds, and 797 pieces of data were comparatively analyzed.

Multi-scale Attention and Deep Ensemble-Based Animal Skin Lesions Classification (다중 스케일 어텐션과 심층 앙상블 기반 동물 피부 병변 분류 기법)

  • Kwak, Min Ho;Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.8
    • /
    • pp.1212-1223
    • /
    • 2022
  • Skin lesions are common diseases that range from skin rashes to skin cancer, which can lead to death. Note that early diagnosis of skin diseases can be important because early diagnosis of skin diseases considerably can reduce the course of treatment and the harmful effect of the disease. Recently, the development of computer-aided diagnosis (CAD) systems based on artificial intelligence has been actively made for the early diagnosis of skin diseases. In a typical CAD system, the accurate classification of skin lesion types is of great importance for improving the diagnosis performance. Motivated by this, we propose a novel deep ensemble classification with multi-scale attention networks. The proposed deep ensemble networks are jointly trained using a single loss function in an end-to-end manner. In addition, the proposed deep ensemble network is equipped with a multi-scale attention mechanism and segmentation information of the original skin input image, which improves the classification performance. To demonstrate our method, the publicly available human skin disease dataset (HAM 10000) and the private animal skin lesion dataset were used for the evaluation. Experiment results showed that the proposed methods can achieve 97.8% and 81% accuracy on each HAM10000 and animal skin lesion dataset. This research work would be useful for developing a more reliable CAD system which helps doctors early diagnose skin diseases.

A Study on the Development of Traffic Volume Estimation Model Based on Mobile Communication Data Using Machine Learning (머신러닝을 이용한 이동통신 데이터 기반 교통량 추정 모형 개발)

  • Dong-seob Oh;So-sig Yoon;Choul-ki Lee;Yong-Sung CHO
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.4
    • /
    • pp.1-13
    • /
    • 2023
  • This study develops an optimal mobile-communication-based National Highway traffic volume estimation model using an ensemble-based machine learning algorithm. Based on information such as mobile communication data and VDS data, the LightGBM model was selected as the optimal model for estimating traffic volume. As a result of evaluating traffic volume estimation performance from 96 points where VDS was installed, MAPE was 8.49 (accuracy 91.51%). On the roads where VDS was not installed, traffic estimation accuracy was 92.6%.

A Study on Classification Models for Predicting Bankruptcy Based on XAI (XAI 기반 기업부도예측 분류모델 연구)

  • Jihong Kim;Nammee Moon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.333-340
    • /
    • 2023
  • Efficient prediction of corporate bankruptcy is an important part of making appropriate lending decisions for financial institutions and reducing loan default rates. In many studies, classification models using artificial intelligence technology have been used. In the financial industry, even if the performance of the new predictive models is excellent, it should be accompanied by an intuitive explanation of the basis on which the result was determined. Recently, the US, EU, and South Korea have commonly presented the right to request explanations of algorithms, so transparency in the use of AI in the financial sector must be secured. In this paper, an artificial intelligence-based interpretable classification prediction model was proposed using corporate bankruptcy data that was open to the outside world. First, data preprocessing, 5-fold cross-validation, etc. were performed, and classification performance was compared through optimization of 10 supervised learning classification models such as logistic regression, SVM, XGBoost, and LightGBM. As a result, LightGBM was confirmed as the best performance model, and SHAP, an explainable artificial intelligence technique, was applied to provide a post-explanation of the bankruptcy prediction process.

A Study on the Timing of Starting Pitcher Replacement Using Machine Learning (머신러닝을 활용한 선발 투수 교체시기에 관한 연구)

  • Noh, Seongjin;Noh, Mijin;Han, Mumoungcho;Um, Sunhyun;Kim, Yangsok
    • Smart Media Journal
    • /
    • v.11 no.2
    • /
    • pp.9-17
    • /
    • 2022
  • The purpose of this study is to implement a predictive model to support decision-making to replace a starting pitcher before a crisis situation in a baseball game. To this end, using the Major League Statcast data provided by Baseball Savant, we implement a predictive model that preemptively replaces starting pitchers before a crisis situation. To this end, first, the crisis situation that the starting pitcher faces in the game was derived through data exploration. Second, if the starting pitcher was replaced before the end of the inning, learning was carried out by composing a label with a replacement in the previous inning. As a result of comparing the trained models, the model based on the ensemble method showed the highest predictive performance with an F1-Score of 65%. The practical significance of this study is that the proposed model can contribute to increasing the team's winning probability by replacing the starting pitcher before a crisis situation, and the coach will be able to receive data-based strategic decision-making support during the game.

Analysis on Climate Zone Shifts over Asia under Global Warming using CMIP6 Projections (CMIP6 기반 전지구 기온상승에 따른 아시아 지역 기후대 변화분석)

  • Kim, Jeong-Bae;Bae, Deg-Hyo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.37-37
    • /
    • 2021
  • 아시아 지역은 전 세계 인구의 60%가 집중되어 있으며, 지역 내에는 다양한 기후대가 혼재되어 있다. 통상, 기후대는 지역의 전반적인 기후 및 가용 수자원 특성을 파악하는데 유용하게 활용된다. 지구온난화의 영향으로 지역의 기후변동성은 심화되고 있으며, 이는 급격한 기후대 이동을 초래할 것으로 전망된다. 본 연구에서는 AR6 기후변화시나리오를 기반으로 전지구 기온상승에 따른 아시아 지역의 기후대 변화특성을 분석하였다. CMIP6 GCMs 및 공유사회경제경로(SSP1-2.6 및 SSP5-8.5) 시나리오를 활용하여 앙상블 기후변화시나리오를 산출하였다. 관측 및 시나리오 자료를 활용하여 산업화 이전 대비 미래 전지구 기온상승(1.5℃~5.0℃) 특성을 추정하였다. 통계적상세화 기법을 적용하여 기후변화시나리오를 상세화하고, 쾨펜 기후구분법을 적용하여 기후특성에 따라 기후대를 구분하였다. 이후, 개별 전지구 기온상승 조건 하에서 아시아 지역의 기후대 분포 및 변화특성을 분석하였다. 전지구 기온이 상승함에 따라 아시아 지역 전반에서 기후대 변화가 가속화되는 것으로 확인되었으며, 이는 모든 SSPs 및 GCMs 시나리오 하에서 동일하였다. 전지구 기온 상승폭은 SSP1-2.6 대비 SSP5-8.5 시나리오 하에서 크게 나타났으며, 동일한 1.5℃ 및 2.0℃ 기온상승 조건에 도달하는 시기도 SSP5-8.5 시나리오에서 현저히 빠른 것으로 분석되었다. 한편, 기후대 이동이 나타나는 지역은 전지구 기온이 상승함에 따라 증가하였으며 5.0℃ (SSP5-8.5) 기온상승 조건 하에서 변화량이 가장 큰 것으로 분석되었다. 다만, 동일한 기온상승 조건 하에서는 SSP 시나리오와 관계없이 기후대 변화 면적 및 공간적 변화패턴이 유사하였다. 기온상승에 따라 아시아 지역 내 열대기후와 건조기후 지역은 확대되는 반면, 온대 및 한랭, 극기후 지역은 줄어들 것으로 전망되었다. 본 연구에서 도출된 전지구 기온상승 조건 별 아시아 지역의 미래 기후대 변화특성은 지역별 기후변화 영향평가 시 기초자료로 활용될 수 있다.

  • PDF

Development of AI-based Smart Agriculture Early Warning System

  • Hyun Sim;Hyunwook Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.67-77
    • /
    • 2023
  • This study represents an innovative research conducted in the smart farm environment, developing a deep learning-based disease and pest detection model and applying it to the Intelligent Internet of Things (IoT) platform to explore new possibilities in the implementation of digital agricultural environments. The core of the research was the integration of the latest ImageNet models such as Pseudo-Labeling, RegNet, EfficientNet, and preprocessing methods to detect various diseases and pests in complex agricultural environments with high accuracy. To this end, ensemble learning techniques were applied to maximize the accuracy and stability of the model, and the model was evaluated using various performance indicators such as mean Average Precision (mAP), precision, recall, accuracy, and box loss. Additionally, the SHAP framework was utilized to gain a deeper understanding of the model's prediction criteria, making the decision-making process more transparent. This analysis provided significant insights into how the model considers various variables to detect diseases and pests.

Development of Type 2 Prediction Prediction Based on Big Data (빅데이터 기반 2형 당뇨 예측 알고리즘 개발)

  • Hyun Sim;HyunWook Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.999-1008
    • /
    • 2023
  • Early prediction of chronic diseases such as diabetes is an important issue, and improving the accuracy of diabetes prediction is especially important. Various machine learning and deep learning-based methodologies are being introduced for diabetes prediction, but these technologies require large amounts of data for better performance than other methodologies, and the learning cost is high due to complex data models. In this study, we aim to verify the claim that DNN using the pima dataset and k-fold cross-validation reduces the efficiency of diabetes diagnosis models. Machine learning classification methods such as decision trees, SVM, random forests, logistic regression, KNN, and various ensemble techniques were used to determine which algorithm produces the best prediction results. After training and testing all classification models, the proposed system provided the best results on XGBoost classifier with ADASYN method, with accuracy of 81%, F1 coefficient of 0.81, and AUC of 0.84. Additionally, a domain adaptation method was implemented to demonstrate the versatility of the proposed system. An explainable AI approach using the LIME and SHAP frameworks was implemented to understand how the model predicts the final outcome.

Model Interpretation through LIME and SHAP Model Sharing (LIME과 SHAP 모델 공유에 의한 모델 해석)

  • Yong-Gil Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.177-184
    • /
    • 2024
  • In the situation of increasing data at fast speed, we use all kinds of complex ensemble and deep learning algorithms to get the highest accuracy. It's sometimes questionable how these models predict, classify, recognize, and track unknown data. Accomplishing this technique and more has been and would be the goal of intensive research and development in the data science community. A variety of reasons, such as lack of data, imbalanced data, biased data can impact the decision rendered by the learning models. Many models are gaining traction for such interpretations. Now, LIME and SHAP are commonly used, in which are two state of the art open source explainable techniques. However, their outputs represent some different results. In this context, this study introduces a coupling technique of LIME and Shap, and demonstrates analysis possibilities on the decisions made by LightGBM and Keras models in classifying a transaction for fraudulence on the IEEE CIS dataset.