• Title/Summary/Keyword: Ensemble prediction

Search Result 373, Processing Time 0.025 seconds

Learning Wind Speed Forecast Model based on Numeric Prediction Algorithm (수치 예측 알고리즘 기반의 풍속 예보 모델 학습)

  • Kim, Se-Young;Kim, Jeong-Min;Ryu, Kwang-Ryel
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.3
    • /
    • pp.19-27
    • /
    • 2015
  • Technologies of wind power generation for development of alternative energy technology have been accumulated over the past 20 years. Wind power generation is environmentally friendly and economical because it uses the wind blowing in nature as energy resource. In order to operate wind power generation efficiently, it is necessary to accurately predict wind speed changing every moment in nature. It is important not only averagely how well to predict wind speed but also to minimize the largest absolute error between real value and prediction value of wind speed. In terms of generation operating plan, minimizing the largest absolute error plays an important role for building flexible generation operating plan because the difference between predicting power and real power causes economic loss. In this paper, we propose a method of wind speed prediction using numeric prediction algorithm-based wind speed forecast model made to analyze the wind speed forecast given by the Meteorological Administration and pattern value for considering seasonal property of wind speed as well as changing trend of past wind speed. The wind speed forecast given by the Meteorological Administration is the forecast in respect to comparatively wide area including wind generation farm. But it contributes considerably to make accuracy of wind speed prediction high. Also, the experimental results demonstrate that as the rate of wind is analyzed in more detail, the greater accuracy will be obtained.

A Study on the Timing of Spring Onset over the Republic of Korea Using Ensemble Empirical Mode Decomposition (앙상블 경험적 모드 분해법을 이용한 우리나라 봄 시작일에 관한 연구)

  • Kwon, Jaeil;Choi, Youngeun
    • Journal of the Korean Geographical Society
    • /
    • v.49 no.5
    • /
    • pp.675-689
    • /
    • 2014
  • This study applied Ensemble Empirical Mode Decomposition(EEMD), a new methodology to define the timing of spring onset over the Republic of Korea and to examine its spatio-temporal change. Also this study identified the relationship between spring onet timing and some atmospheric variations, and figured out synoptic factors which affect the timing of spring onset. The averaged spring onset timing for the period of 1974-2011 was 11th, March in Republic of Korea. In general, the spring onset timing was later with higher latitude and altitude regions, and it was later in inland regions than in costal ones. The correlation analysis has been carried out to find out the factors which affect spring onset timing, and global annual mean temperature, Arctic Oscillation(AO), Siberian High had a significant correlation with spring onset timing. The multiple regression analysis was conducted with three indices which were related to spring onset timing, and the model explained 64.7%. As a result of multiple regression analysis, the effect of annual mean temperature was the greatest and that of AO was the second. To find out synoptic factors affecting spring onset timing, the synoptic analysis has been carried out. As a result the intensity of meridional circulation represented as the major factor affect spring onset timing.

  • PDF

A Comparative Analysis of Ensemble Learning-Based Classification Models for Explainable Term Deposit Subscription Forecasting (설명 가능한 정기예금 가입 여부 예측을 위한 앙상블 학습 기반 분류 모델들의 비교 분석)

  • Shin, Zian;Moon, Jihoon;Rho, Seungmin
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.97-117
    • /
    • 2021
  • Predicting term deposit subscriptions is one of representative financial marketing in banks, and banks can build a prediction model using various customer information. In order to improve the classification accuracy for term deposit subscriptions, many studies have been conducted based on machine learning techniques. However, even if these models can achieve satisfactory performance, utilizing them is not an easy task in the industry when their decision-making process is not adequately explained. To address this issue, this paper proposes an explainable scheme for term deposit subscription forecasting. For this, we first construct several classification models using decision tree-based ensemble learning methods, which yield excellent performance in tabular data, such as random forest, gradient boosting machine (GBM), extreme gradient boosting (XGB), and light gradient boosting machine (LightGBM). We then analyze their classification performance in depth through 10-fold cross-validation. After that, we provide the rationale for interpreting the influence of customer information and the decision-making process by applying Shapley additive explanation (SHAP), an explainable artificial intelligence technique, to the best classification model. To verify the practicality and validity of our scheme, experiments were conducted with the bank marketing dataset provided by Kaggle; we applied the SHAP to the GBM and LightGBM models, respectively, according to different dataset configurations and then performed their analysis and visualization for explainable term deposit subscriptions.

Development of Realtime Dam's Hydrologic Variables Prediction Model using Observed Data Assimilation and Reservoir Operation Techniques (관측자료 동화기법과 댐운영을 고려한 실시간 댐 수문량 예측모형 개발)

  • Lee, Byong Ju;Jung, Il-Won;Jung, Hyun-Sook;Bae, Deg Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.7
    • /
    • pp.755-765
    • /
    • 2013
  • This study developed a real-time dam's hydrologic variables prediction model (DHVPM) and evaluated its performance for simulating historical dam inflow and outflow in the Chungju dam basin. The DHVPM consists of the Sejong University River Forecast (SURF) model for hydrologic modeling and an autoreservoir operation method (Auto ROM) for dam operation. SURF model is continuous rainfall-runoff model with data assimilation using an ensemble Kalman filter technique. The four extreme events including the maximum inflow of each year for 2006~2009 were selected to examine the performance of DHVPM. The statistical criteria, the relative error in peak flow, root mean square error, and model efficiency, demonstrated that DHVPM with data assimilation can simulate more close to observed inflow than those with no data assimilation at both 1-hour lead time, except the relative error in peak flow in 2007. Especially, DHVPM with data assimilation until 10-hour lead time reduced the biases of inflow forecast attributed to observed precipitation error. In conclusion, DHVPM with data assimilation can be useful to improve the accuracy of inflow forecast in the basin where real-time observed inflow are available.

Development of daily spatio-temporal downscaling model with conditional Copula based bias-correction of GloSea5 monthly ensemble forecasts (조건부 Copula 함수 기반의 월단위 GloSea5 앙상블 예측정보 편의보정 기법과 연계한 일단위 시공간적 상세화 모델 개발)

  • Kim, Yong-Tak;Kim, Min Ji;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.12
    • /
    • pp.1317-1328
    • /
    • 2021
  • This study aims to provide a predictive model based on climate models for simulating continuous daily rainfall sequences by combining bias-correction and spatio-temporal downscaling approaches. For these purposes, this study proposes a combined modeling system by applying conditional Copula and Multisite Non-stationary Hidden Markov Model (MNHMM). The GloSea5 system releases the monthly rainfall prediction on the same day every week, however, there are noticeable differences in the updated prediction. It was confirmed that the monthly rainfall forecasts are effectively updated with the use of the Copula-based bias-correction approach. More specifically, the proposed bias-correction approach was validated for the period from 1991 to 2010 under the LOOCV scheme. Several rainfall statistics, such as rainfall amounts, consecutive rainfall frequency, consecutive zero rainfall frequency, and wet days, are well reproduced, which is expected to be highly effective as input data of the hydrological model. The difference in spatial coherence between the observed and simulated rainfall sequences over the entire weather stations was estimated in the range of -0.02~0.10, and the interdependence between rainfall stations in the watershed was effectively reproduced. Therefore, it is expected that the hydrological response of the watershed will be more realistically simulated when used as input data for the hydrological model.

An Assessment of Applicability of Heat Waves Using Extreme Forecast Index in KMA Climate Prediction System (GloSea5) (기상청 현업 기후예측시스템(GloSea5)에서의 극한예측지수를 이용한 여름철 폭염 예측 성능 평가)

  • Heo, Sol-Ip;Hyun, Yu-Kyung;Ryu, Young;Kang, Hyun-Suk;Lim, Yoon-Jin;Kim, Yoonjae
    • Atmosphere
    • /
    • v.29 no.3
    • /
    • pp.257-267
    • /
    • 2019
  • This study is to assess the applicability of the Extreme Forecast Index (EFI) algorithm of the ECMWF seasonal forecast system to the Global Seasonal Forecasting System version 5 (GloSea5), operational seasonal forecast system of the Korea Meteorological Administration (KMA). The EFI is based on the difference between Cumulative Distribution Function (CDF) curves of the model's climate data and the current ensemble forecast distribution, which is essential to diagnose the predictability in the extreme cases. To investigate its applicability, the experiment was conducted during the heat-wave cases (the year of 1994 and 2003) and compared GloSea5 hindcast data based EFI with anomaly data of ERA-Interim. The data also used to determine quantitative estimates of Probability Of Detection (POD), False Alarm Ratio (FAR), and spatial pattern correlation. The results showed that the area of ERA-Interim indicating above 4-degree temperature corresponded to the area of EFI 0.8 and above. POD showed high ratio (0.7 and 0.9, respectively), when ERA-Interim anomaly data were the highest (on Jul. 11, 1994 (> $5^{\circ}C$) and Aug. 8, 2003 (> $7^{\circ}C$), respectively). The spatial pattern showed a high correlation in the range of 0.5~0.9. However, the correlation decreased as the lead time increased. Furthermore, the case of Korea heat wave in 2018 was conducted using GloSea5 forecast data to validate EFI showed successful prediction for two to three weeks lead time. As a result, the EFI forecasts can be used to predict the probability that an extreme weather event of interest might occur. Overall, we expected these results to be available for extreme weather forecasting.

A Development of Defeat Prediction Model Using Machine Learning in Polyurethane Foaming Process for Automotive Seat (머신러닝을 활용한 자동차 시트용 폴리우레탄 발포공정의 불량 예측 모델 개발)

  • Choi, Nak-Hun;Oh, Jong-Seok;Ahn, Jong-Rok;Kim, Key-Sun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.6
    • /
    • pp.36-42
    • /
    • 2021
  • With recent developments in the Fourth Industrial Revolution, the manufacturing industry has changed rapidly. Through key aspects of Fourth Industrial Revolution super-connections and super-intelligence, machine learning will be able to make fault predictions during the foam-making process. Polyol and isocyanate are components in polyurethane foam. There has been a lot of research that could affect the characteristics of the products, depending on the specific mixture ratio and temperature. Based on these characteristics, this study collects data from each factor during the foam-making process and applies them to machine learning in order to predict faults. The algorithms used in machine learning are the decision tree, kNN, and an ensemble algorithm, and these algorithms learn from 5,147 cases. Based on 1,000 pieces of data for validation, the learning results show up to 98.5% accuracy using the ensemble algorithm. Therefore, the results confirm the faults of currently produced parts by collecting real-time data from each factor during the foam-making process. Furthermore, control of each of the factors may improve the fault rate.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

Deep Learning-Based Box Office Prediction Using the Image Characteristics of Advertising Posters in Performing Arts (공연예술에서 광고포스터의 이미지 특성을 활용한 딥러닝 기반 관객예측)

  • Cho, Yujung;Kang, Kyungpyo;Kwon, Ohbyung
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.2
    • /
    • pp.19-43
    • /
    • 2021
  • The prediction of box office performance in performing arts institutions is an important issue in the performing arts industry and institutions. For this, traditional prediction methodology and data mining methodology using standardized data such as cast members, performance venues, and ticket prices have been proposed. However, although it is evident that audiences tend to seek out their intentions by the performance guide poster, few attempts were made to predict box office performance by analyzing poster images. Hence, the purpose of this study is to propose a deep learning application method that can predict box office success through performance-related poster images. Prediction was performed using deep learning algorithms such as Pure CNN, VGG-16, Inception-v3, and ResNet50 using poster images published on the KOPIS as learning data set. In addition, an ensemble with traditional regression analysis methodology was also attempted. As a result, it showed high discrimination performance exceeding 85% of box office prediction accuracy. This study is the first attempt to predict box office success using image data in the performing arts field, and the method proposed in this study can be applied to the areas of poster-based advertisements such as institutional promotions and corporate product advertisements.

Corporate Bankruptcy Prediction Model using Explainable AI-based Feature Selection (설명가능 AI 기반의 변수선정을 이용한 기업부실예측모형)

  • Gundoo Moon;Kyoung-jae Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.241-265
    • /
    • 2023
  • A corporate insolvency prediction model serves as a vital tool for objectively monitoring the financial condition of companies. It enables timely warnings, facilitates responsive actions, and supports the formulation of effective management strategies to mitigate bankruptcy risks and enhance performance. Investors and financial institutions utilize default prediction models to minimize financial losses. As the interest in utilizing artificial intelligence (AI) technology for corporate insolvency prediction grows, extensive research has been conducted in this domain. However, there is an increasing demand for explainable AI models in corporate insolvency prediction, emphasizing interpretability and reliability. The SHAP (SHapley Additive exPlanations) technique has gained significant popularity and has demonstrated strong performance in various applications. Nonetheless, it has limitations such as computational cost, processing time, and scalability concerns based on the number of variables. This study introduces a novel approach to variable selection that reduces the number of variables by averaging SHAP values from bootstrapped data subsets instead of using the entire dataset. This technique aims to improve computational efficiency while maintaining excellent predictive performance. To obtain classification results, we aim to train random forest, XGBoost, and C5.0 models using carefully selected variables with high interpretability. The classification accuracy of the ensemble model, generated through soft voting as the goal of high-performance model design, is compared with the individual models. The study leverages data from 1,698 Korean light industrial companies and employs bootstrapping to create distinct data groups. Logistic Regression is employed to calculate SHAP values for each data group, and their averages are computed to derive the final SHAP values. The proposed model enhances interpretability and aims to achieve superior predictive performance.