• Title/Summary/Keyword: Ensemble prediction

Search Result 372, Processing Time 0.021 seconds

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

Seasonal Prediction of Tropical Cyclone Frequency in the Western North Pacific using GDAPS Ensemble Prediction System (GDAPS 앙상블 예보 시스템을 이용한 북서태평양에서의 태풍 발생 계절 예측)

  • Kim, Ji-Sun;Kwon, H. Joe
    • Atmosphere
    • /
    • v.17 no.3
    • /
    • pp.269-279
    • /
    • 2007
  • This study investigates the possibility of seasonal prediction for tropical cyclone activity in the western North Pacific by using a dynamical modeling approach. We use data from the SMIP/HFP (Seasonal Prediction Model Inter-comparison Project/Historical Forecast Project) experiment with the Korea Meteorological Administration's GDAPS (Global Data Assimilation and Prediction System) T106 model, focusing our analysis on model-generated tropical cyclones. It is found that the prediction depends primarily on the tropical cyclone (TC) detecting criteria. Additionally, a scaling factor and a different weighting to each ensemble member are found to be essential for the best predictions of summertime TC activity. This approach indeed shows a certain skill not only in the category forecast but in the standard verifications such as Brier score and relative operating characteristics (ROC).

Impact of Ensemble Member Size on Confidence-based Selection in Bankruptcy Prediction (부도예측을 위한 확신 기반의 선택 접근법에서 앙상블 멤버 사이즈의 영향에 관한 연구)

  • Kim, Na-Ra;Shin, Kyung-Shik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.55-71
    • /
    • 2013
  • The prediction model is the main factor affecting the performance of a knowledge-based system for bankruptcy prediction. Earlier studies on prediction modeling have focused on the building of a single best model using statistical and artificial intelligence techniques. However, since the mid-1980s, integration of multiple techniques (hybrid techniques) and, by extension, combinations of the outputs of several models (ensemble techniques) have, according to the experimental results, generally outperformed individual models. An ensemble is a technique that constructs a set of multiple models, combines their outputs, and produces one final prediction. The way in which the outputs of ensemble members are combined is one of the important issues affecting prediction accuracy. A variety of combination schemes have been proposed in order to improve prediction performance in ensembles. Each combination scheme has advantages and limitations, and can be influenced by domain and circumstance. Accordingly, decisions on the most appropriate combination scheme in a given domain and contingency are very difficult. This paper proposes a confidence-based selection approach as part of an ensemble bankruptcy-prediction scheme that can measure unified confidence, even if ensemble members produce different types of continuous-valued outputs. The present experimental results show that when varying the number of models to combine, according to the creation type of ensemble members, the proposed combination method offers the best performance in the ensemble having the largest number of models, even when compared with the methods most often employed in bankruptcy prediction.

Bankruptcy prediction using ensemble SVM model (앙상블 SVM 모형을 이용한 기업 부도 예측)

  • Choi, Ha Na;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1113-1125
    • /
    • 2013
  • Corporate bankruptcy prediction has been an important topic in the accounting and finance field for a long time. Several data mining techniques have been used for bankruptcy prediction. However, there are many limits for application to real classification problem with a single model. This study proposes ensemble SVM (support vector machine) model which assembles different SVM models with each different kernel functions. Our ensemble model is made and evaluated by v-fold cross-validation approach. The k top performing models are recruited into the ensemble. The classification is then carried out using the majority voting opinion of the ensemble. In this paper, we investigate the performance of ensemble SVM classifier in terms of accuracy, error rate, sensitivity, specificity, ROC curve, and AUC to compare with single SVM classifiers based on financial ratios dataset and simulation dataset. The results confirmed the advantages of our method: It is robust while providing good performance.

Ensemble techniques and hybrid intelligence algorithms for shear strength prediction of squat reinforced concrete walls

  • Mohammad Sadegh Barkhordari;Leonardo M. Massone
    • Advances in Computational Design
    • /
    • v.8 no.1
    • /
    • pp.37-59
    • /
    • 2023
  • Squat reinforced concrete (SRC) shear walls are a critical part of the structure for both office/residential buildings and nuclear structures due to their significant role in withstanding seismic loads. Despite this, empirical formulae in current design standards and published studies demonstrate a considerable disparity in predicting SRC wall shear strength. The goal of this research is to develop and evaluate hybrid and ensemble artificial neural network (ANN) models. State-of-the-art population-based algorithms are used in this research for hybrid intelligence algorithms. Six models are developed, including Honey Badger Algorithm (HBA) with ANN (HBA-ANN), Hunger Games Search with ANN (HGS-ANN), fitness-distance balance coyote optimization algorithm (FDB-COA) with ANN (FDB-COA-ANN), Averaging Ensemble (AE) neural network, Snapshot Ensemble (SE) neural network, and Stacked Generalization (SG) ensemble neural network. A total of 434 test results of SRC walls is utilized to train and assess the models. The results reveal that the SG model not only minimizes prediction variance but also produces predictions (with R2= 0.99) that are superior to other models.

Performance Comparison Analysis of Artificial Intelligence Models for Estimating Remaining Capacity of Lithium-Ion Batteries

  • Kyu-Ha Kim;Byeong-Soo Jung;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.310-314
    • /
    • 2023
  • The purpose of this study is to predict the remaining capacity of lithium-ion batteries and evaluate their performance using five artificial intelligence models, including linear regression analysis, decision tree, random forest, neural network, and ensemble model. We is in the study, measured Excel data from the CS2 lithium-ion battery was used, and the prediction accuracy of the model was measured using evaluation indicators such as mean square error, mean absolute error, coefficient of determination, and root mean square error. As a result of this study, the Root Mean Square Error(RMSE) of the linear regression model was 0.045, the decision tree model was 0.038, the random forest model was 0.034, the neural network model was 0.032, and the ensemble model was 0.030. The ensemble model had the best prediction performance, with the neural network model taking second place. The decision tree model and random forest model also performed quite well, and the linear regression model showed poor prediction performance compared to other models. Therefore, through this study, ensemble models and neural network models are most suitable for predicting the remaining capacity of lithium-ion batteries, and decision tree and random forest models also showed good performance. Linear regression models showed relatively poor predictive performance. Therefore, it was concluded that it is appropriate to prioritize ensemble models and neural network models in order to improve the efficiency of battery management and energy systems.

On successive machine learning process for predicting strength and displacement of rectangular reinforced concrete columns subjected to cyclic loading

  • Bu-seog Ju;Shinyoung Kwag;Sangwoo Lee
    • Computers and Concrete
    • /
    • v.32 no.5
    • /
    • pp.513-525
    • /
    • 2023
  • Recently, research on predicting the behavior of reinforced concrete (RC) columns using machine learning methods has been actively conducted. However, most studies have focused on predicting the ultimate strength of RC columns using a regression algorithm. Therefore, this study develops a successive machine learning process for predicting multiple nonlinear behaviors of rectangular RC columns. This process consists of three stages: single machine learning, bagging ensemble, and stacking ensemble. In the case of strength prediction, sufficient prediction accuracy is confirmed even in the first stage. In the case of displacement, although sufficient accuracy is not achieved in the first and second stages, the stacking ensemble model in the third stage performs better than the machine learning models in the first and second stages. In addition, the performance of the final prediction models is verified by comparing the backbone curves and hysteresis loops obtained from predicted outputs with actual experimental data.

Predictability Study of Snowfall Case over South Korea Using TIGGE Data on 28 December 2012 (TIGGE 자료를 이용한 2012년 12월 28일 한반도 강설사례 예측성 연구)

  • Lee, Sang-Min;Han, Sang-Un;Won, Hye Young;Ha, Jong-Chul;Lee, Jeong-Soon;Sim, Jae-Kwan;Lee, Yong Hee
    • Atmosphere
    • /
    • v.24 no.1
    • /
    • pp.1-15
    • /
    • 2014
  • This study compared ensemble mean and probability forecasts of snow depth amount associated with winter storm over South Korea on 28 December 2012 at five operational forecast centers (CMA, ECMWF, NCEP, KMA, and UMKO). And cause of difference in predicted snow depth at each Ensemble Prediction System (EPS) was investigated by using THe Observing system Research and Predictability EXperiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) data. This snowfall event occurred due to low pressure passing through South Sea of Korea. Amount of 6 hr accumulated snow depth was more than 10 cm over southern region of South Korea In this case study, ECMWF showed best prediction skill for the spatio-temporal distribution of snow depth. At first, ECMWF EPS has been consistently enhancing the indications present in ensemble mean snow depth forecasts from 7-day lead time. Secondly, its ensemble probabilities in excess of 2~5 cm/6 hour have been coincided with observation frequencies. And this snowfall case could be predicted from 5-day lead time by using 10-day lag ensemble mean 6 hr accumulated snow depth distribution. In addition, the cause of good performances at ECMWF EPS in predicted snow depth amounts was due to outstanding prediction ability of forming inversion layer with below $0^{\circ}C$ temperature in low level (below 850 hPa) according to $35^{\circ}N$ at 1-day lead time.

Typhoon Wukong (200610) Prediction Based on The Ensemble Kalman Filter and Ensemble Sensitivity Analysis (앙상블 칼만 필터를 이용한 태풍 우쿵 (200610) 예측과 앙상블 민감도 분석)

  • Park, Jong Im;Kim, Hyun Mee
    • Atmosphere
    • /
    • v.20 no.3
    • /
    • pp.287-306
    • /
    • 2010
  • An ensemble Kalman filter (EnKF) with Weather Research and Forecasting (WRF) Model is applied for Typhoon Wukong (200610) to investigate the performance of ensemble forecasts depending on experimental configurations of the EnKF. In addition, the ensemble sensitivity analysis is applied to the forecast and analysis ensembles generated in EnKF, to investigate the possibility of using the ensemble sensitivity analysis as the adaptive observation guidance. Various experimental configurations are tested by changing model error, ensemble size, assimilation time window, covariance relaxation, and covariance localization in EnKF. First of all, experiments using different physical parameterization scheme for each ensemble member show less root mean square error compared to those using single physics for all the forecast ensemble members, which implies that considering the model error is beneficial to get better forecasts. A larger number of ensembles are also beneficial than a smaller number of ensembles. For the assimilation time window, the experiment using less frequent window shows better results than that using more frequent window, which is associated with the availability of observational data in this study. Therefore, incorporating model error, larger ensemble size, and less frequent assimilation window into the EnKF is beneficial to get better prediction of Typhoon Wukong (200610). The covariance relaxation and localization are relatively less beneficial to the forecasts compared to those factors mentioned above. The ensemble sensitivity analysis shows that the sensitive regions for adaptive observations can be determined by the sensitivity of the forecast measure of interest to the initial ensembles. In addition, the sensitivities calculated by the ensemble sensitivity analysis can be explained by dynamical relationships established among wind, temperature, and pressure.

Comparison of Ensemble Perturbations using Lorenz-95 Model: Bred vectors, Orthogonal Bred vectors and Ensemble Transform Kalman Filter(ETKF) (로렌쯔-95 모델을 이용한 앙상블 섭동 비교: 브레드벡터, 직교 브레드벡터와 앙상블 칼만 필터)

  • Chung, Kwan-Young;Barker, Dale;Moon, Sun-Ok;Jeon, Eun-Hee;Lee, Hee-Sang
    • Atmosphere
    • /
    • v.17 no.3
    • /
    • pp.217-230
    • /
    • 2007
  • Using the Lorenz-95 simple model, which can simulate many atmospheric characteristics, we compare the performance of ensemble strategies such as bred vectors, the bred vectors rotated (to be orthogonal to each bred member), and the Ensemble Transform Kalman Filter (ETKF). The performance metrics used are the RMSE of ensemble means, the ratio of RMS error of ensemble mean to the spread of ensemble, rank histograms to see if the ensemble member can well represent the true probability density function (pdf), and the distribution of eigen-values of the forecast ensemble, which can provide useful information on the independence of each member. In the meantime, the orthogonal bred vectors can achieve the considerable progress comparing the bred vectors in all aspects of RMSE, spread, and independence of members. When we rotate the bred vectors for orthogonalization, the improvement rate for the spread of ensemble is almost as double as that for RMS error of ensemble mean compared to the non-rotated bred vectors on a simple model. It appears that the result is consistent with the tentative test on the operational model in KMA. In conclusion, ETKF is superior to the other two methods in all terms of the assesment ways we used when it comes to ensemble prediction. But we cannot decide which perturbation strategy is better in aspect of the structure of the background error covariance. It appears that further studies on the best perturbation way for hybrid variational data assimilation to consider an error-of-the-day(EOTD) should be needed.