• Title/Summary/Keyword: Ensemble model

Search Result 619, Processing Time 0.031 seconds

Representation of Model Uncertainty in the Short-Range Ensemble Prediction for Typhoon Rusa (2002) (단기 앙상블 예보에서 모형의 불확실성 표현: 태풍 루사)

  • Kim, Sena;Lim, Gyu-Ho
    • Atmosphere
    • /
    • v.25 no.1
    • /
    • pp.1-18
    • /
    • 2015
  • The most objective way to overcome the limitation of numerical weather prediction model is to represent the uncertainty of prediction by introducing probabilistic forecast. The uncertainty of the numerical weather prediction system developed due to the parameterization of unresolved scale motions and the energy losses from the sub-scale physical processes. In this study, we focused on the growth of model errors. We performed ensemble forecast to represent model uncertainty. By employing the multi-physics scheme (PHYS) and the stochastic kinetic energy backscatter scheme (SKEBS) in simulating typhoon Rusa (2002), we assessed the performance level of the two schemes. The both schemes produced better results than the control run did in the ensemble mean forecast of the track. The results using PHYS improved by 28% and those based on SKEBS did by 7%. Both of the ensemble mean errors of the both schemes increased rapidly at the forecast time 84 hrs. The both ensemble spreads increased gradually during integration. The results based on SKEBS represented model errors very well during the forecast time of 96 hrs. After the period, it produced an under-dispersive pattern. The simulation based on PHYS overestimated the ensemble mean error during integration and represented the real situation well at the forecast time of 120 hrs. The displacement speed of the typhoon based on PHYS was closest to the best track, especially after landfall. In the sensitivity tests of the model uncertainty of SKEBS, ensemble mean forecast was sensitive to the physics parameterization. By adjusting the forcing parameter of SKEBS, the default experiment improved in the ensemble spread, ensemble mean errors, and moving speed.

Ensemble Classification Method for Efficient Medical Diagnostic (효율적인 의료진단을 위한 앙상블 분류 기법)

  • Jung, Yong-Gyu;Heo, Go-Eun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.3
    • /
    • pp.97-102
    • /
    • 2010
  • The purpose of medical data mining for efficient algorithms and techniques throughout the various diseases is to increase the reliability of estimates to classify. Previous studies, an algorithm based on a single model, and even the existence of the model to better predict the classification accuracy of multi-model ensemble-based research techniques are being applied. In this paper, the higher the medical data to predict the reliability of the existing scope of the ensemble technique applied to the I-ENSEMBLE offers. Data for the diagnosis of hypothyroidism is the result of applying the experimental technique, a representative ensemble Bagging, Boosting, Stacking technique significantly improved accuracy compared to all existing, respectively. In addition, compared to traditional single-model techniques and ensemble techniques Multi modeling when applied to represent the effects were more pronounced.

Development and Evaluation of the High Resolution Limited Area Ensemble Prediction System in the Korea Meteorological Administration (기상청 고해상도 국지 앙상블 예측 시스템 구축 및 성능 검증)

  • Kim, SeHyun;Kim, Hyun Mee;Kay, Jun Kyung;Lee, Seung-Woo
    • Atmosphere
    • /
    • v.25 no.1
    • /
    • pp.67-83
    • /
    • 2015
  • Predicting the location and intensity of precipitation still remains a main issue in numerical weather prediction (NWP). Resolution is a very important component of precipitation forecasts in NWP. Compared with a lower resolution model, a higher resolution model can predict small scale (i.e., storm scale) precipitation and depict convection structures more precisely. In addition, an ensemble technique can be used to improve the precipitation forecast because it can estimate uncertainties associated with forecasts. Therefore, NWP using both a higher resolution model and ensemble technique is expected to represent inherent uncertainties of convective scale motion better and lead to improved forecasts. In this study, the limited area ensemble prediction system for the convective-scale (i.e., high resolution) operational Unified Model (UM) in Korea Meteorological Administration (KMA) was developed and evaluated for the ensemble forecasts during August 2012. The model domain covers the limited area over the Korean Peninsula. The high resolution limited area ensemble prediction system developed showed good skill in predicting precipitation, wind, and temperature at the surface as well as meteorological variables at 500 and 850 hPa. To investigate which combination of horizontal resolution and ensemble member is most skillful, the system was run with three different horizontal resolutions (1.5, 2, and 3 km) and ensemble members (8, 12, and 16), and the forecasts from the experiments were evaluated. To assess the quantitative precipitation forecast (QPF) skill of the system, the precipitation forecasts for two heavy rainfall cases during the study period were analyzed using the Fractions Skill Score (FSS) and Probability Matching (PM) method. The PM method was effective in representing the intensity of precipitation and the FSS was effective in verifying the precipitation forecast for the high resolution limited area ensemble prediction system in KMA.

Wind Prediction with a Short-range Multi-Model Ensemble System (단시간 다중모델 앙상블 바람 예측)

  • Yoon, Ji Won;Lee, Yong Hee;Lee, Hee Choon;Ha, Jong-Chul;Lee, Hee Sang;Chang, Dong-Eon
    • Atmosphere
    • /
    • v.17 no.4
    • /
    • pp.327-337
    • /
    • 2007
  • In this study, we examined the new ensemble training approach to reduce the systematic error and improve prediction skill of wind by using the Short-range Ensemble prediction system (SENSE), which is the mesoscale multi-model ensemble prediction system. The SENSE has 16 ensemble members based on the MM5, WRF ARW, and WRF NMM. We evaluated the skill of surface wind prediction compared with AWS (Automatic Weather Station) observation during the summer season (June - August, 2006). At first stage, the correction of initial state for each member was performed with respect to the observed values, and the corrected members get the training stage to find out an adaptive weight function, which is formulated by Root Mean Square Vector Error (RMSVE). It was found that the optimal training period was 1-day through the experiments of sensitivity to the training interval. We obtained the weighted ensemble average which reveals smaller errors of the spatial and temporal pattern of wind speed than those of the simple ensemble average.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Automatic Fruit Grading Using Stacking Ensemble Model Based on Visual and Physical Features (시각적 특징과 물리적 특징에 기반한 스태킹 앙상블 모델을 이용한 과일의 자동 선별)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1386-1394
    • /
    • 2022
  • As consumption of high-quality fruits increases and sales and packaging units become smaller, the demand for automatic fruit grading systems is increasing. Compared to other crops, the quality of fruit is determined by visual characteristics such as shape, color, and scratches, rather than just physical size and weight. Accordingly, this study presents a CNN model that can effectively extract and classify the visual features of fruits and a perceptron that classifies fruits using physical features, and proposes a stacking ensemble model that can effectively combine the classification results of these two neural networks. The experiments with AI Hub public data show that the stacking ensemble model is effective for grading fruits. However, the ensemble model does not always improve the performance of classifying all the fruit grading. So, it is necessary to adapt the model according to the kind of fruit.

Optimization of Random Subspace Ensemble for Bankruptcy Prediction (재무부실화 예측을 위한 랜덤 서브스페이스 앙상블 모형의 최적화)

  • Min, Sung-Hwan
    • Journal of Information Technology Services
    • /
    • v.14 no.4
    • /
    • pp.121-135
    • /
    • 2015
  • Ensemble classification is to utilize multiple classifiers instead of using a single classifier. Recently ensemble classifiers have attracted much attention in data mining community. Ensemble learning techniques has been proved to be very useful for improving the prediction accuracy. Bagging, boosting and random subspace are the most popular ensemble methods. In random subspace, each base classifier is trained on a randomly chosen feature subspace of the original feature space. The outputs of different base classifiers are aggregated together usually by a simple majority vote. In this study, we applied the random subspace method to the bankruptcy problem. Moreover, we proposed a method for optimizing the random subspace ensemble. The genetic algorithm was used to optimize classifier subset of random subspace ensemble for bankruptcy prediction. This paper applied the proposed genetic algorithm based random subspace ensemble model to the bankruptcy prediction problem using a real data set and compared it with other models. Experimental results showed the proposed model outperformed the other models.

Predictability for Heavy Rainfall over the Korean Peninsula during the Summer using TIGGE Model (TIGGE 모델을 이용한 한반도 여름철 집중호우 예측 활용에 관한 연구)

  • Hwang, Yoon-Jeong;Kim, Yeon-Hee;Chung, Kwan-Young;Chang, Dong-Eon
    • Atmosphere
    • /
    • v.22 no.3
    • /
    • pp.287-298
    • /
    • 2012
  • The predictability of heavy precipitation over the Korean Peninsula is studied using THORPEX Interactive Grand Global Ensemble (TIGGE) data. The performance of the six ensemble models is compared through the inconsistency (or jumpiness) and Root Mean Square Error (RMSE) for MSLP, T850 and H500. Grand Ensemble (GE) of the three best ensemble models (ECMWF, UKMO and CMA) with equal weight and without bias correction is consisted. The jumpiness calculated in this study indicates that the GE is more consistent than each single ensemble model. Brier Score (BS) of precipitation also shows that the GE outperforms. The GE is used for a case study of a heavy rainfall event in Korean Peninsula on 9 July 2009. The probability forecast of precipitation using 90 members of the GE and the percentage of 90 members exceeding 90 percentile in climatological Probability Density Function (PDF) of observed precipitation are calculated. As the GE is excellent in possibility of potential detection of heavy rainfall, GE is more skillful than the single ensemble model and can lead to a heavy rainfall warning in medium-range. If the performance of each single ensemble model is also improved, GE can provide better performance.

Ensemble Methods Applied to Classification Problem

  • Kim, ByungJoo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.1
    • /
    • pp.47-53
    • /
    • 2019
  • The idea of ensemble learning is to train multiple models, each with the objective to predict or classify a set of results. Most of the errors from a model's learning are from three main factors: variance, noise, and bias. By using ensemble methods, we're able to increase the stability of the final model and reduce the errors mentioned previously. By combining many models, we're able to reduce the variance, even when they are individually not great. In this paper we propose an ensemble model and applied it to classification problem. In iris, Pima indian diabeit and semiconductor fault detection problem, proposed model classifies well compared to traditional single classifier that is logistic regression, SVM and random forest.

Assessment of the Prediction Performance of Ensemble Size-Related in GloSea5 Hindcast Data (기상청 기후예측시스템(GloSea5)의 과거기후장 앙상블 확대에 따른 예측성능 평가)

  • Park, Yeon-Hee;Hyun, Yu-Kyung;Heo, Sol-Ip;Ji, Hee-Sook
    • Atmosphere
    • /
    • v.31 no.5
    • /
    • pp.511-523
    • /
    • 2021
  • This study explores the optimal ensemble size to improve the prediction performance of the Korea Meteorological Administration's operational climate prediction system, global seasonal forecast system version 5 (GloSea5). The GloSea5 produces an ensemble of hindcast data using the stochastic kinetic energy backscattering version2 (SKEB2) and timelagged ensemble. An experiment to increase the hindcast ensemble from 3 to 14 members for four initial dates was performed and the improvement and effect of the prediction performance considering Root Mean Square Error (RMSE), Anomaly Correlation Coefficient (ACC), ensemble spread, and Ratio of Predictable Components (RPC) were evaluated. As the ensemble size increased, the RMSE and ACC prediction performance improved and more significantly in the high variability area. In spread and RPC analysis, the prediction accuracy of the system improved as the ensemble size increased. The closer the initial date, the better the predictive performance. Results show that increasing the ensemble to an appropriate number considering the combination of initial times is efficient.