• Title/Summary/Keyword: Ensemble system

Search Result 368, Processing Time 0.027 seconds

Optimization of the Vertical Localization Scale for GPS-RO Data Assimilation within KIAPS-LETKF System (KIAPS 앙상블 자료동화 시스템을 이용한 GPS 차폐자료 연직 국지화 규모 최적화)

  • Jo, Youngsoon;Kang, Ji-Sun;Kwon, Hataek
    • Atmosphere
    • /
    • v.25 no.3
    • /
    • pp.529-541
    • /
    • 2015
  • Korea Institute of Atmospheric Prediction System (KIAPS) has been developing a global numerial prediction model and data assimilation system. We has implemented LETKF (Local Ensemble Transform Kalman Filter, Hunt et al., 2007) data assimilation system to NCAR CAM-SE (National Center for Atmospheric Research Community Atmosphere Model with Spectral Element dynamical core, Dennis et al., 2012) that has cubed-sphere grid, known as the same grid system of KIAPS Integrated Model (KIM) now developing. In this study, we have assimilated Global Positioning System Radio Occultation (GPS-RO) bending angle measurements in addition to conventional data within ensemble-based data assimilation system. Before assimilating bending angle data, we performed a vertical unit conversion. The information of vertical localization for GPS-RO data is given by the unit of meter, but the vertical localization method in the LETKF system is based on pressure unit. Therefore, with a clever conversion of the vertical information, we have conducted experiments to search for the best vertical localization scale on GPS-RO data under the Observing System Simulation Experiments (OSSEs). As a result, we found the optimal setting of vertical localization for the GPS-RO bending angle data assimilation. We plan to apply the selected localization strategy to the LETKF system implemented to KIM which is expected to give better analysis of GPS-RO data assimilation due to much higher model top.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Power Line Noise Reductions in ABR by Properly Chosen Iteration Numbers (ABR에서 반복회수 설정에 의한 전력선 잡음의 제거)

  • 안주현;김수찬;남기창;심윤주;김희남;송철규;김덕원
    • Journal of Biomedical Engineering Research
    • /
    • v.22 no.3
    • /
    • pp.241-247
    • /
    • 2001
  • ABR(auditory brainstem response) is one of the audiometry which measures objective hearing threshold level by acquiring electric evoked potentials emanated from auditory nerve system responding to an auditory stimulation. However, the obtained potentials which are largely interfered by power line noise, have extremely low SNR, thus ensemble average algorithm is generally used. The purpose of this study was to investigate the effect of iteration number in ensemble average on the reduction of the power line noise. The power line noise was modeled to be a 60 Hz sinusoidal signal and the energy of the modeled signal was calculated when it was averaged. It was verified by simulation that the energy had the periodic zero points for each stimulation rate, and 60 Hz signal induced by the power line was applied to the developed ABR system to confirm that the period of zero energy point was the same with that of the simulation. By the properly selected iteration number, power line noise could be reduced and more reliable ABR could be acquired.

  • PDF

A Prediction of Precipitation Over East Asia for June Using Simultaneous and Lagged Teleconnection (원격상관을 이용한 동아시아 6월 강수의 예측)

  • Lee, Kang-Jin;Kwon, MinHo
    • Atmosphere
    • /
    • v.26 no.4
    • /
    • pp.711-716
    • /
    • 2016
  • The dynamical model forecasts using state-of-art general circulation models (GCMs) have some limitations to simulate the real climate system since they do not depend on the past history. One of the alternative methods to correct model errors is to use the canonical correlation analysis (CCA) correction method. CCA forecasts at the present time show better skill than dynamical model forecasts especially over the midlatitudes. Model outputs are adjusted based on the CCA modes between the model forecasts and the observations. This study builds a canonical correlation prediction model for subseasonal (June) precipitation. The predictors are circulation fields over western North Pacific from the Global Seasonal Forecasting System version 5 (GloSea5) and observed snow cover extent over Eurasia continent from Climate Data Record (CDR). The former is based on simultaneous teleconnection between the western North Pacific and the East Asia, and the latter on lagged teleconnection between the Eurasia continent and the East Asia. In addition, we suggest a technique for improving forecast skill by applying the ensemble canonical correlation (ECC) to individual canonical correlation predictions.

A multi-scale analysis of the interdecadal change in the Madden-Julian Oscillation (MJO의 다중스케일 분석을 통한 수십년 변동성)

  • Lee, Sang-Heon;Seo, Kyong-Hwan
    • Atmosphere
    • /
    • v.21 no.2
    • /
    • pp.143-149
    • /
    • 2011
  • A new multi-timescale analysis method, Ensemble Empirical Mode Decomposition (EEMD), is used to diagnose the variation of the MJO activity determined by 850hPa and 200hPa zonal winds from the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) Reanalysis data for the 56-yr period from 1950 to 2005. The results show that MJO activity can be decomposed into 9 quasi-periodic oscillations and a trend. With each level of contribution of the quasi-periodic oscillation discussed, the bi-seasonal oscillation, the interannual oscillation and the trend of the MJO activity are the most prominent features. The trend increases almost linearly, so that prior to around 1978 the activity of the MJO is lower than that during the latter part. This may be related to the tropical sea surface temperature(SST). It is speculated that the interdecadal change in the MJO activity appeared in around 1978 is related to the warmer SST in the equatorial warm pool, especially over the Indian Ocean.

An Ensemble Model for Machine Failure Prediction (앙상블 모델 기반의 기계 고장 예측 방법)

  • Cheon, Kang Min;Yang, Jaekyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.1
    • /
    • pp.123-131
    • /
    • 2020
  • There have been a lot of studies in the past for the method of predicting the failure of a machine, and recently, a lot of researches and applications have been generated to diagnose the physical condition of the machine and the parts and to calculate the remaining life through various methods. Survival models are also used to predict plant failures based on past anomaly cycles. In particular, special machine that reflect the fluid flow and process characteristics of chemical plants are connected to hundreds or thousands of sensors, so there are not many factors that need to be considered, such as process and material data as well as application of derivative variables. In this paper, the data were preprocessed through time series anomaly detection based on unsupervised learning to predict the abnormalities of these special machine. Next, clustering results reflecting clustering-based data characteristics were applied to produce additional variables, and a learning data set was created based on the history of past facility abnormalities. Finally, the prediction methodology based on the supervised learning algorithm was applied, and the model update was confirmed to improve the accuracy of the prediction of facility failure. Through this, it is expected to improve the efficiency of facility operation by flexibly replacing the maintenance time and parts supply and demand by predicting abnormalities of machine and extracting key factors.

Performance Assessment of Monthly Ensemble Prediction Data Based on Improvement of Climate Prediction System at KMA (기상청 기후예측시스템 개선에 따른 월별 앙상블 예측자료 성능평가)

  • Ham, Hyunjun;Lee, Sang-Min;Hyun, Yu-Kyug;Kim, Yoonjae
    • Atmosphere
    • /
    • v.29 no.2
    • /
    • pp.149-164
    • /
    • 2019
  • The purpose of this study is to introduce the improvement of current operational climate prediction system of KMA and to compare previous and improved that. Whereas the previous system is based on GloSea5GA3, the improved one is built on GloSea5GC2. GloSea5GC2 is a fully coupled global climate model with an atmosphere, ocean, sea-ice and land components through the coupler OASIS. This is comprised of component configurations Global Atmosphere 6.0 (GA6.0), Global Land 6.0 (GL6.0), Global Ocean 5.0 (GO5.0) and Global Sea Ice 6.0 (GSI6.0). The compositions have improved sea-ice parameters over the previous model. The model resolution is N216L85 (~60 km in mid-latitudes) in the atmosphere and ORCA0.25L75 ($0.25^{\circ}$ on a tri-polar grid) in the ocean. In this research, the predictability of each system is evaluated using by RMSE, Correlation and MSSS, and the variables are 500 hPa geopotential height (h500), 850 hPa temperature (t850) and Sea surface temperature (SST). A predictive performance shows that GloSea5GC2 is better than GloSea5GA3. For example, the RMSE of h500 of 1-month forecast is decreased from 23.89 gpm to 22.21 gpm in East Asia. For Nino3.4 area of SST, the improvements to GloSeaGC2 result in a decrease in RMSE, which become apparent over time. It can be concluded that GloSea5GC2 has a great performance for seasonal prediction.

Korean Flood Vulnerability Assessment on Climate Change (기후변화에 따른 국내 홍수 취약성 평가)

  • Lee, Moon-Hwan;Jung, Il-Won;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.44 no.8
    • /
    • pp.653-666
    • /
    • 2011
  • The purposes of this study are to suggest flood vulnerability assessment method on climate change with evaluation of this method over the 5 river basins and to present the uncertainty range of assessment using multi-model ensemble scenarios. In this study, the data related to past historical flood events were collected and flood vulnerability index was calculated. The vulnerability assessment were also performed under current climate system. For future climate change scenario, the 39 climate scenarios are obtained from 3 different emission scenarios and 13 GCMs provided by IPCC DDC and 312 hydrology scenarios from 3 hydrological models and 2~3 potential evapotranspiration computation methods for the climate scenarios. Finally, the spatial and temporal changes of flood vulnerability and the range of uncertainty were performed for future S1 (2010~2039), S2 (2040~2069), S3 (2070~2099) period compared to reference S0 (1971~2000) period. The results of this study shows that vulnerable region's were Han and Sumjin, Youngsan river basins under current climate system. Considering the climate scenarios, variability in Nakdong, Gum and Han river basins are large, but Sumjin river basin had little variability due to low basic-stream ability to adaptation.

Prediction and Analysis of PM2.5 Concentration in Seoul Using Ensemble-based Model (앙상블 기반 모델을 이용한 서울시 PM2.5 농도 예측 및 분석)

  • Ryu, Minji;Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1191-1205
    • /
    • 2022
  • Particulate matter(PM) among air pollutants with complex and widespread causes is classified according to particle size. Among them, PM2.5 is very small in size and can cause diseases in the human respiratory tract or cardiovascular system if inhaled by humans. In order to prepare for these risks, state-centered management and preventable monitoring and forecasting are important. This study tried to predict PM2.5 in Seoul, where high concentrations of fine dust occur frequently, using two ensemble models, random forest (RF) and extreme gradient boosting (XGB) using 15 local data assimilation and prediction system (LDAPS) weather-related factors, aerosol optical depth (AOD) and 4 chemical factors as independent variables. Performance evaluation and factor importance evaluation of the two models used for prediction were performed, and seasonal model analysis was also performed. As a result of prediction accuracy, RF showed high prediction accuracy of R2 = 0.85 and XGB R2 = 0.91, and it was confirmed that XGB was a more suitable model for PM2.5 prediction than RF. As a result of the seasonal model analysis, it can be said that the prediction performance was good compared to the observed values with high concentrations in spring. In this study, PM2.5 of Seoul was predicted using various factors, and an ensemble-based PM2.5 prediction model showing good performance was constructed.

Development of Impact-based Heat Health Warning System Based on Ensemble Forecasts of Perceived Temperature and its Evaluation using Heat-Related Patients in 2019 (인지온도 확률예보기반 폭염-건강영향예보 지원시스템 개발 및 2019년 온열질환자를 이용한 평가)

  • Kang, Misun;Belorid, Miloslav;Kim, Kyu Rang
    • Atmosphere
    • /
    • v.30 no.2
    • /
    • pp.195-207
    • /
    • 2020
  • This study aims to introduce the structure of the impact-based heat health warning system on 165 counties in South Korea developed by the National Institute of Meteorological Sciences. This system was developed using the daily maximum perceived temperature (PTmax), which is a human physiology-based thermal comfort index, and the Local ENSemble prediction system for the probability forecasts. Also, A risk matrix proposed by the World Meteorological Organization was employed for the impact-based forecasts of this system. The threshold value of the risk matrix was separately set depending on regions. In this system, the risk level was issued as four levels (GREEN, YELLOW, ORANGE, RED) for first, second, and third forecast lead-day (LD1, LD2, and LD3). The daily risk level issued by the system was evaluated using emergency heat-related patients obtained at six cities, including Seoul, Incheon, Daejeon, Gwangju, Daegu, and Busan, for LD1 to LD3. The high-risks level occurred more consistently in the shorter lead time (LD3 → LD1) and the performance (rs) was increased from 0.42 (LD3) to 0.45 (LD1) in all cities. Especially, it showed good performance (rs = 0.51) in July and August, when heat stress is highest in South Korea. From an impact-based forecasting perspective, PTmax is one of the most suitable temperature indicators for issuing the health risk warnings by heat in South Korea.