• Title/Summary/Keyword: Ensemble Technique

Search Result 217, Processing Time 0.029 seconds

Improving SVM Classification by Constructing Ensemble (앙상블 구성을 이용한 SVM 분류성능의 향상)

  • 제홍모;방승양
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.251-258
    • /
    • 2003
  • A support vector machine (SVM) is supposed to provide a good generalization performance, but the actual performance of a actually implemented SVM is often far from the theoretically expected level. This is largely because the implementation is based on an approximated algorithm, due to the high complexity of time and space. To improve this limitation, we propose ensemble of SVMs by using Bagging (bootstrap aggregating) and Boosting. By a Bagging stage each individual SVM is trained independently using randomly chosen training samples via a bootstrap technique. By a Boosting stage an individual SVM is trained by choosing training samples according to their probability distribution. The probability distribution is updated by the error of independent classifiers, and the process is iterated. After the training stage, they are aggregated to make a collective decision in several ways, such ai majority voting, the LSE(least squares estimation) -based weighting, and double layer hierarchical combining. The simulation results for IRIS data classification, the hand-written digit recognition and Face detection show that the proposed SVM ensembles greatly outperforms a single SVM in terms of classification accuracy.

Stochastic Continuous Storage Function Model with Ensemble Kalman Filtering (I) : Model Development (앙상블 칼만필터를 연계한 추계학적 연속형 저류함수모형 (I) : - 모형 개발 -)

  • Bae, Deg-Hyo;Lee, Byong-Ju;Georgakakos, Konstantine P.
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.11
    • /
    • pp.953-961
    • /
    • 2009
  • The objective of this study is to develop a stochastic continuous storage function model for enhancement of an event-oriented watershed and channel storage function models which have been used as an official flood forecast model in Korea. For this study, soil moisture accounting component is added to the original storage function model and each hydrologic component, such as surface flow, subsurface flow, groundwater flow and actual evaportranspiration, is simulated as a function of soil water content. And also, ensemble Kalman filtering technique is used for real-time assimilation of measured streamflow from various stream locations in the watershed. Therefore the enhanced model will be able to simulate hydrologic components for long-term period without additional estimation of model parameters and to give more accurate and reliable results than those from the existing deterministic model due to the assimilation of measured streamflow data.

Cancer Diagnosis System using Genetic Algorithm and Multi-boosting Classifier (Genetic Algorithm과 다중부스팅 Classifier를 이용한 암진단 시스템)

  • Ohn, Syng-Yup;Chi, Seung-Do
    • Journal of the Korea Society for Simulation
    • /
    • v.20 no.2
    • /
    • pp.77-85
    • /
    • 2011
  • It is believed that the anomalies or diseases of human organs are identified by the analysis of the patterns. This paper proposes a new classification technique for the identification of cancer disease using the proteome patterns obtained from two-dimensional polyacrylamide gel electrophoresis(2-D PAGE). In the new classification method, three different classification methods such as support vector machine(SVM), multi-layer perceptron(MLP) and k-nearest neighbor(k-NN) are extended by multi-boosting method in an array of subclassifiers and the results of each subclassifier are merged by ensemble method. Genetic algorithm was applied to obtain optimal feature set in each subclassifier. We applied our method to empirical data set from cancer research and the method showed the better accuracy and more stable performance than single classifier.

An Ensemble Fingerprint Classification System Using Changes of Gradient of Ridge (융선 기울기의 변화량을 이용한 앙상블 지문분류 시스템)

  • Yoon, Kyung-Bae;Park, Chang-Hee
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.5
    • /
    • pp.545-551
    • /
    • 2003
  • Henry System which is a traditional fingerprint classification model is difficult to apply to a modem Automatic Fingerprint Identification System (AFIS). To tackle this problem, this study is to apply algorithm for an An Ensemble Fingerprint Classroom System using changes of gradient of ridge in order to improve precise joining speed of a large volume of database. The existing classification system, Henry System, is useful in a captured fingerprint image of core point and delta point using paper and ink. However, the Henry System is unapplicable in modem Automatic Fingerprint Identification System (AFIS) because of problems such as size of input sensor and way of input. This study is to suggest an Ensemble Fingerprint Classroom System which can classify 5 basic patterns of Henry System in uncaptured delta image using changes of gradient of ridge. The proposed fingerprint classification technique will make an improvement of precise joining speed by reducing data volume.

Diabetes prediction mechanism using machine learning model based on patient IQR outlier and correlation coefficient (환자 IQR 이상치와 상관계수 기반의 머신러닝 모델을 이용한 당뇨병 예측 메커니즘)

  • Jung, Juho;Lee, Naeun;Kim, Sumin;Seo, Gaeun;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1296-1301
    • /
    • 2021
  • With the recent increase in diabetes incidence worldwide, research has been conducted to predict diabetes through various machine learning and deep learning technologies. In this work, we present a model for predicting diabetes using machine learning techniques with German Frankfurt Hospital data. We apply outlier handling using Interquartile Range (IQR) techniques and Pearson correlation and compare model-specific diabetes prediction performance with Decision Tree, Random Forest, Knn (k-nearest neighbor), SVM (support vector machine), Bayesian Network, ensemble techniques XGBoost, Voting, and Stacking. As a result of the study, the XGBoost technique showed the best performance with 97% accuracy on top of the various scenarios. Therefore, this study is meaningful in that the model can be used to accurately predict and prevent diabetes prevalent in modern society.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

Uncertainty Analysis based on LENS-GRM

  • Lee, Sang Hyup;Seong, Yeon Jeong;Park, KiDoo;Jung, Young Hun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.208-208
    • /
    • 2022
  • Recently, the frequency of abnormal weather due to complex factors such as global warming is increasing frequently. From the past rainfall patterns, it is evident that climate change is causing irregular rainfall patterns. This phenomenon causes difficulty in predicting rainfall and makes it difficult to prevent and cope with natural disasters, casuing human and property damages. Therefore, accurate rainfall estimation and rainfall occurrence time prediction could be one of the ways to prevent and mitigate damage caused by flood and drought disasters. However, rainfall prediction has a lot of uncertainty, so it is necessary to understand and reduce this uncertainty. In addition, when accurate rainfall prediction is applied to the rainfall-runoff model, the accuracy of the runoff prediction can be improved. In this regard, this study aims to increase the reliability of rainfall prediction by analyzing the uncertainty of the Korean rainfall ensemble prediction data and the outflow analysis model using the Limited Area ENsemble (LENS) and the Grid based Rainfall-runoff Model (GRM) models. First, the possibility of improving rainfall prediction ability is reviewed using the QM (Quantile Mapping) technique among the bias correction techniques. Then, the GRM parameter calibration was performed twice, and the likelihood-parameter applicability evaluation and uncertainty analysis were performed using R2, NSE, PBIAS, and Log-normal. The rainfall prediction data were applied to the rainfall-runoff model and evaluated before and after calibration. It is expected that more reliable flood prediction will be possible by reducing uncertainty in rainfall ensemble data when applying to the runoff model in selecting behavioral models for user uncertainty analysis. Also, it can be used as a basis of flood prediction research by integrating other parameters such as geological characteristics and rainfall events.

  • PDF

PIV measurement of roof corner vortices

  • Kim, Kyung Chun;Ji, Ho Seong;Seong, Seung Hak
    • Wind and Structures
    • /
    • v.4 no.5
    • /
    • pp.441-454
    • /
    • 2001
  • Conical vortices on roof corners of a prismatic low-rise building have been investigated by using the PIV(Particle Image Velocimetry) technique. The Reynolds number based on the free stream velocity and model height was $5.3{\times}10^3$. Mean and instantaneous vector fields for velocity, vorticity, and turbulent kinetic energy were measured at two vertical planes and for two different flow angles of $30^{\circ}$ and $45^{\circ}$. The measurements provided a clear view of the complex flow structures on roof corners such as a pair of counter rotating conical vortices, secondary vortices, and tertiary vortices. They also enabled accurate and easy measurement of the size of vortices. Additionally, we could easily locate the centers of the vortices from the ensemble averaged velocity fields. It was observed that the flow angle of a $30^{\circ}$ produces a higher level of vorticity and turbulent kinetic energy in one of the pair of vortices than does the $45^{\circ}$ flow angle.

Forecasting Day-ahead Electricity Price Using a Hybrid Improved Approach

  • Hu, Jian-Ming;Wang, Jian-Zhou
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.6
    • /
    • pp.2166-2176
    • /
    • 2017
  • Electricity price prediction plays a crucial part in making the schedule and managing the risk to the competitive electricity market participants. However, it is a difficult and challenging task owing to the characteristics of the nonlinearity, non-stationarity and uncertainty of the price series. This study proposes a hybrid improved strategy which incorporates data preprocessor components and a forecasting engine component to enhance the forecasting accuracy of the electricity price. In the developed forecasting procedure, the Seasonal Adjustment (SA) method and the Ensemble Empirical Mode Decomposition (EEMD) technique are synthesized as the data preprocessing component; the Coupled Simulated Annealing (CSA) optimization method and the Least Square Support Vector Regression (LSSVR) algorithm construct the prediction engine. The proposed hybrid approach is verified with electricity price data sampled from the power market of New South Wales in Australia. The simulation outcome manifests that the proposed hybrid approach obtains the observable improvement in the forecasting accuracy compared with other approaches, which suggests that the proposed combinational approach occupies preferable predication ability and enough precision.

A Prediction of Precipitation Over East Asia for June Using Simultaneous and Lagged Teleconnection (원격상관을 이용한 동아시아 6월 강수의 예측)

  • Lee, Kang-Jin;Kwon, MinHo
    • Atmosphere
    • /
    • v.26 no.4
    • /
    • pp.711-716
    • /
    • 2016
  • The dynamical model forecasts using state-of-art general circulation models (GCMs) have some limitations to simulate the real climate system since they do not depend on the past history. One of the alternative methods to correct model errors is to use the canonical correlation analysis (CCA) correction method. CCA forecasts at the present time show better skill than dynamical model forecasts especially over the midlatitudes. Model outputs are adjusted based on the CCA modes between the model forecasts and the observations. This study builds a canonical correlation prediction model for subseasonal (June) precipitation. The predictors are circulation fields over western North Pacific from the Global Seasonal Forecasting System version 5 (GloSea5) and observed snow cover extent over Eurasia continent from Climate Data Record (CDR). The former is based on simultaneous teleconnection between the western North Pacific and the East Asia, and the latter on lagged teleconnection between the Eurasia continent and the East Asia. In addition, we suggest a technique for improving forecast skill by applying the ensemble canonical correlation (ECC) to individual canonical correlation predictions.