• Title/Summary/Keyword: 앙상블기법

Search Result 300, Processing Time 0.028 seconds

Application Analysis of Short-term Rainfall Forecasting Model according to Bias Correlation in Rainfall Ensemble Data (강우앙상블자료 편의보정에 따른 단기강우예측모델의 적용성 분석)

  • Lee, Sanghyup;Seong, Yeon-Jeong;Bastola, Shiksha;Choo, InnKyo;Jung, Younghun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.119-119
    • /
    • 2019
  • 최근 기후변화와 이상기후의 영향으로 국지성 호우 및 가뭄, 홍수, 태풍 등 재해 발생 규모가 커지고 그 빈도 또한 많아지고 있다. 이러한 자연재해 및 이상현상에 대한 피해를 예방하고 빠르게 대처하기 위해서는 정확한 강우량 추정 및 강우의 시간적 예측이 필요하다. 이러한 강우의 불확실성을 해결하기 위해서 기상청 등에서는 단일 수치예보가 가지는 결정론적인 예측의 한계를 보완한 초기조건, 물리과정, 경계조건 등이 다른 여러 개의 모델을 수행하여, 확률적으로 미래를 예측하는 앙상블 예측 시스템을 예보기술에 응용하고 있으며 기존 수치모델의 정보와 예보 불확실성에 대한 정보를 동시에 제공하고 있다. 그러나 다양한 자연조건에 대한 불완전한 물리적 이해와 연산 능력 등의 한계로 높은 불확실성이 내포되어 있으므로 불확실성을 최소화하기 위한 편의보정이 수행될 필요가 있다. 강우분석의 적용 이전에 해당 자료의 타당성과 신뢰도의 분석이 필요하다. 본 연구에서는 LENS(Local ENsemble prediction System) 예측값과 시강우 관측값을 단기예측모델에 맞추어 3시간 누적하여 비교하였다. 비교 기간은 호우가 집중되는 2016년 10월로 선정하였으며 대상지역은 울산중구로 선정하였다. LENS를 대상 지역의 관측소 지점값과 행정구역 면적값을 따로 추출한 후, 불확실성을 최소화하기 위해 활용되고 있는 CF 기법과 QM 기법을 이용하여 LENS 모델을 재가공하고 이에 따른 편의보정 기법에 따른 LENS 모델을 과거의 실제강우 관측값과의 비교분석을 이용해 적용성을 검토 및 평가하였다.

  • PDF

A Study on the Prediction of Cabbage Price Using Ensemble Voting Techniques (앙상블 Voting 기법을 활용한 배추 가격 예측에 관한 연구)

  • Lee, Chang-Min;Song, Sung-Kwang;Chung, Sung-Wook
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.3
    • /
    • pp.1-10
    • /
    • 2022
  • Vegetables such as cabbage are greatly affected by natural disasters, so price fluctuations increase due to disasters such as heavy rain and disease, which affects the farm economy. Various efforts have been made to predict the price of agricultural products to solve this problem, but it is difficult to predict extreme price prediction fluctuations. In this study, cabbage prices were analyzed using the ensemble Voting technique, a method of determining the final prediction results through various classifiers by combining a single classifier. In addition, the results were compared with LSTM, a time series analysis method, and XGBoost and RandomForest, a boosting technique. Daily data was used for price data, and weather information and price index that affect cabbage prices were used. As a result of the study, the RMSE value showing the difference between the actual value and the predicted value is about 236. It is expected that this study can be used to select other time series analysis research models such as predicting agricultural product prices

Wild Bird Sound Classification Scheme using Focal Loss and Ensemble Learning (Focal Loss와 앙상블 학습을 이용한 야생조류 소리 분류 기법)

  • Jaeseung Lee;Jehyeok Rew
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.2
    • /
    • pp.15-25
    • /
    • 2024
  • For effective analysis of animal ecosystems, technology that can automatically identify the current status of animal habitats is crucial. Specifically, animal sound classification, which identifies species based on their sounds, is gaining great attention where video-based discrimination is impractical. Traditional studies have relied on a single deep learning model to classify animal sounds. However, sounds collected in outdoor settings often include substantial background noise, complicating the task for a single model. In addition, data imbalance among species may lead to biased model training. To address these challenges, in this paper, we propose an animal sound classification scheme that combines predictions from multiple models using Focal Loss, which adjusts penalties based on class data volume. Experiments on public datasets have demonstrated that our scheme can improve recall by up to 22.6% compared to an average of single models.

Improving SVM Classification by Constructing Ensemble (앙상블 구성을 이용한 SVM 분류성능의 향상)

  • 제홍모;방승양
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.251-258
    • /
    • 2003
  • A support vector machine (SVM) is supposed to provide a good generalization performance, but the actual performance of a actually implemented SVM is often far from the theoretically expected level. This is largely because the implementation is based on an approximated algorithm, due to the high complexity of time and space. To improve this limitation, we propose ensemble of SVMs by using Bagging (bootstrap aggregating) and Boosting. By a Bagging stage each individual SVM is trained independently using randomly chosen training samples via a bootstrap technique. By a Boosting stage an individual SVM is trained by choosing training samples according to their probability distribution. The probability distribution is updated by the error of independent classifiers, and the process is iterated. After the training stage, they are aggregated to make a collective decision in several ways, such ai majority voting, the LSE(least squares estimation) -based weighting, and double layer hierarchical combining. The simulation results for IRIS data classification, the hand-written digit recognition and Face detection show that the proposed SVM ensembles greatly outperforms a single SVM in terms of classification accuracy.

Face Recognition Network using gradCAM (gradCam을 사용한 얼굴인식 신경망)

  • Chan Hyung Baek;Kwon Jihun;Ho Yub Jung
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.9-14
    • /
    • 2023
  • In this paper, we proposed a face recognition network which attempts to use more facial features awhile using smaller number of training sets. When combining the neural network together for face recognition, we want to use networks that use different part of the facial features. However, the network training chooses randomly where these facial features are obtained. Other hand, the judgment basis of the network model can be expressed as a saliency map through gradCAM. Therefore, in this paper, we use gradCAM to visualize where the trained face recognition model has made a observations and recognition judgments. Thus, the network combination can be constructed based on the different facial features used. Using this approach, we trained a network for small face recognition problem. In an simple toy face recognition example, the recognition network used in this paper improves the accuracy by 1.79% and reduces the equal error rate (EER) by 0.01788 compared to the conventional approach.

The guideline for choosing the right-size of tree for boosting algorithm (부스팅 트리에서 적정 트리사이즈의 선택에 관한 연구)

  • Kim, Ah-Hyoun;Kim, Ji-Hyun;Kim, Hyun-Joong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.949-959
    • /
    • 2012
  • This article is to find the right size of decision trees that performs better for boosting algorithm. First we defined the tree size D as the depth of a decision tree. Then we compared the performance of boosting algorithm with different tree sizes in the experiment. Although it is an usual practice to set the tree size in boosting algorithm to be small, we figured out that the choice of D has a significant influence on the performance of boosting algorithm. Furthermore, we found out that the tree size D need to be sufficiently large for some dataset. The experiment result shows that there exists an optimal D for each dataset and choosing the right size D is important in improving the performance of boosting. We also tried to find the model for estimating the right size D suitable for boosting algorithm, using variables that can explain the nature of a given dataset. The suggested model reveals that the optimal tree size D for a given dataset can be estimated by the error rate of stump tree, the number of classes, the depth of a single tree, and the gini impurity.

Application of the Satellite Based Soil Moisture Data Assimilation Technique with Ensemble Kalman Filter in Korean Dam Basin (국내 주요 댐 유역에 대한 앙상블 칼만필터 기반 위성 토양수분 자료 동화 기법의 적용)

  • Lee, Jaehyeon;Kim, Dongkyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.301-301
    • /
    • 2018
  • 본 연구에서는 위성 기반 토양수분 자료를 수문모형에 자료동화하여 격자 단위에서 수문기상인자를 산출하고 그 정확성을 평가하였다. 수문모형으로는 Variable Infiltration Capacity(VIC) model을 선정하여 국내 주요 8개 댐 유역에 구축하였으며, 입력자료는 2008년 이후 10년간 자료를 수집하였으며, 2008-2012년의 관측 유량 자료를 사용하여 모형을 보정하였다. 모형의 보정을 위해 Isolated-Speciation Particle Swarm Optimization(ISPSO) 기법을 적용하여 매개변수를 추정하였고, 2013-2017년의 관측유량 자료를 통하여 모형의 성능을 검증하였다. VIC 모형에 자료 동화한 토양수분 자료는 AMSR2 위성 토양 수분 자료와 지상관측 토양수분 자료를 합성한 자료를 사용하였으며, 인공위성자료와 지상 자료를 조건부합성기법으로 합성한 토양수분자료는 각 격자별 토양수분을 더 정확히 산정하여 자료동화시 모형의 모의 정확도가 향상되는 경향을 보였다. 본 연구결과는 지상관측자료를 통해 보정된 위성관측 토양수분자료를 자료동화하여 수문모형의 정확도를 향상시키고, 미계측 유역에 대한 향상된 수문기상인자 정보를 제공함으로써 다양한 수문분석의 기초자료로 활용될 수 있을 것으로 기대된다.

  • PDF

Development of Predictive Model for Length of Stay(LOS) in Acute Stroke Patients using Artificial Intelligence (인공지능을 이용한 급성 뇌졸중 환자의 재원일수 예측모형 개발)

  • Choi, Byung Kwan;Ham, Seung Woo;Kim, Chok Hwan;Seo, Jung Sook;Park, Myung Hwa;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.16 no.1
    • /
    • pp.231-242
    • /
    • 2018
  • The efficient management of the Length of Stay(LOS) is important in hospital. It is import to reduce medical cost for patients and increase profitability for hospitals. In order to efficiently manage LOS, it is necessary to develop an artificial intelligence-based prediction model that supports hospitals in benchmarking and reduction ways of LOS. In order to develop a predictive model of LOS for acute stroke patients, acute stroke patients were extracted from 2013 and 2014 discharge injury patient data. The data for analysis was classified as 60% for training and 40% for evaluation. In the model development, we used traditional regression technique such as multiple regression analysis method, artificial intelligence technique such as interactive decision tree, neural network technique, and ensemble technique which integrate all. Model evaluation used Root ASE (Absolute error) index. They were 23.7 by multiple regression, 23.7 by interactive decision tree, 22.7 by neural network and 22.7 by esemble technique. As a result of model evaluation, neural network technique which is artificial intelligence technique was found to be superior. Through this, the utility of artificial intelligence has been proved in the development of the prediction LOS model. In the future, it is necessary to continue research on how to utilize artificial intelligence techniques more effectively in the development of LOS prediction model.

Indoor positioning method using WiFi signal based on XGboost (XGboost 기반의 WiFi 신호를 이용한 실내 측위 기법)

  • Hwang, Chi-Gon;Yoon, Chang-Pyo;Kim, Dae-Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.70-75
    • /
    • 2022
  • Accurately measuring location is necessary to provide a variety of services. The data for indoor positioning measures the RSSI values from the WiFi device through an application of a smartphone. The measured data becomes the raw data of machine learning. The feature data is the measured RSSI value, and the label is the name of the space for the measured position. For this purpose, the machine learning technique is to study a technique that predicts the exact location only with the WiFi signal by applying an efficient technique to classification. Ensemble is a technique for obtaining more accurate predictions through various models than one model, including backing and boosting. Among them, Boosting is a technique for adjusting the weight of a model through a modeling result based on sampled data, and there are various algorithms. This study uses Xgboost among the above techniques and evaluates performance with other ensemble techniques.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.