• Title/Summary/Keyword: Ensemble Algorithm

Search Result 222, Processing Time 0.027 seconds

Estimating Farmland Prices Using Distance Metrics and an Ensemble Technique (거리척도와 앙상블 기법을 활용한 지가 추정)

  • Lee, Chang-Ro;Park, Key-Ho
    • Journal of Cadastre & Land InformatiX
    • /
    • v.46 no.2
    • /
    • pp.43-55
    • /
    • 2016
  • This study estimated land prices using instance-based learning. A k-nearest neighbor method was utilized among various instance-based learning methods, and the 10 distance metrics including Euclidean distance were calculated in k-nearest neighbor estimation. One distance metric prediction which shows the best predictive performance would be normally chosen as final estimate out of 10 distance metric predictions. In contrast to this practice, an ensemble technique which combines multiple predictions to obtain better performance was applied in this study. We applied the gradient boosting algorithm, a sort of residual-fitting model to our data in ensemble combining. Sales price data of farm lands in Haenam-gun, Jeolla Province were used to demonstrate advantages of instance-based learning as well as an ensemble technique. The result showed that the ensemble prediction was more accurate than previous 10 distance metric predictions.

Identification of Individuals using Single-Lead Electrocardiogram Signal (단일 리드 심전도를 이용한 개인 식별)

  • Lim, Seohyun;Min, Kyeongran;Lee, Jongshill;Jang, Dongpyo;Kim, Inyoung
    • Journal of Biomedical Engineering Research
    • /
    • v.35 no.3
    • /
    • pp.42-49
    • /
    • 2014
  • We propose an individual identification method using a single-lead electrocardiogram signal. In this paper, lead I ECG is measured from subjects in various physical and psychological states. We performed a noise reduction for lead I signal as a preprocessing stage and this signal is used to acquire the representative beat waveform for individuals by utilizing the ensemble average. From the P-QRS-T waves, features are extracted to identify individuals, 19 using the duration and amplitude information, and 16 from the QRS complex acquired by applying Pan-Tompkins algorithm to the ensemble averaged waveform. To analyze the effect of each feature and to improve efficiency while maintaining the performance, Relief-F algorithm is used to select features from the 35 features extracted. Some or all of these 35 features were used in the support vector machine (SVM) learning and tests. The classification accuracy using the entire feature set was 98.34%. Experimental results show that it is possible to identify a person by features extracted from limb lead I signal only.

The Effect of Input Variables Clustering on the Characteristics of Ensemble Machine Learning Model for Water Quality Prediction (입력자료 군집화에 따른 앙상블 머신러닝 모형의 수질예측 특성 연구)

  • Park, Jungsu
    • Journal of Korean Society on Water Environment
    • /
    • v.37 no.5
    • /
    • pp.335-343
    • /
    • 2021
  • Water quality prediction is essential for the proper management of water supply systems. Increased suspended sediment concentration (SSC) has various effects on water supply systems such as increased treatment cost and consequently, there have been various efforts to develop a model for predicting SSC. However, SSC is affected by both the natural and anthropogenic environment, making it challenging to predict SSC. Recently, advanced machine learning models have increasingly been used for water quality prediction. This study developed an ensemble machine learning model to predict SSC using the XGBoost (XGB) algorithm. The observed discharge (Q) and SSC in two fields monitoring stations were used to develop the model. The input variables were clustered in two groups with low and high ranges of Q using the k-means clustering algorithm. Then each group of data was separately used to optimize XGB (Model 1). The model performance was compared with that of the XGB model using the entire data (Model 2). The models were evaluated by mean squared error-ob servation standard deviation ratio (RSR) and root mean squared error. The RSR were 0.51 and 0.57 in the two monitoring stations for Model 2, respectively, while the model performance improved to RSR 0.46 and 0.55, respectively, for Model 1.

Ensemble of Fuzzy Decision Tree for Efficient Indoor Space Recognition

  • Kim, Kisang;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.33-39
    • /
    • 2017
  • In this paper, we expand the process of classification to an ensemble of fuzzy decision tree. For indoor space recognition, many research use Boosted Tree, consists of Adaboost and decision tree. The Boosted Tree extracts an optimal decision tree in stages. On each stage, Boosted Tree extracts the good decision tree by minimizing the weighted error of classification. This decision tree performs a hard decision. In most case, hard decision offer some error when they classify nearby a dividing point. Therefore, We suggest an ensemble of fuzzy decision tree, which offer some flexibility to the Boosted Tree algorithm as well as a high performance. In experimental results, we evaluate that the accuracy of suggested methods improved about 13% than the traditional one.

Noise Correction of Remote Sensing Imageries: Application to KOMPSAT/OSMI Data

  • Kang, Y.Q.;Ahn, Y.H.
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.694-696
    • /
    • 2003
  • The KOMPSAT/OSMI remote sending data of 800 km swath are collected by whisk broom method employing 96 charge coupled devices (CCDs). The stripping noise in the OSMI imageries, which arise mainly due to the non-uniform sensitivities of 96 CCDs, are the major hindrance for oceanographic applications of the OSMI data. The OSMI images are corrected by 'Ensemble Smoothness' method which is based on an assumption that the series of the averages and variances of digital numbers in each line should vary smoothly. The data of each line are corrected by linear regression model of which coefficients are obtained by Ensemble Smoothness method. Our algorithm can be applied not only to OSMI data but also for other remote sensing date collected by whisk broom or push broom.

  • PDF

An Ensemble Classifier using Two Dimensional LDA

  • Park, Cheong-Hee
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.6
    • /
    • pp.817-824
    • /
    • 2010
  • Linear Discriminant Analysis (LDA) has been successfully applied for dimension reduction in face recognition. However, LDA requires the transformation of a face image to a one-dimensional vector and this process can cause the correlation information among neighboring pixels to be disregarded. On the other hand, 2D-LDA uses 2D images directly without a transformation process and it has been shown to be superior to the traditional LDA. Nevertheless, there are some problems in 2D-LDA. First, it is difficult to determine the optimal number of feature vectors in a reduced dimensional space. Second, the size of rectangular windows used in 2D-LDA makes strong impacts on classification accuracies but there is no reliable way to determine an optimal window size. In this paper, we propose a new algorithm to overcome those problems in 2D-LDA. We adopt an ensemble approach which combines several classifiers obtained by utilizing various window sizes. And a practical method to determine the number of feature vectors is also presented. Experimental results demonstrate that the proposed method can overcome the difficulties with choosing an optimal window size and the number of feature vectors.

Minimization of Motion Artifact During Exercise in Impedance Cardiography (임피던스 심장기록법에서 운동으로 인한 Motion Artifact의 최소화)

  • Kim, Jung-Chan;Kim, Jeong-Yeol;Kim, Deok-Won;Youn, Dae-Hee
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1989 no.05
    • /
    • pp.71-73
    • /
    • 1989
  • The origins of the motion artifact resulting from exercise in impedance cardiography wore explained and the ensemble average technique was applied to reduce the motion artifact enabling the measurement of cardiac output during exercise. Algorithm for ensemble average was developed and applied to the actual impedance signals. It was found that the minimum number of sampling was 20, and sampling frequency was 500Hz. Using the ensemble average technique it was possible to measure cardiac output continuously during the treadmill exercise. Therefore it is hoped that this study may contribute in the area of exercise physiology and sport medicine.

  • PDF

Fail Prediction of DRAM Module Outgoing Quality Assurance Inspection using Ensemble Learning Algorithm (앙상블 학습을 이용한 DRAM 모듈 출하 품질보증 검사 불량 예측)

  • Kim, Min-Seok;Baek, Jun-Geol
    • IE interfaces
    • /
    • v.25 no.2
    • /
    • pp.178-186
    • /
    • 2012
  • The DRAM module is an important part of servers, workstations and personal computer. Its malfunction causes a lot of damage on customer system. Therefore, customers demand the highest quality products. The company applies DRAM module Outgoing Quality Assurance Inspection(OQA) to secures the highest quality. It is the key process to decides shipment of products through sample inspection method with customer oriented tests. High fraction of defectives entering to OQA causes inevitable high quality cost. This article proposes the application of ensemble learning to classify the lot status to minimize the ratio of wrong decision in OQA, observing a potential in reducing the wrong decision.

Comparison of tree-based ensemble models for regression

  • Park, Sangho;Kim, Chanmin
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.561-589
    • /
    • 2022
  • When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.

Analysis on the Planar Bowtie Antenna for IMT-2000 Handset (IMT-2000 핸드셋용 평면형 Bowtie 안테나 해석)

  • Lee, Hee-Suk;Kim, Nam
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.11 no.5
    • /
    • pp.681-688
    • /
    • 2000
  • In this paper, a planar antenna that is small and light, is designed and analyzed aiming handset antenna of IMT-2000. Employing the Ensemble simulator based on a MoM, design-parameters are found to determine a resonant frequency. Therefore, it is analyzed with the Ensemble simulation and FDTD numerical for resonating at the allocated frequency for IMT-2000 in the fixed antenna dimension of 21$^{\circ}$wing angle that is a design parameter. Analyzing with FDTD method, Though the results of FDTD are very exact, this analysis introduces errors due to the staircasing approximation in the slope of bowtie. To reduce this error, it is divided to 4-ranges where the cell contains the boundary of perfect conductor/free space. Then, each range is calculated by different by different equation, which modify the H-field to add the component of the area and length of the cell filled with free space. Therefore, the modified FDTD algorithm provided with a narrow bandwidth of return loss calculated with a standard FDTD algorithm that can be extended to the desired ranges.

  • PDF