• Title/Summary/Keyword: Voting Classifier

Search Result 33, Processing Time 0.02 seconds

A New Incremental Learning Algorithm with Probabilistic Weights Using Extended Data Expression

  • Yang, Kwangmo;Kolesnikova, Anastasiya;Lee, Won Don
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.4
    • /
    • pp.258-267
    • /
    • 2013
  • New incremental learning algorithm using extended data expression, based on probabilistic compounding, is presented in this paper. Incremental learning algorithm generates an ensemble of weak classifiers and compounds these classifiers to a strong classifier, using a weighted majority voting, to improve classification performance. We introduce new probabilistic weighted majority voting founded on extended data expression. In this case class distribution of the output is used to compound classifiers. UChoo, a decision tree classifier for extended data expression, is used as a base classifier, as it allows obtaining extended output expression that defines class distribution of the output. Extended data expression and UChoo classifier are powerful techniques in classification and rule refinement problem. In this paper extended data expression is applied to obtain probabilistic results with probabilistic majority voting. To show performance advantages, new algorithm is compared with Learn++, an incremental ensemble-based algorithm.

Operation Plan of Big Data Prediction Model using Cut-off-Voting Classifier in Administrative Big Data Environment (행정 빅데이터 환경에서 컷오프-투표 분류기를 활용한 빅데이터 예측모형의 실험)

  • Woosik Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.145-154
    • /
    • 2024
  • In order to operate predictive models utilizing administrative big data, it is crucial to consider policy changes and the characteristics of highly volatile data. Considering this scenario, this study proposes the Cut-off Voting Classifier (CVC) algorithm. This proposed algorithm prevents a sharp decline in accuracy by utilizing multiple weak classifiers. The study validates the proposed algorithm's performance through experiments. The performance evaluation demonstrates the ability to maintain stable prediction rates even in situations with a sharp decline in predictive model accuracy.

Cognitive Impairment Prediction Model Using AutoML and Lifelog

  • Hyunchul Choi;Chiho Yoon;Sae Bom Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.53-63
    • /
    • 2023
  • This study developed a cognitive impairment predictive model as one of the screening tests for preventing dementia in the elderly by using Automated Machine Learning(AutoML). We used 'Wearable lifelog data for high-risk dementia patients' of National Information Society Agency, then conducted using PyCaret 3.0.0 in the Google Colaboratory environment. This study analysis steps are as follows; first, selecting five models demonstrating excellent classification performance for the model development and lifelog data analysis. Next, using ensemble learning to integrate these models and assess their performance. It was found that Voting Classifier, Gradient Boosting Classifier, Extreme Gradient Boosting, Light Gradient Boosting Machine, Extra Trees Classifier, and Random Forest Classifier model showed high predictive performance in that order. This study findings, furthermore, emphasized on the the crucial importance of 'Average respiration per minute during sleep' and 'Average heart rate per minute during sleep' as the most critical feature variables for accurate predictions. Finally, these study results suggest that consideration of the possibility of using machine learning and lifelog as a means to more effectively manage and prevent cognitive impairment in the elderly.

A Feature Selection-based Ensemble Method for Arrhythmia Classification

  • Namsrai, Erdenetuya;Munkhdalai, Tsendsuren;Li, Meijing;Shin, Jung-Hoon;Namsrai, Oyun-Erdene;Ryu, Keun Ho
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.31-40
    • /
    • 2013
  • In this paper, a novel method is proposed to build an ensemble of classifiers by using a feature selection schema. The feature selection schema identifies the best feature sets that affect the arrhythmia classification. Firstly, a number of feature subsets are extracted by applying the feature selection schema to the original dataset. Then classification models are built by using the each feature subset. Finally, we combine the classification models by adopting a voting approach to form a classification ensemble. The voting approach in our method involves both classification error rate and feature selection rate to calculate the score of the each classifier in the ensemble. In our method, the feature selection rate depends on the extracting order of the feature subsets. In the experiment, we applied our method to arrhythmia dataset and generated three top disjointed feature sets. We then built three classifiers based on the top-three feature subsets and formed the classifier ensemble by using the voting approach. Our method can improve the classification accuracy in high dimensional dataset. The performance of each classifier and the performance of their ensemble were higher than the performance of the classifier that was based on whole feature space of the dataset. The classification performance was improved and a more stable classification model could be constructed with the proposed approach.

Text-independent Speaker Identification by Bagging VQ Classifier

  • Kyung, Youn-Jeong;Park, Bong-Dae;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2E
    • /
    • pp.17-24
    • /
    • 2001
  • In this paper, we propose the bootstrap and aggregating (bagging) vector quantization (VQ) classifier to improve the performance of the text-independent speaker recognition system. This method generates multiple training data sets by resampling the original training data set, constructs the corresponding VQ classifiers, and then integrates the multiple VQ classifiers into a single classifier by voting. The bagging method has been proven to greatly improve the performance of unstable classifiers. Through two different experiments, this paper shows that the VQ classifier is unstable. In one of these experiments, the bias and variance of a VQ classifier are computed with a waveform database. The variance of the VQ classifier is compared with that of the classification and regression tree (CART) classifier[1]. The variance of the VQ classifier is shown to be as large as that of the CART classifier. The other experiment involves speaker recognition. The speaker recognition rates vary significantly by the minor changes in the training data set. The speaker recognition experiments involving a closed set, text-independent and speaker identification are performed with the TIMIT database to compare the performance of the bagging VQ classifier with that of the conventional VQ classifier. The bagging VQ classifier yields improved performance over the conventional VQ classifier. It also outperforms the conventional VQ classifier in small training data set problems.

  • PDF

Double-Bagging Ensemble Using WAVE

  • Kim, Ahhyoun;Kim, Minji;Kim, Hyunjoong
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.5
    • /
    • pp.411-422
    • /
    • 2014
  • A classification ensemble method aggregates different classifiers obtained from training data to classify new data points. Voting algorithms are typical tools to summarize the outputs of each classifier in an ensemble. WAVE, proposed by Kim et al. (2011), is a new weight-adjusted voting algorithm for ensembles of classifiers with an optimal weight vector. In this study, when constructing an ensemble, we applied the WAVE algorithm on the double-bagging method (Hothorn and Lausen, 2003) to observe if any significant improvement can be achieved on performance. The results showed that double-bagging using WAVE algorithm performs better than other ensemble methods that employ plurality voting. In addition, double-bagging with WAVE algorithm is comparable with the random forest ensemble method when the ensemble size is large.

Systematic Approach for Detecting Text in Images Using Supervised Learning

  • Nguyen, Minh Hieu;Lee, GueeSang
    • International Journal of Contents
    • /
    • v.9 no.2
    • /
    • pp.8-13
    • /
    • 2013
  • Locating text data in images automatically has been a challenging task. In this approach, we build a three stage system for text detection purpose. This system utilizes tensor voting and Completed Local Binary Pattern (CLBP) to classify text and non-text regions. While tensor voting generates the text line information, which is very useful for localizing candidate text regions, the Nearest Neighbor classifier trained on discriminative features obtained by the CLBP-based operator is used to refine the results. The whole algorithm is implemented in MATLAB and applied to all images of ICDAR 2011 Robust Reading Competition data set. Experiments show the promising performance of this method.

Text-independent Speaker Identification Using Soft Bag-of-Words Feature Representation

  • Jiang, Shuangshuang;Frigui, Hichem;Calhoun, Aaron W.
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.240-248
    • /
    • 2014
  • We present a robust speaker identification algorithm that uses novel features based on soft bag-of-word representation and a simple Naive Bayes classifier. The bag-of-words (BoW) based histogram feature descriptor is typically constructed by summarizing and identifying representative prototypes from low-level spectral features extracted from training data. In this paper, we define a generalization of the standard BoW. In particular, we define three types of BoW that are based on crisp voting, fuzzy memberships, and possibilistic memberships. We analyze our mapping with three common classifiers: Naive Bayes classifier (NB); K-nearest neighbor classifier (KNN); and support vector machines (SVM). The proposed algorithms are evaluated using large datasets that simulate medical crises. We show that the proposed soft bag-of-words feature representation approach achieves a significant improvement when compared to the state-of-art methods.

A Detailed Analysis of Classifier Ensembles for Intrusion Detection in Wireless Network

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1203-1212
    • /
    • 2017
  • Intrusion detection systems (IDSs) are crucial in this overwhelming increase of attacks on the computing infrastructure. It intelligently detects malicious and predicts future attack patterns based on the classification analysis using machine learning and data mining techniques. This paper is devoted to thoroughly evaluate classifier ensembles for IDSs in IEEE 802.11 wireless network. Two ensemble techniques, i.e. voting and stacking are employed to combine the three base classifiers, i.e. decision tree (DT), random forest (RF), and support vector machine (SVM). We use area under ROC curve (AUC) value as a performance metric. Finally, we conduct two statistical significance tests to evaluate the performance differences among classifiers.

Classification based Knee Bone Detection using Context Information (문맥 정보를 이용한 분류 기반 무릎 뼈 검출 기법)

  • Shin, Seungyeon;Park, Sanghyun;Yun, Il Dong;Lee, Sang Uk
    • Journal of Broadcast Engineering
    • /
    • v.18 no.3
    • /
    • pp.401-408
    • /
    • 2013
  • In this paper, we propose a method that automatically detects organs having similar appearances in medical images by learning both context and appearance features. Since only the appearance feature is used to learn the classifier in most existing detection methods, detection errors occur when the medical images include multiple organs having similar appearances. In the proposed method, based on the probabilities acquired by the appearance-based classifier, new classifier containing the context feature is created by iteratively learning the characteristics of probability distribution around the interest voxel. Furthermore, both the efficiency and the accuracy are improved through 'region based voting scheme' in test stage. To evaluate the performance of the proposed method, we detect femur and tibia which have similar appearance from SKI10 knee joint dataset. The proposed method outperformed the detection method only using appearance feature in aspect of overall detection performance.