• Title/Summary/Keyword: Ensemble Algorithm

Search Result 223, Processing Time 0.023 seconds

Construction of Robust Bayesian Network Ensemble using a Speciated Evolutionary Algorithm (종 분화 진화 알고리즘을 이용한 안정된 베이지안 네트워크 앙상블 구축)

  • Yoo Ji-Oh;Kim Kyung-Joong;Cho Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1569-1580
    • /
    • 2004
  • One commonly used approach to deal with uncertainty is Bayesian network which represents joint probability distributions of domain. There are some attempts to team the structure of Bayesian networks automatically and recently many researchers design structures of Bayesian network using evolutionary algorithm. However, most of them use the only one fittest solution in the last generation. Because it is difficult to combine all the important factors into a single evaluation function, the best solution is often biased and less adaptive. In this paper, we present a method of generating diverse Bayesian network structures through fitness sharing and combining them by Bayesian method for adaptive inference. In order to evaluate performance, we conduct experiments on learning Bayesian networks with artificially generated data from ASIA and ALARM networks. According to the experiments with diverse conditions, the proposed method provides with better robustness and adaptation for handling uncertainty.

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 안드로이드 모바일 악성 앱의 최적 특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.427-432
    • /
    • 2019
  • In this paper, we propose three approaches to modeling Android malware. The first method involves human security experts for meticulously selecting feature sets. With the second approach, we choose 300 features with the highest importance among the top 99% features in terms of occurrence rate. The third approach is to combine multiple models and identify malware through weighted voting. In addition, we applied a novel method of eliminating permission information which used to be regarded as a critical factor for distinguishing malware. With our carefully generated feature sets and the weighted voting by the ensemble algorithm, we were able to reach the highest malware detection accuracy of 97.8%. We also verified that discarding the permission information lead to the improvement in terms of false positive and false negative rates.

Evaluation of Multi-classification Model Performance for Algal Bloom Prediction Using CatBoost (머신러닝 CatBoost 다중 분류 알고리즘을 이용한 조류 발생 예측 모형 성능 평가 연구)

  • Juneoh Kim;Jungsu Park
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.1
    • /
    • pp.1-8
    • /
    • 2023
  • Monitoring and prediction of water quality are essential for effective river pollution prevention and water quality management. In this study, a multi-classification model was developed to predict chlorophyll-a (Chl-a) level in rivers. A model was developed using CatBoost, a novel ensemble machine learning algorithm. The model was developed using hourly field monitoring data collected from January 1 to December 31, 2015. For model development, chl-a was classified into class 1 (Chl-a≤10 ㎍/L), class 2 (10<Chl-a≤50 ㎍/L), and class 3 (Chl-a>50 ㎍/L), where the number of data used for the model training were 27,192, 11,031, and 511, respectively. The macro averages of precision, recall, and F1-score for the three classes were 0.58, 0.58, and 0.58, respectively, while the weighted averages were 0.89, 0.90, and 0.89, for precision, recall, and F1-score, respectively. The model showed relatively poor performance for class 3 where the number of observations was much smaller compared to the other two classes. The imbalance of data distribution among the three classes was resolved by using the synthetic minority over-sampling technique (SMOTE) algorithm, where the number of data used for model training was evenly distributed as 26,868 for each class. The model performance was improved with the macro averages of precision, rcall, and F1-score of the three classes as 0.58, 0.70, and 0.59, respectively, while the weighted averages were 0.88, 0.84, and 0.86 after SMOTE application.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

Performance Improvement of Ensemble Speciated Neural Networks using Kullback-Leibler Entropy (Kullback-Leibler 엔트로피를 이용한 종분화 신경망 결합의 성능향상)

  • Kim, Kyung-Joong;Cho, Sung-Bae
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.4
    • /
    • pp.152-159
    • /
    • 2002
  • Fitness sharing that shares fitness if calculated distance between individuals is smaller than sharing radius is one of the representative speciation methods and can complement evolutionary algorithm which converges one solution. Recently, there are many researches on designing neural network architecture using evolutionary algorithm but most of them use only the fittest solution in the last generation. In this paper, we elaborate generating diverse neural networks using fitness sharing and combing them to compute outputs then, propose calculating distance between individuals using modified Kullback-Leibler entropy for improvement of fitness sharing performance. In the experiment of Australian credit card assessment, breast cancer, and diabetes in UCI database, proposed method performs better than not only simple average output or Pearson Correlation but also previous published methods.

Introduction to Gene Prediction Using HMM Algorithm

  • Kim, Keon-Kyun;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.489-506
    • /
    • 2007
  • Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is the most important process in annotating genes and greatly affects gene analysis and genome annotation. As eukaryotic genes have more complicated structures in DNA sequences than those of prokaryotic genes, analysis programs for eukaryotic gene structure prediction have more diverse and more complicated computational models. There are Ab Initio method, Similarity-based method, and Ensemble method for gene prediction method for eukaryotic genes. Each Method use various algorithms. This paper introduce how to predict genes using HMM(Hidden Markov Model) algorithm and present the process of gene prediction with well-known gene prediction programs.

  • PDF

Implementation of EP waveform Estimator using DSP chip and Microcomputer (DSP chip과 Microcomputer를 이용한 뇌 유발전위 추정기의 구현)

  • Kim, J.W.;Yoo, S.K.;Min, B.G.;Kim, J.W.;Kim, S.H.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1993 no.11
    • /
    • pp.151-155
    • /
    • 1993
  • Evoked potentials(EP) measured with scalp electrodes are often described as a deterministic process corrupted by uncorrelated electrical activities occuring in the brain and These electrical activities(ongoing EEG) refer to noise in EP recording. The Conventional method to determine the EP waveform requires long recording time. Unfortunately most of algorithm developed are too complicated for implementation in real time. Thus, conner EP recording devices use Ensemble average for real time processing. In this paper introduce EP recording hardware for processing advanced algorithm in real tlne. This hardware is composed of DSP chip(TMS320c25) and microcomputer.

  • PDF

Performance Improvement of MSAGF-MMA Adaptive Blind Equalization Using Multiple Step-Size LMS (다중 스텝 크기 LMS를 이용한 MSAGF-MMA 적응 블라인드 등화의 성능 개선)

  • Jeong, Young-Hwa
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.4
    • /
    • pp.83-89
    • /
    • 2013
  • An adaptive blind equalization is a technique using to minimize the Inter-symbol Interference occurred on a communication channel in the transmission of the high speed digital data. In this paper, we propose a blind equalization more improving performance of the conventional MSAGF-MMA adaptive blind equalization algorithm by applying a multiple step size. This algorithm apply a LMS algorithm with a several step size according to each region divided by absolute values of decision-directed error to MSAGF-MMA. By computer simulation, it is confirmed that the proposed algorithm has a performance highly enhanced in terms of a convergence speed, a residual ISI and a residual error and an ensemble averaged MSE in a steady status compared with MMA and MSAGF-MMA.

Multi-Cattle tracking with appearance and motion models in closed barns using deep learning

  • Han, Shujie;Fuentes, Alvaro;Yoon, Sook;Park, Jongbin;Park, Dong Sun
    • Smart Media Journal
    • /
    • v.11 no.8
    • /
    • pp.84-92
    • /
    • 2022
  • Precision livestock monitoring promises greater management efficiency for farmers and higher welfare standards for animals. Recent studies on video-based animal activity recognition and tracking have shown promising solutions for understanding animal behavior. To achieve that, surveillance cameras are installed diagonally above the barn in a typical cattle farm setup to monitor animals constantly. Under these circumstances, tracking individuals requires addressing challenges such as occlusion and visual appearance, which are the main reasons for track breakage and increased misidentification of animals. This paper presents a framework for multi-cattle tracking in closed barns with appearance and motion models. To overcome the above challenges, we modify the DeepSORT algorithm to achieve higher tracking accuracy by three contributions. First, we reduce the weight of appearance information. Second, we use an Ensemble Kalman Filter to predict the random motion information of cattle. Third, we propose a supplementary matching algorithm that compares the absolute cattle position in the barn to reassign lost tracks. The main idea of the matching algorithm assumes that the number of cattle is fixed in the barn, so the edge of the barn is where new trajectories are most likely to emerge. Experimental results are performed on our dataset collected on two cattle farms. Our algorithm achieves 70.37%, 77.39%, and 81.74% performance on HOTA, AssA, and IDF1, representing an improvement of 1.53%, 4.17%, and 0.96%, respectively, compared to the original method.

Ensemble Deep Network for Dense Vehicle Detection in Large Image

  • Yu, Jae-Hyoung;Han, Youngjoon;Kim, JongKuk;Hahn, Hernsoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.45-55
    • /
    • 2021
  • This paper has proposed an algorithm that detecting for dense small vehicle in large image efficiently. It is consisted of two Ensemble Deep-Learning Network algorithms based on Coarse to Fine method. The system can detect vehicle exactly on selected sub image. In the Coarse step, it can make Voting Space using the result of various Deep-Learning Network individually. To select sub-region, it makes Voting Map by to combine each Voting Space. In the Fine step, the sub-region selected in the Coarse step is transferred to final Deep-Learning Network. The sub-region can be defined by using dynamic windows. In this paper, pre-defined mapping table has used to define dynamic windows for perspective road image. Identity judgment of vehicle moving on each sub-region is determined by closest center point of bottom of the detected vehicle's box information. And it is tracked by vehicle's box information on the continuous images. The proposed algorithm has evaluated for performance of detection and cost in real time using day and night images captured by CCTV on the road.