• 제목/요약/키워드: bayesian classification

검색결과 254건 처리시간 0.025초

전자메일 분류를 위한 나이브 베이지안 학습과 중심점 기반 분류의 성능 비교 (Performance Comparison of Naive Bayesian Learning and Centroid-Based Classification for e-Mail Classification)

  • 김국표;권영식
    • 산업공학
    • /
    • 제18권1호
    • /
    • pp.10-21
    • /
    • 2005
  • With the increasing proliferation of World Wide Web, electronic mail systems have become very widely used communication tools. Researches on e-mail classification have been very important in that e-mail classification system is a major engine for e-mail response management systems which mine unstructured e-mail messages and automatically categorize them. In this research we compare the performance of Naive Bayesian learning and Centroid-Based Classification using the different data set of an on-line shopping mall and a credit card company. We analyze which method performs better under which conditions. We compared classification accuracy of them which depends on structure and size of train set and increasing numbers of class. The experimental results indicate that Naive Bayesian learning performs better, while Centroid-Based Classification is more robust in terms of classification accuracy.

키스트로크 인식을 위한 패턴분류 방법 (Pattern Classification Methods for Keystroke Identification)

  • 조태훈
    • 한국정보통신학회논문지
    • /
    • 제10권5호
    • /
    • pp.956-961
    • /
    • 2006
  • 키스트로크 시간간격은 컴퓨터사용자의 검증 및 인식에서 분별적인 특징이 될 수 있다. 본 논문은 키스트로크 시간간격을 특징으로, 신경망의 역전파 알고리즘과 Bayesian 분류기, 그리고 k-NN을 이용한 분류기의 사용자 인식 성능을 비교 실험하였다. 실험 결과, 사용자당 샘플의 개수가 작을 경우에는 k-NN 알고리즘이 가장 성능이 좋았고, 사용자당 샘플의 개수가 많을 경우에는 Bayesian 분류기의 성능이 가장 뛰어난 결과를 보였다. 따라서 웹기반 온라인 사용자인식을 위해서는 사용자별 키스트로크 샘플의 수에 따라 k-NN이나 Bayesian 분류기를 선택적으로 사용하는 것이 바람직할 것으로 보인다.

Application of Bayesian Statistical Analysis to Multisource Data Integration

  • Hong, Sa-Hyun;Moon, Wooil-M.
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2002년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.394-399
    • /
    • 2002
  • In this paper, Multisource data classification methods based on Bayesian formula are considered. For this decision fusion scheme, the individual data sources are handled separately by statistical classification algorithms and then Bayesian fusion method is applied to integrate from the available data sources. This method includes the combination of each expert decisions where the weights of the individual experts represent the reliability of the sources. The reliability measure used in the statistical approach is common to all pixels in previous work. In this experiment, the weight factors have been assigned to have different value for all pixels in order to improve the integrated classification accuracies. Although most implementations of Bayesian classification approaches assume fixed a priori probabilities, we have used adaptive a priori probabilities by iteratively calculating the local a priori probabilities so as to maximize the posteriori probabilities. The effectiveness of the proposed method is at first demonstrated on simulations with artificial and evaluated in terms of real-world data sets. As a result, we have shown that Bayesian statistical fusion scheme performs well on multispectral data classification.

  • PDF

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

Relation Based Bayesian Network for NBNN

  • Sun, Mingyang;Lee, YoonSeok;Yoon, Sung-eui
    • Journal of Computing Science and Engineering
    • /
    • 제9권4호
    • /
    • pp.204-213
    • /
    • 2015
  • Under the conditional independence assumption among local features, the Naive Bayes Nearest Neighbor (NBNN) classifier has been recently proposed and performs classification without any training or quantization phases. While the original NBNN shows high classification accuracy without adopting an explicit training phase, the conditional independence among local features is against the compositionality of objects indicating that different, but related parts of an object appear together. As a result, the assumption of the conditional independence weakens the accuracy of classification techniques based on NBNN. In this work, we look into this issue, and propose a novel Bayesian network for an NBNN based classification to consider the conditional dependence among features. To achieve our goal, we extract a high-level feature and its corresponding, multiple low-level features for each image patch. We then represent them based on a simple, two-level layered Bayesian network, and design its classification function considering our Bayesian network. To achieve low memory requirement and fast query-time performance, we further optimize our representation and classification function, named relation-based Bayesian network, by considering and representing the relationship between a high-level feature and its low-level features into a compact relation vector, whose dimensionality is the same as the number of low-level features, e.g., four elements in our tests. We have demonstrated the benefits of our method over the original NBNN and its recent improvement, and local NBNN in two different benchmarks. Our method shows improved accuracy, up to 27% against the tested methods. This high accuracy is mainly due to consideration of the conditional dependences between high-level and its corresponding low-level features.

Combining Geostatistical Indicator Kriging with Bayesian Approach for Supervised Classification

  • Park, No-Wook;Chi, Kwang-Hoon;Moon, Wooil-M.;Kwon, Byung-Doo
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2002년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.382-387
    • /
    • 2002
  • In this paper, we propose a geostatistical approach incorporated to the Bayesian data fusion technique for supervised classification of multi-sensor remote sensing data. Traditional spectral based classification cannot account for the spatial information and may result in unrealistic classification results. To obtain accurate spatial/contextual information, the indicator kriging that allows one to estimate the probability of occurrence of classes on the basis of surrounding observations is incorporated into the Bayesian framework. This approach has its merit incorporating both the spectral information and spatial information and improves the confidence level in the final data fusion task. To illustrate the proposed scheme, supervised classification of multi-sensor test remote sensing data set was carried out.

  • PDF

베이지안 분류기를 이용한 소프트웨어 품질 분류 (Software Quality Classification using Bayesian Classifier)

  • 홍의석
    • 한국IT서비스학회지
    • /
    • 제11권1호
    • /
    • pp.211-221
    • /
    • 2012
  • Many metric-based classification models have been proposed to predict fault-proneness of software module. This paper presents two prediction models using Bayesian classifier which is one of the most popular modern classification algorithms. Bayesian model based on Bayesian probability theory can be a promising technique for software quality prediction. This is due to the ability to represent uncertainty using probabilities and the ability to partly incorporate expert's knowledge into training data. The two models, Na$\ddot{i}$veBayes(NB) and Bayesian Belief Network(BBN), are constructed and dimensionality reduction of training data and test data are performed before model evaluation. Prediction accuracy of the model is evaluated using two prediction error measures, Type I error and Type II error, and compared with well-known prediction models, backpropagation neural network model and support vector machine model. The results show that the prediction performance of BBN model is slightly better than that of NB. For the data set with ambiguity, although the BBN model's prediction accuracy is not as good as the compared models, it achieves better performance than the compared models for the data set without ambiguity.

베이지안 신경망을 이용한 분류분석 (A Classification Analysis using Bayesian Neural Network)

  • 황진수;최성용;전홍석
    • Journal of the Korean Data and Information Science Society
    • /
    • 제12권2호
    • /
    • pp.11-25
    • /
    • 2001
  • 자료들 사이에 존재하는 관계, 패턴, 규칙등을 찾아내서 모형화 하는 통계적인 분류기법은 여러가지가 있다. 그러나 우리가 얻게 되는 지식은 어떤 일련의 분류규칙에 의해서가 아닌 관찰과 학습을 통한 훈련으로부터 얻게 된다. 본 베이지안 학습은 모든 형태의 불확실성을 표현하는 확률로써 우리의 믿음의 정도를 표현하는 것으로 해석될 수 있으며, 확실한 결과가 알려짐에 따라 확률이론 법칙을 사용하여 이러한 확률들을 갱신한다. 또한 신경망 모형은 이미 알고 있는 속성들에 근거하여 아직 알지 못하는 집단이나 특질들을 예측하게 해준다. 본 논문에서는 이러한 두 가지 방법을 결합한 베이지안 신경망과 기존의 CHAID, CART, QUBST 분류 알고리즘에 있어서 각각 오분류율을 비교연구하였다.

  • PDF

심전도 패턴 판별을 위한 빈발 패턴 베이지안 분류 (Frequent Pattern Bayesian Classification for ECG Pattern Diagnosis)

  • 노기용;김원식;이헌규;이상태;류근호
    • 정보처리학회논문지D
    • /
    • 제11D권5호
    • /
    • pp.1031-1040
    • /
    • 2004
  • 심장의 활동을 기록한 심전도는 심장의 상태에 대한 가치 있는 임상 정보를 제공한다. 지금까지 심전도를 이용한 심장 질환 진단 알고리즘에 대한 많은 연구가 진행되어 왔으나, 심장 질환에 대한 진단 결과의 부 정확성으로 인해 심전계에서는 외국의 진단 알고리즘을 사용하고 있다. 이 논문에서는 심전도 데이터의 수집에서부터 전 처리 과정 그리고 데이터마이닝을 이용한 심장 질환 패턴 분류 기법을 제안한다. 이 패턴 분류기법은 빈발 패턴 베이지안이며 기존의 나이브 베이지안과 빈발 패턴 마이닝의 통합이다. 빈발 패턴 베이지안은 훈련단계에서 탐사된 빈발 패턴들을 사용하여 Product Approximation 구성하므로써 클래스 조건 독립 가정을 가진 나이브 베이지안의 단점을 해결한다.

The Predictive QSAR Model for hERG Inhibitors Using Bayesian and Random Forest Classification Method

  • Kim, Jun-Hyoung;Chae, Chong-Hak;Kang, Shin-Myung;Lee, Joo-Yon;Lee, Gil-Nam;Hwang, Soon-Hee;Kang, Nam-Sook
    • Bulletin of the Korean Chemical Society
    • /
    • 제32권4호
    • /
    • pp.1237-1240
    • /
    • 2011
  • In this study, we have developed a ligand-based in-silico prediction model to classify chemical structures into hERG blockers using Bayesian and random forest modeling methods. These models were built based on patch clamp experimental results. The findings presented in this work indicate that Laplacian-modified naive Bayesian classification with diverse selection is useful for predicting hERG inhibitors when a large data set is not obtained.