• Title/Summary/Keyword: Naive Bayes 분류

Search Result 132, Processing Time 0.027 seconds

An Improvement of Accuracy for NaiveBayes by Using Large Word Sets (빈발단어집합을 이용한 NaiveBayes의 정확도 개선)

  • Lee Jae-Moon
    • Journal of Internet Computing and Services
    • /
    • v.7 no.3
    • /
    • pp.169-178
    • /
    • 2006
  • In this paper, we define the large word sets which are noble variations the large item sets in mining association rules, and improve the accuracy for NaiveBayes based on the defined large word sets. In order to use them, a document is divided into the several paragraphs, and then each paragraph can be transformed as the transaction by extracting words in it. The proposed method was implemented by using Al:Categorizer framework and its accuracies were measured by the experiments for reuter-21578 data set. The results of the experiments show that the proposed method improves the accuracy of the conventional NaiveBayes.

  • PDF

Improving Accuracy of Multi-label Naive Bayes Classifier (다중 레이블 나이브 베이지안 분류기의 정확도 개선 연구)

  • Kim, Hae-Choen;Lee, Jae-Sung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.01a
    • /
    • pp.147-148
    • /
    • 2018
  • 다중 레이블 분류 문제는 다중 레이블 데이터를 입력받았을 때 연관된 다수의 레이블을 추측하는 문제이다. 본 논문에서는 다중 레이블 분류 문제의 기법 중 하나인 나이브 베이지안 분류기에 레이블 의존성을 계산하여 결과에 반영한 결과 다중 레이블 분류 문제의 성능이 개선됨을 확인하였다.

  • PDF

An Active Learning-based Method for Composing Training Document Set in Bayesian Text Classification Systems (베이지언 문서분류시스템을 위한 능동적 학습 기반의 학습문서집합 구성방법)

  • 김제욱;김한준;이상구
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.12
    • /
    • pp.966-978
    • /
    • 2002
  • There are two important problems in improving text classification systems based on machine learning approach. The first one, called "selection problem", is how to select a minimum number of informative documents from a given document collection. The second one, called "composition problem", is how to reorganize selected training documents so that they can fit an adopted learning method. The former problem is addressed in "active learning" algorithms, and the latter is discussed in "boosting" algorithms. This paper proposes a new learning method, called AdaBUS, which proactively solves the above problems in the context of Naive Bayes classification systems. The proposed method constructs more accurate classification hypothesis by increasing the valiance in "weak" hypotheses that determine the final classification hypothesis. Consequently, the proposed algorithm yields perturbation effect makes the boosting algorithm work properly. Through the empirical experiment using the Routers-21578 document collection, we show that the AdaBUS algorithm more significantly improves the Naive Bayes-based classification system than other conventional learning methodson system than other conventional learning methods

Accurate Intrusion Detection using n-Gram Augmented Naive Bayes (N-Gram 증강 나이브 베이스를 이용한 정확한 침입 탐지)

  • Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.285-288
    • /
    • 2008
  • In many intrusion detection applications, n-gram approach has been widely applied. However, n-gram approach has shown a few problems including double counting of features. To address those problems, we applied n-gram augmented Naive Bayes directly to classify intrusive sequences and compared performance with those of Naive Bayes and Support Vector Machines (SVM) with n-gram features by the experiments on host-based intrusion detection benchmark data sets. Experimental results on the University of New Mexico (UNM) benchmark data sets show that the n-gram augmented method, which solves the problem of independence violation that happens when n-gram features are directly applied to Naive Bayes (i.e. Naive Bayes with n-gram features), yields intrusion detectors with higher accuracy than those from Naive Bayes with n-gram features and shows comparable accuracy to those from SVM with n-gram features.

  • PDF

A Study on Efficient Topography Classification of High Resolution Satelite Image (고해상도 위성영상의 효율적 지형분류기법 연구)

  • Lim, Hye-Young;Kim, Hwang-Soo;Choi, Joon-Seog;Song, Seung-Ho
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.13 no.3 s.33
    • /
    • pp.33-40
    • /
    • 2005
  • The aim of remotely sensed data classification is to produce the best accuracy map of the earth surface assigning each pixel to its appropriate category of the real-world. The classification of satellite multi-spectral image data has become tool for generating ground cover map. Many classification methods exist. In this study, MLC(Maximum Likelihood Classification), ANN(Artificial neural network), SVM(Support Vector Machine), Naive Bayes classifier algorithms are compared using IKONOS image of the part of Dalsung Gun, Daegu area. Two preprocessing methods are performed-PCA(Principal component analysis), ICA(Independent Component Analysis). Boosting algorithms also performed. By the combination of appropriate feature selection pre-processing and classifier, the best results were obtained.

  • PDF

A Study on Incremental Learning Model for Naive Bayes Text Classifier (Naive Bayes 문서 분류기를 위한 점진적 학습 모델 연구)

  • 김제욱;김한준;이상구
    • The Journal of Information Technology and Database
    • /
    • v.8 no.1
    • /
    • pp.95-104
    • /
    • 2001
  • In the text classification domain, labeling the training documents is an expensive process because it requires human expertise and is a tedious, time-consuming task. Therefore, it is important to reduce the manual labeling of training documents while improving the text classifier. Selective sampling, a form of active learning, reduces the number of training documents that needs to be labeled by examining the unlabeled documents and selecting the most informative ones for manual labeling. We apply this methodology to Naive Bayes, a text classifier renowned as a successful method in text classification. One of the most important issues in selective sampling is to determine the criterion when selecting the training documents from the large pool of unlabeled documents. In this paper, we propose two measures that would determine this criterion : the Mean Absolute Deviation (MAD) and the entropy measure. The experimental results, using Renters 21578 corpus, show that this proposed learning method improves Naive Bayes text classifier more than the existing ones.

  • PDF

Development of Incident Detection Algorithm Using Naive Bayes Classification (나이브 베이즈 분류기를 이용한 돌발상황 검지 알고리즘 개발)

  • Kang, Sunggwan;Kwon, Bongkyung;Kwon, Cheolwoo;Park, Sangmin;Yun, Ilsoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.17 no.6
    • /
    • pp.25-39
    • /
    • 2018
  • The purpose of this study is to develop an efficient incident detection algorithm by applying machine learning, which is being widely used in the transport sector. As a first step, network of the target site was constructed with micro-simulation model. Secondly, data has been collected under various incident scenarios produced with combination of variables that are expected to affect the incident situation. And, detection results from both McMaster algorithm, a well known incident detection algorithm, and the Naive Bayes algorithm, developed in this study, were compared. As a result of comparison, Naive Bayes algorithm showed less negative effect and better detect rate (DR) than the McMaster algorithm. However, as DR increases, so did false alarm rate (FAR). Also, while McMaster algorithm detected in four cycles, Naive Bayes algorithm determine the situation with just one cycle, which increases DR but also seems to have increased FAR. Consequently it has been identified that the Naive Bayes algorithm has a great potential in traffic incident detection.

Naive Bayes Learner for Propositionalized Attribute Taxonomy (명제화된 어트리뷰트 택소노미를 이용하는 나이브 베이스 학습 알고리즘)

  • Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.406-409
    • /
    • 2008
  • We consider the problem of exploiting a taxonomy of propositionalized attributes in order to learn compact and robust classifiers. We introduce Propositionalized Attribute Taxonomy guided Naive Bayes Learner (PAT-NBL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact and accurate classifiers. PAT-NBL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the instance space from propositionalized attribute taxonomy and data. Our experimental results on University of California-Irvine (UCI) repository data sets show that the proposed algorithm can generate a classifier that is sometimes comparably compact and accurate to those produced by standard Naive Bayes learners.

  • PDF

Development of Visual Inspection Process Adapting Naive Bayes Classifiers (나이브 베이즈 분류기를 적용한 외관검사공정 개발)

  • Ryu, Sun-Joong
    • Journal of the Korean Institute of Gas
    • /
    • v.19 no.2
    • /
    • pp.45-53
    • /
    • 2015
  • In order to improve the performance of the visual inspection process, in addition to existing automatic visual inspection machine and human inspectors have developed a new process configuration using a Naive Bayes classifier. By applying the classifier, defect leakage and human inspector's work amount could be improved at the same time. New classification method called AMPB was applied instead of conventional methods based on MAP classification. By experimental results using the filter product for camera modules, it was confirmed that it is possible to configure the process at the level of leakage ratio 1.14% and human inspector's work amount ratio 75.5%. It is significant that the result can be applied in such a wide range as gas leak detection which is the collaboration process between inspection machine and human inspector's

Propositionalized Attribute Taxonomy Guided Naive Bayes Learning Algorithm (명제화된 어트리뷰트 택소노미를 이용하는 나이브 베이스 학습 알고리즘)

  • Kang, Dae-Ki;Cha, Kyung-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.12
    • /
    • pp.2357-2364
    • /
    • 2008
  • In this paper, we consider the problem of exploiting a taxonomy of propositionalized attributes in order to generate compact and robust classifiers. We introduce Propositionalized Attribute Taxonomy guided Naive Bayes Learner (PAT-NBL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact and accurate classifiers. PAT-NBL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the instance space from propositionalized attribute taxonomy and data. Our experimental results on University of California-Irvine (UCI) repository data set, show that the proposed algorithm can generate a classifier that is sometimes comparably compact and accurate to those produced by standard Naive Bayes learners.