DOI QR코드

DOI QR Code

Learning Distribution Graphs Using a Neuro-Fuzzy Network for Naive Bayesian Classifier

퍼지신경망을 사용한 네이브 베이지안 분류기의 분산 그래프 학습

  • Tian, Xue-Wei (Dept. of Computer Engineering, Gachon University) ;
  • Lim, Joon S. (Dept. of Computer Engineering, Gachon University)
  • 전설위 (가천대학교 전자계산학과) ;
  • 임준식 (가천대학교 전자계산학과)
  • Received : 2013.10.02
  • Accepted : 2013.11.20
  • Published : 2013.11.28

Abstract

Naive Bayesian classifiers are a powerful and well-known type of classifiers that can be easily induced from a dataset of sample cases. However, the strong conditional independence assumptions can sometimes lead to weak classification performance. Normally, naive Bayesian classifiers use Gaussian distributions to handle continuous attributes and to represent the likelihood of the features conditioned on the classes. The probability density of attributes, however, is not always well fitted by a Gaussian distribution. Another eminent type of classifier is the neuro-fuzzy classifier, which can learn fuzzy rules and fuzzy sets using supervised learning. Since there are specific structural similarities between a neuro-fuzzy classifier and a naive Bayesian classifier, the purpose of this study is to apply learning distribution graphs constructed by a neuro-fuzzy network to naive Bayesian classifiers. We compare the Gaussian distribution graphs with the fuzzy distribution graphs for the naive Bayesian classifier. We applied these two types of distribution graphs to classify leukemia and colon DNA microarray data sets. The results demonstrate that a naive Bayesian classifier with fuzzy distribution graphs is more reliable than that with Gaussian distribution graphs.

Naive Bayesian classifiers 네이브 베이지안 분류기는 샘플 데이터로부터 쉽게 구현될 수 있는 강력하고도 많이 사용되는 형식의 분류기다. 그러나 강한 조건부 독립성으로 인하여 효율이 저하되는 분류 결과를 초래한다. 일반적으로 네이브 베이지안 분류기는 연속성을 가진 특징 데이터의 우도를 처리하기 위해 가우시안 분산을 사용한다. 속성들의 확률밀도는 항상 가우시안 분산에 적합한 것만은 아니다. 또 다른 형식의 분류기는 지도학습을 통해 퍼지 규칙과 퍼지집합을 학습할 수 있는 퍼지신경망이다. 퍼지신경망과 네이브 베이지안 분류기간에는 구조적 유사성을 가지고 있기 때문에 퍼지신경망으로 학습된 분산 그래프를 네이브 베이지안 분류기에 적용하고자 하는 방안이 본 연구의 목적이다. 따라서 네이브 베이지안 분류기에 가우시안 분산 그래프를 사용한 결과와 퍼지 분산 그래프를 사용한 결과를 비교하였다. 이를 위해 leukemia와 colon의 DNA 마이크로어레이 데이터를 적용하여 분류하였다. 네이브 베이지안 분류기에 퍼지 분산 그래프를 사용한 결과 가우시안 분산 그래프를 사용한 결과보다 더 신뢰성이 있음을 보여주었다.

Keywords

References

  1. N. Friedman, D. Geiger, and M. Goldszmidt, Bayesian network classifiers. Machine Learning, Vol. 29, No. 9, pp. 131-163, 1997. https://doi.org/10.1023/A:1007465528199
  2. B. W. Morgan, An Introduction to Bayesian Statistical Decision Processes. New Jersey, 1968.
  3. D. Heckerman, A tutorial on learning with Bayesian Network. Microsoft Research, Redmond, 1996.
  4. M. Sahami, Learning Limited Dependence Bayesian Classifiers. Knowledge Discovery and Data Mining, pp. 335-338, 1996.
  5. P. Langley, An analysis of Bayesian classifiers. Tenth National Conference on Artificial Intelligence pp. 223-228, 1992.
  6. J. S. Lim, D. Wang, Y.-S. Kim, and S. Gupta, A neuro-Fuzzy Approach for Diagnosis of Antibody Deficiency Syndrome. Neurocomputing 69, Issues 7-9, pp. 969-974, March 2006. https://doi.org/10.1016/j.neucom.2005.06.009
  7. J. S. Lim, "Finding Features for Real-Time Premature Ventricular Contraction Detection Using a Fuzzy Neural Network System," IEEE Transactions on Neural Networks, pp. 522-527, 2009.
  8. T. R. Golub, D. K. Slonim, and P. Tamayo, Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, pp. 531-537, 1999.
  9. U. Alon, N. Barkai, D. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine, Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proc. Nat. Acad. Sci. USA, vol. 96, pp. 6745-6750, 1999. https://doi.org/10.1073/pnas.96.12.6745
  10. H. Jun; M. Claudio, The influence of the sigmoid function parameters on the speed of backpropagation learning. From Natural to Artificial Neural Computation, pp. 195-201, 1995.
  11. G. Xuan, Bhattacharyya distance feature selection. Pattern Recognition, Proceedings of the 13th International Conference, 1996.
  12. X. W. Tian. S. H. Lee, and J. S. Lim, Gene Selection for Leukemia Classification Based on Bhattacharyya Distance. Proceedings of KIIS Spring Conference, vol. 23, No. 1, pp. 17-18, 2013.
  13. X. W. Tian and J. S. Lim, Bhattacharyya Distance for identifying differentially expressed genes in colon gene experiments, International Conference on Information Science and Applications, 2013.
  14. E. Themaat, On the Use of Learning Bayesian Networks to Analyze Gene Expression Data: Classification and Gene Network Reconstruction, Master Thesis, University of Amsterdam, Artificial Intelligence, June 2005.

Cited by

  1. Classification of Epileptic Seizure Signals Using Wavelet Transform and Hilbert Transform vol.14, pp.4, 2016, https://doi.org/10.14400/JDC.2016.14.4.277