DOI QR코드

DOI QR Code

A Novel Feature Selection Method for Output Coding based Multiclass SVM

출력 코딩 기반 다중 클래스 서포트 벡터 머신을 위한 특징 선택 기법

  • Received : 2013.03.30
  • Accepted : 2013.05.09
  • Published : 2013.07.31

Abstract

Recently, support vector machine has been widely used in various application fields due to its superiority of classification performance comparing with decision tree and neural network. Since support vector machine is basically designed for the binary classification problem, output coding method to analyze the classification result of multiclass binary classifier is used for the application of support vector machine into the multiclass problem. However, previous feature selection method for output coding based support vector machine found the features to improve the overall classification accuracy instead of improving each classification accuracy of each classifier. In this paper, we propose the novel feature selection method to find the features for maximizing the classification accuracy of each binary classifier in output coding based support vector machine. Experimental result showed that proposed method significantly improved the classification accuracy comparing with previous feature selection method.

서포트 벡터 머신은 뛰어난 일반화 성능에 힘입어 다양한 분야에서 의사 결정 나무나 인공 신경망에 비해 더 좋은 분류 성능을 보이고 있기 때문에 최근 널리 사용되고 있다. 서포트 벡터 머신은 기본적으로 이진 분류 문제를 위하여 설계되었기 때문에 서포트 벡터 머신을 다중 클래스 문제에 적용하기 위한 방법으로 다중 이진 분류기의 출력 결과를 이용하는 출력 코딩 방법이 주로 사용되고 있다. 그러나 출력 코딩 기반 서포트 벡터 머신에 사용된 기존 특징 선택 기법은 각 분류기의 정확도 향상을 위한 특징이 아니라 전체 분류 정확도 향상을 위한 특징을 선택하고 있다. 본 논문에서는 출력 코딩 기반 서포트 벡터 머신의 각 이진 분류기의 분류 정확도를 최대화하는 특징을 각각 선택하여 사용함으로써, 전체 분류 정확도를 향상시키는 특징 선택 기법을 제안한다. 실험 결과는 제안 기법이 기존 특징 선택 기법에 비하여 통계적으로 유의미한 분류 정확도 향상이 있었음을 보여주었다.

Keywords

References

  1. C. Huang, L.S. Davis, and J.R.G. Townshend, "An Assessment of Support Vector Machines for land Cover Classification," International Journal of Remote Sensing, Vol. 23, No. 4, pp. 725-749, 2002. https://doi.org/10.1080/01431160110040323
  2. Y. Bazi and F. Melgani, "Toward an Optimal SVM Classification System for Hyperspectral Remote Sensing Images," IEEE Transactions on Geoscience and Remote Sensing, Vol. 44, No. 11, pp. 3374-3385, 2006 https://doi.org/10.1109/TGRS.2006.880628
  3. K. Crammer and Y. Singer, "On the Learnability and Design of Output Codes for Multiclass Problems," Computational Learning Theory, Vol. 47, No. 2, pp. 35-46, 2000.
  4. J. Weston and C. Watkins, "Multi-Class Support Vector Machines," Proc. of European Symposium on Artificial Neural Networks, pp. 219-224, 1999.
  5. M. Bartlett, J. Movellan, and T. Sejnowski, "Face Recognition by Independent Component Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 13, No. 6, pp. 1450-1464, 2002.
  6. J.C. Platt, N. Cristianini, and J. Shawe-Taylor, "Large Margin DAG's for Multiclass Classification," Advances in Neural Information Processing Systems, Vol. 12, pp. 547-553, MIT Press, 2000.
  7. S. Cheong, S.H. Oh, and S. Lee, "Support Vector Machines with Binary Tree Architecture for Multi-Class Classification," Neural Information Processing: Letters and Reviews, Vol. 2, No. 3, pp. 47-51, 2004.
  8. C. Young, C. Yen, Y. Pao, and M. Nagurka, "One-class-at-a-time Removal Sequence Planning Method for Multiclass Classification Problems," IEEE Transactions on Neural Networks, Vol. 17, No. 6, pp. 1544-1549, 2006. https://doi.org/10.1109/TNN.2006.879768
  9. C. Hsu and C. Lin, "A Comparison of Methods for Multiclass Support Vector Machines," IEEE Transactions on Neural Networks, Vol. 13, No. 2, pp. 415-425, 2002. https://doi.org/10.1109/72.991427
  10. J. Ghosh, "Multiclassifier Systems: Back to the Future," Proc. of the 3rd Int'l Workshop on Multiple Classifier Systems, Lecture Note in Computer Science, Vol. 2364, pp. 1-15, 2002.
  11. T. Hastie and R. Tibshirani, "Classification by Pairwise Coupling," The Annals of Statistics, Vol. 26, No. 2, pp. 451-471, 1998. https://doi.org/10.1214/aos/1028144844
  12. T. Dietterich and G. Bakiri, "Solving Multiclass Learning Problems Via Error-Correcting Output Codes," Journal of Artificial Intelligence Research, Vol. 2, pp. 263-286, 1995.
  13. I. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, Vol. 3, No. 1, pp. 1157- 1182, 2003.(호 기입요함!)
  14. B. Fei and J. Liu, "Binary tree of SVM: A New Fast Multiclass Training and Classification Algorithm," IEEE Transactions on Neural Networks, Vol. 17, No. 3, pp. 696-704, 2006. https://doi.org/10.1109/TNN.2006.872343
  15. J.H. Friedman, Another Approach to Polychotomous Classification, Technical Report, Stanford University, 1996.
  16. A. Jain and D. Zongker, "Feature Selection: Evaluation, Application, and Small Sample Performance," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 2, pp. 153-158, 1997. https://doi.org/10.1109/34.574797
  17. 이정진, 김경원, 이호, "블록 기반 클러스터링과 히스토그램 카이 제곱 거리를 이용한 반도체 결함 원인 진단 기법," 멀티미디어학회논문지, 제5권, 제9호, pp. 1149-1155, 2012. https://doi.org/10.9717/kmms.2012.15.9.1149
  18. G.F. Hughes, "On the Mean Accuracy of Statistical Pattern Recognizers," IEEE Transactions on Information Theory, pp. 55-63, 1968.
  19. Y. Saeys, I. Inza, and P. Larranaga, "A Review of Feature Selection Techniques in bioinformatics," Bioinformatics, Vol. 23, No. 19, pp. 2507-2517, 2007. https://doi.org/10.1093/bioinformatics/btm344
  20. A.L. Blum and R.L. Rivest, "Training a 3- Node Neural Network is NP-complete," Neural Networks, Vol. 5, No. 1, pp. 17-27, 1992.
  21. M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NPcompleteness, Freeman, New York, 1979.
  22. M. Kudo and J. Sklansky, "Comparison of Algorithms that Select Features for Pattern Classifiers," Pattern Recognition, Vol. 33, No. 1, pp. 25-41, 2000. https://doi.org/10.1016/S0031-3203(99)00041-2
  23. UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/, 2013.