DOI QR코드

DOI QR Code

Prototype-Based Classification Using Class Hyperspheres

클래스 초월구를 이용한 프로토타입 기반 분류

  • 이현종 (단국대학교 컴퓨터과학과) ;
  • 황두성 (단국대학교 소프트웨어학과)
  • Received : 2016.03.30
  • Accepted : 2016.06.03
  • Published : 2016.10.31

Abstract

In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data with hyperspheres, and a hypersphere must cover the data from the same class. The radius of a hypersphere is computed by the mid point of the two distances to the farthest same class point and the nearest other class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that cover all the training data. The proposed prototype selection method is designed by a greedy algorithm and applicable to process a large-scale training set in parallel. The prediction rule is the nearest-neighbor rule and the new training data is the set of prototypes. In experiments, the generalization performance of the proposed method is superior to existing methods.

본 논문은 최근접 이웃 규칙을 이용한 프로토타입을 이용하는 분류 학습을 제안한다. 훈련 데이터가 대표하는 클래스 영역을 초월구로 분할하는데 최근접 이웃규칙을 적용시키며, 초월구는 동일 클래스 데이터들만 포함시킨다. 초월구의 반지름은 가장 인접한 다른 클래스 데이터와 가장 먼 동일 클래스 데이터의 중간 거리 값으로 결정한다. 그리고 전체 훈련 데이터를 대표하는 최소의 프로토타입 집합을 선택하기 위해 집합 덮개 최적화를 이용한다. 제안하는 선택 방법은 클래스 별 프로토타입을 선택하는 그리디 알고리즘으로 설계되며, 대규모 훈련 데이터에 대한 병렬처리가 가능하다. 분류 예측은 최근접 이웃 규칙을 이용하며, 새로운 훈련 데이터는 프로토타입 집합이다. 실험에서 제안하는 방법은 기 연구된 학습 방법에 비해 일반화 성능이 우수하다.

Keywords

References

  1. X. Wu et al., "The top ten algorithms in data mining," CRC Press, 2009.
  2. Trevor Hastie, Robert Tibshirani, and Jerome Friedman, "The elements of statistical learning: data mining," Inference, and Prediction, Springer Series in Statistics, 2001.
  3. Jose Arturo Olvera-Lopez, Jesus Ariel Carrasco-Ochoa, Jose Francisco Martinez Trinidad, and Josef Kittler, "A review of instance selection methods," Artif. Intell. Rev, Vol.34, No.2, pp.133-143, 2010. https://doi.org/10.1007/s10462-010-9165-y
  4. S. Garcia, J. Derrac, J. R. Cano, and F. Herrera, "Prototype selection for nearest neighbor classification : taxonomy and empirical study," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.34, No.3, pp.417-435, 2012. https://doi.org/10.1109/TPAMI.2011.142
  5. D. S. Hwang and D. W. Kim, "Near-boundary data selection for fast support vector machines," Malaysian Journal of Computer Science, Vol.25, No.1, pp.23-37, 2012.
  6. Fabrizio Angiulli, "Fast nearest neighbor condensation for large data sets classification," IEEE Transactions on Knowledge and Data Engineering, Vol.19, No.11, pp.1450-1464, 2007. https://doi.org/10.1109/TKDE.2007.190645
  7. D. Randall Wilson and Tony R. Martinez, "Reduction techniques for instance-based learning algorithms," Machine Learning, Vol.38, No.3, pp.257-286, 2000. https://doi.org/10.1023/A:1007626913721
  8. Jacob Bien and Robert Tibshirani, "Prototype selection for interpretable classification," The Annuals of Applied Statistics, Vol.5, No.4, pp.2403-2424, 2011. https://doi.org/10.1214/11-AOAS495
  9. Ichigaku Takigawa, Mineichi Kudo, and Atsuyoshi Nakamura, "Convex sets as prototypes for classifying patterns," Engineering Applications of Artificial Intelligence, Vol.22, No.1, pp.101-108, 2009. https://doi.org/10.1016/j.engappai.2008.05.012
  10. David Marchette, "Class cover catch digraphs," Wiley Interdisciplinary Reviews : Computational Statistics, Vol.2, No.2, pp.171-177, 2010. https://doi.org/10.1002/wics.70
  11. Reda Younsi and Anthony Bagnall, "An efficient randomised sphere cover classifier," International Journal of Data Mining, Modelling and Management, Vol.4, No.2, pp.156-171, 2012. https://doi.org/10.1504/IJDMMM.2012.046808
  12. S. Salzberg, "A nearest hyperrectangle learning method," Machine Learning, Vol.6, pp.251-276, 1991.
  13. J. Hamidzadeh, R. Monsefi, and H. S. Yazdi, "IRAHC: instance reduction algorithm using hyperrectangle clustering," Pattern Recognition, Vol.48, No.5, pp.1878-1889, 2015. https://doi.org/10.1016/j.patcog.2014.11.005
  14. Vijay V. Vazirani, Approximation Algorithms, Berlin: Springer, New York, 2001.
  15. GLPK, "The GLPK Linear Programming Kit Package."
  16. K. Bache and M. Lichman, "UCI Machine Learning Repository," University of California, School of Information and Computer Science., 2013.
  17. B. H. Baek et al., "System design and implementation for recognizing and playing guitar tab chords," Korea Information Processing Society, Vol.22, No.2, pp.119-112. 2015.