DOI QR코드

DOI QR Code

정보이득 분할을 이용한 분류기법의 지배적 초월평면 생성기법

A dominant hyperrectangle generation technique of classification using IG partitioning

  • 투고 : 2013.12.29
  • 심사 : 2014.01.17
  • 발행 : 2014.01.29

초록

중첩형 일반화 사례 (NGE, Nested Generalized Exemplar) 기법은 거리 기반 분류를 최적 일치 규칙으로 사용하며, 노이즈에 대한 내구력을 증가시켜 주는 동시에 모델 크기를 감소시키는 장점이 있다. NGE 학습 중 생성된 교차(cross)나 중첩(overlap) 현상은 분류성능을 저해하는 요인으로 작용한다. 따라서 본 논문은 NGE 학습 중 생성된 교차나 중첩 현상이 발생한 초월 평면에대해 상호정보가 가장 큰 구간을 분리하여, 새로운 초월평면을 구성하게 하여, 분류성능 향상시키고 초월평면의 개수를 감소시키는 기법인 DHGen(Dominant Hyperrectangle Generation) 알고리즘을 제안하였다. 제안한 DHGen은 분류성능면에서 kNN과 유사하고 NGE이론으로 구현한 EACH보다 우수함을 UCI Machine Learning Repository에서 벤치마크데이터를 발췌한 실험자료로 입증하였다.

NGE(Nested Generalized Exemplar Method) can increase the performance of the noisy data at the same time, can reduce the size of the model. It is the optimal distance-based classification method using a matching rule. NGE cross or overlap hyperrectangles generated in the learning has been noted to inhibit the factors. In this paper, We propose the DHGen(Dominant Hyperrectangle Generation) algorithm which avoids the overlapping and the crossing between hyperrectangles, uses interval weights for mixed hyperrectangles to be splited based on the mutual information. The DHGen improves the classification performance and reduces the number of hyperrectangles by processing the training set in an incremental manner. The proposed DHGen has been successfully shown to exhibit comparable classification performance to k-NN and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.

키워드

참고문헌

  1. Aha, D.W. et al, "Instance-Based Learning Algorithms. Machine Learning," Vol. 6, pp 37-66. 1991.
  2. D. Zaharie, L. Perian, V. Negru, "A View Inside the Classification with Non-Nested Generalized Exemplars," IADIS European Conference on Data Mining, 24-26 July, Rome, Italy, pg.19-26, 2011
  3. kyoung-jae, kim, "Prediction of KOSPI using Data Editing Techniques and Case-based Reasoning," Journal of the Korea Society of Computer and Information, v.12, no.6, pp.287-295, 2007
  4. Jeong-hoon, Seu, "The Study for Traffic Signal Control Expert System using Case-based system and Rule-based system," Journal of the Korea Society of Computer and Information, v.11, no.2, pp.121-129, 2006.
  5. Wettschereck, D. and Dietterich, T.G., "An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrecyangle Algorithms," Machine Learning, Vol. 19, pp. 1-25, 1995.
  6. Wettschereck, D., "A hybrid nearest-neighbor and nearest hyperrectangle algorithm," Proceedings of European Conference on Machine Learning, Springer Verlag NY, eds. F. Bergadano, L. De Raedt pp. 323-335, 1994.
  7. P. Domingos, "Unifying instance-based and rule-based induction,"Machine Learning, vol. 24, pp. 141-168, 1996.
  8. Lee-sang, Jeong, Chang-seung,Ha, "A Study on the Design and Implementation Human Resource Dispatch System of Using Case Based Reasoning," Journal of the Korea Society of Computer and Information, v.12, no.3, pp.95-103, 2007.
  9. Xiang Y, Jin R, Fuhry D, Dragan F, "Summarizing transactional databases with overlapped hyperrectangles." Data Min Knowl Discov 23(2), pp.215-251, 2011. https://doi.org/10.1007/s10618-010-0203-9
  10. S Garcia et al, "A First Approach to Nearest Hyperrectangle Selection by Evolutionary Algorithms," Proc. of 9th Intern. Conf. on Intelligent System Design and Applications, Pisa, Italy, pp.517-522, 2009.
  11. A. Asuncion and D. Newman, "UCI machine learning repository," 2007. http://www.ics.uci.edu/ mlearn/MLRepository.html
  12. D. R. Wilson and T. R. Martinez, "Improved heterogeneous distance functions," Journal of Artificial Intelligence Research, vol. 6, pp. 1-34, 1997.