A Performance Study of Gaussian Radial Basis Function Model for the Monk's Problems

Monk's Problem에 관한 가우시안 RBF 모델의 성능 고찰

  • Shin, Mi-Young (School of Electrical Engineering and Computer Science, Kyungpook National University) ;
  • Park, Joon-Goo (School of Electrical Engineering and Computer Science, Kyungpook National University)
  • 신미영 (경북대학교 전자전기컴퓨터학부) ;
  • 박준구 (경북대학교 전자전기컴퓨터학부)
  • Published : 2006.11.25

Abstract

As art analytic method to uncover interesting patterns hidden under a large volume of data, data mining research has been actively done so far in various fields. However, current state-of-the-arts in data mining research have several challenging problems such as being too ad-hoc. The existing techniques are mostly the ones designed for individual problems, so there is no unifying theory applicable for more general data mining problems. In this paper, we address the problem of classification, which is one of significant data mining tasks. Specifically, our objective is to evaluate radial basis function (RBF) model for classification tasks and investigate its usefulness. For evaluation, we analyze the popular Monk's problems which are well-known datasets in data mining research. First, we develop RBF models by using the representational capacity based learning algorithm, and then perform a comparative assessment of the results with other models generated by the existing techniques. Through a variety of experiments, it is empirically shown that the RBF model has not only the superior performance on the Monk's problems but also its modeling process can be controlled in a systematic way, so the RBF model with RC-based algorithm might be a good candidate to handle the current ad-hoc problem.

데이터 마이닝(data mining)이란 대량의 데이터에 내재되어 있는 숨겨진 패턴을 찾아내기 위한 분석 기술로서 지금까지 많은 연구가 진행되어 왔지만, 현재의 데이터 마이닝 연구는 ad-hoc 문제와 같은 해결되어야 할 중요한 이슈들이 있다. 즉, 개별적 문제에 대해 설계된 마이닝 기법이 주로 사용되는 까닭에 여러 문제에 통합적으로 적용될 수 있는 시스템적 마이닝 기법에 관한 연구가 요구되고 있다. 본 논문에서는 이러한 핵심 데이터 마이닝 태스크 중의 하나인 분류 모델링 방법으로 방사형 기저 함수(radial basis function, RBF) 모델의 성능을 고찰하고 그 유용성(usefulness)을 살펴보고자 한다. 특히, 대표적인 마이닝 관련 벤치마킹 데이터인 Monk's problem 분석을 위해 RC(Representation Capacity) 기반 알고리즘을 사용하여 RBF 모델을 구축하고 분류 성능을 기존의 연구 결과와 비교 고찰한다. 그리하여 RBF 모델의 분류 성능 면에서의 우수성뿐만 아니라 모델링 과정을 체계적인 방식으로 적절히 제어할 수 있음을 보여주고, 이를 통해 현재의 ad-hoc 방식의 문제를 어느 정도 해결할 수 있음을 보여준다.

Keywords

References

  1. L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone, Classification and Regression Trees, Wadsworth, 1984
  2. G.V. Kass, 'An Exploratory technique for investigating large quantities of categorical data,' Applied Statistics, pp.119-127, 1980
  3. D. Michie, D.J. Spiegelhalter and C.C. Taylor (eds), Machine learning, Neural and Statistical Classification, Ellis Horwood, 1994
  4. Q.Yang and X. Wu, '10 Challenging problems in data mining research,' in presentation slides of IEEE conference on Data Mining, 15. Dec, 2005
  5. S.B. Thrun et al, 'The Monk's problems: a performance comparison of different learning algorithms,' Technical Report CMU-CS-91-197, Carnegie Mellon University. 1991
  6. H. Xiong, M.N. S. Swamy, and M. Omair Ahmad, 'Optimizing the kernel in the empirical feature space,' IEEE Transactions on Neural Networks. March 2005
  7. M. W. Mitchell, 'An architecture for situated learning agents,' Ph.D. Dissertation, Monash University, Australia, 2003
  8. M. Casey and K. Ahmad, 'In-situ learning in multi-net systems,' Lecture Notes in Computer Science, vol. 3177, pp. 752-757, 2004 https://doi.org/10.1007/b99975
  9. K. Toh, Q-L Tran and O. Srinivasan, 'Benchmarking a reduced multivariate polynomial pattern classifier,' IEEE Trans. on Pattern Anal. and Machine Intelligence, vol.16, no.2, pp. 460-474, 2005
  10. S. H. Huang, 'Dimensionality reduction in automatic knowledge acquisition: a simple greedy search approach,' IEEE Transactions on Knowledge and Data Engineering. vol. 16, no. 6, pp. 1364-1373, 2003 https://doi.org/10.1109/TKDE.2003.1245278
  11. S. Saxon and Alwyn Barry, 'XCS and the Monk's problems in learning classifier systems: from foundations to applications,' P.L. Lanzi et al, Ed., Lecture Notes in Computer Science, vol. 1813, pp. 440-448, 2000
  12. A. L. Goel and Miyoung Shin, 'Radial basis functions: an algebraic approach (with data mining applications),' Tutorial notes in European conference on Machine Learning, Pisa, Italy, September 2004