Comparative Analysis on Error Back Propagation Learning and Layer By Layer Learning in Multi Layer Perceptrons

다층퍼셉트론의 오류역전파 학습과 계층별 학습의 비교 분석

  • 곽영태 (익산대학 컴퓨터과학과)
  • Published : 2003.10.01

Abstract

This paper surveys the EBP(Error Back Propagation) learning, the Cross Entropy function and the LBL(Layer By Layer) learning, which are used for learning the MLP(Multi Layer Perceptrons). We compare the merits and demerits of each learning method in the handwritten digit recognition. Although the speed of EBP learning is slower than other learning methods in the initial learning process, its generalization capability is better. Also, the speed of Cross Entropy function that makes up for the weak points of EBP learning is faster than that of EBP learning. But its generalization capability is worse because the error signal of the output layer trains the target vector linearly. The speed of LBL learning is the fastest speed among the other learning methods in the initial learning process. However, it can't train for more after a certain time, it has the lowest generalization capability. Therefore, this paper proposes the standard of selecting the learning method when we apply the MLP.

본 논문은 MLP의 학습 방법으로 사용되는 EBP학습, Cross Entropy함수, 계층별 학습을 소개하고, 필기체 숫자인식 문제를 대상으로 각 학습 방법의 장단점을 비교한다. 실험 결과, EBP학습은 학습 초기에 학습 속도가 다른 학습 방법에 비해 느리지만, 일반화 성능이 좋다. 또한, EBP학습의 단점을 보안한 Cross Entropy 함수는 학습 속도가 EBP학습보다 빠르다. 그러나, 출력층의 오차 신호가 목표 벡터에 대해 선형적으로 학습하기 때문에, 일반화 성능이 EBP학습보다 낮다. 그리고, 계층별 학습은 학습 초기에, 학습 속도가 가장 빠르다. 그러나, 일정한 시간 후, 더 이상 학습이 진행되지 않기 때문에, 일반화 성능이 가장 낮은 결과를 얻었다. 따라서, 본 논문은 MLP를 응용하고자 할 때, 학습 방법의 선택 기준을 제시한다.

Keywords

References

  1. D. E. Rumelhart and J. I. McCelland, Parallel Distributed Processing, MIT Press, Cambridge, MA, pp. 318-362, 1986
  2. R. P. Lippmann, 'An Introduction to Computing with Neural Nets,' IEEE ASSP Magazine, vol. 4, no. 2, pp. 4-22, April 1987 https://doi.org/10.1109/MASSP.1987.1165576
  3. J. M. Zurada, Introduction to Artificial Neural Systems, West Publishing Co., 1992
  4. Simon Haykin, Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Co., 1994
  5. J. R. Chen and P. Mars, 'Stepsize variation methods for accelerating the backpropation algorithm,' Proc. IJCNN Jan. 15-19, 1990, Washington, DC, USA, vol. I, pp. 601-604
  6. Ali Rezgui and Nazif Tepedelenlioglu, 'The effect of the slope of the activation function on the back propagation algorithm,' Proceeding of IJCNN'90 Washington D.C., vol. 1, pp. 707-710
  7. Plagianakos, V.P., Magoulas, G.D., Vrahatis, M.N., 'Deterministic nonmonotone strategies for effective training of multilayer perceptrons,' IEEE Trans. Neural Networks, vol. 13, pp. 1268-1284, 2002 https://doi.org/10.1109/TNN.2002.804225
  8. A. Van Ooyen and B. Nienhuis, 'Improving the convergence of the back-propagation algorithm,' Neural Networks, vol. 78, pp. 465-471, 1992
  9. G.-J. Wang and C.-C. Chen, 'A Fast Multilayer Neural-Network Training Algorithm Based on the Layer-By-Layer Optimizing Procedures,' IEEE Trans. Neural Networks, vol. 7, pp. 768-775, May, 1996 https://doi.org/10.1109/72.501734
  10. Lengelle, R., and Denoeux, T., 'Training MLPs Layer by Layer Using an Objective Function for Internal Representations,' Neural Networks, vol. 9, January, 1996
  11. Jim. Y. F. Yam and Tommy W. S. Chow, 'Extended Least Squares Based Algorithm for Training Feedforward Networks,' IEEE Trans. Neural Networks, vol. 8, pp. 806-810, May, 1997 https://doi.org/10.1109/72.572119
  12. C M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford, 1997
  13. Ampazis, N., Perantonis, S,J., "Two highly efficient second-order algorithms for training feedforward networks,' IEEE Trans. Neural Networks, vol. 13, pp. 1064-1074, Sep., 2002 https://doi.org/10.1109/TNN.2002.1031939
  14. J. J. Hull, 'A database for handwritten text recognition research,' IEEE Trans. Pattern and Machine Intell., vol. 16, pp. 550-554, 1994 https://doi.org/10.1109/34.291440
  15. J. Villiers and E. Barnard, 'Backpropagation Neural Nets with One and Two Hidden Layers,' IEEE Trans. Neural Netwoks, vol. 4, no. 1, pp. 136-141, 1993 https://doi.org/10.1109/72.182704
  16. M M Islam and K Murase, 'A new algorithm to design compact two-hidden-layer artificial neural networks,' Neural Netwoks, vol. 14, 2001
  17. K. Hornik, M. Stinchcombe, and H. White, 'Multilayer feedforward networks are universal approximators,' Neural Networks, vol. 2, pp. 359-366, 1989 https://doi.org/10.1016/0893-6080(89)90020-8
  18. Shah, J.V., Chi-Sang Poon, 'Linear independence of internal representations in multilayer perceptrons,' IEEE Trans. Neural Netwoks, vol. 10, no. 1, pp. 10-18, 1999 https://doi.org/10.1109/72.737489
  19. David J. Winter, Matrix Algebra, Macmillan Publishing Company, 1992