DOI QR코드

DOI QR Code

Hierarchical Architecture of Multilayer Perceptrons for Performance Improvement

다층퍼셉트론의 계층적 구조를 통한 성능향상

  • 오상훈 (목원대학교 정보통신공학과)
  • Received : 2010.03.02
  • Accepted : 2010.06.01
  • Published : 2010.06.28

Abstract

Based on the theoretical results that multi-layer feedforward neural networks with enough hidden nodes are universal approximators, we usually use three-layer MLP's(multi-layer perceptrons) consisted of input, hidden, and output layers for many application problems. However, this conventional three-layer architecture of MLP shows poor generalization performance in some applications, which are complex with various features in an input vector. For the performance improvement, this paper proposes a hierarchical architecture of MLP especially when each part of inputs has a special information. That is, one input vector is divided into sub-vectors and each sub-vector is presented to a separate MLP. These lower-level MLPs are connected to a higher-level MLP, which has a role to do a final decision. The proposed method is verified through the simulation of protein disorder prediction problem.

다층퍼셉트론이 충분한 중간층 노드 수를 지니면 임의의 함수를 근사시킬 수 있다는 이론적 연구결과에 기초하여, 다층퍼셉트론을 실제 문제에 응용하는 경우에 일반적으로 입력층, 중간층, 출력층으로 이루어진 3층 구조의 다층퍼셉트론을 사용한다. 그렇지만, 이러한 구조의 다층퍼셉트론은 입력벡터가 여러 가지 성질로 이루어진 복잡한 문제의 경우 좋은 일반화 성능을 보이지 않는다. 이 논문에서는 입력 벡터가 여러 가지 정보를 지닌 데이터들로 구성되어 있는 문제인 경우에 계층적 구조를 지닌 다층퍼셉트론의 구성으로 성능을 향상시키는 방법을 제안한다. 즉, 입력데이터를 섭-벡터로 구분한 후 섭-벡터별로 다층퍼셉트론을 적용시키며, 이 섭-벡터별로 적용된 하위층 다층퍼셉트론으로부터 인식 결과를 받아서 최종 결정을 하는 상위 다층퍼셉트론을 구현한다. 제안한 방법의 효용성은 단백질의 구조를 예측하는 문제를 통하여 확인한다.

Keywords

References

  1. D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing: Explorations in the Microstructures of Cognition, The MIT Press, 1986.
  2. K. Hornik, M. Stincombe, and H. White, "Multilayer feedforward networks are universal approximators," Neural Networks, Vol.2, pp.359-366, 1989. https://doi.org/10.1016/0893-6080(89)90020-8
  3. M. Stevenson, R. Winter, and B. Widrow, “Sensitiviety of feedforward neural networks to weight errors," IEEE Trans. Neural Networks, Vol.1, pp.71-90, 1990. https://doi.org/10.1109/72.80206
  4. Y. Xie and M. A. Jabri, "Analysis of the effects of quantization in multilayer neural networks using a statistical model," IEEE Trans. Neural Networks, Vol.3, pp.334-338, 1992. https://doi.org/10.1109/72.125876
  5. J. Y. Choi and C.-H. Choi, "Sensitivity analysis of multilayer perceptron with differentiable activation functions," IEEE Trans. Neural Networks, Vol.3, pp.101-107, 1992. https://doi.org/10.1109/72.105422
  6. Y. Lee and S.-H. Oh, "Input noise immunity of multilayer perceptrons," ETRI Journal, Vol.16, pp.35-43, 1994. https://doi.org/10.4218/etrij.94.0194.0013
  7. S.-H. Oh and Y. Lee, "Sensitivity analysis of single hidden-layer neural networks with threshold functions," IEEE Trans. Neural Networks, Vol.6, pp.1005-1007, 1995. https://doi.org/10.1109/72.392264
  8. R. P. Lippmann, "Pattern classification using neural networks," IEEE Communication Magazine, pp.47-64, 1989.
  9. J. B. Hamshire II and A. H. Waibel, "A novel objective function for improved phoneme recognition using time-delay neural networks," IEEE Trans. Neural Networks, Vol.1, pp.216-228, 1990. https://doi.org/10.1109/72.80233
  10. A. S. Weigend and N. A. Gershenfeld, Time Series Prediction: Forecasting the future and understanding the past, Addison-Wesley Publishing Co., 1994.
  11. Y.-M. Huang, C.-M. Hung, and H. C. Jiau, "Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem," Nonlinear Analysis, Vol.7, pp.720-747, 2006. https://doi.org/10.1016/j.nonrwa.2005.04.006
  12. Z. R. Yang and R. Thomson, "Bio-basis function neural netwrok for prediction of protease cleavage sites in proteins," IEEE Trans. Neural Networks, Vol.16, pp.263-274, 2005. https://doi.org/10.1109/TNN.2004.836196
  13. S.-H. Oh, "Improving the error back-propagation algorithm with a modified error function," IEEE Trans. Neural Networks, Vol.8, pp.799-803, 1997. https://doi.org/10.1109/72.572117
  14. J. B. Hampshire II and A. H. Waibel, "A novel objective function for improved phoneme recognition using time-delay neural networks," IEEE Trans. Neural Networks, Vol.1, pp.216-228, 1990. https://doi.org/10.1109/72.80233
  15. A. van Ooyen and B. Nienhuis, "Improving the convergence of the back-propagation algorithm," Neural Networks, Vol.5, pp.465-471, 1992. https://doi.org/10.1016/0893-6080(92)90008-7
  16. 오상훈, “다층퍼셉트론에 의한 불균형 데이터의 학습방법”, 한국콘텐츠학회 논문지, 제9권, 제7호, pp.141-148, 2009.
  17. 오상훈, “다층퍼셉트론의 출력노드 수 증가에 의한 성능향상”, 한국콘텐츠학회 논문지, 제9권, 제1호, pp.123-130, 2009.
  18. D. Simard, P. Y. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks," Proc. Int. Conf. Document Analysis and Recognition(ICDAR), Washington DC, USA, pp.958-962, 2003.
  19. Z.-H. Zhou and X.-Y. Liu, "Training cost-sensitive neural networks with methods addressing the class imbalance problem," IEEE Trans. Knowledgement and Data Eng., Vol.18, No.1, pp.63-77, 2006. https://doi.org/10.1109/TKDE.2006.17
  20. Y. Lee, S.-H. Oh, and M. W. Kim, "An analysis of premature saturation in back-propagation learning," Neural Networks, Vol.6, pp.719-728, 1993. https://doi.org/10.1016/S0893-6080(05)80116-9
  21. Sang-Hoon Oh "On the Design of Multilayer perceptrons for Pattern Classifications," Proc. Int. Conf. on Convergence Content 2009, Hanoi, Vietnam, pp.59-62, Dec. 17-19 2009.
  22. G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, Vol.313, pp.504-507, 2006. https://doi.org/10.1126/science.1127647
  23. F. J. Owens, G. H. Zheng, and D. A. Irvine, "A multi-output-layer perceptron," Neural Computing & Applications, Vol.4, pp.10-20, 1996. https://doi.org/10.1007/BF01413865