DOI QR코드

DOI QR Code

Design of Multilayer Perceptrons for Pattern Classifications

패턴인식 문제에 대한 다층퍼셉트론의 설계 방법

  • 오상훈 (목원대학교 정보통신공학과)
  • Received : 2010.03.24
  • Accepted : 2010.05.18
  • Published : 2010.05.28

Abstract

Multilayer perceptrons(MLPs) or feed-forward neural networks are widely applied to many areas based on their function approximation capabilities. When implementing MLPs for application problems, we should determine various parameters and training methods. In this paper, we discuss the design of MLPs especially for pattern classification problems. This discussion includes how to decide the number of nodes in each layer, how to initialize the weights of MLPs, how to train MLPs among various error functions, the imbalanced data problems, and deep architecture.

다층퍼셉트론 혹은 전방향 신경회로망이 임의의 함수를 근사시킬 수 있다는 이론적 연구결과에 기초하여 많은 분야에 응용되고 있다. 이 다층퍼셉트론을 실제 문제에 응용하는 경우에 여러 가지 파라미터 혹은 학습 방법 등을 결정하여야 한다. 이 논문에서는 패턴인식 문제에 다층퍼셉트론을 적용하는 경우에 실제 결정하여야 할 파라미터의 결정방법과 학습 방법에 대하여 논의한다. 이 논의는 각층의 노드 수 결정 방법, 다층 퍼셉트론의 가중치 초기화, 그리고, 성능향상을 위하여 학습에 사용되는 여러 가지 오차 함수, 데이터 불균형 문제의 학습, 깊은 구조 등을 다루었다.

Keywords

References

  1. K. Hornik, M. Stinchcombe, and H. White, "Multilayer feed-forward networks are universal approximators," Neural Networks, Vol.2, pp.359-366, 1989. https://doi.org/10.1016/0893-6080(89)90020-8
  2. R. P. Lippmann, "Pattern classification using neural networks," IEEE Communication Magazine, pp.47-64, 1989.
  3. J. B. Hamshire II and A. H. Waibel, "A novel objective function for improved phoneme recognition using time-delay neural networks," IEEE Trans. Neural Networks, Vol.1, pp.216-228, 1990. https://doi.org/10.1109/72.80233
  4. A. S. Weigend and N. A. Gershenfeld, Time Series Prediction: Forecasting the future and understanding the past, Addison-Wesley Publishing Co., 1994.
  5. Y.M. Huang, C.-M. Hung, and H. C. Jiau, "Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem," Nonlinear Analysis, Vol.7, pp.720-747, 2006. https://doi.org/10.1016/j.nonrwa.2005.04.006
  6. Z. R. Yang and R. Thomson, "Bio-basis function neural netwrok for prediction of protease cleavage sites in proteins," IEEE Trans. Neural Networks, Vol.16, pp.263-274, 2005. https://doi.org/10.1109/TNN.2004.836196
  7. D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing, MIT Press, Cambridge, MA, 1986.
  8. H. Kim and H. Park, "Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis," Bioinformatics, Vol.23, pp.1495-1502, 2007. https://doi.org/10.1093/bioinformatics/btm134
  9. Keinosuke Fukunaga, Introduction to Statistical Pattern Recognition, Elsevier, 1990.
  10. T.W. Lee, et al., "A unifying informationtheoretic framework for independent component analysis," Computers & Mathematics with Applications, Vol.31. pp.1-21, 2000.
  11. M. Girolami, A. Cichocki, and S.-I. Amari, "A common neural-network model for unsupervised exploratory data analysis and independent component analysis," IEEE Trans. Neural Networks, Vol.9, No.6, pp.1495-1501, 1998. https://doi.org/10.1109/72.728398
  12. D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Natute, Vol.401, pp.788-791, 1999. https://doi.org/10.1038/44565
  13. S.H. Oh, "Comparisons of linear feature extraction methods," Journal of the Korea Contents Association, Vol.9, No.4, pp.121-130, 2009.
  14. H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy," IEEE Trans. PAMI, Vol.27, No.8, pp.1226-1238, 2005. https://doi.org/10.1109/TPAMI.2005.159
  15. H. White, "Learning in artificial neural networks: a statistical perspective," Neural Computation, Vol.1, pp.425-464. 1989. https://doi.org/10.1162/neco.1989.1.4.425
  16. S.H. Oh, "Improving the error backpropagation algorithm with a modified error function," IEEE Trans. Neural Networks, Vol.8, pp.799-803, 1997. https://doi.org/10.1109/72.572117
  17. S.H. Oh, "Performance improvement of multilayer perceptrons with increased output nodes," Journal of the Korea Contents Association, Vol.9, No.1, pp.123-130, 2009. https://doi.org/10.5392/JKCA.2009.9.1.123
  18. A. P. Engelbrecht, "A new pruning heuristic based on variance analysis of sensitivity information," IEEE Trans. Neural Networks, Vol.12, pp.1386-1399, 2001. https://doi.org/10.1109/72.963775
  19. Y. R. Park, T. J. Murray, and C. Chen, "Predicting Sun Spots Using A Layered Perceptron Neural Networks," IEEE Trans. Neural Networks, Vol.7, pp.501-505, 1996(3). https://doi.org/10.1109/72.485683
  20. J. Moody and P. J. Antsaklis, "The dependence identification neural network construction algorithm," IEEE Trans. Neural Networks, Vol.7, pp.3-15, 1996. https://doi.org/10.1109/72.478388
  21. F. Girosi, M. Jones, and T. Poggio, "Regularization theory and neural network architecture," Neural Computation, Vol.7, pp.219-269, 1995. https://doi.org/10.1162/neco.1995.7.2.219
  22. X. Zeng and D. S. Yeung, "Hidden neuron pruning of multilayer perceptrons using a quantified sensitivity measure," Neurocomputing, Vol.69, pp.825-837, 2006. https://doi.org/10.1016/j.neucom.2005.04.010
  23. J. V. Shah and C.-S. Poon, "Linear independence of internal representations in multilayer percpetrons," IEEE Trans. Neural Networks, Vol.10, No.1, pp.10-18, 1999. https://doi.org/10.1109/72.737489
  24. S.H. Oh and Y. Lee, "Effect of nonlinear transformations on correlation between weighted sums in multilayer perceptrons,"IEEE Trans. Neural Networks, Vol.5, pp.508-510, 1994. https://doi.org/10.1109/72.286927
  25. Y Lee, S.-H. Oh, and M. W. Kim, "An analysis of premature saturation in back propagation learning," Neural Networks, Vol.6, pp.719-728, 1993. https://doi.org/10.1016/S0893-6080(05)80116-9
  26. Y.F. Yam, "An independent component analysis based weight initialization method for multilayer perceptrons," Neurocomputing, Vol.48, pp.807-818, 2002. https://doi.org/10.1016/S0925-2312(01)00674-9
  27. P. C. Barman and S.-Y. Lee, "Nonnegative matrix factorization (NMF) based supervised feature selection and adaptation," LNCS, Vol.5326, pp.120-127, 2008.
  28. A. van Ooyen and B. Nienhuis, "Improving the convergence of the backpropagation algorithm," Neural Networks, Vol.5, pp.465-471, 1992. https://doi.org/10.1016/0893-6080(92)90008-7
  29. J. B. Hampshire II and A. H. Waibel, "A novel objective function for improved phoneme recognition using time-delay neural networks,"IEEE Trans. Neural Networks, Vol.1, pp.216-228, 1990. https://doi.org/10.1109/72.80233
  30. S.H. Oh, "Classification of imbalanced data using multilayer perceptrons," Journal of the Korea Contents Association, Vol.9, pp.141-148, 2009. https://doi.org/10.5392/JKCA.2009.9.7.141
  31. Y. Bengio, "Learning deep architecture for AI," to appear in Foundations and Trends in Machine Learning
  32. G. E. Hinton and R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," Science, Vol.313, pp.504-507, 2006. https://doi.org/10.1126/science.1127647