DOI QR코드

DOI QR Code

Optimal Algorithm and Number of Neurons in Deep Learning

딥러닝 학습에서 최적의 알고리즘과 뉴론수 탐색

  • 장하영 (군산대학교 컴퓨터공학과) ;
  • 유은경 (공군 항공소프트웨어지원소) ;
  • 김혁진 (청운대학교 컴퓨터공학과)
  • Received : 2022.02.21
  • Accepted : 2022.04.20
  • Published : 2022.04.28

Abstract

Deep Learning is based on a perceptron, and is currently being used in various fields such as image recognition, voice recognition, object detection, and drug development. Accordingly, a variety of learning algorithms have been proposed, and the number of neurons constituting a neural network varies greatly among researchers. This study analyzed the learning characteristics according to the number of neurons of the currently used SGD, momentum methods, AdaGrad, RMSProp, and Adam methods. To this end, a neural network was constructed with one input layer, three hidden layers, and one output layer. ReLU was applied to the activation function, cross entropy error (CEE) was applied to the loss function, and MNIST was used for the experimental dataset. As a result, it was concluded that the number of neurons 100-300, the algorithm Adam, and the number of learning (iteraction) 200 would be the most efficient in deep learning learning. This study will provide implications for the algorithm to be developed and the reference value of the number of neurons given new learning data in the future.

딥러닝(Deep Learning)은 퍼셉트론을 기반으로 하고 있으며 현재에는 이미지 인식, 음성 인식, 객체 검출 및 약물 개발 등과 같은 다양한 영역에서 사용되고 있다. 이에 따라 학습 알고리즘이 다양하게 제안되었고 신경망을 구성하는 뉴런수도 연구자마다 많은 차이를 보이고 있다. 본 연구는 현재 대표적으로 사용되고 있는 확률적 경사하강법(SGD), 모멘텀법(Momentum), AdaGrad, RMSProp 및 Adam법의 뉴런수에 따른 학습 특성을 분석하였다. 이를 위하여 1개의 입력층, 3개의 은닉층, 1개의 출력층으로 신경망을 구성하였고 활성화함수는 ReLU, 손실 함수는 교차 엔트로피 오차(CEE)를 적용하였고 실험 데이터셋은 MNIST를 사용하였다. 그 결과 뉴런수는 100~300개, 알고리즘은 Adam, 학습횟수(iteraction)는 200회가 딥러닝 학습에서 가장 효율적일 것으로 결론을 내렸다. 이러한 연구는 향후 새로운 학습 데이터가 주어졌을 경우 개발될 알고리즘과 뉴런수의 기준치에 함의를 제공할 것이다.

Keywords

Acknowledgement

This work is supported by Chungwoon University Research Year in 2021

References

  1. https://terms.naver.com/entry.naver?docId=3386834&categoryId=58369&cid=58369.
  2. Y. M. Park & D. I. Jung. (2022). Development of vision system for quality inspection of automotive parts and comparison of machine learning models. The journal of Convergence on Culture Technology, 8(1), 409-415.
  3. M. H. Jung & W. H. Kwon. (2021). Present status and future of AI-based drug discovery. Journal of the Korea Institute Of Information and Communication Engineering, 25(12), 1797-1808. https://doi.org/10.6109/JKIICE.2021.25.12.1797
  4. T. Mitchell. (1997). Machine Learning. NW: McGraw-Hill.
  5. E. J. Lim, E. Y. Lee, & I. G. Lee. (2021). Behavior and Script Similarity-Based Cryptojacking Detection Framework Using Machine Learning. Journal of The Korea Institute of Information Security and Cryptology, 31(6), 1105-1114. https://doi.org/10.13089/JKIISC.2021.31.6.1105
  6. Y. J. Seo & K. S. Kim. (2021). Development of an Artificial Intelligence Model for Predicting the Policy and the Environment Affecting the Public Interest on the Real Estate. Journal of Korean Institute of Information Technology, 19(12), 135-141. https://doi.org/10.14801/jkiit.2021.19.12.135
  7. S. Ruder. (2016). An overview of gradient descent optimization algorithms, arXiv preprint arXiv:1609.04747.
  8. D. S. Yook, H. W. Lee, & I. C. Yoo. (2020). A survey on parallel training algorithms for deep neural networks. The Journal of The Acoustical Society of Korea, 39(6), 505-514. https://doi.org/10.7776/ASK.2020.39.6.505
  9. J. P. Youn. (2021). A Study on the Prediction Method of Air-Compressor's Health Condition Using Convolutional Neural Network(CNN), Master's thesis, Soong-sil University, Seoul.
  10. G. H. Joo, C. H. Park, & H. S. Im. (2020). Performance Evaluation of Machine Learning Optimizers. Journal of IKEEE, 24(3), 766-776. https://doi.org/10.7471/IKEEE.2020.24.3.766
  11. D. John, H. Elad & S. Yoram. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research, 12, 2121-2159.
  12. M. J. Kang. (2020). Comparison of Gradient Descent for Deep Learning. Journal of Korea Academia-Industrial cooperation Society, 21(2), 189-194. https://doi.org/10.5762/KAIS.2020.21.2.189
  13. D. Kingma & J. Ba. (2014). Adam: A Method for Stochastic optimization, arXiv preprint arXiv:1412.6980.
  14. S. Y. Kim, W. K. Chung, & S. R. Shin. (2019). Acoustic Full-waveform Inversion using Adam Optimizer. Geophysics and Geophysical Exploration, 22(4), 202-209. https://doi.org/10.7582/GGE.2019.22.4.202
  15. Y. LeCun, B. E. Boser, J. S. Denker & D. Henderson, R.E. Howard, W. E Hubbard & L.D. Jackel. (1990). Handwritten digit recognition with a back-propagation network, In Advances in neural information processing systems, 396-404.