DOI QR코드

DOI QR Code

비용함수와 파라미터를 이용한 효과적인 디지털 데이터 기계학습 방법론

An efficient machine learning for digital data using a cost function and parameters

  • Ji, Sangmin (Department of Mathematics, Chungnam National University) ;
  • Park, Jieun (Seongsan Liberal Arts College, Daegu University)
  • 투고 : 2021.07.08
  • 심사 : 2021.10.20
  • 발행 : 2021.10.28

초록

기계학습은 학습에 이용되는 학습 데이터와 데이터를 예측할 인공신경망을 이용하여 비용함수를 만들고, 비용함수를 최소화시키는 파라미터들을 찾는 과정이다. 파라미터들은 비용함수의 그래디언트 기반 방법들을 이용하여 변화하게 된다. 디지털 신호가 복잡할수록, 학습하고자 하는 문제가 복잡할수록, 인공신경망의 구조는 더욱 복잡해지고 깊어진다. 복잡하고, 깊어지는 인공신경망 구조는 과적합(Over-fitting) 문제를 발생시킨다. 과적합 문제를 해결하기 위하여 파라미터의 가중치 감소 정규화 방법이 사용되고 있다. 우리는 이러한 방법에서 추가로 비용함수의 값을 이용한다. 이러한 방법으로 기계학습의 정확도가 향상되는 결과를 얻었으며 이는 수치 실험을 통하여 우수성이 확인된다. 이러한 결과는 기계학습을 통한 인공지능의 폭넓은 데이터에 대한 정확한 값을 도출한다.

Machine learning is the process of constructing a cost function using learning data used for learning and an artificial neural network to predict the data, and finding parameters that minimize the cost function. Parameters are changed by using the gradient-based method of the cost function. The more complex the digital signal and the more complex the problem to be learned, the more complex and deeper the structure of the artificial neural network. Such a complex and deep neural network structure can cause over-fitting problems. In order to avoid over-fitting, a weight decay regularization method of parameters is used. We additionally use the value of the cost function in this method. In this way, the accuracy of machine learning is improved, and the superiority is confirmed through numerical experiments. These results derive accurate values for a wide range of artificial intelligence data through machine learning.

키워드

과제정보

This work was supported by the National Research Foundation of Korea (NRF-2017R1E1A1A03070311)

참고문헌

  1. I. Goodfellow, Y. Bengio, & A. Courville. (2016). Regularization for deep learning. In Deep Learning. Cambridge : MIT Press.
  2. H. Zhao, Y. H. Tsai, R. Salakhutdinov, & G. J. Gordon. (2019). Learning Neural Networks with Adaptive Regularization. arXiv:1907.06288v2.
  3. Y. Zheng, R. Zhang, & Y. Mao. (2021). Regularizing Neural Networks via Adversarial Model Perturbation. arXiv:2010.04925v4.
  4. Y. Wang, Z. P. Bian, J. Hou, & L. P. Chau. (2021). Convolutional Neural Networks With Dynamic Regularization. IEEE Transactions on Neural Networks and Learning Systems, 32(5), 2299-2304. https://doi.org/10.1109/TNNLS.2020.2997044
  5. J. Duchi, E. Hazan, & Y. Singer. (2011). Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121-2159.
  6. Y. LeCun, L. Bottou, Y. Bengio, & P. Haffner. (1998). Gradient-based learning applied to document recognition. Proc. IEEE, 86, 2278-2324. https://doi.org/10.1109/5.726791
  7. R. Pascanu, & Y. Bengio. (2013). Revisiting natural gradient for deep networks. arXiv:1301.3584.
  8. J. Sohl-Dickstein, B. Poole, & S. Ganguli. (2014). Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), Beijing, China, 21-26 June, 604-612.
  9. S. Chaudhury, & T. Yamasaki. (2021). Robustness of Adaptive Neural Network Optimization Under Training Noise. IEEE Access, 9, 37039-37053. https://doi.org/10.1109/ACCESS.2021.3062990
  10. R. He, L. Liu, H. Ye, Q. Tan, B. Ding, L. Cheng, J. W. Low, L. Bing, & L. Si. (2021). On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation. arXiv:2106.03164v1.
  11. C. T. Kelley. (1995). Iterative methods for linear and nonlinear equations. In Frontiers in Applied Mathematics; SIAM: Philadelphia, PA, USA. Volume 16.
  12. H. Zulkifli. (2018). Understanding Learning Rates and How It Improves Performance in Deep Learning. https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10
  13. S. Lau. (2017). Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning. Towards Data Science. https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1.
  14. G. Aurelien. (2017). Gradient Descent. Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly. pp. 113-124. ISBN 978-1-4919-6229-9.
  15. I. Sutskever, J. Martens, G. Dahl, & G.E. Hinton. (2013). On the importance of initialization and momentum in deep learning. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), Atlanta, GA, USA, 16-21 June, 1139-1147.
  16. T. Tieleman, & G.E. Hinton. (2012). Lecture 6.5-RMSProp, COURSERA: Neural Networks for Machine Learning. Technical Report, University of Toronto, Toronto, ON, Canada.
  17. M. D. Zeiler. (2012). Adadelta: An adaptive learning rate method. arXiv:1212.5701.
  18. D. P. Kingma, & J. L. Ba. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations, ICLR 2015, San Diego, CA, USA, 7-9 May
  19. M. J. Kochenderfer, & T. A. Wheeler. (2019). Algorithms for Optimization. Cambridge: The MIT Press.
  20. K. He, X. Zhang, S. Ren, & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27-30 June, 770-778.