DOI QR코드

DOI QR Code

Hyper Parameter Tuning Method based on Sampling for Optimal LSTM Model

  • Kim, Hyemee (Department of Industrial Engineering, Pusan National University) ;
  • Jeong, Ryeji (Department of Industrial Engineering, Pusan National University) ;
  • Bae, Hyerim (Department of Industrial Engineering, Pusan National University)
  • 투고 : 2018.12.04
  • 심사 : 2018.12.07
  • 발행 : 2019.01.31

초록

As the performance of computers increases, the use of deep learning, which has faced technical limitations in the past, is becoming more diverse. In many fields, deep learning has contributed to the creation of added value and used on the bases of more data as the application become more divers. The process for obtaining a better performance model will require a longer time than before, and therefore it will be necessary to find an optimal model that shows the best performance more quickly. In the artificial neural network modeling a tuning process that changes various elements of the neural network model is used to improve the model performance. Except Gride Search and Manual Search, which are widely used as tuning methods, most methodologies have been developed focusing on heuristic algorithms. The heuristic algorithm can get the results in a short time, but the results are likely to be the local optimal solution. Obtaining a global optimal solution eliminates the possibility of a local optimal solution. Although the Brute Force Method is commonly used to find the global optimal solution, it is not applicable because of an infinite number of hyper parameter combinations. In this paper, we use a statistical technique to reduce the number of possible cases, so that we can find the global optimal solution.

키워드

CPTSCQ_2019_v24n1_137_f0001.png 이미지

Fig. 1. Basic RNN construction

CPTSCQ_2019_v24n1_137_f0002.png 이미지

Fig. 2. Various structures of RNN

CPTSCQ_2019_v24n1_137_f0003.png 이미지

Fig. 3. Algorithm flow chart

CPTSCQ_2019_v24n1_137_f0004.png 이미지

Fig. 4. Distribution of sample with criterion

CPTSCQ_2019_v24n1_137_f0005.png 이미지

Fig. 5. Example of filtering combinations

CPTSCQ_2019_v24n1_137_f0006.png 이미지

Fig. 6. Example of setting a criterion

Table 1. Condition of experimentsa = the number of input variablesb = any integer,  

CPTSCQ_2019_v24n1_137_t0001.png 이미지

Table 2. Mean RMSE from experiment’s first loop using gold price

CPTSCQ_2019_v24n1_137_t0002.png 이미지

Table 3. Probability that the distribution is bigger than the criterion from experiment’s first loop using gold price

CPTSCQ_2019_v24n1_137_t0003.png 이미지

Table 4. Experiment result – performance(RMSE)

CPTSCQ_2019_v24n1_137_t0004.png 이미지

Table 5. Experiment result – the number of experiments

CPTSCQ_2019_v24n1_137_t0005.png 이미지

참고문헌

  1. L. Deng, and D. Yu, "Deep learning: methods and applications," Foundations and Trends in Signal Processing, Vol. 7, No. 3-4, pp. 197-387, 2014. https://doi.org/10.1561/2000000039
  2. C. Paar, and J. Pelzl, "Understanding cryptography: a textbook for students and practitioners," Springer Science & Business Media, pp. 7, 2009.
  3. R. E. Korf, "Depth-first iterative-deepening: An optimal admissible tree search," Artificial intelligence, Vol. 27, No. 1, pp. 97-109, 1985. https://doi.org/10.1016/0004-3702(85)90084-0
  4. T. Pukkala, and J. Kangas, "A heuristic optimization method for forest planning and decision making," Scandinavian Journal of Forest Research, Vol. 8, No. 1-4, pp. 560-570, 1993. https://doi.org/10.1080/02827589309382802
  5. A. Graves, "Supervised sequence labelling with recurrent neural networks," Springer, pp. 35-42, 2012.
  6. R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts, "Recursive deep models for semantic compositionality over a sentiment treebank," Empirical Methods in Natural Language Processing, pp. 1631-1642, Seattle, Washington, USA, October 2013.
  7. T. Mikolov, M. Karaflat, L. Burget, J. Cernocky, and S. Khudanpur, "Recurrent neural network based language model," the International Speech Communication Association, Makuhari, Chiba, Japan, September 2010.
  8. S. Hochreiter, and J. Schmidhuber, "Long short-term memory," Neural computation, Vol 9, No. 8, pp. 1735-1780, November 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  9. R. V. Hogg, and E. A. Tanis, "Probability and statistical inference," Pearson Educational International, pp. 204-205, 2015.
  10. J. Bergstra, and Y. Bengio, "Random search for hyper-parameter optimization," Journal of Machine Learning Research, Vol. 13, pp. 281-305, February 2012.
  11. H. K. Lam, S. H. Ling, F. H. Leung, and P. K. S. Tam, "Tuning of the structure and parameters of neural network using an improved genetic algorithm," Industrial Electronics Society 2001( IECON'01), pp. 25-30, Denver, Colorado, USA, November 2001.
  12. J. Snoek, H. Larochelle, and R. P. Adams, "Practical bayesian optimization of machine learning algorithms," Advances in neural information processing systems, pp. 2951-2959, 2012.
  13. Y. Xia, C. Liu, Y. Li, and N. Liu, "A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring," Expert Systems with Applications, Vol. 78, pp. 225-241, 2017. https://doi.org/10.1016/j.eswa.2017.02.017
  14. B. Qolomany, M. Maabreh, A. Al-Fuqaha, A. Gupta, and D. Benhaddou, "Parameters optimization of deep learning models using particle swarm optimization," In Wireless Communications and Mobile Computing Conference(IWCMC), pp. 1285-1290, Valencia, Spain, June 2017.
  15. A. Tharwat, A. E. Hassanien, and B. E. Elnaghi, "A ba-based algorithm for parameter optimization of support vector machine," Pattern Recognition Letters, Vol. 93, pp. 13-22, 2017. https://doi.org/10.1016/j.patrec.2016.10.007
  16. C. Zhang, J. Zhou, C. Li, W. Fu, and T. Peng, "A compound structure of ELM based on feature selection and parameter optimization using hybrid backtracking search algorithm for wind speed forecasting," Energy Conversion and Management, Vol. 143, pp. 360-376, 2017. https://doi.org/10.1016/j.enconman.2017.04.007