DOI QR코드

DOI QR Code

Comparative Analysis of RNN Architectures and Activation Functions with Attention Mechanisms for Mars Weather Prediction

  • Jaehyeok Jo (Dept. of AI and Big Data, Soonchunhyang University) ;
  • Yunho Sin (Asan Middle School) ;
  • Bo-Young Kim (Asan Middle School) ;
  • Jihoon Moon (Dept. of AI and Big Data, Soonchunhyang University)
  • Received : 2024.09.27
  • Accepted : 2024.10.18
  • Published : 2024.10.31

Abstract

In this paper, we propose a comparative analysis to evaluate the impact of activation functions and attention mechanisms on the performance of time-series models for Mars meteorological data. Mars meteorological data are nonlinear and irregular due to low atmospheric density, rapid temperature variations, and complex terrain. We use long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU (BiGRU) architectures to evaluate the effectiveness of different activation functions and attention mechanisms. The activation functions tested include rectified linear unit (ReLU), leaky ReLU, exponential linear unit (ELU), Gaussian error linear unit (GELU), Swish, and scaled ELU (SELU), and model performance was measured using mean absolute error (MAE) and root mean square error (RMSE) metrics. Our results show that the integration of attentional mechanisms improves both MAE and RMSE, with Swish and ReLU achieving the best performance for minimum temperature prediction. Conversely, GELU and ELU were less effective for pressure prediction. These results highlight the critical role of selecting appropriate activation functions and attention mechanisms in improving model accuracy for complex time-series forecasting.

본 연구는 화성 기상 데이터를 대상으로 활성화 함수와 어텐션 메커니즘이 시계열 모델의 성능에 미치는 영향을 평가하기 위해 비교 및 분석한다. 화성의 기상 데이터는 대기 밀도가 낮고, 급격한 온도 변동 및 복잡한 지형 등으로 인해 비선형적이고 불규칙적이다. 본 연구에서는 LSTM, BiLSTM, GRU, BiGRU 아키텍처를 사용하여 다양한 활성화 함수와 어텐션 메커니즘의 효과를 평가한다. 실험에 사용된 활성화 함수는 ReLU, Leaky ReLU, ELU, GELU, Swish, SELU이며, 모델 성능은 MAE와 RMSE 지표로 측정된다. 실험 결과, 어텐션 메커니즘을 통합함으로써 MAE와 RMSE가 모두 향상되었으며, Swish와 ReLU는 최저 온도 예측에서 가장 우수한 성능을 보였다. 반면, GELU와 ELU는 기압 예측에서 성능이 저하되었다. 이러한 결과는 복잡한 시계열 예측의 모델 정확도를 향상하기 위해 적절한 활성 함수와 어텐션 메커니즘을 선택하는 것이 중요함을 보여준다.

Keywords

Acknowledgement

This study was supported by MSIT (Ministry of Science, ICT), Korea, under the National Program for Excellence in SW, supervised by IITP (Institute of Information & Communications Technology Planning & Evaluation) in 2024 (2021-0-01399).

References

  1. K. Croswell, "Magnificent Mars," Simon and Schuster, 2003. 
  2. C. P. McKay, "The search for life on Mars," Origins of Life and Evolution of the Biosphere, vol. 27, no. 1, pp. 263-289, Jun. 1997. DOI: 10.1023/A:1006500116990 
  3. R. Pyle, "Space 2.0: How private spaceflight, a resurgent NASA, and international partners are creating a new space age," BenBella Books, 2019. 
  4. S. Singh, P. Singh, S. Rangabhashiyam, and K. K. Srivastava, "Global Climate Change," Elsevier, 2021. 
  5. B. L. Ehlmann et al., "The sustainability of habitability on terrestrial planets: Insights, questions, and needed measurements from Mars for understanding the evolution of Earth-like worlds," Journal of Geophysical Research: Planets, vol. 121, no. 10, pp. 1927-1961, Sep. 2016. DOI: 10.1002/2016JE005134 
  6. P. L. Read, S. R. Lewis, and D. P. Mulholland, "The physics of Martian weather and climate: a review," Reports on Progress in Physics, vol. 78, no. 12, p. 125901, Nov. 2015. DOI: 10.1088/0034-4885/78/12/125901 
  7. A. Barjasteh, S. H. Ghafouri, and M. Hashemi, "A hybrid model based on discrete wavelet transform (DWT) and bidirectional recurrent neural networks for wind speed prediction," Engineering Applications of Artificial Intelligence, vol. 127, p. 107340, Jan. 2024. DOI: 10.1016/j.engappai.2023.107340 
  8. Z. Yuan, Z. Yang, Y. Ling, C. Wu, and C. Li, "Spatiotemporal attention mechanism-based deep network for critical parameters prediction in chemical process," Process Safety and Environmental Protection, vol. 155, pp. 401-414, Nov. 2021. DOI: 10.1016/j.psep.2021.09.024 
  9. Z. Niu, G. Zhong, and H. Yu, "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48-62, Sep. 2021. DOI: 10.1016/j.neucom.2021.03.091 
  10. J. Pla-Garcia et al., "Meteorological predictions for Mars 2020 Perseverance Rover landing site at Jezero crater," Space Science Reviews, vol. 216, p. 148, Dec. 2020. DOI: 10.1007/s11214-020-00763-x 
  11. I. Priyadarshini and V. Puri, "Mars weather data analysis using machine learning techniques," Earth Science Informatics, vol. 14, pp. 1885-1898, Dec. 2021. DOI: 10.1007/s12145-021-00643-0 
  12. P. Pant et al., "Machine Learning Techniques for Analysis of Mars Weather Data," Proceedings of the 15th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1-7, Bucharest, Romania, Jun. 2023. DOI: 10.1109/ECAI58194.2023.10194233. 
  13. J. Moon, Y. Han, H. Chang, and S. Rho, "Multistep-ahead solar irradiance forecasting for smart cities based on LSTM, Bi-LSTM, and GRU neural networks," The Journal of Society for e-Business Studies, vol. 27, no. 4, pp. 27-52, Nov. 2022. DOI: 10.7838/jsebs.2022.27.4.027 
  14. J. Moon, S. Park, S. Rho, and E. Hwang, "A comparative analysis of artificial neural network architectures for building energy consumption forecasting," International Journal of Distributed Sensor Networks, vol. 15, no. 9, p. 1550147719877616, Sep. 2019. DOI: 10.1177/1550147719877616 
  15. S. Jung, J. Moon, S. Park, and E. Hwang, "A probabilistic short-term solar radiation prediction scheme based on attention mechanism for smart island," KIISE Transactions on Computing Practices, vol. 25, no. 12, pp. 602-609, Dec. 2019. DOI: 10.5626/KTCP.2019.25.12.602 
  16. S. Jung, J. Moon, S. Park, and E. Hwang, "An attention-based multilayer GRU model for multistep-ahead short-term load forecasting," Sensors, vol. 21, no. 5, p. 1639, Feb. 2021. DOI: 10.3390/s21051639 
  17. D. Atri, N. Abdelmoneim, D. B. Dhuri, and M. Simoni, "Diurnal variation of the surface temperature of Mars with the Emirates Mars Mission: a comparison with Curiosity and Perseverance rover measurements," Monthly Notices of the Royal Astronomical Society: Letters, vol. 518, no. 1, pp. L1-L6, Oct. 2022. DOI: 10.1093/mnrasl/slac094 
  18. C. D. Xu, J. F. Wang, M. G. Hu, and Q. X. Li, "Interpolation of missing temperature data at meteorological stations using P-BSHADE," Journal of Climate, vol. 26, no. 19, pp. 7452-7463, Oct. 2013. DOI: 10.1175/JCLI-D-12-00633.1 
  19. A. Gokhan, C. O. Guzeller, and M. T. Eser, "The effect of the normalization method used in different sample sizes on the success of artificial neural network model," International Journal of Assessment Tools in Education, vol. 6, no. 2, pp. 170-192, Jul. 2019. DOI: 10.21449/ijate.479404 
  20. M. Alizamir et al., "Improving the accuracy of daily solar radiation prediction by climatic data using an efficient hybrid deep learning model: Long short-term memory (LSTM) network coupled with wavelet transform," Engineering Applications of Artificial Intelligence, vol. 123, p. 106199, Aug. 2023. DOI: 10.1016/j.engappai.2023.106199 
  21. M. Bukhari, S. Yasmin, S. Naz, M. Y. Durrani, M. Javaid, J. Moon, and S. Rho, "A smart heart disease diagnostic system using deep vanilla LSTM," Computers, Materials & Continua, vol. 77, no. 1, pp. 1251-1279, Oct. 2023. DOI: 10.32604/cmc.2023.040329 
  22. M. J. Gul, G. M. Urfa, A. Paul, J. Moon, S. Rho, and E. Hwang, "Mid-term electricity load prediction using CNN and Bi-LSTM," The Journal of Supercomputing, vol. 77, pp. 10942-10958, Oct. 2021. DOI: 10.1007/s11227-021-03686-8 
  23. B. Lee, S. Kim, M. Maqsood, J. Moon, and S. Rho, "Advancing autoencoder architectures for enhanced anomaly detection in multivariate industrial time series," Computers, Materials & Continua, vol. 81, no. 1, pp. 1275-1300, Oct. 2024. DOI: 10.32604/cmc.2024.054826 
  24. D. So, J. Oh, I. Jeon, J. Moon, M. Lee, and S. Rho, "BiGTA-Net: A hybrid deep learning-based electrical energy forecasting model for building energy management systems," Systems, vol. 11, no. 9, p. 456, Sep. 2023. DOI: 10.3390/systems11090456 
  25. D. Soydaner, "Attention mechanism in neural networks: where it comes and where it goes," Neural Computing and Applications, vol. 34, pp. 13371-13385, Aug. 2022. DOI: 10.1007/s00521-022-07366-3 
  26. B. Ouyang, Y. Song, Y. Li, G. Sant, and M. Bauchy, "EBOD: An ensemble-based outlier detection algorithm for noisy datasets," Knowledge-Based Systems, vol. 231, p. 107400, Nov. 2021. DOI: 10.1016/j.knosys.2021.107400 
  27. J. Kim, J. Moon, E. Hwang, and P. Kang, "Recurrent inception convolution neural network for multi short-term load forecasting," Energy and Buildings, vol. 194, pp. 328-341, Jul. 2019. DOI: 10.1016/j.enbuild.2019.04.034 
  28. S. Y. Sen and N. Ozkurt, "Convolutional Neural Network Hyperparameter Tuning with Adam Optimizer for ECG Classification," Proceedings of the 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1-6, Istanbul, Turkey, Oct. 2020. DOI: 10.1109/ASYU50717.2020.9259896. 
  29. T. Miseta, A. Fodor, and A. Vathy-Fogarassy, "Surpassing early stopping: A novel correlation-based stopping criterion for neural networks," Neurocomputing, vol. 567, p. 127028, Jan. 2024. DOI: 10.1016/j.neucom.2023.127028 
  30. D. Chicco, M. J. Warrens, and G. Jurman, "The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation," PeerJ Computer Science, vol. 7, p. e623, Jul. 2021. DOI: 10.7717/peerj-cs.623