DOI QR코드

DOI QR Code

PM2.5 Prediction Model Performance and Variable Impact Analysis Using SHAP

SHAP을 활용한 PM2.5 예측 모델 성능 및 변수 영향력 분석

  • Yong-jin, Jung (Department of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education (KOREATECH)) ;
  • Chang-Heon Oh (Department of Electrical, Electronics and Communication Engineering, Korea University of Technology and Education (KOREATECH))
  • 정용진 (한국기술교육대학교 전기전자통신공학과) ;
  • 오창헌 (한국기술교육대학교 전기전자통신공학과)
  • Received : 2024.10.07
  • Accepted : 2024.10.29
  • Published : 2024.10.31

Abstract

Machine learning and deep learning are being researched in various fields and applied in real life. Designing reliable models is crucial, and understanding the results of these models is necessary. This paper analyzes the impact of variables on prediction values using SHAP. Prediction models for PM2.5 were designed using DNN and LSTM algorithms. The training and test data were composed by selecting weather data and air pollutant data through correlation analysis. The RMSE and accuracy for AQI categories were checked for both prediction models, with the LSTM algorithm showing slightly better performance. The contribution of variables to the prediction values of both models was confirmed using SHAP. It was found that air pollutant data had a high contribution in predicting PM2.5, and temperature among weather data had a high contribution in the prediction process of both models. Both models showed that high values of temperature, wind speed, and sea level pressure decreased prediction values, while low values increased them. For NO2 , PM10, and SO2, the LSTM model showed a bidirectional impact on prediction values, unlike the DNN model.

본 논문에서는 SHAP을 사용하여 변수들이 예측 값에 어떠한 영향을 주었는지 분석하였다. DNN과 LSTM 알고리즘을 사용하여 PM2.5에 대한 예측 모델을 설계하였다. 학습 및 테스트 데이터는 기상데이터와 대기오염물질데이터를 상관분석을 통해 선별하여 구성하였다. 두 예측 모델에 대해 RMSE와 AQI의 범주에 대한 정확도를 확인하였으며, SHAP을 이용하여 두 모델의 예측 값에 대해 변수들의 기여도를 확인하였다. 공통적으로 대기오염물질 데이터가 PM2.5를 예측 하는 과정에서 기여도가 높은 것을 확인하였으며, 기상 데이터 중 온도가 두 모델의 예측 과정에서 기여도가 높은 것을 확인하였다. 그리고 기여도에 따른 영향력을 확인하였을 때, 두 모델이 공통적으로 온도, 풍속, 해면기압이 값이 높을 때 예측 값을 감소시키는 영향을 주며, 값이 낮을 때 예측 값을 증가시키는 영향을 주었다. NO2 , PM10, SO2의 경우 DNN 예측 모델과는 달리 LSTM 예측 모델에서는 값이 높을 때 예측 값의 양방향으로 영항을 주는 것을 확인하였다.

Keywords

Acknowledgement

This paper was supported by the Education and Research Promotion Program of KOREATECH in 2023.

References

  1. Korea Disease Control and Prevention Agency. Health effects of fine dust [Internet]. Available: https://www.kdca.go.kr/contents.es?mid=a20205030301.
  2. D. S. Kim and K. S. Ban, Nearly Everything about the Fine Dust, Seoul : Prisma, 2019.
  3. F. Keith, D. S. Krantz, R. Chen, K. M. Harris, C. M. Ware, A. K. Lee, ... , S. S. Gottlieb, "Anger, hostility, and hospitalizations in patients with heart failure," Health Psychology, Vol. 36, No. 9, pp. 829-838. Sep. 2017. DOI: 10.1037/hea0000519.
  4. E. J. Bang and Y. H. Choi, "Recent understanding in particular matter-mediated aging and age-related diseases," Journal of Life Science, Vol. 34, No. 1, pp. 68-77, Jan. 2024. DOI: 10.5352/JLS.2024.34.1.68.
  5. World Health Organization (WHO), Health effects of particulate matter: Policy implications for countries in eastern europe, caucasus and central asia [Internet]. Available: https://iris.who.int/handle/10665/344854.
  6. H. J. Choi, "The effect of fine dust risk perception on indoor and outdoor tourists: focusing on planned behavior theory (TPB)," Journal of Korea Entertainment Industry Association, Vol. 17, No. 8, pp. 25-37, Dec. 2023. DOI: 10.21184/jkeia.2023.12.17.8.25.
  7. T. G. Kwon, The effect of atmospheric environment on the fan attendance in KBO League : focusing on fine dust, M. A. dissertation, Hanyang University, Republic of Korea, 2019. Retrieved from https://www.riss.kr/link?id=T15035580.
  8. D. H. Kim and H. B. Kim, "Perception of participants in outdoor physical activity for particulate matter : focusing on the university students," The Korean Journal of Sport, Vol. 18, No. 1, pp. 369-378, Mar. 2020. Retrieved from https://www.kci.go.kr/kciportal/ci/sereArticleSearch/ciSereArtiView.kci?sereArticleSearchBean.artiId=ART002573513.
  9. J. Li, C. Hua, L. Ma, K. Chen, F. Zheng, Q. Chen, ... , Y. Liu, "Key drivers of the oxidative potential of PM2.5 in Beijing in the context of air quality improvement from 2018 to 2022," Environment International, Vol. 187, May. 2024. DOI: 10.1016/j.envint.2024.108724.
  10. X. Gao, Z. Ruan, J. Liu, Q. Chen, and Y. Yuan, "Analysis of atmospheric pollutants and meteorological factors on PM2.5 concentration and temporal variations in Harbin," Atmosphere, Vol. 13, No. 9, Sep. 2022. DOI: 10.3390/atmos13091426.
  11. Z. Guo, X. Wang, and L. ge, "Classification prediction model of indoor PM2.5 concentration using CatBoost algorithm," Frontiers in Built Environment, Vol. 9. Jul. 2023. DOI: 10.3389/fbuil.2023.1207193.