DOI QR코드

DOI QR Code

A Study on Detecting Abnormal Air Quality Data Related to Vehicle Emissions Using a Deep Learning Model

딥러닝 모델 기반의 자동차 배출가스 관련 대기환경 이상 데이터 탐지 연구

  • Jungmu Choi (Dept. of Computer Eng., Inha Univ.) ;
  • Jangwoo Kwon (Dept. of Computer Eng., Inha Univ.) ;
  • Junpyo Lee (Dept. of Computer Eng., Inha Univ.) ;
  • Sunwoo Lee (Dept. of Computer Eng., Inha Univ.) ;
  • Park Jung Min (Climate and Air Quality Research Department, National Institute of Environmental Research) ;
  • Shin Hye Jung (Climate and Air Quality Research Department, National Institute of Environmental Research) ;
  • An Chan Jung (Climate and Air Quality Research Department, National Institute of Environmental Research) ;
  • Kang Soyoung (Climate and Air Quality Research Department, National Institute of Environmental Research)
  • 최정무 (인하대학교 컴퓨터공학과 ) ;
  • 권장우 (인하대학교 컴퓨터공학과 ) ;
  • 이준표 (인하대학교 전기컴퓨터공학과 ) ;
  • 이선우 (인하대학교 전기컴퓨터공학과) ;
  • 박정민 (국립환경과학원 대기환경 연구과 ) ;
  • 신혜정 (국립환경과학원 대기환경 연구과) ;
  • 안찬중 (국립환경과학원 대기환경 연구과 ) ;
  • 강소영 (국립환경과학원 대기환경 연구과 )
  • Received : 2024.08.26
  • Accepted : 2024.09.28
  • Published : 2024.10.31

Abstract

Automobiles are one of the major sources of air pollution, and analyzing data on air pollutants, where vehicles are the primary pollutants, can help elucidate the correlation between factors like electric vehicles, traffic volume, and actual air pollution. Ensuring the reliability of air pollutant data is crucial for such analyses. This paper proposes a method for detecting sections of data exhibiting 'baseline anomalies' measured at air pollutant monitoring stations across the country by combining deep learning models with algorithms such as dynamic time warping and change point detection. While previous studies have focused on detecting data with unprecedented patterns and defined them as anomalies, this approach was not suitable for detecting baseline anomalies. In this study, we modify the U-Net model, typically used for image segmentation, to be more suitable for time-series data and apply dynamic time warping and change point detection algorithms to compare with nearby monitoring stations, thereby minimizing false detections.

자동차는 주요 대기 오염원 중 하나로 작용하고 있으며 자동차가 주 오염원인 대기오염물질 데이터의 분석을 통해 전기자동차, 교통량 등과 실제 대기오염의 상관관계를 분석할 수 있으며, 이러한 분석을 위해선 대기오염물질 데이터의 신뢰성 확보가 중요하다. 본 논문은 딥러닝 모델과 동적 시간 와핑, 변화점 탐지 등의 알고리즘을 복합적으로 이용하여 전국 각지의 대기오염물질 측정소에서 측정되는 데이터 중 '베이스라인 이상' 증상이 나타나는 구간을 탐지하는 방법을 제시한다. 기존 연구들은 이전에 없던 패턴이 나타나는 데이터를 탐지하여 이상으로 정의하지만 이는 베이스라인 이상 탐지에는 적합하지 않았다. 본 논문에서는 주로 이미지 분할(Segmentation)에 사용되는 Unet모델을 시계열 데이터에 적합하도록 변형하여 사용하고 있으며 또한 동적 시간 와핑과 변화점 탐지 알고리즘을 적용하여 주변 측정소와 적절한 비교를 진행하고 이를 통해 오탐지를 최소화하였다.

Keywords

Acknowledgement

본 연구는 국립환경 과학원의 지원(NIER-2024-04-02-008)과 2024년 과학기술정보통신부 및 정보통신기획평가원의 SW중심대학사업의 연구결과로 수행되었음"(2022-0-01127)

References

  1. Brahmam, M. V. and Gopikrishnan, S.(2023), "Fusing long short-term memory and autoencoder models for robust anomaly detection in indoor air quality time-series data", International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, no.10, pp.182-195.
  2. Chen, M., Peng, H., Fu, J. and Ling, H.(2021), "Autoformer: Searching transformers for visual recognition", Proceedings of the IEEE/CVF International Conference on Computer Vision.
  3. Guo, W., Xiyu, L. and Laisheng, X.(2020), "Membrane system-based improved neural networks for time-series anomaly detection", Processes, vol. 8, no. 9, 1168, https://doi.org/10.3390/pr8091168.
  4. Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J. and Scholkopf, B.(1998), "Support vector machines", IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp.18-28.
  5. Hochreiter, S.(1997), "Long Short-Term Memory", Neural Computation, vol. 9, no. 8, pp.1735-1780.
  6. Jin, J. K. and Jin, J. I.(2021), "A study on the effect of traffic congestion on particulate matter concentration in Seoul: Big Data approach", Journal of Korea Planning Association, vol. 56, no. 1, pp.121-136.
  7. Mohtar, A. A. A., Latif, M. T., Baharudin, N. H., Ahamad, F., Chung, J. X., Othman, M. and Juneng, L.(2018), "Variation of major air pollutants in different seasonal conditions in an urban environment in Malaysia", Geoscience Letters, vol. 5, no. 1, pp.1-13.
  8. Park, J. Y., Seo, Y. S. and Cho, J, H.(2023), "Unsupervised outlier detection for time-series data of indoor air quality using LSTM autoencoder with ensemble method", Journal of Big Data, vol. 10, no. 1, p.66.
  9. Ronneberger, O., Fischer, P. and Brox, T.(2015), "U-net: Convolutional networks for biomedical image segmentation", Medical Image Computing and Computer-assisted Intervention-MICCAI 2015: 18th International Conference, Proceedings Part III 18, pp.234-241.
  10. Vaswani, A.(2017), "Attention is all you need", arXiv preprint arXiv:1706.03762.
  11. Von Schneidemesser, E., Steinmar, K., Weatherhead, E. C., Bonn, B., Gerwig, H. and Quedenau, J.(2019), "Air pollution at human scales in an urban environment: Impact of local environment and vehicles on particle number concentrations", Science of the Total Environment, vol. 688, pp.691-700.
  12. Wang, A., Xu, J., Tu, R., Saleh, M. and Hatzopoulou, M.(2020), "Potential of machine learning for prediction of traffic related air pollution", Transportation Research Part D: Transport and Environment, vol. 88, 102599.
  13. Wu, H., Hu, T., Liu, Y., Zhou, H., Wang, J. and Long, M.(2022), "Timesnet: Temporal 2d-variation modeling for general time series analysis", arXiv preprint arXiv:2210.02186.
  14. Zhang, M., Guo, J., Li, X. and Jin, R.(2020), "Data-driven anomaly detection approach for time-series streaming data", Sensors, vol. 20, no. 19, 5646.
  15. Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L. and Jin, R.(2022), "Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting", International Conference on Machine Learning, PMLR, pp.27268-27286.