DOI QR코드

DOI QR Code

Prediction of extreme PM2.5 concentrations via extreme quantile regression

  • Lee, SangHyuk (Department of Statistics, Chung-Ang University) ;
  • Park, Seoncheol (Department of Information Statistics, Chungbuk National University) ;
  • Lim, Yaeji (Department of Statistics, Chung-Ang University)
  • Received : 2021.10.07
  • Accepted : 2021.11.22
  • Published : 2022.05.31

Abstract

In this paper, we develop a new statistical model to forecast the PM2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM2.5 level well.

Keywords

Acknowledgement

This research was supported by the Chung-Ang University Graduate Research Scholarship in 2020 and the National Research Foundation of Korea (NRF) funded by the Korean government (NRF-2021R1A2B5B01001790, NRF-2021R1F1A1064096).

References

  1. Alhamzawi R, Yu K, and Benoit DF (2012). Bayesian adaptive Lasso quantile regression, Statistical Modelling, 12, 279-297. https://doi.org/10.1177/1471082X1101200304
  2. Bae MA, Kim BU, Kim HC, and Kim ST (2020). A multiscale tiered approach to quantify contributions: A case study of PM2.5 in South Korea during 2010-2017, Atmosphere, 11, 141. https://doi.org/10.3390/atmos11020141
  3. Burnett RT, Pope III CA, Ezzati M, et al. (2014). An integrated risk function for estimating the global burden of disease attributable to ambient fine particulate matter exposure, Environmental Health Perspectives, 122, 397-403. https://doi.org/10.1289/ehp.1307049
  4. Choi JK, Heo JB, Ban SJ, Yi SM, and Zoh KD (2012). Chemical characteristics of PM2.5 aerosol in Incheon Korea, Atmospheric Environment, 60, 583-592. https://doi.org/10.1016/j.atmosenv.2012.06.078
  5. D'Amico G, Petroni F, and Prattico F (2015). Wind speed prediction for wind farm applications by extreme value theory and copulas, Journal of Wind Engineering and Industrial Aerodynamics, 145, 229-236. https://doi.org/10.1016/j.jweia.2015.06.018
  6. Dong M, Yang D, Kuang Y, He D, Erdal S, and Kenski D (2009). PM2.5 concentration prediction using hidden semi-Markov model-based times series data mining, Expert Systems with Applications, 36, 9046-9055. https://doi.org/10.1016/j.eswa.2008.12.017
  7. Hyndman R, Koehler AB, Ord JK, and Snyder RD (2008). Forecasting with Exponential Smoothing: The State Space Approach, Springer Science & Business Media.
  8. Lakshmi TJ and Prasad Ch SR (2014). A study on classifying imbalanced datasets. In Proceedings of the 2014 First International Conference On Networks and Soft Computing (ICNSC2014), 141-145.
  9. Ordieres JB, Vergara EP, Capuz RS, and Salazar RE (2005). Neural network prediction model for fine particulate matter PM2.5 on the US-Mexico border in El Paso (Texas) and Ciudad Juarez (Chihuahua), Environmental Modelling and Software, 20, 547-559. https://doi.org/10.1016/j.envsoft.2004.03.010
  10. Pui DYH, Chen S-C, and Zuo Z (2014). PM2.5 in China: Measurements, sources, visibility and health effects, and mitigation, Particuology, 13, 1-26. https://doi.org/10.1016/j.partic.2013.11.001
  11. Qin S, Liu F, Wang C, Song Y, and Qu J (2015). Spatial-temporal analysis and projection of extreme particulate matter (PM10 and PM2.5) levels using association rules: A case study of the Jing-Jin-Ji region, China, Atmospheric Environment, 120, 339-350. https://doi.org/10.1016/j.atmosenv.2015.09.006
  12. Qiao W, Tian W, Tian Y, Yang Q, Wang Y, and Zhang J (2019). The forecasting of PM2.5 using a hybrid model based on wavelet transform and an improved deep learning algorithm, IEEE Access, 7, 142814-142825. https://doi.org/10.1109/access.2019.2944755
  13. Quintela-del-Ri A and Francisco-Fernandez M (2011). Nonparametric functional data estimation applied to ozone data: Prediction and extreme value analysis, Chemosphere, 82, 800-808. https://doi.org/10.1016/j.chemosphere.2010.11.025
  14. Ryou HG, Heo JB, and Kim SY (2018). Source apportionment of PM10 and PM2.5 air pollution, and possible impacts of study characteristics in South Korea, Environmental Pollution, 240, 963-972. https://doi.org/10.1016/j.envpol.2018.03.066
  15. Song C, He J, Wu L, et al. (2017). Health burden attributable to ambient PM2.5 in China, Environmental Pollution, 223, 575-586. https://doi.org/10.1016/j.envpol.2017.01.060
  16. Sasaki Y (2007). The truth of the F-measure, Retrieved May 26th, 2021 from https://www. cs. odu.edu/mukka/cs795sum09dm/Lecturenotes/Day3/F-measure-YS-26Oct07. pdf
  17. Schaumburg J (2012). Predicting extreme value at risk: Nonparametric quantile regression with refinements from extreme value theory, Computational Statistics and Data Analysis, 56, 4081-4096. https://doi.org/10.1016/j.csda.2012.03.016
  18. Song YZ, Yang HL, Peng JH, Song YR, Sun Q, and Li Y (2015). Estimating PM2.5 Concentrations in Xi'an city using a generalized additive model with multi-source monitoring data, PLoS One, 10, e0142149. https://doi.org/10.1371/journal.pone.0142149
  19. Stracquadanio M, Apollo G, and Trombini C (2007). A Study of PM2.5 and PM2.5-Associated Polycyclic Aromatic Hydrocarbons at an Urban Site in the Po Valley (Bologna, Italy), Water, Air, And Soil Pollution, 179, 227-237. https://doi.org/10.1007/s11270-006-9227-6
  20. Sun Y,Wong AKC, and Kamel MS (2009). Classification of imbalanced data: A review, International Journal of Pattern Recognition and Artificial Intelligence, 23, 687-719. https://doi.org/10.1142/S0218001409007326
  21. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  22. Weissman I (1978). Estimation of parameters and large quantiles based on the k largest observations, Journal of the American Statistical Association, 73, 812-815. https://doi.org/10.2307/2286285
  23. Wu Y and Liu Y (2009). Variable selection in quantile regression, Statistica Sinica, 801-817.
  24. Wang HJ, Li D, and He X (2012). Estimation of high conditional quantiles for heavy-tailed distributions, Journal of the American Statistical Association, 107, 1453-1464. https://doi.org/10.1080/01621459.2012.716382
  25. Wang L,Wu Y, and Li R (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension, Journal of the American Statistical Association, 107, 214-222. https://doi.org/10.1080/01621459.2012.656014
  26. Wang HJ and Li D (2013). Estimation of extreme conditional quantiles through power transformation, Journal of the American Statistical Association, 108, 1062-1074. https://doi.org/10.1080/01621459.2013.820134
  27. WHO (2018). 9 out of 10 people worldwide breathe polluted air, but more countries are taking action, Retrieved November 4th, 2021, from https://www.who.int/news/item/02-05-2018-9-out-of-10-people-worldwide-breathe-polluted-air-but-more-countries-are-taking-action
  28. Zhang H, Wang Y, Hu J, Ying Q, and Hu X-M (2015). Relationships between meteorological parameters and criteria air pollutants in three megacities in China, Environmental Research, 140, 242-254. https://doi.org/10.1016/j.envres.2015.04.004
  29. Zhang B, Jiao L, Xu G, Zhao S, Tang X, Zhou Y, and Gong C (2018). Influences of wind and precipitation on different-sized particulate matter concentrations (PM2.5, PM10, PM2.5-10), Meteorology and Atmospheric Physics, 130, 383-392. https://doi.org/10.1007/s00703-017-0526-9
  30. Zou Q, Xie S, Lin Z, Wu M, and Ju Y (2016). Finding the best classification threshold in imbalanced classification, Big Data Research, 5, 2-8. https://doi.org/10.1016/j.bdr.2015.12.001