DOI QR코드

DOI QR Code

Modified Transformation and Evaluation for High Concentration Ozone Predictions

고농도 오존 예측을 위한 향상된 변환 기법과 예측 성능 평가

  • 천성표 (부산대학교 전자전기통신공학부) ;
  • 김성신 (부산대학교 전자전기통신공학부) ;
  • 이종범 (강원대학교 환경과학과)
  • Published : 2007.08.25

Abstract

To reduce damage from high concentration ozone in the air, we have researched how to predict high concentration ozone before it occurs. High concentration ozone is a rare event and its reaction mechanism has nonlinearities and complexities. In this paper, we have tried to apply and consider as many methods as we could. We clustered the data using the fuzzy c-mean method and took a rejection sampling to fill in the missing and abnormal data. Next, correlations of the input component and output ozone concentration were calculated to transform more correlated components by modified log transformation. Then, we made the prediction models using Dynamic Polynomial Neural Networks. To select the optimal model, we adopted a minimum bias criterion. Finally, to evaluate suggested models, we compared the two models. One model was trained and tested by the transformed data and the other was not. We concluded that the modified transformation effected good to ideal performance In some evaluations. In particular, the data were related to seasonal characteristics or its variation trends.

대기중의 고농도 오존의 피해를 줄이기 위해서, 고농도 오존 발생 전에 미리 오존 농도를 예측하기 위한 연구가 진행되었다. 하지만, 고농도 오존은 그 발생 빈도가 매우 희소하고, 대기 오존 생성 과정이 매우 비선형적이며 복잡한 특징이 있다. 이러한 특징을 극복하고 보다 정확한 예측 모델을 개발하기 위하여, 본 논문에서는 다양한 데이터 처리 기법을 도입하였다. 데이터 전처리과정에서 FCM(Fuzzy C-mean) 방법을 이용하여 오존 농도별 데이터 클러스터링을 시도하였으며, 결측 또는 비정상 데이터를 처리할 목적으로 Rejection 표본 추출법을 이용하였고, 모델의 입력과 출력의 상관관계를 향상시키기 위해서 로그 변환기법을 응용하였다. 오존 예측을 위한 모델링 기법은 DPNN(Dynamical Polynomial Neural Networks)을 이용하였으며, 최소 바이어스 판별법(Minimum Bias Criterion)으로 최적화된 모델을 선택하였다. 끝으로, 본 논문에서는 로그 변환기법이 예측 모델에 미치는 영향을 보이기 위해서 입력 데이터를 두 개의 집합으로 나누어 다양한 방법으로 예측 결과를 평가했다. 결과적으로 계절적 영향에 의해 특정 분포를 가지는 오존 관련 데이터에 있어서 로그 변환 방법이 모델의 성능을 향상시킬 수 있다는 것을 보였다.

Keywords

References

  1. Wang, Z. Xue, Data Mining and Knowledge Discovery for Process Monitoring and Control, Springer-Verlag , Berlin Heidelberg London, 1999
  2. D. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, From Data Mining to Knowledge Discovery: An Overview. In : Fayyad, D.M. et al: Advances in Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, California, 1996
  3. J. F. Elder IV, D. Pregibon, A Statistical Perspective on Knowledge Discovery in Databases. In U. M. Fayyad et al, Advances in Knowledge Discovery and Data Mining, AAAI Press/The MIT Press, Menlo Park, California, 1996
  4. B. Devlin, Data Warehouse from Architecture to Implementation, Addison-Wesley, 1997
  5. R. T. Burnett, M. Smith-Doiron, D. Stieb, M. E. Raizenne, J. R. Brook, R. E. Dales, J. A. Leech, S. Cakrnak, and D. Krewski, 'Association between ozone and hospitalization for acute respiratory diseases in children less than 2 years of age,' Am. J. Epidemiol, Vol.153, pp.444-452, 2001 https://doi.org/10.1093/aje/153.5.444
  6. R. Matyssek, W. M. Harvranek, G. Wieser, J. L. Innes, Forest Decline and Ozone, A Comparison of Controlled Chamber and Field Experiments, Springer, Berlin, 1997
  7. J. Kim, S. Kim, and B. Wang, 'Forecasting High-Level Ozone Concentration with Fuzzy Clustering,' KFIS, Vol. 11, No.4, pp.336-339, 2001
  8. Stuart Russell, Peter Norvig, Artificial Intelligence - A Modem Approach, Prentice Hall, 2003
  9. D. T. Pham, and L. Xing, Neural Networks for Identification, Prediction and Control, Springer-Verlag, London, 1995
  10. H. R. Madala, A. G. Ivakhnenko, Inductive Learning Algorithms for Complex Systems Modeling, CRC Press, 1994
  11. M. Millan, R. Salvador, E. Mautilla, 'Meteorology and photochemical air pollution m southern Europe: experimental results from EC research projects,' Atmospheric Environment Vol.30, pp.1909-1924, 1996 https://doi.org/10.1016/1352-2310(95)00220-0
  12. T. Dasu, T. Johnson, Exploratory Data Mining and Data Cleaning, John Wiley & Sons, 2003
  13. S. Cheon, S. Kim, 'Directed Knowledge Discovery Methodology for the Prediction of Ozone Concentration,' Lecture Notes in Computer Science, Vol. 3613, pp.772-781, 2005
  14. C. Yuan, and M. J. Druzdzel, 'Importance sampling algorithms for Bayesian networks,' Principles and performance, Mathematical and Computer Modelling Vol. 43, pp.1189-1207, 2006 https://doi.org/10.1016/j.mcm.2005.05.020
  15. U. Schlink, S. Dorling, E. Pelikan, G. Nunnari, G. Cawley, H. Junninen, A. Greig, R. Foxall, K. Eben, T. Chatterton, J. Vondracek, M. Richter, M. Dostal, L. Bertucco, M. Kolehmainen, and M. Doyle, 'A rigorous inter-comparison of ground-level ozone predictions,' Atmospheric Environment Vol.37, 3237-3253, 2003 https://doi.org/10.1016/S1352-2310(03)00330-3
  16. S. Kim, 'A Neuro-Fuzzy Approach to Integration and Control of Industrial Processes: Part I,' KFIS, pp.58-69, 1998
  17. J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, 1981
  18. G. E. P. Box, and D. R. Cox, 'An analysis of transformations,' Journal of the Royal Statistical Society, Vol. B-26, pp.211-243, 1964
  19. G. E. P. Box, and G. M. Jenkins, 'Time series analysis: Forecasting and control,' Holden-Day, San Francisco, 1976
  20. A. G. Ivakhnenko, 'The Group Method of Data Handling in Prediction Problem,' Soviet Automatic Control, Vol. 9, No.6, pp.21-30, 1976
  21. S. Farlow, Self-Organizing Method in Modeling: GMDH-Type Algorithms, Marcel Deck-ker, New York, 1984