데이터마이닝 기법들을 통한 제주 안개 예측 방안 연구

A Study on Fog Forecasting Method through Data Mining Techniques in Jeju

  • 투고 : 2016.03.21
  • 심사 : 2016.04.04
  • 발행 : 2016.04.30


Fog may have a significant impact on road conditions. In an attempt to improve fog predictability in Jeju, we conducted machine learning with various data mining techniques such as tree models, conditional inference tree, random forest, multinomial logistic regression, neural network and support vector machine. To validate machine learning models, the results from the simulation was compared with the fog data observed over Jeju(184 ASOS site) and Gosan(185 ASOS site). Predictive rates proposed by six data mining methods are all above 92% at two regions. Additionally, we validated the performance of machine learning models with WRF (weather research and forecasting) model meteorological outputs. We found that it is still not good enough for operational fog forecast. According to the model assesment by metrics from confusion matrix, it can be seen that the fog prediction using neural network is the most effective method.


Fog prediction;Data mining;R;Tree models;Conditional inference tree;Random forest;Multinomial logistic regression;Neural network;Support vector machine;Confusion matrix


연구 과제 주관 기관 : 산업통상자원부


  1. Cristianini, N., Taylor, J., 2000, An Introduction to support vector machines, Cambridge University Press, 93-124.
  2. David, J. H., 1998, Data mining: Statistics and more, The American Statistician, 52(2), 112-118.
  3. Heo, K. Y., Ha, K. J., 2004, Classification of synoptic pattern associated with coastal fog around the korean peninsula, Korean Meteorological Society, 40(5), 541-556.
  4. Jiawei, H., Micheline, K., Jian, P., 2011, Data mining: Concepts and techniques 3rd ed., Morgan Kaufmann, 23-26.
  5. Kim, M. J., 2009, Self-organizing neural network of fog prediction, Master's Dissertation, Yonsei University, Seoul.
  6. Ko, Y. S., 2011, The construction methodology of a rule-based expert system using CART-based decision tree method, The Journal of the Korea Institute of Electronic Communication Sciences, 6(6), 849-854.
  7. Leiper, D. F., 1994, Fog on the U.S. west coast: A Review, Bull. Amer. Meteor. Soc., 75(2), 229-240.<0229:FOTUWC>2.0.CO;2
  8. Ramon, D. U., Sara, A. A., 2006, Gene selection and classification of microarray data using random forest, BMC Bioinformatics, 7(1), 3.
  9. Road Traffic Authority traffic accidents Comprehensive Analysis Center, 2014, Weather conditions specific mortality analysis.
  10. Breiman, L., 2001, Random forests, Machine Learning, 45(1), 5 32.
  11. Cherkassky, V., Ma, Y., 2004, Practical selection of SVM parameters and noise estimation for SVM regression, Neural Networks, 17(1), 113-126.
  12. Seo, M. G., 2014, Data processing using the R & industry analysis, Gilbut Inc., 447-450.
  13. Shin, J., Park, H., Seok, K. H., 2009, Variance function estimation with LS-SVM for replicated data, Journal of Korean Data and Information Science Society, 20, 925-931.
  14. Annette, J. D., Adrian, G. B., 2008, An Introduction to generalized linear model 3rd ed., Chapman Hall/CRC, 149-163.
  15. Bishop, M. C., 1999, Neural networks for pattern recognition, Oxford University Press, 140-148.
  16. Sohn, H. J., 2010, Characteristic analysis of long-term variability of fog occurrence in South Korea, Master's Dissertation, Kongju National University, Korea.
  17. Yaser, S. A., Malik, M. I., Lin, H. T., 2012, Learning from data, AML Book, 32-39