DOI QR코드

DOI QR Code

Feature Selection to Predict Very Short-term Heavy Rainfall Based on Differential Evolution

미분진화 기반의 초단기 호우예측을 위한 특징 선택

  • 서재현 (광운대학교 컴퓨터과학과) ;
  • 이용희 (국립기상연구소 예보연구과) ;
  • 김용혁 (광운대학교 컴퓨터과학과)
  • Received : 2012.07.27
  • Accepted : 2012.11.30
  • Published : 2012.12.25

Abstract

The Korea Meteorological Administration provided the recent four-years records of weather dataset for our very short-term heavy rainfall prediction. We divided the dataset into three parts: train, validation and test set. Through feature selection, we select only important features among 72 features to avoid significant increase of solution space that arises when growing exponentially with the dimensionality. We used a differential evolution algorithm and two classifiers as the fitness function of evolutionary computation to select more accurate feature subset. One of the classifiers is Support Vector Machine (SVM) that shows high performance, and the other is k-Nearest Neighbor (k-NN) that is fast in general. The test results of SVM were more prominent than those of k-NN in our experiments. Also we processed the weather data using undersampling and normalization techniques. The test results of our differential evolution algorithm performed about five times better than those using all features and about 1.36 times better than those using a genetic algorithm, which is the best known. Running times when using a genetic algorithm were about twenty times longer than those when using a differential evolution algorithm.

본 논문에서는 대한민국의 국립기상연구소에서 제공한 최근 4년간의 데이터를 훈련 데이터, 검증 데이터 및 테스트 데이터로 나누어 초단기 호우 예측을 하고자 한다. 우리는 데이터 셋을 훈련 데이터, 검증 데이터와 테스트 데이터 세 부분으로 나눴다. 데이터의 차원이 커짐에 따라 해 공간의 크기가 지수적으로 증가하여 실험의 속도가 현저히 떨어지는 문제를 피하기 위하여 72개의 특징들 중에서 주요한 특징들만을 선택하게 되었다. 예측의 정확도를 높이기 위해 미분진화 알고리즘을 사용하였고, 진화연산의 적합도 함수로 두 개의 분류기를 선택하였는데, 일반적으로 우수한 성능을 보이는 서포트 벡터 머신(SVM)과 분류 속도가 빠른 최근린법(k-NN)을 사용하였다. 또한, 실험에 사용할 데이터 가공을 위해 언더샘플링과 정규화를 하였다. 진화연산의 적합도 함수로 SVM 분류기를 사용하였을 때 실험 결과가 대체로 우수하였는데, 미분진화 알고리즘 실험은 모든 특징을 선택한 실험보다 약 5 배 정도 우수한 성능을 보였고, 유전 알고리즘을 사용한 실험보다 약 1.36 배 정도 더 우수한 성능을 보였다. 실험 속도 면에서는 미분진화 알고리즘을 사용한 실험이 유전 알고리즘을 사용한 실험보다 약 20배 이상 실험 시간이 단축되었다.

Keywords

Acknowledgement

Grant : 예보기술지원 및 활용연구

Supported by : 국립기상연구소

References

  1. Korea Meteorological Administration, http://www.kma.go.kr
  2. E. E. Ebert and J. L. McBride, "Verification of precipitation in weather systems: Determination of systematic errors," Journal of Hydrology, vol. 239, pp. 179-202, 2000. https://doi.org/10.1016/S0022-1694(00)00343-7
  3. B. R. Moon, Genetic Algorithm, Hanbit Media Press, 2008.
  4. J. Holland, Adaptation in Nature and Artificial Systems, MIT Press, 1992.
  5. J. J. Jung and T. H. Lee, "Global Optimization Using Differential Evolution Algorithm," In Proceedings of KSME conference on solid and structural mechanics, pp. 29-33, 2001.
  6. R. N. Khushaba, A. Al-Ani, and A. Al-Jumaily, "Differential evolution based feature subset selection," In Proceedings of the 19th International Conference on Pattern Recognition, ICPR2008, Tampa, FL, USA, pp. 1-4, 2008.
  7. K. Price, "Differential evolution vs. the contest functions of the 2nd ICEO," In Proceedings of the IEEE International Conference on Evolutionary Computation, pp. 153-157, 1997.
  8. R. Storn, "On the usage of differential evolution for function optimization," In Proceedings of the North American Fuzzy Information Processing Society (NAFIPS), IEEE Press, pp. 519-523, 1996.
  9. R. N. Khushaba, A. Al-Ani, and A. Al-Jumaily, "Feature subset selection using differential evolution and a statistical repair mechanism," Expert Systems with Applications, vol. 38, pp. 11515-11526, 2011. https://doi.org/10.1016/j.eswa.2011.03.028
  10. Automatic Weather Stations, http://www.automaticweatherstation.com
  11. J. H. Seo and Y. H. Kim, "A survey on rainfall forecast algorithms based on machine learning technique," In Proceedings of KIIS Fall Conference, vol. 21, no 2, pp. 218-221, 2011.
  12. C. M. Kishtawal, Sujit Basu, Falguni Patadia, and P. K. Thapliyal "Forecasting summer rainfall over India using genetic algorithm," Geophys. Res. Lett., vol. 30, no. 23, pp. 1-5, 2003.
  13. J. N. K. Liu, B. N. L. Li, and T. S. Dillon., "An improved naïve Bayesian classifier technique coupled with a novel input solution method," IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, vol. 31, no. 2, pp. 249-256, 2001. https://doi.org/10.1109/5326.941848
  14. L. Ingsrisawang, S, Ingsriswang, S. Somchit, P. Aungsuratana, and W. Khantiyanan, "Machine learning techniques for short-term rain forecasting system in the northeastern part of Thailand," In Proceedings of World Academy of Science, Engineering and Technology, vol. 31, pp. 248-253, 2008.
  15. W. C. Hong, "Rainfall forecasting by technological machine learning models," Journal of Applied Mathematics and Computing, vol. 200, no. 1, pp. 41-57, 2008. https://doi.org/10.1016/j.amc.2007.10.046
  16. D. Bremner, E. Demaine, J. Erickson, J. Iacono, S. Langerman, P. Morin, and G. Toussaint, "Output-sensitive algorithms for computing nearest-neighbor decision boundaries," Discrete and Computational Geometry, vol. 33, no. 4, pp. 593-604, 2005. https://doi.org/10.1007/s00454-004-1152-0
  17. Y. Yin, D. Han, and Z. Cai, "Explore data classification algorithm based on SVM and PSO for education decision," Journal of Convergence Information Technology, vol. 6, no. 10, pp. 122-128, 2011. https://doi.org/10.4156/jcit.vol6.issue10.16
  18. C. Chang and C. Lin, "LIBSVM: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, pp. 1-27, 2011.
  19. J. H. Seo and Y. H. Kim, "Genetic feature selection for very short-term heavy rainfall prediction," In Proceedings of International Conference on Convergence and Hybrid Information Technology - Lecture Notes in Computer Science 7425, pp. 312-322, 2012.
  20. J. T. Schaefer, "The critical success index as an indicator of warning skill," Weather and Forecasting, vol. 5, issue 4, pp. 570-575, 1990. https://doi.org/10.1175/1520-0434(1990)005<0570:TCSIAA>2.0.CO;2
  21. J. H. Seo, Y. S. Lim, and Y. H. Kim, "Feature selection for forecasting a short-term heavy rainfall using a genetic algorithm," In Proceedings of KIISE Fall Conference, vol. 38, no. 2, pp. 327-330, 2011.

Cited by

  1. Forecasting the Precipitation of the Next Day Using Deep Learning vol.26, pp.2, 2016, https://doi.org/10.5391/JKIIS.2016.26.2.093