Rainfall Recognition from Road Surveillance Videos Using TSN

TSN을 이용한 도로 감시 카메라 영상의 강우량 인식 방법

  • Li, Zhun (School of Computing, Korea Advanced Institute of Science and Technology (KAIST)) ;
  • Hyeon, Jonghwan (School of Computing, Korea Advanced Institute of Science and Technology (KAIST)) ;
  • Choi, Ho-Jin (School of Computing, Korea Advanced Institute of Science and Technology (KAIST))
  • ;
  • 현종환 (한국과학기술원 전산학부) ;
  • 최호진 (한국과학기술원 전산학부)
  • Received : 2018.09.19
  • Accepted : 2018.10.22
  • Published : 2018.10.31


Rainfall depth is an important meteorological information. Generally, high spatial resolution rainfall data such as road-level rainfall data are more beneficial. However, it is expensive to set up sufficient Automatic Weather Systems to get the road-level rainfall data. In this paper, we propose to use deep learning to recognize rainfall depth from road surveillance videos. To achieve this goal, we collect a new video dataset and propose a procedure to calculate refined rainfall depth from the original meteorological data. We also propose to utilize the differential frame as well as the optical flow image for better recognition of rainfall depth. Under the Temporal Segment Networks framework, the experimental results show that the combination of the video frame and the differential frame is a superior solution for the rainfall depth recognition. The final model is able to achieve high performance in the single-location low sensitivity classification task and reasonable accuracy in the higher sensitivity classification task for both the single-location and the multi-location case.


Grant : 기상.지진See-At기술개발연구

Supported by : 기상청


  1. Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., Vijayanarasimhan, S. (2016) Youtube- 8m: A large-scale video classification benchmark, arXiv preprint arXiv:1609.08675.
  2. Bae, G., Kim, J., Kwon, E., Lee, J., Hong, B., Kim, K. (2016) A method for distinguish rainfall by using pixels of CCTV images, Proceedings of Korea Information Science Society Winter Conference 2016, 1698-1700.
  3. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H. (2007) Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems, 19, 153-160.
  4. Bengio, Y. (2012) Deep learning of representations for unsupervised and transfer learning, Proceedings of ICML Workshop on Unsupervised and Transfer Learning, 17-36.
  5. Berg, P., Norin, L., Olsson, J. (2016) Creation of a high resolution precipitation data set by merging gridded gauge data and radar observations for Sweden, Journal of Hydrology, 541, 6-13.
  6. Carreira, J., Zisserman, A. (2017) Quo vadis, action recognition? A new model and the kinetics dataset, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 4724-4733.
  7. Caruana, R. (1995) Learning many related tasks at the same time with backpropagation, Advances in Neural Information Processing Systems, 7, 657-664.
  8. Chahine, M.T. (1992) The hydrological cycle and its influence on climate, Nature, 359(6394), 373-380.
  9. de Souza, C.R. (2018) Action recognition in videos: data-efficient approaches for supervised learning of human action classification models for video (Doctoral dissertation, Universitat Autònoma de Barcelona), Retrieved from
  10. Diba, A., Fayyaz, M., Sharma, V., Karami, A.H., Arzani, M.M., Yousefzadeh, R., Van Gool, L. (2017) Temporal 3D ConvNets: new architecture and transfer learning for video classification, arXiv preprint arXiv: 1711.08200.
  11. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T. (2015) Longterm recurrent convolutional networks for visual recognition and description, Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2625-2634.
  12. Dong, R., Liao, J., Li, B., Zhou, H., Crookes, D. (2017) Measurements of rainfall rates from videos, Proceedings of the 2017 IEEE International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, 1-9.
  13. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S. (2010) Why does unsupervised pre-training help deep learning?, Journal of Machine Learning Research, 11(Feb), 625-660.
  14. Falkowski, P., Scholes, R.J., Boyle, E.E.A., Canadell, J., Canfield, D., Elser, J., Gruber, N., Hibbard, K., Hogberg, P., Linder, S., Mackenzie, F.T. (2000) The global carbon cycle: a test of our knowledge of earth as a system, Science, 290(5490), 291-296.
  15. Farneback, G. (2003) Two-frame motion estimation based on polynomial expansion, Proceedings of Scandinavian Conference on Image analysis, Springer, Berlin, Heidelberg, 363-370.
  16. Galloway, J.N., Dentener, F.J., Capone, D.G., Boyer, E.W., Howarth, R.W., Seitzinger, S.P., Asner, G.P., Cleveland, C.C., Green, P.A., Holland, E.A., Karl, D.M. (2004) Nitrogen cycles: past, present, and future, Biogeochemistry, 70(2), 153-226.
  17. Garg, K., Nayar, S.K. (2004) Detection and removal of rain from videos, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, 528-535.
  18. Gibson, J.J. (1950) The Perception of the Visual World, Houghton Mifflin, Boston, U.S.A., 259pp.
  19. Guo, L.C., Zhang, Y., Lin, H., Zeng, W., Liu, T., Xiao, J., Rutherford, S., You, J., Ma, W. (2016) The washout effects of rainfall on atmospheric particulate pollution in two Chinese cities, Environmental Pollution, 215, 195-202.
  20. Hochreiter, S., Schmidhuber, J. (1997) Long short-term memory, Neural Computation, 9(8), 1735-1780.
  21. Ioffe, S., Szegedy, C. (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift, Proceedings of the 32nd International Conference on International Conference on Machine Learning, 37, 448-456.
  22. Kang, L.W., Lin, C.W., Fu, Y.H. (2012) Automatic single-imagebased rain streaks removal via image decomposition, IEEE Transactions on Image Processing, 21(4), 1742-1755.
  23. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L. (2014) Large-scale video classification with convolutional neural networks, Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 1725-1732.
  24. Kim, S., Hong, K.H., Jun, H., Park, Y.J., Park, M., Sunwoo, Y. (2014) Effect of precipitation on air pollutant concentration in Seoul, Korea, Asian Journal of Atmospheric Environment, 8(4), 202-211.
  25. Ko, B., Li, Z., Choi, H. (2017) Determination of precipitation from road CCTV video by using CNN-LSTM, Proceedings of Korea Software Congress 2017, 820-822.
  26. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998) Gradientbased learning applied to document recognition, Proceedings of the IEEE, 86(11), 2278-2324.
  27. Lee, J., Hong, B., Jung, S., Chang, V. (2018) Clustering learning model of CCTV image pattern for producing road hazard meteorological information, 86, Future Generation Computer Systems, 1338-1350.
  28. Lobligeois, F., Andreassian, V., Perrin, C., Tabary, P., Loumagne, C. (2014) When does higher spatial resolution rainfall information improve streamflow simulation? An evaluation using 3620 flood events, Hydrology and Earth System Sciences, 18(2), 575-594.
  29. Oki, T., Kanae, S. (2006) Global hydrological cycles and world water resources, Science, 313(5790), 1068-1072.
  30. Prentice, I.C., Farquhar, G.D., Fasham, M.J.R., Goulden, M.L., Heimann, M., Jaramillo, V.J., Kheshgi, H.S., LeQuere, C., Scholes, R.J., Wallace, D.W.R. (2001) The carbon cycle and atmospheric carbon dioxide, Climate Change 2001, 183-237.
  31. Qiu, Z., Yao, T., Mei, T. (2017) Learning spatio-temporal representation with pseudo-3d residual networks, Proceedings of the 2017 IEEE International Conference on Computer Vision, 5534-5542.
  32. Ramanathan, V.C.P.J., Crutzen, P.J., Kiehl, J.T., Rosenfeld, D. (2001) Aerosols, climate, and the hydrological cycle, Science, 294(5549), 2119-2124.
  33. Rolnick, D., Veit, A., Belongie, S., Shavit, N. (2017) Deep learning is robust to massive label noise, arXiv preprint arXiv: 1705.10694.
  34. Shen, M., Xue, P. (2011) A fast algorithm for rain detection and removal from videos, Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 1-6.
  35. Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C. (2015) Convolutional LSTM network: A machine learning approach for precipitation nowcasting, Advances in Neural Information Processing Systems, 28, 802-810.
  36. Shin, Y., Hong, B., Lee, J. (2015) Weather condition distinguishing method based data analysis using CCTV video, Proceedings of Korea Information Science Society Winter Conference 2015, 247-249.
  37. Simonyan, K., Zisserman, A. (2014) Two-stream convolutional networks for action recognition in videos, Advances in Neural Information Processing Systems, 27, 568- 576.
  38. Soomro, K., Zamir, A.R., Shah, M. (2012) UCF101: A dataset of 101 human actions classes from videos in the wild, arXiv preprint arXiv:1212.0402.
  39. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014) Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15(1), 1929-1958.
  40. Szegedy, C., Vanhoucke, V., Loffe, S., Shlens, J., Wojna, Z. (2016) Rethinking the inception architecture for computer vision, Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2818-2826.
  41. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M. (2015) Learning spatiotemporal features with 3d convolutional networks, Proceedings of the 2015 IEEE Conference on Computer Vision, 4489-4497.
  42. Walker, J.C.G. (1980) The oxygen cycle in the natural environment and the biogeochemical cycles, The Handbook of Environmental Chemistry, Springer, Berlin, Heidelberg, 87-104.
  43. Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Van Gool, L. (2016) Temporal segment networks: Towards good practices for deep action recognition, Proceedings of the 2016 European Conference on Computer Vision, Springer, Cham, 20-36.
  44. Xue, X., Jin, X., Zhang, C., Goto, S. (2012) Motion robust rain detection and removal from videos, Proceedings of the 2012 IEEE International Workshop on Multimedia Signal Processing, 170-174.
  45. Zach, C., Pock, T., Bischof, H. (2007) A duality based approach for realtime TV-L1 optical flow, Proceedings of the 29th DAGM Conference on Pattern Recognition, Springer, Berlin, Heidelberg, 214-223.