DOI QR코드

DOI QR Code

Violent crowd flow detection from surveillance cameras using deep transfer learning-gated recurrent unit

  • Elly Matul Imah (Data Science Department, Universitas Negeri Surabaya) ;
  • Riskyana Dewi Intan Puspitasari (Data Science Department, Universitas Negeri Surabaya)
  • Received : 2023.06.05
  • Accepted : 2024.01.08
  • Published : 2024.08.20

Abstract

Violence can be committed anywhere, even in crowded places. It is hence necessary to monitor human activities for public safety. Surveillance cameras can monitor surrounding activities but require human assistance to continuously monitor every incident. Automatic violence detection is needed for early warning and fast response. However, such automation is still challenging because of low video resolution and blind spots. This paper uses ResNet50v2 and the gated recurrent unit (GRU) algorithm to detect violence in the Movies, Hockey, and Crowd video datasets. Spatial features were extracted from each frame sequence of the video using a pretrained model from ResNet50V2, which was then classified using the optimal trained model on the GRU architecture. The experimental results were then compared with wavelet feature extraction methods and classification models, such as the convolutional neural network and long short-term memory. The results show that the proposed combination of ResNet50V2 and GRU is robust and delivers the best performance in terms of accuracy, recall, precision, and F1-score. The use of ResNet50V2 for feature extraction can improve model performance.

Keywords

Acknowledgement

This research was supported by a Prototype Output Grant (Approval No. B/65354/UN38.III.1/LK.04.00/2023, No. 205/E5/PG.02.00.PM/2023), funded by the Ministry of Education, Culture, Research, and Technology of the Republic of Indonesia.

References

  1. M. Asad, J. Yang, J. He, P. Shamsolmoali, and X. He, Multiframe feature-fusion-based model for violence detection, Vis. Comput. 37 (2021), 1415-1431.
  2. M. Ullah, M. M. Yamin, A. Mohammed, S. D. Khan, H. Ullah, and F. A. Cheikh, Attention-based lstm network for action recognition in sports, (Proc. IS&T Int'l. Symp. on Electronic Imaging: Intelligent Robotics and Industrial Applications using Computer Vision), 2021, pp. 302-1-302-6.
  3. M. Nohara and H. Nishi, Video object detection method using single-frame detection and motion vector tracking, (IEEE 18th International Conference on Industrial Informatics, Warwick, UK), 2020, pp. 119-25.
  4. N. Venkatesvara Rao, D. Venkatavara Prasad, and M. Sugumaran, Real-time video object detection and classification using hybrid texture feature extraction, Int. J. Comput. Appl. 43 (2021), no. 2, 119-126.
  5. G. Heo, J. Jeon, and B. Son, Crack automatic detection of cctv video of sewer inspection with low resolution, KSCE J. Civ. Eng. 23 (2019), 1219-1227.
  6. J. L. Salazar Gonzalez, C. Zaccaro, J. A. Alvarez-Garcia, L. M. ' Soria Morillo, and F. Sancho Caparrini, Real-time gun detection in cctv: an open problem, Neural Netw. 132 (2020), 297-308.
  7. J. Lim, M.I. Al Jobayer, V.M. Baskaran, J.M. Lim, K. Wong, and J. See, Gun detection in surveillance videos using deep neural networks, (Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, Lanzhou, China), 2019, pp. 1998-2002.
  8. R. Debnath and M.K. Bhowmik, Automatic visual gun detection carried by a moving person, (IEEE 15th International Conference on Industrial and Information Systems, Rupnagar, India), 2020, DOI 10.1109/ICIIS51140.2020.9342681.
  9. S. Das, A. Sarker, and T. Mahmud, Violence detection from videos using hog features, (4th International Conference on Electrical Information and Communication Technology, Khulna, Bangladesh), 2019, DOI 10.1109/EICT48899.2019.9068754.
  10. J. Mahmoodi and A. Salajeghe, A classification method based on optical flow for violence detection, Expert Syst. Appl. 127 (2019), 121-127.
  11. A. Jain and D. K. Vishwakarma, Deep neuralnet for violence detection using motion features from dynamic images, (Third International Conference on Smart Systems and Inventive Technology, Tirunelveli, India), 2020, pp. 826-31.
  12. M. A. Soeleman, C. Supriyanto, and D. P. Prabowo, An empirical study of cnn-lstm on class imbalance datasets for violence video detection, (The 2021 International Conference on Computer, Control, Informatics and Its Applications), 2022, pp. 81-85.
  13. I. E.M. Karisma and A. Wintarti, Violence classification using support vector machine and deep transfer learning feature extraction, (International Seminar on Intelligent Technology and Its Applications, Surabaya, Indonesia), 2021, pp. 337-42.
  14. K. Karisma, E. M. Imah, I. K. Laksono, and A. Wintarti, Detecting violent scenes in movies using gated recurrent units and discrete wavelet transform, Regist: J. Ilm. Teknol. Sist. Inf. 8 (2022), no. 2, 94-103.
  15. M. W. Fakhr, F. A. Maghraby, and M. Magdy, Violence 4d: violence detection in surveillance using 4d convolutional neural networks, IET Comput. Vis. 17 (2023), no. 3, 282-294.
  16. T. Hassner, Y. Itcher, and O. Kliper-Gross, Violent flows: Real-time detection of violent crowd behavior, (3rd IEEE International Workshop on Socially Intelligent Surveillance and Monitoring at the IEEE Conf on Computer Vision and Pattern Recognition, Providence, RI, USA), 2012, DOI 10.1109/CVPRW.2012.6239348.
  17. S. Roshan, G. Srivathsan, K. Deepak, and S. Chandrakala, Violence detection in automated video surveillance: Recent trends and comparative studies, The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems, Academic Press, 2020, DOI 10.1016/B978-0-12-816385-6.00011-8.
  18. A. Onan and M. A. Tocoglu, A term weighted neural language model and stacked bidirectional lstm based framework for sarcasm identification, IEEE Access 9 (2021), 7701-7722.
  19. B. C. Mateus, M. Mendes, J. T. Farinha, R. Assis, and A. M. Cardoso, Comparing lstm and gru models to predict the condition of a pulp paper press, Energies 22 (2021), no. 21, 6958.
  20. R. Rana, Gated recurrent unit (GRU) for emotion classification from noisy speech, arXiv preprint, 2016, DOI 10.48550/arXiv.1612.07778.
  21. E.B. Nievas, O.D. Suarez, G.B. Garcia, and R. Sukthankar, Hockey fight detection dataset, 2011, DOI 10.1371/journal.pone.0120448.
  22. S. Bahri, L. Awalushaumi, and M. Susanto, The approximation of nonlinear function using daubechies and symlets wavelets, (Proceedings of the International Conference on Mathematics and Islam), 2018, pp. 300-306.
  23. S. Ren J. Sun, K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, arXiv preprint, 2015, DOI 10.48550/arXiv.1512.03385.
  24. K. Simonyan, and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint, 2014, DOI 10.48550/arXiv.1409.1556.
  25. S. Ren, J. Sun, K. He, and X. Zhang, Identity mappings in deep residual networks, In European conference on computer vision, Springer, 2016, 630-645.
  26. J. Q. Gan, E. J. J. Savran, and R. Kiziltepe, A novel keyframe extraction method for video classification using deep neural networks, Neural. Comput. Applic. 35 (2021), 24513-24524.
  27. X. Xiang and L. Zhang, Video event classification based on two-stage neural network, Multimed. Tools Appl. 79 (2020), 21471-21486.
  28. B. Yang, J. Yang, X. He, Z. Zheng, T. Zhang, and W. Jia, Mowld: a robust motion image descriptor for violence detection, Multimed. Tools Appl. 76 (2017), no. 1, 1419-1438.
  29. A. S. Keceli and A. Y. Kaya, Violent activity detection with transfer learning method, Electron. Lett. 53 (2017), no. 15, 1047-1048.
  30. H. Mousavi, M. Nabi, M. Ravanbakhsh, and H. Rabiee, Detection and localization of crowd behavior using a novel tracklet-based model, Int. J. Mach. Learn. Cybern. 9 (2018), 1999-2010.
  31. A. B. Mabrouk and E. Zagrouba, Spatio-temporal feature using optical flow based distribution for violence detection, Pattern. Recogn. Lett. 92 (2017), 62-67.
  32. H. Luo, X. Hou, P. Zhou, and Q. Ding, Violence detection in surveillance video using low-level features, PLoS ONE 13 (2018), no. 10, DOI 10.1007/s12205-019-0980-7.
  33. I. P. Febin, K. Jayasree, and P. T. Joy, Violence detection in videos for an intelligent surveillance system using mobsift and movement filtering algorithm, Pattern. Anal. Appl. 23 (2020), no. 2, 611-623.
  34. A. S. Keceli and A. Kaya, Violent activity classification with transferred deep features and 3d-Cnn, Signal Image Video Process. 17 (2022), no. 1, 139-146.