DOI QR코드

DOI QR Code

AONet: Attention network with optional activation for unsupervised video anomaly detection

  • Akhrorjon Akhmadjon Ugli Rakhmonov (Department of Computer Science and Engineering, Kyungpook National University) ;
  • Barathi Subramanian (Department of Computer Science and Engineering, Kyungpook National University) ;
  • Bahar Amirian Varnousefaderani (Department of Computer Science and Engineering, Kyungpook National University) ;
  • Jeonghong Kim (Department of Computer Science and Engineering, Kyungpook National University)
  • 투고 : 2024.03.15
  • 심사 : 2024.08.21
  • 발행 : 2024.10.10

초록

Anomaly detection in video surveillance is crucial but challenging due to the rarity of irregular events and ambiguity of defining anomalies. We propose a method called AONet that utilizes a spatiotemporal module to extract spatiotemporal features efficiently, as well as a residual autoencoder equipped with an attention network for effective future frame prediction in video anomaly detection. AONet utilizes a novel activation function called OptAF that combines the strengths of the ReLU, leaky ReLU, and sigmoid functions. Furthermore, the proposed method employs a combination of robust loss functions to address various aspects of prediction errors and enhance training effectiveness. The performance of the proposed method is evaluated on three widely used benchmark datasets. The results indicate that the proposed method outperforms existing state-of-the-art methods and demonstrates comparable performance, achieving area under the curve values of 97.0%, 86.9%, and 73.8% on the UCSD Ped2, CUHK Avenue, and ShanghaiTech Campus datasets, respectively. Additionally, the high speed of the proposed method enables its application to real-time tasks.

키워드

과제정보

This study was supported by the BK21 FOUR project (AI-driven Convergence Software Education Research Program) funded by the Ministry of Education, Department of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea (4120240214871), and the Basic Science Research Program of the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Republic of Korea (2021R1I1A3043970).

참고문헌

  1. E. M. Imah and R. D. I. Puspitasari, Violent crowd flow detection from surveillance cameras using deep transfer learning-gated recurrent unit, ETRI J. 46 (2024), no. 4, 671-682. https://doi.org/10.4218/etrij.2023-0222
  2. A. Mukherjee, V. Hassija, and V. Chamola, QuARCS: quantum anomaly recognition and caption scoring framework for surveillance videos, IEEE Trans. Consumer Electron. (2024). https://doi.org/10.1109/TCE.2024.3440520
  3. X. Wei, Y. Zhang, X. Zhang, Q. Ge, and B. Yin, Real-time passenger flow anomaly detection in metro system, IET Intell. Transport Syst. 17 (2023), no. 10, 2020-2033.
  4. H. Zhu, P. Wei, and Z. Xu, A spatio-temporal enhanced graphtransformer autoencoder embedded pose for anomaly detection, IET Comput. Vis. 18 (2023), 405-419.
  5. A. A. U. Rakhmonov, B. Subramanian, B. Olimov, and J. Kim, Extensive knowledge distillation model: an end-to-end effective anomaly detection model for real-time industrial applications, IEEE Access 11 (2023), 69750-69761. https://doi.org/10.1109/ACCESS.2023.3293108
  6. B. A. Ugli Olimov, K. C. Veluvolu, A. Paul, and J. Kim, UzADL: anomaly detection and localization using graph Laplacian matrix-based unsupervised learning method, Comput. Ind. Eng. 171 (2022), 108313. https://doi.org/10.1016/j.cie.2022.108313
  7. A. Hussain, W. Ullah, N. Khan, Z. A. Khan, M. J. Kim, and S. W. Baik, TDS-Net: transformer enhanced dual-stream network for video anomaly detection, Expert Syst. Applicat. 2024 (2024), 124846.
  8. N. Li and F. Chang, Video anomaly detection and localization via multivariate gaussian fully convolution adversarial auto-encoder, Neuro-computing 369 (2019), 92-105.
  9. N. Li, F. Chang, and C. Liu, Spatial-temporal cascade auto-encoder for video anomaly detection in crowded scenes, IEEE Trans. Multim. 23 (2020), 203-215.
  10. Y. Lu, K. M. Kumar, S. Shahabeddin Nabavi, and Y. Wang, Future frame prediction using convolutional VRNN for anomaly detection (16th IEEE Int. Conf. Adv. Video Signal Based Surveillance (AVSS), Taipei, Taiwan), 2019, pp. 1-8.
  11. J. T. Zhou, J. Du, H. Zhu, X. Peng, Y. Liu, and R. S. M. Goh, AnomalyNet: an anomaly detection network for video surveillance, IEEE Trans. Inf. Forensics Sec. 14 (2019), no. 10, 2537-2550.
  12. X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W. Woo, Convolutional LSTM network: a machine learning approach for precipitation nowcasting, Adv. Neural Inf. Process. Syst. 28 (2015), 802.
  13. Y. Wu, F. He, D. Zhang, and X. Li, Service-oriented feature-based data exchange for cloud-based design and manufacturing, IEEE Trans. Services Comput. 11 (2015), no. 2, 341-353.
  14. M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, and L. S. Davis, Learning temporal regularity in video sequences (Proc. IEEE Conf. Comput. Vision Pattern Recognit., Las Vegas, NV, USA) IEEE, Piscataway, New Jersey, USA 2016, pp. 733-742.
  15. W. Liu, W. Luo, D. Lian, and S. Gao, Future frame prediction for anomaly detection-a new baseline (Proc. IEEE Conf. Comput. Vision Pattern Recognit., Salt Lake City, UT, USA), IEEE, Piscataway, New Jersey, USA 2018, pp. 6536-6545.
  16. W. Luo, W. Liu, D. Lian, and S. Gao, Future frame prediction network for video anomaly detection, IEEE Trans. Pattern Anal. Mach. Intell. 44 (2021), no. 11, 7505-7520.
  17. W. W. Y. Ng, G. Zeng, J. Zhang, D. S. Yeung, and W. Pedrycz, Dual autoencoders features for imbalance classification problem, Pattern Recognit. 60 (2016), 875-889.
  18. L. Vu and Q. U. Nguyen, An ensemble of activation functions in autoencoder applied to IoT anomaly detection (6th NAFOSTED Conf. Inf. Comput. Sci., Hanoi, Vietnam), IEEE, Piscataway, New Jersey, USA 2019. https://doi.org/10.1109/NICS48868.2019.9023860
  19. Y. Liu, H. Shen, T. Wang, and G. Bai, Vehicle counting in drone images: an adaptive method with spatial attention and multiscale receptive fields, ETRI J. (2024). https://doi.org/10.4218/etrij.2023-0426
  20. J. T. Zhou, L. Zhang, Z. Fang, J. Du, X. Peng, and Y. Xiao, Attention-driven loss for anomaly detection in video surveillance, IEEE Trans. Circuits Syst. Video Technol. 30 (2019), no. 12, 4639-4647.
  21. Y. Liu, J. Liu, K. Yang, B. Ju, S. Liu, Y. Wang, D. Yang, P. Sun, and L. Song, AMP-Net: appearance-motion prototype network assisted automatic video anomaly detection system, IEEE Trans. Ind. Inf. 20 (2024), no. 2, 2843-2855. https://doi.org/10.1109/TII.2023.3298476
  22. Y. Lu, F. Yu, M. K. K. Reddy, and Y. Wang, Few-shot scene-adaptive anomaly detection (Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK), 2020, pp. 125-141.
  23. V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines (Proc. 27th Int. Conf. Mach. Learn. (ICML-10), Haifa, Israel), Omnipress, Madison, Wisconsin, USA 2010, pp. 807-814.
  24. A. L. Maas, A. Y. Hannun, and A. Y. Ng, Rectifier nonlinearities improve neural network acoustic models (Proc. ICML, Vol. 30, Atlanta, GA, USA), JMLR.org, New York, NY 2013, pp. 3.
  25. C. M. Bishop and N. M. Nasrabadi, Pattern recognition and machine learning, Vol. 4, Springer, Berlin, Germany 2006.
  26. H. Wei, K. Li, H. Li, Y. Lyu, and X. Hu, Detecting video anomaly with a stacked convolutional lstm framework (Int. Conf. Comput. Vision Syst., Thessaloniki, Greece), Berlin, Germany, Berlin, Germany, 2019, pp. 330-342.
  27. R. Nawaratne, D. Alahakoon, D. De Silva, and X. Yu, Spatio-temporal anomaly detection using deep learning for real-time video surveillance, IEEE Trans. Ind. Inf. 16 (2019), no. 1, 393-402.
  28. Z. Fang, J. T. Zhou, Y. Xiao, Y. Li, and F. Yang, Multi-encoder towards effective anomaly detection in videos, IEEE Trans. Multimedia 23 (2020), 4106-4116.
  29. Y. Hao, J. Li, N. Wang, X. Wang, and X. Gao, Spatiotemporal consistency-enhanced network for video anomaly detection, Pattern Recognit. 121 (2022), 108232.
  30. D. Abati, A. Porrello, S. Calderara, and R. Cucchiara, Latent space autoregression for novelty detection (Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., Long Beach, CA, USA), IEEE, Piscataway, New Jersey, USA, 2019, pp. 481-490.
  31. D. Gong, L. Liu, V. Le, B. Saha, M. R. Mansour, S. Venkatesh, and A. Hengel, Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection (Proc. IEEE/CVF Int. Conf. Comput. Vision, Seoul, Rep. of Korea), 2019, pp. 1705-1714.
  32. B. Li, Z. Li, and Z. Yin, Video anomaly detection via improved future frame prediction (Fourth Int. Conf. Comput. Vision Data Mining (ICCVDM 2023), Vol. 13063, Changchun, China), SPIE, Bellingham, Washington, USA 2024, pp. 121-129.
  33. W. Luo, W. Liu, D. Lian, J. Tang, L. Duan, X. Peng, and S. Gao, Video anomaly detection with sparse coding inspired deep neural networks, IEEE Trans. Pattern Anal. Mach. Intell. 43 (2019), no. 3, 1070-1084.
  34. Y. Li, Y. Cai, J. Liu, S. Lang, and X. Zhang, Spatio-temporal unity networking for video anomaly detection, IEEE Access 7 (2019), 172425-172432.
  35. Y. Chang, Z. Tu, W. Xie, B. Luo, S. Zhang, H. Sui, and J. Yuan, Video anomaly detection with spatio-temporal dissociation, Pattern Recognit. 122 (2022), 108213.
  36. Y. Tang, L. Zhao, S. Zhang, C. Gong, G. Li, and J. Yang, Integrating prediction and reconstruction for anomaly detection, Pattern Recognit. Lett. 129 (2020), 123-130.
  37. R. Morais, V. Le, T. Tran, B. Saha, M. Mansour, and S. Venkatesh, Learning regularity in skeleton trajectories for anomaly detection in videos (Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., Long Beach, CA, USA), IEEE, Piscataway, New Jersey, USA 2019, pp. 11996-12004.
  38. W. Luo, W. Liu, and S. Gao, A revisit of sparse coding based anomaly detection in stacked RNN framework (Proc. IEEE Int. Conf. Comput. Vision., Venice, Italy), 2017, pp. 341-349.
  39. Z. Wu, C. Shen, and A. Van Den Hengel, Wider or deeper: Revisiting the ResNet model for visual recognition, Pattern Recognit. 90 (2019), 119-133.
  40. J. Lin, C. Gan, and S. Han, TSM: temporal shift module for efficient video understanding (Proc. IEEE/CVF Int. Conf. Comput. Vision, Seoul, Rep. of Korea), IEEE, Piscataway, New Jersey, USA 2019, pp. 7083-7093.
  41. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning spatiotemporal features with 3D convolutional networks (Proc. IEEE Int. Conf. Comput. Vision., Santiago, Chile), IEEE, Piscataway, New Jersey, USA 2015, pp. 4489-4497.
  42. S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, CBAM: convolutional block attention module (Proc. Eur. Conf. Comput. Vision (ECCV), Munich, Germany), 2018, pp. 3-19.
  43. Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, Image super-resolution using very deep residual channel attention networks (Proc. Eur. Conf. Comput. Vision (ECCV), Munich, Germany), Springer, Berlin, Germany 2018, pp. 286-301.
  44. V. Mahadevan, W. Li, V. Bhalodia, and N. Vasconcelos, Anomaly detection in crowded scenes (IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Beijing, China), 2010, pp. 1975-1981.
  45. C. Lu, J. Shi, and J. Jia, Abnormal event detection at 150 FPS in MATLAB (IEEE Int. Conf. Comput. Vis., Sydney, Australia), 2013, pp. 2720-2727.
  46. K. Doshi and Y. Yilmaz, Online anomaly detection in surveillance videos with asymptotic bound on false alarm rate, Pattern Recognit. 114 (2021), 107865.
  47. V.-T. Le and Y.-G. Kim, Attention-based residual autoencoder for video anomaly detection, Appl. Intell. 53 (2023), no. 3, 3240-3254.
  48. Y. Yang, D. Zhan, F. Yang, X.-D. Zhou, Y. Yan, and Y. Wang, Improving video anomaly detection performance with patchlevel loss and segmentation map (IEEE 6th Int. Conf. Comput. Commun. (ICCC)., Chengdu, China), 2020, pp. 1832-1839.
  49. K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition (Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Las Vegas, NV, USA), 2016, pp. 770-778.
  50. J. Hu, L. Shen, and G. Sun, Squeeze-and-excitation networks, (Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA), 2018, pp. 7132-7141.