DOI QR코드

DOI QR Code

보안 감시를 위한 심층학습 기반 다채널 영상 분석

Multi-channel Video Analysis Based on Deep Learning for Video Surveillance

  • 투고 : 2018.11.16
  • 심사 : 2018.12.15
  • 발행 : 2018.12.31

초록

본 논문에서는 영상 보안 감시를 위한 심층학습 객체 검출과 다중 객체 추적을 위한 확률적 데이터연관 필터를 연계한 영상분석 기법을 제안하고, GPU를 이용하여 구현하는 방안을 제시한다. 제안하는 영상분석 기법은 객체 검출과 추적으로 순차적으로 수행한다. 객체 검출을 위한 심층학습은 ResNet을 이용하고, 다중 객체 추적을 위하여 확률적 데이터 연관 필터를 적용한다. 제안하는 영상분석 기법은 임의의 영역으로 불법으로 침입하는 사람을 검출하거나 특정 공간에 출입하는 사람을 계수하는데 응용할 수 있다. 시뮬레이션을 통하여 약 25fps의 속도로 48채널의 영상을 분석할 수 있음을 보이고, RTSP 프로토콜을 통하여 실시간 영상분석이 가능함을 보인다.

In this paper, a video analysis is proposed to implement video surveillance system with deep learning object detection and probabilistic data association filter for tracking multiple objects, and suggests its implementation using GPU. The proposed video analysis technique involves object detection and object tracking sequentially. The deep learning network architecture uses ResNet for object detection and applies probabilistic data association filter for multiple objects tracking. The proposed video analysis technique can be used to detect intruders illegally trespassing any restricted area or to count the number of people entering a specified area. As a results of simulations and experiments, 48 channels of videos can be analyzed at a speed of about 27 fps and real-time video analysis is possible through RTSP protocol.

키워드

KCTSAD_2018_v13n6_1263_f0001.png 이미지

그림 1. 일반적인 학습과 잔차 학습 기본 구조 Fig. 1 Basic structure of general learning and residual learning

KCTSAD_2018_v13n6_1263_f0002.png 이미지

그림 2. PDAF 구조 Fig. 2 Structure of PDAF

KCTSAD_2018_v13n6_1263_f0003.png 이미지

그림 3. 제안하는 영상보안 시스템 구성 Fig. 3 Configuration of the proposed video surveillance system

KCTSAD_2018_v13n6_1263_f0004.png 이미지

그림 4. ResNet-10 모델의 구조 Fig. 4 Architecture of ResNet-10 Model.

KCTSAD_2018_v13n6_1263_f0005.png 이미지

그림 5. PDAF에 의한 객체 추적 결과 Fig. 5 A result of object tracking with PDAF

KCTSAD_2018_v13n6_1263_f0006.png 이미지

그림 6. 영상 분석 결과(4채널 RTSP전송 포함) Fig. 6 A result of video analysis(including 4 channels RTSP video transmission)

KCTSAD_2018_v13n6_1263_f0007.png 이미지

그림 7. 실시간 영상분석 결과(4채널 RTSP전송) Fig. 7 A result of real-time video analysis(4 channel RTSP video transmission)

표 1. 채널 수에 따른 영상분석 평균 처리 속도 Table 1. Average speed of video analysis as the number of channels

KCTSAD_2018_v13n6_1263_t0001.png 이미지

참고문헌

  1. P. Viola and M. J. Jones, "Rapid object detection using a boosted cascade of simple features," In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Kauai, USA, Feb. 2001, pp. 511-518.
  2. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, USA, June 2005, pp. 886-893.
  3. T. Ahonen, A. Hadid, and M. Pietikainen, "Face recognition with local binary patterns: application to face recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 28, no. 12, Dec. 2006, pp. 2037-2041. https://doi.org/10.1109/TPAMI.2006.244
  4. Y. Freund and R. E. Schapire. "Experiments with a new boosting algorithm in machine learning", In Proc. of 13th Int. Conf. In Machine Learning, San Francisco, USA, 1996, pp. 148-156.
  5. C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, issue 3, Sept. 1995, pp. 273-297. https://doi.org/10.1007/BF00994018
  6. J. Park, "Comparison speed of pedestrian detection with parallel processing graphic processor and general purpose processor," J. of Korean Institute of Electronic Communication Society, vol. 10, no. 2, 2015, pp.239-246. https://doi.org/10.13067/JKIECS.2015.10.2.239
  7. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Computation, vol. 1, issue 4, 1989, pp. 541-551. https://doi.org/10.1162/neco.1989.1.4.541
  8. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierachies for accurate object detection and semantic segmentation," In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Ohio, USA, June 2014, pp. 580-587.
  9. R. Girshick, "Fast R-CNN," In Proc. IEEE Int. conf. on Computer Vision, Santiago, Chile, 2015, pp. 1440-1448.
  10. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, 2017, pp. 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
  11. J. Redmon, S. Divvala, R. Girshik, and A. Farhadi, "You only look once: unified, real-time object detection," In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, June 2016, pp. 779-788.
  12. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, "SSD: single shot multibox detector," In Proc. European Conf. on Computer Vision, Amsterdam, Netherlands, Oct., 2016, pp. 21-37.
  13. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, June 2016, pp. 770-778.
  14. Y. S. Lee and P. J. Moon, "A comparison and analysis of deep learning framework," J. of Korean Institute of Electronic Communication Society, vol. 12, no. 1, 2017, pp.115-122.
  15. B. Choi, J. Park, J. Song, and B. Yoon, "Object detection and tracking with infrared videos at night-time," J. of Korean Institute of Electronic Communication Society, vol. 10, no. 2, 2015, pp.183-188.. https://doi.org/10.13067/JKIECS.2015.10.2.183
  16. Y. Bar-Shalom, F. Daum, and J. Huang, "The probabilistic data association filter," IEEE Control Systems Magazine, vol. 29, issue 6, 2009, pp. 82-100. https://doi.org/10.1109/MCS.2009.934469