DOI QR코드

DOI QR Code

임베디드 GPU에서의 딥러닝 기반 실시간 보행자 탐지 기법

Deep Learning-Based Real-Time Pedestrian Detection on Embedded GPUs

  • Vien, An Gia (Pukyong National University, Department of Computer Engineering) ;
  • Lee, Chul (Pukyong National University, Department of Computer Engineering)
  • 투고 : 2019.02.20
  • 심사 : 2019.03.08
  • 발행 : 2019.03.30

초록

본 논문은 임베디드 GPU에서 실시간 동작하는 딥 컨볼루션 뉴럴 네트워크(CNN) 기반의 보행자 탐지 기법을 제안한다. 제안하는 기법에서는 먼저 영상 내 보행자 크기에 대한 통계적 분석을 통해서 최적의 컨볼루션 층의 개수를 결정한다. 또한, 본 논문에서는 다중 스케일 CNN 학습 기법을 적용하여 영상 내의 보행자 크기 변화에 강인한 탐지 기법을 개발한다. 컴퓨터 모의실험을 통해 제안하는 알고리즘이 임베디드 GPU에서 실시간 동작하면서도 기존의 기법과 비교하여 평균적으로 높은 정확도를 보임을 확인한다.

We propose an efficient single convolutional neural network (CNN) for pedestrian detection on embedded GPUs. We first determine the optimal number of the convolutional layers and hyper-parameters for a lightweight CNN. Then, we employ a multi-scale approach to make the network robust to the sizes of the pedestrians in images. Experimental results demonstrate that the proposed algorithm is capable of real-time operation, while providing higher detection performance than conventional algorithms.

키워드

표 1. 제안하는 네트워크의 파라미터 요약 Table 1. Summary of the proposed network layers

BSGHC3_2019_v24n2_357_t0001.png 이미지

표 2. Caltech 데이터셋을 이용한 recall 및 IoU 성능 비교. Table 2. Comparison of the detection performance using recall and IoU on the Caltech test dataset.

BSGHC3_2019_v24n2_357_t0002.png 이미지

표 3. YOLOv2, tiny YOLO 및 제안하는 기법의 속도 비교. Table 3. The computation speed in fps of YOLOv2, tiny YOLO, and the proposed algorithm

BSGHC3_2019_v24n2_357_t0003.png 이미지

표 4. 모델 크기 및 네트워크 파라미터 수 비교 Table 4. Comparison of the model size and network parameters.

BSGHC3_2019_v24n2_357_t0004.png 이미지

참고문헌

  1. P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An Evaluation of The State of The Art,” IEEE Transaction Pattern Analysis and Machine Intelligence, Vol. 34, No. 4, pp. 743-761, April 2012. https://doi.org/10.1109/TPAMI.2011.155
  2. P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. Lecun, "Pedestrian Detection with Unsupervised Multi-Stage Feature Learning," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3626-3633, June 2013.
  3. P. Dollar, R. Appel, S. Belongie, and P. Perona, “Fast Feature Pyramids for Object Detection,” IEEE Transaction Pattern Analysis and Machine Intelligence, Vol. 36, No. 8, pp. 1532-1545, August 2014. https://doi.org/10.1109/TPAMI.2014.2300479
  4. X. Wang, T. X. Han, and S. Yan, "An HOG-LBP Human Detector with Partila Occlusion Handling," Proceeding of IEEE Conference on Computer Vision, pp. 32-39, September 2009.
  5. S. Zhang, C. Bauckhage, and A. B. Cremers, "Informed Haar-Like Features Improve Pedestrian Detection," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 947-954, June 2014.
  6. K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," Proceedings of International Conference on Learning and Representations, May 2015.
  7. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, June 2015.
  8. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, June 2016.
  9. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, June 2016.
  10. J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263-7271, July 2017.
  11. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Proceeding International Conference on Neural Information Processing Systems, pp. 91-99, December 2015.
  12. S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," Proceeding of International Conference on Machine Learning, pp. 448-456, July 2015.
  13. A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier Nonlinearities Improve Neural Network Acoustic Models," International Conference on Machine Learning Workshop on Deep Learning for Audio, Speech, and Language, 2013.
  14. J. Redmon, "Darknet: Open Source Neural Network in C," http://pjreddie.com/darknet/, 2013-2016.