DOI QR코드

DOI QR Code

서베일런스에서 회선 신경망 기술을 이용한 사람 추적 기법

Human Tracking Technology using Convolutional Neural Network in Visual Surveillance

  • Kang, Sung-Kwan (HCI Lab., Department of Computer and Information Engineering, Inha University) ;
  • Chun, Sang-Hun (Department of Information and Technology, Incheon JEI University)
  • 투고 : 2016.12.30
  • 심사 : 2017.02.20
  • 발행 : 2017.02.28

초록

본 논문에서는 현재와 이전의 영상 프레임 뿐 만 아니라 영상의 축척과 이전 위치에 주어진 객체의 비율과 위치 추정에 대한 학습 문제로서 사람 추적 문제를 다룬다. 본 논문에서는 회선 신경망 분류기를 이용한 사람 검출방법을 제안한다. 제안하는 방법은 신경망을 정규화하고 검출 작업을 위한 특징 표현을 자동으로 최적화함으로써 사람 검출의 정확성을 향상시킨다. 제안하는 방법에서는 감시 영상 시스템에서 실시간 영상이 들어오면 제일 먼저 위치를 추정하는 작업을 수행하기 위하여 회선신경망을 학습시킨다. 기존의 다른 학습 방법과 달리 회선신경망은 두쌍의 연속된 영상 프레임으로부터 공간적이고 시간적인 특징을 모두 공동으로 학습시킨다. 회선 신경망에 의해 학습된 특징을 이용하는 SVM 분류기의 정확성은 회선 신경망의 정확성과 일치한다. 이것은 자동적으로 최적화된 특징의 중요성을 확인시켜 준다. 그러나, 회선 신경망을 이용한 사람 객체의 분류에 대한 계산 시간은 사용된 특징의 타입과 관계없이 SVM의 것보다 약 40분의 1정도로 작다.

In this paper, we have studied tracking as a training stage of considering the position and the scale of a person given its previous position, scale, as well as next and forward image fraction. Unlike other learning methods, CNN is thereby learning combines both time and spatial features from the image for the two consecutive frames. We introduce multiple path ways in CNN to better fuse local and global information. A creative shift-variant CNN architecture is designed so as to alleviate the drift problem when the distracting objects are similar to the target in cluttered environment. Furthermore, we employ CNNs to estimate the scale through the accurate localization of some key points. These techniques are object-independent so that the proposed method can be applied to track other types of object. The capability of the tracker of handling complex situations is demonstrated in many testing sequences. The accuracy of the SVM classifier using the features learnt by the CNN is equivalent to the accuracy of the CNN. This fact confirms the importance of automatically optimized features. However, the computation time for the classification of a person using the convolutional neural network classifier is less than approximately 1/40 of the SVM computation time, regardless of the type of the used features.

키워드

참고문헌

  1. Y. Li, H. Ai, T. Yamashita, S. Lao, and M. Kawade, "Tracking in low frame rate video: A cascade particle filter with discriminative observers of different life spans," IEEE Trans. Pattern Anal. Mach. Intell., Vol. 30, No. 10, pp. 1728-1740, Oct. 2008. https://doi.org/10.1109/TPAMI.2008.73
  2. D. Ramanan, D. A. Forsyth, and A. Zisserman, "Tracking people by learning their appearance," IEEE Trans. Pattern Anal. Mach. Intell., Vol. 29, No. 1, pp. 65-81, Jan. 2007. https://doi.org/10.1109/TPAMI.2007.250600
  3. B. Wu and R. Nevatia, "Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors,"Int. J. Comput. Vis., Vol. 75, No. 2, pp. 247-266, Nov. 2007. https://doi.org/10.1007/s11263-006-0027-7
  4. C. Papageorgiou, T. Evgeniou, and T. Poggio, "A trainable pedestrian detection system," in Proc. Intelligent Vehicle Symposium IV'98, Stuttgart,Germany, Oct. 1998.
  5. D. Valentin, H. Abdi, A. J. Otoole, and G. W. Cottrell, "Connectionist Models of Face Processing: A Survey", Pattern Recognition, Vol. 27, pp. 1209-1230, 1994. https://doi.org/10.1016/0031-3203(94)90006-X
  6. L. Zhao and C. Thorpe, "Stereo and neural network-based pedestrian detection," IEEE Transactions on Intelligent Transportation Systems, Vol. 1, No. 3, pp. 148-154, Sept. 2000. https://doi.org/10.1109/6979.892151
  7. D. A. Ross, J. Lim, R.-S. Lin, and M.-H. Yang, "Incremental learning for robust visual tracking," Int. J. Comput. Vis., Vol. 77, No. 1-3, pp. 125-141, May 2008. https://doi.org/10.1007/s11263-007-0075-7
  8. P. Viola, M. Jones, and D. Snow, "Detecting pedestrians using patterns of motion and appearance," in Proc. IEEE International Conference on Computer Vision, ICCV 2003, Nice, France, Oct. 2003.
  9. A. Shashua, Y. Gdalyahu, and G. Hayun, "Pedestrian detection for driving assistance systems: Single-frame classification and system level performance," in Proc. IEEE Intelligent Vehicle Symposium, IV 2004, Parma, Italy, June 2004.
  10. A. Broggi, A. Fascioli, M. Carletti, T. Graf, and M. Meinecke, "A ultiresolution approach for infrared vision-based pedestrian detection," in Proc. IEEE Intelligent Vehicle Symposium, IV 2004, Parma, Italy, June 2004.
  11. J. Fan, M. Yang, and Y. Wu, "A bi-subspace model for robust visual tracking," in Proc. IEEE Int. Conf. Image Process., San Diego, CA, pp. 2660-2663, Oct. 2008,
  12. Zhai. Yujia, "Stable Tracking Control to a Non-linear Process Via Neural Network Model", International Conference on Convergence Technology, Vol. 5, No. 4, pp.163-169, 2014.
  13. Nipon. Theera-Umpon,Lee. Sanghyuk, "Similarity Measure Design on High Dimensional Data", International Conference on Convergence Technology, Vol. 4, No. 1, pp.43-48, 2013.
  14. Sunghyuck Hong, "New Authentication Methods based on User's Behavior Big Data Analysis on Cloud", Journal of Convergence Society for SMB, Vol. 6, No. 4, pp.31-36, 2016.
  15. Hyung-Song Shin, Kyun-Tak Kim , Kyu-Jin Lee, Kye-San Lee, "A study on Scalable Video Coding Signals Transmission using inter-layer Differential OVSF code allocation scheme in MC-CDMA", Journal of Convergence Society for SMB, Vol. 6, No. 3, pp.49-55, 2016. https://doi.org/10.22156/CS4SMB.2016.6.3.049