DOI QR코드

DOI QR Code

A Robust Object Detection and Tracking Method using RGB-D Model

RGB-D 모델을 이용한 강건한 객체 탐지 및 추적 방법

  • Park, Seohee (Department of Computer Science, Kyonggi University) ;
  • Chun, Junchul (Department of Computer Science, Kyonggi University)
  • Received : 2017.06.01
  • Accepted : 2017.07.18
  • Published : 2017.08.31

Abstract

Recently, CCTV has been combined with areas such as big data, artificial intelligence, and image analysis to detect various abnormal behaviors and to detect and analyze the overall situation of objects such as people. Image analysis research for this intelligent video surveillance function is progressing actively. However, CCTV images using 2D information generally have limitations such as object misrecognition due to lack of topological information. This problem can be solved by adding the depth information of the object created by using two cameras to the image. In this paper, we perform background modeling using Mixture of Gaussian technique and detect whether there are moving objects by segmenting the foreground from the modeled background. In order to perform the depth information-based segmentation using the RGB information-based segmentation results, stereo-based depth maps are generated using two cameras. Next, the RGB-based segmented region is set as a domain for extracting depth information, and depth-based segmentation is performed within the domain. In order to detect the center point of a robustly segmented object and to track the direction, the movement of the object is tracked by applying the CAMShift technique, which is the most basic object tracking method. From the experiments, we prove the efficiency of the proposed object detection and tracking method using the RGB-D model.

최근 지능형 CCTV는 빅 데이터, 인공지능 및 영상 분석과 같은 분야와 결합하여 다양한 이상 행위들을 탐지하고 보행자와 같은 객체의 전반적인 상황을 분석할 수 있으며, 이러한 지능형 영상 감시 기능에 대한 영상 분석 연구가 활발히 진행되고 있는 추세이다. 그러나 일반적으로 2차원 정보를 이용하는 CCTV 영상은 위상학적 정보 부족으로 인해 객체 오 인식과 같은 한계가 존재한다. 이러한 문제는 두 대의 카메라를 사용하여 생성된 객체의 깊이 정보를 영상에 추가함으로써 해결 할 수 있다. 본 논문에서는 가우시안 혼합기법을 사용하여 배경 모델링을 수행하고, 모델링 된 배경에서 전경을 분할하여 움직이는 객체의 존재 여부를 탐지한다. RGB 정보 기반 분할 결과를 이용하여 깊이 정보 기반 분할을 수행하기 위해 두 대의 카메라를 사용하여 스테레오 기반 깊이 지도를 생성한다. RGB 기반으로 분할된 영역을 깊이 정보를 추출하기 위한 도메인으로 설정하고, 도메인 내부에서 깊이 기반 분할을 수행한다. 강건하게 분할된 객체의 중심점을 탐지하고 방향을 추적하기 위해 가장 기본적인 객체 추적 방법인 CAMShift 기법을 적용하여 객체의 움직임을 추적한다. 실험을 통하여 제안된 RGB-D 모델을 이용한 객체 탐지 및 추적 방법의 우수성을 입증하였다.

Keywords

References

  1. Grant, Jason M., and Patrick J. Flynn., "Crowd Scene Understanding from Video: A Survey," ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol 13, No. 2, pp. 19, 2017. https://doi.org/10.1145/3052930
  2. Dixit, Astha, Manoj Verma, and Kailash Patidar., "Survey on Video Object Detection & Tracking," International Journal of Current Trends in Engineering & Technology, Vol 2, No. 2, pp. 264-268, 2016. http://ijctet.org/issuedetail.php?id=317
  3. Zivkovic, Zoran, "Improved adaptive Gaussian mixture model for background subtraction," Proceedings of the 17th International Conference on Pattern Recognition, pp. 28-31, 2004. https://doi.org/10.1109/icpr.2004.1333992
  4. Viola, Paul, and Michael Jones, "Rapid object detection using a boosted cascade of simple features," Computer Vision and Pattern Recognition (CVPR), Vol 1, pp. 511-518, 2001. https://doi.org/10.1109/cvpr.2001.990517
  5. Dalal, Navneet, and Bill Triggs, "Histograms of oriented gradients for human detection," Computer Vision and Pattern Recognition (CVPR), Vol 1, pp. 886-893, 2005. https://doi.org/10.1109/cvpr.2005.177
  6. Xia, Lu, Chia-Chih Chen, and Jake K. Aggarwal, "Human detection using depth information by kinect," Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 15-22, 2011. https://doi.org/10.1109/cvpr.2005.177
  7. Hirschmuller, Heiko, "Stereo processing by semiglobal matching and mutual information," IEEE Transactions on pattern analysis and machine intelligence, Vol 30, No. 2, pp. 328-341, 2008. https://doi.org/10.1109/tpami.2007.1166
  8. Bradski, Gary R, "Computer vision face tracking for use in a perceptual user interface," Intel Technology Journal, pp. 214-219, 1998. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673
  9. Nguyen, Duc Thanh, Wanqing Li, and Philip O. Ogunbona, "Human detection from images and videos: A survey," Pattern Recognition, pp. 148-175, 2016. https://doi.org/10.1016/j.patcog.2015.08.027
  10. Shotton, Jamie, et al., "Real-time human pose recognition in parts from single depth images," Communications of the ACM, Vol 56, No. 1, pp. 116-124, 2013. https://doi.org/10.1145/2398356.2398381
  11. Wei, Shih-En, et al., "Convolutional pose machines," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724-4732, 2016. https://doi.org/10.1109/cvpr.2016.511
  12. Cao, Zhe, et al., "Realtime multi-person 2d pose estimation using part affinity fields," Computer Vision and Pattern Recognition (CVPR), 2017. https://arxiv.org/abs/1611.08050

Cited by

  1. Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals vol.19, pp.5, 2017, https://doi.org/10.3390/s19051085