A Real-time Face Tracking Algorithm using Improved CamShift with Depth Information

  • Lee, Jun-Hwan (Dept. of Electronic Engineering, KwangWoon University) ;
  • Jung, Hyun-jo (Dept. of Electronic Engineering, KwangWoon University) ;
  • Yoo, Jisang (Dept. of Electronic Engineering, KwangWoon University)
  • Received : 2016.10.03
  • Accepted : 2017.05.16
  • Published : 2017.09.01


In this paper, a new face tracking algorithm is proposed. The CamShift (Continuously adaptive mean SHIFT) algorithm shows unstable tracking when there exist objects with similar color to that of face in the background. This drawback of the CamShift is resolved by the proposed algorithm using Kinect's pixel-by-pixel depth information and the skin detection method to extract candidate skin regions in HSV color space. Additionally, even when the target face is disappeared, or occluded, the proposed algorithm makes it robust to this occlusion by the feature point matching. Through experimental results, it is shown that the proposed algorithm is superior in tracking performance to that of existing TLD (Tracking-Learning-Detection) algorithm, and offers faster processing speed. Also, it overcomes all the existing shortfalls of CamShift with almost comparable processing time.


Face tracking;Face-TLD;Haar-Feature;CamShift;Kinect


Grant : Development of hybrid audio contents production and representation technology for supporting channel and object based audio

Supported by : Institute for Information & communications Technology Promotion (IITP)


  1. Viola, Paul, and Michael J. Jones. "Robust real-time face detection." International journal of computer vision 57.2 (2004): 137-154.
  2. Rowley, Henry A., Shumeet Baluja, and Takeo Kanade. "Neural network-based face detection." IEEE Transactions on pattern analysis and machine intelligence 20.1 (1998): 23-38.
  3. Osuna, Edgar, Robert Freund, and Federico Girosit. "Training support vector machines: an application to face detection." Computer vision and pattern recognition, 1997. Proceedings., 1997 IEEE computer society conference on. IEEE, 1997.
  4. Hsu, Rein-Lien, Mohamed Abdel-Mottaleb, and Anil K. Jain. "Face detection in color images." IEEE transactions on pattern analysis and machine intelligence 24.5 (2002): 696-706.
  5. Hjelmås, Erik, and Boon Kee Low. "Face detection: A survey." Computer vision and image understanding 83.3 (2001): 236-274.
  6. Kalal, Zdenek, Krystian Mikolajczyk, and Jiri Matas. "Face-tld: Tracking-learning-detection applied to faces." Image Processing (ICIP), 2010 17th IEEE International Conference on. IEEE, 2010.
  7. Kalal, Zdenek, Krystian Mikolajczyk, and Jiri Matas. "Tracking-learning-detection." IEEE transactions on pattern analysis and machine intelligence 34.7 (2012): 1409-1422.
  8. Kim, Young-Gon, Rae-Hong Park, and Seong-Su Mun. "Face Detection Using Adaboost and Template Matching of Depth Map based Block Rank Patterns." Journal of Broadcast Engineering 17.3 (2012): 437-446.
  9. Kim, Hoo Hyun, et al. "Rotation Invariant Face Detection with Boosted Random Ferns." Proceedings of the Korean Society of Broadcast Engineers Conference. The Korean Institute of Broadcast and Media Engineers, 2013.
  10. Lee, Kyong-Ho. "Face Tracking Using Face Feature and Color Information." Journal of the Korea Society of Computer and Information 18.11 (2013): 167-174.
  11. Viola, Paul, and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. vol. 1. IEEE, 2001.
  12. Viola, Paul, and Michael Jones. "Fast and robust classification using asymmetric adaboost and a detector cascade." Advances in neural information processing systems. 2002.
  13. Jones, Michael J., and James M. Rehg. "Statistical color models with application to skin detection." International Journal of Computer Vision 46.1 (2002): 81-96.
  14. Vezhnevets, Vladimir, Vassili Sazonov, and Alla Andreeva. "A survey on pixel-based skin color detection techniques." Proc. Graphicon. vol. 3. 2003.
  15. Bradski, Gary R. "Computer vision face tracking for use in a perceptual user interface." (1998).
  16. Allen, John G., Richard YD Xu, and Jesse S. Jin. "Object tracking using camshift algorithm and multiple quantized feature spaces." Proceedings of the Pan-Sydney area workshop on Visual information processing. Australian Computer Society, Inc., 2004.
  17. Wang, Zhaowen, et al. "CamShift guided particle filter for visual tracking." Pattern Recognition Letters 30.4 (2009): 407-413.
  18. Zhang, Zhengyou. "A flexible new technique for camera calibration." IEEE Transactions on pattern analysis and machine intelligence 22.11 (2000): 1330-1334.
  19. Tsai, Roger. "A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses." IEEE Journal on Robotics and Automation 3.4 (1987): 323-344.
  20. Weng, Juyang, Paul Cohen, and Marc Herniou. "Camera calibration with distortion models and accuracy evaluation." IEEE Transactions on pattern analysis and machine intelligence 14.10 (1992): 965-980.
  21. Muhlmann, Karsten, et al. "Calculating dense disparity maps from color stereo images, an efficient implementation." International Journal of Computer Vision 47.1-3 (2002): 79-88.
  22. Zhang, Zhengyou. "Microsoft kinect sensor and its effect." IEEE multimedia19.2 (2012): 4-10.
  23. Pagliari, Diana, and Livio Pinto. "Calibration of kinect for xbox one and comparison between the two generations of microsoft sensors." Sensors 15.11 (2015): 27569-27589.
  24. Rosten, Edward, and Tom Drummond. "Machine learning for high-speed corner detection." Computer Vision-ECCV 2006 (2006): 430-443.
  25. Calonder, Michael, et al. "Brief: Binary robust independent elementary features." Computer Vision-ECCV 2010 (2010): 778-792.
  26. (Image by Greg Borenstein)
  27. Comaniciu, Dorin, and Peter Meer. "Mean shift: A robust approach toward feature space analysis." IEEE Transactions on pattern analysis and machine intelligence 24.5 (2002): 603-619.
  28. Bhattacharyya, Anil. "On a measure of divergence between two multinomial populations." Sankhya: the indian journal of statistics (1946): 401-406.
  29. Trzcinski, Tomasz, and Vincent Lepetit. "Efficient discriminative projections for compact binary descriptors." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2012.
  30. Danielsson, Per-Erik. "Euclidean distance mapping." Computer Graphics and image processing 14.3 (1980): 227-248.
  31. Muller, Meinard. Information retrieval for music and motion. vol. 2. Heidelberg: Springer, 2007.
  32. Lowe, David G. "Distinctive image features from scale-invariant keypoints." International journal of computer vision 60.2 (2004): 91-110.
  33. Harris, Chris, and Mike Stephens. "A combined corner and edge detector." Alvey vision conference. vol. 15, no. 50. 1988.
  34. Bay, Herbert, et al. "Speeded-up robust features (SURF)." Computer vision and image understanding 110.3 (2008): 346-359.
  35. Hamming, Richard W. "Error detecting and error correcting codes." Bell Labs Technical Journal 29.2 (1950): 147-160.
  36. Fischler, Martin A., and Robert C. Bolles. "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography." Communications of the ACM 24.6 (1981): 381-395.
  38. Song, Shuran, and Jianxiong Xiao. "Tracking revisited using RGBD camera: Unified benchmark and baselines." Proceedings of the IEEE international conference on computer vision. 2013.