DOI QR코드

DOI QR Code

Spatial-temporal texture features for 3D human activity recognition using laser-based RGB-D videos

  • Ming, Yue (School of Electronic Engineering, Beijing University of Posts and Telecommunications) ;
  • Wang, Guangchao (School of Electronic Engineering, Beijing University of Posts and Telecommunications) ;
  • Hong, Xiaopeng (Department of Computer Science and Engineering, University of Oulu)
  • Received : 2015.09.24
  • Accepted : 2016.09.20
  • Published : 2017.03.31

Abstract

The IR camera and laser-based IR projector provide an effective solution for real-time collection of moving targets in RGB-D videos. Different from the traditional RGB videos, the captured depth videos are not affected by the illumination variation. In this paper, we propose a novel feature extraction framework to describe human activities based on the above optical video capturing method, namely spatial-temporal texture features for 3D human activity recognition. Spatial-temporal texture feature with depth information is insensitive to illumination and occlusions, and efficient for fine-motion description. The framework of our proposed algorithm begins with video acquisition based on laser projection, video preprocessing with visual background extraction and obtains spatial-temporal key images. Then, the texture features encoded from key images are used to generate discriminative features for human activity information. The experimental results based on the different databases and practical scenarios demonstrate the effectiveness of our proposed algorithm for the large-scale data sets.

Keywords

References

  1. P. Borges, N. Conci, and A. Cavallaro, "Video-based human behavior understanding: A survey," IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 11, pp. 1993-2008, 2013. https://doi.org/10.1109/TCSVT.2013.2270402
  2. Jinpyung Kim, Gyujin Jang, Gyujin Kim and Moon-Hyun Kim, "Crowd activity recognition using Optical Flow Orientation Distribution," KSII Transactions on Internet and Information Systems, vol. 9, no. 8, pp. 2948-2963, 2015. https://doi.org/10.3837/tiis.2015.08.011
  3. B. Ben Amor, J. Su and A. Srivastave, "Action recognition using rate-invariant analysis of skeletal shape trajectories,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 99, pp. 1-12, 2015.
  4. Jinseok Lee, Shung Han Cho, Sangjin Hong, Jaechan Lim and Oh Seong-Jun, "Object tracking in 3D space with passive acoustic sensors using particle Filter,'' KSII Transactions on Internet and Information Systems, vol. 5, no. 9, pp. 1632-1652, 2015. https://doi.org/10.3837/tiis.2011.09.008
  5. L. Chen, H. Wei, and J. Ferryman, "A survey of human motion analysis using depth imagery,'' Pattern Reccognition Letters, vol.34, no.15, pp. 1995-2006, 2013. https://doi.org/10.1016/j.patrec.2013.02.006
  6. S.S. Rautaray and A. Agrawal, "Vision based hand gesture recognition for human computer interaction: a survey," Artificial Intelligence Review, vol. 43, no. 1, pp. 1-54, 2015. https://doi.org/10.1007/s10462-012-9356-9
  7. J. Aggarwal, and M. Ryoo, "Human activity analysis: A review,'' ACM Computing Surveys, vol.43, no.3, pp. 1-47, 2011.
  8. I. Everts, J. van Gemert, and T. Gevers, "Evaluation of color spatio-temporal interest points for human action recognition,'' IEEE Transactions on image processing, vol.16, no.2, pp. 1569-1580, 2014.
  9. W. Lin, Y. Chen, J. Wu, H. Wang, B. Sheng, and H. Li, "A new network-based algorithm for human activity recognition in videos,'' IEEE Transactions on Circuits and Systems I, vol.24, no.5, pp. 826-841, 2013.
  10. Y. Kong, Y. Jia, and Y. Fu, "Interactive phrases: Semantic descriptions for human interaction recognition,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, no.9, pp. 1775-1788, 2014. https://doi.org/10.1109/TPAMI.2014.2303090
  11. O. Brdiczka, M. Langet, J. Maisonnasse, and J. Crowly, "Detecting human behavior models from multimodal observation in a smart home," IEEE Transactions on Automation Science and Engineering, vol. 6, no. 4, pp. 588-597, 2009. https://doi.org/10.1109/TASE.2008.2004965
  12. M. Singh, A. Basu, and M. Mandal, "Human activity recognition based on silhouette directionality," IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 9, pp. 1280-1292, Sept. 2008. https://doi.org/10.1109/TCSVT.2008.928888
  13. J. Y. Sung, C. Ponce, B. Selman, and A. Saxena, "Human activity detection from RGBD images," in Proc. of AAAI Conference on Artificial Intelligence Workshops, August 7-11, 2011.
  14. L. Schwarz, D. Mateus, V. Castaneda, and N. Navab, "Manifold learning for tof-based human body tracking and activity recognition," in Proc. of British Machine Vision Conference, August 31 - September 3, 2010.
  15. H. Zhang, C.M. Reardon, and L.E. Paker, "Real-time multiple human perception with color-depth cameras on a mobile robot," IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1429-1441, 2013. https://doi.org/10.1109/TCYB.2013.2275291
  16. Hao Zhang and Lynne E. Parker, "CoDe4D: Color-depth local spatio-Temporal features for human activity recognition from RGB-D videos," IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 3, pp. 1280-1292, 2016.
  17. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, June20-25, 2005.
  18. B. Liang and L. Zheng, "Gesture recognition from one example using depth images," Lecture Notes on Software Engineering, vol. 1, no. 4, 2013.
  19. A. F. Bobick and J. W. Davis, "The recognition of human movement using temporal templates," IEEE T PAMI 23(3), 257-267, 2001. https://doi.org/10.1109/34.910878
  20. Jun Wan, Guodong Guo, and Stan Z. Li, "Explore efficient local features from RGB-D data for one-shot learning gesture recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, no.8, pp. 1626-1639, 2016. https://doi.org/10.1109/TPAMI.2015.2513479
  21. Y. Ming, and Q. Ruan, "Activity recognition from kinect with 3d local spatiotemporal features,'' in Proc. of IEEE International Conference on Multimedia and Expo, pp. 344-349, July 9-13, 2012.
  22. J. Wan, Q. Ruan, W. Li, and S. Deng, "One-shot learning gesture recognition from rgb-d data using bag of features,'' Journal of Machine Learning Research, vol.14, no.1, pp. 2549-2582. 2013.
  23. G. Zhao, and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, no.6, pp. 915-928, 2007. https://doi.org/10.1109/TPAMI.2007.1110
  24. G. Zhao, T. Ahonen, J. Matas, and M. Pietikainen, "Rotation invariant image and video description with local binary pattern features,'' IEEE Transactions on Image Processing, vol.21, no.4, pp. 1465-1467, 2012. https://doi.org/10.1109/TIP.2011.2175739
  25. R. Mattivi, and L. Shao, "Human action recognition using lbp-top as sparse spatio-temporal feature descriptor,'' Computer Analysis of Images and Patterns, vol.16, no.2, pp. 641-648, 2009.
  26. O. Barnich, and M. V. Droogenbroeck, "Vibe: A universal background substraction algorithm for video sequences,'' IEEE Transactions on Image Processing, vol.20, no.6, pp. 1709-1724, 2011. https://doi.org/10.1109/TIP.2010.2101613
  27. Yue Ming, Guangchao Wang, Chunxiao Fan, "Uniform Local Binary Pattern based Texture-Edge Feature for 3D Human Behavior Recognition,'' Plos One, vol.5, no.10, 2015.
  28. D. He, and L. Wang, "Texture classification using texture spectrum,'' Pattern Recognition, vol.23, no.8, pp. 905-910, 1990. https://doi.org/10.1016/0031-3203(90)90135-8
  29. T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, no.7, pp. 971-987, 2002. https://doi.org/10.1109/TPAMI.2002.1017623
  30. N. Altman, "An introduction to kernel and nearest-neighbor nonparametric regression," The American Statistician, vol.19, no.3, pp. 175-185, 1992.
  31. Y. Lin, M. Hu, and W. Cheng, "Human action recognition and retrieval using sole depth information," in Proc. of the ACM international conference on Multimedia, pp. 168-197, 1997.
  32. Yan-Ching Lin, Min-Chun Hu, Wen-Huang Cheng, Yuang-Huan Hsieh, and Hong-Ming Chen, "Human action recognition and retrieval using sole depth information," in Proc. of 20th ACM International Conference on Multimedia, pp. 175-186, 2012.