DOI QR코드

DOI QR Code

Bio-mimetic Recognition of Action Sequence using Unsupervised Learning

비지도 학습을 이용한 생체 모방 동작 인지 기반의 동작 순서 인식

  • Kim, Jin Ok (Faculty of Mobile Contents, Daegu Haany University)
  • Received : 2014.03.04
  • Accepted : 2014.06.05
  • Published : 2014.08.30

Abstract

Making good predictions about the outcome of one's actions would seem to be essential in the context of social interaction and decision-making. This paper proposes a computational model for learning articulated motion patterns for action recognition, which mimics biological-inspired visual perception processing of human brain. Developed model of cortical architecture for the unsupervised learning of motion sequence, builds upon neurophysiological knowledge about the cortical sites such as IT, MT, STS and specific neuronal representation which contribute to articulated motion perception. Experiments show how the model automatically selects significant motion patterns as well as meaningful static snapshot categories from continuous video input. Such key poses correspond to articulated postures which are utilized in probing the trained network to impose implied motion perception from static views. We also present how sequence selective representations are learned in STS by fusing snapshot and motion input and how learned feedback connections enable making predictions about future input sequence. Network simulations demonstrate the computational capacity of the proposed model for motion recognition.

대상의 동작을 잘 예측하는 것은 사회적 상호작용과 의사결정 컨텍스트 이해의 핵심이다. 본 연구는 동작 인식 과정에서 인간 뇌 시각인지 과정을 모방한 방법으로 관절 동작의 동작 순서 패턴을 학습하는 컴퓨팅 모델을 제안하였다. 제안 방법의 핵심은 뇌에서 동작 인지 자극을 처리하는 신경생리학적 IT, MT, STS의 피질 기능과 특정 시각 신경 회로 네트워크 기능을 모방하여 비지도 방법으로 동작 순서를 학습한 후 동작을 예측, 인식하는 것이다. 실험을 통해 제안 모델이 어떻게 연속적으로 입력되는 비디오에서 의미있는 동작 스냅샷 뿐 아니라 중요한 동작 패턴을 자동으로 선택하는 지를 제시하였다. 이 핵심 움직임은 학습 네트워크가 정적 시점에서 정지 영상에 함축된 동작을 인식하는지를 증명하는데 이용하는 관절 자세이다. 또한 STS 피질 영역에서 어떻게 정지와 움직임 입력을 통합하는지를 모방하여 학습하고, 학습한 피드백 연결이 차후 동작의 입력 순서를 어떻게 예측하는지를 제시하였다. 네트워크 시뮬레이션을 통해 동작 인식에 대한 제안 모델의 우수성을 입증하였다.

Keywords

References

  1. B. McCane, K. Nobings, D. Grannitch, B. Galvin. "On benchmarking optical flow." Computer Vision and Image Understanding, 2003, pp. 126-143,
  2. Jeff Hawkins, Sandra Baksesless, "On Intelligence-How a new understanding of the brain will lead to the creation of truly intelligent machines", Owl Books, NY, 2005.
  3. H. Jhuang, T. Serre, L. Wof, T. Poggio, "A Biologically inspired system for Action Recognition," ICCV(Inter. Conf. Com. Vision), 2007, pp. 1-8.
  4. R. J. Peters, L. Itti, "Beyond bottom-up : Incorporating task-dependent influences into a computational model of spatial attention," IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1-8.
  5. V. Navalpakkam, L, Itti, "An Integrated Model of Top-down and Bottom-up Attention for Optimal Object Detection, CVPR(Computer Vision and Pattern Recognition), 2006, pp. 2049-2056.
  6. M. A. Giese, T. Poggio. "Neural mechanisms for the recognition of biological movements." Nat. Rev. Neurosci, vol. 4, pp. 179-192, 2003. https://doi.org/10.1038/nrn1057
  7. A. Casile, M. A. Giese. "Roles of motion and form in biological motion recognition." Lecture Notes in Computer Science, vol. 2714, pp. 854-862, 2003.
  8. H. Jhuang, T. Serre, L. Wolf, T. Poggio. "A biologically inspired system for action recognition." IEEE Intl. Conf. on Computer Vision, 2007, pp. 14-20,
  9. K. Schindler, L. Van Gool. "Action snippets: How many frames does human action recognition require?" IEEE Conf. on Computer Vision and Pattern Recognition, 2008, pp. 22-24.
  10. M. J. Escobar, G. S. Masson, T. Vieville, P. Kornprobst. "Action recognition using a bio-inspired feedforward spiking network." Intl. Journal of Computer Vision, vol. 83, 2009, pp. 284-301.
  11. C. M. Bishop. "Neural Networks for Pattern Recognition." Oxford University Press, 1995.
  12. T. Jellema, D.I. Perrett. "Cells in monkey STS responsive to articulated body motions and consequent static posture: a case of implied motion? "Neuropsychologia, Vol. 41, pp.1728-1737, 2003. https://doi.org/10.1016/S0028-3932(03)00175-1
  13. Z. Kourtzi, N. Kanwisher. "Activation of human MT/MST by static images with implied motion." Journal of Cognitive Neuroscience, vol. 12, pp.48-53, 2000. https://doi.org/10.1162/08989290051137594
  14. J. P. Van Santen, G. Sperling. "Temporal covariance model of human motion perception," Journal of the Optical Society of America, vol. 1, pp. 451, 1984. https://doi.org/10.1364/JOSAA.1.000451
  15. E. H. Adelson, J. Bergen. "Spatiotemporal energy models for the perception of motion," Journal of the Optical Society of America, vol. 2, no. 7, pp. 284-299, 1985. https://doi.org/10.1364/JOSAA.2.000284
  16. R. A. Brooks. "A robot that walks; emergent behaviors from a carefully evolved network," Neural Computation, vol. 1, no. 2, pp. 253-262, 1989. https://doi.org/10.1162/neco.1989.1.2.253
  17. R. Sekuler, S. N. J. Watamaniuk, R. Blake. "Motion Perception," Steven's Handbook of Experimental Psychology, vol. 1, pp. 121-176, 1998.
  18. W. A. Fellez, J. G. Talyor. "Establishing retinotopy by lateral-inhibition type homogeneous neural fields," Neurocomputing, vol. 48, pp. 313-322, 2002. https://doi.org/10.1016/S0925-2312(01)00652-X
  19. E. Mingolla. "Neural models of motion integration and segmentation," Neural Networks, vol. 16, pp. 939-945, 2003. https://doi.org/10.1016/S0893-6080(03)00099-6
  20. C. Pack, S. Grossberg, E. Mingolla, "A neural model of smooth pursuit control and motion perception by cortical area MST," Technical Report CAS/CNR-TR 99-023, Department of Cognitive and neural systems and Center for Adaptive Systems, MIT, 2000.
  21. M. J. Escobar, G. S. Masson, T. Vieville, P. Kornprobst, "Action recognition using a bio-inspired feedforward speaking network," International Journal of Computer vision, vol. 82. no. 3, pp. 284-297, 2009. https://doi.org/10.1007/s11263-008-0201-1
  22. D. H. Hubel, T. N. Weisel. "Receptive Fields, binocular iteration and functional architecture in the cats visual cortex, "Journal of Physiology, vol. 160, pp. 106-154, 1962. https://doi.org/10.1113/jphysiol.1962.sp006837
  23. A. M. Derrington, G. B. Henning. "Detecting and discriminating the direction of motion of luminance and colour grattings," Visual Research, vol. 33, pp. 799-811, 1993.
  24. P. Foldiak. "Learning invariances from transformation sequences." Neural Computation, vol. 3, pp. 194-200, 1991. https://doi.org/10.1162/neco.1991.3.2.194
  25. G. A. Carpenter, S. Grossberg. "Pattern recognition by self-organizing," Neural networks. MIT Press, 1991.
  26. M. W. Oram, D. I. Perrett. "Integration of form and motion in the anterior superior temporal polysensory area (STPA) of the macaque monkey," Journal of Neurophysiology, vol. 76, no. 1, pp. 109-129, 1996. https://doi.org/10.1152/jn.1996.76.1.109
  27. J. M. Singer, D. L. Sheinberg, "Temporal cortex neurons encode articulated actions as slow sequences of integrated poses," Journal of Neuroscience, vol. 30, no. 8, pp. 3133-3145, 2008.
  28. Jinok Kim, "A Study on Visual Perception based Emotion Recognition using Body-Activity Posture," The KIPS Transactions, Part B, vol. 18, no. 5, pp. 305-314, 2011. https://doi.org/10.3745/KIPSTB.2011.18B.5.305
  29. Jinok Kim, "Agent's Activities based Intention Recognition Computing", Journal of Korean Internet Society, vol. 13, no. 2, pp. 87-98, 2012. https://doi.org/10.7472/jksii.2012.13.2.87
  30. Jinok Kim, "Effective Pose-based Approach with Pose Estimation for Emotional Action Recognition", The KIPS Transactions: Part B, vol. 2, no. 3, pp. 1-10, 2013.

Cited by

  1. Hand-Mouse Interface Using Virtual Monitor Concept for Natural Interaction vol.5, 2017, https://doi.org/10.1109/ACCESS.2017.2768405
  2. 사용자의 신체적 특징과 뇌파 집중 지수를 이용한 가상 모니터 개념의 NUI/NUX vol.16, pp.6, 2015, https://doi.org/10.7472/jksii.2015.16.6.11