DOI QR코드

DOI QR Code

Generating Augmented Lifting Player using Pose Tracking

  • Choi, Jong-In (Dept. of Digital Media, Seoul Women's University) ;
  • Kim, Jong-Hyun (Dept. of Software Application, Kangnam University)
  • Received : 2020.04.03
  • Accepted : 2020.04.27
  • Published : 2020.05.29

Abstract

This paper proposes a framework for creating acrobatic scenes such as soccer ball lifting using various users' videos. The proposed method can generate a desired result within a few seconds using a general video of user recorded with a mobile phone. The framework of this paper is largely divided into three parts. The first is to analyze the posture by receiving the user's video. To do this, the user can calculate the pose of the user by analyzing the video using a deep learning technique, and track the movement of a selected body part. The second is to analyze the movement trajectory of the selected body part and calculate the location and time of hitting the object. Finally, the trajectory of the object is generated using the analyzed hitting information. Then, a natural object lifting scenes synchronized with the input user's video can be generated. Physical-based optimization was used to generate a realistic moving object. Using the method of this paper, we can produce various augmented reality applications.

본 논문에서는 다양한 사용자의 영상을 이용하여 축구공 리프팅과 같은 묘기 장면을 만들 수 있는 프레임워크를 제안한다. 제안된 방법은 핸드폰 등으로 촬영된 일반적인 사용자의 영상이라면 수 초 이내에 원하는 결과를 생성할 수 있다. 본 논문의 프레임워크는 크게 세 부분으로 나누어진다. 첫 번째는 사용자의 영상을 입력받아 자세를 분석하는 것이다. 이를 위해서는 딥러닝 기법으로 영상을 분석하여 사용자의 포즈를 계산하고, 원하는 신체 부위의 움직임을 추적할 수 있다. 두 번째는 지정된 신체부위의 이동 궤적을 분석하여 물체를 타격하는 위치와 시간을 계산하는 것이다. 마지막으로 분석된 타격 정보를 이용하여 물체의 이동 궤적을 생성하는 것이다. 그러면 입력된 사용자 영상과 동기화되는 자연스러운 물체 리프팅 장면을 생성할 수 있다. 사실적인 물체의 움직임을 생성하기 위해 물리 기반 최적화를 사용하였다. 본 논문의 프레임워크를 이용하면 다양한 증강현실 어플리케이션을 제작할 수 있다.

Keywords

References

  1. D. Holden, T. Komura, and J. Saito, "Phase-functioned neural networks for character control," ACM Transactions on Graphics, vol. 36, pp. 1-13, 07 2017. doi: 10.1145/3072959.3073663.
  2. X. Peng, G. Berseth, K. Yin, and M. Panne, "Deeploco: dynamic locomotion skills using hierarchical deep reinforcement learning," ACM Transactions on Graphics, vol. 36, pp. 1-13, 07 2017. doi: 10.1145/3072959.3073602.
  3. D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei Rezvani Nezhad, H.-P. Seidel, W. Xu, D. Casas, and C. Theobalt, "Vnect: Real-time 3d human pose estimation with a single rgb camera," ACM Transactions on Graphics, vol. 36, 05 2017. doi:10.1145/3072959.3073596
  4. D. Mehta, O. Sotnychenko, F. Mueller, W. Xu, M. Elgharib, P. Fua, H. Seidel, H. Rhodin, G. Pons-Moll, and C. Theobalt, "Xnect: Real-time multiperson 3d human pose estimation with a single RGB camera," CoRR, vol. abs/1907.00837, 2019. arXiv:1907.00837.
  5. Z. Cao, G. Hidalgo, T. Simon, S. Wei, and Y. Sheikh, "Openpose: Realtime multi-person 2d pose estimation using part affinity fields," CoRR, vol. abs/1812.08008, 2018. arXiv:1611.08050.
  6. M. Andriluka, S. Roth, and B. Schiele, "Pictorial structures revisited: People detection and articulated pose estimation," pp. 1014 - 1021, 07 2009. doi: 10.1109/CVPR.2009.5206754.
  7. S. Johnson and M. Everingham, "Clustered pose and nonlinear appearance models for human pose estimation," pp. 1-11, 01 2010. doi:10.5244/C.24.12.
  8. A. Bulat and G. Tzimiropoulos, "Human pose estimation via convolutional part heatmap regression," vol. 9911, 10 2016. doi:10.1007/978-3-319-46478-7_44.
  9. V. Ramakrishna, D. Munoz, M. Hebert, J. Bagnell, and Y. Sheikh, "Pose machines: Articulated pose estimation via inference machines," Conference Paper, Proceedings of European Conference on Computer Vision. pp. 33-47, 09 2014. doi:10.1007/978-3-319-10605-2_3.
  10. M. Sun and S. Savarese, "Articulated part-based model for joint object detection and pose estimation," pp. 723-730, 11 2011. DOI:10.1109/ICCV.2011.6126309.
  11. A. Jain, "Articulated people detection and pose estimation: Reshaping the future," pp. 3178-3185, 06 2012. DOI: 10.1109/CVPR.2012.6248052.
  12. U. Iqbal and J. Gall, "Multi-person pose estimation with local joint-toperson associations," vol. 9914, 10 2016. doi: 10.1007/978-3-319-48881-344.
  13. G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, and K. Murphy, "Towards accurate multi-person pose estimation in the wild," pp. 3711-3719, 07 2017. doi: 10.1109/CVPR.2017.395.
  14. A. Toshev and C. Szegedy, "Deeppose: Human pose estimation via deep neural networks," Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 12 2013. DOI: 10.1109/CVPR.2014.214.
  15. W. Ouyang, X. Chu, and X. Wang, "Multi-source deep learning for human pose estimation," pp. 2337-2344, 06 2014. doi: 10.1109/CVPR.2014.299.
  16. J. Tompson, A. Jain, Y. Lecun, and C. Bregler, "Joint training of a convolutional network and a graphical model for human pose estimation," 06 2014. arXiv:1406.2984.
  17. A. Newell, K. Yang, and J. Deng, "Stacked hourglass networks for human pose estimation," vol. 9912, pp. 483-499, 10 2016. doi: 10.1007/978-3-319-46484-829.
  18. L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka, P. Gehler, and B. Schiele, "Deepcut: Joint subset partition and labeling for multi person pose estimation," pp. 4929-4937, 06 2016. arXiv:1511.06645.
  19. E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, "Deepercut: A deeper, stronger, and faster multi-person pose estimation model," 05 2016. arXiv: 1605.03170.
  20. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," vol. 7, 12 2015. DOI: 10.1109/CVPR.2016.90.
  21. C. Twigg and D. James, "L.: Backward steps in rigid body simulation," ACM Trans. Graph., vol. 27, 08 2008. doi: 10.1145/1399504.1360624.
  22. J. Popovic, S. Seitz, M. Erdmann, Z. Popovic, and A. Witkin, "Interactive manipulation of rigid body simulations," Proceedings of the ACM SIGGRAPH Conference on Computer Graphics, 10 2001. doi:10.1145/344779.344880.
  23. R. Fattal and D. Lischinski, "Target-driven smoke animation," ACM Transaction on Graphics, vol. 23, 06 2004. doi:10.1145/1015706.1015743.
  24. A. Treuille, A. McNamara, Z. Popovic, and J. Stam, "Keyframe control of smoke simulations," ACM Trans. Graph., vol. 22, pp. 716-723, 07 2003. doi: 10.1145/882262.882337.
  25. C. Wojtan, P. Mucha, and G. Turk, "Keyframe control of complex particle systems using the adjoint method," pp. 15-23, 01 2006. DOI:10.1145/1218064.1218067.
  26. J. Barbi and J. Popovi, "Real-time control of physically based simulations using gentle forces," ACM Transactions on Graphics, vol. 27, 12 2008. doi: 10.1145/1409060.1409116.
  27. S. Jain and C. Liu, "Interactive synthesis of human-object interaction," pp. 47-53, 01 2009. doi: 10.1145/1599470.1599476.
  28. J. I. Choi, S. J. Kang, C. H. Kim, and J. Lee, "Virtual ball player," The Visual Computer, vol. 31, 05 2015. doi: 10.1007/s00371-015-1116-9.
  29. J. Chemin and J. Lee. "A physics-based juggling simulation using reinforcement learning," In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games (MIG '18), Article 3, 1-7. DOI: 10.1145/3274247.3274516.
  30. S. Hong, D. Han, K. Cho, J. S. Shin, and J. Noh. 2019. Physics-based full-body soccer motion control for dribbling and shooting. ACM Trans. Graph. vol 38, no 4, Article 74 (July 2019), pp. 1-12. DOI:10.1145/3306346.3322963.