DOI QR코드

DOI QR Code

Exploring the Effectiveness of GAN-based Approach and Reinforcement Learning in Character Boxing Task

캐릭터 복싱 과제에서 GAN 기반 접근법과 강화학습의 효과성 탐구

  • Seoyoung Son (Department of Computer Science, Hanyang University Graduate School) ;
  • Taesoo Kwon (Department of Computer Science, Hanyang University Graduate School)
  • 손서영 (한양대학교 일반대학원 컴퓨터소프트웨어학과) ;
  • 권태수 (한양대학교 일반대학원 컴퓨터소프트웨어학과)
  • Received : 2023.06.28
  • Accepted : 2023.08.16
  • Published : 2023.09.01

Abstract

For decades, creating a desired locomotive motion in a goal-oriented manner has been a challenge in character animation. Data-driven methods using generative models have demonstrated efficient ways of predicting long sequences of motions without the need for explicit conditioning. While these methods produce high-quality long-term motions, they can be limited when it comes to synthesizing motion for challenging novel scenarios, such as punching a random target. A state-of-the-art solution to overcome this limitation is by using a GAN Discriminator to imitate motion data clips and incorporating reinforcement learning to compose goal-oriented motions. In this paper, our research aims to create characters performing combat sports such as boxing, using a novel reward design in conjunction with existing GAN-based approaches. We experimentally demonstrate that both the Adversarial Motion Prior [3] and Adversarial Skill Embeddings [4] methods are capable of generating viable motions for a character punching a random target, even in the absence of mocap data that specifically captures the transition between punching and locomotion. Also, with a single learned policy, multiple task controllers can be constructed through the TimeChamber framework.

캐릭터 애니메이션 분야에서 목표 지향적 이동을 위해 원하는 궤적을 재현하는 것은 항상 어려운 과제이다. 생성 모델을 사용하는 데이터 기반 방법은 명시적인 조건 없이 긴 동작 시퀀스를 예측하는 효율적인 방법 중 하나이다. 이러한 방법은 고품질의 결과물을 생성해내지만, 멀리 있는 목표물을 무작위로 타격하는 것처럼 더 어려운 상황의 모션을 합성(synthesis)에 있어서는 제한될 수 있다. 하지만 이는 모션 데이터 클립을 모방하는 GAN Discriminator 를 사용하고 강화학습을 통해 해결할 수 있다. 본 연구는 캐릭터들이 GAN 기반 접근법과 리워드 설계를 통해 복싱을 구현하는 것을 목표로 한다. 논문에서 사용된 두 가지의 최신 연구인 Adversarial Motion Prior 와 Adversarial Skill Embedding 에 대해 비교실험하며, 또한 복싱을 경쟁 스포츠에 적용하기 위하여 멀티 에이전트 강화 학습을 위한 대규모 self-play 프레임워크인 TimeChamber 를 활용한다.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2020R1A2C1012847).

References

  1. L. Kovar, M. Gleicher, and F. Pighin, "Motion graphs," ACM SIGGRAPH 2008 classes, pp. 1-10, 2008.
  2. J. Ho, and S. Ermon, "Generative adversarial imitation learning," Advances in neural information processing systems, vol. 29, 2016.
  3. X. B. Peng et al., "Amp: Adversarial motion priors for stylized physics-based character control," ACM Transactions on Graphics (TOG), vol. 40, no. 4, pp. 1-20, 2021. https://doi.org/10.1145/3476576.3476723
  4. X. B. Peng et al., "ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters," arXiv preprint arXiv:2205.01906, 2022.
  5. D. Holden, T. Komura, and J. Saito, "Phase-functioned neural networks for character control," ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1-13, 2017. https://doi.org/10.1145/3072959.3073663
  6. H. Zhang et al., "Mode-adaptive neural networks for quadruped motion control," ACM Transactions on Graphics (TOG), vol. 37, no. 4, pp. 1-11, 2018. https://doi.org/10.1145/3197517.3201366
  7. D. Holden et al., "Learned motion matching," ACM Transactions on Graphics (TOG), vol. 39, no. 4, pp. 53: 1-53: 12, 2020. https://doi.org/10.1145/3386569.3392440
  8. X. B. Peng et al., "Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning," ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1-13, 2017. https://doi.org/10.1145/3072959.3073602
  9. W. Yu, G. Turk, and C. K. Liu, "Learning symmetric and low-energy locomotion," ACM Transactions on Graphics (TOG), vol. 37, no. 4, pp. 1-12, 2018. https://doi.org/10.1145/3197517.3201397
  10. A. Elgammal, and C.-S. Lee, "The role of manifold learning in human motion analysis," Human Motion, pp. 25-56: Springer, 2008.
  11. D. Holden, J. Saito, and T. Komura, "A deep learning framework for character motion synthesis and editing," ACM Transactions on Graphics (TOG), vol. 35, no. 4, pp. 1-11, 2016. https://doi.org/10.1145/2897824.2925975
  12. H. Y. Ling et al., "Character controllers using motion vaes," ACM Transactions on Graphics (TOG), vol. 39, no. 4, pp. 40: 1-40: 12, 2020. https://doi.org/10.1145/3386569.3392422
  13. J. Won, D. Gopinath, and J. Hodgins, "Physics-based character controllers using conditional VAEs," ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 1-12, 2022. https://doi.org/10.1145/3528223.3530067
  14. H. Yao et al., "ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters," ACM Transactions on Graphics (TOG), vol. 41, no. 6, pp. 1-16, 2022. https://doi.org/10.1145/3550454.3555434
  15. G. E. Henter, S. Alexanderson, and J. Beskow, "Moglow: Probabilistic and controllable motion synthesis using normalising flows," ACM Transactions on Graphics (TOG), vol. 39, no. 6, pp. 1-14, 2020. https://doi.org/10.1145/3414685.3417836
  16. J. Juravsky et al., "PADL: Language-Directed Physics-Based Character Control." pp. 1-9.
  17. S. Agrawal, and M. van de Panne, "Task-based locomotion," ACM Transactions on Graphics (TOG), vol. 35, no. 4, pp. 1-11, 2016. https://doi.org/10.1145/2897824.2925893
  18. K. Lee, S. Lee, and J. Lee, "Interactive character animation by learning multi-objective control," ACM Transactions on Graphics (TOG), vol. 37, no. 6, pp. 1-10, 2018. https://doi.org/10.1145/3272127.3275071
  19. J. Merel et al., "Catch & carry: reusable neural controllers for vision-guided whole-body tasks," ACM Transactions on Graphics (TOG), vol. 39, no. 4, pp. 39: 1-39: 12, 2020.
  20. L. Fussell, K. Bergamin, and D. Holden, "Supertrack: Motion tracking for physically simulated characters using supervised learning," ACM Transactions on Graphics (TOG), vol. 40, no. 6, pp. 1-13, 2021. https://doi.org/10.1145/3478513.3480527
  21. T. Bansal et al., "Emergent complexity via multi-agent competition," arXiv preprint arXiv:1710.03748, 2017.
  22. B. Baker et al., "Emergent tool use from multi-agent autocurricula," arXiv preprint arXiv:1909.07528, 2019.
  23. J. Won, D. Gopinath, and J. Hodgins, "Control strategies for physically simulated characters performing two-player competitive sports," ACM Transactions on Graphics (TOG), vol. 40, no. 4, pp. 1-11, 2021. https://doi.org/10.1145/3450626.3459761
  24. Z. L. Huang Ziming, Wu Yutong, Flood Sung. "TimeChamber: A Massively Parallel Large Scale Self-Play Framework," https://github.com/inspirai/TimeChamber.
  25. CMU, "CMU Graphics Lab Motion Capture Database," 2002.
  26. E. Coumans, "Bullet physics library," Open source: bulletphysics. org, vol. 15, no. 49, pp. 5, 2013.
  27. V. Makoviychuk et al., "Isaac gym: High performance gpu-based physics simulation for robot learning," arXiv preprint arXiv:2108.10470, 2021.
  28. G. Tevet et al., "Human motion diffusion model," arXiv preprint arXiv:2209.14916, 2022.
  29. M. Zhang et al., "Motiondiffuse: Text-driven human motion generation with diffusion model," arXiv preprint arXiv:2208.15001, 2022.
  30. Y. Shafir et al., "Human Motion Diffusion as a Generative Prior," arXiv preprint arXiv:2303.01418, 2023.
  31. Y. Yuan et al., "PhysDiff: Physics-Guided Human Motion Diffusion Model," arXiv preprint arXiv:2212.02500, 2022.