DOI QR코드

DOI QR Code

멀티에이전트 강화학습 기술 동향: 분산형 훈련-분산형 실행 프레임워크를 중심으로

Survey on Recent Advances in Multiagent Reinforcement Learning Focusing on Decentralized Training with Decentralized Execution Framework

  • 신영환 (정보전략부 ) ;
  • 서승우 (정보전략부 ) ;
  • 유병현 (복합지능연구실 ) ;
  • 김현우 (복합지능연구실 ) ;
  • 송화전 (복합지능연구실 ) ;
  • 이성원 (정보전략부 )
  • Y.H. Shin ;
  • S.W. Seo ;
  • B.H. Yoo ;
  • H.W. Kim ;
  • H.J. Song ;
  • S. Yi
  • 발행 : 2023.08.01

초록

The importance of the decentralized training with decentralized execution (DTDE) framework is well-known in the study of multiagent reinforcement learning. In many real-world environments, agents cannot share information. Hence, they must be trained in a decentralized manner. However, the DTDE framework has been less studied than the centralized training with decentralized execution framework. One of the main reasons is that many problems arise when training agents in a decentralized manner. For example, DTDE algorithms are often computationally demanding or can encounter problems with non-stationarity. Another reason is the lack of simulation environments that can properly handle the DTDE framework. We discuss current research trends in the DTDE framework.

키워드

과제정보

본 연구는 한국전자통신연구원 내부연구과제의 일환으로 수행되었음[멀티에이전트 강화학습 탐색, 통신, 학습전략 기술 연구, 22YE1210, 자율성장형 복합인공지능 원천기술 연구, 23ZS1100].

참고문헌

  1. L. Busoniu et al., "A Comprehensive survey of multiagent reinforcement learning," IEEE Trans. Syst. Man, Cybern. C, Appl. Rev., vol. 38, no. 2, 2008, pp. 156-172.  https://doi.org/10.1109/TSMCC.2007.913919
  2. 유병현 외, "멀티 에이전트 강화학습 기술 동향," 전자통신동향 용어해설 분석, 제35권 제6호, 2020, pp. 137-149. 
  3. S. Gronauer and K. Diepold, "Multi-agent deep reinforcement learning: A survey," Artif. Intell. Rev., vol. 55, no. 2, 2022, pp. 895-943.  https://doi.org/10.1007/s10462-021-09996-w
  4. H. Nekoei et al., "Dealing with non-stationarity in decentralized cooperative multi-agent deep reinforcement learning via multi-timescale learning," arXiv preprint, CoRR, 2023, arXiv: 2302.02792. 
  5. G. Papoudakis et al., "Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks," in Proc. Neural Inform. Process. (Virtual), 2021. 
  6. J. Foerster et al., "Counterfactual multi-agent policy gradients," Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, 2020. 
  7. T. Rashid et al., "QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning," in Proc. Int. Conf. Mach. Learn., (Stockholm, Sweden), July 2018. 
  8. P. Hernandez-Leal et al., "A survey of learning in multiagent environments: Dealing with non-stationarity," arXiv preprint, CoRR, 2016, arXiv: 1707.09183. 
  9. M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents," in Proc. Int. Conf. Mach. Learn., (Amherst, MA, USA), 1993, pp. 330-337. 
  10. V. Mnih et al., "Playing atari with deep reinforcement learning," arXiv preprint, CoRR, 2013, arXiv: 1312.5602. 
  11. T. Chu et al., "Multi-agent deep reinforcement learning for large-scale traffic signal control," IEEE Trans. Itell. Transportation Syst., vol. 21, no. 3, 2020, pp. 1086-1095.  https://doi.org/10.1109/TITS.2019.2901791
  12. C. Witt et al., "Is independent learning all you need in the StarCraft multi-agent challenge?," arXiv preprint, CoRR, 2020, arXiv: 2011.09533. 
  13. V. Mnih et al., "Asynchronous methods for deep reinforcement learning," in Proc. Int. Conf. Mach. Learn., (New York, NY, USA), June 2016, 
  14. J. Schulman et al., "Proximal policy optimization algorithms," arXiv preprint, CoRR, 2017, arXiv: 1707.06347. 
  15. H. Nekoei et al., "Staged independent learning: Towards decentralized cooperative multi-agent reinforcement learning," in Proc. Int. Conf. Mach. Learn., (Virtual), Apr. 2022. 
  16. L. Matignon et al., "Hysteretic q-learning: An algorithm for decentralized reinforcement learning in cooperative multi-agent teams," in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., (San Diego, CA, USA), Oct. 2007, pp. 64-69. 
  17. G. Palmer et al., "Lenient multi-agent deep reinforcement learning," in Proc. Int. Conf. Auton. Agents MultiAgent Syst., (Stockholm, Sweden), July 2018, pp. 443-451. 
  18. J. Jiang et al., "I2Q: A fully decentralized Q-learning algorithm," in Proc. Neural Inform. Process., (New Orleans, LA, USA), Nov. 2022. 
  19. S. Omidshafiei et al., "Deep decentralized multi-task multi-agent reinforcement learning under partial observability," in Proc. Int. Conf. Mach. Learn., (Sydney, Australia), Aug. 2017, pp. 2681-2690. 
  20. W. Li et al., "F2A2: Flexible fully-decentralized approximate actor-critic for cooperative multi-agent reinforcement learning," arXiv preprint, CoRR, 2020, arXiv: 2004.11145. 
  21. M. Samvelyan et al., "The starcraft multi-agent challenge," arXiv preprint, CoRR, 2019, arXiv: 1902.04043. 
  22. G. Brockman et al., "OpenAI gym," arXiv preprint, CoRR, 2016, arXiv: 1606.01540. 
  23. F. Christianos et al., "Shared experience actor-critic for multi-agent reinforcement learning," in Proc. Neural Inform. Process. Syst., (Vancouver, Canada), Dec. 2020, pp. 10707-10717.