DOI QR코드

DOI QR Code

멀티에이전트 강화학습을 위한 통신 기술 동향

Survey on Communication Algorithms for Multiagent Reinforcement Learning

  • 서승우 (정보전략부) ;
  • 신영환 (정보전략부) ;
  • 유병현 (복합지능연구실) ;
  • 김현우 (복합지능연구실) ;
  • 송화전 (복합지능연구실) ;
  • 이성원 (정보전략부)
  • S.W. Seo ;
  • Y.H. Shin ;
  • B.H. Yoo ;
  • H.W. Kim ;
  • H.J. Song ;
  • S. Yi
  • 발행 : 2023.08.01

초록

Communication for multiagent reinforcement learning (MARL) has emerged to promote understanding of an entire environment. Through communication for MARL, agents can cooperate by choosing the best action considering not only their surrounding environment but also the entire environment and other agents. Hence, MARL with communication may outperform conventional MARL. Many communication algorithms have been proposed to support MARL, but current analyses remain insufficient. This paper presents existing communication algorithms for MARL according to various criteria such as communication methods, contents, and restrictions. In addition, we consider several experimental environments that are primarily used to demonstrate the MARL performance enhanced by communication.

키워드

과제정보

본 연구는 한국전자통신연구원 내부연구과제의 일환으로 수행되었음[멀티에이전트 강화학습 탐색, 통신, 학습전략 기술 연구, 22YE1210, 자율성장형 복합인공지능 원천기술 연구, 23ZS1100].

참고문헌

  1. C.V. Goldman and S. Zilberstein, "Decentralized control of cooperative systems: Categorization and complexity analysis," J. Artif. Intell. Res., vol. 22, 2004, pp. 143-174. https://doi.org/10.1613/jair.1427
  2. 유병현 외, "멀티 에이전트 강화학습 기술 동향," 전자통신 동향분석, 제35권 제6호, 2020, pp. 137-149. https://doi.org/10.22648/ETRI.2020.J.350614
  3. C. Zhu, M. Dastani, and S. Wang, "A survey of multi-agent reinforcement learning with communication," arXiv preprint, CoRR, 2022, arXiv: 2203.08975.
  4. S. Sukhbaatar, A. Szlam, and R. Fergus, "Learning multiagent communication with backpropagation," in Proc. Neural Inform. Process. Syst., (Barcelona, Spain), Dec. 2016, pp. 2244-2252.
  5. J.N. Foerster et al., "Learning to communicate with deep multi-agent reinforcement learning," in Proc. Neural Inform. Process. Syst., (Barcelona, Spain), Dec. 2016, pp. 2137-2145.
  6. P. Peng et al., "Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games," arXiv preprint, CoRR, 2017, arXiv: 1703.10069.
  7. A. Das et al., "Tarmac: Targeted multi-agent communication," in Proc. Int. Conf. Mach. Learn., (Long Beach, CA, USA), June 2019, pp. 1538-1546.
  8. J. Jiang et al., "Graph convolutional reinforcement learning," in Proc. Int. Conf. Learn. Representations, (Addis Ababa, Ethiopia), Apr. 2020.
  9. T. Chu, S. Chinchali, and S. Katti. "Multi-agent reinforcement learning for networked system control," in Proc. Int. Conf. Learn. Representations, (Addis Ababa, Ethiopia), Apr. 2020.
  10. Z. Ding, T. Huang, and Z. Lu. "Learning individually inferred communication for multi-agent cooperation," in Proc. Neural Inform. Process. Syst., (Vancouver, Canada), Dec. 2020, pp. 22069-22079.
  11. O. Kilinc and G. Montana. "Multi-agent deep reinforcement learning with extremely noisy observations," arXiv preprint, CoRR, 2018, arXiv: 1812.00922.
  12. H. Mao et al., "Learning agent communication under limited bandwidth by message pruning," Proc. AAAI Conf. Artif. Intell., vol. 34, no. 4, 2020, pp. 5142-5149.
  13. R. Wang et al., "Learning efficient multi-agent communication: An information bottleneck approach," in Proc. Int. Conf. Mach. Learn., (Vienna, Austria), July 2020, pp. 9908-9918.
  14. C. Guan et al., "Efficient multi-agent communication via self-supervised information aggregation," in Proc. Neural Inform. Process. Syst., (New Orleans, LA, USA), Nov. 2022, pp. 1020-1033.
  15. S.Q. Zhang, Q. Zhang, and J. Lin. "Efficient communication in multi-agent reinforcement learning via variance based control," in Proc. Neural Inform. Process. Syst., (Vancouver, Canada), Dec. 2019, pp. 3230-3239.
  16. J. Jiang and Z. Lu. "Learning attentional communication for multiagent cooperation," in Proc. Neural Inform. Process. Syst., (Montreal, Canada), Dec. 2018, pp. 7265-7275.
  17. D. Kim et al., "Learning to schedule communication in multiagent reinforcement learning," in Proc. Int. Conf. Learn. Representations, (New Orleans, LA, USA), May 2019.
  18. Y. Liu et al., "Multi-agent game abstraction via graph attention neural network," Proc. AAAI Conf. Artif. Intell., 2020, vol. 34, no. 5, pp. 7211-7218.
  19. G. Hu et al., "Event-triggered multi-agent reinforcement learning with communication under limited-bandwidth constraint," arXiv preprint, CoRR, 2020, arXiv: 2010.04978.
  20. B. Freed et al., "Sparse discrete communication learning for multi-agent cooperation through backpropagation," in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., (Las Vegas, NV, USA), Jan. 2020, pp. 7993-7998.
  21. S.Q. Zhang, Q. Zhang, and J. Lin. "Succinct and robust multi-agent communication with temporal message control," in Proc. Neural Inform. Process. Syst., (Vancouver, Canada), Dec. 2020. pp. 17271-17282.
  22. T. Wang et al., "Learning nearly decomposable value functions via communication minimization," in Proc. Int. Conf. Learn. Representations, (Addis Ababa, Ethiopia), Apr. 2020.
  23. B. Freed et al., "Communication learning via backpropagation in discrete channels with unknown noise," Proc. AAAI Conf. Artif. Intell., vol. 34, no. 5, 2020, pp. 7160-7168.
  24. W. Xue et al., "Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning," in Proc. Int. Conf. Auton. Agents Multiagent Syst., (Richland, SC, USA), 2022, pp. 1418-1426.
  25. Y. Sun et al., "Certifiably robust policy learning against adversarial multi-agent communication," in Proc. Int. Conf. Learn. Representations, (Kigali, Rwanda), May 2023.
  26. A. Singh, T. Jain, and S. Sukhbaatar, "Learning when to communicate at scale in multiagent cooperative and competitive tasks," in Proc. Int. Conf. Learn. Representations, (Vancouver, Canada), Dec. 2018.
  27. R. Lowe et al., "Multi-agent actor-critic for mixed cooperative-competitive environments," in Proc. Neural Inform. Process. Syst., (Long Beach, CA, USA), Dec. 2017.
  28. M. Samvelyan et al., "The StarCraft Multi-Agent Challenge," arXiv preprint, CoRR, 2019, arXiv: 1902.04043.
  29. B. Ellis et al., "SMACv2: An improved benchmark for cooperative multi-agent reinforcement learning," arXiv preprint, CoRR, 2022 arXiv: 2212.07489.
  30. F. Christianos, L. Schafer, and S. Albrecht, "Shared experience actor-critic for multi-agent reinforcement learning," in Proc. Neural Inform. Process. Syst., (Vancouver, Canada), Dec. 2020. pp. 10707-10717.