Acknowledgement
정보통신기획평가원/정보통신방송 기술개발사업/클라우드에 연결된 개별 로봇 및 로봇그룹의 작업 계획 기술 개발 / 2020-0-00096.
References
- M. Samvelyan, T. Rashid, C. S. Witt, G. Farquhar, N. Nardelli, T. G. J. Rudner, C. M. Hung, P. H. S. Torr, J. N. Foerster, and S. Whiteson, "The StarCraft Multi-Agent Challenge," CoRR, abs/1902.04043, 2019.
- J. N. Foerster, G, Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, "Counterfactual multi-agent policy gradients," in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
- P. Sunehag, G. Lever, A. Gruslys, W. M. Czarnecki, V. Zambaldi, M. Jaderberg, M. Lanctot, N. Sonnerat, J. Z. Leibo, K. Tuyls, and T. Graepel, "Value-decomposition networks for cooperative multi-agent learning based on team reward," in Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2017.
- T. Rashid, M. Samvelyan, C. S. Witt, G. Farquhar, J. N. Foerster, and S. Whiteson, "Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning," in Proceedings of the International Conference on Machine Learning (ICML), pp.4292-4301, 2018.
- M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents." in Proceedings of the Tenth International Conference on Machine Learning (ICML), pp.330-337, 1993.
- C. Watkins, "Learning from delayed rewards," Ph.D. Thesis, University of Cambridge England, 1989.
- V. Mnih, et al., "Human-level control through deep reinforcement learning," Nature, pp.529-533, 2015.
- A. Tampuu, et al., "Multiagent cooperation and competition with deep reinforcement learning," PLoS ONE, Vol.12, No.4, 2017.
- J. N. Foerster, et al., "Stabilising experience replay for deep multi-agent reinforcement learning," in Proceedings of The 34th International Conference on Machine Learning (ICML), pp.1146-1155, 2017
- C. Guestrin, D. Koller, and R. Parr, "Multiagent planning with factored MDPs," In Advances in Neural Information Processing Systems (NIPS), MIT Press, pp.1523-1530, 2002.
- J. R. Kok and N. Vlassis, "Collaborative multiagent reinforcement learning by payoff propagation," Journal of Machine Learning Research, pp.1789-1828, 2006.
- S. Sukhbaatar, R. Fergus, A. Szlam, and R. Fergus, "Learning multiagent communication with backpropagation," In Advances in Neural Information Processing Systems (NIPS), pp.2244-2252, 2016.
- P. Peng, et al., "Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play StarCraft combat games," In Advances in Neural Information Processing Systems (NIPS), 2017.
- J. K. Gupta, M. Egorov, and M. Kochenderfer, "Cooperative multi-agent control using deep reinforcement learning," in Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Springer, pp.66-83, 2017.
- R. Lowe, Y. Wu, A. Tamar, J. Harb, O. P. Abbeel, and I. Mordatch, "Multi-agent actor-critic for mixed cooperative-competitive environments," In Advances in Neural Information Processing Systems (NIPS), pp.6382-6393, 2017.
- S. Iqbal, C. A, C. S. Witt, B. Penget, W. Bohmer, S. Whiteson, and F. Sha, "AI-QMIX: Attention and imagination for dynamic multi-agent reinforcement learning," arXiv: 2006.04222, 2020.