Proceedings of the Korean Institute of Intelligent Systems Conference (한국지능시스템학회:학술대회논문집)
- 2003.09a
- /
- Pages.166-169
- /
- 2003
Generating Cooperative Behavior by Multi-Agent Profit Sharing on the Soccer Game
- Miyazaki, Kazuteru (National Institution for Academic Degrees and University Evaluation) ;
- Terada, Takashi (Japan Overseas Cooperation Vulunteer) ;
- Kobayashi, Hiroaki (Meiji University)
- Published : 2003.09.01
Abstract
Reinforcement learning if a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward and a penalty. Q-learning [8] that is a representative reinforcement learning system treats a reward and a penalty at the same time. There is a problem how to decide an appropriate reward and penalty values. We know the Penalty Avoiding Rational Policy Making algorithm (PARP) [4] and the Penalty Avoiding Profit Sharing (PAPS) [2] as reinforcement learning systems to treat a reward and a penalty independently. though PAPS is a descendant algorithm of PARP, both PARP and PAPS tend to learn a local optimal policy. To overcome it, ion this paper, we propose the Multi Best method (MB) that is PAPS with the multi-start method[5]. MB selects the best policy in several policies that are learned by PAPS agents. By applying PS, PAPS and MB to a soccer game environment based on the SoccerBots[9], we show that MB is the best solution for the soccer game environment.
Keywords