Search | Korea Science

Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib

Minkyoung Kim
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.1
- /
- pp.11-19
- /
- 2024
Multi-agent systems can be utilized in various real-world cooperative environments such as battlefield engagements and unmanned transport vehicles. In the context of battlefield engagements, where dense reward design faces challenges due to limited domain knowledge, it is crucial to consider situations that are learned through explicit sparse rewards. This paper explores the collaborative potential among allied agents in a battlefield scenario. Utilizing the Multi-Robot Warehouse Environment(RWARE) as a sparse reward environment, we define analogous problems and establish evaluation criteria. Constructing a learning environment with the QMIX algorithm from the reinforcement learning library Ray RLlib, we enhance the Agent Network of QMIX and integrate Random Network Distillation(RND). This enables the extraction of patterns and temporal features from partial observations of agents, confirming the potential for improving the acquisition of sparse reward experiences through intrinsic rewards.
https://doi.org/10.9708/jksci.2024.29.01.011 인용 PDF HTML

Multagent Control Strategy Using Reinforcement Learning (강화학습을 이용한 다중 에이전트 제어 전략)

Lee, Hyong-Ill;Kim, Byung-Cheon
- The KIPS Transactions:PartB
- /
- v.10B no.3
- /
- pp.249-256
- /
- 2003
The most important problems in the multi-agent system are to accomplish a goal through the efficient coordination of several agents and to prevent collision with other agents. In this paper, we propose a new control strategy for succeeding the goal of the prey pursuit problem efficiently. Our control method uses reinforcement learning to control the multi-agent system and consider the distance as well as the space relationship between the agents in the state space of the prey pursuit problem.
https://doi.org/10.3745/KIPSTB.2003.10B.3.249 인용 PDF KSCI

Learning soccer robot using genetic programming

Wang, Xiaoshu;Sugisaka, Masanori
- 제어로봇시스템학회:학술대회논문집
- /
- 1999.10a
- /
- pp.292-297
- /
- 1999
Evolving in artificial agent is an extremely difficult problem, but on the other hand, a challenging task. At present the studies mainly centered on single agent learning problem. In our case, we use simulated soccer to investigate multi-agent cooperative learning. Consider the fundamental differences in learning mechanism, existing reinforcement learning algorithms can be roughly classified into two types-that based on evaluation functions and that of searching policy space directly. Genetic Programming developed from Genetic Algorithms is one of the most well known approaches belonging to the latter. In this paper, we give detailed algorithm description as well as data construction that are necessary for learning single agent strategies at first. In following step moreover, we will extend developed methods into multiple robot domains. game. We investigate and contrast two different methods-simple team learning and sub-group loaming and conclude the paper with some experimental results.
PDF

Study on Enhancing Training Efficiency of MARL for Swarm Using Transfer Learning (전이학습을 활용한 군집제어용 강화학습의 효율 향상 방안에 관한 연구)

Seulgi Yi;Kwon-Il Kim;Sukmin Yoon
- Journal of the Korea Institute of Military Science and Technology
- /
- v.26 no.4
- /
- pp.361-370
- /
- 2023
Swarm has recently become a critical component of offensive and defensive systems. Multi-agent reinforcement learning(MARL) empowers swarm systems to handle a wide range of scenarios. However, the main challenge lies in MARL's scalability issue - as the number of agents increases, the performance of the learning decreases. In this study, transfer learning is applied to advanced MARL algorithm to resolve the scalability issue. Validation results show that the training efficiency has significantly improved, reducing computational time by 31 %.
https://doi.org/10.9766/KIMST.2023.26.4.361 인용 PDF

Hybrid Multi-agent Learning Strategy (혼성 다중에이전트 학습 전략)

Kim, Byung-Chun;Lee, Chang-Hoon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.6
- /
- pp.187-193
- /
- 2013
In multi-agent systems, How to coordinate the behaviors of the agents through learning is a very important problem. The most important problems in the multi-agent system are to accomplish a goal through the efficient coordination of several agents and to prevent collision with other agents. In this paper, we propose a novel approach by using hybrid learning strategy. It is used hybrid learning strategy to control the multi-agent system efficiently by using the spatial relationship among the agents. Through experiments, we can see approximate faster the goal then other strategies and avoids collision among the agents.
https://doi.org/10.7236/JIIBC.2013.13.6.187 인용 PDF KSCI

Multi Colony Intensification.Diversification Interaction Ant Reinforcement Learning Using Temporal Difference Learning (Temporal Difference 학습을 이용한 다중 집단 강화.다양화 상호작용 개미 강화학습)

Lee Seung-Gwan
- The Journal of the Korea Contents Association
- /
- v.5 no.5
- /
- pp.1-9
- /
- 2005
In this paper, we suggest multi colony interaction ant reinforcement learning model. This method is a hybrid of multi colony interaction by elite strategy and reinforcement teaming applying Temporal Difference(TD) learning to Ant-Q loaming. Proposed model is consisted of some independent AS colonies, and interaction achieves search according to elite strategy(Intensification, Diversification strategy) between the colonies. Intensification strategy enables to select of good path to use heuristic information of other agent colony. This makes to select the high frequency of the visit of a edge by agents through positive interaction of between the colonies. Diversification strategy makes to escape selection of the high frequency of the visit of a edge by agents achieve negative interaction by search information of other agent colony. Through this strategies, we could know that proposed reinforcement loaming method converges faster to optimal solution than original ACS and Ant-Q.
PDF

The Application of Industrial Inspection of LED

Xi, Wang;Chong, Kil-To
- Proceedings of the IEEK Conference
- /
- 2009.05a
- /
- pp.91-93
- /
- 2009
In this paper, we present the Q-learning method for adaptive traffic signal control on the basis of In this paper, we present the Q-learning method for adaptive traffic signal control on the basis of multi-agent technology. The structure is composed of sixphase agents and one intersection agent. Wireless communication network provides the possibility of the cooperation of agents. As one kind of reinforcement learning, Q-learning is adopted as the algorithm of the control mechanism, which can acquire optical control strategies from delayed reward; furthermore, we adopt dynamic learning method instead of static method, which is more practical. Simulation result indicates that it is more effective than traditional signal system.
PDF

The Automatic Coordination Model for Multi-Agent System Using Learning Method (학습기법을 이용한 멀티 에이전트 시스템 자동 조정 모델)

Lee, Mal-Rye;Kim, Sang-Geun
- The KIPS Transactions:PartB
- /
- v.8B no.6
- /
- pp.587-594
- /
- 2001
Multi-agent system fits to the distributed and open internet environments. In a multi-agent system, agents must cooperate with each other through a coordination procedure, when the conflicts between agents arise. Where those are caused by the point that each action acts for a purpose separately without coordination. But previous researches for coordination methods in multi-agent system have a deficiency that they cannot solve correctly the cooperation problem between agents, which have different goals in dynamic environment. In this paper, we suggest the automatic coordination model for multi-agent system using neural network and reinforcement learning in dynamic environment. We have competitive experiment between multi-agents that have complexity environment and diverse activity. And we analysis and evaluate effect of activity of multi-agents. The results show that the proposed method is proper.
PDF

A Navigation System for Mobile Robot

Zhang, Yuanliang;Chong, Kil-To
- Proceedings of the IEEK Conference
- /
- 2009.05a
- /
- pp.118-120
- /
- 2009
In this paper, we present the Q-learning method for adaptive traffic signal control on the basis of multi-agent technology. The structure is composed of sixphase agents and one intersection agent. Wireless communication network provides the possibility of the cooperation of agents. As one kind of reinforcement learning, Q-learning is adopted as the algorithm of the control mechanism, which can acquire optical control strategies from delayed reward; furthermore, we adopt dynamic learning method instead of static method, which is more practical. Simulation result indicates that it is more effective than traditional signal system.
PDF

Multi-agent Q-learning based Admission Control Mechanism in Heterogeneous Wireless Networks for Multiple Services

Chen, Jiamei;Xu, Yubin;Ma, Lin;Wang, Yao
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.7 no.10
- /
- pp.2376-2394
- /
- 2013
In order to ensure both of the whole system capacity and users QoS requirements in heterogeneous wireless networks, admission control mechanism should be well designed. In this paper, Multi-agent Q-learning based Admission Control Mechanism (MQACM) is proposed to handle new and handoff call access problems appropriately. MQACM obtains the optimal decision policy by using an improved form of single-agent Q-learning method, Multi-agent Q-learning (MQ) method. MQ method is creatively introduced to solve the admission control problem in heterogeneous wireless networks in this paper. In addition, different priorities are allocated to multiple services aiming to make MQACM perform even well in congested network scenarios. It can be observed from both analysis and simulation results that our proposed method not only outperforms existing schemes with enhanced call blocking probability and handoff dropping probability performance, but also has better network universality and stability than other schemes.
https://doi.org/10.3837/tiis.2013.10.003 인용 PDF KSCI

Search Result 62, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)