• Title/Summary/Keyword: multi-agent learning

Search Result 121, Processing Time 0.03 seconds

Recent Trends in Multi-Agent Technology and Communication Optimization Research for Swarm Flight of Drones (드론 군집 비행을 위한 다중 에이전트 최신 기술 분석 및 통신 최적화 기술 연구)

  • Kim Eunsu;Jang Yeonju;Bang Jongho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.20 no.3
    • /
    • pp.71-84
    • /
    • 2024
  • Artificial intelligence can be cited as a key linkage technology for expanding drones' application fields, and drones combined with artificial intelligence are expected to improve drones' operational capabilities based on algorithms that can solve complex tasks through learning. The purpose of this study is to analyze various latest research cases that apply deep reinforcement learning to drones to solve limitations for performing swarm flight and to propose a new research direction that applies them to multi-agent communication optimization technology. The process of the research is to investigate and analyze the methods for efficient operation of control and communication technologies required for swarm flight to be successful, and to apply algorithms that have the advantage of exchanging richer feedback between agents and having less learning than conventional methods when learning deep reinforcement learning algorithms. It is expected that the efficiency and performance of learning communication protocols optimized for swarm flight will be improved, which will increase the efficiency of mission performance when exploring or scouting large areas through swarm flight in the future.

A Course Scheduling Multi-Agent System For Ubiquitous Web Learning Environment (유비쿼터스 웹 학습 환경을 위한 코스 스케줄링 멀티 에이전트 시스템)

  • Han, Seung-Hyun;Ryu, Dong-Yeop;Seo, Jeong-Man
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.4 s.36
    • /
    • pp.365-373
    • /
    • 2005
  • Ubiquitous learning environment needs various new model of e-learning as web based education system has been proposed. The demand for the customized courseware which is required from the learners is increased. the needs of the efficient and automated education agents in the web-based instruction are recognized. But many education systems that had been studied recently did not service fluently the courses which learners had been wanting and could not provide the way for the learners to study the learning weakness which is observed in the continuous feedback of the course. In this paper we propose a multi-agent system for course scheduling of learner-oriented using weakness analysis algorithm via personalized ubiquitous environment factors. First proposed system analyze learner's result of evaluation and calculates learning accomplishment. From this accomplishment the multi-agent schedules the suitable course for the learner. The learner achieves an active and complete learning from the repeated and suitable course.

  • PDF

Intelligent Warehousing: Comparing Cooperative MARL Strategies

  • Yosua Setyawan Soekamto;Dae-Ki Kang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.205-211
    • /
    • 2024
  • Effective warehouse management requires advanced resource planning to optimize profits and space. Robots offer a promising solution, but their effectiveness relies on embedded artificial intelligence. Multi-agent reinforcement learning (MARL) enhances robot intelligence in these environments. This study explores various MARL algorithms using the Multi-Robot Warehouse Environment (RWARE) to determine their suitability for warehouse resource planning. Our findings show that cooperative MARL is essential for effective warehouse management. IA2C outperforms MAA2C and VDA2C on smaller maps, while VDA2C excels on larger maps. IA2C's decentralized approach, focusing on cooperation over collaboration, allows for higher reward collection in smaller environments. However, as map size increases, reward collection decreases due to the need for extensive exploration. This study highlights the importance of selecting the appropriate MARL algorithm based on the specific warehouse environment's requirements and scale.

Trends in quantum reinforcement learning: State-of-thearts and the road ahead

  • Soohyun Park;Joongheon Kim
    • ETRI Journal
    • /
    • v.46 no.5
    • /
    • pp.748-758
    • /
    • 2024
  • This paper presents the basic quantum reinforcement learning theory and its applications to various engineering problems. With the advances in quantum computing and deep learning technologies, various research works have focused on quantum deep learning and quantum machine learning. In this paper, quantum neural network (QNN)-based reinforcement learning (RL) models are discussed and introduced. Moreover, the pros of the QNN-based RL algorithms and models, such as fast training, high scalability, and efficient learning parameter utilization, are presented along with various research results. In addition, one of the well-known multi-agent extensions of QNN-based RL models, the quantum centralized-critic and multiple-actor network, is also discussed and its applications to multi-agent cooperation and coordination are introduced. Finally, the applications and future research directions are introduced and discussed in terms of federated learning, split learning, autonomous control, and quantum deep learning software testing.

A Course Scheduling Multi-Agent System using Learning Evaluation Analysis (학습 평가 분석을 이용한 웹기반 코스 스케쥴링 멀티 에이전트 시스템)

  • Park, Jae-Pyo;Yoo, Kwang-Hyoung;Lee, Jong-Hee;Jeon, Moon-Seok
    • The Journal of Korean Association of Computer Education
    • /
    • v.7 no.1
    • /
    • pp.97-106
    • /
    • 2004
  • Recently, the demand for the customized courseware which is required from the learners is increased. Therefore the needs of the efficient and automated education agents in the web-based instruction are recognized. In this paper we propose a multi-agent system for course scheduling of learner-oriented using weakness analysis algorithm. At first proposed system analyze learner's result of evaluation and calculates learning accomplishment. From this accomplishment the multi-agent schedules the suitable course for the learner. The learner achieves an active and complete learning from the repeated and suitable course.

  • PDF

Cooperative Multi-agent Reinforcement Learning on Sparse Reward Battlefield Environment using QMIX and RND in Ray RLlib

  • Minkyoung Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.11-19
    • /
    • 2024
  • Multi-agent systems can be utilized in various real-world cooperative environments such as battlefield engagements and unmanned transport vehicles. In the context of battlefield engagements, where dense reward design faces challenges due to limited domain knowledge, it is crucial to consider situations that are learned through explicit sparse rewards. This paper explores the collaborative potential among allied agents in a battlefield scenario. Utilizing the Multi-Robot Warehouse Environment(RWARE) as a sparse reward environment, we define analogous problems and establish evaluation criteria. Constructing a learning environment with the QMIX algorithm from the reinforcement learning library Ray RLlib, we enhance the Agent Network of QMIX and integrate Random Network Distillation(RND). This enables the extraction of patterns and temporal features from partial observations of agents, confirming the potential for improving the acquisition of sparse reward experiences through intrinsic rewards.

Dynamic Positioning of Robot Soccer Simulation Game Agents using Reinforcement learning

  • Kwon, Ki-Duk;Cho, Soo-Sin;Kim, In-Cheol
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.59-64
    • /
    • 2001
  • The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement learning is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to chose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement learning is different from supervised learning in the sense that there is no presentation of input pairs as training examples. Furthermore, model-free reinforcement learning algorithms like Q-learning do not require defining or learning any models of the surrounding environment. Nevertheless it can learn the optimal policy if the agent can visit every state- action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem. we suggest Adaptive Mediation-based Modular Q-Learning (AMMQL)as an improvement of the existing Modular Q-Learning (MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. This paper introduces the concept of AMMQL and presents details of its application into dynamic positioning of robot soccer agents.

  • PDF

Generating Cooperative Behavior by Multi-Agent Profit Sharing on the Soccer Game

  • Miyazaki, Kazuteru;Terada, Takashi;Kobayashi, Hiroaki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.166-169
    • /
    • 2003
  • Reinforcement learning if a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward and a penalty. Q-learning [8] that is a representative reinforcement learning system treats a reward and a penalty at the same time. There is a problem how to decide an appropriate reward and penalty values. We know the Penalty Avoiding Rational Policy Making algorithm (PARP) [4] and the Penalty Avoiding Profit Sharing (PAPS) [2] as reinforcement learning systems to treat a reward and a penalty independently. though PAPS is a descendant algorithm of PARP, both PARP and PAPS tend to learn a local optimal policy. To overcome it, ion this paper, we propose the Multi Best method (MB) that is PAPS with the multi-start method[5]. MB selects the best policy in several policies that are learned by PAPS agents. By applying PS, PAPS and MB to a soccer game environment based on the SoccerBots[9], we show that MB is the best solution for the soccer game environment.

  • PDF

Mean Field Game based Reinforcement Learning for Weapon-Target Assignment (평균 필드 게임 기반의 강화학습을 통한 무기-표적 할당)

  • Shin, Min Kyu;Park, Soon-Seo;Lee, Daniel;Choi, Han-Lim
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.23 no.4
    • /
    • pp.337-345
    • /
    • 2020
  • The Weapon-Target Assignment(WTA) problem can be formulated as an optimization problem that minimize the threat of targets. Existing methods consider the trade-off between optimality and execution time to meet the various mission objectives. We propose a multi-agent reinforcement learning algorithm for WTA based on mean field game to solve the problem in real-time with nearly optimal accuracy. Mean field game is a recent method introduced to relieve the curse of dimensionality in multi-agent learning algorithm. In addition, previous reinforcement learning models for WTA generally do not consider weapon interference, which may be critical in real world operations. Therefore, we modify the reward function to discourage the crossing of weapon trajectories. The feasibility of the proposed method was verified through simulation of a WTA problem with multiple targets in realtime and the proposed algorithm can assign the weapons to all targets without crossing trajectories of weapons.

Improving Dynamic Missile Defense Effectiveness Using Multi-Agent Deep Q-Network Model (멀티에이전트 기반 Deep Q-Network 모델을 이용한 동적 미사일 방어효과 개선)

  • Min Gook Kim;Dong Wook Hong;Bong Wan Choi;Ji Hoon Kyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.74-83
    • /
    • 2024
  • The threat of North Korea's long-range firepower is recognized as a typical asymmetric threat, and South Korea is prioritizing the development of a Korean-style missile defense system to defend against it. To address this, previous research modeled North Korean long-range artillery attacks as a Markov Decision Process (MDP) and used Approximate Dynamic Programming as an algorithm for missile defense, but due to its limitations, there is an intention to apply deep reinforcement learning techniques that incorporate deep learning. In this paper, we aim to develop a missile defense system algorithm by applying a modified DQN with multi-agent-based deep reinforcement learning techniques. Through this, we have researched to ensure an efficient missile defense system can be implemented considering the style of attacks in recent wars, such as how effectively it can respond to enemy missile attacks, and have proven that the results learned through deep reinforcement learning show superior outcomes.