통합 검색 | Korea Science

Multi-Agent Deep Reinforcement Learning for Fighting Game: A Comparative Study of PPO and A2C

Yoshua Kaleb Purwanto;Dae-Ki Kang
- International Journal of Internet, Broadcasting and Communication
- /
- 제16권3호
- /
- pp.192-198
- /
- 2024
This paper investigates the application of multi-agent deep reinforcement learning in the fighting game Samurai Shodown using Proximal Policy Optimization (PPO) and Advantage Actor-Critic (A2C) algorithms. Initially, agents are trained separately for 200,000 timesteps using Convolutional Neural Network (CNN) and Multi-Layer Perceptron (MLP) with LSTM networks. PPO demonstrates superior performance early on with stable policy updates, while A2C shows better adaptation and higher rewards over extended training periods, culminating in A2C outperforming PPO after 1,000,000 timesteps. These findings highlight PPO's effectiveness for short-term training and A2C's advantages in long-term learning scenarios, emphasizing the importance of algorithm selection based on training duration and task complexity. The code can be found in this link https://github.com/Lexer04/Samurai-Shodown-with-Reinforcement-Learning-PPO.
https://doi.org/10.7236/IJIBC.2024.16.3.192 인용 PDF

멀티스킬 상담 인력이 콜센터 서비스 품질에 미치는 영향에 관한 연구 (A Study on the Impact of Multi-Skilled Agents on the Service Quality of Call Centers)

진도원;박찬규
- 한국IT서비스학회지
- /
- 제18권3호
- /
- pp.17-35
- /
- 2019
Call centers do not simply play a role of responding to customers' calls, but they have developed into a core unit for maintaining competitiveness through services, marketing, or sales. Since the service quality of call centers heavily affects customer satisfaction, organizations have focused on enhancing it by reducing waiting time and increasing service level. One of the techniques, which improve the service quality of call centers, is to employ multi-skilled agents that can handle more than one type of calls. This study deals with three issues relevant to multi-skilled agents. First, we analyze how the way of allocating a specific group of agents to a set of skills affects the performance of call centers. Secondly, we investigate the relationship between the number of multi-skilled agents and the performance of call centers. Finally, we examine the impact of agent selection rules on the performance of call centers. Two selection rules are compared : the first rule is to assign a call to any available agent at random while the other rule is to assign a call preferably to single-skilled agents over multi-skilled agents when applicable. Based on simulation experiments, we suggest three implications. First, as the length of cycles in the agent-skill configuration network becomes longer, call centers achieve higher service level and shorter waiting time. Secondly, simulation results show that as the portion of multi-skilled agents increases, the performance of call centers improves. However, most of the improvement is attained when the portion of multi-skilled agents is relatively low. Finally, the agent selection rules do not significantly affect the call centers' performance, but the rule of preferring single-skilled agents tends to distribute the workload among agents more equally.
https://doi.org/10.9716/KITS.2019.18.3.017 인용 PDF KSCI

조정을 지원하는 다중 에이전트 시스템 아키텍쳐 (The Architecture Supporting Agent's Coordination in Multi-Agent Systems)

이승연;박수용
- 한국정보과학회:학술대회논문집
- /
- 한국정보과학회 2000년도 가을 학술발표논문집 Vol.27 No.2 (1)
- /
- pp.355-357
- /
- 2000
실세계에서 발생하는 복잡한 문제들을 해결하기 위한 노력으로, 다중 에이전트 시스템)Multi-Agent System) 구축에 대한 관심이 높아지고 있다. 다양한 종류의 분산 인공지능 문제들을 에이전트라는 추상적 단위와 에이전트간의 상호작용을 토대로 해결하는 시스템을 개발하기 위하여, 본 연구에서는 다중 에이전트 지향의 소프트웨어를 개발함에 있어 중요한 요소인 조정(Coordination)을 지원하는 아키텍쳐를 제안한다. 문제영역을 분석하고, 다중 에이전트 시스템의 특성을 파악하여 시스템 요소들의 조정을 지원하는 아키텍쳐 공정을 제안한다. 또한, 이를 지능형 교통정보 시스템에 적용하여 본다.
PDF

멀티 에이전트 기반의 통합설계 시스템 개발 (Development of an Integrated Design System Based on Multi-Agent)

이재경;박성환;이종원;한승호;한형석
- 한국정밀공학회지
- /
- 제22권1호
- /
- pp.14-18
- /
- 2005
PDF KSCI

일차 다개체 시스템의 그룹 평균 상태일치와 그룹 대형 상태일치 (Group Average-consensus and Group Formation-consensus for First-order Multi-agent Systems)

김재만;박진배;최윤호
- 제어로봇시스템학회논문지
- /
- 제20권12호
- /
- pp.1225-1230
- /
- 2014
This paper investigates the group average-consensus and group formation-consensus problems for first-order multi-agent systems. The control protocol for group consensus is designed by considering the positive adjacency elements. Since each intra-group Laplacian matrix cannot be satisfied with the in-degree balance because of the positive adjacency elements between groups, we decompose the Laplacian matrix into an intra-group Laplacian matrix and an inter-group Laplacian matrix. Moreover, average matrices are used in the control protocol to analyze the stability of multi-agent systems with a fixed and undirected communication topology. Using the graph theory and the Lyapunov functional, stability analysis is performed for group average-consensus and group formation-consensus, respectively. Finally, some simulation results are presented to validate the effectiveness of the proposed control protocol for group consensus.
https://doi.org/10.5302/J.ICROS.2014.14.0087 인용 PDF KSCI

풍력 복합발전 시스템을 위한 멀티에이전트 제어 (Multi-agent Control for Wind Hybrid Power Systems)

강승진;고희상;부창진;김호찬
- 한국산학기술학회논문지
- /
- 제15권12호
- /
- pp.7451-7458
- /
- 2014
본 논문에서는 독립된 풍력 복합발전 시스템을 대상으로 시스템의 모델링과 다양한 환경에서 체계적으로 동작시키기 위한 멀티에이전트 기반의 제어방법을 제안한다. 멀티에이전트 제어는 풍력발전기, 디젤발전기, 배터리, 부하로 구성되는 새로운 형식의 하이브리드 제어방법이고, 풍속과 배터리의 충전상태에 따라 풍력 복합발전 시스템의 운전은 14개의 모드로 나누어 수행된다. 시뮬레이션 성능평가를 통해 제안된 알고리즘이 독립된 풍력 복합발전 시스템에서 다양한 풍속변화가 존재하는 경우에도 효율적으로 운전될 수 있음을 보여준다.
https://doi.org/10.5762/KAIS.2014.15.12.7451 인용 PDF KSCI

학습 평가 분석을 이용한 웹기반 코스 스케쥴링 멀티 에이전트 시스템 (A Course Scheduling Multi-Agent System using Learning Evaluation Analysis)

박재표;이광형;이종희;전문석
- 컴퓨터교육학회논문지
- /
- 제7권1호
- /
- pp.97-106
- /
- 2004
최근 학습자의 요구에 맞는 코스웨어의 주문이 증가하고 있는 추세이며 그에 따라 웹 기반 교육 시스템에 효율적이고 자동화된 교육 에이전트의 필요성이 인식되고 있다. 본 논문에서는 취약성 분석 알고리즘을 이용한 학습자 중심의 코스 스케쥴링 멀티 에이전트 시스템을 제안한다. 제안한 시스템은 먼저 학습자의 학습 평가 결과를 분석하고 학습자의 학습 성취도를 계산하며, 이 성취도를 에이전트의 스케줄에 적응하여 학습자에게 적합한 코스를 제공하고, 학습자는 이러한 코스에 따라 능력에 맞는 반복된 학습을 통하여 적극적인 완전학습을 수행하게 된다.
PDF

퍼지추론 기반 멀티 에이전트를 통한 리모델링 사업 전 추진단계에서의 갈등관리 (Conflict Management in Planning phase of Remodeling Project through Multi-Agent based on Fuzzy Inference.)

박지은;유정호
- 한국건축시공학회:학술대회논문집
- /
- 한국건축시공학회 2015년도 춘계 학술논문 발표대회
- /
- pp.202-203
- /
- 2015
To promote the remodeling project it is important to get apartment residents' consent. It is significant variable to determine project to progress smoothly from planning stage which committee of association establishment sets up to establishment stage of association. On average, it takes about 1~1.6 year in planning phase which means before construction phase of remodeling. Therefore, it is very important issue to get apartment residents' consent in planning phase. In this research, we focused on residents' opinion and proposed solution of conflict with gathering residents' opinion to proceed remodeling project. By setting particular remodeling situation, related residents represented as agents made effort to efficient coordination to reduce total duration of decision making. Therefore, we proposed multi-agent based on fuzzy inference to simulate behavior of decision making on remodeling project effectively. From this method, optimal alternative is selected by considering each agents' attributes which represented by fuzzy set. This research will develope to further research for realizing concrete multi-agent based on fuzzy inference considering all stakeholders in remodeling project.
PDF

Opportunistic Spectrum Access with Discrete Feedback in Unknown and Dynamic Environment：A Multi-agent Learning Approach

Gao, Zhan;Chen, Junhong;Xu, Yuhua
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제9권10호
- /
- pp.3867-3886
- /
- 2015
This article investigates the problem of opportunistic spectrum access in dynamic environment, in which the signal-to-noise ratio (SNR) is time-varying. Different from existing work on continuous feedback, we consider more practical scenarios in which the transmitter receives an Acknowledgment (ACK) if the received SNR is larger than the required threshold, and otherwise a Non-Acknowledgment (NACK). That is, the feedback is discrete. Several applications with different threshold values are also considered in this work. The channel selection problem is formulated as a non-cooperative game, and subsequently it is proved to be a potential game, which has at least one pure strategy Nash equilibrium. Following this, a multi-agent Q-learning algorithm is proposed to converge to Nash equilibria of the game. Furthermore, opportunistic spectrum access with multiple discrete feedbacks is also investigated. Finally, the simulation results verify that the proposed multi-agent Q-learning algorithm is applicable to both situations with binary feedback and multiple discrete feedbacks.
https://doi.org/10.3837/tiis.2015.10.006 인용 PDF KSCI KPUBS HTML

Explicit Dynamic Coordination Reinforcement Learning Based on Utility

Si, Huaiwei;Tan, Guozhen;Yuan, Yifu;peng, Yanfei;Li, Jianping
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제16권3호
- /
- pp.792-812
- /
- 2022
Multi-agent systems often need to achieve the goal of learning more effectively for a task through coordination. Although the introduction of deep learning has addressed the state space problems, multi-agent learning remains infeasible because of the joint action spaces. Large-scale joint action spaces can be sparse according to implicit or explicit coordination structure, which can ensure reasonable coordination action through the coordination structure. In general, the multi-agent system is dynamic, which makes the relations among agents and the coordination structure are dynamic. Therefore, the explicit coordination structure can better represent the coordinative relationship among agents and achieve better coordination between agents. Inspired by the maximization of social group utility, we dynamically construct a factor graph as an explicit coordination structure to express the coordinative relationship according to the utility among agents and estimate the joint action values based on the local utility transfer among factor graphs. We present the application of such techniques in the scenario of multiple intelligent vehicle systems, where state space and action space are a problem and have too many interactions among agents. The results on the multiple intelligent vehicle systems demonstrate the efficiency and effectiveness of our proposed methods.
https://doi.org/10.3837/tiis.2022.03.003 인용 PDF KSCI HTML

검색결과 1,001건 처리시간 0.026초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)