통합 검색 | Korea Science

A Learning AI Algorithm for Poker with Embedded Opponent Modeling

Kim, Seong-Gon;Kim, Yong-Gi
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제10권3호
- /
- pp.170-177
- /
- 2010
Poker is a game of imperfect information where competing players must deal with multiple risk factors stemming from unknown information while making the best decision to win, and this makes it an interesting test-bed for artificial intelligence research. This paper introduces a new learning AI algorithm with embedded opponent modeling that can be used for these types of situations and we use this AI and apply it to a poker program. The new AI will be based on several graphs with each of its nodes representing inputs, and the algorithm will learn the optimal decision to make by updating the weight of the edges connecting these nodes and returning a probability for each action the graphs represent.
https://doi.org/10.5391/IJFIS.2010.10.3.170 인용 PDF KSCI

적대적 멀티 에이전트 환경에서 효율적인 강화 학습을 위한 정책 모델링 (Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments)

권기덕;김인철
- 한국정보과학회논문지:소프트웨어및응용
- /
- 제35권3호
- /
- pp.179-188
- /
- 2008
멀티 에이전트 강화 학습에서 해결해야 할 중요한 문제는 자신의 작업 성능에 영향을 미칠 수 있는 다른 에이전트들이 존재하는 동적 환경에서 한 에이전트가 시행착오적 상호작용을 통해 어떻게 자신의 최적 행동 정책을 학습할 수 있느냐 하는 것이다. 멀티 에이전트 강화 학습을 위한 기존 연구들은 대부분 단일 에이전트 MDP 기반의 강화 학습기법들을 큰 변화 없이 그대로 적용하거나 비록 다른 에이전트에 관한 별도의 모델을 이용하더라도 다른 에이전트에 관해 요구되는 정보나 가정이 현실적이지 못하다는 한계점을 가지고 있다. 본 논문에서는 멀티 에이전트 강화 학습기술에 기초가 되는 기본 개념들을 정형화하고 이들을 기초로 기존 연구들의 특징과 한계점을 비교한다. 그리고 새로운 행동 정책 모델을 소개한 뒤, 이것을 이용한 강화 학습 방법을 설명한다. 본 논문에서 제안하는 멀티 에이전트 강화학습 방법은 상대 모델을 이용하는 기존의 멀티 에이전트 강화 학습 연구들에서 주로 시도되었던 상대 에이전트의 Q 평가 함수 모델 대신 상대 에이전트의 행동 정책 모델을 학습하며, 표현력은 풍부하나 학습에 시간과 노력이 많이 요구되는 유한 상태 오토마타나 마코프 체인과 같은 행동 정책 모델들에 비해 비교적 간단한 형태의 행동 정책 모델을 이용함으로써 학습의 효율성을 높였다. 또한, 본 논문에서는 대표적인 적대적 멀티 에이전트 환경인 고양이와 쥐게임을 소개하고, 이 게임을 테스베드삼아 비교 실험들을 수행하고 그 결과를 설명함으로써 본 논문에서 제안하는 정책 모델 기반의 멀티 에이전트 강화 학습의 효과를 분석해본다.
PDF KSCI

태권도 옆차기 동작의 동력학해석과 충격해석에 관한 연구 (A Study on the Dynamic and Impact Analysis of Side Kick in Taekwondo)

이중현;한규현;이현승;이은엽;이영신
- 대한기계학회논문집A
- /
- 제32권1호
- /
- pp.83-90
- /
- 2008
Taekwondo is a martial art form and sport that uses the hands and foot for attack and defense. Taekwondo basic motion is composed of the breaking, competition and poomsea motion. In the side kick among the competition motion, the impact force is larger than other kinds of kicks. The side kick with the front foot can be made in two steps. In the first step, the front foot is stretched forward from back stance free-fighting position. For the second step, the rear foot is followed simultaneously. Then, the kick is executed while entire body weight rests on the rear foot. In this paper, impact analysis of the human model for hitting posture is carried out. The ADAMS/LifeMOD is used in hitting modeling and simulation. The simulation model creates the human model to hit the opponent. As the results, the dynamic analysis of human muscle were presented.
https://doi.org/10.3795/KSME-A.2008.32.1.083 인용 PDF KSCI

A Raid-Type War-Game Model Based on a Discrete Multi-Weapon Lanchester's Law

Baik, Seung-Won
- Management Science and Financial Engineering
- /
- 제19권2호
- /
- pp.31-36
- /
- 2013
We propose a war-game model that is appropriate for a raid-type warfare in which, a priori, the maneuver of the attacker is relatively certain. The model is based on a multi-weapon extention of the Lanchester's law. Instead of a continuous time dynamic game with the differential equations from the Lanchester's law, however, we adopt a multi-period model relying on a time-discretization of the Lanchester's law. Despite the obvious limitation that two players make a move only on the discrete time epochs, the pragmatic model has a manifold justification. The existence of an equilibrium is readily established by its equivalence to a finite zero-sum game, the existence of whose equilibrium is, in turn, well-known to be no other than the LP-duality. It implies then that the war-game model dictates optimal strategies for both players under the assumption that any strategy choice of each player will be responded by a best strategy of her opponent. The model, therefore, provides a sound ground for finding an efficient reinforcement of a defense system that guarantees peaceful equilibria.
https://doi.org/10.7737/MSFE.2013.19.2.031 인용 PDF KSCI

깊은강화학습 기반 1-vs-1 공중전 모델링 및 시뮬레이션 (Modeling and Simulation on One-vs-One Air Combat with Deep Reinforcement Learning)

문일철;정민재;김동준
- 한국시뮬레이션학회논문지
- /
- 제29권1호
- /
- pp.39-46
- /
- 2020
인공지능(AI)를 교전상황에 활용하는 것은 최근 10년간 국방 분야의 주요 관심사였다. 이러한 응용을 위해서, AI 교전에이전트를 훈련해야 하며, 이를 위해 현실적인 시뮬레이션이 반드시 필요하다. 하드웨어 차원의 현실성을 가진 공중 무기체계 공중전 모델에서 AI 에이전트를 학습한 사례에 대해서 본 논문은 서술하고 있다. 특히, 본 논문은 기총만을 활용하는 공중전 상황에서 적을 어떻게 추적해야하는지 AI를 학습하였다. 본 논문은 현실적인 공중전 시뮬레이터를 작성하여, 에이전트의 행동을 강화학습으로 수행한 결과를 제시한다. 훈련 결과로는 Lead 추적을 활용하여 단축된 교전시간과 높은 보상을 갖는 에이전트의 학습에 성공하였다.
https://doi.org/10.9709/JKSS.2020.29.1.039 인용 PDF KSCI

검색결과 5건 처리시간 0.018초

A Learning AI Algorithm for Poker with Embedded Opponent Modeling

적대적 멀티 에이전트 환경에서 효율적인 강화 학습을 위한 정책 모델링 (Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments)

태권도 옆차기 동작의 동력학해석과 충격해석에 관한 연구 (A Study on the Dynamic and Impact Analysis of Side Kick in Taekwondo)

A Raid-Type War-Game Model Based on a Discrete Multi-Weapon Lanchester's Law

깊은강화학습 기반 1-vs-1 공중전 모델링 및 시뮬레이션 (Modeling and Simulation on One-vs-One Air Combat with Deep Reinforcement Learning)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)