A Strategy for improving Performance of Q-learning with Prediction Information

Lee, Choong-Hyeon;Um, Ky-Hyun;Cho, Kyung-Eun;

Journal of Korea Game Society (한국게임학회 논문지)

Volume 7 Issue 4
/
Pages.105-116
/
2007
/
1598-4540(pISSN)
/
2287-8211(eISSN)

Korea Game Society (한국게임학회)

A Strategy for improving Performance of Q-learning with Prediction Information

예측 정보를 이용한 Q-학습의 성능 개선 기법

Lee, Choong-Hyeon (Wisdomain Co., Ltd.) ;
Um, Ky-Hyun (Dept. of Game & Multimedia Engineering, Dongguk University) ;
Cho, Kyung-Eun (Dept. of Game & Multimedia Engineering, Dongguk University)

이충현 (㈜Wisdomain) ;
엄기현 (동국대학교 영상미디어대학 게임멀티미디어공학과) ;
조경은 (동국대학교 영상미디어대학 게임멀티미디어공학과)

Published : 2007.12.31

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Nowadays, learning of agents gets more and more useful in game environments. But it takes a long learning time to produce satisfactory results in game. So, we need a good method to shorten the learning time. In this paper, we present a strategy for improving the learning performance of Q-learning with prediction information. It refers to the chosen action at each status in the Q-learning algorithm, It stores the referred value at the P-table of prediction module, and then it searches some values with high frequency at the table. The values are used to renew second compensation value from the Q-table. Our experiments show that our approach gets the efficiency improvement of average 9% after the middle point of learning experiments, and that the more actions in a status space, the higher performance.

게임 환경에서의 학습은 다양한 분야에서 유용하게 활용될 수 있다. 그러나, 학습이 게임에서 만족스러운 결과를 산출하기까지는 많은 학습 시간이 요구된다. 이러한 점을 개선하기 위하여 학습시간을 단축시킬 수 있는 방법론들이 필요하다. 본 논문에서는 예측 정보를 이용한 Q-학습의 성능개선 방안을 제안한다. Q-학습 알고리즘에서는 Q-테이블의 각 상태별 선택된 액션을 참조한다. 참조한 값은 예측 모듈의 P-테이블에 저장되고, 이 테이블에서 출연 빈도가 가장 높은 값을 찾아 2차 보상 값을 갱신할 때 활용한다. 본 연구에서 제시한 방법은 상태내의 전이가 가능한 액션의 수가 많을수록 성능이 높아짐을 확인하였다. 또한 실험결과로 실험 중반 이후부터 제안한 방식이 기존 방식보다 평균 9%의 성능 향상을 보였다.

Journal of Korea Game Society (한국게임학회 논문지)

A Strategy for improving Performance of Q-learning with Prediction Information

예측 정보를 이용한 Q-학습의 성능 개선 기법

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)