Reinforcement Learning with Clustering for Function Approximation and Rule Extraction

;;;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 30 Issue 11
/
Pages.1054-1061
/
2003
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Reinforcement Learning with Clustering for Function Approximation and Rule Extraction

함수근사와 규칙추출을 위한 클러스터링을 이용한 강화학습

이영아 (경희대학교 컴퓨터공학과) ;
홍석미 (경희대학교 컴퓨터공학과) ;
정태충 (경희대학교 컴퓨터공학과)

Published : 2003.12.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Q-Learning, a representative algorithm of reinforcement learning, experiences repeatedly until estimation values about all state-action pairs of state space converge and achieve optimal policies. When the state space is high dimensional or continuous, complex reinforcement learning tasks involve very large state space and suffer from storing all individual state values in a single table. We introduce Q-Map that is new function approximation method to get classified policies. As an agent learns on-line, Q-Map groups states of similar situations and adapts to new experiences repeatedly. State-action pairs necessary for fine control are treated in the form of rule. As a result of experiment in maze environment and mountain car problem, we can achieve classified knowledge and extract easily rules from Q-Map

강화학습의 대표적인 알고리즘인 Q-Learning은 상태공간의 모든 상태-행동 쌍(state-action pairs)의 평가값이 수렴할 때까지 반복해서 경험하여 최적의 전략(policy)을 얻는다. 상태공간을 구성하는 요소(feature)들이 많거나 요소의 데이타 형태가 연속형(continuous)인 경우, 상태공간은 지수적으로 증가하게 되어, 모든 상태들을 반복해서 경험해야 하고 모든 상태-행동 쌍의 Q값을 저장하는 것은 시간과 메모리에 있어서 어려운 문제이다. 본 논문에서는 온라인으로 학습을 진행하면서 비슷한 상황의 상태들을 클러스터링(clustering)하고 새로운 경험에 적응해서 클러스터(cluster)의 수정(update)을 반복하여, 분류된 최적의 전략(policy)을 얻는 새로운 함수근사(function approximation)방법인 Q-Map을 소개한다. 클러스터링으로 인해 정교한 제어가 필요한 상태(state)는 규칙(rule)으로 추출하여 보완하였다. 미로환경과 마운틴 카 문제를 제안한 Q-Map으로 실험한 결과 분류된 지식을 얻을 수 있었으며 가시화된(explicit) 지식의 형태인 규칙(rule)으로도 쉽게 변환할 수 있었다.

Keywords

References

Stuart I. Reynolds, Adaptive Resolution Model-Free Reinforcement Learning: Decision Boundary Partition, Advances in Artificial Intelligence, 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence(AI-2001), Ottawa, Canada, June 2001, Proceedings
Michael Herrmann, Ralf Der, Efficient Q-Learning by Division of Labor, in Proc. International Conference on Artificial Neural Networks-ICANN'95, Vol. II, S.129-134
Ron Sun, knowledge Extraction from Reinforcement Learning, Proceedings of International Joint Conference on Neural Networks, Washington, DC. July 10-15, 1999. IEEE Press, Piscataway, NJ
Rudy Setiono and Huan Liu, Symbolic Representation of Neural Networks, IEEE Computer March 1996 (Vol. 29, No. 3) pp. 71-77 https://doi.org/10.1109/2.485895
Ron Sun, Supplementing Neural Reinforcement Learning with Symbolic Methods: Possibilities and Challenges, Proceedings of International Joint Conference on Neural Networks, Washington, DC. July 10-15, 1999. IEEE Press, Piscataway, NJ https://doi.org/10.1109/IJCNN.1999.830828
Richard S. Sutton, Generalization in Reinforcement Learning: Successful Examples Using sparse Coarse Coding, Advances in Neural Information Processing Systems, pp.1038-1044, MIT Press, 1996
Edward Keedwell, Ajit Narayanan and Dragon Savic, Using Genetic algorithms to extract rules from trained neural networks, Proceedings of the Genetic and Evolutionary Computing Conference, Volume 1, Morgan Kaufmann Publishers, San Francisco, California, USA, 1999: 793
R.Matthew Kretchmar, Charles W. Anderson, Comparison of CMACs and Radial Basis Functions for Local Function Approximators in Reinforcement Learning, ICNN'97. International Conference on Neural Networks. 1997
Haixun Wang, Wei Wang, Jiong Yang, Philip S. Yu, Clustering by Pattern Similarity in Large Data Sets, ACM SIGMOD Conference 2002 Madison, Wisconsin, USA https://doi.org/10.1145/564691.564737
R. Matthew Kretchmar, Charles W. Anderson, Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning, IWANN99: International Work Conference on Artificial and Natural Neural Networks : Alicante, Spain. June 1999
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, Cambridge, MA., 1998

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Reinforcement Learning with Clustering for Function Approximation and Rule Extraction

함수근사와 규칙추출을 위한 클러스터링을 이용한 강화학습

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)