Proceedings of the IEEK Conference (대한전자공학회:학술대회논문집)
- 2001.06c
- /
- Pages.13-16
- /
- 2001
Reinforcement learning Speedup method using Q-value Initialization
Q-value Initialization을 이용한 Reinforcement Learning Speedup Method
Abstract
In reinforcement teaming, Q-learning converges quite slowly to a good policy. Its because searching for the goal state takes very long time in a large stochastic domain. So I propose the speedup method using the Q-value initialization for model-free reinforcement learning. In the speedup method, it learns a naive model of a domain and makes boundaries around the goal state. By using these boundaries, it assigns the initial Q-values to the state-action pairs and does Q-learning with the initial Q-values. The initial Q-values guide the agent to the goal state in the early states of learning, so that Q-teaming updates Q-values efficiently. Therefore it saves exploration time to search for the goal state and has better performance than Q-learning. 1 present Speedup Q-learning algorithm to implement the speedup method. This algorithm is evaluated. in a grid-world domain and compared to Q-teaming.
Keywords