Labeling Q-learning with SOM

  • Published : 2002.10.01

Abstract

Reinforcement Learning (RL) is one of machine learning methods and an RL agent autonomously learns the action selection policy by interactions with its environment. At the beginning of RL research, it was limited to problems in environments assumed to be Markovian Decision Process (MDP). However in practical problems, the agent suffers from the incomplete perception, i.e., the agent observes the state of the environments, but these observations include incomplete information of the state. This problem is formally modeled by Partially Observable MDP (POMDP). One of the possible approaches to POMDPS is to use historical nformation to estimate states. The problem of these approaches is how t..

Keywords