제어로봇시스템학회:학술대회논문집
- 2001.10a
- /
- Pages.22.2-22
- /
- 2001
Flexible Labeling Mechanism in LQ-learning for Maze Problems
- Lee, Haeyeon (Tohoku Univ.) ;
- Hiroyuki Kamaya (Tohoku Univ.) ;
- Kenichi Abe (Tohoku Univ.) ;
- Hiroyuki Kamaya (Tohoku Univ.)
- Published : 2001.10.01
Abstract
Recently, Reinforcement Learning (RL) methods in MDP have been extended and applied to the POMDP problems. Currently, hierarchical RL methods are widely studied. However, they have the drawback that the learning time and memories are exhausted only for keeping the hierarchical structure, though they aren´t necessary. On the other hand, our "Labeling Q-learning (LQ-learning) proposed previously, has no hierarchical structure, but adopts a characteristic internal memory mechanism. Namely, LQ-1earning agent percepts the state by pair of observation and its label, and the agent can distinguish states, which look as same, but obviously different, more exactly. So to speak, at each step t, we define a new type of perception of its environment ~ot = (ot,
Keywords