Search | Korea Science

Moon, Young-Joon;Lee, Jae-Hoon;Park, Joo-Young
- Journal of the Korean Institute of Intelligent Systems
- /
- v.19 no.4
- /
- pp.519-524
- /
- 2009
Recently, reinforcement learning methods have drawn much interests in the area of machine learning. Dominant approaches in researches for the reinforcement learning include the value-function approach, the policy search approach, and the actor-critic approach, among which pertinent to this paper are algorithms studied for problems with continuous states and continuous actions along the line of the actor-critic strategy. In particular, this paper focuses on presenting a method combining the so-called ACFRL(actor-critic fuzzy reinforcement learning), which is an actor-critic type reinforcement learning based on fuzzy theory, together with the RLS-NAC which is based on the RLS filters and natural actor-critic methods. The presented method is applied to a control problem for crawling robots, and some results are reported from comparison of learning performance.
https://doi.org/10.5391/JKIIS.2009.19.4.519 인용 PDF KSCI

Jeong, Gyu-Baek;Mun, Yeong-Jun;Park, Ju-Yeong
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2007.11a
- /
- pp.163-166
- /
- 2007
최근에 국내외의 인공지능 분야에서는, 강화학습(reinforcement learning)에 관한 연구가 활발히 진행되고 있다. 본 논문에서는 능동형 현가장치(active-suspension)의 제어를 위하여 RLS 기반 NAC(natural actor-critic)을 활용한 강화학습 기법을 적용해보고, 그 성능을 시뮬레이션을 통해 확인해본다.
PDF

Chu B.;Kim D.;Hong D.;Park J.;Chung J.T.;Kim T.H.
- Proceedings of the Korean Society of Precision Engineering Conference
- /
- 2006.05a
- /
- pp.53-54
- /
- 2006
The main purpose of tunnel ventilation system is to maintain CO pollutant and VI (visibility index) under an adequate level to provide drivers with safe driving condition. Moreover, it is necessary to minimize power consumption used to operate ventilation system. To achieve the objectives, the control algorithm used in this research is reinforcement teaming (RL) method. RL is a goal-directed teaming of a mapping from situations to actions. The goal of RL is to maximize a reward which is an evaluative feedback from the environment. Constructing the reward of the tunnel ventilation system, two objectives listed above are included. RL algorithm based on actor-critic architecture and natural gradient method is adopted to the system. Also, the recursive least-squares (RLS) is employed to the learning process to improve the efficiency of the use of data. The simulation results performed with real data collected from existing tunnel are provided in this paper. It is confirmed that with the suggested controller, the pollutant level inside the tunnel was well maintained under allowable limit and the performance of energy consumption was improved compared to conventional control scheme.
PDF