• 제목/요약/키워드: Learning state

검색결과 1,597건 처리시간 0.033초

Reinforcement Learning Control using Self-Organizing Map and Multi-layer Feed-Forward Neural Network

  • Lee, Jae-Kang;Kim, Il-Hwan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.142-145
    • /
    • 2003
  • Many control applications using Neural Network need a priori information about the objective system. But it is impossible to get exact information about the objective system in real world. To solve this problem, several control methods were proposed. Reinforcement learning control using neural network is one of them. Basically reinforcement learning control doesn't need a priori information of objective system. This method uses reinforcement signal from interaction of objective system and environment and observable states of objective system as input data. But many methods take too much time to apply to real-world. So we focus on faster learning to apply reinforcement learning control to real-world. Two data types are used for reinforcement learning. One is reinforcement signal data. It has only two fixed scalar values that are assigned for each success and fail state. The other is observable state data. There are infinitive states in real-world system. So the number of observable state data is also infinitive. This requires too much learning time for applying to real-world. So we try to reduce the number of observable states by classification of states with Self-Organizing Map. We also use neural dynamic programming for controller design. An inverted pendulum on the cart system is simulated. Failure signal is used for reinforcement signal. The failure signal occurs when the pendulum angle or cart position deviate from the defined control range. The control objective is to maintain the balanced pole and centered cart. And four states that is, position and velocity of cart, angle and angular velocity of pole are used for state signal. Learning controller is composed of serial connection of Self-Organizing Map and two Multi-layer Feed-Forward Neural Networks.

  • PDF

퍼지 클러스터링을 이용한 강화학습의 함수근사 (Function Approximation for Reinforcement Learning using Fuzzy Clustering)

  • 이영아;정경숙;정태충
    • 정보처리학회논문지B
    • /
    • 제10B권6호
    • /
    • pp.587-592
    • /
    • 2003
  • 강화학습을 적용하기에 적합한 많은 실세계의 제어 문제들은 연속적인 상태 또는 행동(continuous states or actions)을 갖는다. 연속 값을 갖는 문제인 경우, 상태공간의 크기가 거대해져서 모든 상태-행동 쌍을 학습하는데 메모리와 시간상의 문제가 있다. 이를 해결하기 위하여 학습된 유사한 상태로부터 새로운 상태에 대한 추측을 하는 함수 근사 방법이 필요하다. 본 논문에서는 1-step Q-learning의 함수 근사를 위하여 퍼지 클러스터링을 기초로 한 Fuzzy Q-Map을 제안한다. Fuzzy Q-Map은 데이터에 대한 각 클러스터의 소속도(membership degree)를 이용하여 유사한 상태들을 군집하고 행동을 선택하고 Q값을 참조했다. 또한 승자(winner)가 되는 퍼지 클러스터의 중심과 Q값은 소속도와 TD(Temporal Difference) 에러를 이용하여 갱신하였다. 본 논문에서 제안한 방법은 마운틴 카 문제에 적용한 결과, 빠른 수렴 결과를 보였다.

Motivation based Behavior Sequence Learning for an Autonomous Agent in Virtual Reality

  • Song, Wei;Cho, Kyung-Eun;Um, Ky-Hyun
    • 한국멀티미디어학회논문지
    • /
    • 제12권12호
    • /
    • pp.1819-1826
    • /
    • 2009
  • To enhance the automatic performance of existing predicting and planning algorithms that require a predefined probability of the states' transition, this paper proposes a multiple sequence generation system. When interacting with unknown environments, a virtual agent needs to decide which action or action order can result in a good state and determine the transition probability based on the current state and the action taken. We describe a sequential behavior generation method motivated from the change in the agent's state in order to help the virtual agent learn how to adapt to unknown environments. In a sequence learning process, the sensed states are grouped by a set of proposed motivation filters in order to reduce the learning computation of the large state space. In order to accomplish a goal with a high payoff, the learning agent makes a decision based on the observation of states' transitions. The proposed multiple sequence behaviors generation system increases the complexity and heightens the automatic planning of the virtual agent for interacting with the dynamic unknown environment. This model was tested in a virtual library to elucidate the process of the system.

  • PDF

Wearable Sensor based Gait Pattern Analysis for detection of ON/OFF State in Parkinson's Disease

  • Aich, Satyabrata;Park, Jinse;Joo, Moon-il;Sim, Jong Seong;Kim, Hee-Cheol
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2019년도 춘계학술대회
    • /
    • pp.283-284
    • /
    • 2019
  • In the last decades patient's suffering with Parkinson's disease is increasing at a rapid rate and as per prediction it will grow more rapidly as old age population is increasing at a rapid rate through out the world. As the performance of wearable sensor based approach reached to a new height as well as powerful machine learning technique provides more accurate result these combination has been widely used for assessment of various neurological diseases. ON state is the state where the effect of medicine is present and OFF state the effect of medicine is reduced or not present at all. Classification of ON/OFF state for the Parkinson's disease is important because the patients could injure them self due to freezing of gait and gait related problems in the OFF state. in this paper wearable sensor based approach has been used to collect the data in ON and OFF state and machine learning techniques are used to automate the classification based on the gait pattern. Supervised machine learning techniques able to provide 97.6% accuracy while classifying the ON/OFF state.

  • PDF

Online Evolution for Cooperative Behavior in Group Robot Systems

  • Lee, Dong-Wook;Seo, Sang-Wook;Sim, Kwee-Bo
    • International Journal of Control, Automation, and Systems
    • /
    • 제6권2호
    • /
    • pp.282-287
    • /
    • 2008
  • In distributed mobile robot systems, autonomous robots accomplish complicated tasks through intelligent cooperation with each other. This paper presents behavior learning and online distributed evolution for cooperative behavior of a group of autonomous robots. Learning and evolution capabilities are essential for a group of autonomous robots to adapt to unstructured environments. Behavior learning finds an optimal state-action mapping of a robot for a given operating condition. In behavior learning, a Q-learning algorithm is modified to handle delayed rewards in the distributed robot systems. A group of robots implements cooperative behaviors through communication with other robots. Individual robots improve the state-action mapping through online evolution with the crossover operator based on the Q-values and their update frequencies. A cooperative material search problem demonstrated the effectiveness of the proposed behavior learning and online distributed evolution method for implementing cooperative behavior of a group of autonomous mobile robots.

A Motivation-Based Action-Selection-Mechanism Involving Reinforcement Learning

  • Lee, Sang-Hoon;Suh, Il-Hong;Kwon, Woo-Young
    • International Journal of Control, Automation, and Systems
    • /
    • 제6권6호
    • /
    • pp.904-914
    • /
    • 2008
  • An action-selection-mechanism(ASM) has been proposed to work as a fully connected finite state machine to deal with sequential behaviors as well as to allow a state in the task program to migrate to any state in the task, in which a primitive node in association with a state and its transitional conditions can be easily inserted/deleted. Also, such a primitive node can be learned by a shortest path-finding-based reinforcement learning technique. Specifically, we define a behavioral motivation as having state-dependent value as a primitive node for action selection, and then sequentially construct a network of behavioral motivations in such a way that the value of a parent node is allowed to flow into a child node by a releasing mechanism. A vertical path in a network represents a behavioral sequence. Here, such a tree for our proposed ASM can be newly generated and/or updated whenever a new behavior sequence is learned. To show the validity of our proposed ASM, experimental results of a mobile robot performing the task of pushing- a- box-in to- a-goal(PBIG) will be illustrated.

Virtual World-Based Information Security Learning: Design and Evaluation

  • Ryoo, Jungwoo;Lee, Dongwon;Techatassanasoontorn, Angsana A.
    • Journal of Information Science Theory and Practice
    • /
    • 제4권3호
    • /
    • pp.6-27
    • /
    • 2016
  • There has been a growing interest and enthusiasm for the application of virtual worlds in learning and training. This research proposes a design framework of a virtual world-based learning environment that integrates two unique features of the virtual world technology, immersion and interactivity, with an instructional strategy that promotes self-regulatory learning. We demonstrate the usefulness and assess the effectiveness of our design in the context of information security learning. In particular, the information security learning module implemented in Second Life was incorporated into an Introduction to Information Security course. Data from pre- and post- learning surveys were used to evaluate the effectiveness of the learning module. Overall, the results strongly suggest that the virtual world-based learning environment enhances information security learning, thus supporting the effectiveness of the proposed design framework. Additional results suggest that learner traits have an important influence on learning outcomes through perceived enjoyment. The study offers useful design and implementation guidelines for organizations and universities to develop a virtual world-based learning environment. It also represents an initial step towards the design and explanation theories of virtual world-based learning environments.

A Method for Learning Macro-Actions for Virtual Characters Using Programming by Demonstration and Reinforcement Learning

  • Sung, Yun-Sick;Cho, Kyun-Geun
    • Journal of Information Processing Systems
    • /
    • 제8권3호
    • /
    • pp.409-420
    • /
    • 2012
  • The decision-making by agents in games is commonly based on reinforcement learning. To improve the quality of agents, it is necessary to solve the problems of the time and state space that are required for learning. Such problems can be solved by Macro-Actions, which are defined and executed by a sequence of primitive actions. In this line of research, the learning time is reduced by cutting down the number of policy decisions by agents. Macro-Actions were originally defined as combinations of the same primitive actions. Based on studies that showed the generation of Macro-Actions by learning, Macro-Actions are now thought to consist of diverse kinds of primitive actions. However an enormous amount of learning time and state space are required to generate Macro-Actions. To resolve these issues, we can apply insights from studies on the learning of tasks through Programming by Demonstration (PbD) to generate Macro-Actions that reduce the learning time and state space. In this paper, we propose a method to define and execute Macro-Actions. Macro-Actions are learned from a human subject via PbD and a policy is learned by reinforcement learning. In an experiment, the proposed method was applied to a car simulation to verify the scalability of the proposed method. Data was collected from the driving control of a human subject, and then the Macro-Actions that are required for running a car were generated. Furthermore, the policy that is necessary for driving on a track was learned. The acquisition of Macro-Actions by PbD reduced the driving time by about 16% compared to the case in which Macro-Actions were directly defined by a human subject. In addition, the learning time was also reduced by a faster convergence of the optimum policies.

A Survey of Deep Learning in Agriculture: Techniques and Their Applications

  • Ren, Chengjuan;Kim, Dae-Kyoo;Jeong, Dongwon
    • Journal of Information Processing Systems
    • /
    • 제16권5호
    • /
    • pp.1015-1033
    • /
    • 2020
  • With promising results and enormous capability, deep learning technology has attracted more and more attention to both theoretical research and applications for a variety of image processing and computer vision tasks. In this paper, we investigate 32 research contributions that apply deep learning techniques to the agriculture domain. Different types of deep neural network architectures in agriculture are surveyed and the current state-of-the-art methods are summarized. This paper ends with a discussion of the advantages and disadvantages of deep learning and future research topics. The survey shows that deep learning-based research has superior performance in terms of accuracy, which is beyond the standard machine learning techniques nowadays.

The Practice of Overcoming Stress During Distance Learning of Students - Future Teachers of Preschool Education Institutions

  • Oksana Dzhus;Oleksii Lystopad;Iryna Mardarova;Tetyana Kozak;Tetiana Zavgorodnia
    • International Journal of Computer Science & Network Security
    • /
    • 제23권4호
    • /
    • pp.151-155
    • /
    • 2023
  • The main purpose of the article is to analyze the practice of overcoming during distance learning of students-future teachers of a preschool education institution. The key aspects of practical activities to counter a stressful situation during distance learning of students-future teachers of a preschool education institution are identified. The research methodology includes a number of methods designed to analyze the practice of coping with stress during distance learning of students. The results of the study include the definition of the main elements of practical activities to counteract stress and stressful situations of different scales in the distance learning of students-future teachers of a preschool education institution. Further research requires the analysis of international experience in dealing with a stressful situation during distance learning of students.