• Title/Summary/Keyword: Learning state

Search Result 1,597, Processing Time 0.03 seconds

Reinforcement Learning using Propagation of Goal-State-Value (목표상태 값 전파를 이용한 강화 학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1303-1311
    • /
    • 1999
  • In order to learn in dynamic environments, reinforcement learning algorithms like Q-learning, TD(0)-learning, TD(λ)-learning have been proposed. however, most of them have a drawback of very slow learning because the reinforcement value is given when they reach their goal state. In this thesis, we have proposed a reinforcement learning method that can approximate fast to the goal state in maze environments. The proposed reinforcement learning method is separated into global learning and local learning, and then it executes learning. Global learning is a learning that uses the replacing eligibility trace method to search the goal state. In local learning, it propagates the goal state value that has been searched through global learning to neighboring sates, and then searches goal state in neighboring states. we can show through experiments that the reinforcement learning method proposed in this thesis can find out an optimal solution faster than other reinforcement learning methods like Q-learning, TD(o)learning and TD(λ)-learning.

  • PDF

Exploring the Relationships Between Emotions and State Motivation in a Video-based Learning Environment

  • YU, Jihyun;SHIN, Yunmi;KIM, Dasom;JO, Il-Hyun
    • Educational Technology International
    • /
    • v.18 no.2
    • /
    • pp.101-129
    • /
    • 2017
  • This study attempted to collect learners' emotion and state motivation, analyze their inner states, and measure state motivation using a non-self-reported survey. Emotions were measured by learning segment in detailed learning situations, and they were used to indicate total state motivation with prediction power. Emotion was also used to explain state motivation by learning segment. The purpose of this study was to overcome the limitations of video-based learning environments by verifying whether the emotions measured during individual learning segments can be used to indicate the learner's state motivation. Sixty-eight students participated in a 90-minute to measure their emotions and state motivation, and emotions showed a statistically significant relationship between total state motivation and motivation by learning segment. Although this result is not clear because this was an exploratory study, it is meaningful that this study showed the possibility that emotions during different learning segments can indicate state motivation.

Dynamic Action Space Handling Method for Reinforcement Learning Models

  • Woo, Sangchul;Sung, Yunsick
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1223-1230
    • /
    • 2020
  • Recently, extensive studies have been conducted to apply deep learning to reinforcement learning to solve the state-space problem. If the state-space problem was solved, reinforcement learning would become applicable in various fields. For example, users can utilize dance-tutorial systems to learn how to dance by watching and imitating a virtual instructor. The instructor can perform the optimal dance to the music, to which reinforcement learning is applied. In this study, we propose a method of reinforcement learning in which the action space is dynamically adjusted. Because actions that are not performed or are unlikely to be optimal are not learned, and the state space is not allocated, the learning time can be shortened, and the state space can be reduced. In an experiment, the proposed method shows results similar to those of traditional Q-learning even when the state space of the proposed method is reduced to approximately 0.33% of that of Q-learning. Consequently, the proposed method reduces the cost and time required for learning. Traditional Q-learning requires 6 million state spaces for learning 100,000 times. In contrast, the proposed method requires only 20,000 state spaces. A higher winning rate can be achieved in a shorter period of time by retrieving 20,000 state spaces instead of 6 million.

Region-based Q- learning For Autonomous Mobile Robot Navigation (자율 이동 로봇의 주행을 위한 영역 기반 Q-learning)

  • 차종환;공성학;서일홍
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.174-174
    • /
    • 2000
  • Q-learning, based on discrete state and action space, is a most widely used reinforcement Learning. However, this requires a lot of memory and much time for learning all actions of each state when it is applied to a real mobile robot navigation using continuous state and action space Region-based Q-learning is a reinforcement learning method that estimates action values of real state by using triangular-type action distribution model and relationship with its neighboring state which was defined and learned before. This paper proposes a new Region-based Q-learning which uses a reward assigned only when the agent reached the target, and get out of the Local optimal path with adjustment of random action rate. If this is applied to mobile robot navigation, less memory can be used and robot can move smoothly, and optimal solution can be learned fast. To show the validity of our method, computer simulations are illusrated.

  • PDF

Learning Effects According to the Level of Science State Curiosity and Science State Anxiety Evoked in Science Learning (과학 학습에서 유발되는 과학상태호기심 및 과학상태불안 수준에 따른 학습효과)

  • Kang, Jihoon;Kim, Jina
    • Journal of The Korean Association For Science Education
    • /
    • v.41 no.3
    • /
    • pp.221-235
    • /
    • 2021
  • The purpose of this study is to investigate the learning effects according to the level of Science State Curiosity (SSC) and Science State Anxiety (SSA) in science learning situation for 5th~6th grade elementary school students. To achieve this purpose, we measured and analyzed SSC and SSA in each learning situation by dividing science learning into three situations: Confronting scientific task (I), Checking the results (II), and Learning science concepts (III). In order to identify the net effects of SSC and SSA on learning effects, science curiosity, need for cognition, science self-concept, science anxiety, and interest, which were expected to affect the learning effects, were controlled. SSC and SSA in the situation of confronting scientific tasks were defined as 'SSCI' and 'SSAI,' SSC and SSA in the situation of checking the results were defined as 'SSCII' and 'SSAII,' and SSC and SSA in the situation of learning science concepts were defined as 'SSCIII' and 'SSAIII.' In addition, the learning effects were divided into post-learning effect and delayed post-learning effect, and the degree of improvements in the post- or delayed post-test scores compared to the pre-test score were calculated and analyzed. As a result of the analysis, SSCI·SSCII had a positive effect on the post- and the delayed post-learning effect, but SSAIII had a negative effect on the post- and delayed post-learning effect, SSAI·SSAII had a negative effect on the post-learning effect. SSC had a greater effect on learning effects than SSA, and SSCII had the most influence on the post-learning effect and SSCI had the most influence on the delayed post-learning effect. As SSCIII increased, there was a tendency to do additional voluntary learning. The results of this study are expected to broaden the understanding of students' emotional states in science learning and provide a theoretical foundation for studies of state curiosity and state anxiety.

Reinforcement learning Speedup method using Q-value Initialization (Q-value Initialization을 이용한 Reinforcement Learning Speedup Method)

  • 최정환
    • Proceedings of the IEEK Conference
    • /
    • 2001.06c
    • /
    • pp.13-16
    • /
    • 2001
  • In reinforcement teaming, Q-learning converges quite slowly to a good policy. Its because searching for the goal state takes very long time in a large stochastic domain. So I propose the speedup method using the Q-value initialization for model-free reinforcement learning. In the speedup method, it learns a naive model of a domain and makes boundaries around the goal state. By using these boundaries, it assigns the initial Q-values to the state-action pairs and does Q-learning with the initial Q-values. The initial Q-values guide the agent to the goal state in the early states of learning, so that Q-teaming updates Q-values efficiently. Therefore it saves exploration time to search for the goal state and has better performance than Q-learning. 1 present Speedup Q-learning algorithm to implement the speedup method. This algorithm is evaluated. in a grid-world domain and compared to Q-teaming.

  • PDF

How Does Cognitive Conflict Affect Conceptual Change Process in High School Physics Classrooms?

  • Lee, Gyoung-Ho;Kwon, Jae-Sool
    • Journal of The Korean Association For Science Education
    • /
    • v.24 no.1
    • /
    • pp.1-16
    • /
    • 2004
  • The purpose of this study was to examine the role of cognitive conflict in the conceptual change process. Ninety-seven high school students in Korea participated in this study. Before instruction, we conducted pretests to measure learning motivation and learning strategies. During instruction, we tested the students' preconceptions about Newton's 3rd Law and presented demonstrations. After this, we tested the students' cognitive conflict levels and provided students learning sessions in which we explained the results of the demonstrations. After these learning sessions, we tested the students' state learning motivation and state learning strategy. Posttests and delayed posttests were conducted with individual interviews. The result shows that cognitive conflict has direct/indirect effects on the conceptual change process. However, the effects of cognitive conflict are mediated by other variables in class, such as state learning motivation and state learning strategy. In addition, we found that there was an optimal level of cognitive conflict in the conceptual change process. We discuss the complex role of cognitive conflict in conceptual change, and the educational implications of these findings.

Q-learning Using Influence Map (영향력 분포도를 이용한 Q-학습)

  • Sung Yun-Sick;Cho Kyung-Eun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.5
    • /
    • pp.649-657
    • /
    • 2006
  • Reinforcement Learning is a computational approach to learning whereby an agent take an action which maximize the total amount of reward it receives among possible actions within current state when interacting with a uncertain environment. Q-learning, one of the most active algorithm in Reinforcement Learning, is consist of rewards which is obtained when an agent take an action. But it has the problem with mapping real world to discrete states. When state spaces are very large, Q-learning suffers from time for learning. In constant, when the state space is reduced, many state spaces map to single state space. Because an agent only learns single action within many states, an agent takes an action monotonously. In this paper, to reduce time for learning and complement simple action, we propose the Q-learning using influence map(QIM). By using influence map and adjacent state space's learning result, an agent could choose proper action within uncertain state where an agent does not learn. When this paper compares simulation results of QIM and Q-learning, we show that QIM effects as same as Q-learning even thought QIM uses 4.6% of the Q-learning's state spaces. This is because QIM learns faster than Q-learning about 2.77 times and the state spaces which is needed to learn is reduced, so the occurred problem is complemented by the influence map.

  • PDF

Factors influencing flow state of cooperative learning among nursing students: in convergence era (융복합시대 간호대학생의 협동학습수업 몰입상태에 영향을 미치는 요인)

  • Kim, Min-Suk;Yun, Soon-Young
    • Journal of Digital Convergence
    • /
    • v.13 no.10
    • /
    • pp.397-403
    • /
    • 2015
  • The purpose of this study is to identify factors affecting flow state of the cooperative learning on nursing students live in convergence era. Data was collected a total of six weeks to two nursing freshman class 93 people from April to June 2015. Data were analyzed using descriptive statistics, pearson's correlation coefficients and factors affecting the flow state was used for the multiple regression with the SPSS/WIN 18.0 program. Flow state of the cooperative learning was correlated with major satisfaction and learning satisfaction. The results of major satisfaction and learning satisfaction were significant predictors with 65.4% of the variance in flow state of the cooperative learning, learning satisfaction was confirmed by the affecting factor the flow state. This presents a basis for teaching method applied to maximize the flow state.

Comparison of learning performance of character controller based on deep reinforcement learning according to state representation (상태 표현 방식에 따른 심층 강화 학습 기반 캐릭터 제어기의 학습 성능 비교)

  • Sohn, Chaejun;Kwon, Taesoo;Lee, Yoonsang
    • Journal of the Korea Computer Graphics Society
    • /
    • v.27 no.5
    • /
    • pp.55-61
    • /
    • 2021
  • The character motion control based on physics simulation using reinforcement learning continue to being carried out. In order to solve a problem using reinforcement learning, the network structure, hyperparameter, state, action and reward must be properly set according to the problem. In many studies, various combinations of states, action and rewards have been defined and successfully applied to problems. Since there are various combinations in defining state, action and reward, many studies are conducted to analyze the effect of each element to find the optimal combination that improves learning performance. In this work, we analyzed the effect on reinforcement learning performance according to the state representation, which has not been so far. First we defined three coordinate systems: root attached frame, root aligned frame, and projected aligned frame. and then we analyze the effect of state representation by three coordinate systems on reinforcement learning. Second, we analyzed how it affects learning performance when various combinations of joint positions and angles for state.