• Title/Summary/Keyword: reduced learning time

Search Result 260, Processing Time 0.028 seconds

A Method for Learning Macro-Actions for Virtual Characters Using Programming by Demonstration and Reinforcement Learning

  • Sung, Yun-Sick;Cho, Kyun-Geun
    • Journal of Information Processing Systems
    • /
    • v.8 no.3
    • /
    • pp.409-420
    • /
    • 2012
  • The decision-making by agents in games is commonly based on reinforcement learning. To improve the quality of agents, it is necessary to solve the problems of the time and state space that are required for learning. Such problems can be solved by Macro-Actions, which are defined and executed by a sequence of primitive actions. In this line of research, the learning time is reduced by cutting down the number of policy decisions by agents. Macro-Actions were originally defined as combinations of the same primitive actions. Based on studies that showed the generation of Macro-Actions by learning, Macro-Actions are now thought to consist of diverse kinds of primitive actions. However an enormous amount of learning time and state space are required to generate Macro-Actions. To resolve these issues, we can apply insights from studies on the learning of tasks through Programming by Demonstration (PbD) to generate Macro-Actions that reduce the learning time and state space. In this paper, we propose a method to define and execute Macro-Actions. Macro-Actions are learned from a human subject via PbD and a policy is learned by reinforcement learning. In an experiment, the proposed method was applied to a car simulation to verify the scalability of the proposed method. Data was collected from the driving control of a human subject, and then the Macro-Actions that are required for running a car were generated. Furthermore, the policy that is necessary for driving on a track was learned. The acquisition of Macro-Actions by PbD reduced the driving time by about 16% compared to the case in which Macro-Actions were directly defined by a human subject. In addition, the learning time was also reduced by a faster convergence of the optimum policies.

Dynamic Action Space Handling Method for Reinforcement Learning Models

  • Woo, Sangchul;Sung, Yunsick
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1223-1230
    • /
    • 2020
  • Recently, extensive studies have been conducted to apply deep learning to reinforcement learning to solve the state-space problem. If the state-space problem was solved, reinforcement learning would become applicable in various fields. For example, users can utilize dance-tutorial systems to learn how to dance by watching and imitating a virtual instructor. The instructor can perform the optimal dance to the music, to which reinforcement learning is applied. In this study, we propose a method of reinforcement learning in which the action space is dynamically adjusted. Because actions that are not performed or are unlikely to be optimal are not learned, and the state space is not allocated, the learning time can be shortened, and the state space can be reduced. In an experiment, the proposed method shows results similar to those of traditional Q-learning even when the state space of the proposed method is reduced to approximately 0.33% of that of Q-learning. Consequently, the proposed method reduces the cost and time required for learning. Traditional Q-learning requires 6 million state spaces for learning 100,000 times. In contrast, the proposed method requires only 20,000 state spaces. A higher winning rate can be achieved in a shorter period of time by retrieving 20,000 state spaces instead of 6 million.

Transition from Conventional to Reduced-Port Laparoscopic Gastrectomy to Treat Gastric Carcinoma: a Single Surgeon's Experience from a Small-Volume Center

  • Kim, Ho Goon;Kim, Dong Yi;Jeong, Oh
    • Journal of Gastric Cancer
    • /
    • v.18 no.2
    • /
    • pp.172-181
    • /
    • 2018
  • Purpose: This study aimed to evaluate the surgical outcomes and investigate the feasibility of reduced-port laparoscopic gastrectomy using learning curve analysis in a small-volume center. Materials and Methods: We reviewed 269 patients who underwent laparoscopic distal gastrectomy (LDG) for gastric carcinoma between 2012 and 2017. Among them, 159 patients underwent reduced-port laparoscopic gastrectomy. The cumulative sum technique was used for quantitative assessment of the learning curve. Results: There were no statistically significant differences in the baseline characteristics of patients who underwent conventional and reduced-port LDG, and the operative time did not significantly differ between the groups. However, the amount of intraoperative bleeding was significantly lower in the reduced-port laparoscopic gastrectomy group (56.3 vs. 48.2 mL; P<0.001). There were no significant differences between the groups in terms of the first flatus time or length of hospital stay. Neither the incidence nor the severity of the complications significantly differed between the groups. The slope of the cumulative sum curve indicates the trend of learning performance. After 33 operations, the slope gently stabilized, which was regarded as the breakpoint of the learning curve. Conclusions: The surgical outcomes of reduced-port laparoscopic gastrectomy were comparable to those of conventional laparoscopic gastrectomy, suggesting that transition from conventional to reduced-port laparoscopic gastrectomy is feasible and safe, with a relatively short learning curve, in a small-volume center.

Evolutionary Computation for the Real-Time Adaptive Learning Control(II) (실시간 적응 학습 제어를 위한 진화연산(II))

  • Chang, Sung-Ouk;Lee, Jin-Kul
    • Proceedings of the KSME Conference
    • /
    • 2001.06b
    • /
    • pp.730-734
    • /
    • 2001
  • In this study in order to confirm the algorithms that are suggested from paper (I) as the experimental result, as the applied results of the hydraulic servo system are very strong a non-linearity of the fluid in the computer simulation, the real-time adaptive learning control algorithms is validated. The evolutionary strategy has characteristics that are automatically. adjusted in search regions with natural competition among many individuals. The error that is generated from the dynamic system is applied to the mutation equation. Competitive individuals are reduced with automatic adjustments of the search region in accord with the error. In this paper, the individual parents and offspring can be reduced in order to apply evolutionary algorithms in real-time as the description of the paper (I). The possibility of a new approaching algorithm that is suggested from the computer simulation of the paper (I) would be proved as the verification of a real-time test and the consideration its influence from the actual experiment.

  • PDF

Evolutionary Computation for the Real-Time Adaptive Learning Control(I) (실시간 적응 학습 제어를 위한 진화연산(I))

  • Chang, Sung-Ouk;Lee, Jin-Kul
    • Proceedings of the KSME Conference
    • /
    • 2001.06b
    • /
    • pp.724-729
    • /
    • 2001
  • This paper discusses the composition of the theory of reinforcement learning, which is applied in real-time learning, and evolutionary strategy, which proves its the superiority in the finding of the optimal solution at the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations are proposed. It possible to control the control object varied as time changes. As the state value of the control object is generated, applied evolutionary strategy each sampling time because the learning process of an estimation, selection, mutation in real-time. These algorithms can be applied, the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes. In the future, studies are needed on the proof of the theory through experiments and the characteristic considerations of the robustness against the outside disturbances.

  • PDF

The Real-time Self-tuning Learning Control based on Evolutionary Computation (진화 연산을 이용한 실시간 자기동조 학습제어)

  • Chang, Sung-Quk;Lee, Jin-Kul
    • Proceedings of the KSME Conference
    • /
    • 2001.06b
    • /
    • pp.105-109
    • /
    • 2001
  • This paper discuss the real-time self-tuning learning control based on evolutionary computation, which proves its the superiority in the finding of the optimal solution at the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations are proposed. It possible to control the control object varied as time changes. As the state value of the control object is generated, applied evolutionary strategy each sampling time because the learning process of an estimation, selection, mutation in real-time. These algorithms can be applied, the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes.

  • PDF

The Self-tuning PID Control Based on Real-time Adaptive Learning Evolutionary Algorithm (실시간 적응 학습 진화 알고리듬을 이용한 자기 동조 PID 제어)

  • Chang, Sung-Ouk;Lee, Jin-Kul
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.27 no.9
    • /
    • pp.1463-1468
    • /
    • 2003
  • This paper presented the real-time self-tuning learning control based on evolutionary computation, which proves its superiority in finding of the optimal solution at the off-line learning method. The individuals of the populations are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations is proposed. It is possible to control the control object slightly varied as time changes. As the state value of the control object is generated, evolutionary strategy is applied each sampling time because the learning process of an estimation, selection, mutation is done in real-time. These algorithms can be applied; the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes.

A Study on Reducing Learning Time of Deep-Learning using Network Separation (망 분리를 이용한 딥러닝 학습시간 단축에 대한 연구)

  • Lee, Hee-Yeol;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.25 no.2
    • /
    • pp.273-279
    • /
    • 2021
  • In this paper, we propose an algorithm that shortens the learning time by performing individual learning using partitioning the deep learning structure. The proposed algorithm consists of four processes: network classification origin setting process, feature vector extraction process, feature noise removal process, and class classification process. First, in the process of setting the network classification starting point, the division starting point of the network structure for effective feature vector extraction is set. Second, in the feature vector extraction process, feature vectors are extracted without additional learning using the weights previously learned. Third, in the feature noise removal process, the extracted feature vector is received and the output value of each class is learned to remove noise from the data. Fourth, in the class classification process, the noise-removed feature vector is input to the multi-layer perceptron structure, and the result is output and learned. To evaluate the performance of the proposed algorithm, we experimented with the Extended Yale B face database. As a result of the experiment, in the case of the time required for one-time learning, the proposed algorithm reduced 40.7% based on the existing algorithm. In addition, the number of learning up to the target recognition rate was shortened compared with the existing algorithm. Through the experimental results, it was confirmed that the one-time learning time and the total learning time were reduced and improved over the existing algorithm.

Adaptive Learning Control of Electro-Hydraulic Servo System Using Real-Time Evolving Neural Network Algorithm (실시간 진화 신경망 알고리즘을 이용한 전기.유압 서보 시스템의 적응 학습제어)

  • Jang, Seong-Uk;Lee, Jin-Geol
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.8 no.7
    • /
    • pp.584-588
    • /
    • 2002
  • The real-time characteristic of the adaptive leaning control algorithms is validated based on the applied results of the hydraulic servo system that has very strong a non-linearity. The evolutionary strategy automatically adjusts the search regions with natural competition among many individuals. The error that is generated from the dynamic system is applied to the mutation equation. Competitive individuals are reduced with automatic adjustments of the search region in accordance with the error. In this paper, the individual parents and offspring can be reduced in order to apply evolutionary algorithms in real-time. The feasibility of the newly proposed algorithm was demonstrated through the real-time test.

Improving Deep Learning Models Considering the Time Lags between Explanatory and Response Variables

  • Chaehyeon Kim;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.345-359
    • /
    • 2024
  • A regression model represents the relationship between explanatory and response variables. In real life, explanatory variables often affect a response variable with a certain time lag, rather than immediately. For example, the marriage rate affects the birth rate with a time lag of 1 to 2 years. Although deep learning models have been successfully used to model various relationships, most of them do not consider the time lags between explanatory and response variables. Therefore, in this paper, we propose an extension of deep learning models, which automatically finds the time lags between explanatory and response variables. The proposed method finds out which of the past values of the explanatory variables minimize the error of the model, and uses the found values to determine the time lag between each explanatory variable and response variables. After determining the time lags between explanatory and response variables, the proposed method trains the deep learning model again by reflecting these time lags. Through various experiments applying the proposed method to a few deep learning models, we confirm that the proposed method can find a more accurate model whose error is reduced by more than 60% compared to the original model.