• Title/Summary/Keyword: reduced learning time

검색결과 269건 처리시간 0.021초

A Method for Learning Macro-Actions for Virtual Characters Using Programming by Demonstration and Reinforcement Learning

  • Sung, Yun-Sick;Cho, Kyun-Geun
    • Journal of Information Processing Systems
    • /
    • 제8권3호
    • /
    • pp.409-420
    • /
    • 2012
  • The decision-making by agents in games is commonly based on reinforcement learning. To improve the quality of agents, it is necessary to solve the problems of the time and state space that are required for learning. Such problems can be solved by Macro-Actions, which are defined and executed by a sequence of primitive actions. In this line of research, the learning time is reduced by cutting down the number of policy decisions by agents. Macro-Actions were originally defined as combinations of the same primitive actions. Based on studies that showed the generation of Macro-Actions by learning, Macro-Actions are now thought to consist of diverse kinds of primitive actions. However an enormous amount of learning time and state space are required to generate Macro-Actions. To resolve these issues, we can apply insights from studies on the learning of tasks through Programming by Demonstration (PbD) to generate Macro-Actions that reduce the learning time and state space. In this paper, we propose a method to define and execute Macro-Actions. Macro-Actions are learned from a human subject via PbD and a policy is learned by reinforcement learning. In an experiment, the proposed method was applied to a car simulation to verify the scalability of the proposed method. Data was collected from the driving control of a human subject, and then the Macro-Actions that are required for running a car were generated. Furthermore, the policy that is necessary for driving on a track was learned. The acquisition of Macro-Actions by PbD reduced the driving time by about 16% compared to the case in which Macro-Actions were directly defined by a human subject. In addition, the learning time was also reduced by a faster convergence of the optimum policies.

Dynamic Action Space Handling Method for Reinforcement Learning Models

  • Woo, Sangchul;Sung, Yunsick
    • Journal of Information Processing Systems
    • /
    • 제16권5호
    • /
    • pp.1223-1230
    • /
    • 2020
  • Recently, extensive studies have been conducted to apply deep learning to reinforcement learning to solve the state-space problem. If the state-space problem was solved, reinforcement learning would become applicable in various fields. For example, users can utilize dance-tutorial systems to learn how to dance by watching and imitating a virtual instructor. The instructor can perform the optimal dance to the music, to which reinforcement learning is applied. In this study, we propose a method of reinforcement learning in which the action space is dynamically adjusted. Because actions that are not performed or are unlikely to be optimal are not learned, and the state space is not allocated, the learning time can be shortened, and the state space can be reduced. In an experiment, the proposed method shows results similar to those of traditional Q-learning even when the state space of the proposed method is reduced to approximately 0.33% of that of Q-learning. Consequently, the proposed method reduces the cost and time required for learning. Traditional Q-learning requires 6 million state spaces for learning 100,000 times. In contrast, the proposed method requires only 20,000 state spaces. A higher winning rate can be achieved in a shorter period of time by retrieving 20,000 state spaces instead of 6 million.

Transition from Conventional to Reduced-Port Laparoscopic Gastrectomy to Treat Gastric Carcinoma: a Single Surgeon's Experience from a Small-Volume Center

  • Kim, Ho Goon;Kim, Dong Yi;Jeong, Oh
    • Journal of Gastric Cancer
    • /
    • 제18권2호
    • /
    • pp.172-181
    • /
    • 2018
  • Purpose: This study aimed to evaluate the surgical outcomes and investigate the feasibility of reduced-port laparoscopic gastrectomy using learning curve analysis in a small-volume center. Materials and Methods: We reviewed 269 patients who underwent laparoscopic distal gastrectomy (LDG) for gastric carcinoma between 2012 and 2017. Among them, 159 patients underwent reduced-port laparoscopic gastrectomy. The cumulative sum technique was used for quantitative assessment of the learning curve. Results: There were no statistically significant differences in the baseline characteristics of patients who underwent conventional and reduced-port LDG, and the operative time did not significantly differ between the groups. However, the amount of intraoperative bleeding was significantly lower in the reduced-port laparoscopic gastrectomy group (56.3 vs. 48.2 mL; P<0.001). There were no significant differences between the groups in terms of the first flatus time or length of hospital stay. Neither the incidence nor the severity of the complications significantly differed between the groups. The slope of the cumulative sum curve indicates the trend of learning performance. After 33 operations, the slope gently stabilized, which was regarded as the breakpoint of the learning curve. Conclusions: The surgical outcomes of reduced-port laparoscopic gastrectomy were comparable to those of conventional laparoscopic gastrectomy, suggesting that transition from conventional to reduced-port laparoscopic gastrectomy is feasible and safe, with a relatively short learning curve, in a small-volume center.

실시간 적응 학습 제어를 위한 진화연산(II) (Evolutionary Computation for the Real-Time Adaptive Learning Control(II))

  • 장성욱;이진걸
    • 대한기계학회:학술대회논문집
    • /
    • 대한기계학회 2001년도 춘계학술대회논문집B
    • /
    • pp.730-734
    • /
    • 2001
  • In this study in order to confirm the algorithms that are suggested from paper (I) as the experimental result, as the applied results of the hydraulic servo system are very strong a non-linearity of the fluid in the computer simulation, the real-time adaptive learning control algorithms is validated. The evolutionary strategy has characteristics that are automatically. adjusted in search regions with natural competition among many individuals. The error that is generated from the dynamic system is applied to the mutation equation. Competitive individuals are reduced with automatic adjustments of the search region in accord with the error. In this paper, the individual parents and offspring can be reduced in order to apply evolutionary algorithms in real-time as the description of the paper (I). The possibility of a new approaching algorithm that is suggested from the computer simulation of the paper (I) would be proved as the verification of a real-time test and the consideration its influence from the actual experiment.

  • PDF

실시간 적응 학습 제어를 위한 진화연산(I) (Evolutionary Computation for the Real-Time Adaptive Learning Control(I))

  • 장성욱;이진걸
    • 대한기계학회:학술대회논문집
    • /
    • 대한기계학회 2001년도 춘계학술대회논문집B
    • /
    • pp.724-729
    • /
    • 2001
  • This paper discusses the composition of the theory of reinforcement learning, which is applied in real-time learning, and evolutionary strategy, which proves its the superiority in the finding of the optimal solution at the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations are proposed. It possible to control the control object varied as time changes. As the state value of the control object is generated, applied evolutionary strategy each sampling time because the learning process of an estimation, selection, mutation in real-time. These algorithms can be applied, the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes. In the future, studies are needed on the proof of the theory through experiments and the characteristic considerations of the robustness against the outside disturbances.

  • PDF

진화 연산을 이용한 실시간 자기동조 학습제어 (The Real-time Self-tuning Learning Control based on Evolutionary Computation)

  • 장성욱;이진걸
    • 대한기계학회:학술대회논문집
    • /
    • 대한기계학회 2001년도 춘계학술대회논문집B
    • /
    • pp.105-109
    • /
    • 2001
  • This paper discuss the real-time self-tuning learning control based on evolutionary computation, which proves its the superiority in the finding of the optimal solution at the off-line learning method. The individuals are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations are proposed. It possible to control the control object varied as time changes. As the state value of the control object is generated, applied evolutionary strategy each sampling time because the learning process of an estimation, selection, mutation in real-time. These algorithms can be applied, the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes.

  • PDF

실시간 적응 학습 진화 알고리듬을 이용한 자기 동조 PID 제어 (The Self-tuning PID Control Based on Real-time Adaptive Learning Evolutionary Algorithm)

  • 장성욱;이진걸
    • 대한기계학회논문집A
    • /
    • 제27권9호
    • /
    • pp.1463-1468
    • /
    • 2003
  • This paper presented the real-time self-tuning learning control based on evolutionary computation, which proves its superiority in finding of the optimal solution at the off-line learning method. The individuals of the populations are reduced in order to learn the evolutionary strategy in real-time, and new method that guarantee the convergence of evolutionary mutations is proposed. It is possible to control the control object slightly varied as time changes. As the state value of the control object is generated, evolutionary strategy is applied each sampling time because the learning process of an estimation, selection, mutation is done in real-time. These algorithms can be applied; the people who do not have knowledge about the technical tuning of dynamic systems could design the controller or problems in which the characteristics of the system dynamics are slightly varied as time changes.

망 분리를 이용한 딥러닝 학습시간 단축에 대한 연구 (A Study on Reducing Learning Time of Deep-Learning using Network Separation)

  • 이희열;이승호
    • 전기전자학회논문지
    • /
    • 제25권2호
    • /
    • pp.273-279
    • /
    • 2021
  • 본 논문에서는 딥러닝 구조를 분할을 이용한 개별 학습을 수행하여 학습시간을 단축하는 알고리즘을 제안한다. 제안하는 알고리즘은 망 분류 기점 설정 과정, 특징 벡터 추출 과정, 특징 노이즈 제거 과정, 클래스 분류 과정 등의 4가지 과정으로 구성된다. 첫 번째로 망 분류 기점 설정 과정에서는 효과적인 특징 벡터 추출을 위한 망 구조의 분할 기점을 설정한다. 두 번째로 특징 벡터 추출 과정에서는 기존에 학습한 가중치를 사용하여 추가 학습 없이 특징 벡터를 추출한다. 세 번째로 특징 노이즈 제거 과정에서는 추출된 특징 벡터를 입력받아 각 클래스의 출력값을 학습하여 데이터의 노이즈를 제거한다. 네 번째로 클래스 분류 과정에서는 노이즈가 제거된 특징 벡터를 입력받아 다층 퍼셉트론 구조에 입력하고 이를 출력하고 학습한다. 제안된 알고리즘의 성능을 평가하기 위하여 Extended Yale B 얼굴 데이터베이스를 사용하여 실험 하였다. 실험 결과, 1회 학습에 소요되는 시간의 경우 제안하는 알고리즘이 기존 알고리즘 기준 40.7% 단축하였다. 또한 목표 인식률까지 학습 횟수가 기존 알고리즘과 비교하여 단축하였다. 실험결과를 통해 1회 학습시간과 전체 학습시간을 감소시켜 기존의 알고리즘보다 향상됨을 확인하였다.

실시간 진화 신경망 알고리즘을 이용한 전기.유압 서보 시스템의 적응 학습제어 (Adaptive Learning Control of Electro-Hydraulic Servo System Using Real-Time Evolving Neural Network Algorithm)

  • 장성욱;이진걸
    • 제어로봇시스템학회논문지
    • /
    • 제8권7호
    • /
    • pp.584-588
    • /
    • 2002
  • The real-time characteristic of the adaptive leaning control algorithms is validated based on the applied results of the hydraulic servo system that has very strong a non-linearity. The evolutionary strategy automatically adjusts the search regions with natural competition among many individuals. The error that is generated from the dynamic system is applied to the mutation equation. Competitive individuals are reduced with automatic adjustments of the search region in accordance with the error. In this paper, the individual parents and offspring can be reduced in order to apply evolutionary algorithms in real-time. The feasibility of the newly proposed algorithm was demonstrated through the real-time test.

Improving Deep Learning Models Considering the Time Lags between Explanatory and Response Variables

  • Chaehyeon Kim;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • 제20권3호
    • /
    • pp.345-359
    • /
    • 2024
  • A regression model represents the relationship between explanatory and response variables. In real life, explanatory variables often affect a response variable with a certain time lag, rather than immediately. For example, the marriage rate affects the birth rate with a time lag of 1 to 2 years. Although deep learning models have been successfully used to model various relationships, most of them do not consider the time lags between explanatory and response variables. Therefore, in this paper, we propose an extension of deep learning models, which automatically finds the time lags between explanatory and response variables. The proposed method finds out which of the past values of the explanatory variables minimize the error of the model, and uses the found values to determine the time lag between each explanatory variable and response variables. After determining the time lags between explanatory and response variables, the proposed method trains the deep learning model again by reflecting these time lags. Through various experiments applying the proposed method to a few deep learning models, we confirm that the proposed method can find a more accurate model whose error is reduced by more than 60% compared to the original model.