• 제목/요약/키워드: Stochastic Learning

검색결과 142건 처리시간 0.031초

시연에 의해 유도된 탐험을 통한 시각 기반의 물체 조작 (Visual Object Manipulation Based on Exploration Guided by Demonstration)

  • 김두준;조현준;송재복
    • 로봇학회논문지
    • /
    • 제17권1호
    • /
    • pp.40-47
    • /
    • 2022
  • A reward function suitable for a task is required to manipulate objects through reinforcement learning. However, it is difficult to design the reward function if the ample information of the objects cannot be obtained. In this study, a demonstration-based object manipulation algorithm called stochastic exploration guided by demonstration (SEGD) is proposed to solve the design problem of the reward function. SEGD is a reinforcement learning algorithm in which a sparse reward explorer (SRE) and an interpolated policy using demonstration (IPD) are added to soft actor-critic (SAC). SRE ensures the training of the critic of SAC by collecting prior data and IPD limits the exploration space by making SEGD's action similar to the expert's action. Through these two algorithms, the SEGD can learn only with the sparse reward of the task without designing the reward function. In order to verify the SEGD, experiments were conducted for three tasks. SEGD showed its effectiveness by showing success rates of more than 96.5% in these experiments.

Actor-Critic Reinforcement Learning System with Time-Varying Parameters

  • Obayashi, Masanao;Umesako, Kosuke;Oda, Tazusa;Kobayashi, Kunikazu;Kuremoto, Takashi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.138-141
    • /
    • 2003
  • Recently reinforcement learning has attracted attention of many researchers because of its simple and flexible learning ability for any environments. And so far many reinforcement learning methods have been proposed such as Q-learning, actor-critic, stochastic gradient ascent method and so on. The reinforcement learning system is able to adapt to changes of the environment because of the mutual action with it. However when the environment changes periodically, it is not able to adapt to its change well. In this paper we propose the reinforcement learning system that is able to adapt to periodical changes of the environment by introducing the time-varying parameters to be adjusted. It is shown that the proposed method works well through the simulation study of the maze problem with aisle that opens and closes periodically, although the conventional method with constant parameters to be adjusted does not works well in such environment.

  • PDF

Exploring the Usage of the DEMATEL Method to Analyze the Causal Relations Between the Factors Facilitating Organizational Learning and Knowledge Creation in the Ministry of Education

  • Park, Sun Hyung;Kim, Il Soo;Lim, Seong Bum
    • International Journal of Contents
    • /
    • 제12권4호
    • /
    • pp.31-44
    • /
    • 2016
  • Knowledge creation and management are regarded as critical success factors for an organization's survival in the knowledge era. As a process of knowledge acquisition and sharing, organizational learning mechanisms (OLMs) guide the learning function of organizations represented by its different learning activities. We examined a variety of learning processes that constitute OLMs. In this study, we aimed to capture the process and framework of OLMs and knowledge sharing and acquisition. Factors facilitating OLMs were investigated at three levels: individual, group, and organizational. The concept of an OLM has received some attention in the field of organizational learning, however, the relationship among the factors generating OLMs has not been empirically tested. As part of the ongoing discussion, we attempted a systemic approach for OLMs. OLMs can be represented by factors that are inherent to the organization's system; therefore, prior to empirically testing the OLM generating factor(s), evaluation of its organizational integration is required to determine effective treatment of each factor. Thus, we developed a framework to manage knowledge and proposed a method to numerically evaluate factors influencing the OLMs. Specifically, composite importance (CI) of the Decision-Making Trial and Evaluation Laboratory (DEMATEL) method was applied to explore the interaction effect of these factors based on systemic approach. The augmented matrix thus generated is expected to serve as a stochastic matrix of an absorbing Markov chain.

심층 신경망 병렬 학습 방법 연구 동향 (A survey on parallel training algorithms for deep neural networks)

  • 육동석;이효원;유인철
    • 한국음향학회지
    • /
    • 제39권6호
    • /
    • pp.505-514
    • /
    • 2020
  • 심층 신경망(Deep Neural Network, DNN) 모델을 대량의 학습 데이터로 학습시키기 위해서는 많은 시간이 소요되기 때문에 병렬 학습 방법이 필요하다. DNN의 학습에는 일반적으로 Stochastic Gradient Descent(SGD) 방법이 사용되는데, SGD는 근본적으로 순차적인 처리가 필요하므로 병렬화하기 위해서는 다양한 근사(approximation) 방법을 적용하게 된다. 본 논문에서는 기존의 DNN 병렬 학습 알고리즘들을 소개하고 연산량, 통신량, 근사 방법 등을 분석한다.

확률론적 의사결정기법을 이용한 태양광 발전 시스템의 고장검출 알고리즘 (Fault Detection Algorithm of Photovoltaic Power Systems using Stochastic Decision Making Approach)

  • 조현철;이관호
    • 융합신호처리학회논문지
    • /
    • 제12권3호
    • /
    • pp.212-216
    • /
    • 2011
  • 태양광 발전 시스템의 고장검출은 고장으로 인해 발생되는 기술적 및 경제적 손실을 최대한 줄이기 위한 첨단 기술로 각광을 받고 있다. 본 논문은 푸리에 신경회로망과 확률론적 의사결정법을 이용한 태양광 발전 시스템의 새로운 고장진단 알고리즘을 제안한다. 우선 태양광 시스템의 동적 모델링을 위하여 최급강하 기반 최적화 기법을 통해 신경회로망 모델을 구성하며 GLRT 알고리즘을 이용하여 태양광 시스템의 확률론적 고장검출 기법을 제안한다. 제안한 고장검출 알고리즘의 타당성 검증을 위하여 태양광 고장검출 테스트베드를 제작하여 실시간 실험을 실시하였으며 이 때 태양광으로부터의 신호는 직류 전력선 통신을 이용하였다.

Stochastic Morphological Sampling Theorem을 이용한 지능형 진화형 수신기 구현 (A Design of Intelligent and Evolving Receiver Based on Stochastic Morphological Sampling Theorem)

  • 박재현;이경록송문호김운경
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 하계종합학술대회논문집
    • /
    • pp.46-49
    • /
    • 1998
  • In this paper, we introduce the notion of intelligent communication by introducing a novel intelligent receiver model. This receiver is continually evolving and learns and improves in performance as it compiles its experience over time. In digital communication context, in a typical training mode, it jearns the concept of "1" as is deteriorated by arbitrary (not necessarily additive as is typically assumed) disturbance and /or modulation. After learning "1", in test mode, it classifies the received signal "1" and "0" almost completely. The intelligent receiver as implemented is grounded on the recently introduced Stochastic Morphological Sampling Theorem(SMST), a distribution-free result which gives theoretical bounds on the sample complexity(training size) needed for the required performance parameters such as accuracy($\varepsilon$) and confidence($\delta$). Based on this theorem, we demonstrate --almost irrespective of channel and modulation model-- the number of samples needed to learn the concept of "1" is not too "large" and the resulting universal receiver structure, that corresponding to classical Nearest Neighbor rule in Pattern Recognition Theory, is trivial. We check the surprising efficiency and validity of this model through some simple simulations. and validity of this model through some simple simulations.

  • PDF

중등교과과정에서의 사건의 독립에 관한 연구 -수학 개념들 간의 연결을 중심으로- (Stochastic independence of events in the middle and high school education course -Focusing on the connections between math concepts-)

  • 김성래;서종진
    • 한국학교수학회논문집
    • /
    • 제15권1호
    • /
    • pp.199-214
    • /
    • 2012
  • 확률과 통계에서 사건의 독립 개념은 중요하고 유용한 역할을 한다. 본 논문에서는 중등 학교에서의 사건의 독립에 대하여 조사하고, 사건의 독립과 관련된 수학 개념을 어느 정도 알고 있는지 알아보았다. 그 결과 학생들은 사건의 독립과 관련된 하위 개념에 대한 이해가 부족하고, 사건의 독립과 관련된 개념들 간의 연결이 부분적으로 나타나 하위 개념과 상위 개념들 간의 연결이 잘 이루어 질 수 있도록 지도가 필요한 것으로 나타났다.

  • PDF

Deep Learning 기반의 DGA 개발에 대한 연구 (A Study on the Development of DGA based on Deep Learning)

  • 박재균;최은수;김병준;장범
    • 한국인공지능학회지
    • /
    • 제5권1호
    • /
    • pp.18-28
    • /
    • 2017
  • Recently, there are many companies that use systems based on artificial intelligence. The accuracy of artificial intelligence depends on the amount of learning data and the appropriate algorithm. However, it is not easy to obtain learning data with a large number of entity. Less data set have large generalization errors due to overfitting. In order to minimize this generalization error, this study proposed DGA which can expect relatively high accuracy even though data with a less data set is applied to machine learning based genetic algorithm to deep learning based dropout. The idea of this paper is to determine the active state of the nodes. Using Gradient about loss function, A new fitness function is defined. Proposed Algorithm DGA is supplementing stochastic inconsistency about Dropout. Also DGA solved problem by the complexity of the fitness function and expression range of the model about Genetic Algorithm As a result of experiments using MNIST data proposed algorithm accuracy is 75.3%. Using only Dropout algorithm accuracy is 41.4%. It is shown that DGA is better than using only dropout.

Step-Size Control for Width Adaptation in Radial Basis Function Networks for Nonlinear Channel Equalization

  • Kim, Nam-Yong
    • Journal of Communications and Networks
    • /
    • 제12권6호
    • /
    • pp.600-604
    • /
    • 2010
  • A method of width adaptation in the radial basis function network (RBFN) using stochastic gradient (SG) algorithm is introduced. Using Taylor's expansion of error signal and differentiating the error with respect to the step-size, the optimal time-varying step-size of the width in RBFN is derived. The proposed approach to adjusting widths in RBFN achieves superior learning speed and the steady-state mean square error (MSE) performance in nonlinear channel environment. The proposed method has shown enhanced steady-state MSE performance by more than 3 dB in both nonlinear channel environments. The results confirm that controlling over step-size of the width in RBFN by the proposed algorithm can be an effective approach to enhancement of convergence speed and the steady-state value of MSE.

Dropout Genetic Algorithm Analysis for Deep Learning Generalization Error Minimization

  • Park, Jae-Gyun;Choi, Eun-Soo;Kang, Min-Soo;Jung, Yong-Gyu
    • International Journal of Advanced Culture Technology
    • /
    • 제5권2호
    • /
    • pp.74-81
    • /
    • 2017
  • Recently, there are many companies that use systems based on artificial intelligence. The accuracy of artificial intelligence depends on the amount of learning data and the appropriate algorithm. However, it is not easy to obtain learning data with a large number of entity. Less data set have large generalization errors due to overfitting. In order to minimize this generalization error, this study proposed DGA(Dropout Genetic Algorithm) which can expect relatively high accuracy even though data with a less data set is applied to machine learning based genetic algorithm to deep learning based dropout. The idea of this paper is to determine the active state of the nodes. Using Gradient about loss function, A new fitness function is defined. Proposed Algorithm DGA is supplementing stochastic inconsistency about Dropout. Also DGA solved problem by the complexity of the fitness function and expression range of the model about Genetic Algorithm As a result of experiments using MNIST data proposed algorithm accuracy is 75.3%. Using only Dropout algorithm accuracy is 41.4%. It is shown that DGA is better than using only dropout.