• 제목/요약/키워드: learning function

검색결과 2,315건 처리시간 0.037초

기계학습을 위한 양자화 경사도함수 유도 및 구현에 관한 연구 (Study on Derivation and Implementation of Quantized Gradient for Machine Learning)

  • 석진욱
    • 대한임베디드공학회논문지
    • /
    • 제15권1호
    • /
    • pp.1-8
    • /
    • 2020
  • A derivation method for a quantized gradient for machine learning on an embedded system is proposed, in this paper. The proposed differentiation method induces the quantized gradient vector to an objective function and provides that the validation of the directional derivation. Moreover, mathematical analysis shows that the sequence yielded by the learning equation based on the proposed quantization converges to the optimal point of the quantized objective function when the quantized parameter is sufficiently large. The simulation result shows that the optimization solver based on the proposed quantized method represents sufficient performance in comparison to the conventional method based on the floating-point system.

Labeling Q-Learning for Maze Problems with Partially Observable States

  • Lee, Hae-Yeon;Hiroyuki Kamaya;Kenich Abe
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.489-489
    • /
    • 2000
  • Recently, Reinforcement Learning(RL) methods have been used far teaming problems in Partially Observable Markov Decision Process(POMDP) environments. Conventional RL-methods, however, have limited applicability to POMDP To overcome the partial observability, several algorithms were proposed [5], [7]. The aim of this paper is to extend our previous algorithm for POMDP, called Labeling Q-learning(LQ-learning), which reinforces incomplete information of perception with labeling. Namely, in the LQ-learning, the agent percepts the current states by pair of observation and its label, and the agent can distinguish states, which look as same, more exactly. Labeling is carried out by a hash-like function, which we call Labeling Function(LF). Numerous labeling functions can be considered, but in this paper, we will introduce several labeling functions based on only 2 or 3 immediate past sequential observations. We introduce the basic idea of LQ-learning briefly, apply it to maze problems, simple POMDP environments, and show its availability with empirical results, look better than conventional RL algorithms.

  • PDF

Reward Shaping for a Reinforcement Learning Method-Based Navigation Framework

  • Roland, Cubahiro;Choi, Donggyu;Jang, Jongwook
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 추계학술대회
    • /
    • pp.9-11
    • /
    • 2022
  • Applying Reinforcement Learning in everyday applications and varied environments has proved the potential of the of the field and revealed pitfalls along the way. In robotics, a learning agent takes over gradually the control of a robot by abstracting the navigation model of the robot with its inputs and outputs, thus reducing the human intervention. The challenge for the agent is how to implement a feedback function that facilitates the learning process of an MDP problem in an environment while reducing the time of convergence for the method. In this paper we will implement a reward shaping system avoiding sparse rewards which gives fewer data for the learning agent in a ROS environment. Reward shaping prioritizes behaviours that brings the robot closer to the goal by giving intermediate rewards and helps the algorithm converge quickly. We will use a pseudocode implementation as an illustration of the method.

  • PDF

피드백 오차 학습 신경회로망을 이용한 하드디스크 서보정보 기록 방식 (Servo-Writing Method using Feedback Error Learning Neural Networks for HDD)

  • 김수환;정정주;심준석
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2004년도 학술대회 논문집 정보 및 제어부문
    • /
    • pp.699-701
    • /
    • 2004
  • This paper proposes the algorithm of servo- writing based on feedback error learning neural networks. The controller consists of feedback controller using PID and feedforward controller using gaussian radial basis function network. Because the RBFNs are trained by on-line rule, the controller has adaptation capability. The performance of the proposed controller is compared to that of conventional PID controller. Proposed algorithm shows better performance than PID controller.

  • PDF

외란을 포함한 학습 데이터에 강인한 시스템 모델링 (A Robust Learning Algorithm for System Identification)

  • 한상현;윤중선
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.200-200
    • /
    • 2000
  • Highly nonlinear dynamical systems are easily identified using neural networks. When disturbances are included in the learning data set Int system modeling, modeling process will be poorly performed. Since the radial basis functions in the radial basis function network(RBFN) are centered at the points specified by the weights, RBF networks are robust for approximating the process including the narrow-band disturbances deviating significantly from the regular signals. To exclude(filter) these disturbances, a robust algorithm for system identification, based on the RBFN, is proposed. The performance of system identification excluding disturbances is investigated and compared with the one including disturbances.

  • PDF

퍼지-신경망을 이용한 시간지연 공정 시스템에 대한 적응제어 기법

  • 최중락;곽동훈;이동익
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 1996년도 추계학술대회 논문집
    • /
    • pp.994-998
    • /
    • 1996
  • We propose an approach to integrating fuzzy logic control with RBF(Radial Basis Function) networks and show how the integrated network can be applied to multivariable self-organizing and self-learning fuzzy controller. Using the hybrid learning algorithm. To investigate its usefulness and performance, this controller is applied to a time-delayed process system. Simulation results show good control performance and fast convergency in hybrid loaming method.

  • PDF

평생학습 확대에 따른 지역평생교육 추진체제 활성화 방안 (A Study on the Method to Reinforce the Efficient Political Function for Lifelong Learning)

  • 윤명희;이충렬;박종운;임현성
    • 수산해양교육연구
    • /
    • 제22권4호
    • /
    • pp.576-588
    • /
    • 2010
  • The goal of this research was to survey a method to reinforce the efficient political function for lifelong learning when Busan metropolitan city promotes lifelong policy. To achieve this goal, the survey to reinforce the political function for lifelong education was carried out on persons in charge of lifelong education in Busan metropolitan city. The results are as follows: First, they showed high degree of perception on lifelong policy and business. Second, on a question about how much the tasks to promote lifelong education presented by our nation are needed, they answered that concrete agenda for an aging society is needed and a necessary institute to invigorate it is lifelong learning institute. Third, they answered that the policy which is necessary to be implemented as soon as possible is the development education for vocational competency, and the ratio of the perception which requires to hire education experts for life long education was high.

다층 신경회로망과 가우시안 포텐샬 함수 네트워크의 구조적 결합을 이용한 효율적인 학습 방법 (Efficient Learning Algorithm using Structural Hybrid of Multilayer Neural Networks and Gaussian Potential Function Networks)

  • 박상봉;박래정;박철훈
    • 한국통신학회논문지
    • /
    • 제19권12호
    • /
    • pp.2418-2425
    • /
    • 1994
  • 기울기를 따라가는 방식(gradient descent method)에 바탕을 둔 오류 역전파(EBP : Error Back Propagation) 방법이 가장 널리 사용되는 신경회로망의 학습 방법에서 문제가 되는 지역 최소값(local minima), 느린 학습 시간, 신경망 구조(structure), 그리고 초기의 연결 강도(interconnection weight) 등을 기존의 다층 신경 회로망에 지역적인 학습 능력을 가진 가우시안 포텔샵 네트워크(GPFN : Gaussian Potential Function Networks)를 병렬적으로 부가하여 해결함으로써 지역화된 오류 학습 패턴들이 나타내는 문제에 대하여 학습 성능을 향상시킬 수 잇는 새로운 학습 방법을 제시한다. 함수 근사화 문제에서 기존의 EBP 학습 방법과의 비교 실험으로 제안된 학습 방법이 보다 개선된 일반화 능력과 빠른 학습 속도를 가짐을 보여 그 효율성을 입증한다.

  • PDF

역전파 신경회로망과 Q학습을 이용한 장기보드게임 개발 ((The Development of Janggi Board Game Using Backpropagation Neural Network and Q Learning Algorithm))

  • 황상문;박인규;백덕수;진달복
    • 대한전자공학회논문지TE
    • /
    • 제39권1호
    • /
    • pp.83-90
    • /
    • 2002
  • 본 논문은 2인용 보드게임의 정보에 대한 전략을 학습할 수 있는 방법을 역전파 신경회로망과 Q학습알고리즘을 이용하여 제안하였다. 학습의 과정은 단순히 상대프로세스와의 대국에 의하여 이루어진다. 시스템의 구성은 탐색을 담당하는 부분과 기물의 수를 발생하는 부분으로 구성되어 있다. 수의 발생부분은 보드의 상태에 따라서 갱신되고, 탐색커널은 αβ 탐색을 기본으로 역전파 신경회로망과 Q학습을 결합하여 게임에 대해 양호한 평가함수를 학습하였다. 학습의 과정에서 일련의 기물의 이동에 있어서 인접한 평가치들의 차이만을 줄이는 Temporal Difference학습과는 달리, 기물의 이동에 따른 평가치에 대해 갱신된 평가치들을 이용하여 평가함수를 학습함으로써 최적의 전략을 유도할 수 있는 Q학습알고리즘을 사용하였다. 일반적으로 많은 학습을 통하여 평가함수의 정확도가 보장되면 승률이 학습의 양에 비례함을 알 수 있었다.

An active learning method with difficulty learning mechanism for crack detection

  • Shu, Jiangpeng;Li, Jun;Zhang, Jiawei;Zhao, Weijian;Duan, Yuanfeng;Zhang, Zhicheng
    • Smart Structures and Systems
    • /
    • 제29권1호
    • /
    • pp.195-206
    • /
    • 2022
  • Crack detection is essential for inspection of existing structures and crack segmentation based on deep learning is a significant solution. However, datasets are usually one of the key issues. When building a new dataset for deep learning, laborious and time-consuming annotation of a large number of crack images is an obstacle. The aim of this study is to develop an approach that can automatically select a small portion of the most informative crack images from a large pool in order to annotate them, not to label all crack images. An active learning method with difficulty learning mechanism for crack segmentation tasks is proposed. Experiments are carried out on a crack image dataset of a steel box girder, which contains 500 images of 320×320 size for training, 100 for validation, and 190 for testing. In active learning experiments, the 500 images for training are acted as unlabeled image. The acquisition function in our method is compared with traditional acquisition functions, i.e., Query-By-Committee (QBC), Entropy, and Core-set. Further, comparisons are made on four common segmentation networks: U-Net, DeepLabV3, Feature Pyramid Network (FPN), and PSPNet. The results show that when training occurs with 200 (40%) of the most informative crack images that are selected by our method, the four segmentation networks can achieve 92%-95% of the obtained performance when training takes place with 500 (100%) crack images. The acquisition function in our method shows more accurate measurements of informativeness for unlabeled crack images compared to the four traditional acquisition functions at most active learning stages. Our method can select the most informative images for annotation from many unlabeled crack images automatically and accurately. Additionally, the dataset built after selecting 40% of all crack images can support crack segmentation networks that perform more than 92% when all the images are used.