• Title/Summary/Keyword: Stochastic Learning

Search Result 142, Processing Time 0.03 seconds

Goal-Directed Reinforcement Learning System (목표지향적 강화학습 시스템)

  • Lee, Chang-Hoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.265-270
    • /
    • 2010
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like TD-learning and TD(${\lambda}$)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present GDRLS algorithm for finding the shortest path faster in a maze environment. GDRLS is select the candidate states that can guide the shortest path in maze environment, and learn only the candidate states to find the shortest path. Through experiments, we can see that GDRLS can search the shortest path faster than TD-learning and TD(${\lambda}$)-learning in maze environment.

The Effect of Noise Injection into Inputs in the Kohonen Learning (Kohonen 학습의 입력에 잡음 주입의 효과)

  • 정혁준;송근배;이행세
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.265-268
    • /
    • 2001
  • This paper proposes the strategy of noise injection into inputs in the Kohonen learning algorithm (KKA) to improve the local convergence problem of the KLA. Noise strengths are high in the begin of the learning and gradually lowered as the teaming proceeds. This strategy is a kind of stochastic relaxation (SR) method which is broadly used in the general optimization problems. It is convenient to implement and improves the convergence properties of the KLA with moderately increased computing time compared to the KLA. Experimental results for Gauss-Markov sources and real speech demonstrate that the proposed method can consistently provide better codebooks than the KLA.

  • PDF

Robot Control via SGA-based Reinforcement Learning Algorithms (SGA 기반 강화학습 알고리즘을 이용한 로봇 제어)

  • 박주영;김종호;신호근
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.10a
    • /
    • pp.63-66
    • /
    • 2004
  • The SGA(stochastic gradient ascent) algorithm is one of the most important tools in the area of reinforcement learning, and has been applied to a wide range of practical problems. In particular, this learning method was successfully applied by Kimura et a1. [1] to the control of a simple creeping robot which has finite number of control input choices. In this paper, we considered the application of the SGA algorithm to Kimura's robot control problem for the case that the control input is not confined to a finite set but can be chosen from a infinite subset of the real numbers. We also developed a MATLAB-based robot animation program, which showed the effectiveness of the training algorithms vividly.

  • PDF

Predicting Nonlinear Processes for Manufacturing Automation: Case Study through a Robotic Application

  • Kim, Steven H.;Oh, Heung-Sik
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.23 no.2
    • /
    • pp.249-260
    • /
    • 1997
  • The manufacturing environment is rife with nonlinear processes. In this context, an intelligent production controller should be able to predict the dynamic behavior of various subsystems as they react to transient environmental conditions, the varying internal condition of the manufacturing plant, and the changing demands of the production schedule. This level of adaptive capability may be achieved through a coherent methodology for a learning coordinator to predict nonlinear and stochastic processes. The system is to serve as a real time, online supervisor for routine activities as well as exceptional conditions such as damage, failure, or other anomalies. The complexity inherent in a learning coordinator can be managed by a modular architecture incorporating case based reasoning. In the interest of concreteness, the concepts are presented through a case study involving a knowledge based robotic system.

  • PDF

An Adaptive Approach to Learning the Preferences of Users in a Social Network Using Weak Estimators

  • Oommen, B. John;Yazidi, Anis;Granmo, Ole-Christoffer
    • Journal of Information Processing Systems
    • /
    • v.8 no.2
    • /
    • pp.191-212
    • /
    • 2012
  • Since a social network by definition is so diverse, the problem of estimating the preferences of its users is becoming increasingly essential for personalized applications, which range from service recommender systems to the targeted advertising of services. However, unlike traditional estimation problems where the underlying target distribution is stationary; estimating a user's interests typically involves non-stationary distributions. The consequent time varying nature of the distribution to be tracked imposes stringent constraints on the "unlearning" capabilities of the estimator used. Therefore, resorting to strong estimators that converge with a probability of 1 is inefficient since they rely on the assumption that the distribution of the user's preferences is stationary. In this vein, we propose to use a family of stochastic-learning based Weak estimators for learning and tracking a user's time varying interests. Experimental results demonstrate that our proposed paradigm outperforms some of the traditional legacy approaches that represent the state-of-the-art technology.

Harvest Forecasting Improvement Using Federated Learning and Ensemble Model

  • Ohnmar Khin;Jin Gwang Koh;Sung Keun Lee
    • Smart Media Journal
    • /
    • v.12 no.10
    • /
    • pp.9-18
    • /
    • 2023
  • Harvest forecasting is the great demand of multiple aspects like temperature, rain, environment, and their relations. The existing study investigates the climate conditions and aids the cultivators to know the harvest yields before planting in farms. The proposed study uses federated learning. In addition, the additional widespread techniques such as bagging classifier, extra tees classifier, linear discriminant analysis classifier, quadratic discriminant analysis classifier, stochastic gradient boosting classifier, blending models, random forest regressor, and AdaBoost are utilized together. These presented nine algorithms achieved exemplary satisfactory accuracies. The powerful contributions of proposed algorithms can create exact harvest forecasting. Ultimately, we intend to compare our study with the earlier research's results.

An Efficient Traning of Multilayer Neural Newtorks Using Stochastic Approximation and Conjugate Gradient Method (확률적 근사법과 공액기울기법을 이용한 다층신경망의 효율적인 학습)

  • 조용현
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.5
    • /
    • pp.98-106
    • /
    • 1998
  • This paper proposes an efficient learning algorithm for improving the training performance of the neural network. The proposed method improves the training performance by applying the backpropagation algorithm of a global optimization method which is a hybrid of a stochastic approximation and a conjugate gradient method. The approximate initial point for f a ~gtl obal optimization is estimated first by applying the stochastic approximation, and then the conjugate gradient method, which is the fast gradient descent method, is applied for a high speed optimization. The proposed method has been applied to the parity checking and the pattern classification, and the simulation results show that the performance of the proposed method is superior to those of the conventional backpropagation and the backpropagation algorithm which is a hyhrid of the stochastic approximation and steepest descent method.

  • PDF

Analysis of Reinforcement Learning Methods for BS Switching Operation (기지국 상태 조정을 위한 강화 학습 기법 분석)

  • Park, Hyebin;Lim, Yujin
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.2
    • /
    • pp.351-358
    • /
    • 2018
  • Reinforcement learning is a machine learning method which aims to determine a policy to get optimal actions in dynamic and stochastic environments. But reinforcement learning has high computational complexity and needs a lot of time to get solution, so it is not easily applicable to uncertain and continuous environments. To tackle the complexity problem, AC (actor-critic) method is used and it separates an action-value function into a value function and an action decision policy. Also, in transfer learning method, the knowledge constructed in one environment is adapted to another environment, so it reduces the time to learn in a reinforcement learning method. In this paper, we present AC method and transfer learning method to solve the problem of a reinforcement learning method. Finally, we analyze the case study which a transfer learning method is used to solve BS(base station) switching problem in wireless access networks.

Punching Motion Generation using Reinforcement Learning and Trajectory Search Method (경로 탐색 기법과 강화학습을 사용한 주먹 지르기동작 생성 기법)

  • Park, Hyun-Jun;Choi, WeDong;Jang, Seung-Ho;Hong, Jeong-Mo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.8
    • /
    • pp.969-981
    • /
    • 2018
  • Recent advances in machine learning approaches such as deep neural network and reinforcement learning offer significant performance improvements in generating detailed and varied motions in physically simulated virtual environments. The optimization methods are highly attractive because it allows for less understanding of underlying physics or mechanisms even for high-dimensional subtle control problems. In this paper, we propose an efficient learning method for stochastic policy represented as deep neural networks so that agent can generate various energetic motions adaptively to the changes of tasks and states without losing interactivity and robustness. This strategy could be realized by our novel trajectory search method motivated by the trust region policy optimization method. Our value-based trajectory smoothing technique finds stably learnable trajectories without consulting neural network responses directly. This policy is set as a trust region of the artificial neural network, so that it can learn the desired motion quickly.

Pragmatic Assessment of Optimizers in Deep Learning

  • Ajeet K. Jain;PVRD Prasad Rao ;K. Venkatesh Sharma
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.115-128
    • /
    • 2023
  • Deep learning has been incorporating various optimization techniques motivated by new pragmatic optimizing algorithm advancements and their usage has a central role in Machine learning. In recent past, new avatars of various optimizers are being put into practice and their suitability and applicability has been reported on various domains. The resurgence of novelty starts from Stochastic Gradient Descent to convex and non-convex and derivative-free approaches. In the contemporary of these horizons of optimizers, choosing a best-fit or appropriate optimizer is an important consideration in deep learning theme as these working-horse engines determines the final performance predicted by the model. Moreover with increasing number of deep layers tantamount higher complexity with hyper-parameter tuning and consequently need to delve for a befitting optimizer. We empirically examine most popular and widely used optimizers on various data sets and networks-like MNIST and GAN plus others. The pragmatic comparison focuses on their similarities, differences and possibilities of their suitability for a given application. Additionally, the recent optimizer variants are highlighted with their subtlety. The article emphasizes on their critical role and pinpoints buttress options while choosing among them.