• Title/Summary/Keyword: Model based reinforcement learning

Search Result 150, Processing Time 0.027 seconds

Research on Unmanned Aerial Vehicle Mobility Model based on Reinforcement Learning (강화학습 기반 무인항공기 이동성 모델에 관한 연구)

  • Kyoung Hun Kim;Min Kyu Cho;Chang Young Park;Jeongho Kim;Soo Hyun Kim;Young Ghyu Sun;Jin Young Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.6
    • /
    • pp.33-39
    • /
    • 2023
  • Recently, reinforcement learning has been used to improve the communication performance of flying ad-hoc networks (FANETs) and to design mobility models. Mobility model is a key factor for predicting and controlling the movement of unmmaned aerial vehicle (UAVs). In this paper, we designed and analyzed the performance of Q-learning with fourier basis function approximation and Deep-Q Network (DQN) models for optimal path finding in a three-dimensional virtual environment where UAVs operate. The experimental results show that the DQN model is more suitable for optimal path finding than the Q-learning model in a three-dimensional virtual environment.

Co-Operative Strategy for an Interactive Robot Soccer System by Reinforcement Learning Method

  • Kim, Hyoung-Rock;Hwang, Jung-Hoon;Kwon, Dong-Soo
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.2
    • /
    • pp.236-242
    • /
    • 2003
  • This paper presents a cooperation strategy between a human operator and autonomous robots for an interactive robot soccer game, The interactive robot soccer game has been developed to allow humans to join into the game dynamically and reinforce entertainment characteristics. In order to make these games more interesting, a cooperation strategy between humans and autonomous robots on a team is very important. Strategies can be pre-programmed or learned by robots themselves with learning or evolving algorithms. Since the robot soccer system is hard to model and its environment changes dynamically, it is very difficult to pre-program cooperation strategies between robot agents. Q-learning - one of the most representative reinforcement learning methods - is shown to be effective for solving problems dynamically without explicit knowledge of the system. Therefore, in our research, a Q-learning based learning method has been utilized. Prior to utilizing Q-teaming, state variables describing the game situation and actions' sets of robots have been defined. After the learning process, the human operator could play the game more easily. To evaluate the usefulness of the proposed strategy, some simulations and games have been carried out.

Machine Scheduling Models Based on Reinforcement Learning for Minimizing Due Date Violation and Setup Change (납기 위반 및 셋업 최소화를 위한 강화학습 기반의 설비 일정계획 모델)

  • Yoo, Woosik;Seo, Juhyeok;Kim, Dahee;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.3
    • /
    • pp.19-33
    • /
    • 2019
  • Recently, manufacturers have been struggling to efficiently use production equipment as their production methods become more sophisticated and complex. Typical factors hindering the efficiency of the manufacturing process include setup cost due to job change. Especially, in the process of using expensive production equipment such as semiconductor / LCD process, efficient use of equipment is very important. Balancing the tradeoff between meeting the deadline and minimizing setup cost incurred by changes of work type is crucial planning task. In this study, we developed a scheduling model to achieve the goal of minimizing the duedate and setup costs by using reinforcement learning in parallel machines with duedate and work preparation costs. The proposed model is a Deep Q-Network (DQN) scheduling model and is a reinforcement learning-based model. To validate the effectiveness of our proposed model, we compared it against the heuristic model and DNN(deep neural network) based model. It was confirmed that our proposed DQN method causes less due date violation and setup costs than the benchmark methods.

Development of Convolutional Network-based Denoising Technique using Deep Reinforcement Learning in Computed Tomography (심층강화학습을 이용한 Convolutional Network 기반 전산화단층영상 잡음 저감 기술 개발)

  • Cho, Jenonghyo;Yim, Dobin;Nam, Kibok;Lee, Dahye;Lee, Seungwan
    • Journal of the Korean Society of Radiology
    • /
    • v.14 no.7
    • /
    • pp.991-1001
    • /
    • 2020
  • Supervised deep learning technologies for improving the image quality of computed tomography (CT) need a lot of training data. When input images have different characteristics with training images, the technologies cause structural distortion in output images. In this study, an imaging model based on the deep reinforcement learning (DRL) was developed for overcoming the drawbacks of the supervised deep learning technologies and reducing noise in CT images. The DRL model was consisted of shared, value and policy networks, and the networks included convolutional layers, rectified linear unit (ReLU), dilation factors and gate rotation unit (GRU) in order to extract noise features from CT images and improve the performance of the DRL model. Also, the quality of the CT images obtained by using the DRL model was compared to that obtained by using the supervised deep learning model. The results showed that the image accuracy for the DRL model was higher than that for the supervised deep learning model, and the image noise for the DRL model was smaller than that for the supervised deep learning model. Also, the DRL model reduced the noise of the CT images, which had different characteristics with training images. Therefore, the DRL model is able to reduce image noise as well as maintain the structural information of CT images.

Federated Deep Reinforcement Learning Based on Privacy Preserving for Industrial Internet of Things (산업용 사물 인터넷을 위한 프라이버시 보존 연합학습 기반 심층 강화학습 모델)

  • Chae-Rim Han;Sun-Jin Lee;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1055-1065
    • /
    • 2023
  • Recently, various studies using deep reinforcement learning (deep RL) technology have been conducted to solve complex problems using big data collected at industrial internet of things. Deep RL uses reinforcement learning"s trial-and-error algorithms and cumulative compensation functions to generate and learn its own data and quickly explore neural network structures and parameter decisions. However, studies so far have shown that the larger the size of the learning data is, the higher are the memory usage and search time, and the lower is the accuracy. In this study, model-agnostic learning for efficient federated deep RL was utilized to solve privacy invasion by increasing robustness as 55.9% and achieve 97.8% accuracy, an improvement of 5.5% compared with the comparative optimization-based meta learning models, and to reduce the delay time by 28.9% on average.

A Routing Algorithm based on Deep Reinforcement Learning in SDN (SDN에서 심층강화학습 기반 라우팅 알고리즘)

  • Lee, Sung-Keun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1153-1160
    • /
    • 2021
  • This paper proposes a routing algorithm that determines the optimal path using deep reinforcement learning in software-defined networks. The deep reinforcement learning model for learning is based on DQN, the inputs are the current network state, source, and destination nodes, and the output returns a list of routes from source to destination. The routing task is defined as a discrete control problem, and the quality of service parameters for routing consider delay, bandwidth, and loss rate. The routing agent classifies the appropriate service class according to the user's quality of service profile, and converts the service class that can be provided for each link from the current network state collected from the SDN. Based on this converted information, it learns to select a route that satisfies the required service level from the source to the destination. The simulation results indicated that if the proposed algorithm proceeds with a certain episode, the correct path is selected and the learning is successfully performed.

Sustainable Smart City Building-energy Management Based on Reinforcement Learning and Sales of ESS Power

  • Dae-Kug Lee;Seok-Ho Yoon;Jae-Hyeok Kwak;Choong-Ho Cho;Dong-Hoon Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1123-1146
    • /
    • 2023
  • In South Korea, there have been many studies on efficient building-energy management using renewable energy facilities in single zero-energy houses or buildings. However, such management was limited due to spatial and economic problems. To realize a smart zero-energy city, studying efficient energy integration for the entire city, not just for a single house or building, is necessary. Therefore, this study was conducted in the eco-friendly energy town of Chungbuk Innovation City. Chungbuk successfully realized energy independence by converging new and renewable energy facilities for the first time in South Korea. This study analyzes energy data collected from public buildings in that town every minute for a year. We propose a smart city building-energy management model based on the results that combine various renewable energy sources with grid power. Supervised learning can determine when it is best to sell surplus electricity, or unsupervised learning can be used if there is a particular pattern or rule for energy use. However, it is more appropriate to use reinforcement learning to maximize rewards in an environment with numerous variables that change every moment. Therefore, we propose a power distribution algorithm based on reinforcement learning that considers the sales of Energy Storage System power from surplus renewable energy. Finally, we confirm through economic analysis that a 10% saving is possible from this efficiency.

Model-free $H_{\infty}$ Control of Linear Discrete-time Systems using Q-learning and LMI Based on I/O Data (입출력 데이터 기반 Q-학습과 LMI를 이용한 선형 이산 시간 시스템의 모델-프리 $H_{\infty}$ 제어기 설계)

  • Kim, Jin-Hoon;Lewis, F.L.
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.7
    • /
    • pp.1411-1417
    • /
    • 2009
  • In this paper, we consider the design of $H_{\infty}$ control of linear discrete-time systems having no mathematical model. The basic approach is to use Q-learning which is a reinforcement learning method based on actor-critic structure. The model-free control design is to use not the mathematical model of the system but the informations on states and inputs. As a result, the derived iterative algorithm is expressed as linear matrix inequalities(LMI) of measured data from system states and inputs. It is shown that, for a sufficiently rich enough disturbance, this algorithm converges to the standard $H_{\infty}$ control solution obtained using the exact system model. A simple numerical example is given to show the usefulness of our result on practical application.

The Implication of Bandura's Vicarious Reinforcement in Observational Learning for Christian Education (관찰학습에서의 반두라 대리강화에 대한 기독교교육적 함의)

  • Lee, Jongmin
    • Journal of Christian Education in Korea
    • /
    • v.61
    • /
    • pp.81-107
    • /
    • 2020
  • This study reviews Bandura's vicarious reinforcement in observational learning process and implies this concept into Christian education in terms of spiritual role modeling. The first part of this study answers three questions: "what is vicarious reinforcement?" "how does vicarious reinforcement take place in observational learning?" and "how does vicarious reinforcement affect observer's behavior change?" Bandura conceptualizes the learning process with observational learning and imitative or non-imitative performance. Based on this concept, Bandura define the roles of vicarious reinforcement in the four steps of observational learning process: attention, retention, motor reproduction, and motivational process. Also, the three effects of vicarious reinforcements are explained in the following categories: the observational learning effect, inhibitory or disinhibitory effects, and eliciting effect. Adapting the structure of observational learning theory in terms of the effect of vicarious reinforcement and the function of role models, the second part of this study examines the biblical concept of imitation of Christ and the modeling strategy of discipleship. Especially Paul's spiritual role model serves as positive vicarious reinforcement for the Christian believers to perform the desired behaviors. Also, Paul's condemnation serves as explicit negative vicarious reinforcement. Then, the last part of this study covers the implication of these findings from observational learning and empirical studies in terms of spiritual role modeling to Christian education.

Visual servoing based on neuro-fuzzy model

  • Jun, Hyo-Byung;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.712-715
    • /
    • 1997
  • In image jacobian based visual servoing, generally, inverse jacobian should be calculated by complicated coordinate transformations. These are required excessive computation and the singularity of the image jacobian should be considered. This paper presents a visual servoing to control the pose of the robotic manipulator for tracking and grasping 3-D moving object whose pose and motion parameters are unknown. Because the object is in motion tracking and grasping must be done on-line and the controller must have continuous learning ability. In order to estimate parameters of a moving object we use the kalman filter. And for tracking and grasping a moving object we use a fuzzy inference based reinforcement learning algorithm of dynamic recurrent neural networks. Computer simulation results are presented to demonstrate the performance of this visual servoing

  • PDF