• Title/Summary/Keyword: Model based reinforcement learning

Search Result 150, Processing Time 0.023 seconds

Reinforcement Learning based on Deep Deterministic Policy Gradient for Roll Control of Underwater Vehicle (수중운동체의 롤 제어를 위한 Deep Deterministic Policy Gradient 기반 강화학습)

  • Kim, Su Yong;Hwang, Yeon Geol;Moon, Sung Woong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.5
    • /
    • pp.558-568
    • /
    • 2021
  • The existing underwater vehicle controller design is applied by linearizing the nonlinear dynamics model to a specific motion section. Since the linear controller has unstable control performance in a transient state, various studies have been conducted to overcome this problem. Recently, there have been studies to improve the control performance in the transient state by using reinforcement learning. Reinforcement learning can be largely divided into value-based reinforcement learning and policy-based reinforcement learning. In this paper, we propose the roll controller of underwater vehicle based on Deep Deterministic Policy Gradient(DDPG) that learns the control policy and can show stable control performance in various situations and environments. The performance of the proposed DDPG based roll controller was verified through simulation and compared with the existing PID and DQN with Normalized Advantage Functions based roll controllers.

Multi-Agent Reinforcement Learning Model based on Fuzzy Inference (퍼지 추론 기반의 멀티에이전트 강화학습 모델)

  • Lee, Bong-Keun;Chung, Jae-Du;Ryu, Keun-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.10
    • /
    • pp.51-58
    • /
    • 2009
  • Reinforcement learning is a sub area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. In the case of multi-agent, especially, which state space and action space gets very enormous in compared to single agent, so it needs to take most effective measure available select the action strategy for effective reinforcement learning. This paper proposes a multi-agent reinforcement learning model based on fuzzy inference system in order to improve learning collect speed and select an effective action in multi-agent. This paper verifies an effective action select strategy through evaluation tests based on Robocup Keepaway which is one of useful test-beds for multi-agent. Our proposed model can apply to evaluate efficiency of the various intelligent multi-agents and also can apply to strategy and tactics of robot soccer system.

Machine Learning-Based Rapid Prediction Method of Failure Mode for Reinforced Concrete Column (기계학습 기반 철근콘크리트 기둥에 대한 신속 파괴유형 예측 모델 개발 연구)

  • Kim, Subin;Oh, Keunyeong;Shin, Jiuk
    • Journal of the Earthquake Engineering Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.113-119
    • /
    • 2024
  • Existing reinforced concrete buildings with seismically deficient column details affect the overall behavior depending on the failure type of column. This study aims to develop and validate a machine learning-based prediction model for the column failure modes (shear, flexure-shear, and flexure failure modes). For this purpose, artificial neural network (ANN), K-nearest neighbor (KNN), decision tree (DT), and random forest (RF) models were used, considering previously collected experimental data. Using four machine learning methodologies, we developed a classification learning model that can predict the column failure modes in terms of the input variables using concrete compressive strength, steel yield strength, axial load ratio, height-to-dept aspect ratio, longitudinal reinforcement ratio, and transverse reinforcement ratio. The performance of each machine learning model was compared and verified by calculating accuracy, precision, recall, F1-Score, and ROC. Based on the performance measurements of the classification model, the RF model represents the highest average value of the classification model performance measurements among the considered learning methods, and it can conservatively predict the shear failure mode. Thus, the RF model can rapidly predict the column failure modes with simple column details.

Deep reinforcement learning for base station switching scheme with federated LSTM-based traffic predictions

  • Hyebin Park;Seung Hyun Yoon
    • ETRI Journal
    • /
    • v.46 no.3
    • /
    • pp.379-391
    • /
    • 2024
  • To meet increasing traffic requirements in mobile networks, small base stations (SBSs) are densely deployed, overlapping existing network architecture and increasing system capacity. However, densely deployed SBSs increase energy consumption and interference. Although these problems already exist because of densely deployed SBSs, even more SBSs are needed to meet increasing traffic demands. Hence, base station (BS) switching operations have been used to minimize energy consumption while guaranteeing quality-of-service (QoS) for users. In this study, to optimize energy efficiency, we propose the use of deep reinforcement learning (DRL) to create a BS switching operation strategy with a traffic prediction model. First, a federated long short-term memory (LSTM) model is introduced to predict user traffic demands from user trajectory information. Next, the DRL-based BS switching operation scheme determines the switching operations for the SBSs using the predicted traffic demand. Experimental results confirm that the proposed scheme outperforms existing approaches in terms of energy efficiency, signal-to-interference noise ratio, handover metrics, and prediction performance.

A Development of Nurse Scheduling Model Based on Q-Learning Algorithm

  • JUNG, In-Chul;KIM, Yeun-Su;IM, Sae-Ran;IHM, Chun-Hwa
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.1
    • /
    • pp.1-7
    • /
    • 2021
  • In this paper, We focused the issue of creating a socially problematic nurse schedule. The nurse schedule should be prepared in consideration of three shifts, appropriate placement of experienced workers, the fairness of work assignment, and legal work standards. Because of the complex structure of the nurse schedule, which must reflect various requirements, in most hospitals, the nurse in charge writes it by hand with a lot of time and effort. This study attempted to automatically create an optimized nurse schedule based on legal labor standards and fairness. We developed an I/O Q-Learning algorithm-based model based on Python and Web Application for automatic nurse schedule. The model was trained to converge to 100 by creating an Fairness Indicator Score(FIS) that considers Labor Standards Act, Work equity, Work preference. Manual nurse schedules and this model are compared with FIS. This model showed a higher work equity index of 13.31 points, work preference index of 1.52 points, and FIS of 16.38 points. This study was able to automatically generate nurse schedule based on reinforcement Learning. In addition, as a result of creating the nurse schedule of E hospital using this model, it was possible to reduce the time required from 88 hours to 3 hours. If additional supplementation of FIS and reinforcement Learning techniques such as DQN, CNN, Monte Carlo Simulation and AlphaZero additionally utilize a more an optimized model can be developed.

Reinforcement Learning of Bipedal Walking with Musculoskeletal Models and Reference Motions (근골격 모델과 참조 모션을 이용한 이족보행 강화학습)

  • Jiwoong Jeon;Taesoo Kwon
    • Journal of the Korea Computer Graphics Society
    • /
    • v.29 no.1
    • /
    • pp.23-29
    • /
    • 2023
  • In this paper, we introduce a method to obtain high-quality results at a low cost for simulating musculoskeletal characters based on data from the reference motion through motion capture on two-legged walking through reinforcement learning. We reset the motion data of the reference motion to allow the character model to perform, and then train the corresponding motion to be learned through reinforcement learning. We combine motion imitation of the reference model with minimal metabolic energy for the muscles to learn to allow the musculoskeletal model to perform two-legged walking in the desired direction. In this way, the musculoskeletal model can learn at a lower cost than conventional manually designed controllers and perform high-quality bipedal walking.

Reinforcement Learning Model for Mass Casualty Triage Taking into Account the Medical Capability (의료능력을 고려한 대량전상자 환자분류 강화학습 모델)

  • Byeongho Park;Namsuk Cho
    • Journal of the Society of Disaster Information
    • /
    • v.19 no.1
    • /
    • pp.44-59
    • /
    • 2023
  • Purpose: In the event of mass casualties, triage must be done promptly and accurately so that as many patients as possible can be recovered and returned to the battlefield. However, medical personnel have received many tasks with less manpower, and the battlefield for classifying patients is too complex and uncertain. Therefore, we studied an artificial intelligence model that can assist and replace medical personnel on the battlefield. Method: The triage model is presented using reinforcement learning, a field of artificial intelligence. The learning of the model is conducted to find a policy that allows as many patients as possible to be treated, taking into account the condition of randomly set patients and the medical capability of the military hospital. Result: Whether the reinforcement learning model progressed well was confirmed through statistical graphs such as cumulative reward values. In addition, it was confirmed through the number of survivors whether the triage of the learned model was accurate. As a result of comparing the performance with the rule-based model, the reinforcement learning model was able to rescue 10% more patients than the rule-based model. Conclusion: Through this study, it was found that the triage model using reinforcement learning can be used as an alternative to assisting and replacing triage decision-making of medical personnel in the case of mass casualties.

GAN-based Color Palette Extraction System by Chroma Fine-tuning with Reinforcement Learning

  • Kim, Sanghyuk;Kang, Suk-Ju
    • Journal of Semiconductor Engineering
    • /
    • v.2 no.1
    • /
    • pp.125-129
    • /
    • 2021
  • As the interest of deep learning, techniques to control the color of images in image processing field are evolving together. However, there is no clear standard for color, and it is not easy to find a way to represent only the color itself like the color-palette. In this paper, we propose a novel color palette extraction system by chroma fine-tuning with reinforcement learning. It helps to recognize the color combination to represent an input image. First, we use RGBY images to create feature maps by transferring the backbone network with well-trained model-weight which is verified at super resolution convolutional neural networks. Second, feature maps are trained to 3 fully connected layers for the color-palette generation with a generative adversarial network (GAN). Third, we use the reinforcement learning method which only changes chroma information of the GAN-output by slightly moving each Y component of YCbCr color gamut of pixel values up and down. The proposed method outperforms existing color palette extraction methods as given the accuracy of 0.9140.

Roll control of Underwater Vehicle based Reinforcement Learning using Advantage Actor-Critic (Advantage Actor-Critic 강화학습 기반 수중운동체의 롤 제어)

  • Lee, Byungjun
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.1
    • /
    • pp.123-132
    • /
    • 2021
  • In order for the underwater vehicle to perform various tasks, it is important to control the depth, course, and roll of the underwater vehicle. To design such a controller, it is necessary to construct a dynamic model of the underwater vehicle and select the appropriate hydrodynamic coefficients. For the controller design, since the dynamic model is linearized assuming a limited operating range, the control performance in the steady state is well satisfied, but the control performance in the transient state may be unstable. In this paper, in order to overcome the problems of the existing controller design, we propose a A2C(Advantage Actor-Critic) based roll controller for underwater vehicle with stable learning performance in a continuous space among reinforcement learning methods that can be learned through rewards for actions. The performance of the proposed A2C based roll controller is verified through simulation and compared with PID and Dueling DDQN based roll controllers.

Visual Analysis of Deep Q-network

  • Seng, Dewen;Zhang, Jiaming;Shi, Xiaoying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.3
    • /
    • pp.853-873
    • /
    • 2021
  • In recent years, deep reinforcement learning (DRL) models are enjoying great interest as their success in a variety of challenging tasks. Deep Q-Network (DQN) is a widely used deep reinforcement learning model, which trains an intelligent agent that executes optimal actions while interacting with an environment. This model is well known for its ability to surpass skilled human players across many Atari 2600 games. Although DQN has achieved excellent performance in practice, there lacks a clear understanding of why the model works. In this paper, we present a visual analytics system for understanding deep Q-network in a non-blind matter. Based on the stored data generated from the training and testing process, four coordinated views are designed to expose the internal execution mechanism of DQN from different perspectives. We report the system performance and demonstrate its effectiveness through two case studies. By using our system, users can learn the relationship between states and Q-values, the function of convolutional layers, the strategies learned by DQN and the rationality of decisions made by the agent.