• Title/Summary/Keyword: Model based reinforcement learning

Search Result 150, Processing Time 0.03 seconds

A Study on DRL-based Efficient Asset Allocation Model for Economic Cycle-based Portfolio Optimization (심층강화학습 기반의 경기순환 주기별 효율적 자산 배분 모델 연구)

  • JUNG, NAK HYUN;Taeyeon Oh;Kim, Kang Hee
    • Journal of Korean Society for Quality Management
    • /
    • v.51 no.4
    • /
    • pp.573-588
    • /
    • 2023
  • Purpose: This study presents a research approach that utilizes deep reinforcement learning to construct optimal portfolios based on the business cycle for stocks and other assets. The objective is to develop effective investment strategies that adapt to the varying returns of assets in accordance with the business cycle. Methods: In this study, a diverse set of time series data, including stocks, is collected and utilized to train a deep reinforcement learning model. The proposed approach optimizes asset allocation based on the business cycle, particularly by gathering data for different states such as prosperity, recession, depression, and recovery and constructing portfolios optimized for each phase. Results: Experimental results confirm the effectiveness of the proposed deep reinforcement learning-based approach in constructing optimal portfolios tailored to the business cycle. The utility of optimizing portfolio investment strategies for each phase of the business cycle is demonstrated. Conclusion: This paper contributes to the construction of optimal portfolios based on the business cycle using a deep reinforcement learning approach, providing investors with effective investment strategies that simultaneously seek stability and profitability. As a result, investors can adopt stable and profitable investment strategies that adapt to business cycle volatility.

Transfer Learning Technique for Accelerating Learning of Reinforcement Learning-Based Horizontal Pod Autoscaling Policy (강화학습 기반 수평적 파드 오토스케일링 정책의 학습 가속화를 위한 전이학습 기법)

  • Jang, Yonghyeon;Yu, Heonchang;Kim, SungSuk
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.4
    • /
    • pp.105-112
    • /
    • 2022
  • Recently, many studies using reinforcement learning-based autoscaling have been performed to make autoscaling policies that are adaptive to changes in the environment and meet specific purposes. However, training the reinforcement learning-based Horizontal Pod Autoscaler(HPA) policy in a real environment requires a lot of money and time. And it is not practical to retrain the reinforcement learning-based HPA policy from scratch every time in a real environment. In this paper, we implement a reinforcement learning-based HPA in Kubernetes, and propose a transfer leanring technique using a queuing model-based simulation to accelerate the training of a reinforcement learning-based HPA policy. Pre-training using simulation enabled training the policy through simulation experience without consuming time and resources in the real environment, and by using the transfer learning technique, the cost was reduced by about 42.6% compared to the case without transfer learning technique.

Reinforcement Learning-Based Intelligent Decision-Making for Communication Parameters

  • Xie, Xia.;Dou, Zheng;Zhang, Yabin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.9
    • /
    • pp.2942-2960
    • /
    • 2022
  • The core of cognitive radio is the problem concerning intelligent decision-making for communication parameters, the objective of which is to find the most appropriate parameter configuration to optimize transmission performance. The current algorithms have the disadvantages of high dependence on prior knowledge, large amount of calculation, and high complexity. We propose a new decision-making model by making full use of the interactivity of reinforcement learning (RL) and applying the Q-learning algorithm. By simplifying the decision-making process, we avoid large-scale RL, reduce complexity and improve timeliness. The proposed model is able to find the optimal waveform parameter configuration for the communication system in complex channels without prior knowledge. Moreover, this model is more flexible than previous decision-making models. The simulation results demonstrate the effectiveness of our model. The model not only exhibits better decision-making performance in the AWGN channels than the traditional method, but also make reasonable decisions in the fading channels.

UAV Path Planning based on Deep Reinforcement Learning using Cell Decomposition Algorithm (셀 분해 알고리즘을 활용한 심층 강화학습 기반 무인 항공기 경로 계획)

  • Kyoung-Hun Kim;Byungsun Hwang;Joonho Seon;Soo-Hyun Kim;Jin-Young Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.3
    • /
    • pp.15-20
    • /
    • 2024
  • Path planning for unmanned aerial vehicles (UAV) is crucial in avoiding collisions with obstacles in complex environments that include both static and dynamic obstacles. Path planning algorithms like RRT and A* are effectively handle static obstacle avoidance but have limitations with increasing computational complexity in high-dimensional environments. Reinforcement learning-based algorithms can accommodate complex environments, but like traditional path planning algorithms, they struggle with training complexity and convergence in higher-dimensional environment. In this paper, we proposed a reinforcement learning model utilizing a cell decomposition algorithm. The proposed model reduces the complexity of the environment by decomposing the learning environment in detail, and improves the obstacle avoidance performance by establishing the valid action of the agent. This solves the exploration problem of reinforcement learning and improves the convergence of learning. Simulation results show that the proposed model improves learning speed and efficient path planning compared to reinforcement learning models in general environments.

Mapless Navigation with Distributional Reinforcement Learning (분포형 강화학습을 활용한 맵리스 네비게이션)

  • Van Manh Tran;Gon-Woo Kim
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.92-97
    • /
    • 2024
  • This paper provides a study of distributional perspective on reinforcement learning for application in mobile robot navigation. Mapless navigation algorithms based on deep reinforcement learning are proven to promising performance and high applicability. The trial-and-error simulations in virtual environments are encouraged to implement autonomous navigation due to expensive real-life interactions. Nevertheless, applying the deep reinforcement learning model in real tasks is challenging due to dissimilar data collection between virtual simulation and the physical world, leading to high-risk manners and high collision rate. In this paper, we present distributional reinforcement learning architecture for mapless navigation of mobile robot that adapt the uncertainty of environmental change. The experimental results indicate the superior performance of distributional soft actor critic compared to conventional methods.

Region-based Q- learning For Autonomous Mobile Robot Navigation (자율 이동 로봇의 주행을 위한 영역 기반 Q-learning)

  • 차종환;공성학;서일홍
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.174-174
    • /
    • 2000
  • Q-learning, based on discrete state and action space, is a most widely used reinforcement Learning. However, this requires a lot of memory and much time for learning all actions of each state when it is applied to a real mobile robot navigation using continuous state and action space Region-based Q-learning is a reinforcement learning method that estimates action values of real state by using triangular-type action distribution model and relationship with its neighboring state which was defined and learned before. This paper proposes a new Region-based Q-learning which uses a reward assigned only when the agent reached the target, and get out of the Local optimal path with adjustment of random action rate. If this is applied to mobile robot navigation, less memory can be used and robot can move smoothly, and optimal solution can be learned fast. To show the validity of our method, computer simulations are illusrated.

  • PDF

Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments (적대적 멀티 에이전트 환경에서 효율적인 강화 학습을 위한 정책 모델링)

  • Kwon, Ki-Duk;Kim, In-Cheol
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.3
    • /
    • pp.179-188
    • /
    • 2008
  • An important issue in multiagent reinforcement learning is how an agent should team its optimal policy through trial-and-error interactions in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for multiagent reinforcement teaming tend to apply single-agent reinforcement learning techniques without any extensions or are based upon some unrealistic assumptions even though they build and use explicit models of other agents. In this paper, basic concepts that constitute the common foundation of multiagent reinforcement learning techniques are first formulated, and then, based on these concepts, previous works are compared in terms of characteristics and limitations. After that, a policy model of the opponent agent and a new multiagent reinforcement learning method using this model are introduced. Unlike previous works, the proposed multiagent reinforcement learning method utilize a policy model instead of the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper. the Cat and Mouse game is introduced as an adversarial multiagent environment. And effectiveness of the proposed multiagent reinforcement learning method is analyzed through experiments using this game as testbed.

Application of Reinforcement Learning in Detecting Fraudulent Insurance Claims

  • Choi, Jung-Moon;Kim, Ji-Hyeok;Kim, Sung-Jun
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.125-131
    • /
    • 2021
  • Detecting fraudulent insurance claims is difficult due to small and unbalanced data. Some research has been carried out to better cope with various types of fraudulent claims. Nowadays, technology for detecting fraudulent insurance claims has been increasingly utilized in insurance and technology fields, thanks to the use of artificial intelligence (AI) methods in addition to traditional statistical detection and rule-based methods. This study obtained meaningful results for a fraudulent insurance claim detection model based on machine learning (ML) and deep learning (DL) technologies, using fraudulent insurance claim data from previous research. In our search for a method to enhance the detection of fraudulent insurance claims, we investigated the reinforcement learning (RL) method. We examined how we could apply the RL method to the detection of fraudulent insurance claims. There are limited previous cases of applying the RL method. Thus, we first had to define the RL essential elements based on previous research on detecting anomalies. We applied the deep Q-network (DQN) and double deep Q-network (DDQN) in the learning fraudulent insurance claim detection model. By doing so, we confirmed that our model demonstrated better performance than previous machine learning models.

Implementation of Intelligent Virtual Character Based on Reinforcement Learning and Emotion Model (강화학습과 감정모델 기반의 지능적인 가상 캐릭터의 구현)

  • Woo Jong-Ha;Park Jung-Eun;Oh Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.3
    • /
    • pp.259-265
    • /
    • 2006
  • Learning and emotions are very important parts to implement intelligent robots. In this paper, we implement intelligent virtual character based on reinforcement learning which interacts with user and have internal emotion model. Virtual character acts autonomously in 3D virtual environment by internal state. And user can learn virtual character specific behaviors by repeated directions. Mouse gesture is used to perceive such directions based on artificial neural network. Emotion-Mood-Personality model is proposed to express emotions. And we examine the change of emotion and learning behaviors when virtual character interact with user.

A Distributed Scheduling Algorithm based on Deep Reinforcement Learning for Device-to-Device communication networks (단말간 직접 통신 네트워크를 위한 심층 강화학습 기반 분산적 스케쥴링 알고리즘)

  • Jeong, Moo-Woong;Kim, Lyun Woo;Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1500-1506
    • /
    • 2020
  • In this paper, we study a scheduling problem based on reinforcement learning for overlay device-to-device (D2D) communication networks. Even though various technologies for D2D communication networks using Q-learning, which is one of reinforcement learning models, have been studied, Q-learning causes a tremendous complexity as the number of states and actions increases. In order to solve this problem, D2D communication technologies based on Deep Q Network (DQN) have been studied. In this paper, we thus design a DQN model by considering the characteristics of wireless communication systems, and propose a distributed scheduling scheme based on the DQN model that can reduce feedback and signaling overhead. The proposed model trains all parameters in a centralized manner, and transfers the final trained parameters to all mobiles. All mobiles individually determine their actions by using the transferred parameters. We analyze the performance of the proposed scheme by computer simulation and compare it with optimal scheme, opportunistic selection scheme and full transmission scheme.