• Title/Summary/Keyword: Deep reinforcement learning

Search Result 210, Processing Time 0.023 seconds

Applying Deep Reinforcement Learning to Improve Throughput and Reduce Collision Rate in IEEE 802.11 Networks

  • Ke, Chih-Heng;Astuti, Lia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.334-349
    • /
    • 2022
  • The effectiveness of Wi-Fi networks is greatly influenced by the optimization of contention window (CW) parameters. Unfortunately, the conventional approach employed by IEEE 802.11 wireless networks is not scalable enough to sustain consistent performance for the increasing number of stations. Yet, it is still the default when accessing channels for single-users of 802.11 transmissions. Recently, there has been a spike in attempts to enhance network performance using a machine learning (ML) technique known as reinforcement learning (RL). Its advantage is interacting with the surrounding environment and making decisions based on its own experience. Deep RL (DRL) uses deep neural networks (DNN) to deal with more complex environments (such as continuous state spaces or actions spaces) and to get optimum rewards. As a result, we present a new approach of CW control mechanism, which is termed as contention window threshold (CWThreshold). It uses the DRL principle to define the threshold value and learn optimal settings under various network scenarios. We demonstrate our proposed method, known as a smart exponential-threshold-linear backoff algorithm with a deep Q-learning network (SETL-DQN). The simulation results show that our proposed SETL-DQN algorithm can effectively improve the throughput and reduce the collision rates.

Aspect-based Sentiment Analysis of Product Reviews using Multi-agent Deep Reinforcement Learning

  • M. Sivakumar;Srinivasulu Reddy Uyyala
    • Asia pacific journal of information systems
    • /
    • v.32 no.2
    • /
    • pp.226-248
    • /
    • 2022
  • The existing model for sentiment analysis of product reviews learned from past data and new data was labeled based on training. But new data was never used by the existing system for making a decision. The proposed Aspect-based multi-agent Deep Reinforcement learning Sentiment Analysis (ADRSA) model learned from its very first data without the help of any training dataset and labeled a sentence with aspect category and sentiment polarity. It keeps on learning from the new data and updates its knowledge for improving its intelligence. The decision of the proposed system changed over time based on the new data. So, the accuracy of the sentiment analysis using deep reinforcement learning was improved over supervised learning and unsupervised learning methods. Hence, the sentiments of premium customers on a particular site can be explored to other customers effectively. A dynamic environment with a strong knowledge base can help the system to remember the sentences and usage State Action Reward State Action (SARSA) algorithm with Bidirectional Encoder Representations from Transformers (BERT) model improved the performance of the proposed system in terms of accuracy when compared to the state of art methods.

A Study on Deep Reinforcement Learning Framework for DME Pulse Design

  • Lee, Jungyeon;Kim, Euiho
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.10 no.2
    • /
    • pp.113-120
    • /
    • 2021
  • The Distance Measuring Equipment (DME) is a ground-based aircraft navigation system and is considered as an infrastructure that ensures resilient aircraft navigation capability during the event of a Global Navigation Satellite System (GNSS) outage. The main problem of DME as a GNSS back up is a poor positioning accuracy that often reaches over 100 m. In this paper, a novel approach of applying deep reinforcement learning to a DME pulse design is introduced to improve the DME distance measuring accuracy. This method is designed to develop multipath-resistant DME pulses that comply with current DME specifications. In the research, a Markov Decision Process (MDP) for DME pulse design is set using pulse shape requirements and a timing error. Based on the designed MDP, we created an Environment called PulseEnv, which allows the agent representing a DME pulse shape to explore continuous space using the Soft Actor Critical (SAC) reinforcement learning algorithm.

Visual Analysis of Deep Q-network

  • Seng, Dewen;Zhang, Jiaming;Shi, Xiaoying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.3
    • /
    • pp.853-873
    • /
    • 2021
  • In recent years, deep reinforcement learning (DRL) models are enjoying great interest as their success in a variety of challenging tasks. Deep Q-Network (DQN) is a widely used deep reinforcement learning model, which trains an intelligent agent that executes optimal actions while interacting with an environment. This model is well known for its ability to surpass skilled human players across many Atari 2600 games. Although DQN has achieved excellent performance in practice, there lacks a clear understanding of why the model works. In this paper, we present a visual analytics system for understanding deep Q-network in a non-blind matter. Based on the stored data generated from the training and testing process, four coordinated views are designed to expose the internal execution mechanism of DQN from different perspectives. We report the system performance and demonstrate its effectiveness through two case studies. By using our system, users can learn the relationship between states and Q-values, the function of convolutional layers, the strategies learned by DQN and the rationality of decisions made by the agent.

Reinforcement Learning based on Deep Deterministic Policy Gradient for Roll Control of Underwater Vehicle (수중운동체의 롤 제어를 위한 Deep Deterministic Policy Gradient 기반 강화학습)

  • Kim, Su Yong;Hwang, Yeon Geol;Moon, Sung Woong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.5
    • /
    • pp.558-568
    • /
    • 2021
  • The existing underwater vehicle controller design is applied by linearizing the nonlinear dynamics model to a specific motion section. Since the linear controller has unstable control performance in a transient state, various studies have been conducted to overcome this problem. Recently, there have been studies to improve the control performance in the transient state by using reinforcement learning. Reinforcement learning can be largely divided into value-based reinforcement learning and policy-based reinforcement learning. In this paper, we propose the roll controller of underwater vehicle based on Deep Deterministic Policy Gradient(DDPG) that learns the control policy and can show stable control performance in various situations and environments. The performance of the proposed DDPG based roll controller was verified through simulation and compared with the existing PID and DQN with Normalized Advantage Functions based roll controllers.

A DASH System Using the A3C-based Deep Reinforcement Learning (A3C 기반의 강화학습을 사용한 DASH 시스템)

  • Choi, Minje;Lim, Kyungshik
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.5
    • /
    • pp.297-307
    • /
    • 2022
  • The simple procedural segment selection algorithm commonly used in Dynamic Adaptive Streaming over HTTP (DASH) reveals severe weakness to provide high-quality streaming services in the integrated mobile networks of various wired and wireless links. A major issue could be how to properly cope with dynamically changing underlying network conditions. The key to meet it should be to make the segment selection algorithm much more adaptive to fluctuation of network traffics. This paper presents a system architecture that replaces the existing procedural segment selection algorithm with a deep reinforcement learning algorithm based on the Asynchronous Advantage Actor-Critic (A3C). The distributed A3C-based deep learning server is designed and implemented to allow multiple clients in different network conditions to stream videos simultaneously, collect learning data quickly, and learn asynchronously, resulting in greatly improved learning speed as the number of video clients increases. The performance analysis shows that the proposed algorithm outperforms both the conventional DASH algorithm and the Deep Q-Network algorithm in terms of the user's quality of experience and the speed of deep learning.

Development of Semi-Active Control Algorithm Using Deep Q-Network (Deep Q-Network를 이용한 준능동 제어알고리즘 개발)

  • Kim, Hyun-Su;Kang, Joo-Won
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.1
    • /
    • pp.79-86
    • /
    • 2021
  • Control performance of a smart tuned mass damper (TMD) mainly depends on control algorithms. A lot of control strategies have been proposed for semi-active control devices. Recently, machine learning begins to be applied to development of vibration control algorithm. In this study, a reinforcement learning among machine learning techniques was employed to develop a semi-active control algorithm for a smart TMD. The smart TMD was composed of magnetorheological damper in this study. For this purpose, an 11-story building structure with a smart TMD was selected to construct a reinforcement learning environment. A time history analysis of the example structure subject to earthquake excitation was conducted in the reinforcement learning procedure. Deep Q-network (DQN) among various reinforcement learning algorithms was used to make a learning agent. The command voltage sent to the MR damper is determined by the action produced by the DQN. Parametric studies on hyper-parameters of DQN were performed by numerical simulations. After appropriate training iteration of the DQN model with proper hyper-parameters, the DQN model for control of seismic responses of the example structure with smart TMD was developed. The developed DQN model can effectively control smart TMD to reduce seismic responses of the example structure.

Cloud Task Scheduling Based on Proximal Policy Optimization Algorithm for Lowering Energy Consumption of Data Center

  • Yang, Yongquan;He, Cuihua;Yin, Bo;Wei, Zhiqiang;Hong, Bowei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1877-1891
    • /
    • 2022
  • As a part of cloud computing technology, algorithms for cloud task scheduling place an important influence on the area of cloud computing in data centers. In our earlier work, we proposed DeepEnergyJS, which was designed based on the original version of the policy gradient and reinforcement learning algorithm. We verified its effectiveness through simulation experiments. In this study, we used the Proximal Policy Optimization (PPO) algorithm to update DeepEnergyJS to DeepEnergyJSV2.0. First, we verify the convergence of the PPO algorithm on the dataset of Alibaba Cluster Data V2018. Then we contrast it with reinforcement learning algorithm in terms of convergence rate, converged value, and stability. The results indicate that PPO performed better in training and test data sets compared with reinforcement learning algorithm, as well as other general heuristic algorithms, such as First Fit, Random, and Tetris. DeepEnergyJSV2.0 achieves better energy efficiency than DeepEnergyJS by about 7.814%.

Federated Deep Reinforcement Learning Based on Privacy Preserving for Industrial Internet of Things (산업용 사물 인터넷을 위한 프라이버시 보존 연합학습 기반 심층 강화학습 모델)

  • Chae-Rim Han;Sun-Jin Lee;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1055-1065
    • /
    • 2023
  • Recently, various studies using deep reinforcement learning (deep RL) technology have been conducted to solve complex problems using big data collected at industrial internet of things. Deep RL uses reinforcement learning"s trial-and-error algorithms and cumulative compensation functions to generate and learn its own data and quickly explore neural network structures and parameter decisions. However, studies so far have shown that the larger the size of the learning data is, the higher are the memory usage and search time, and the lower is the accuracy. In this study, model-agnostic learning for efficient federated deep RL was utilized to solve privacy invasion by increasing robustness as 55.9% and achieve 97.8% accuracy, an improvement of 5.5% compared with the comparative optimization-based meta learning models, and to reduce the delay time by 28.9% on average.

ROV Manipulation from Observation and Exploration using Deep Reinforcement Learning

  • Jadhav, Yashashree Rajendra;Moon, Yong Seon
    • Journal of Advanced Research in Ocean Engineering
    • /
    • v.3 no.3
    • /
    • pp.136-148
    • /
    • 2017
  • The paper presents dual arm ROV manipulation using deep reinforcement learning. The purpose of this underwater manipulator is to investigate and excavate natural resources in ocean, finding lost aircraft blackboxes and for performing other extremely dangerous tasks without endangering humans. This research work emphasizes on a self-learning approach using Deep Reinforcement Learning (DRL). DRL technique allows ROV to learn the policy of performing manipulation task directly, from raw image data. Our proposed architecture maps the visual inputs (images) to control actions (output) and get reward after each action, which allows an agent to learn manipulation skill through trial and error method. We have trained our network in simulation. The raw images and rewards are directly provided by our simple Lua simulator. Our simulator achieve accuracy by considering underwater dynamic environmental conditions. Major goal of this research is to provide a smart self-learning way to achieve manipulation in highly dynamic underwater environment. The results showed that a dual robotic arm trained for a 3DOF movement successfully achieved target reaching task in a 2D space by considering real environmental factor.