• Title/Summary/Keyword: deep reinforcement learning

Search Result 210, Processing Time 0.028 seconds

Reinforcement learning model for water distribution system design (상수도관망 설계에의 강화학습 적용방안 연구)

  • Jaehyun Kim;Donghwi Jung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.229-229
    • /
    • 2023
  • 강화학습은 에이전트(agent)가 주어진 환경(environment)과의 상호작용을 통해서 상태(state)를 변화시켜가며 최대의 보상(reward)을 얻을 수 있도록 최적의 행동(action)을 학습하는 기계학습법을 의미한다. 최근 알파고와 같은 게임뿐만 아니라 자율주행 자동차, 로봇 제어 등 다양한 분야에서 널리 사용되고 있다. 상수도관망 분야의 경우에도 펌프 운영, 밸브 운영, 센서 최적 위치 선정 등 여러 문제에 적용되었으나, 설계에 강화학습을 적용한 연구는 없었다. 설계의 경우, 관망의 크기가 커짐에 따라 알고리즘의 탐색 공간의 크기가 증가하여 기존의 최적화 알고리즘을 이용하는 것에는 한계가 존재한다. 따라서 본 연구는 강화학습을 이용하여 상수도관망의 구성요소와 환경요인 간의 복잡한 상호작용을 고려하는 설계 방법론을 제안한다. 모델의 에이전트를 딥 강화학습(Deep Reinforcement Learning)으로 구성하여, 상태 및 행동 공간이 커 발생하는 고차원성 문제를 해결하였다. 또한, 해당 모델의 상태 및 보상으로 절점에서의 압력 및 수요량과 설계비용을 고려하여 적절한 수량과 수압의 용수 공급이 가능한 경제적인 관망을 설계하도록 하였다. 모델의 행동은 실제로 공학자가 설계하듯이 절점마다 하나씩 차례대로 다른 절점과의 연결 여부를 결정하는 것으로, 이를 통해 관망의 레이아웃(layout)과 관경을 결정한다. 본 연구에서 제안한 방법론을 규모가 큰 그리드 네트워크에 적용하여 모델을 검증하였으며, 고려해야 할 변수의 개수가 많음에도 불구하고 목적에 부합하는 관망을 설계할 수 있었다. 모델 학습과정 동안 에피소드의 평균 길이와 보상의 크기 등의 변화를 비교하여, 제안한 모델의 학습 능력을 평가 및 보완하였다. 향후 강화학습 모델을 통해 신뢰성(reliability) 또는 탄력성(resilience)과 같은 시스템의 성능까지 고려한 설계가 가능할 것으로 기대한다.

  • PDF

An Intelligent Video Streaming Mechanism based on a Deep Q-Network for QoE Enhancement (QoE 향상을 위한 Deep Q-Network 기반의 지능형 비디오 스트리밍 메커니즘)

  • Kim, ISeul;Hong, Seongjun;Jung, Sungwook;Lim, Kyungshik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.188-198
    • /
    • 2018
  • With recent development of high-speed wide-area wireless networks and wide spread of highperformance wireless devices, the demand on seamless video streaming services in Long Term Evolution (LTE) network environments is ever increasing. To meet the demand and provide enhanced Quality of Experience (QoE) with mobile users, the Dynamic Adaptive Streaming over HTTP (DASH) has been actively studied to achieve QoE enhanced video streaming service in dynamic network environments. However, the existing DASH algorithm to select the quality of requesting video segments is based on a procedural algorithm so that it reveals a limitation to adapt its performance to dynamic network situations. To overcome this limitation this paper proposes a novel quality selection mechanism based on a Deep Q-Network (DQN) model, the DQN-based DASH ABR($DQN_{ABR}$) mechanism. The $DQN_{ABR}$ mechanism replaces the existing DASH ABR algorithm with an intelligent deep learning model which optimizes service quality to mobile users through reinforcement learning. Compared to the existing approaches, the experimental analysis shows that the proposed solution outperforms in terms of adapting to dynamic wireless network situations and improving QoE experience of end users.

A Development of Intelligent Pumping Station Operation System Using Deep Reinforcement Learning (심층 강화학습을 이용한 지능형 빗물펌프장 운영 시스템 개발)

  • Kang, Seung-Ho;Park, Jung-Hyun;Joo, Jin-Gul
    • Convergence Security Journal
    • /
    • v.20 no.1
    • /
    • pp.33-40
    • /
    • 2020
  • The rainwater pumping station located near a river prevents river overflow and flood damages by operating several pumps according to the appropriate rules against the reservoir. At the present time, almost all of rainwater pumping stations employ pumping policies based on the simple rules depending only on the water level of reservoir. The ongoing climate change caused by global warming makes it increasingly difficult to predict the amount of rainfall. Therefore, it is difficult to cope with changes in the water level of reservoirs through the simple pumping policy. In this paper, we propose a pump operating method based on deep reinforcement learning which has the ability to select the appropriate number of operating pumps to keep the reservoir to the proper water level using the information of the amount of rainfall, the water volume and current water level of the reservoir. In order to evaluate the performance of the proposed method, the simulations are performed using Storm Water Management Model(SWMM), a dynamic rainfall-runoff-routing simulation model, and the performance of the method is compared with that of a pumping policy being in use in the field.

Object Part Detection-based Manipulation with an Anthropomorphic Robot Hand Via Human Demonstration Augmented Deep Reinforcement Learning (행동 복제 강화학습 및 딥러닝 사물 부분 검출 기술에 기반한 사람형 로봇손의 사물 조작)

  • Oh, Ji Heon;Ryu, Ga Hyun;Park, Na Hyeon;Anazco, Edwin Valarezo;Lopez, Patricio Rivera;Won, Da Seul;Jeong, Jin Gyun;Chang, Yun Jung;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.854-857
    • /
    • 2020
  • 최근 사람형(Anthropomorphic)로봇손의 사물조작 지능을 개발하기 위하여 행동복제(Behavior Cloning) Deep Reinforcement Learning(DRL) 연구가 진행중이다. 자유도(Degree of Freedom, DOF)가 높은 사람형 로봇손의 학습 문제점을 개선하기 위하여, 행동 복제를 통한 Human Demonstration Augmented(DA)강화 학습을 통하여 사람처럼 사물을 조작하는 지능을 학습시킬 수 있다. 그러나 사물 조작에 있어, 의미 있는 파지를 위해서는 사물의 특정 부위를 인식하고 파지하는 방법이 필수적이다. 본 연구에서는 딥러닝 YOLO기술을 적용하여 사물의 특정 부위를 인식하고, DA-DRL을 적용하여, 사물의 특정 부분을 파지하는 딥러닝 학습 기술을 제안하고, 2 종 사물(망치 및 칼)의 손잡이 부분을 인식하고 파지하여 검증한다. 본 연구에서 제안하는 학습방법은 사람과 상호작용하거나 도구를 용도에 맞게 사용해야하는 분야에서 유용할 것이다.

A Study on the Classification of Variables Affecting Smartphone Addiction in Decision Tree Environment Using Python Program

  • Kim, Seung-Jae
    • International journal of advanced smart convergence
    • /
    • v.11 no.4
    • /
    • pp.68-80
    • /
    • 2022
  • Since the launch of AI, technology development to implement complete and sophisticated AI functions has continued. In efforts to develop technologies for complete automation, Machine Learning techniques and deep learning techniques are mainly used. These techniques deal with supervised learning, unsupervised learning, and reinforcement learning as internal technical elements, and use the Big-data Analysis method again to set the cornerstone for decision-making. In addition, established decision-making is being improved through subsequent repetition and renewal of decision-making standards. In other words, big data analysis, which enables data classification and recognition/recognition, is important enough to be called a key technical element of AI function. Therefore, big data analysis itself is important and requires sophisticated analysis. In this study, among various tools that can analyze big data, we will use a Python program to find out what variables can affect addiction according to smartphone use in a decision tree environment. We the Python program checks whether data classification by decision tree shows the same performance as other tools, and sees if it can give reliability to decision-making about the addictiveness of smartphone use. Through the results of this study, it can be seen that there is no problem in performing big data analysis using any of the various statistical tools such as Python and R when analyzing big data.

Novel Reward Function for Autonomous Drone Navigating in Indoor Environment

  • Khuong G. T. Diep;Viet-Tuan Le;Tae-Seok Kim;Anh H. Vo;Yong-Guk Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.624-627
    • /
    • 2023
  • Unmanned aerial vehicles are gaining in popularity with the development of science and technology, and are being used for a wide range of purposes, including surveillance, rescue, delivery of goods, and data collection. In particular, the ability to avoid obstacles during navigation without human oversight is one of the essential capabilities that a drone must possess. Many works currently have solved this problem by implementing deep reinforcement learning (DRL) model. The essential core of a DRL model is reward function. Therefore, this paper proposes a new reward function with appropriate action space and employs dueling double deep Q-Networks to train a drone to navigate in indoor environment without collision.

DQN Reinforcement Learning for Mountain-Car in OpenAI Gym Environment (OpenAI Gym 환경의 Mountain-Car에 대한 DQN 강화학습)

  • Myung-Ju Kang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.375-377
    • /
    • 2024
  • 본 논문에서는 OpenAI Gym 환경에서 프로그램으로 간단한 제어가 가능한 Mountain-Car-v0 게임에 대해 DQN(Deep Q-Networks) 강화학습을 진행하였다. 본 논문에서 적용한 DQN 네트워크는 입력층 1개, 은닉층 3개, 출력층 1개로 구성하였고, 입력층과 은닉층에서의 활성화함수는 ReLU를, 출력층에서는 Linear함수를 활성화함수로 적용하였다. 실험은 Mountain-Car-v0에 대해 DQN 강화학습을 진행했을 때 각 에피소드별로 획득한 보상 결과를 살펴보고, 보상구간에 포함된 횟수를 분석하였다. 실험결과 전체 100회의 에피소드 중 보상을 50 이상 획득한 에피소드가 85개로 나타났다.

  • PDF

Computation Offloading with Resource Allocation Based on DDPG in MEC

  • Sungwon Moon;Yujin Lim
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.226-238
    • /
    • 2024
  • Recently, multi-access edge computing (MEC) has emerged as a promising technology to alleviate the computing burden of vehicular terminals and efficiently facilitate vehicular applications. The vehicle can improve the quality of experience of applications by offloading their tasks to MEC servers. However, channel conditions are time-varying due to channel interference among vehicles, and path loss is time-varying due to the mobility of vehicles. The task arrival of vehicles is also stochastic. Therefore, it is difficult to determine an optimal offloading with resource allocation decision in the dynamic MEC system because offloading is affected by wireless data transmission. In this paper, we study computation offloading with resource allocation in the dynamic MEC system. The objective is to minimize power consumption and maximize throughput while meeting the delay constraints of tasks. Therefore, it allocates resources for local execution and transmission power for offloading. We define the problem as a Markov decision process, and propose an offloading method using deep reinforcement learning named deep deterministic policy gradient. Simulation shows that, compared with existing methods, the proposed method outperforms in terms of throughput and satisfaction of delay constraints.

DeNERT: Named Entity Recognition Model using DQN and BERT

  • Yang, Sung-Min;Jeong, Ok-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.29-35
    • /
    • 2020
  • In this paper, we propose a new structured entity recognition DeNERT model. Recently, the field of natural language processing has been actively researched using pre-trained language representation models with a large amount of corpus. In particular, the named entity recognition, which is one of the fields of natural language processing, uses a supervised learning method, which requires a large amount of training dataset and computation. Reinforcement learning is a method that learns through trial and error experience without initial data and is closer to the process of human learning than other machine learning methodologies and is not much applied to the field of natural language processing yet. It is often used in simulation environments such as Atari games and AlphaGo. BERT is a general-purpose language model developed by Google that is pre-trained on large corpus and computational quantities. Recently, it is a language model that shows high performance in the field of natural language processing research and shows high accuracy in many downstream tasks of natural language processing. In this paper, we propose a new named entity recognition DeNERT model using two deep learning models, DQN and BERT. The proposed model is trained by creating a learning environment of reinforcement learning model based on language expression which is the advantage of the general language model. The DeNERT model trained in this way is a faster inference time and higher performance model with a small amount of training dataset. Also, we validate the performance of our model's named entity recognition performance through experiments.

A Study on Automatic Classification of Characterized Ground Regions on Slopes by a Deep Learning based Image Segmentation (딥러닝 영상처리를 통한 비탈면의 지반 특성화 영역 자동 분류에 관한 연구)

  • Lee, Kyu Beom;Shin, Hyu-Soung;Kim, Seung Hyeon;Ha, Dae Mok;Choi, Isu
    • Tunnel and Underground Space
    • /
    • v.29 no.6
    • /
    • pp.508-522
    • /
    • 2019
  • Because of the slope failure, not only property damage but also human damage can occur, slope stability analysis should be conducted to predict and reinforce of the slope. This paper, defines the ground areas that can be characterized in terms of slope failure such as Rockmass jointset, Rockmass fault, Soil, Leakage water and Crush zone in sloped images. As a result, it was shown that the deep learning instance segmentation network can be used to recognize and automatically segment the precise shape of the ground region with different characteristics shown in the image. It showed the possibility of supporting the slope mapping work and automatically calculating the ground characteristics information of slopes necessary for decision making such as slope reinforcement.