• Title/Summary/Keyword: Q-learning algorithm

Search Result 155, Processing Time 0.028 seconds

A Study on the Improvement of Heat Energy Efficiency for Utilities of Heat Consumer Plants based on Reinforcement Learning (강화학습을 기반으로 하는 열사용자 기계실 설비의 열효율 향상에 대한 연구)

  • Kim, Young-Gon;Heo, Keol;You, Ga-Eun;Lim, Hyun-Seo;Choi, Jung-In;Ku, Ki-Dong;Eom, Jae-Sik;Jeon, Young-Shin
    • Journal of Energy Engineering
    • /
    • v.27 no.2
    • /
    • pp.26-31
    • /
    • 2018
  • This paper introduces a study to improve the thermal efficiency of the district heating user control facility based on reinforcement learning. As an example, it is proposed a general method of constructing a deep Q learning network(DQN) using deep Q learning, which is a reinforcement learning algorithm that does not specify a model. In addition, it is also introduced the big data platform system and the integrated heat management system which are specialized in energy field applied in processing huge amount of data processing from IoT sensor installed in many thermal energy control facilities.

A Study on Cooperative Traffic Signal Control at multi-intersection (다중 교차로에서 협력적 교통신호제어에 대한 연구)

  • Kim, Dae Ho;Jeong, Ok Ran
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1381-1386
    • /
    • 2019
  • As traffic congestion in cities becomes more serious, intelligent traffic control is actively being researched. Reinforcement learning is the most actively used algorithm for traffic signal control, and recently Deep reinforcement learning has attracted attention of researchers. Extended versions of deep reinforcement learning have been emerged as deep reinforcement learning algorithm showed high performance in various fields. However, most of the existing traffic signal control were studied in a single intersection environment, and there is a limitation that the method at a single intersection does not consider the traffic conditions of the entire city. In this paper, we propose a cooperative traffic control at multi-intersection environment. The traffic signal control algorithm is based on a combination of extended versions of deep reinforcement learning and we considers traffic conditions of adjacent intersections. In the experiment, we compare the proposed algorithm with the existing deep reinforcement learning algorithm, and further demonstrate the high performance of our model with and without cooperative method.

LoRa Network based Parking Dispatching System : Queuing Theory and Q-learning Approach (LoRa 망 기반의 주차 지명 시스템 : 큐잉 이론과 큐러닝 접근)

  • Cho, Youngho;Seo, Yeong Geon;Jeong, Dae-Yul
    • Journal of Digital Contents Society
    • /
    • v.18 no.7
    • /
    • pp.1443-1450
    • /
    • 2017
  • The purpose of this study is to develop an intelligent parking dispatching system based on LoRa network technology. During the local festival, many tourists come into the festival site simultaneously after sunset. To handle the traffic jam and parking dispatching, many traffic management staffs are engaged in the main road to guide the cars to available parking lots. Nevertheless, the traffic problems are more serious at the peak time of festival. Such parking dispatching problems are complex and real-time traffic information dependent. We used Queuing theory to predict inbound traffics and to measure parking service performance. Q-learning algorithm is used to find fastest routes and dispatch the vehicles efficiently to the available parking lots.

Retained Message Delivery Scheme utilizing Reinforcement Learning in MQTT-based IoT Networks (MQTT 기반 IoT 네트워크에서 강화학습을 활용한 Retained 메시지 전송 방법)

  • Yeunwoong Kyung;Tae-Kook Kim;Youngjun Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.2
    • /
    • pp.131-135
    • /
    • 2024
  • In the MQTT protocol, if the retained flag of a message published by a publisher is set, the message is stored in the broker as a retained message. When a new subscriber performs a subscribe, the broker immediately sends the retained message. This allows the new subscriber to perform updates on the current state via the retained message without waiting for new messages from the publisher. However, sending retained messages can become a traffic overhead if new messages are frequently published by the publisher. This situation could be considered an overhead when new subscribers frequently subscribe. Therefore, in this paper, we propose a retained message delivery scheme by considering the characteristics of the published messages. We model the delivery and waiting actions to new subscribers from the perspective of the broker using reinforcement learning, and determine the optimal policy through Q learning algorithm. Through performance analysis, we confirm that the proposed method shows improved performance compared to existing methods.

Model predictive control combined with iterative learning control for nonlinear batch processes

  • Lee, Kwang-Soon;Kim, Won-Cheol;Lee, Jay H.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10a
    • /
    • pp.299-302
    • /
    • 1996
  • A control algorithm is proposed for nonlinear multi-input multi-output(MIMO) batch processes by combining quadratic iterative learning control(Q-ILC) with model predictive control(MPC). Both controls are designed based on output feedback and Kalman filter is incorporated for state estimation. Novelty of the proposed algorithm lies in the facts that, unlike feedback-only control, unknown sustained disturbances which are repeated over batches can be completely rejected and asymptotically perfect tracking is possible for zero random disturbance case even with uncertain process model.

  • PDF

The Development of an Intelligent Home Energy Management System Integrated with a Vehicle-to-Home Unit using a Reinforcement Learning Approach

  • Ohoud Almughram;Sami Ben Slama;Bassam Zafar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.87-106
    • /
    • 2024
  • Vehicle-to-Home (V2H) and Home Centralized Photovoltaic (HCPV) systems can address various energy storage issues and enhance demand response programs. Renewable energy, such as solar energy and wind turbines, address the energy gap. However, no energy management system is currently available to regulate the uncertainty of renewable energy sources, electric vehicles, and appliance consumption within a smart microgrid. Therefore, this study investigated the impact of solar photovoltaic (PV) panels, electric vehicles, and Micro-Grid (MG) storage on maximum solar radiation hours. Several Deep Learning (DL) algorithms were applied to account for the uncertainty. Moreover, a Reinforcement Learning HCPV (RL-HCPV) algorithm was created for efficient real-time energy scheduling decisions. The proposed algorithm managed the energy demand between PV solar energy generation and vehicle energy storage. RL-HCPV was modeled according to several constraints to meet household electricity demands in sunny and cloudy weather. Simulations demonstrated how the proposed RL-HCPV system could efficiently handle the demand response and how V2H can help to smooth the appliance load profile and reduce power consumption costs with sustainable power generation. The results demonstrated the advantages of utilizing RL and V2H as potential storage technology for smart buildings.

Performance Analysis of Deep Reinforcement Learning for Crop Yield Prediction (작물 생산량 예측을 위한 심층강화학습 성능 분석)

  • Ohnmar Khin;Sung-Keun Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.99-106
    • /
    • 2023
  • Recently, many studies on crop yield prediction using deep learning technology have been conducted. These algorithms have difficulty constructing a linear map between input data sets and crop prediction results. Furthermore, implementation of these algorithms positively depends on the rate of acquired attributes. Deep reinforcement learning can overcome these limitations. This paper analyzes the performance of DQN, Double DQN and Dueling DQN to improve crop yield prediction. The DQN algorithm retains the overestimation problem. Whereas, Double DQN declines the over-estimations and leads to getting better results. The proposed models achieves these by reducing the falsehood and increasing the prediction exactness.

Development of reinforcement learning algorithm with countinuous action selection for acrobot (Acrobot 제어를 위한 강화학습에서의 연속적인 행위 선택 알고리즘의 개발)

  • Seo, Sung-Hwan;Jang, Si-Young;Suh, Il-Hong
    • Proceedings of the KIEE Conference
    • /
    • 2003.07d
    • /
    • pp.2387-2389
    • /
    • 2003
  • Acrobat은 대표석인 비선형, underactuated 시스템이며, acrobot의 제어목적에는 swing-up 제어와 balancing 제어가 있다. 이 두 가지 제어목적을 달성하기 위해 기존에 많은 연구가 진행되었다. 그러나 이 방법들은 두 개의 독립적인 제어기를 acrobot의 상태에 따라 전환하여 사용하는 방법으로서 전환 시점의 선정기준에 대한 어려움과 두 가지 제어목적의 달성을 위한 전체 학습 시간지연의 문제점이 있다. 이를 개선하기 위하여 우리는 acrobot의 두 가지 제어목적을 동시에 해결할 수 있도록 기존에 연구하였던 연속적인 상태공간의 근사화가 가능한 영역기반 Q-학습(Region-based Q-Learning)[11]을 기반으로 한 하나의 제어기로 구현하는 방법을 연구하였다. 제안한 방법을 제작한 acrobot에 적용한 실험을 통하여 그 유용성을 검증하였다.

  • PDF

Decision Support Method in Dynamic Car Navigation Systems by Q - Learning

  • Hong, Soo-Jung;Hong, Eon-Joo;Oh, Kyung-Whan
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.6-9
    • /
    • 2002
  • 오랜 세월동안 위대한 이동수단을 만들어내고자 하는 인간의 끓은 오늘날 눈부신 각종 운송기구를 만들어 내는 결실을 얻고 있다. 자동차 네비게이션 시스템도 그러한 결실중의 한 예라고 할 수 있을 것이다. 지능적으로 판단하고 정보를 처리할 수 있는 자동차 네비게이션 시스템을 부착함으로써 한단계 발전한 운송수단으로 진화할 수 있을 것이다. 이러한 자동차 네비게이션 시스템의 단점이라면 한정된 리 소스만으로 여러 가지 작업을 수행해야만 하는 어려움이다. 그래서 네비게이션 시스템의 주요 작업중의 하나인 경로를 추출하는 경로추출(Route Planing) 작업은 한정된 리 소스에서도 최적의 경로를 찾을 수 있는 지능적인 방법이어야만 한다. 이러한 경로를 추출하는 작업을 하는 데 기존에 일반적으로 쓰였던 두 가지 방법에는 Dijkstra's algorithm과 A* algorithm이 있다. 이 두 방법은 최적의 경로를 찾아 낸다는 점은 있지만 경로를 찾기 위해서 알고리즘의 특성상 각각, 넓은 영역에 대하여 탐색작업을 해야하고 또한 수행시간이 많이 걸린다는 단점과 또한 경로를 계산하기 위해서 Heuristic function을 추가적인 정보로 계산을 해야 한다는 단점이 있다. 본 논문에서는 적은 탐색 영역을 가지면서 또한 최적의 경로를 추출하는 데 드는 수행시간은 작으며 나아가 동적인 교통환경에서도 최적의 경로를 추출할 수 있는 최적 경로 추출방법을 강화학습의 일종인 Q- Learning을 이용하여 구현해 보고자 한다.

  • PDF

Developing Novel Algorithms to Reduce the Data Requirements of the Capture Matrix for a Wind Turbine Certification (풍력 발전기 평가를 위한 수집 행렬 데이터 절감 알고리즘 개발)

  • Lee, Jehyun;Choi, Jungchul
    • New & Renewable Energy
    • /
    • v.16 no.1
    • /
    • pp.15-24
    • /
    • 2020
  • For mechanical load testing of wind turbines, capture matrix is constructed for various range of wind speeds according to the international standard IEC 61400-13. The conventional method wastes considerable amount of data by its invalid data policy -segment data into 10 minutes then remove invalid ones. Previously, we have suggested an alternative way to save the total amount of data to build a capture matrix, but the efficient selection of data has been still under question. The paper introduces optimization algorithms to construct capture matrix with less data. Heuristic algorithm (simple stacking and lowest frequency first), population method (particle swarm optimization) and Q-Learning accompanied with epsilon-greedy exploration are compared. All algorithms show better performance than the conventional way, where the distribution of enhancement was quite diverse. Among the algorithms, the best performance was achieved by heuristic method (lowest frequency first), and similarly by particle swarm optimization: Approximately 28% of data reduction in average and more than 40% in maximum. On the other hand, unexpectedly, the worst performance was achieved by Q-Learning, which was a promising candidate at the beginning. This study is helpful for not only wind turbine evaluation particularly the viewpoint of cost, but also understanding nature of wind speed data.