• Title/Summary/Keyword: Deep Q learning

Search Result 85, Processing Time 0.031 seconds

Prediction Technique of Energy Consumption based on Reinforcement Learning in Microgrids (마이크로그리드에서 강화학습 기반 에너지 사용량 예측 기법)

  • Sun, Young-Ghyu;Lee, Jiyoung;Kim, Soo-Hyun;Kim, Soohwan;Lee, Heung-Jae;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.3
    • /
    • pp.175-181
    • /
    • 2021
  • This paper analyzes the artificial intelligence-based approach for short-term energy consumption prediction. In this paper, we employ the reinforcement learning algorithms to improve the limitation of the supervised learning algorithms which usually utilize to the short-term energy consumption prediction technologies. The supervised learning algorithm-based approaches have high complexity because the approaches require contextual information as well as energy consumption data for sufficient performance. We propose a deep reinforcement learning algorithm based on multi-agent to predict energy consumption only with energy consumption data for improving the complexity of data and learning models. The proposed scheme is simulated using public energy consumption data and confirmed the performance. The proposed scheme can predict a similar value to the actual value except for the outlier data.

Performance Comparison of Reinforcement Learning Algorithms for Futures Scalping (해외선물 스캘핑을 위한 강화학습 알고리즘의 성능비교)

  • Jung, Deuk-Kyo;Lee, Se-Hun;Kang, Jae-Mo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.697-703
    • /
    • 2022
  • Due to the recent economic downturn caused by Covid-19 and the unstable international situation, many investors are choosing the derivatives market as a means of investment. However, the derivatives market has a greater risk than the stock market, and research on the market of market participants is insufficient. Recently, with the development of artificial intelligence, machine learning has been widely used in the derivatives market. In this paper, reinforcement learning, one of the machine learning techniques, is applied to analyze the scalping technique that trades futures in minutes. The data set consists of 21 attributes using the closing price, moving average line, and Bollinger band indicators of 1 minute and 3 minute data for 6 months by selecting 4 products among futures products traded at trading firm. In the experiment, DNN artificial neural network model and three reinforcement learning algorithms, namely, DQN (Deep Q-Network), A2C (Advantage Actor Critic), and A3C (Asynchronous A2C) were used, and they were trained and verified through learning data set and test data set. For scalping, the agent chooses one of the actions of buying and selling, and the ratio of the portfolio value according to the action result is rewarded. Experiment results show that the energy sector products such as Heating Oil and Crude Oil yield relatively high cumulative returns compared to the index sector products such as Mini Russell 2000 and Hang Seng Index.

Development of Prediction Model of Chloride Diffusion Coefficient using Machine Learning (기계학습을 이용한 염화물 확산계수 예측모델 개발)

  • Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.3
    • /
    • pp.87-94
    • /
    • 2023
  • Chloride is one of the most common threats to reinforced concrete (RC) durability. Alkaline environment of concrete makes a passive layer on the surface of reinforcement bars that prevents the bar from corrosion. However, when the chloride concentration amount at the reinforcement bar reaches a certain level, deterioration of the passive protection layer occurs, causing corrosion and ultimately reducing the structure's safety and durability. Therefore, understanding the chloride diffusion and its prediction are important to evaluate the safety and durability of RC structure. In this study, the chloride diffusion coefficient is predicted by machine learning techniques. Various machine learning techniques such as multiple linear regression, decision tree, random forest, support vector machine, artificial neural networks, extreme gradient boosting annd k-nearest neighbor were used and accuracy of there models were compared. In order to evaluate the accuracy, root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE) and coefficient of determination (R2) were used as prediction performance indices. The k-fold cross-validation procedure was used to estimate the performance of machine learning models when making predictions on data not used during training. Grid search was applied to hyperparameter optimization. It has been shown from numerical simulation that ensemble learning methods such as random forest and extreme gradient boosting successfully predicted the chloride diffusion coefficient and artificial neural networks also provided accurate result.

Deep Learning-Based, Real-Time, False-Pick Filter for an Onsite Earthquake Early Warning (EEW) System (온사이트 지진조기경보를 위한 딥러닝 기반 실시간 오탐지 제거)

  • Seo, JeongBeom;Lee, JinKoo;Lee, Woodong;Lee, SeokTae;Lee, HoJun;Jeon, Inchan;Park, NamRyoul
    • Journal of the Earthquake Engineering Society of Korea
    • /
    • v.25 no.2
    • /
    • pp.71-81
    • /
    • 2021
  • This paper presents a real-time, false-pick filter based on deep learning to reduce false alarms of an onsite Earthquake Early Warning (EEW) system. Most onsite EEW systems use P-wave to predict S-wave. Therefore, it is essential to properly distinguish P-waves from noises or other seismic phases to avoid false alarms. To reduce false-picks causing false alarms, this study made the EEWNet Part 1 'False-Pick Filter' model based on Convolutional Neural Network (CNN). Specifically, it modified the Pick_FP (Lomax et al.) to generate input data such as the amplitude, velocity, and displacement of three components from 2 seconds ahead and 2 seconds after the P-wave arrival following one-second time steps. This model extracts log-mel power spectrum features from this input data, then classifies P-waves and others using these features. The dataset consisted of 3,189,583 samples: 81,394 samples from event data (727 events in the Korean Peninsula, 103 teleseismic events, and 1,734 events in Taiwan) and 3,108,189 samples from continuous data (recorded by seismic stations in South Korea for 27 months from 2018 to 2020). This model was trained with 1,826,357 samples through balancing, then tested on continuous data samples of the year 2019, filtering more than 99% of strong false-picks that could trigger false alarms. This model was developed as a module for USGS Earthworm and is written in C language to operate with minimal computing resources.

Resource Allocation Strategy of Internet of Vehicles Using Reinforcement Learning

  • Xi, Hongqi;Sun, Huijuan
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.443-456
    • /
    • 2022
  • An efficient and reasonable resource allocation strategy can greatly improve the service quality of Internet of Vehicles (IoV). However, most of the current allocation methods have overestimation problem, and it is difficult to provide high-performance IoV network services. To solve this problem, this paper proposes a network resource allocation strategy based on deep learning network model DDQN. Firstly, the method implements the refined modeling of IoV model, including communication model, user layer computing model, edge layer offloading model, mobile model, etc., similar to the actual complex IoV application scenario. Then, the DDQN network model is used to calculate and solve the mathematical model of resource allocation. By decoupling the selection of target Q value action and the calculation of target Q value, the phenomenon of overestimation is avoided. It can provide higher-quality network services and ensure superior computing and processing performance in actual complex scenarios. Finally, simulation results show that the proposed method can maintain the network delay within 65 ms and show excellent network performance in high concurrency and complex scenes with task data volume of 500 kbits.

Mapless Navigation Based on DQN Considering Moving Obstacles, and Training Time Reduction Algorithm (이동 장애물을 고려한 DQN 기반의 Mapless Navigation 및 학습 시간 단축 알고리즘)

  • Yoon, Beomjin;Yoo, Seungryeol
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.3
    • /
    • pp.377-383
    • /
    • 2021
  • Recently, in accordance with the 4th industrial revolution, The use of autonomous mobile robots for flexible logistics transfer is increasing in factories, the warehouses and the service areas, etc. In large factories, many manual work is required to use Simultaneous Localization and Mapping(SLAM), so the need for the improved mobile robot autonomous driving is emerging. Accordingly, in this paper, an algorithm for mapless navigation that travels in an optimal path avoiding fixed or moving obstacles is proposed. For mapless navigation, the robot is trained to avoid fixed or moving obstacles through Deep Q Network (DQN) and accuracy 90% and 93% are obtained for two types of obstacle avoidance, respectively. In addition, DQN requires a lot of learning time to meet the required performance before use. To shorten this, the target size change algorithm is proposed and confirmed the reduced learning time and performance of obstacle avoidance through simulation.

Novel Reward Function for Autonomous Drone Navigating in Indoor Environment

  • Khuong G. T. Diep;Viet-Tuan Le;Tae-Seok Kim;Anh H. Vo;Yong-Guk Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.624-627
    • /
    • 2023
  • Unmanned aerial vehicles are gaining in popularity with the development of science and technology, and are being used for a wide range of purposes, including surveillance, rescue, delivery of goods, and data collection. In particular, the ability to avoid obstacles during navigation without human oversight is one of the essential capabilities that a drone must possess. Many works currently have solved this problem by implementing deep reinforcement learning (DRL) model. The essential core of a DRL model is reward function. Therefore, this paper proposes a new reward function with appropriate action space and employs dueling double deep Q-Networks to train a drone to navigate in indoor environment without collision.

Edge Caching Based on Reinforcement Learning Considering Edge Coverage Overlap in Vehicle Environment (차량 환경에서 엣지 커버리지 오버랩을 고려한 강화학습 기반의 엣지 캐싱)

  • Choi, Yoonjeong;Lim, Yujin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.110-113
    • /
    • 2022
  • 인터넷을 통해 주위 사물과 연결된 차량은 사용자에게 편리성을 제공하기 위해 다양한 콘텐츠를 요구하는데 클라우드로부터 가져오는 시간이 비교적 오래 걸리기 때문에 차량과 물리적으로 가까운 위치에 캐싱하는 기법들이 등장하고 있다. 본 논문에서는 기반 시설이 밀집하게 설치된 도시 환경에서 maximum distance separable(MDS) 코딩을 사용해 road side unit(RSU)에 캐싱하는 방법에 대해 연구하였다. RSU의 중복된 서비스 커버리지 지역을 고려하여 차량의 콘텐츠 요구에 대한 RSU hit ratio를 높이기 위해 deep Q-learning(DQN)를 사용하였다. 실험 결과 비교 알고리즘보다 hit raito 측면에서 더 높은 성능을 보이는 것을 증명하였다.

A Reinforcement Learning Framework for Autonomous Cell Activation and Customized Energy-Efficient Resource Allocation in C-RANs

  • Sun, Guolin;Boateng, Gordon Owusu;Huang, Hu;Jiang, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.3821-3841
    • /
    • 2019
  • Cloud radio access networks (C-RANs) have been regarded in recent times as a promising concept in future 5G technologies where all DSP processors are moved into a central base band unit (BBU) pool in the cloud, and distributed remote radio heads (RRHs) compress and forward received radio signals from mobile users to the BBUs through radio links. In such dynamic environment, automatic decision-making approaches, such as artificial intelligence based deep reinforcement learning (DRL), become imperative in designing new solutions. In this paper, we propose a generic framework of autonomous cell activation and customized physical resource allocation schemes for energy consumption and QoS optimization in wireless networks. We formulate the problem as fractional power control with bandwidth adaptation and full power control and bandwidth allocation models and set up a Q-learning model to satisfy the QoS requirements of users and to achieve low energy consumption with the minimum number of active RRHs under varying traffic demand and network densities. Extensive simulations are conducted to show the effectiveness of our proposed solution compared to existing schemes.

A DQN-based Two-Stage Scheduling Method for Real-Time Large-Scale EVs Charging Service

  • Tianyang Li;Yingnan Han;Xiaolong Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.551-569
    • /
    • 2024
  • With the rapid development of electric vehicles (EVs) industry, EV charging service becomes more and more important. Especially, in the case of suddenly drop of air temperature or open holidays that large-scale EVs seeking for charging devices (CDs) in a short time. In such scenario, inefficient EV charging scheduling algorithm might lead to a bad service quality, for example, long queueing times for EVs and unreasonable idling time for charging devices. To deal with this issue, this paper propose a Deep-Q-Network (DQN) based two-stage scheduling method for the large-scale EVs charging service. Fine-grained states with two delicate neural networks are proposed to optimize the sequencing of EVs and charging station (CS) arrangement. Two efficient algorithms are presented to obtain the optimal EVs charging scheduling scheme for large-scale EVs charging demand. Three case studies show the superiority of our proposal, in terms of a high service quality (minimized average queuing time of EVs and maximized charging performance at both EV and CS sides) and achieve greater scheduling efficiency. The code and data are available at THE CODE AND DATA.