• Title/Summary/Keyword: Q-learning algorithm

Search Result 155, Processing Time 0.031 seconds

A Research on Low-power Buffer Management Algorithm based on Deep Q-Learning approach for IoT Networks (IoT 네트워크에서의 심층 강화학습 기반 저전력 버퍼 관리 기법에 관한 연구)

  • Song, Taewon
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.4
    • /
    • pp.1-7
    • /
    • 2022
  • As the number of IoT devices increases, power management of the cluster head, which acts as a gateway between the cluster and sink nodes in the IoT network, becomes crucial. Particularly when the cluster head is a mobile wireless terminal, the power consumption of the IoT network must be minimized over its lifetime. In addition, the delay of information transmission in the IoT network is one of the primary metrics for rapid information collecting in the IoT network. In this paper, we propose a low-power buffer management algorithm that takes into account the information transmission delay in an IoT network. By forwarding or skipping received packets utilizing deep Q learning employed in deep reinforcement learning methods, the suggested method is able to reduce power consumption while decreasing transmission delay level. The proposed approach is demonstrated to reduce power consumption and to improve delay relative to the existing buffer management technique used as a comparison in slotted ALOHA protocol.

Optimal Design of Semi-Active Mid-Story Isolation System using Supervised Learning and Reinforcement Learning (지도학습과 강화학습을 이용한 준능동 중간층면진시스템의 최적설계)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.4
    • /
    • pp.73-80
    • /
    • 2021
  • A mid-story isolation system was proposed for seismic response reduction of high-rise buildings and presented good control performance. Control performance of a mid-story isolation system was enhanced by introducing semi-active control devices into isolation systems. Seismic response reduction capacity of a semi-active mid-story isolation system mainly depends on effect of control algorithm. AI(Artificial Intelligence)-based control algorithm was developed for control of a semi-active mid-story isolation system in this study. For this research, an practical structure of Shiodome Sumitomo building in Japan which has a mid-story isolation system was used as an example structure. An MR (magnetorheological) damper was used to make a semi-active mid-story isolation system in example model. In numerical simulation, seismic response prediction model was generated by one of supervised learning model, i.e. an RNN (Recurrent Neural Network). Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm The numerical simulation results presented that the DQN algorithm can effectively control a semi-active mid-story isolation system resulting in successful reduction of seismic responses.

A Study on Application of Reinforcement Learning Algorithm Using Pixel Data (픽셀 데이터를 이용한 강화 학습 알고리즘 적용에 관한 연구)

  • Moon, Saemaro;Choi, Yonglak
    • Journal of Information Technology Services
    • /
    • v.15 no.4
    • /
    • pp.85-95
    • /
    • 2016
  • Recently, deep learning and machine learning have attracted considerable attention and many supporting frameworks appeared. In artificial intelligence field, a large body of research is underway to apply the relevant knowledge for complex problem-solving, necessitating the application of various learning algorithms and training methods to artificial intelligence systems. In addition, there is a dearth of performance evaluation of decision making agents. The decision making agent that can find optimal solutions by using reinforcement learning methods designed through this research can collect raw pixel data observed from dynamic environments and make decisions by itself based on the data. The decision making agent uses convolutional neural networks to classify situations it confronts, and the data observed from the environment undergoes preprocessing before being used. This research represents how the convolutional neural networks and the decision making agent are configured, analyzes learning performance through a value-based algorithm and a policy-based algorithm : a Deep Q-Networks and a Policy Gradient, sets forth their differences and demonstrates how the convolutional neural networks affect entire learning performance when using pixel data. This research is expected to contribute to the improvement of artificial intelligence systems which can efficiently find optimal solutions by using features extracted from raw pixel data.

Dynamic Resource Allocation in Distributed Cloud Computing (분산 클라우드 컴퓨팅을 위한 동적 자원 할당 기법)

  • Ahn, TaeHyoung;Kim, Yena;Lee, SuKyoung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.512-518
    • /
    • 2013
  • A resource allocation algorithm has a high impact on user satisfaction as well as the ability to accommodate and process services in a distributed cloud computing. In other words, service rejections, which occur when datacenters have no enough resources, degrade the user satisfaction level. Therefore, in this paper, we propose a resource allocation algorithm considering the cloud domain's remaining resources to minimize the number of service rejections. The resource allocation rate based on Q-Learning increases when the remaining resources are sufficient to allocate the maximum allocation rate otherwise and avoids the service rejection. To demonstrate, We compare the proposed algorithm with two previous works and show that the proposed algorithm has the smaller number of the service rejections.

Topic directed Web Spidering using Reinforcement Learning (강화학습을 이용한 주제별 웹 탐색)

  • Lim, Soo-Yeon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.395-399
    • /
    • 2005
  • In this paper, we presents HIGH-Q learning algorithm with reinforcement learning for more fast and exact topic-directed web spidering. The purpose of reinforcement learning is to maximize rewards from environment, an reinforcement learning agents learn by interacting with external environment through trial and error. We performed experiments that compared the proposed method using reinforcement learning with breath first search method for searching the web pages. In result, reinforcement learning method using future discounted rewards searched a small number of pages to find result pages.

Q-learning Using Influence Map (영향력 분포도를 이용한 Q-학습)

  • Sung Yun-Sick;Cho Kyung-Eun
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.5
    • /
    • pp.649-657
    • /
    • 2006
  • Reinforcement Learning is a computational approach to learning whereby an agent take an action which maximize the total amount of reward it receives among possible actions within current state when interacting with a uncertain environment. Q-learning, one of the most active algorithm in Reinforcement Learning, is consist of rewards which is obtained when an agent take an action. But it has the problem with mapping real world to discrete states. When state spaces are very large, Q-learning suffers from time for learning. In constant, when the state space is reduced, many state spaces map to single state space. Because an agent only learns single action within many states, an agent takes an action monotonously. In this paper, to reduce time for learning and complement simple action, we propose the Q-learning using influence map(QIM). By using influence map and adjacent state space's learning result, an agent could choose proper action within uncertain state where an agent does not learn. When this paper compares simulation results of QIM and Q-learning, we show that QIM effects as same as Q-learning even thought QIM uses 4.6% of the Q-learning's state spaces. This is because QIM learns faster than Q-learning about 2.77 times and the state spaces which is needed to learn is reduced, so the occurred problem is complemented by the influence map.

  • PDF

Reinforcement Learning Based Evolution and Learning Algorithm for Cooperative Behavior of Swarm Robot System (군집 로봇의 협조 행동을 위한 강화 학습 기반의 진화 및 학습 알고리즘)

  • Seo, Sang-Wook;Kim, Ho-Duck;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.5
    • /
    • pp.591-597
    • /
    • 2007
  • In swarm robot systems, each robot must behaves by itself according to the its states and environments, and if necessary, must cooperates with other robots in order to carry out a given task. Therefore it is essential that each robot has both learning and evolution ability to adapt the dynamic environments. In this paper, the new polygon based Q-learning algorithm and distributed genetic algorithms are proposed for behavior learning and evolution of collective autonomous mobile robots. And by distributed genetic algorithm exchanging the chromosome acquired under different environments by communication each robot can improve its behavior ability Specially, in order to improve the performance of evolution, selective crossover using the characteristic of reinforcement learning is adopted in this paper. we verify the effectiveness of the proposed method by applying it to cooperative search problem.

Bi-directional Electricity Negotiation Scheme based on Deep Reinforcement Learning Algorithm in Smart Building Systems (스마트 빌딩 시스템을 위한 심층 강화학습 기반 양방향 전력거래 협상 기법)

  • Lee, Donggu;Lee, Jiyoung;Kyeong, Chanuk;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.5
    • /
    • pp.215-219
    • /
    • 2021
  • In this paper, we propose a deep reinforcement learning algorithm-based bi-directional electricity negotiation scheme that adjusts and propose the price they want to exchange for negotiation over smart building and utility grid. By employing a deep Q network algorithm, which is a kind of deep reinforcement learning algorithm, the proposed scheme adjusts the price proposal of smart building and utility grid. From the simulation results, it can be verified that consensus on electricity price negotiation requires average of 43.78 negotiation process. The negotiation process under simulation settings and scenario can also be confirmed through the simulation results.

A Navigation System for Mobile Robot

  • Zhang, Yuanliang;Chong, Kil-To
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.118-120
    • /
    • 2009
  • In this paper, we present the Q-learning method for adaptive traffic signal control on the basis of multi-agent technology. The structure is composed of sixphase agents and one intersection agent. Wireless communication network provides the possibility of the cooperation of agents. As one kind of reinforcement learning, Q-learning is adopted as the algorithm of the control mechanism, which can acquire optical control strategies from delayed reward; furthermore, we adopt dynamic learning method instead of static method, which is more practical. Simulation result indicates that it is more effective than traditional signal system.

  • PDF

Reinforcement Learning Algorithm Using Domain Knowledge

  • Young, Jang-Si;Hong, Suh-Il;Hak, Kong-Sung;Rok, Oh-Sang
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.173.5-173
    • /
    • 2001
  • Q-Learning is a most widely used reinforcement learning, which addresses the question of how an autonomous agent can learn to choose optimal actions to achieve its goal about any one problem. Q-Learning can acquire optimal control strategies from delayed rewards, even when the agent has no prior knowledge of the effects of its action in the environment. If agent has an ability using previous knowledge, then it is expected that the agent can speed up learning by interacting with environment. We present a novel reinforcement learning method using domain knowledge, which is represented by problem-independent features and their classifiers. Here neural network are implied as knowledge classifiers. To show that an agent using domain knowledge can have better performance than the agent with standard Q-Learner. Computer simulations are ...

  • PDF