• Title/Summary/Keyword: Deep Q-Learning

Search Result 85, Processing Time 0.022 seconds

Solving Survival Gridworld Problem Using Hybrid Policy Modified Q-Based Reinforcement

  • Montero, Vince Jebryl;Jung, Woo-Young;Jeong, Yong-Jin
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1150-1156
    • /
    • 2019
  • This paper explores a model-free value-based approach for solving survival gridworld problem. Survival gridworld problem opens up a challenge involving taking risks to gain better rewards. Classic value-based approach in model-free reinforcement learning assumes minimal risk decisions. The proposed method involves a hybrid on-policy and off-policy updates to experience roll-outs using a modified Q-based update equation that introduces a parametric linear rectifier and motivational discount. The significance of this approach is it allows model-free training of agents that take into account risk factors and motivated exploration to gain better path decisions. Experimentations suggest that the proposed method achieved better exploration and path selection resulting to higher episode scores than classic off-policy and on-policy Q-based updates.

Optimal Design of Semi-Active Mid-Story Isolation System using Supervised Learning and Reinforcement Learning (지도학습과 강화학습을 이용한 준능동 중간층면진시스템의 최적설계)

  • Kang, Joo-Won;Kim, Hyun-Su
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.4
    • /
    • pp.73-80
    • /
    • 2021
  • A mid-story isolation system was proposed for seismic response reduction of high-rise buildings and presented good control performance. Control performance of a mid-story isolation system was enhanced by introducing semi-active control devices into isolation systems. Seismic response reduction capacity of a semi-active mid-story isolation system mainly depends on effect of control algorithm. AI(Artificial Intelligence)-based control algorithm was developed for control of a semi-active mid-story isolation system in this study. For this research, an practical structure of Shiodome Sumitomo building in Japan which has a mid-story isolation system was used as an example structure. An MR (magnetorheological) damper was used to make a semi-active mid-story isolation system in example model. In numerical simulation, seismic response prediction model was generated by one of supervised learning model, i.e. an RNN (Recurrent Neural Network). Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm The numerical simulation results presented that the DQN algorithm can effectively control a semi-active mid-story isolation system resulting in successful reduction of seismic responses.

Reward Design of Reinforcement Learning for Development of Smart Control Algorithm (스마트 제어알고리즘 개발을 위한 강화학습 리워드 설계)

  • Kim, Hyun-Su;Yoon, Ki-Yong
    • Journal of Korean Association for Spatial Structures
    • /
    • v.22 no.2
    • /
    • pp.39-46
    • /
    • 2022
  • Recently, machine learning is widely used to solve optimization problems in various engineering fields. In this study, machine learning is applied to development of a control algorithm for a smart control device for reduction of seismic responses. For this purpose, Deep Q-network (DQN) out of reinforcement learning algorithms was employed to develop control algorithm. A single degree of freedom (SDOF) structure with a smart tuned mass damper (TMD) was used as an example structure. A smart TMD system was composed of MR (magnetorheological) damper instead of passive damper. Reward design of reinforcement learning mainly affects the control performance of the smart TMD. Various hyper-parameters were investigated to optimize the control performance of DQN-based control algorithm. Usually, decrease of the time step for numerical simulation is desirable to increase the accuracy of simulation results. However, the numerical simulation results presented that decrease of the time step for reward calculation might decrease the control performance of DQN-based control algorithm. Therefore, a proper time step for reward calculation should be selected in a DQN training process.

The Effect of Segment Size on Quality Selection in DQN-based Video Streaming Services (DQN 기반 비디오 스트리밍 서비스에서 세그먼트 크기가 품질 선택에 미치는 영향)

  • Kim, ISeul;Lim, Kyungshik
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.10
    • /
    • pp.1182-1194
    • /
    • 2018
  • The Dynamic Adaptive Streaming over HTTP(DASH) is envisioned to evolve to meet an increasing demand on providing seamless video streaming services in the near future. The DASH performance heavily depends on the client's adaptive quality selection algorithm that is not included in the standard. The existing conventional algorithms are basically based on a procedural algorithm that is not easy to capture and reflect all variations of dynamic network and traffic conditions in a variety of network environments. To solve this problem, this paper proposes a novel quality selection mechanism based on the Deep Q-Network(DQN) model, the DQN-based DASH Adaptive Bitrate(ABR) mechanism. The proposed mechanism adopts a new reward calculation method based on five major performance metrics to reflect the current conditions of networks and devices in real time. In addition, the size of the consecutive video segment to be downloaded is also considered as a major learning metric to reflect a variety of video encodings. Experimental results show that the proposed mechanism quickly selects a suitable video quality even in high error rate environments, significantly reducing frequency of quality changes compared to the existing algorithm and simultaneously improving average video quality during video playback.

Deep Reinforcement Learning-Based Cooperative Robot Using Facial Feedback (표정 피드백을 이용한 딥강화학습 기반 협력로봇 개발)

  • Jeon, Haein;Kang, Jeonghun;Kang, Bo-Yeong
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.264-272
    • /
    • 2022
  • Human-robot cooperative tasks are increasingly required in our daily life with the development of robotics and artificial intelligence technology. Interactive reinforcement learning strategies suggest that robots learn task by receiving feedback from an experienced human trainer during a training process. However, most of the previous studies on Interactive reinforcement learning have required an extra feedback input device such as a mouse or keyboard in addition to robot itself, and the scenario where a robot can interactively learn a task with human have been also limited to virtual environment. To solve these limitations, this paper studies training strategies of robot that learn table balancing tasks interactively using deep reinforcement learning with human's facial expression feedback. In the proposed system, the robot learns a cooperative table balancing task using Deep Q-Network (DQN), which is a deep reinforcement learning technique, with human facial emotion expression feedback. As a result of the experiment, the proposed system achieved a high optimal policy convergence rate of up to 83.3% in training and successful assumption rate of up to 91.6% in testing, showing improved performance compared to the model without human facial expression feedback.

Tidy-up Task Planner based on Q-learning (정리정돈을 위한 Q-learning 기반의 작업계획기)

  • Yang, Min-Gyu;Ahn, Kuk-Hyun;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.1
    • /
    • pp.56-63
    • /
    • 2021
  • As the use of robots in service area increases, research has been conducted to replace human tasks in daily life with robots. Among them, this study focuses on the tidy-up task on a desk using a robot arm. The order in which tidy-up motions are carried out has a great impact on the success rate of the task. Therefore, in this study, a neural network-based method for determining the priority of the tidy-up motions from the input image is proposed. Reinforcement learning, which shows good performance in the sequential decision-making process, is used to train such a task planner. The training process is conducted in a virtual tidy-up environment that is configured the same as the actual tidy-up environment. To transfer the learning results in the virtual environment to the actual environment, the input image is preprocessed into a segmented image. In addition, the use of a neural network that excludes unnecessary tidy-up motions from the priority during the tidy-up operation increases the success rate of the task planner. Experiments were conducted in the real world to verify the proposed task planning method.

A Study on Cooperative Traffic Signal Control at multi-intersection (다중 교차로에서 협력적 교통신호제어에 대한 연구)

  • Kim, Dae Ho;Jeong, Ok Ran
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1381-1386
    • /
    • 2019
  • As traffic congestion in cities becomes more serious, intelligent traffic control is actively being researched. Reinforcement learning is the most actively used algorithm for traffic signal control, and recently Deep reinforcement learning has attracted attention of researchers. Extended versions of deep reinforcement learning have been emerged as deep reinforcement learning algorithm showed high performance in various fields. However, most of the existing traffic signal control were studied in a single intersection environment, and there is a limitation that the method at a single intersection does not consider the traffic conditions of the entire city. In this paper, we propose a cooperative traffic control at multi-intersection environment. The traffic signal control algorithm is based on a combination of extended versions of deep reinforcement learning and we considers traffic conditions of adjacent intersections. In the experiment, we compare the proposed algorithm with the existing deep reinforcement learning algorithm, and further demonstrate the high performance of our model with and without cooperative method.

A Study on Automatic Comment Generation Using Deep Learning (딥 러닝을 이용한 자동 댓글 생성에 관한 연구)

  • Choi, Jae-yong;Sung, So-yun;Kim, Kyoung-chul
    • Journal of Korea Game Society
    • /
    • v.18 no.5
    • /
    • pp.83-92
    • /
    • 2018
  • Many studies in deep learning show results as good as human's decision in various fields. And importance of activation of online-community and SNS grows up in game industry. Even it decides whether a game can be successful or not. The purpose of this study is to construct a system which can read texts and create comments according to schedule in online-community and SNS using deep learning. Using recurrent neural network, we constructed models generating a comment and a schedule of writing comments, and made program choosing a news title and uploading the comment at twitter in calculated time automatically. This study can be applied to activating an online game community, a Q&A service, etc.

Methodology for Apartment Space Arrangement Based on Deep Reinforcement Learning

  • Cheng Yun Chi;Se Won Lee
    • Architectural research
    • /
    • v.26 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • This study introduces a deep reinforcement learning (DRL)-based methodology for optimizing apartment space arrangements, addressing the limitations of human capability in evaluating all potential spatial configurations. Leveraging computational power, the methodology facilitates the autonomous exploration and evaluation of innovative layout options, considering architectural principles, legal standards, and client re-quirements. Through comprehensive simulation tests across various apartment types, the research demonstrates the DRL approach's effec-tiveness in generating efficient spatial arrangements that align with current design trends and meet predefined performance objectives. The comparative analysis of AI-generated layouts with those designed by professionals validates the methodology's applicability and potential in enhancing architectural design practices by offering novel, optimized spatial configuration solutions.

Recommendation System of University Major Subject based on Deep Reinforcement Learning (심층 강화학습 기반의 대학 전공과목 추천 시스템)

  • Ducsun Lim;Youn-A Min;Dongkyun Lim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.9-15
    • /
    • 2023
  • Existing simple statistics-based recommendation systems rely solely on students' course enrollment history data, making it difficult to identify classes that match students' preferences. To address this issue, this study proposes a personalized major subject recommendation system based on deep reinforcement learning (DRL). This system gauges the similarity between students based on structured data, such as the student's department, grade level, and course history. Based on this information, it recommends the most suitable major subjects by comprehensively considering information about each available major subject and evaluations of the student's courses. We confirmed that this DRL-based recommendation system provides useful insights for university students while selecting their major subjects, and our simulation results indicate that it outperforms conventional statistics-based recommendation systems by approximately 20%. In light of these results, we propose a new system that offers personalized subject recommendations by incorporating students' course evaluations. This system is expected to assist students significantly in finding major subjects that align with their preferences and academic goals.