• Title/Summary/Keyword: Path of Reinforcement

Search Result 133, Processing Time 0.023 seconds

Path Planning of Unmanned Aerial Vehicle based Reinforcement Learning using Deep Q Network under Simulated Environment (시뮬레이션 환경에서의 DQN을 이용한 강화 학습 기반의 무인항공기 경로 계획)

  • Lee, Keun Hyoung;Kim, Shin Dug
    • Journal of the Semiconductor & Display Technology
    • /
    • v.16 no.3
    • /
    • pp.127-130
    • /
    • 2017
  • In this research, we present a path planning method for an autonomous flight of unmanned aerial vehicles (UAVs) through reinforcement learning under simulated environment. We design the simulator for reinforcement learning of uav. Also we implement interface for compatibility of Deep Q-Network(DQN) and simulator. In this paper, we perform reinforcement learning through the simulator and DQN, and use Q-learning algorithm, which is a kind of reinforcement learning algorithms. Through experimentation, we verify performance of DQN-simulator. Finally, we evaluated the learning results and suggest path planning strategy using reinforcement learning.

  • PDF

Path Planning for a Robot Manipulator based on Probabilistic Roadmap and Reinforcement Learning

  • Park, Jung-Jun;Kim, Ji-Hun;Song, Jae-Bok
    • International Journal of Control, Automation, and Systems
    • /
    • v.5 no.6
    • /
    • pp.674-680
    • /
    • 2007
  • The probabilistic roadmap (PRM) method, which is a popular path planning scheme, for a manipulator, can find a collision-free path by connecting the start and goal poses through a roadmap constructed by drawing random nodes in the free configuration space. PRM exhibits robust performance for static environments, but its performance is poor for dynamic environments. On the other hand, reinforcement learning, a behavior-based control technique, can deal with uncertainties in the environment. The reinforcement learning agent can establish a policy that maximizes the sum of rewards by selecting the optimal actions in any state through iterative interactions with the environment. In this paper, we propose efficient real-time path planning by combining PRM and reinforcement learning to deal with uncertain dynamic environments and similar environments. A series of experiments demonstrate that the proposed hybrid path planner can generate a collision-free path even for dynamic environments in which objects block the pre-planned global path. It is also shown that the hybrid path planner can adapt to the similar, previously learned environments without significant additional learning.

RL-based Path Planning for SLAM Uncertainty Minimization in Urban Mapping (도시환경 매핑 시 SLAM 불확실성 최소화를 위한 강화 학습 기반 경로 계획법)

  • Cho, Younghun;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.16 no.2
    • /
    • pp.122-129
    • /
    • 2021
  • For the Simultaneous Localization and Mapping (SLAM) problem, a different path results in different SLAM results. Usually, SLAM follows a trail of input data. Active SLAM, which determines where to sense for the next step, can suggest a better path for a better SLAM result during the data acquisition step. In this paper, we will use reinforcement learning to find where to perceive. By assigning entire target area coverage to a goal and uncertainty as a negative reward, the reinforcement learning network finds an optimal path to minimize trajectory uncertainty and maximize map coverage. However, most active SLAM researches are performed in indoor or aerial environments where robots can move in every direction. In the urban environment, vehicles only can move following road structure and traffic rules. Graph structure can efficiently express road environment, considering crossroads and streets as nodes and edges, respectively. In this paper, we propose a novel method to find optimal SLAM path using graph structure and reinforcement learning technique.

Reinforcement Learning Using State Space Compression (상태 공간 압축을 이용한 강화학습)

  • Kim, Byeong-Cheon;Yun, Byeong-Ju
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.3
    • /
    • pp.633-640
    • /
    • 1999
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like Q-learning and TD(Temporal Difference)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present COMREL(COMpressed REinforcement Learning) algorithm for finding the shortest path fast in a maze environment, select the candidate states that can guide the shortest path in compressed maze environment, and learn only the candidate states to find the shortest path. After comparing COMREL algorithm with the already existing Q-learning and Priortized Sweeping algorithm, we could see that the learning time shortened very much.

  • PDF

Goal-Directed Reinforcement Learning System (목표지향적 강화학습 시스템)

  • Lee, Chang-Hoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.5
    • /
    • pp.265-270
    • /
    • 2010
  • Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like TD-learning and TD(${\lambda}$)-learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present GDRLS algorithm for finding the shortest path faster in a maze environment. GDRLS is select the candidate states that can guide the shortest path in maze environment, and learn only the candidate states to find the shortest path. Through experiments, we can see that GDRLS can search the shortest path faster than TD-learning and TD(${\lambda}$)-learning in maze environment.

Reinforcement Learning for Node-disjoint Path Problem in Wireless Ad-hoc Networks (무선 애드혹 네트워크에서 노드분리 경로문제를 위한 강화학습)

  • Jang, Kil-woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.8
    • /
    • pp.1011-1017
    • /
    • 2019
  • This paper proposes reinforcement learning to solve the node-disjoint path problem which establishes multipath for reliable data transmission in wireless ad-hoc networks. The node-disjoint path problem is a problem of determining a plurality of paths so that the intermediate nodes do not overlap between the source and the destination. In this paper, we propose an optimization method considering transmission distance in a large-scale wireless ad-hoc network using Q-learning in reinforcement learning, one of machine learning. Especially, in order to solve the node-disjoint path problem in a large-scale wireless ad-hoc network, a large amount of computation is required, but the proposed reinforcement learning efficiently obtains appropriate results by learning the path. The performance of the proposed reinforcement learning is evaluated from the viewpoint of transmission distance to establish two node-disjoint paths. From the evaluation results, it showed better performance in the transmission distance compared with the conventional simulated annealing.

UAV Path Planning based on Deep Reinforcement Learning using Cell Decomposition Algorithm (셀 분해 알고리즘을 활용한 심층 강화학습 기반 무인 항공기 경로 계획)

  • Kyoung-Hun Kim;Byungsun Hwang;Joonho Seon;Soo-Hyun Kim;Jin-Young Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.3
    • /
    • pp.15-20
    • /
    • 2024
  • Path planning for unmanned aerial vehicles (UAV) is crucial in avoiding collisions with obstacles in complex environments that include both static and dynamic obstacles. Path planning algorithms like RRT and A* are effectively handle static obstacle avoidance but have limitations with increasing computational complexity in high-dimensional environments. Reinforcement learning-based algorithms can accommodate complex environments, but like traditional path planning algorithms, they struggle with training complexity and convergence in higher-dimensional environment. In this paper, we proposed a reinforcement learning model utilizing a cell decomposition algorithm. The proposed model reduces the complexity of the environment by decomposing the learning environment in detail, and improves the obstacle avoidance performance by establishing the valid action of the agent. This solves the exploration problem of reinforcement learning and improves the convergence of learning. Simulation results show that the proposed model improves learning speed and efficient path planning compared to reinforcement learning models in general environments.

A Study of Unmanned Aerial Vehicle Path Planning using Reinforcement Learning

  • Kim, Cheong Ghil
    • Journal of the Semiconductor & Display Technology
    • /
    • v.17 no.1
    • /
    • pp.88-92
    • /
    • 2018
  • Currently drone industry has become one of the fast growing markets and the technology for unmanned aerial vehicles are expected to continue to develop at a rapid rate. Especially small unmanned aerial vehicle systems have been designed and utilized for the various field with their own specific purposes. In these fields the path planning problem to find the shortest path between two oriented points is important. In this paper we introduce a path planning strategy for an autonomous flight of unmanned aerial vehicles through reinforcement learning with self-positioning technique. We perform Q-learning algorithm, a kind of reinforcement learning algorithm. At the same time, multi sensors of acceleraion sensor, gyro sensor, and magnetic are used to estimate the position. For the functional evaluation, the proposed method was simulated with virtual UAV environment and visualized the results. The flight history was based on a PX4 based drones system equipped with a smartphone.

Optimization of the Path of Inner Reinforcement for an Automobile Hood Using Design Sensitivity Analysis (설계민감도해석을 이용한 자동차후드 보강경로 최적설계)

  • Lee, Tae-Hui;Lee, Dong-Gi;Gu, Ja-Gyeom;Han, Seok-Yeong;Im, Jang-Geun
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.24 no.1 s.173
    • /
    • pp.62-68
    • /
    • 2000
  • Optimization technique to find a path of an inner reinforcement of an automobile hood is proposed by using design sensitivity informations. The strength and modal characteristics of the automobile hood are analyzed and their design sensitivity analyses with respect to the thickness are carried out using MSC/NASTRAN. Based on the design sensitivity analysis, determination of design variables and response functions is discussed. Techniques improving design from design sensitivity informations are suggested and the double-layer method is newly proposed to optimize the path of stiffener for a shell structure, Using the suggested method, we redesign a new inner reinforcement of an automobile hood and compare the responses with the original design. It is confirmed that new design improved in the frequency responses without the weight increasement.

Leveraging Reinforcement Learning for Generating Construction Workers' Moving Path: Opportunities and Challenges

  • Kim, Minguk;Kim, Tae Wan
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.1085-1092
    • /
    • 2022
  • Travel distance is a parameter mainly used in the objective function of Construction Site Layout Planning (CSLP) automation models. To obtain travel distance, common approaches, such as linear distance, shortest-distance algorithm, visibility graph, and access road path, concentrate only on identifying the shortest path. However, humans do not necessarily follow one shortest path but can choose a safer and more comfortable path according to their situation within a reasonable range. Thus, paths generated by these approaches may be different from the actual paths of the workers, which may cause a decrease in the reliability of the optimized construction site layout. To solve this problem, this paper adopts reinforcement learning (RL) inspired by various concepts of cognitive science and behavioral psychology to generate a realistic path that mimics the decision-making and behavioral processes of wayfinding of workers on the construction site. To do so, in this paper, the collection of human wayfinding tendencies and the characteristics of the walking environment of construction sites are investigated and the importance of taking these into account in simulating the actual path of workers is emphasized. Furthermore, a simulation developed by mapping the identified tendencies to the reward design shows that the RL agent behaves like a real construction worker. Based on the research findings, some opportunities and challenges were proposed. This study contributes to simulating the potential path of workers based on deep RL, which can be utilized to calculate the travel distance of CSLP automation models, contributing to providing more reliable solutions.

  • PDF