• Title/Summary/Keyword: Path of Reinforcement

Search Result 135, Processing Time 0.022 seconds

Region-based Q- learning For Autonomous Mobile Robot Navigation (자율 이동 로봇의 주행을 위한 영역 기반 Q-learning)

  • 차종환;공성학;서일홍
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.174-174
    • /
    • 2000
  • Q-learning, based on discrete state and action space, is a most widely used reinforcement Learning. However, this requires a lot of memory and much time for learning all actions of each state when it is applied to a real mobile robot navigation using continuous state and action space Region-based Q-learning is a reinforcement learning method that estimates action values of real state by using triangular-type action distribution model and relationship with its neighboring state which was defined and learned before. This paper proposes a new Region-based Q-learning which uses a reward assigned only when the agent reached the target, and get out of the Local optimal path with adjustment of random action rate. If this is applied to mobile robot navigation, less memory can be used and robot can move smoothly, and optimal solution can be learned fast. To show the validity of our method, computer simulations are illusrated.

  • PDF

A Study about Additional Reinforcement in Local Updating and Global Updating for Efficient Path Search in Ant Colony System (Ant Colony System에서 효율적 경로 탐색을 위한 지역갱신과 전역갱신에서의 추가 강화에 관한 연구)

  • Lee, Seung-Gwan;Chung, Tae-Choong
    • The KIPS Transactions:PartB
    • /
    • v.10B no.3
    • /
    • pp.237-242
    • /
    • 2003
  • Ant Colony System (ACS) Algorithm is new meta heuristic for hard combinatorial optimization problem. It is a population based approach that uses exploitation of positive feedback as well as greedy search. It was first proposed for tackling the well known Traveling Salesman Problem (TSP). In this paper, we introduce ACS of new method that adds reinforcement value for each edge that visit to Local/Global updating rule. and the performance results under various conditions are conducted, and the comparision between the original ACS and the proposed method is shown. It turns out that our proposed method can compete with tile original ACS in terms of solution quality and computation speed to these problem.

Proximal Policy Optimization Reinforcement Learning based Optimal Path Planning Study of Surion Agent against Enemy Air Defense Threats (근접 정책 최적화 기반의 적 대공 방어 위협하 수리온 에이전트의 최적 기동경로 도출 연구)

  • Jae-Hwan Kim;Jong-Hwan Kim
    • Journal of the Korea Society for Simulation
    • /
    • v.33 no.2
    • /
    • pp.37-44
    • /
    • 2024
  • The Korean Helicopter Development Program has successfully introduced the Surion helicopter, a versatile multi-domain operational aircraft that replaces the aging UH-1 and 500MD helicopters. Specifically designed for maneuverability, the Surion plays a crucial role in low-altitude tactical maneuvers for personnel transportation and specific missions, emphasizing the helicopter's survivability. Despite the significance of its low-altitude tactical maneuver capability, there is a notable gap in research focusing on multi-mission tactical maneuvers that consider the risk factors associated with deploying the Surion in the presence of enemy air defenses. This study addresses this gap by exploring a method to enhance the Surion's low-altitude maneuvering paths, incorporating information about enemy air defenses. Leveraging the Proximal Policy Optimization (PPO) algorithm, a reinforcement learning-based approach, the research aims to optimize the helicopter's path planning. Visualized experiments were conducted using a Surion model implemented in the Unity environment and ML-Agents library. The proposed method resulted in a rapid and stable policy convergence for generating optimal maneuvering paths for the Surion. The experiments, based on two key criteria, "operation time" and "minimum damage," revealed distinct optimal paths. This divergence suggests the potential for effective tactical maneuvers in low-altitude situations, considering the risk factors associated with enemy air defenses. Importantly, the Surion's capability for remote control in all directions enhances its adaptability in complex operational environments.

Assessment of design methods for punching through numerical experiments

  • Kotsovou, Gregoria M.;Kotsovos, Gerasimos M.;Vougioukas, Emmanuel
    • Computers and Concrete
    • /
    • v.17 no.3
    • /
    • pp.305-322
    • /
    • 2016
  • The work is intended to demonstrate that the loss of bond between concrete and flexural steel which led in recent years a number of flat-slab structures to punching collapse under service loading conditions is also relevant to ultimate limit-state design. It is based on a comparative study of the results obtained from numerical experiments on flat slab-column sub-assemblages. The slabs were designed for punching either in compliance with the EC2 code requirements, which do not allow for such loss of bond, or in accordance with the compressive force-path method which considers the loss of bond between concrete and the flexural reinforcement in tension as the primary cause of punching. The numerical experiments are carried out through the use of a nonlinear finite element analysis package for which, although ample published evidence of its validity exists, additional proof of its suitability for the purposes of the present work is presented.

Capacities and Failure Modes of Transfer Girders in the Upper-Wall and Lower-Frame Structures having different Detailing (주상복합구조의 전이보 상세에 따른 성능과 파괴모드)

  • 이한선;김상연;고동우;권기혁;김민수
    • Proceedings of the Korea Concrete Institute Conference
    • /
    • 2000.10b
    • /
    • pp.845-850
    • /
    • 2000
  • This paper presents the results of tests performed on the transfer girders which have been generally used between upper walls and lower frames in the hybrid structures. The 8 specimens were designed using (1) ACI method, (2) strut-tie model, and (3) X-type shear reinforcement cage. The capacities of the specimens are in general larger than the design values except the one designed according to strut-tie model. The reason for this difference seems to be due to the arbitrary allocation of transferred shear force to the path of direct compression strut and the path of indirect strut and tie. The failure modes turn out toe be (1) shear failure at critical shear zone, (2) compressive concrete crushing in the diagonal strut in the shear zone of transfer girder, and (3) compressive concrete crushing in the corner of upper wall.

A Study on the Reinforcement of Reinforced Concrete using Evolutionary Structural Optimization (점진적 구조 최적화 기법을 응용한 철근콘크리트 부재의 배근)

  • 윤성수;이정재
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.44 no.2
    • /
    • pp.127-135
    • /
    • 2002
  • Due to the fact that the design of a reinforced concrete structure changes in accordance with its shape and assigned load, total automation of the design system has not been achieved. For instance, since there is no general rule about setting up reinforcing steel quantity and arrangement location, it is simply not feasible to automatically decide the reinforcing arrangement location. In this study, the ESO(evolutionary structural optimization) technique and its related issues will be discussed. The ESO techniques is determined the reasonable load path which is traveling of load between in-flow and out-flow at a concrete structure using numerical analysis. And the results applied to the steel arrangement in reinforced concrete structures. The optimal algorithm, which determines the terminal criteria during ESO process, has been updated by using the obtained results. And the load path within the member has been determined automatically.

Post-buckling analysis of imperfect nonlocal piezoelectric beams under magnetic field and thermal loading

  • Fenjan, Raad M.;Ahmed, Ridha A.;Faleh, Nadhim M.
    • Structural Engineering and Mechanics
    • /
    • v.78 no.1
    • /
    • pp.15-22
    • /
    • 2021
  • An investigation of the nonlinear thermal buckling behavior of a nano-sized beam constructed from intelligent materials called piezo-magnetic materials has been presented in this article. The nano-sized beam geometry has been considered based on two assumptions: an ideal straight beam and an imperfect beam. For incorporating nano-size impacts, the nano-sized beam formulation has been presented according to nonlocal elasticity. After establishing the governing equations based on classic beam theory and nonlocal elasticity, the nonlinear buckling path has been obtained via Galerkin's method together with an analytical trend. The dependency of buckling path to piezo-magnetic material composition, electro-magnetic fields and geometry imperfectness has been studied in detail.

Earthwork Planning via Reinforcement Learning with Heterogeneous Construction Equipment (강화학습을 이용한 이종 장비 토목 공정 계획)

  • Ji, Min-Gi;Park, Jun-Keon;Kim, Do-Hyeong;Jung, Yo-Han;Park, Jin-Kyoo;Moon, Il-Chul
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • Earthwork planning is one of the critical issues in a construction process management. For the construction process management, there are some different approaches such as optimizing construction with either mathematical methodologies or heuristics with simulations. This paper propose a simulated earthwork scenario and an optimal path for the simulation using a reinforcement learning. For reinforcement learning, we use two different Markov decision process, or MDP, formulations with interacting excavator agent and truck agent, sequenced learning, and independent learning. The simulation result shows that two different formulations can reach the optimal planning for a simulated earthwork scenario. This planning could be a basis for an automatic construction management.

Implementation of Autonomous Mobile Wheeled Robot for Path Correction through Deep Learning Object Recognition (딥러닝 객체인식을 통한 경로보정 자율 주행 로봇의 구현)

  • Lee, Hyeong-il;Kim, Jin-myeong;Lee, Jai-weun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.12
    • /
    • pp.164-172
    • /
    • 2019
  • In this paper, we implement a wheeled mobile robot that accurately and autonomously finds the optimal route from the starting point to the destination point based on computer vision in a complex indoor environment. We get a number of waypoints from the starting point to get the best route to the target through deep reinforcement learning. However, in the case of autonomous driving, the majority of cases do not reach their destination accurately due to external factors such as surface curvature and foreign objects. Therefore, we propose an algorithm to deepen the waypoints and destinations included in the planned route and then correct the route through the waypoint recognition while driving to reach the planned destination. We built an autonomous wheeled mobile robot controlled by Arduino and equipped with Raspberry Pi and Pycamera and tested the planned route in the indoor environment using the proposed algorithm through real-time linkage with the server in the OSX environment.

Hybrid Learning for Vision-and-Language Navigation Agents (시각-언어 이동 에이전트를 위한 복합 학습)

  • Oh, Suntaek;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.9
    • /
    • pp.281-290
    • /
    • 2020
  • The Vision-and-Language Navigation(VLN) task is a complex intelligence problem that requires both visual and language comprehension skills. In this paper, we propose a new learning model for visual-language navigation agents. The model adopts a hybrid learning that combines imitation learning based on demo data and reinforcement learning based on action reward. Therefore, this model can meet both problems of imitation learning that can be biased to the demo data and reinforcement learning with relatively low data efficiency. In addition, the proposed model uses a novel path-based reward function designed to solve the problem of existing goal-based reward functions. In this paper, we demonstrate the high performance of the proposed model through various experiments using both Matterport3D simulation environment and R2R benchmark dataset.