• Title/Summary/Keyword: 심층 강화학습

Search Result 108, Processing Time 0.02 seconds

Multi-Object Goal Visual Navigation Based on Multimodal Context Fusion (멀티모달 맥락정보 융합에 기초한 다중 물체 목표 시각적 탐색 이동)

  • Jeong Hyun Choi;In Cheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.407-418
    • /
    • 2023
  • The Multi-Object Goal Visual Navigation(MultiOn) is a visual navigation task in which an agent must visit to multiple object goals in an unknown indoor environment in a given order. Existing models for the MultiOn task suffer from the limitation that they cannot utilize an integrated view of multimodal context because use only a unimodal context map. To overcome this limitation, in this paper, we propose a novel deep neural network-based agent model for MultiOn task. The proposed model, MCFMO, uses a multimodal context map, containing visual appearance features, semantic features of environmental objects, and goal object features. Moreover, the proposed model effectively fuses these three heterogeneous features into a global multimodal context map by using a point-wise convolutional neural network module. Lastly, the proposed model adopts an auxiliary task learning module to predict the observation status, goal direction and the goal distance, which can guide to learn the navigational policy efficiently. Conducting various quantitative and qualitative experiments using the Habitat-Matterport3D simulation environment and scene dataset, we demonstrate the superiority of the proposed model.

Development of an Actor-Critic Deep Reinforcement Learning Platform for Robotic Grasping in Real World (현실 세계에서의 로봇 파지 작업을 위한 정책/가치 심층 강화학습 플랫폼 개발)

  • Kim, Taewon;Park, Yeseong;Kim, Jong Bok;Park, Youngbin;Suh, Il Hong
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.2
    • /
    • pp.197-204
    • /
    • 2020
  • In this paper, we present a learning platform for robotic grasping in real world, in which actor-critic deep reinforcement learning is employed to directly learn the grasping skill from raw image pixels and rarely observed rewards. This is a challenging task because existing algorithms based on deep reinforcement learning require an extensive number of training data or massive computational cost so that they cannot be affordable in real world settings. To address this problems, the proposed learning platform basically consists of two training phases; a learning phase in simulator and subsequent learning in real world. Here, main processing blocks in the platform are extraction of latent vector based on state representation learning and disentanglement of a raw image, generation of adapted synthetic image using generative adversarial networks, and object detection and arm segmentation for the disentanglement. We demonstrate the effectiveness of this approach in a real environment.

Reinforcement Learning based on Deep Deterministic Policy Gradient for Roll Control of Underwater Vehicle (수중운동체의 롤 제어를 위한 Deep Deterministic Policy Gradient 기반 강화학습)

  • Kim, Su Yong;Hwang, Yeon Geol;Moon, Sung Woong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.5
    • /
    • pp.558-568
    • /
    • 2021
  • The existing underwater vehicle controller design is applied by linearizing the nonlinear dynamics model to a specific motion section. Since the linear controller has unstable control performance in a transient state, various studies have been conducted to overcome this problem. Recently, there have been studies to improve the control performance in the transient state by using reinforcement learning. Reinforcement learning can be largely divided into value-based reinforcement learning and policy-based reinforcement learning. In this paper, we propose the roll controller of underwater vehicle based on Deep Deterministic Policy Gradient(DDPG) that learns the control policy and can show stable control performance in various situations and environments. The performance of the proposed DDPG based roll controller was verified through simulation and compared with the existing PID and DQN with Normalized Advantage Functions based roll controllers.

Digital Twin and Visual Object Tracking using Deep Reinforcement Learning (심층 강화학습을 이용한 디지털트윈 및 시각적 객체 추적)

  • Park, Jin Hyeok;Farkhodov, Khurshedjon;Choi, Piljoo;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.2
    • /
    • pp.145-156
    • /
    • 2022
  • Nowadays, the complexity of object tracking models among hardware applications has become a more in-demand duty to complete in various indeterminable environment tracking situations with multifunctional algorithm skills. In this paper, we propose a virtual city environment using AirSim (Aerial Informatics and Robotics Simulation - AirSim, CityEnvironment) and use the DQN (Deep Q-Learning) model of deep reinforcement learning model in the virtual environment. The proposed object tracking DQN network observes the environment using a deep reinforcement learning model that receives continuous images taken by a virtual environment simulation system as input to control the operation of a virtual drone. The deep reinforcement learning model is pre-trained using various existing continuous image sets. Since the existing various continuous image sets are image data of real environments and objects, it is implemented in 3D to track virtual environments and moving objects in them.

Fast Motion Planning of Wheel-legged Robot for Crossing 3D Obstacles using Deep Reinforcement Learning (심층 강화학습을 이용한 휠-다리 로봇의 3차원 장애물극복 고속 모션 계획 방법)

  • Soonkyu Jeong;Mooncheol Won
    • The Journal of Korea Robotics Society
    • /
    • v.18 no.2
    • /
    • pp.143-154
    • /
    • 2023
  • In this study, a fast motion planning method for the swing motion of a 6x6 wheel-legged robot to traverse large obstacles and gaps is proposed. The motion planning method presented in the previous paper, which was based on trajectory optimization, took up to tens of seconds and was limited to two-dimensional, structured vertical obstacles and trenches. A deep neural network based on one-dimensional Convolutional Neural Network (CNN) is introduced to generate keyframes, which are then used to represent smooth reference commands for the six leg angles along the robot's path. The network is initially trained using the behavioral cloning method with a dataset gathered from previous simulation results of the trajectory optimization. Its performance is then improved through reinforcement learning, using a one-step REINFORCE algorithm. The trained model has increased the speed of motion planning by up to 820 times and improved the success rates of obstacle crossing under harsh conditions, such as low friction and high roughness.

Collective Navigation Through a Narrow Gap for a Swarm of UAVs Using Curriculum-Based Deep Reinforcement Learning (커리큘럼 기반 심층 강화학습을 이용한 좁은 틈을 통과하는 무인기 군집 내비게이션)

  • Myong-Yol Choi;Woojae Shin;Minwoo Kim;Hwi-Sung Park;Youngbin You;Min Lee;Hyondong Oh
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.117-129
    • /
    • 2024
  • This paper introduces collective navigation through a narrow gap using a curriculum-based deep reinforcement learning algorithm for a swarm of unmanned aerial vehicles (UAVs). Collective navigation in complex environments is essential for various applications such as search and rescue, environment monitoring and military tasks operations. Conventional methods, which are easily interpretable from an engineering perspective, divide the navigation tasks into mapping, planning, and control; however, they struggle with increased latency and unmodeled environmental factors. Recently, learning-based methods have addressed these problems by employing the end-to-end framework with neural networks. Nonetheless, most existing learning-based approaches face challenges in complex scenarios particularly for navigating through a narrow gap or when a leader or informed UAV is unavailable. Our approach uses the information of a certain number of nearest neighboring UAVs and incorporates a task-specific curriculum to reduce learning time and train a robust model. The effectiveness of the proposed algorithm is verified through an ablation study and quantitative metrics. Simulation results demonstrate that our approach outperforms existing methods.

Obstacle Avoidance System for Autonomous CTVs in Offshore Wind Farms Based on Deep Reinforcement Learning (심층 강화학습 기반 자율운항 CTV의 해상풍력발전단지 내 장애물 회피 시스템)

  • Jingyun Kim;Haemyung Chon;Jackyou Noh
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.3
    • /
    • pp.131-139
    • /
    • 2024
  • Crew Transfer Vessels (CTVs) are primarily used for the maintenance of offshore wind farms. Despite being manually operated by professional captains and crew, collisions with other ships and marine structures still occur. To prevent this, the introduction of autonomous navigation systems to CTVs is necessary. In this study, research on the obstacle avoidance system of the autonomous navigation system for CTVs was conducted. In particular, research on obstacle avoidance simulation for CTVs using deep reinforcement learning was carried out, taking into account the currents and wind loads in offshore wind farms. For this purpose, 3 degrees of freedom ship maneuvering modeling for CTVs considering the currents and wind loads in offshore wind farms was performed, and a simulation environment for offshore wind farms was implemented to train and test the deep reinforcement learning agent. Specifically, this study conducted research on obstacle avoidance maneuvers using MATD3 within deep reinforcement learning, and as a result, it was confirmed that the model, which underwent training over 10,000 episodes, could successfully avoid both static and moving obstacles. This confirms the conclusion that the application of the methods proposed in this study can successfully facilitate obstacle avoidance for autonomous navigation CTVs within offshore wind farms.

Deep Q-Learning Network Model for Container Ship Master Stowage Plan (컨테이너 선박 마스터 적하계획을 위한 심층강화학습 모형)

  • Shin, Jae-Young;Ryu, Hyun-Seung
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.1
    • /
    • pp.19-29
    • /
    • 2021
  • In the Port Logistics system, Container Stowage planning is an important issue for cost-effective efficiency improvements. At present, Planners are mainly carrying out Stowage planning by manual or semi-automatically. However, as the trend of super-large container ships continues, it is difficult to calculate an efficient Stowage plan with manpower. With the recent rapid development of artificial intelligence-related technologies, many studies have been conducted to apply enhanced learning to optimization problems. Accordingly, in this paper, we intend to develop and present a Deep Q-Learning Network model for the Master Stowage planning of Container ships.

Qualitative Analysis of Chinese University Students' Online Learning Experience in Korea During the Covid-19 Pandemic (코로나19 시기 재한 중국인 유학생들의 온라인 수업경험에 대한 질적 분석)

  • Kim, Joo-yeong;Koo, Yesung;Bai, Chunai;Park, Junghwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.633-642
    • /
    • 2021
  • This study explores the online learning experiences of Chinese foreign students in Korea by using the CQR process and method. To gather data, researchers conducted online, in-depth interviews with 15 Chinese university students in Korea who were enrolled in the spring and fall semesters in 2020. After compiling the research, the data were segmented into four domains and 13 categories, with 36 subcategories identified from among foreign students' online learning experiences. The results show that Chinese students perceived the convenience of online classes and personalized learning as its strength, but considered lowered motivation and lack of concentration as weaknesses. Also, they experienced an increase in the amount of learning, spending more time studying online, using personal learning strategies, and getting help from friends and the university's online learning system. Moreover, they experienced difficulties related to class notifications, guidance, and interactions with the instructors. Foreign students studying in Korea need their instructor's facilitation in order to understand and participate in online classes, reinforcing a student's self-directed learning ability, and need appropriate guidance and support in terms of the online class environment.

Study on Improving the Navigational Safety Evaluation Methodology based on Autonomous Operation Technology (자율운항기술 기반의 선박 통항 안전성 평가 방법론 개선 연구)

  • Jun-Mo Park
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.1
    • /
    • pp.74-81
    • /
    • 2024
  • In the near future, autonomous ships, ships controlled by shore remote control centers, and ships operated by navigators will coexist and operate the sea together. In the advent of this situation, a method is required to evaluate the safety of the maritime traffic environment. Therefore, in this study, a plan to evaluate the safety of navigation through ship control simulation was proposed in a maritime environment, where ships directly controlled by navigators and autonomous ships coexisted, using autonomous operation technology. Own ship was designed to have autonomous operational functions by learning the MMG model based on the six-DOF motion with the PPO algorithm, an in-depth reinforcement learning technique. The target ship constructed maritime traffic modeling data based on the maritime traffic data of the sea area to be evaluated and designed autonomous operational functions to be implemented in a simulation space. A numerical model was established by collecting date on tide, wave, current, and wind from the maritime meteorological database. A maritime meteorology model was created based on this and designed to reproduce maritime meteorology on the simulator. Finally, the safety evaluation proposed a system that enabled the risk of collision through vessel traffic flow simulation in ship control simulation while maintaining the existing evaluation method.