DOI QR코드

DOI QR Code

근접 정책 최적화 기반의 적 대공 방어 위협하 수리온 에이전트의 최적 기동경로 도출 연구

Proximal Policy Optimization Reinforcement Learning based Optimal Path Planning Study of Surion Agent against Enemy Air Defense Threats

  • 김재환 (대한민국 육군 ) ;
  • 김종환
  • Jae-Hwan Kim ;
  • Jong-Hwan Kim (Mechanical & Systems Engineering Department at the Korea Military Academy)
  • 투고 : 2024.04.02
  • 심사 : 2024.06.14
  • 발행 : 2024.06.30

초록

한국형 헬기 개발사업의 성공적인 결과로 인하여 노후화된 UH-1및 500MD 헬기를 대체하는 수리온(Surion)에 대한 연구가 활발히 진행되고 있다. 특히, 높은 기동성을 보유한 수리온은 미래 전장에서의 병력수송 및 특수작전 등 다양한 임무를 수행할 것으로 예상되며 이를 지원하기 위한 저고도 전술기동 능력이 요구되고 있다. 그러나 수리온 운용시, 대공 위협 요소를 고려한 최적 저고도 전술기동에 대한 연구는 아직까지 미흡한 실정이다. 본 연구는 강화학습 기반의 알고리즘 중에 하나인 Proximal Policy Optimization(PPO) 알고리즘과 적 대공위협을 고려하여 수리온이 작전 목표지역까지 도달하도록 하는 저고도 상에서의 최적화된 기동 경로를 산출하는 방법론을 제안한다. 이를 위해, Unity 환경과 ML-Agents 라이브러리 상에서 실사화된 수리온 모델을 기초로 약 2×107 회의 강화학습을 진행하였고, 제안하는 방법을 적용하여 수리온의 최단시간 및 최소피해를 달성하는 최적 저고도 전술기동 경로를 산출하는 정책을 도출하였다. 그 결과, '최단 시간' 및 '최소 피해'라는 두 가지 기준을 충족하는 최적 경로가 도출되었다. 본 연구의 결과는 수리온 및 수리온 무인체계를 운용하는 다양한 작전에 활용되어 기동계획을 수립할 시 기동성, 작전성공율, 그리고 생존율을 예측하는데 보탬이 되기를 기대한다.

The Korean Helicopter Development Program has successfully introduced the Surion helicopter, a versatile multi-domain operational aircraft that replaces the aging UH-1 and 500MD helicopters. Specifically designed for maneuverability, the Surion plays a crucial role in low-altitude tactical maneuvers for personnel transportation and specific missions, emphasizing the helicopter's survivability. Despite the significance of its low-altitude tactical maneuver capability, there is a notable gap in research focusing on multi-mission tactical maneuvers that consider the risk factors associated with deploying the Surion in the presence of enemy air defenses. This study addresses this gap by exploring a method to enhance the Surion's low-altitude maneuvering paths, incorporating information about enemy air defenses. Leveraging the Proximal Policy Optimization (PPO) algorithm, a reinforcement learning-based approach, the research aims to optimize the helicopter's path planning. Visualized experiments were conducted using a Surion model implemented in the Unity environment and ML-Agents library. The proposed method resulted in a rapid and stable policy convergence for generating optimal maneuvering paths for the Surion. The experiments, based on two key criteria, "operation time" and "minimum damage," revealed distinct optimal paths. This divergence suggests the potential for effective tactical maneuvers in low-altitude situations, considering the risk factors associated with enemy air defenses. Importantly, the Surion's capability for remote control in all directions enhances its adaptability in complex operational environments.

키워드

과제정보

본 연구는 2023년 국방과학연구소 '여단 표준과업 분석' (UE231106ID)의 지원을 받아 수행되었습니다.

참고문헌

  1. Kim, Do-Hyung, et al. "Application and performance evaluation of helicopter active vibration control system for surion." Journal of the Korean Society for Aeronautical & Space Sciences 43.6 (2015): 557-567. 
  2. Hur, Jang-wook, Chan-Dong Kim, and Jae-Sang Jang. "A study on the parameters for icing airworthiness flight tests of surion military helicopter." Journal of the Korean Society for Aeronautical & Space Sciences 43.6 (2015): 526-532. 
  3. Kim, Jin Hoon, and Dong Wook Rhee. "Application of systems engineering in surion r&d project." Journal of the Korean Society of Systems Engineering 10.1 (2014): 81-86. 
  4. Kim, Yonghee, et al. "A Study on Restricted Category Type Certification Procedure of Surion Derivatives Rotorcraft." Journal of Aerospace System Engineering 14.1 (2020): 54-61. 
  5. van't Hoff, S. C., et al. "Korean Utility Helicopter KUH-1 Icing Certification Program." (2020). 
  6. Zaloga, Steven J. Red SAM: The SA-2 Guideline Anti-Aircraft Missile. Bloomsbury Publishing, 2011. 
  7. Scarlatoiu, Greg, and Joseph Bermudez Jr. "Unusual Activity at the Kanggon Military Training Area in North Korea: Evidence of Execution by Anti-aircraft Machine Guns?." HRNK Insider (2015). 
  8. Hanseok Park, & Hyochoong Bang (2022). Reinforcement Learning Based Path Planning for Autonomous Shipboard Landing of UAV in Maritime. Journal of the KNST, 5(1), 38-46, 10.31818/JKNST.2022.03.5.1.38 
  9. Kwang-Seok Park, Jin-Man Park, Wan-Kyu Yun, & Sang-Jo Yoo (2019). DQN Reinforcement Learning: The Robot's Optimum Path Navigation in Dynamic Environments for Smart Factory. The Journal of Korean Institute of Communications and Information Sciences, 44(12), 2269-2279, 10.7840/kics.2019.44.12.2269 
  10. Taeyun Kim, Hunki Lee, Kyungho Kim, & Sung-Ho Hwang (2023). Path-Following Strategies for 4-Wheel Independent Steering EVs Using PPO Reinforcement Learning and Turning Radius Gain. Transaction of the Korean Society of Automotive Engineers, 31(8), 575-584, 10.7467/KSAE.2023.31.8.575 
  11. Heo K. and Shin S. (2014). "Automatic Flight Route Following Function's Software Design, Implementation and Comparison." The Journal of Aerospace Industry, 80, 101-122. 
  12. Heo K., et al.(2013). Design and Implementation of Wind Estimation for the KUH's Mission Computer. The Journal of Aerospace Industry,77, 103-112. 
  13. Ki Ho Hong, Jin Hee Won, & Sang Hyun Park (2021). A Study on the Construction of a Drone Safety Flight Map and The Flight Path Search Algorithm. Journal of Korea Multimedia Society, 24(11), 1538-1551. 
  14. Min-Chang Kim, Giancarlo Eder Guerra Padilla, & Kee-Ho Yu (2023). Flight Path Planning Method for UAM Considering Urban Airflow Based on A* Algorithm.Journal of Institute of Control, Robotics and Systems, 29(11), 914-920, 10.5302/ J.ICROS.2023.23.0066 
  15. Kim, Jong-Hwan, and Nam-Su Ahn. "Monte Carlo Simulation based Optimal Aiming Point Computation Against Multiple Soft Targets on Ground." Journal of the Korea Society for Simulation 29.1 (2020): 47-55. 
  16. Jung, Yunyoung, and Jonghwan Kim. "Combat effectiveness and efficiency evaluation of firearm weapon systems in different projectile guidance simulations." Journal of Advances in Military Studies 6.1 (2023): 119-143. 
  17. Y. Jung, J. Jung, and J.-H. Kim, "Effectiveness Analysis of the Self-propelled Guns Performance on Counter-Artillery Fire Using Monte Carlo Simulation Method," Journal of the Korea Society for Simulation, vol. 32, no. 2, pp. 59-66, Jun. 2023. 
  18. Schulman, John, et al. "Trust region policy optimization." International conference on machine learning. PMLR, 2015. 
  19. Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017). 
  20. Huang, Changxin, et al. "Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient Optimization for Bipedal Locomotion." IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).