DOI QR코드

DOI QR Code

Simulation Based Reinforcement Learning for the Intelligence Behavior of Autonomous Weapon System

자율무기체계 지능화를 위한 시뮬레이션 기반 강화학습

  • Received : 2022.12.14
  • Accepted : 2023.06.12
  • Published : 2023.06.30

Abstract

Despite its strong military impact, there is a lack of empirical research related to the intelligence of Autonomous Weapon Systems (AWS). This study discusses the necessary concepts for intelligent reconnaissance missions of drones from an engineering perspective and provides concrete proposals for the required technologies to implement them. Specifically, the study explores Simulation-Based Learning frameworks, the integration of Supervised Learning and Reinforcement Learning in simulation environments, and Vision-Based Learning using aerial image data. Through this study, the importance and potential of Simulation-Based Learning for the intelligence behavior of AWS have been confirmed. This research holds value and contributes as foundational research for advanced AI technology in the military.

자율무기체계(AWS)의 막강한 군사적 파급력에도 불구하고 체계의 지능화와 관련된 실증연구는 부족한 것이 현실이다. 본 연구는 공학적인 관점에서 드론의 정찰 임무를 지능화하는데 필요한 개념에 대해 논의하고, 이를 구현하는데 필요한 기술들을 구체적으로 제시하고 실증한다. 이를 위해, 시뮬레이션 기반 학습의 프레임워크, 시뮬레이션 환경에서 지도학습과 강화학습의 연계, 항공 이미지 데이터를 이용한 비전기반 학습 등을 논의한다. 본 연구를 통해 자율무기체계의 지능화를 위한 시뮬레이션 기반 학습의 중요성과 가능성을 확인하였다. 본 연구는 AI 과학기술 강군 건설을 위한 기초연구로서 가치와 기여점이 있다.

Keywords

Acknowledgement

본 논문은 (주)풍산-KAIST 미래기술연구센터의 지원으로 수행된 '강화학습을 이용한 개인휴대전투드론(PCD)의 정찰임무 활용방안 연구' 중 일부를 포함하고 있음.

References

  1. Hynek, N. and Solovyeva, A. (2021), "Operations of power in autonomous weapon systems: ethical conditions and socio-political prospects", AI & SOCIETY, 36(1), 79-99.  https://doi.org/10.1007/s00146-020-01048-1
  2. Jungmok Ma (2022), "Strategies for Controllable Autonomous Weapon Systems with Reinforcement Q Learning", Korean Journal of Military Art and Science (KJMAS), 78(2), 425-446. 
  3. DTaQ (2019), "Dictionary of Defense Science and Technology Terms." 
  4. Norris, William R. and Albert E. Patterson (2019), "Automation, Autonomy, and Semi-Autonomy: A Brief Definition Relative to Robotics and Machine Systems." 
  5. Ding, D.; Shen, C.; Pan, Z.; Cuiuri, D.; Li, H.; Larkin, N.; van Duin, S. (2016), "Towards an automated robotic arc-welding-based additive manufacturing system from CAD to finished part. Computer-Aided Design, 73, 66-75.  https://doi.org/10.1016/j.cad.2015.12.003
  6. Trofimova, M.; Panov, A.; Kuznetsov, S.(2018), "Automated System of Analysis of Reasons and Consequences of Defects of Mechanical Engineering Products", International Russian Automation Conference (RusAutoCon). IEEE. 
  7. Song, B.D.; Kim, J. (2015), "Rolling Horizon Path Planning of an Autonomous System of UAVs for Persistent Cooperative Service: MILP Formulation and Efficient Heuristics", Journal of Intelligent & Robotic Systems, 84, 241-258.  https://doi.org/10.1007/s10846-015-0280-5
  8. Lam, A.Y.S.; Leung, Y.W. (2016), "Autonomous-Vehicle Public Transportation System: Scheduling and Admission Control", IEEE Transactions on Intelligent Transportation Systems, 17, 1210-1226.  https://doi.org/10.1109/TITS.2015.2513071
  9. Gray, Maggie, and Amy Ertan (2021), "Artificial Intelligence and Autonomy in the Military: An Overview of NATO Member States' Strategies and Deployment." Tallin, NATO Cooperative Cyber Defence Centre of Excellence. 
  10. Michael Horowitz, et al. (2018), "Strategic competition in an era of artificial intelligence", Center for a New American Security. 
  11. McCarthy, J.(2007), "What is artificial intelligence." 
  12. Sutton, Richard S. (2020), "John McCarthy's definition of intelligence." Journal of Artificial General Intelligence, 11(2), 66-67. 
  13. Tuncali, C. E., et al. (2019) "Requirements-driven test generation for autonomous vehicles with machine learning components", IEEE Transactions on Intelligent Vehicles, 5(2), 265-280.  https://doi.org/10.1109/TIV.2019.2955903
  14. Date, Y., Baba, et al. (2020), "Application of Simulation-Based Methods on Autonomous Vehicle Control with Deep Neural Network: Work-in-Progress", In 2020 International Conference on Embedded Software (EMSOFT) (pp.1-3). IEEE. 
  15. Tesla AI-Day (2021), "https://www.youtube.com/watch?v=j0z4FweCy4M" 
  16. Gabriel Prescinotti Vivan, et al. (2021), "No Cost Autonomous Vehicle Advancements in CARLA through ROS," SAE Technical Paper, 2021-01-0106. 
  17. J. Leudet, F. Christophe, T. Mikkonen and T. Mannisto (2019), "AILiveSim : An Extensible Virtual Environment for Training Autonomous Vehicles" 2019 IEEE43rd Annual Computer Software and Applications Conference (COMPSAC), 79-488. 
  18. Gang, B. G., Park, M. and Choi, E. (2019), "The Development of The Simulation Environment for Operating a Simultaneous Man/Unmanned Aerial Vehicle Teaming", Journal of Aerospace System Engineering, 13(6), 36-42.  https://doi.org/10.20910/JASE.2019.13.6.36
  19. S.-H. Lee, S.-M. Seo and Y._H Lee. (2022), "Development of Simulator for CBRN Reconnaissance Vehicle-II(Armored Type)", JOURNAL OF THE KOREA SOCIETY FOR SIMULATION (JKSS), 31(3), 45-54. 
  20. Nadell, Christian C., et al. (2019), "Deep learning for accelerated all-dielectric metasurface design." Optics express 27(20), 27523-27535.  https://doi.org/10.1364/OE.27.027523
  21. Sutanto, Giovanni, et al. (2022), "Supervised learning and reinforcement learning of feedback models for reactive behaviors: Tactile feedback testbed." The International Journal of Robotics Research, 41(13-14), 1121-1145.  https://doi.org/10.1177/02783649221143399
  22. Yeung, Christopher, et al. (2022), "Hybrid Supervised and Reinforcement Learning for the Design and Optimization of Nanophotonic Structures." arXiv preprint arXiv:2209.04447 
  23. B. Janakiramaiah, et al. (2023), "Military object detection in defense using multi-level capsule networks." Soft Computing, 27(2), 1045-1059.  https://doi.org/10.1007/s00500-021-05912-0
  24. Yang, Z., Yu, W., Liang, P. et al. (2019), "Deep transfer learning for military object recognition under small training set condition", Neural Comput & Applic 31, 6469-6478.  https://doi.org/10.1007/s00521-018-3468-3
  25. J.H. Yoon (2021), "Major Issues in Introducing Artificial Intelligence Technology in the Defense Sector and Improvement of Utilization", Institute of Science and Technology Policy, STEPI Insight [279]. 
  26. WILHELM OHMAN (2019), "Data augmentation using military simulators in deep learning object detection applications", DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS, STOCKHOLM, SWEDEN. 
  27. H.M. Yang (2019), "Synthetic Image Dataset Generation for Defense using Generative Adversarial Networks", Journal of the KIMST, 22(1), 49-59. 
  28. S.Y. Cho (2021), "Robust Military Vehicle Detection under Partial Occlusion with Synthetic-Data" KTCP, 27(11), 519-530.  https://doi.org/10.5626/KTCP.2021.27.11.519
  29. Hodge, V.J., Hawkins, R., Alexander, R. (2021), "Deep reinforcement learning for drone navigation using sensor data", Neural Comput & Applic 33, 2015-2033.  https://doi.org/10.1007/s00521-020-05097-x
  30. Jiang, Z., Song, G. (2022), "A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Platform", arXiv preprint arXiv:2209.02954. 
  31. M. R. Shaker, Shigang Yue, T. Duckett (2009), "Vision-based reinforcement learning using approximate policy iteration," 2009 International Conference on Advanced Robotics, 1-6. 
  32. Munoz G, Barrado C, Cetin E and Salami E (2019), "Deep Reinforcement Learning for Drone Delivery", Drones, 3(3), 72. 
  33. Xue Z. and Gonsalves T. (2021), "Vision Based Drone Obstacle Avoidance by Deep Reinforcement Learning", AI, 2(3), 366-380.  https://doi.org/10.3390/ai2030023
  34. Yong-Chan Choi and Park Seong Su (2022), "Reconnaissance Drone Path Planning under Hostile Threat Environment Based on Reinforcement Learning", Journal of Next-generation Convergence Technology Association, 6(4), 624-631 (8 pages).  https://doi.org/10.33097/JNCTA.2022.06.04.624
  35. M. Choi, H. Moon, S. Han, Y. Choi, M. Lee and N. Cho, "Experimental and Computational Study on the Ground Forces CGF Automation of Wargame Models Using Reinforcement Learning," in IEEE Access, doi: 10.1109/ACCESS.2022.3227797. 
  36. A. Shahzad, X. Gao and A. Yasin et al.(2020), "A Vision-Based Path Planning and Object Tracking Framework for 6-DOF Robotic Manipulator," in IEEE Access, 8, 203158-203167.  https://doi.org/10.1109/ACCESS.2020.3037540
  37. Abdi A, Ranjbar MH, Park JH. (2022), "Computer Vision-Based Path Planning for Robot Arms in Three-Dimensional Workspaces Using Q-Learning and Neural Networks", Sensors, 22(5), 1697. 
  38. Kadian, A., et al. (2020), "Sim2real predictivity: Does evaluation in simulation predict real-world performance?", IEEE Robotics and Automation Letters, 5(4), 6670-6677.  https://doi.org/10.1109/LRA.2020.3013848
  39. Kanishka Rao, et al. (2020), "Rl-cyclegan : Reinforcement learning aware simulation-to-real," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 
  40. Yoon Joon-wan, et al. (2022), "Sim-to-Real Reinforcement Learning-based Square Peg Alignment Method", 2022 37th Conference of the Society for Control and Robot Systems. 
  41. R. R. Hill and J. O. Miller (2017), "A history of United States military simulation," Winter Simulation Conference (WSC), 346-364. 
  42. Cho Sung-sik, et al.(2020), "National Defense M&S [3rd Edition]" Gyomunsa, published July 20, 2020. 
  43. U.S. DOD (2021), "Department of Defense dictionary of military and associated terms", Joint Chiefs Of Staff Washington, 2010. 
  44. Ungerleider, Sabine Kastner and Leslie G. (2000), "Mechanisms of visual attention in the human cortex." Annual review of neuroscience 23(1), 315-341.  https://doi.org/10.1146/annurev.neuro.23.1.315
  45. John Johnson (1958), "Analysis of image forming systems," in Image Intensifier Symposium, AD 220160 (Warfare Electrical Engineering Department, U.S. Army Research and Development Laboratories, Ft. Belvoir, Va., 1958), 244-273. 
  46. NATO Standard Agreement (1995), "STANAG 4347 (ED. 1), DEFINITION OF NOMINAL STATIC RANGE PERFORMANCE FOR THERMAL IMAGING SYSTEMS", 18-JUL-1995 
  47. Lombardo, Russell L. (1998), "Target acquisition : It's not just for military imaging." Photonics spectra, 32(7), 123-126.  https://doi.org/10.1023/A:1004208009727
  48. Xia, Gui-Song, et al.(2018), "DOTA : A large-scale dataset for object detection in aerial images." Proceedings of the IEEE conference on computer vision and pattern recognition. 
  49. Hari Surrisyad and Wahyono (2020), "A Fast Military Object Recognition using Extreme Learning Approach on CNN", International Journal of Advanced Computer Science and Applications (IJACSA), 11(12). 
  50. D. Du et al. (2019), "VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results", IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 213-226. 
  51. Burdorf, S., et al. (2022), "Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data" arXiv preprint arXiv: 2202.00632. 
  52. Microsoft (2023), "Azure Official Document", https://learn.microsoft.com/en-us/azure/?product=popular 
  53. Unity Manual Barracuda 3.0.0, (2022), https://docs.unity3d.com/Packages/com.unity.barracuda@3.0/manual/SupportedArchitectures.html 
  54. Andrei Barbu et al.(2019), "ObjectNet : A large-scale bias-controlled dataset for pushing the limits of object recognition models," 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. 
  55. Openja, M., et al. (2022), "An Empirical Study of Challenges in Converting Deep Learning Models", arXiv preprint arXiv:2206.14322. 
  56. Azar, A.T., et al. (2021), "Drone Deep Reinforcement Learning: A Review," Electronics 2021, 10, 999. 
  57. X. B. Peng, et al. (2018), "Sim-to-Real Transfer of Robotic Control with Dynamics Randomization," ICRA 2018, 3803-3810. 
  58. Chen, X., et al. (2021), "Understanding domain randomization for sim-to-real transfer", arXiv preprint arXiv:2110.03239. 
  59. Richard S. Sutton and Andrew G. Barto(1998), "Reinforcement Learning: An Introduction (First Edition)", MIT Press, Cambridge, MA, 1998. 
  60. Yadav Pamul, et al. (2021), "OODA-RL: A REINFORCEMENT LEARNING FRAMEWORK FOR ARTIFICIAL GENERAL INTELLIGENCE TO SOLVE OPEN WORLD NOVELTY", TechRxiv. Preprint. 
  61. S. Nahavandi (2017), "Trusted Autonomy Between Humans and Robots : Toward Human-on-the-Loop in Robotics and Autonomous Systems," in IEEE Systems, Man, and Cybernetics Magazine, 3(1), 10-17.  https://doi.org/10.1109/MSMC.2016.2623867
  62. Schulman, et al. (2017) "Proximal policy optimization algorithms", arXiv preprint arXiv:1707.06347. 
  63. Kaspar, M., et al.(2020), "Sim2real transfer for reinforcement learning without dynamics randomization", In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp.4383-4388). IEEE. 
  64. Sandha, S.S., et al. (2020), "Sim2Real Transfer for Deep Reinforcement Learning with Stochastic State Transition Delays", CoRL. 
  65. Nick Polson, James Scott (2018), "AIQ: How People and Machines Are Smarter Together" St. Martin's Press, published May, 2018. 
  66. J.C. Choi, H.S. Kim, Y.S. Son (2019), "Overview and Trends of Imitation Learning Technology", Journal of the Society of Information Science, 37 (11) (Volume 366), 34-4. 
  67. K.S. Min, H.H. Lee, Y.R. Kim et al.(2022), "Reinforcement Learning with Pytorch and Unity ML-Agents," WIKIBOOKS, Data Science Series_ 082, published July, 2022. 
  68. K. Zhang, Z. Yang, and T. Basar (2017), "Multi-agent reinforcementlearning : A selective overview of theories and algorithms," Handbook of Reinforcement Learning and Control, pp.321-384, 2021. 
  69. H.Y. Choi (2017), "LiDAR sensor technology and industrial trends," The proceedings of KIEE, 66 (9), 12-17.