Fig. 1. configuration Objects
Fig. 2. Learning cycle
Fig. 3. simulation trajectories
Fig. 4. wind applied trajectory
Fig. 5. Test Result Graph
Fig. 6. Wind Applied Test Result Graph
Table 1. Hyperparameters of Unity ML-agent
Table 2. Training Statistics
Table 3. Simulation result
Table 4. Wind Applied Simulation result
참고문헌
- John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan, Pieter Abbeel (2015), Trust Region Policy Optimization, arXiv:1502.05477v5 [cs.LG].
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov (2017), Proximal Policy Optimization Algorithms, arXiv:1707.06347v2 [cs.LG] 28 Aug 2017.
- Vincent Pierre (2017), Unity ML-Agents, https://github.com/Unity-Technologies/ml-agents
- Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick (2017), Mask R-CNN, arXiv:1703.06870 [cs.CV]
- Jemin Hwangbo, Inkyu Sa, Roland Siegwart, Marco Hutter (2017), Control of a Quadrotor with Reinforcement Learning, arXiv:1707.05110v1 [cs.RO] https://doi.org/10.1109/LRA.2017.2720851
- Huy X. Pham, Hung. M. La, David Feil-Seifer, Luan V. Nguyen (2018), Autonomous UAV Navigation Using Reinforcement Learning, arXiv:1801.05086v1 [cs.RO]
- William Koch, Renato Mancuso, Richard West, Azer Bestavros, Reinforcement Learning for UAV Attitude Control, arXiv:1804.04154v1 [cs.RO]
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, Mastering the game of go with deep neural networks and tree search, Nature, vol. 529, no. 7587, pp. 484-489, 2016 https://doi.org/10.1038/nature16961
- Huy X. Pham, Hung. M. La, David Feil-Seifer, Luan V. Nguyen, Autonomous UAV Navigation Using Reinforcement Learning, arXiv:1801.05086v1 [cs.RO]
- 김성필 (2016), 딥러닝 첫걸음, 한빛미디어, 서울, pp. 17-33.