DOI QR코드

DOI QR Code

메타강화학습을 이용한 수중로봇 매니퓰레이터 제어

Control for Manipulator of an Underwater Robot Using Meta Reinforcement Learning

  • 투고 : 2021.01.22
  • 심사 : 2021.02.17
  • 발행 : 2021.02.28

초록

본 논문에서는 수중 건설 로봇을 제어하기 위한 모델 기반 메타 강화 학습 방법을 제안한다. 모델 기반 메타 강화 학습은 실제 응용 프로그램의 최근 경험을 사용하여 모델을 빠르게 업데이트한다. 다음으로, 대상 위치에 도달하기 위해 매니퓰레이터의 제어 입력을 계산하는 모델 예측 제어로 모델을 전송한다. MuJoCo 및 Gazebo를 사용하여 모델 기반 메타 강화 학습을 위한 시뮬레이션 환경을 구축하였으며 수중 건설 로봇의 실제 제어 환경에서의 모델 불확실성을 포함하여 제안한 방법을 검증하였다.

This paper introduces model-based meta reinforcement learning as a control for the manipulator of an underwater construction robot. Model-based meta reinforcement learning updates the model fast using recent experience in a real application and transfers the model to model predictive control which computes control inputs of the manipulator to reach the target position. The simulation environment for model-based meta reinforcement learning is established using MuJoCo and Gazebo. The real environment of manipulator control for underwater construction robot is set to deal with model uncertainties.

키워드

참고문헌

  1. A. Nagabandi, I. Clavera, S. Liu, R. Fearing, P. Abbeel, S. Levine, and C. Finn, "Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning," arXiv preprint arXiv:1803.11347, 2018.
  2. M. Hausknecht and P. Stone, "Deep Recurrent Q-Learning for Partially Observable MDPs," arXiv preprint arXiv:1507.06527, 2017.
  3. C. Finn and S. Levine, "Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm," arXiv preprint arXiv:1710. 11622, 2017.
  4. S. Ravi and H. Larochelle, "Optimization as a model for few-shot learning," Int. Conf. on Learning Representations, 2018.
  5. C. Finn, P. Abbeel, and S. Levine, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks," Int. Conf. on Machine Learning, 2017.
  6. T. Hospedales, A. Antoniou, P. Micaelli, and A. Storkey, "Meta-Learning in Neural Networks: A Survey," arXiv preprint arXiv:2004.05439, 2020.
  7. G. Williams, N. Wagener, B. Goldfain, P. Drews, J. Rehg, B. Boots, and E. Theodorou, "Information theoretic mpc for model-based reinforcement learning," IEEE Int. Conf. on Robotics and Automation, 2017.
  8. S. Sastry and A. Isidori, "Adaptive control of linearizable systems," IEEE Trans. on Automatic Control, 1989.
  9. G. Williams, A. Aldrich, and E. Theodorou, "Model Predictive Path Integral Control using Covariance Variable Importance Sampling," arXiv preprint arXiv:1509.01149, 2015.
  10. M. Al-Shedivat, T. Bansal, Y. Burda, I. Sutskever, I. Mordatch, and P. Abbeel, "Continuous adaptation via meta-learning in nonstationary and competitive environments," arXiv preprint arXiv:1710.03641, 2017.