DOI QR코드

DOI QR Code

역강화학습 기술 동향

Research Trends on Inverse Reinforcement Learning

  • 이상광 (지능형지식콘텐츠연구실) ;
  • 김대욱 (지능형지식콘텐츠연구실) ;
  • 장시환 (지능형지식콘텐츠연구실) ;
  • 양성일 (지능형지식콘텐츠연구실)
  • 발행 : 2019.12.01

초록

Recently, reinforcement learning (RL) has expanded from the research phase of the virtual simulation environment to a wide range of applications, such as autonomous driving, natural language processing, recommendation systems, and disease diagnosis. However, RL is less likely to be used in these complex real-world environments. In contrast, inverse reinforcement learning (IRL) can obtain optimal policies in various situations; furthermore, it can use expert demonstration data to achieve its target task. In particular, IRL is expected to be a key technology for artificial general intelligence research that can successfully perform human intellectual tasks. In this report, we briefly summarize various IRL techniques and research directions.

키워드

과제정보

연구 과제번호 : 메타 플레이 인식 기반 지능형 게임 서비스 플랫폼 개발

연구 과제 주관 기관 : 한국콘텐츠진흥원

참고문헌

  1. A. Attia et al., "Global overview of imitation learning," arXiv: 1801.06503, 2018.
  2. J. Ho et al., "Generative adversarial imitation learning," Advances in Neural Information Processing Systems, 2016.
  3. S. Ross et al., "Efficient reductions for imitation learning," Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.
  4. S. Ross et al., "A reduction of imitation learning and structured prediction to no-regret online learning," Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011.
  5. P. Abbeel et al., "Apprenticeship learning via inverse reinforcement learning," Proceedings of the Twenty-First International Conference on Machine Learning, 2004.
  6. B. Ziebart et al., "Maximum Entropy Inverse Reinforcement Learning," Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008.
  7. S. Levine et al., "Nonlinear inverse reinforcement learning with gaussian processes," Advances in Neural Information Processing Systems, 2011.
  8. M. Wulfmeier et al., "Maximum entropy deep inverse reinforcement learning," arXiv:1507.04888, 2015.
  9. C. Finn et al., "Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization," Proceedings of the Thirty-Third International Conference on International Conference on Machine Learning, 2016.
  10. http://rll.berkeley.edu/gcl
  11. I. Goodfellow et al., "Generative adversarial nets," Advances in Neural Information Processing Systems, 2014.
  12. J. Schuman et al., "Trust region policy optimization," Proceedings of the Thirty-Second International Conference on International Conference on Machine Learning, 2015.
  13. X. Peng et al., "Variational discriminator bottleneck: Improving imitation learning, inverse RL, and GANs by constraining information flow," Proceddings of the International Conference on Learning Representations, 2019.
  14. Y. Li et al., "InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations," Advances in Neural Information Processing Systems, 2017.
  15. X. Chen et al., "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets," Advances in Neural Information Processing Systems, 2016.
  16. M. Arjovsky et al., "Wasserstein Generative Adversarial Networks," Proceedings of the Thirty-Fourth International Conference on International Conference on Machine Learning, 2017.
  17. http://torcs.sourceforge.net/