과제정보
This work was supported by IITP grant funded by the Korea Government MSIT. (No. 2018-0-00622)
참고문헌
- S. Gu, E. Holly, T. Lillicrap, and S. Levine, "Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 2017, DOI: 10.1109/ICRA.2017.7989385.
- A. Y. Ng and S. Russell, "Algorithms for inverse reinforcement learning," 17th International Conference on Machine Learning (ICML), 2000, [Online], https://ai.stanford.edu/~ang/papers/icml00-irl.pdf.
- G. Schoettler, A. Nair, J. Luo, J. Luo, S. Bahl, J. A. Ojea, E. Solowjow, and S. Levine, "Deep reinforcement learning for industrial insertion tasks with visual inputs and natural rewards," International Conference on Intelligent Robots and Systems, 2019, [Online], https://arxiv.org/pdf/1906.05841.pdf.
- X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne, "Deepmimic: Example-guided deep reinforcement learning of physics-based character skills," ACM SIGGRAPH, 2018, [Online], https://dl.acm.org/doi/pdf/10.1145/3197517.3201311.
- A. Rajeswaran, V. Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine, "Learning complex dexterous manipulation with deep reinforcement learning and demonstrations," arXiv:1709.10087v2 [cs.LG], 2018, [Online], https://arxiv.org/pdf/1709.10087.pdf.
- T. Hester, M. Vecerik, O. Pietquin, M. Lanctot, T. Schaul, B. Piot, A. Sendonaris, G. Dulac-Arnold, I. Osband, J. Agapiou, J. Z. Leibo, and A. Gruslys, "Learning from demonstrations for real world reinforcement learning," arXiv preprint arXiv:1704.03732, 2017, [Online], https://openreview.net/forum?id=5AvKPhXBsz.
- A. Nair, B. McGrew, M. Andrychowicz, W. Zaremba, and P. Abbeel, "Overcoming exploration in reinforcement learning with demonstrations," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 2017, DOI: 10.1109/ICRA.2018.8463162.
- A. Singh, L. Yang, K. Hartikainen, C. Finn, and S. Levine, "End-toEnd Robotic Reinforcement Learning without Reward Engineering," arXiv:1904.07854 [cs.LG], 2019. [Online], https://arxiv.org/pdf/1904.07854.pdf.
- P. Sermanet, K. Xu, and S. Levine, "Unsupervised perceptual rewards for imitation learning," arXiv:1612.06699v3 [cs.CV], 2017, [Online], https://arxiv.org/pdf/1612.06699.pdf.
- L. Smith, N. Dhawan, M. Zhang, P. Abbeel, and S. Levine, "AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos," arXiv:1912.04443v3 [cs.RO], 2019, [Online], https://arxiv.org/pdf/1912.04443.pdf.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor," Machine Learning Research, pp. 1861-1870, 2018, [Online], https://proceedings.mlr.press/v80/haarnoja18b.
- S. Schaal, A. Ijspeert, and A. Billard, "Computational approaches to motor learning by imitation," Philosophical Transactions: Biological Sciences, 2003, DOI: 10.1098/rstb.2002.1258.
- C. P. Burgess, I. Higgins, A. Pal, L. Matthey, N. Watters, G. Desjardins, and A. Lerchner, "Understanding disentangling in beta-vae," arXiv:1804.03599v1 [stat.ML], 2017, [Online], https://arxiv.org/pdf/1804.03599.pdf.
- A. Kumar, J. Fu, M. Soh, G. Tucker, and S. Levine, "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction," Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2019, [Online], https://proceedings.neurips.cc/paper/2019/hash/c2073ffa77b5357a498057413bb09d3a-Abstract.html.