DOI QR코드

DOI QR Code

심층강화학습 라이브러리 기술동향

A Survey on Deep Reinforcement Learning Libraries

  • 발행 : 2019.12.01

초록

Reinforcement learning is a type of machine learning paradigm that forces agents to repeat the observation-action-reward process to assess and predict the values of possible future action sequences. This allows the agents to incrementally reinforce the desired behavior for a given observation. Thanks to the recent advancements of deep learning, reinforcement learning has evolved into deep reinforcement learning that introduces promising results in various control and optimization domains, such as games, robotics, autonomous vehicles, computing, industrial control, and so on. In addition to this trend, a number of programming libraries have been developed for importing deep reinforcement learning into a variety of applications. In this article, we briefly review and summarize 10 representative deep reinforcement learning libraries and compare them from a development project perspective.

키워드

과제정보

연구 과제번호 : 초연결 지능 인프라 원천기술 연구개발

연구 과제 주관 기관 : 정보통신기술진흥센터

참고문헌

  1. 장수영 외, "심층 강화학습 기술 동향," 전자통신동향분석 34권 제4호, 2019. 8, pp. 1-14. https://doi.org/10.22648/etri.2019.j.340401
  2. R.S. Sutton et al., Reinforcement Learning: An Introduction, 2nd edition, Cambridge, MA, USA: MIT Press, 2018.
  3. Y. LeCun et al., "Deep Learning," Nature, vol. 521, May 2015. pp. 436-444. https://doi.org/10.1038/nature14539
  4. V. Mnih et al., "Playing Atari with Deep Reinforcement Learning," arXiv:1312.5602, Dec. 2013.
  5. A.S. Polydoros et al., "Survey of Model-based Reinforcement Learning: Applications on Robotics," J. Intell. Robotic Syst., vol. 86, no. 2, Mar. 2017, pp. 153-173. https://doi.org/10.1007/s10846-017-0468-y
  6. J. Hwangbo et al., "Control of a Quadrotor with Reinforcement Learning," arXiv:1707.5110, July 2017.
  7. J. Zhang et al., "Query-Efficient Imitation Learning for End-to-End Autonomous Driving," arXiv:1605.06450, May 2016.
  8. H. Mao et al., "Resource Management with Deep Reinforcement Learning," in Proc. HotNets'16 , Atlanta, CA, USA, Nov. 2016, pp. 50-56.
  9. H. Mao et al., "Neural Adaptive Video Streaming with Pensieve," in Proc. Conf. SIGCOMM'17 , Los Angeles, CA, USA, Aug. 2017, pp. 197-210.
  10. H. Mao et al., "Learning Scheduling Algorithms for Data Processing Clusters," arXiv:1810.01963, Oct. 2018.
  11. 김근영 외, "기계학습을 활용한 5G 통신 동향," 전자통신동향분석 31권 제5호, 2016.10, pp. 1-10. https://doi.org/10.22648/ETRI.2016.J.310501
  12. Y. Deng et al., "Deep Direct Reinforcement Learning for Financial Signal Representation and Trading," IEEE Trans. Neural Netw. Learning Syst., vol. 28, no. 3, March 2017, pp. 653-664. https://doi.org/10.1109/TNNLS.2016.2522401
  13. https://www.yna.co.kr/view/AKR20171018151400017?input=1179m
  14. V. Mnih et al., "Asynchronous Methods for Deep Reinforcement Learning," in Proc. Int Conf. Machine Learning, New York, USA, June 2016, pp. 1928-1937.
  15. T.P. Lillicrap et al., "Continuous Control with Deep Reinforcement Learning," arXiv:1509:02971, Sept. 2015.
  16. J. Schulman et al., "Trust Region Policy Optimization," in Proc. Int. Conf. Machin Learning, Lille, France, July 2015, pp. 1889-1897.
  17. J. Schulman et al., "Proximal Policy Optimization Algorithms," arXiv:1707.06347, Jul. 2017.
  18. T. Schaul et al., "Prioritized Experience Replay," arXiv: 1511.05952, Nov. 2015.
  19. Z. Wang et al., "Dueling Network Architectures for Deep Reinforcement Learning," in Proc. Int Conf. Machine Learning, New York, USA, June 2016, pp. 1995-2003.
  20. H. Hasselt et al., "Deep Reinforcement Learning with Double Q-Learning," in Proc. AAAI Conf. Artif. Intell., Fhoenix, AZ, USA, Feb. 2016, pp. 2094-2100.
  21. 오일석, 패턴인식, 교보문고, 2008년.
  22. https://hunkim.github.io/ml/
  23. I. Goodfellow et al., Deep Learning , MIT Press, 2016.
  24. L. Espeholt et al., "IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures," in Proc. Int. Conf. Machine Learning, Stockholm, Sweden, July 2018, pp. 1407-1416.
  25. D. Horgan et al., "Distributed Prioritized Experienced Replay," arXiv:1803.00933, March 2018.
  26. S. Kapturowski et al., "Recurrent Experience Replay in Distributed Reinforcement Learning," in Proc. Int. Conf. Machine Learning , Long Beach, CA, USA, May 2019.
  27. R. Lowe et al., "Multi-Agent Actor Critic for Mixed Cooperative-Competitive Environments," arXiv:1706.02275, July 2017.
  28. T. Rashid et al., "QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning," in Proc. Int. Conf. Machine Learning, Stockholm, Sweden, July 2018, pp. 4295-4304.
  29. S. Li et al., "Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient," in Proc. AAAI Conf. Artif. Intell., Honolulu, HI, USA, Jan. 2019.
  30. https://nervanasystems.github.io/coach/
  31. https://github.com/NervanaSystems/coach
  32. https://www.tensorflow.org/?hl=ko
  33. https://mxnet.incubator.apache.org/
  34. https://software.intel.com/en-us/frameworks/tensorflow
  35. https://gym.openai.com/
  36. https://github.com/openai/roboschool
  37. https://github.com/Breakend/gym-extensions
  38. https://github.com/bulletphysics/bullet3
  39. http://vizdoom.cs.put.edu.pl/
  40. http://carla.org/
  41. https://github.com/deepmind/pysc2
  42. https://github.com/deepmind/dm_control
  43. https://opensource.google/projects/dopamine
  44. P.S. Castro et al., "Dopamine: A Research Framework for Deep Reinforcement Learning," arXiv:1812.06110, Dec. 2018.
  45. https://github.com/google/dopamine
  46. https://keras.io/
  47. https://github.com/keras-rl/keras-rl
  48. https://github.com/openai/baselines
  49. tps://www.open-mpi.org
  50. https://spinningup.openai.com/en/latest/
  51. https://github.com/openai/spinningup
  52. https://gym.openai.com/envs/#mujoco
  53. https://ray.readthedocs.io/en/latest/rllib.html
  54. E. Liang et al., "RLlib: Abstractions for Distributed Reinforcement Learning," in Proc. Int. Conf. Machine Learning, Stockholm, Sweden, July 2018, pp. 3053-3062.
  55. https://ray.readthedocs.io/en/latest/index.html#
  56. https://github.com/ray-project/ray
  57. https://pytorch.org/
  58. https://stable-baselines.readthedocs.io/en/master/
  59. https://github.com/hill-a/stable-baselines
  60. https://github.com/araffin/rl-baselines-zoo
  61. https://tensorforce.readthedocs.io/en/latest/
  62. https://github.com/tensorforce/tensorforce
  63. https://github.com/mgbellemare/Arcade-Learning-Environment
  64. https://github.com/microsoft/MazeExplorer
  65. https://github.com/openai/retro
  66. https://opensim.stanford.edu
  67. https://github.com/ntasfi/PyGame-Learning-Environment
  68. https://github.com/tensorflow/agents
  69. https://github.com/deepmind/trfl
  70. https://winderresearch.com/a-comparison-of-reinforcementlearning-frameworks-dopamine-rllib-keras-rl-coach-trfltensorforce-coach-and-more/
  71. https://medium.com/@vermashresth/a-primer-on-deepreinforcement-learning-frameworks-part-1-6c9ab6a0f555
  72. https://mc.ai/choosing-a-deep-reinforcement-learning-library/