References
- C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992. http://dx.doi.org/10.1023/A:1022676722315
- D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning," Journal of Machine Learning Research, vol. 6, pp. 503-556, 2005.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, "Playing Atari with deep reinforcement learning," Available https://arxiv.org/abs/1312.5602
- Y. Tkachenko, "Autonomous CRM control via CLV approximation with deep reinforcement learning in discrete and continuous action space," Available https://arxiv.org/abs/1504.01840
- M. Riedmiller, "Neural fitted Q iteration: first experiences with a data efficient neural reinforcement learning method," in Machine learning: ECML 2005, J. Gama, R. Camacho, P. B. Brazdil, A. M. Jorge, and L. Torgo, Eds. Berlin: Springer Berlin Heidelberg, 2005, pp. 317-328. http://dx.doi.org/10.1007/1156409632
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, "Continuous control with deep reinforcement learning," Available https://arxiv.org/abs/1509.02971
- R. Sutton, "Generalization in reinforcement learning: successful examples using sparse coarse coding," Advances in Neural Information Processing Systems, vol. 8, pp. 1038-1044, 1996.
- H. van Hasselt, A. Guez, and D. Silver, "Deep reinforcement learning with double Q-learning," Available https://arxiv.org/abs/1509.06461
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, et al, "Human-level control through deep reinforcement learning," Nature, vol. 518, no. 7540, pp. 529-533, 2015. http://dx.doi.org/10.1038/nature14236
- D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic policy gradient algorithms," in Proceedings of the 31st International Conference on Machine Learning (ICML-14), Beijing, China, 2014, pp. 387-395.
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," Advances in Neural Information Processing Systems 12, vol. 99, pp. 1057-1063, 2000.
- L. A. Celiberto, C. H. C. Ribeiro, A. H. R. Costa, and R. A. C. Bianchi, "Heuristic reinforcement learning applied to robocup simulation agents," in RoboCup 2007: Robot Soccer World Cup XI, U. Visser, F. Ribeiro, T. Ohashi, and F. Dellaert, Eds. Berlin: Springer Berlin Heidelberg, 2008, pp 220-227. http://dx.doi.org/10.1007/978-3-540-68847-119
- R. A. C. Bianchi, M. F. Martins, C. H. C. Ribeiro, and A. H. R. Costa, "Heuristically-accelerated multiagent reinforcement learning," IEEE Transactions on Cybernetics, vol. 44, no. 2, pp. 252-265, 2014. http://dx.doi.org/10.1109/TCYB.2013.2253094