References
- Electric Power Systems Research v.63 no.1 A reinforcement learning approach to automatic generation control T. P. I. Ahamed;P. S. N. Rao;P. S. Sastry https://doi.org/10.1016/S0378-7796(02)00088-3
- Journal of Dynamic Systems, Measurement and Control Data storage in the cerebellar model articulation controller J. S. Albus
- Journal of Dynamic Systems, Measurement and Control A new approach to manipulator control: The cerebellar model articulation controler (CMAC) J. S. Albus
- IEEE Control Systems Magazine v.9 no.3 Learning to control an inverted pendulum using neural networks C. W. Anderson https://doi.org/10.1109/37.24809
- Artificial Intelligence in Engineering v.11 no.4 Synthesis of reinforcement learning, neural networks and PI control applied to a simulated heating coil C. W. Anderson;D. C. Hittle;A. D. Katz;R. M. Kretchmar https://doi.org/10.1016/S0954-1810(97)00004-6
- Machine Learning v.23 Purposive behavior acquisition for a real robot by vision-based reinforcement learning M. Asada;S. Noda;S. Tawaratsumida;K. Hosoda
- Comp. & Maths. with Appls. v.12A Dual control of an integrator with unknown gain K. J. Astrom;A. Helmersson
- Proc. of the Fourteenth International Conference on Machine Learning Robot learning from demonstration C. G. Atkeson;S. Schaal
- Proc. of the International Conference on Machine Learning Residual algorithms: Reinforcement learning with function approximation L. Baird III
- Artificial Intelligence v.72 no.1 Learning to act using real-time dynamic programming A. G. Barto;S. J. Bradtke;S. P. Singh https://doi.org/10.1016/0004-3702(94)00011-O
- IEEE Trans. on Systems, Man, and Cybernetics v.13 no.5 Neuronlike adaptive elements that can solve difficult learning control problems A. G. Barto;R. S. Sutton;C. W. Anderson
- Dynamic Programming R. E. Bellman
- Dynamic Programming and Optimal Control(2nd edition) D. P. Bertsekas
- Proc. of Sixth International Conference on Chemical Process Control Neuro-dynamic programming: An overview D. P. Bertsekas;J. B. Rawlings(ed.);B. A. Ogunnaike(ed.);J. W. Eaton(ed.)
- IEEE Trans. on Automatic Control v.34 no.6 Adaptive aggregation for infinite horizon dynamic programming D. P. Bersekas;D. A. Castanon https://doi.org/10.1109/9.24227
- Data Networks(2nd edition) D. P. Bertsekas;R. G. Gallager
- Parallel and Distributed Computation: Numerical Methods D. P. Bertsekas;J. N. Tsitsiklis
- Neuro-Dynamic Programming D. P. Bertsekas;J. N. Tsitsiklis
- Probability Theory and related Fields v.78 A convex analytic approach to Markov decision processes V. Borkar https://doi.org/10.1007/BF00353877
- Advances in Neural Information Processing Systems v.7 Generalization in reinforcement learning: safely approximating J. A. Boyan;A. W. Moore;G. Tesauro(ed.);D. Touretzky(ed.)
- Advances in Neural Information Processing Systems v.8 Improving elevator performance using reinforcement learning R. Crites:A. G. Barto;D. S. Touretzky(ed.);M. C. Mozer(ed.);M. E. Hasselmo(ed.)
- Machine Learning v.33 Elevator group control using multiple reinforcement learning agents R. Crites;A. G. Barto https://doi.org/10.1023/A:1007518724497
- Advances in Neural Information Processing Systems v.5 Reinforcement learning applied to linear quadratic regulation S. J. Bradtke;S. J. Hanson(ed.);J. Cowan(ed.);C. L. Giles(ed.)
-
Machine Learning
v.8
The convergence of
$TD({\lambda})$ for general${\lambda}$ P. Dayan - Operations Research v.51 no.6 The linear programming approach to approximate dynamic programming D. P. de Farias;B. Van Roy https://doi.org/10.1287/opre.51.6.850.24925
- Management Science v.16 On linear programming in a Markov decision problem E. V. Denardo
- Proc. of the International Conference on Robotics and Automation A comparison of direct and model-based reinforcement learning C. G. Atkeson;J. Santamaria
- Proc. of the Twelfh International Conference on Machine Learning Stable function approximation in dynamic programming G. J. Gordon
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction T. Hastie;R. Tibshirani;J. Friedman
- Management Science v.25 Linear programming and Markov decision chains A. Hordijk;L. C. M. Kallenberg https://doi.org/10.1287/mnsc.25.4.352
- Computers & Chemical Engineering v.16 no.4 Process control via artificial neural networks and reinforcement learning J. C. Hoskins;D. M. Himmelblau https://doi.org/10.1016/0098-1354(92)80045-B
- Dynamic Programming and Markov Processes R. A. Howard
- Neural Computation v.6 no.6 On the convergence of stochastic iterative dynamic programming algorithms T. Jaakkola;M. I. Jordan;S. P. Singh https://doi.org/10.1162/neco.1994.6.6.1185
- Journal of Artificial Intelligence Research v.4 Reinforcement learning: A survey L. P. Kaelbling;M. L. Littman;A. W. Moore
- International Journal of Robust and Nonlinear Control v.13 no.3;4 Simulation based strategy for nonlinear optimal control: Application to a microbial cell reactor N. S. Kaisare;J. M. Lee;J. H. Lee
- Proc. of the Eleventh National Conference on Artificial Intelligence Complexity analysis of real-time reinforcement learning S. Koenig;R. G. Simmons
- Advances in neural information processing systems v.12 Actor-critic algorithms V. R. Konda;J. N. Tsitsiklis;S. A. Solla(ed.);T. K. Leen(ed.);L.-R.Muller(ed.)
- Proc. of the 1992 IEEE/RSJ International Conference on Intelligent Robots and Systems Adaptive state space quantisation for reinforcement learning of collision-free navigation B. J. A. Krose;J. W. M. van Dam
- Stochastic Systems: Estimation, Identification and Adaptive Control P. R. Kumar;P. P. Varaiya
- AIChE Annual Meeting Simulation-based dynamic programming strategy for improvement of control policies J. M. Lee;N. S. Kaisare;J. H. Lee
- AIChE Annual Meeting Neuro-dynamic programming approach to dual control problem J. M. Lee;J. H. Lee
- Automatica Approximate dynamic programming based approaches for input-ouput data-driven control of nonlinear processes J. M. Lee;J. H. Lee
- Korean J. Chem. Eng. v.21 no.2 Simulation-based learning of cost-to-go for control of nonlinear processes J. M. Lee;J. H. Lee https://doi.org/10.1007/BF02705417
- Computers & Chemical Engineering v.16 A neural network architecture that computes its own reliability J. A. Leonard;M. A. Kramer;L. H. Ungar https://doi.org/10.1016/0098-1354(92)80035-8
- Machine Learning v.8 Self-improving reactive agents based on reinforcement learning, plannin and teaching L.-J. Lin
- Machine Learning v.55 no.2;3 Automatic programming of behavior-based robots using rein-forcement learning S. Mahadevan;J. Connell
- Proc. of 14th International Conference on Machine Learning Self-improving factory simulation using continuous-time average-reward reinforcement S. Mahadevan;N. Marchalleck;T. K. Das;A. Gosavi
- Management Science v.6 no.3 Linear programming and sequential decisions A. S. Manne https://doi.org/10.1287/mnsc.6.3.259
- IEEE Trans. on Automatic Control v.46 no.2 Simulationbased optimization of Markov reward processes P. Marbach;J. N. Tsitsiklis https://doi.org/10.1109/9.905687
- Computers & Chemical Engineering v.24 Batch process modeling for optimization using reinforcement learning E. C. Martinez https://doi.org/10.1016/S0098-1354(00)00354-9
- Applications of Artificial Neural Networks Temporal difference learning: A chemical process control application S. Miller;R. J. Williams;A. F. Murray(ed.)
- Machine Learning v.21 no.3 The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces A. Moore;C. Atkeson
- PhD thesis, Cambridge University Efficient Memory Based Robot Learning A. W. Moore
- Machine Learning v.13 Prioritized sweeping: Reinforcement learning with less data and less time A. W. Moore;C. G. Atkeson
- Computers & Chemical Engineering v.23 Model predictive control: Past, present and future M. Morari;J. H. Lee https://doi.org/10.1016/S0098-1354(98)00301-9
- Proc. of the International Joint Conference on Artificial Intelligence A convergent reinforcement learning algorithm in the comtinuous case based on a finite difference method R. Munos
- Machine Learning Journal v.40 A study of reinforcemetn learning in the continuous case by means of viscosity solutions R. Munos https://doi.org/10.1023/A:1007686309208
- Advances in Neural Information Processing Systems v.10 Enhancing Q-learning for optimal asset allocation R. Neuneier;M. Jordan(ed.);M. Kearns(ed.);S. Solla(ed.)
- IEEE Trans. on automatic Control v.47 no.10 Kernel-based reinforcement learning in average-cost problems D. Ormoneit;P. W. Glynn https://doi.org/10.1109/TAC.2002.803530
- Machine Learning v.49 Kernel-based reinforcement learning D. Ormoneit;S. Sen https://doi.org/10.1023/A:1017928328829
- Ann. Math. Statist. v.33 On estimation of a probability density function and mode E. Parzen https://doi.org/10.1214/aoms/1177704472
- PhD thesis, North-eastern University Efficient Dynamic Programming-Based Learning for Control J. Peng
- Adaptive Behavior v.1 no.4 Efficient learning and planning within the Dyna framework J. Peng;R. J. Williams https://doi.org/10.1177/105971239300100403
- IEEE Trans. on Neural Networks v.8 no.5 Adaptive critic designs D. V. Prokhorov;D. C. Wunsch II https://doi.org/10.1109/72.623201
- Markov Decision Processes M. L. Puterman
- Control Engineering Practice v.11 no.7 A survey of industrial model predictive control technology S. J. Qin;T. A. Badgwell https://doi.org/10.1016/S0967-0661(02)00186-7
- Mathematical and Computational Techniques for Multilevel Adaptive Methods U. Rude
- Technical Report CUED/F-INFENG/TR 166, Engineering Department, Cambridge University On-line Q-learning using connectionist systems G. A. Rummery;M. Niranjan
- Proc. of the Fourth Connectionist Models Summer School Approximating Q-values with basis function representations P. Sabes
- IBM J. Res. Develop. Some studies in machine learning using the game of checkers A. L. Samuel
- IBM J. Res. Develop. Some studies in machine learning using the game of checkers II - recent progress A. L. Samuel
- Adaptive Behavior v.6 no.2 Experiments with reinforcement learning in problems with continuous state and action spaces J. C. Santamaria;R. S. Sutton;A. Ram https://doi.org/10.1177/105971239700600201
- Advances in Neural Information Processing Systems v.9 Learning from demonstration S. Schaal;M. C. Mozer(ed.);M. Jordan(ed.);T. Petsche(ed.)
- IEEE Control Systems v.14 no.1 Robot juggling: an implementation of memory-based learning S. Schaal;C. Atkeson https://doi.org/10.1109/37.257895
- Advances in Neural Information Processing Systems v.6 Temporal difference learning of position evaluation in the game of Go N. N. Schraudolph;P. Dayan;T. J. Sejnowski;J. D. Cowan(ed.);G. Tesauro(ed.);J. Alspector(ed.)
- Advances in Neural Information Processing Systems v.9 Reinforcement learning for dynamic chanel allocation in cellular telephone systems S. Singh;D. Bertsekas;M. C. Mozer(ed.);M. I. Jordan(ed.);T. Petsche(ed.)
- Machine Learning v.22 Reinforcement learning with replacing eligibility traces S. P. Singh;R. S. Sutton
- Proc. 17th International Conf. on Machine Learning Practical reinforcement learning in continuous spaces W. D. Smart;L. P. Kaelbling
- Advances in Neural Information Processing Systems v.12 Policy gradient methods for reinforce-ment learning with function approximation R. Sutton;D. McAllester;S. Singh;Y. Mansour;S. A. Solla(ed.);T. K. Leen(ed.);K.-R. Muller(ed.)
- PhD thesis, University of Massachusetts Temporal Credit Assignment in Reinforcement Learning R. S. Sutton
- Machine Learning v.3 no.1 Learning to predict by the method of temporal differences R. S. Sutton
- Proc. of the Seventh International Conference on Machine Learning Integrated architectures for learning, planning, and reacting based on approximating dynamic programming R. S. Sutton
- Proc. of the Eighth International Workshop on Machine Learning Planning by incremental dynamic programming R. S. Sutton
- Advances in Neural Information Processing Systems v.8 Generalization in reinforcement learning: Successful examples using sparse coarse coding R. S. Sutton;D. S. Touretzky(ed.);M. C. Mozer(ed.);M. E. Hasselmo(ed.)
- Psycol. Rev. v.88 no.2 Toward a modern theory of adaptive networks: Expectation and prediction R. S. Sutton;A. G. Barto https://doi.org/10.1037/0033-295X.88.2.135
- Reinforcement Learning: An Introduction R. S. Sutton;A. G. Barto
- Advanced Robotics v.14 no.5 Enhanced continuous valued Q-learning for real autonomous robots M. Takeda;T. Nakamura;M. Imai;T. Ogasawara;M. Asada https://doi.org/10.1163/156855300741852
- Machine Learning v.8 Practical issues in temporal difference learning G. Tesauro
- Neural Computation v.6 no.2 TD-Gammon, a self-teaching backgammon program, achieves master-level play G. Tesauro https://doi.org/10.1162/neco.1994.6.2.215
- Communications of the ACM v.38 no.3 Temporal difference learning and TD-Gammon G. Tesauro https://doi.org/10.1145/203330.203343
- Advances in Neural Information Processing Systems v.7 Learning to play the game of chess S. Thrun;G. Tesauro(ed.);D. S. Touretzky(ed.);T. K. Leen(ed.)
- Proc. of the Fourth Connectionist Models Summer School Issues in using function approximation for reinforcement learning S. Thrun;A. Schwartz
- Machine Learning v.16 Asynchronous stochastic approximation and Q-learning J. N. Tsitsiklis
- IEEE Trans. on Automatic Control v.42 no.5 An analysis of temporal-difference learning with function approximation J. N. Tsitsiklis;B. Van Roy https://doi.org/10.1109/9.580874
- Handbook of Markov Decision Processes: Methods and Applications Neuro-dynamic programming: Overview and recent trends B. Van Roy;E. Feinberg(ed.);A. Shwartz(ed.)
- PhD thesis, University of Cambridge Learning from Delayed Rewards C. J. C. H. Watkins
- Machine Learning v.8 Q-learning C. J. C. H. Watkins;P. Dayan
- General Systems Yearbook v.22 Advanced forecasting methods for global crisis warning and models of intelligence P. J. Werbos
- Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold Approximate dynamic programming for real-time control and neural modeling P. J. Werbos;A. White(ed.);D. A. Sofge(ed.)
- Proc. of the Eighth International Workshop on Machine Learning Complexity and cooperation in Q-learning S. D. Whitehead
- Technical Report NU-CCS-93-14, Northeastern University, College of Computer Science Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems R. J. Williams;L. C. Baird III
- Computers & Chemical Engineering v.21S Neuro-fuzzy modeling and control of a batch process involving simultaneous reaction and distillation J. A. Wilson;E. C. Martinez
- Stochastic Problems in Control Stochastic control problems M. Wonham;B. Friedland(ed.)
- PhD thesis, Oregon State University;Technical Report CS-96-30-1 Reinforcement Learning for Job-Shop Scheduling W. Zhang
- Proc. of the Twelfth International Conference on Machine Learning A reinforcement learning approach to job-shop scheduling W. Zhang;T. G. Dietterich
-
Advances in Neural Information Processing Systems
v.8
Highperformance jop-shop scheduling with a time delay
$TD({\lambda})$ network W. Zhang;T. G. Dietterich;D. S. Touretzky(ed.);M. C. Mozer(ed.);M. E. Hasselmo(ed.)