References
- Ministry of Commerce, Industry and Energy, 2003, Total energy consumption report, pp. 1-80
- Virk, G. S. and Loveday, D.L., 1992, A comparison of predictive, PID, and on/off techniques for energy management and control, Proceedings of ASHRAE, pp. 3-10
- Hang, C. C. and Astrom, K.J. and Ho, W. K., 1991, Refinements of the Ziegler-Nichols tuning formula, IEEE Proceedings Part D-Control Theory Application., Vol. 138, No. 2, pp. 111-118
- Watkins, C. and Dayan, P., 1992, Technical note: Q-learning, Machine Learning, Vol. 8, pp. 279-292
- Anderson, C. W., Hittle, D. C., Katz, A. D. and Kretchmar, R. M., 1997, Synthesis of reinforcement learning, neural networks, and PI control applied to a simulated heating coil, Artificial Intelligence in Engineering, Vol. 11, No. 4, pp. 421-429 https://doi.org/10.1016/S0954-1810(97)00004-6
- Anderson, C. W., 1993, Q-Iearning with hidden-unit restarting, Advances in Neural Information Processing Systems, Vol. 5, Hanson, S. J., Cowan, J. D. and Giles, C. L., eds., Morgan Kaufmann Publishers, San Mateo, CA, pp. 81-88
- Barto, A. G., Bradtke, S. J. and Singh, S. P., 1995, Learning to act using real-time dynamic programming, Artificial Intelligence, Special Volume: Computational Research on Interaction and Agency, Vol. 72, No. 1, pp. 81-138
- Sutton, R. S, 1988, .Leaming to predict by the method of temporal difference, Machine Learning, Vol. 9, pp. 9-44
- Sutton, R. S. and Barto, A. G., 1998, Reinforcement Learning: an Introduction, Cambridge, MA, MIT Press, pp. 51-85
- So, J. H., Cho, S. H., Song, M. H. and Park, M. S., 2001, Experimental study on control performance of reinforcement learning method, Proceedings of the SAREK, pp. 697-701