DOI QR코드

DOI QR Code

Reinforcement learning-based control with application to the once-through steam generator system

  • Cheng Li (Naval University of Engineering) ;
  • Ren Yu (Naval University of Engineering) ;
  • Wenmin Yu (Naval University of Engineering) ;
  • Tianshu Wang (Naval University of Engineering)
  • Received : 2022.05.18
  • Accepted : 2023.06.01
  • Published : 2023.10.25

Abstract

A reinforcement learning framework is proposed for the control problem of outlet steam pressure of the once-through steam generator(OTSG) in this paper. The double-layer controller using Proximal Policy Optimization(PPO) algorithm is applied in the control structure of the OTSG. The PPO algorithm can train the neural networks continuously according to the process of interaction with the environment and then the trained controller can realize better control for the OTSG. Meanwhile, reinforcement learning has the characteristic of difficult application in real-world objects, this paper proposes an innovative pretraining method to solve this problem. The difficulty in the application of reinforcement learning lies in training. The optimal strategy of each step is summed up through trial and error, and the training cost is very high. In this paper, the LSTM model is adopted as the training environment for pretraining, which saves training time and improves efficiency. The experimental results show that this method can realize the self-adjustment of control parameters under various working conditions, and the control effect has the advantages of small overshoot, fast stabilization speed, and strong adaptive ability.

Keywords

References

  1. H. Yao, G. Chen, K. Lu, Y. Wu, W. Tian, G. Su, S. Qiu, Study on the systematic thermal-hydraulic characteristics of helical coil once-through steam generator, Ann. Nucl. Energy 154 (2021), 108096.
  2. G. Zhao, Y. Zhao, J. Liu, Integral control strategy between the casing once-through steam generator and the turbine, Energy Conserv. Technol. 220 (2020) 162-166.
  3. Y. Zhang, M. Zheng, Z. Ma, J. Wu, Dynamic modeling ,simulation and control of helical coiled once-through steam generator, Appl. Sci. Technol. 313 (2020) 71-77.
  4. S. Cheng, C. Li, M. Peng, X. Liu, Research of pressure control based on artificial immune control of once -through steam generator, Nucl. Power Eng. 36 (2015) 62-65.
  5. Z. Chen, L. Liao, L. Liu, W. Li, Study on application of T-S fuzzy neural method in once-through steam generator feedwater control, Nucl. Power Eng. 33 (2012) 20-23.
  6. X. Hu, T. Yang, H. Qian, Research on control strategy of once-through steam generator for integrated reactor, J. Shanghai Univ. Electr. Power 37 (2021) 115-120.
  7. R.S. Sutton, A.G. Barto, R.J. Williams, Reinforcement learning is direct adaptive optimal control, IEEE Control Syst. Mag. 12 (1992) 19-22.
  8. C.J.C.H. Watkins, P. Dayan, Q-learn. Mach. Learn. 8 (1992) 279-292.
  9. T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous Control with Deep Reinforcement Learning, 2018, pp. 1-16. IN201847005934A.
  10. X. Wang, L. Zhang, T. Lin, C. Zhao, K. Wang, Z. Chen, Solving job scheduling problems in a resource preemption environment with multi-agent reinforcement learning, Robot. Comput. Integrated Manuf. 77 (2022) 102324.
  11. X. Deng, Y. Zhang, H. Qi, Towards optimal HVAC control in non-stationary building environments combining active change detection and deep reinforcement learning, Build. Environ. 211 (2022), 108680, 1-108680.16.
  12. X. Qiu, C. Gao, K. Wang, W. Jing, Attitude control of a moving MassA-ctuated UAV based on deep reinforcement learning, J. Aero. Eng. 35 (2022), 4021133.1-4021133.12.
  13. R.B. Grando, J.D. Jesus, V.A. Kich, et al., Double critic deep reinforcement learning for mapless 3D navigation of unmanned aerial vehicles, J. Intell. Rob. Syst. 104 (2022) 29-43. https://doi.org/10.1007/s10846-021-01568-y
  14. R. Zhang, Q. Lv, J. Li, J. Bao, T. Liu, S. Liu, A reinforcement learning method for human-robot collaboration in assembly tasks, Robot. Comput. Integrated Manuf. 73 (2022) 1-10.
  15. J.K. Park, T.K. Kim, S.H. Seong, Providing support to operators for monitoring safety functions using reinforcement learning, Prog. Nucl. Energy 118 (2022), 103123.
  16. T. Nishida, Data transformation and normalization, Rinsho Byori the Japanese Journal of Clinical Pathology 58 (2010) 990-997.
  17. M.S. David, S. Renjith, Comparison of word embeddings in text classification based on RNN and CNN, IOP Conf. Ser. Mater. Sci. Eng. 1187 (2021) 247-255.
  18. Q. Ye, Y. Wang, X. Li, J. Guo, Y. Huang, B. Yang, A power load prediction method of associated industry chain production resumption based on multitask LSTM, Energy Rep. 8 (2022) 239-249. https://doi.org/10.1016/j.egyr.2022.01.110
  19. A. Zeng, W. Nie, Stock recommendation system based on deep bidirectional LSTM, Comput. Sci. 46 (2019) 84-89.
  20. J. Ren, J. Wang, C. Wang, Stock forecasting system based on elstm-l model, Stat. Decis. 35 (2019) 160-164.
  21. I. Papatsouma, N. Farmakis, Approximating symmetric distributions via sampling and coefficient of variation, Commun. Stat. 49 (2020) 61-77. https://doi.org/10.1080/03610926.2018.1529244
  22. V. Mnih, K. Kavukcuoglu, D. Silver, et al., Playing atari with deep reinforcement learning, CoRR abs/1312 5602 (2013) 1-9.
  23. T.P. Lillicrap, J.J. Hunt, A. Pritzel, et al., Continuous Control with Deep Reinforcement Learning, Computerence, 2015, pp. 1-16.
  24. R.J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Mach. Learn. 8 (1992) 229-256.
  25. J. Schulman, et al., Trust region policy optimization, Int. Conf. Mach. Learn. 3 (2016) 244-259.
  26. P. Hamalainen, et al., PPO-CMA: proximal policy optimization with covariance matrix adaptation, IEEE 30th Int. Workshop on Mach. Learn. Signal Proc. (2020) 1-6.
  27. J. Baxter, P.L. Bartlett, Infinite-horizon policy-gradient estimation, J. Artif. Intell. Res. 15 (2019) 319-350.
  28. D. Yan, C. Xi, Rein Houthooft, Bench marking deep reinforcement learning for continuous control, Int. Conf. Mach. Learn. 3 (2016) 2001-2014.
  29. Y. Wu, Z. Yu, C. Li, M. He, B. Hua, Z. Chen, Reinforcement learning in dual-arm trajectory planning for a free-floating space robot, Aero. Sci. Technol. 98 (2020), 105657.