Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A reinforcement learning approach to long-horizon operations, health, and maintenance supervisory control of advanced energy systems

Journal Article · · Engineering Applications of Artificial Intelligence

In this work, we develop a Reinforcement Learning (RL) approach to the supervisory control problem for advanced energy systems, such as novel nuclear reactors and other demand-driven, mission-critical, and component-health-sensitive energy plants. The inclusive problem landscape considered captures the stochastic confluence of plant performance, component health evolution, power demand from the grid, diverse maintenance actions, and operator-defined goals and constraints, all considered over meaningfully long-enough reasoning horizons. Key aspects of the proposed approach are a receding horizon control-inspired technique dictating time- or event-triggered supervisory policy (re-)constructions, as well as additional capability-enabling contributions such as timescale compression, to handle long reasoning horizons and uncertainty in parts of the problem, and practical yet demonstrably-effective handling of hybrid action spaces with continuous and discrete decision variables. The resulting algorithm consists of a simulation-based RL agent constructing stochastic supervisory control policies over nontrivial action spaces and for long horizons, applying the learned policy to the system for a much shorter interval, and perpetually repeating, to construct the next long-horizon policy. That next policy will only be applied, again, for a short interval, yet originally far-in-time events move progressively closer, their associated uncertainty decreases, and new events and aspects enter the reasoning horizon. The proposed methodology bridges fundamental receding horizon concepts with the unequivocally stronger and more scalable reasoning of contemporary RL. Numerical examples using Soft Actor–Critic Deep RL illustrate the operation and efficacy of the proposed technique for a power plant tasked with health-aware load following missions in a dynamic electricity market landscape.

Research Organization:
Idaho National Laboratory (INL), Idaho Falls, ID (United States)
Sponsoring Organization:
USDOE Office of Nuclear Energy (NE)
Grant/Contract Number:
AC07-05ID14517
OSTI ID:
1959166
Report Number(s):
INL/JOU-22-67790-Rev000
Journal Information:
Engineering Applications of Artificial Intelligence, Journal Name: Engineering Applications of Artificial Intelligence Vol. 116; ISSN 0952-1976
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (28)

Multiresolution Hierarchical Path-Planning for Small UAVs Using Wavelet Decompositions journal September 2011
Model predictive control: Theory and practice—A survey journal May 1989
Average cost temporal-difference learning journal November 1999
Model predictive control: past, present and future journal May 1999
A survey of industrial model predictive control technology journal July 2003
Secure embedded intelligence in nuclear systems: Framework and methods journal June 2020
Optimal design of tests for heat exchanger fouling identification journal February 2016
Multi-level convolutional autoencoder networks for parametric prediction of spatio-temporal dynamics journal December 2020
Dynamic modeling, simulation and optimization of a subcritical steam power plant. Part I: Plant model and regulatory control journal August 2017
Reinforcement Learning-based control using Q-learning and gravitational search algorithm with experimental validation on a nonlinear servo system journal January 2022
Policy Iteration Reinforcement Learning-based control using a Grey Wolf Optimizer algorithm journal March 2022
A tutorial review of economic model predictive control methods journal August 2014
Forced convection heat transfer of molten Salts: A review journal June 2020
MOOSE: Enabling massively parallel multiphysics simulation journal January 2020
Magnetic control of tokamak plasmas through deep reinforcement learning journal February 2022
A probabilistic graphical model foundation for enabling predictive digital twins at scale journal May 2021
Integrated State Awareness Through Secure Embedded Intelligence in Nuclear Systems: Opportunities and Implications journal January 2020
Supervisory control of hybrid systems journal July 2000
Deep Learning Algorithms for Bearing Fault Diagnostics—A Comprehensive Review journal January 2020
A DRL Agent for Jointly Optimizing Computation Offloading and Resource Allocation in MEC journal December 2021
Supervisory control and data acquisition journal January 1987
Deep Reinforcement Learning Control of a Boiling Water Reactor journal August 2022
Policy Search for Model Predictive Control With Application to Agile Drone Flight journal August 2022
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play journal December 2018
Reinforcement learning in robotics: A survey journal August 2013
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition journal November 2000
Model Predictive Control in Aerospace Systems: Current State and Opportunities journal July 2017
Measured enthalpy and derived thermodynamic properties of solid and liquid lithium tetrafluoroberyllate, Li2BeF4, from 273 to 900 K
  • Douglas, Thomas B.; Payne, William H.
  • Journal of Research of the National Bureau of Standards Section A: Physics and Chemistry, Vol. 73A, Issue 5 https://doi.org/10.6028/jres.073A.037
journal September 1969