Optimizing thermodynamic trajectories using evolutionary reinforcement learning
- Univ. of Ontario Inst. of Technology, Oshawa, ON (Canada)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Univ. of Ontario Inst. of Technology, Oshawa, ON (Canada); Univ. of Ottawa, Ottawa, ON (Canada); National Research Council of Canada, Ottawa, ON (Canada)
Using a model heat engine we show that neural network-based reinforcement learning can identify thermodynamic trajectories of maximal efficiency. We use an evolutionary learning algorithm to evolve a population of neural networks, subject to a directive to maximize the efficiency of a trajectory composed of a set of elementary thermodynamic processes; the resulting networks learn to carry out the maximally-efficient Carnot, Stirling, or Otto cycles. Given additional irreversible processes this evolutionary scheme learns a hitherto unknown thermodynamic cycle. Our results show how the reinforcement learning strategies developed for game playing can be applied to solve physical problems conditioned upon path-extensive order parameters.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1601197
- Journal Information:
- arXiv.org Repository, Journal Name: arXiv.org Repository Vol. 2019; ISSN 9999-0017
- Publisher:
- Cornell University
- Country of Publication:
- United States
- Language:
- English
Similar Records
Evolutionary reinforcement learning of dynamical large deviations