Virtual to Physical: Reinforcement Learning to Optimize SNS Particle Accelerator Controls

Kasparian, Armen; Zhukov, Alexander P.; Cathey, Brandon; Elliott, Carrie; Colen, Jonathan; Rajput, Kishansingh; Schram, Malachi; Thompson, Trent; Blokland, Willem

doi:10.2172/2573756

Virtual to Physical: Reinforcement Learning to Optimize SNS Particle Accelerator Controls

Conference · Wed Apr 09 04:00:00 EDT 2025

DOI:https://doi.org/10.2172/2573756· OSTI ID:2573756

Kasparian, Armen ^[1]; Zhukov, Alexander P. ^[2]; Cathey, Brandon ^[3]; Elliott, Carrie ^[2]; Colen, Jonathan ^[4]; ^[1]; Schram, Malachi ^[1]; Thompson, Trent ^[2]; Blokland, Willem ^[2]

Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Spallation Neutron Source (SNS)
Old Dominion Univ., Norfolk, VA (United States)

Complex accelerators must have control systems that can handle dynamic nonlinear environments. This makes traditional control methods unsuitable as they can struggle to adapt to these uncertainties. This provides an ideal environment for reinforcement learning algorithms as they are adaptable and generalizable. We present a reinforcement learning pipeline that can effectively handle the dynamics of a complex accelerator. We test and prove our pipelines capabilities on multiple environments including the Spallation Neutron Source (SNS) and the Beam Test Facility (BTF) at Oakridge National Lab (ORNL). Due to the limited time available to train an online algorithm like reinforcement learning on a real accelerator, we utilize a virtual twin accelerator (VIRAC) developed by ORNL to pretrain the policy and show its ability to converge in the virtual environment. We then test the adaptability of the pretrained RL model by applying it on the real accelerator and comparing the results. Utilizing our Scientific Optimization and Controls Toolkit (SOCT) and open-source standards such as Gymnasium we create and solve for a MEBT orbit correction problem in the SNS and an emittance maximization problem in the BTF. We show how Twin Delayed Deep Deterministic Policy Gradient (TD3) can solve this optimization environment in the virtual accelerator and transfer this policy onto the real accelerator for inference and model retraining. We show how reinforcement learning can be utilized as a control system for complex accelerators and provide a model pipeline for how an implementation performs and can be adapted to new accelerator control problems.

Research Organization:: Thomas Jefferson National Accelerator Facility (TJNAF)

Sponsoring Organization:: USDOE Office of Science (SC), Nuclear Physics (NP)

DOE Contract Number:: AC05-06OR23177

OSTI ID:: 2573756

Report Number(s):: DOE/OR/23177-7935; JLAB-CST-25-4290

Country of Publication:: United States

Language:: English

Similar Records

Federated Deep Reinforcement Learning for Decentralized VVO of BTM DERs

Conference · Mon Sep 30 20:00:00 EDT 2024 · OSTI ID:2477510

Single Episode Policy Transfer

Software · Sat Nov 30 19:00:00 EST 2019 · OSTI ID:code-34258

Optimizing and Extending the Functionality of EXARL for Scalable Reinforcement Learning [Slides]

Technical Report · Thu Aug 05 00:00:00 EDT 2021 · OSTI ID:1812639

Virtual to Physical: Reinforcement Learning to Optimize SNS Particle Accelerator Controls

Citation Formats

Similar Records

Related Subjects