Streaming Data from Experimental Facilities to Supercomputers for Real-Time Data Processing
In this paper we demonstrate direct data streaming from instruments and detectors at a large-scale experimental facility to a supercomputer for real-time data processing and feedback. Streaming data to supercomputers introduces the potential for novel scientific applications and workflow models, including the ability to provide real-time feedback from very large datasets during an experiment and the integration of real-time ML training and inference at scale. We discuss a successful demonstration for real-time processing of data from the Advanced Photon Source (APS) on the Polaris supercomputer using an EPICS-based streaming framework. We describe the capabilities of the streaming framework itself, and outline the architecture that allows us to process experimentally derived data on a supercomputer without file-based data transfers. We present throughput measurements that are indicative of system performance capable of sustaining the expected data production rates of the facility, as well as discuss some outstanding challenges and our future directions.
- Research Organization:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Basic Energy Sciences (BES)
- DOE Contract Number:
- AC02-06CH11357
- OSTI ID:
- 2246622
- Resource Relation:
- Conference: 5th Annual Workshop on Large-scale Experiment-in-the-Loop Computing in conjunction with SC23 The International Conference for High Performance Computing, Networking, Storage and Analysis - (Denver, CO, US, 11/12/23-11/17/23), 11/12/23 - 11/12/23, Denver, CO, US
- Country of Publication:
- United States
- Language:
- English