Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Real‐time XFEL data analysis at SLAC and NERSC: A trial run of nascent exascale experimental data analysis

Journal Article · · Concurrency and Computation. Practice and Experience
DOI:https://doi.org/10.1002/cpe.8019· OSTI ID:2322459
 [1];  [2];  [2];  [2];  [2];  [2];  [3];  [3];  [4];  [4]
  1. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC); SLAC
  2. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
  3. SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
  4. Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
X‐ray scattering experiments using free electron lasers (XFELs) are a powerful tool to determine the molecular structure and function of unknown samples (such as COVID‐19 viral proteins). XFEL experiments are a challenge to computing in two ways: (i) due to the high cost of running XFELs, a fast turnaround time from data acquisition to data analysis is essential to make informed decisions on experimental protocols; (ii) data‐collection rates are growing exponentially, requiring new scalable algorithms. Here we report our experiences analyzing data from two experiments at the Linac Coherent Light Source (LCLS) during September 2020. Raw data were analyzed on NERSC's Cori XC40 system, using the Superfacility paradigm: our workflow automatically moves raw data between LCLS and NERSC, where it is analyzed using the software package CCTBX. We achieved real time data analysis with a turnaround time from data acquisition to full molecular reconstruction in as little as 10 min—sufficient time for the experiment's operators to make informed decisions. By hosting the data analysis on Cori, and by automating LCLS‐NERSC interoperability, we achieved a data analysis rate which matches the data acquisition rate. Completing data analysis within 10 min is a first for XFEL experiments and an important milestone if we are to keep up with data‐collection trends.
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States); SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
Sponsoring Organization:
Exascale Computing Project; National Institutes of Health (NIH); USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
Grant/Contract Number:
AC02-05CH11231; AC02-76SF00515
OSTI ID:
2322459
Alternate ID(s):
OSTI ID: 2332965
Journal Information:
Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 12 Vol. 36; ISSN 1532-0626
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English

References (13)

How a Lightsource Uses a Supercomputer for Live Interactive Analysis of Large Data Sets journal July 2023
The Computational Crystallography Toolbox : crystallographic algorithms in a reusable software framework journal January 2002
New Python-based methods for data processing journal June 2013
Linac Coherent Light Source data analysis using psana journal March 2016
Towards the spatial resolution of metalloprotein charge states by detailed modeling of XFEL crystallographic diffraction journal February 2020
Enabling discovery data science through cross-facility workflows conference December 2021
NEWT: A RESTful service for building High Performance Computing web applications conference November 2010
SDN for End-to-End Networked Science at the Exascale (SENSE) conference November 2018
FirecREST: a RESTful API to HPC systems conference November 2020
Experiences with Cross-Facility Real-Time Light Source Data Analysis Workflows conference November 2021
Cross-facility science with the Superfacility Project at LBNL conference November 2020
The Agave Platform: An Open, Science-as-a-Service Platform for Digital Science
  • Dooley, Rion; Brandt, Steven R.; Fonner, John
  • PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing https://doi.org/10.1145/3219104.3219129
conference July 2018
Hatchet
  • Bhatele, Abhinav; Brink, Stephanie; Gamblin, Todd
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356219
conference November 2019

Figures / Tables (13)