Efficient Data Management in Neutron Scattering Data Reduction Workflows at ORNL
- ORNL
Oak Ridge National Laboratory (ORNL) experimental neutron science facilities produce 1.2 TB a day of raw event-based data that is stored using the standard metadata-rich NeXus schema built on top of the HDF5 file format. Performance of several data reduction workflows is largely determined by the amount of time spent on the loading and processing algorithms in Mantid, an open-source data analysis framework used across several neutron sciences facilities around the world. The present work introduces new data management algorithms to address identified input output (I/O) bottlenecks on Mantid. First, we introduce an in-memory binary-tree metadata index that resemble NeXus data access patterns to provide a scalable search and extraction mechanism. Second, data encapsulation in Mantid algorithms is optimally redesigned to reduce the total compute and memory runtime footprint associated with metadata I/O reconstruction tasks. Results from this work show speed ups in wall-clock time on ORNL data reduction workflows, ranging from 11% to 30% depending on the complexity of the targeted instrument-specific data. Nevertheless, we highlight the need for more research to address reduction challenges as experimental data volumes increase.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE; USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1772865
- Country of Publication:
- United States
- Language:
- English
Similar Records
Efficient loading of reduced data ensembles produced at ORNL SNS/HFIR neutron time-of-flight facilities
DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics
Performance Improvements on SNS and HFIR Instrument Data Reduction Workflows Using Mantid
Conference
·
Tue Nov 30 23:00:00 EST 2021
·
OSTI ID:1841479
DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics
Conference
·
Wed Nov 06 23:00:00 EST 2024
·
OSTI ID:2562111
Performance Improvements on SNS and HFIR Instrument Data Reduction Workflows Using Mantid
Conference
·
Mon Nov 30 23:00:00 EST 2020
·
OSTI ID:1755304