skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Computational reproducibility of scientific workflows at extreme scales

Journal Article · · International Journal of High Performance Computing Applications
ORCiD logo [1];  [2];  [3];  [1];  [3];  [3];  [4];  [1]
  1. Brookhaven National Lab. (BNL), Upton, NY (United States)
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  3. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  4. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics and performance metrics. We discuss two use cases: scientific reproducibility of results in the Energy Exascale Earth System Model (E3SM – previously ACME), and performance reproducibility in molecular dynamics workflows on HPC computing platforms. In order to capture and persist the provenance and performance data of these workflows, we have designed and developed the Chimbuko and ProvEn frameworks. Chimbuko captures provenance and enables detailed single workflow performance analysis. ProvEn is a hybrid, queriable system for storing and analyzing the provenance and performance metrics of multiple runs in workflow performance analysis campaigns. Workflow provenance and performance data output from Chimbuko can be visualized in a dynamic, multi-level visualization providing overview and zoom-in capabilities for areas of interest. Provenance and related performance data ingested into ProvEn is queriable and can be used to reproduce runs. In conclusion, our provenance-based approach highlights challenges in extracting information and gaps in the information collected. It is agnostic to the type of provenance data it captures so that both the reproducibility of scientific results and that of performance can be explored with our tools.

Research Organization:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
Grant/Contract Number:
SC0012704
OSTI ID:
1542776
Report Number(s):
BNL-211854-2019-JAAM
Journal Information:
International Journal of High Performance Computing Applications, Vol. 33, Issue 5; ISSN 1094-3420
Publisher:
SAGECopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

References (29)

An introduction to Docker for reproducible research journal January 2015
Provenance: An Introduction to PROV journal September 2013
Leveraging large sensor streams for robust cloud control conference December 2016
A web interface for XALT log data analysis conference January 2016
Packing experiments for sharing and publication conference January 2013
Ten Simple Rules for Reproducible Computational Research journal October 2013
Performance Visualization for TAU Instrumented Scientific Workflows [Performance Visualization for TAU Instrumented Scientific Workflows]
  • Xie, Cong; Xu, Wei; Ha, Sungsoo
  • International Conference on Information Visualization Theory and Applications, Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications https://doi.org/10.5220/0006646803330340
conference January 2018
Towards trustworthy testbeds thanks to throughout testing
  • Nussbaum, Lucas
  • 2017 IEEE International Parallel and Distributed Processing Symposium: Workshops (IPDPSW), 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) https://doi.org/10.1109/IPDPSW.2017.101
conference May 2017
Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS journal August 2013
Numerical reproducibility for the parallel reduction on multi- and many-core architectures journal November 2015
High Performance MPI Library for Container-Based HPC Cloud on InfiniBand Clusters conference August 2016
Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity report December 2018
Fast Parallel Algorithms for Short-Range Molecular Dynamics journal March 1995
Prescriptive provenance for streaming analysis of workflows at scale conference August 2018
User Environment Tracking and Problem Detection with XALT conference November 2014
Guidelines for evaluating and expressing the uncertainty of NIST measurement results report January 1994
The Spack package manager: bringing order to HPC software chaos
  • Gamblin, Todd; LeGendre, Matthew; Collette, Michael R.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807623
conference January 2015
R3: repeatability, reproducibility and rigor journal March 2012
Data provenance hybridization supporting extreme-scale scientific workflow applications conference August 2016
Streaming spectral clustering conference May 2016
Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance journal October 2016
Enhancing reproducibility for computational methods journal December 2016
Reproducible Research in Computational Science journal December 2011
Data storage and sharing for the long tail of science conference August 2016
Numerical Reproducibility and Accuracy at ExaScale conference April 2013
Testbeds Support for Reproducible Research conference January 2017
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization conference August 2017
The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data journal July 2014
Automated Capture of Experiment Context for Easier Reproducibility in Computational Research journal July 2012

Similar Records

Computational Reproducibility of Scientific Workflows at Extreme Scales
Journal Article · Sun Sep 01 00:00:00 EDT 2019 · International Journal of High Performance Computing Applications · OSTI ID:1542776

Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Conference · Sun Aug 06 00:00:00 EDT 2017 · OSTI ID:1542776

Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Conference · Wed Aug 30 00:00:00 EDT 2017 · Proceedings · OSTI ID:1542776