Computational reproducibility of scientific workflows at extreme scales
- Brookhaven National Lab. (BNL), Upton, NY (United States)
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics and performance metrics. We discuss two use cases: scientific reproducibility of results in the Energy Exascale Earth System Model (E3SM – previously ACME), and performance reproducibility in molecular dynamics workflows on HPC computing platforms. In order to capture and persist the provenance and performance data of these workflows, we have designed and developed the Chimbuko and ProvEn frameworks. Chimbuko captures provenance and enables detailed single workflow performance analysis. ProvEn is a hybrid, queriable system for storing and analyzing the provenance and performance metrics of multiple runs in workflow performance analysis campaigns. Workflow provenance and performance data output from Chimbuko can be visualized in a dynamic, multi-level visualization providing overview and zoom-in capabilities for areas of interest. Provenance and related performance data ingested into ProvEn is queriable and can be used to reproduce runs. In conclusion, our provenance-based approach highlights challenges in extracting information and gaps in the information collected. It is agnostic to the type of provenance data it captures so that both the reproducibility of scientific results and that of performance can be explored with our tools.
- Research Organization:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
- Grant/Contract Number:
- SC0012704
- OSTI ID:
- 1542776
- Report Number(s):
- BNL-211854-2019-JAAM
- Journal Information:
- International Journal of High Performance Computing Applications, Vol. 33, Issue 5; ISSN 1094-3420
- Publisher:
- SAGECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization
Capturing provenance as a diagnostic tool for workflow performance evaluation and optimization