Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Science Capsule: Towards Sharing and Reproducibility of Scientific Workflows

Conference · · Workshop on Workflows in Support of Large-Scale Science.
Workflows are increasingly processing large volumes of data from scientific instruments, experiments and sensors. These workflows often consist of complex data processing and analysis steps that might include a diverse ecosystem of tools and also often involve human-in-the-loop steps. Sharing and reproducing these workflows with collaborators and the larger community is critical but hard to do without the entire context of the workflow including user notes and execution environment. In this paper, we describe Science Capsule, which is a framework to capture, share, and reproduce scientific workflows. Science Capsule captures, manages and represents both computational and human elements of a workflow. It automatically captures and processes events associated with the execution and data life cycle of workflows, and lets users add other types and forms of scientific artifacts. Science Capsule also allows users to create `workflow snapshots' that keep track of the different versions of a workflow and their lineage, allowing scientists to incrementally share and extend workflows between users. Our results show that Science Capsule is capable of processing and organizing events in near real-time for high-throughput experimental and data analysis workflows without incurring any significant performance overheads.
Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1833998
Conference Information:
Journal Name: Workshop on Workflows in Support of Large-Scale Science. Journal Volume: 2021
Country of Publication:
United States
Language:
English

References (19)

PDiffView journal August 2009
The W3C PROV family of specifications for modelling provenance metadata conference March 2013
Temporal representation for mining scientific data provenance journal July 2014
REANA: A System for Reusable Research Data Analyses journal January 2019
iRODS Primer: Integrated Rule-Oriented Data System journal January 2010
An empirical analysis of journal policy effectiveness for computational reproducibility journal March 2018
FireWorks: a dynamic workflow system designed for high-throughput applications: FireWorks: A Dynamic Workflow System Designed for High-Throughput Applications journal May 2015
Ontologies: principles, methods and applications journal June 1996
Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems journal January 2005
LabelFlow Framework for Annotating Workflow Provenance journal February 2018
Xi-cam : a versatile interface for data visualization and analysis journal May 2018
If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology journal July 2013
Provenance and data differencing for workflow reproducibility analysis journal April 2013
Open is not enough journal November 2018
Mining Taverna's semantic web of provenance journal January 2008
Computing environments for reproducibility: Capturing the “Whole Tale” journal May 2019
Publishing computational research - a review of infrastructures for reproducible and transparent scholarly communication journal July 2020
ReproZip: Computational Reproducibility With Ease
  • Chirigati, Fernando; Rampin, Rémi; Shasha, Dennis
  • SIGMOD/PODS'16: International Conference on Management of Data, Proceedings of the 2016 International Conference on Management of Data https://doi.org/10.1145/2882903.2899401
conference June 2016
Data at work: supporting sharing in science and engineering conference January 2003

Similar Records

Scientific Process Automation and Workflow Management
Book · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:972328

Science Capsule: Capturing the Data Life Cycle (Science Capsule) v0.1.0
Software · Fri Jun 05 20:00:00 EDT 2020 · OSTI ID:code-51417

Related Subjects