ExaWorks: Workflows for Exascale
Exascale computers will offer transformative capabilities to combine data-driven and learning-based approaches with traditional simulation applications to accelerate scientific discovery and insight. These software combinations and integrations, however, are difficult to achieve due to challenges of coordination and deployment of heterogeneous software components on diverse and massive platforms. We present the ExaWorks project, which can address many of these challenges: ExaWorks is leading a co-design process to create a workflow Software Development Toolkit (SDK) consisting of a wide range of workflow management tools that can be composed and interoperate through common interfaces. We describe the initial set of tools and interfaces supported by the SDK, efforts to make them easier to apply to complex science challenges, and examples of their application to exemplar cases. Furthermore, we discuss how our project is working with the workflows community, large computing facilities as well as HPC platform vendors to sustainably address the requirements of workflows at the exascale.
- Research Organization:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
- DOE Contract Number:
- SC0012704
- OSTI ID:
- 1880770
- Report Number(s):
- BNL-223247-2022-CPPJ
- Resource Relation:
- Conference: WORKS21: 16th Workshop on Workflows in Support of Large-Scale Science held in conjunction with SC21: The International Conference for High Performance Computing, Networking, Storage and Analysis, St. Louis, MO, 11/15/2021 - 11/15/2021
- Country of Publication:
- United States
- Language:
- English
CANDLE/Supervisor: a workflow framework for machine learning applied to cancer research
|
journal | December 2018 |
funcX: A Federated Function Serving Fabric for Science
|
conference | June 2020 |
Design and Performance Characterization of RADICAL-Pilot on Leadership-Class Platforms
|
journal | April 2022 |
A population data-driven workflow for COVID-19 modeling and learning
|
journal | September 2021 |
High-bypass Learning: Automated Detection of Tumor Cells That Significantly Impact Drug Response
|
conference | November 2020 |
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications
|
journal | January 2013 |
Compiler Techniques for Massively Scalable Implicit Task Parallelism
|
conference | November 2014 |
Dataflow coordination of data-parallel tasks via MPI 3.0
|
conference | January 2013 |
Generalizable coordination of large multiscale workflows: challenges and learnings at scale
|
conference | November 2021 |
Parsl: Pervasive Parallel Programming in Python
|
conference | January 2019 |
Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing
|
conference | November 2021 |
The Exascale Computing Project
|
journal | May 2017 |
SLURM: Simple Linux Utility for Resource Management
|
book | January 2003 |
Scalable HPC & AI infrastructure for COVID-19 therapeutics
|
conference | July 2021 |
Similar Records
Exascale workflow applications and middleware: An ExaWorks retrospective
ExaWorks software development kit: a robust and scalable collection of interoperable workflows technologies