Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Towards a Standard Process Management Infrastructure for Workflows Using Python

Conference ·

Orchestrating the execution of ensembles of processes lies at the core of scientific workflow engines on large scale parallel platforms. This is usually handled using platform-specific command line tools, with limited process management control and potential strain on system resources. The PMIx standard provides a uniform interface to system resources. The low level C implementation of PMIx has hampered its use in workflow engines, leading to the development of Python binding that has yet to gain traction. In this paper, we present our work to harden the PMIx Python client, demonstrating its usability using a prototype Python driver to orchestrate the execution of an ensemble of processes. We present experimental results using the prototype on the Summit supercomputer at Oak Ridge National Laboratory. This work lays the foundation for wider adoption of PMIx for workflow engines, and encourages wider support of more PMIx functionality in vendor provided system software stacks.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1969826
Resource Relation:
Journal Volume: 13798; Conference: The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’22) - Sendai, , Japan - 12/7/2022 5:00:00 AM-12/9/2022 5:00:00 AM
Country of Publication:
United States
Language:
English

References (8)

Supercomputer-Based Ensemble Docking Drug Discovery Pipeline with Application to Covid-19 December 2020
Using Pilot Systems to Execute Many Task Workloads on Supercomputers January 2019
Flux: Overcoming scheduling challenges for exascale workflows September 2020
Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer December 2019
Characterizing the Performance of Executing Many-tasks on Summit November 2019
PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems January 2010
FireWorks: a dynamic workflow system designed for high-throughput applications: FireWorks: A Dynamic Workflow System Designed for High-Throughput Applications May 2015
Scalable HPC & AI infrastructure for COVID-19 therapeutics
  • No authors listed
  • PASC '21: Platform for Advanced Scientific Computing Conference, Proceedings of the Platform for Advanced Scientific Computing Conference https://doi.org/10.1145/3468267.3470573
July 2021