Towards a Standard Process Management Infrastructure for Workflows Using Python
- ORNL
Orchestrating the execution of ensembles of processes lies at the core of scientific workflow engines on large scale parallel platforms. This is usually handled using platform-specific command line tools, with limited process management control and potential strain on system resources. The PMIx standard provides a uniform interface to system resources. The low level C implementation of PMIx has hampered its use in workflow engines, leading to the development of Python binding that has yet to gain traction. In this paper, we present our work to harden the PMIx Python client, demonstrating its usability using a prototype Python driver to orchestrate the execution of an ensemble of processes. We present experimental results using the prototype on the Summit supercomputer at Oak Ridge National Laboratory. This work lays the foundation for wider adoption of PMIx for workflow engines, and encourages wider support of more PMIx functionality in vendor provided system software stacks.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1969826
- Resource Relation:
- Journal Volume: 13798; Conference: The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’22) - Sendai, , Japan - 12/7/2022 5:00:00 AM-12/9/2022 5:00:00 AM
- Country of Publication:
- United States
- Language:
- English
Similar Records
RADICAL-Pilot and PMIx/PRRTE: Executing Heterogeneous Workloads at Large Scale on Partitioned HPC Resources
$\mathrm{RADICAL}$-Pilot and $\mathrm{PMIx}$/$\mathrm{PRRTE}$: Executing Heterogeneous Workloads at Large Scale on Partitioned $\mathrm{HPC}$ Resources