MPI Session: External Network Transport Implementation (V.1.0)
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Univ. of Tennessee, Chattanooga, TN (United States)
The MPI Sessions extensions to the MPI standard have been accepted by the MPI Forum and will be included in the upcoming MPI 4 version of the standard. MPI Sessions has the potential to address several limitations of MPI’s current specification: MPI cannot be initialized within an MPI process from different application components without a priori knowledge or coordination; MPI cannot be initialized more than once; and, MPI cannot be reinitialized after MPI finalization. MPI Sessions also offers the possibility for more flexible ways for individual components of an application to express the capabilities they require from MPI at a finer granularity than is presently possible. A prototype of MPI Sessions, based on the Open MPI implementation of the MPI standard, was developed to facilitate acceptance of the Sessions proposal by the Forum. The initial implementation had some limitations, one of the more significant ones being that it was limited in its ability to fully exploit modern network APIs such as OFI libfabric and OpenUCX and underlying network hardware. This report presents enhancements to the prototype implementation of MPI Sessions that removes this restriction for the networks to be used in the next generation of DOE exa–scale systems. Open MPI was used as the implementation vehicle, but results here are also relevant to other middleware stacks.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Univ. of Tennessee, Chattanooga, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC); USDOE National Nuclear Security Administration (NNSA): National Science Foundation (NSF)
- DOE Contract Number:
- 89233218CNA000001; CCF-1822191; CCF-1821431; CCF-1562306
- OSTI ID:
- 1669081
- Report Number(s):
- LA-UR-20-27636; STPR17/OMPIX/17-37
- Country of Publication:
- United States
- Language:
- English
Similar Records
Failure recovery for bulk synchronous applications with MPI stages
File I/O for MPI Applications in Redundant Execution Scenarios