skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Open MPI Update - Using New Network APIS

  1. Los Alamos National Laboratory
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Resource Relation:
Conference: Open Fabrics Alliance Workshop ; 2016-04-04 - 2016-04-04 ; Monterey, California, United States
Country of Publication:
United States
Computer Science; MPI libfabrics UCX infiniband iwarp roce

Citation Formats

Pritchard, Howard Porter. Open MPI Update - Using New Network APIS. United States: N. p., 2016. Web.
Pritchard, Howard Porter. Open MPI Update - Using New Network APIS. United States.
Pritchard, Howard Porter. Fri . "Open MPI Update - Using New Network APIS". United States. doi:.
title = {Open MPI Update - Using New Network APIS},
author = {Pritchard, Howard Porter},
abstractNote = {},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri Apr 15 00:00:00 EDT 2016},
month = {Fri Apr 15 00:00:00 EDT 2016}

Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • High Performance Computing (HPC) systems are rapidly growing in size and complexity. As a result, transient and persistent network failures can occur on the time scale of application run times, reducing the productive utilization of these systems. The ubiquitous network protocol used to deal with such failures is TCP/IP, however, available implementations of this protocol provide unacceptable performance for HPC system users, and do not provide the high bandwidth, low latency communications of modern interconnects. This paper describes methods used to provide protection against several network errors such as dropped packets, corrupt packets, and loss of network interfaces while maintainingmore » high-performance communications. Micro-benchmark experiments using vendor supplied TCP/IP and O/S bypass low-level communications stacks over InfiniBand and Myrinet are used to demonstrate the high-performance characteristics of our protocol. The NAS Parallel Benchmarks are used to demonstrate the scalability and the minimal performance impact of this protocol. The micro-benchmarks show that providing higher data reliability decrease performance by up to 30% relative to unprotected communications, but provide performance improvements of a factor of four over TCP/IP running over InfiniBand DDR. The NAS Parallel Benchmarks show virtually no impact of the data reliability protocol on overall run-time.« less
  • InfiniBand (IB) is a popular network technology for modern high-performance computing systems. MPI implementations traditionally support IB using a reliable, connection-oriented (RC) transport. However, per-process resource usage that grows linearly with the number of processes, makes this approach prohibitive for large-scale systems. IB provides an alternative in the form of a connectionless unreliable datagram transport (UD), which allows for near-constant resource usage and initialization overhead as the process count increases. This paper describes a UD-based implementation for IB in Open MPI as a scalable alternative to existing RC-based schemes. We use the software reliability capabilities of Open MPI to providemore » the guaranteed delivery semantics required by MPI. Results show that UD not only requires fewer resources at scale, but also allows for shorter MPI startup times. A connectionless model also improves performance for applications that tend to send small messages to many different processes.« less