Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Preparing MPICH for exascale

Journal Article · · International Journal of High Performance Computing Applications
 [1];  [1];  [1];  [2];  [2];  [3];  [2];  [4];  [5];  [1];  [6];  [5];  [7];  [8];  [9];  [10];  [11];  [12];  [13];  [9] more »;  [9];  [9];  [9];  [9];  [12];  [9];  [14];  [15];  [1] « less
  1. Argonne National Laboratory (ANL), Argonne, IL (United States)
  2. Argonne National Laboratory (ANL), Argonne, IL (United States); Meta, Palo Alto, CA (United States)
  3. Argonne National Laboratory (ANL), Argonne, IL (United States); Cerebras Systems, Sunnyvale, CA (United States)
  4. Argonne National Laboratory (ANL), Argonne, IL (United States); Klaytn Foundation (Singapore)
  5. Argonne National Laboratory (ANL), Argonne, IL (United States); NVIDIA Corporation, Santa Clara, CA (United States)
  6. Argonne National Laboratory (ANL), Argonne, IL (United States); FernUniversität in Hagen (Germany)
  7. NVIDIA Corporation, Santa Clara, CA (United States); Univ. of California, Irvine, CA (United States)
  8. NVIDIA Corporation, Santa Clara, CA (United States); Univ. of California, Riverside, CA (United States)
  9. Intel Corporation, Santa Clara, CA (United States)
  10. Meta, Palo Alto, CA (United States); Intel Corporation, Santa Clara, CA (United States)
  11. Intel Corporation, Santa Clara, CA (United States); Microsoft Corporation, Redmond, WA (United States)
  12. NVIDIA Corporation, Santa Clara, CA (United States); Intel Corporation, Santa Clara, CA (United States)
  13. Intel Corporation, Santa Clara, CA (United States); Fastly, San Francisco, CA (United States)
  14. Hewlett Packard Enterprise, Palo Alto, CA (United States)
  15. Univ. of Illinois at Urbana-Champaign, IL (United States)
The advent of exascale supercomputers heralds a new era of scientific discovery, yet it introduces significant architectural challenges that must be overcome for MPI applications to fully exploit its potential. Among these challenges is the adoption of heterogeneous architectures, particularly the integration of GPUs to accelerate computation. Additionally, the complexity of multithreaded programming models has also become a critical factor in achieving performance at scale. The efficient utilization of hardware acceleration for communication, provided by modern NICs, is also essential for achieving low latency and high throughput communication in such complex systems. In response to these challenges, the MPICH library, a high-performance and widely used Message Passing Interface (MPI) implementation, has undergone significant enhancements. Here, this paper presents four major contributions that prepare MPICH for the exascale transition. First, we describe a lightweight communication stack that leverages the advanced features of modern NICs to maximize hardware acceleration. Second, our work showcases a highly scalable multithreaded communication model that addresses the complexities of concurrent environments. Third, we introduce GPU-aware communication capabilities that optimize data movement in GPU-integrated systems. Finally, we present a new datatype engine aimed at accelerating the use of MPI derived datatypes on GPUs. These improvements in the MPICH library not only address the immediate needs of exascale computing architectures but also set a foundation for exploiting future innovations in high-performance computing. By embracing these new designs and approaches, MPICH-derived libraries from HPE Cray and Intel were able to achieve real exascale performance on OLCF Frontier and ALCF Aurora respectively.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE; USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
Grant/Contract Number:
AC02-06CH11357; AC05-00OR22725
OSTI ID:
2506860
Alternate ID(s):
OSTI ID: 3005770
Journal Information:
International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 2 Vol. 39; ISSN 1094-3420; ISSN 1741-2846
Publisher:
SAGECopyright Statement
Country of Publication:
United States
Language:
English

References (20)

Assessing a mini‐application as a performance proxy for a finite element method engineering application
  • Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.
  • Concurrency and Computation: Practice and Experience, Vol. 27, Issue 17 https://doi.org/10.1002/cpe.3587
journal July 2015
A survey of MPI usage in the US exascale computing project: A survey of MPI usage in the U. S. exascale computing project journal September 2018
Implementing Fast and Reusable Datatype Processing book January 2003
PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems book January 2010
The Locally Self-consistent Multiple Scattering code in a geographically distributed linked MPP environment journal November 1998
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations journal September 2010
FALCON-X: Zero-copy MPI derived datatype processing on modern CPU and GPU architectures journal October 2020
QMCPACK : an open source ab initio quantum Monte Carlo package for the electronic structure of atoms, molecules and solids journal April 2018
CUDA Kernel Based Collective Reduction Operations on Large-scale GPU Clusters conference May 2016
Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries conference August 2022
Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs conference October 2013
Memory Compression Techniques for Network Address Management in MPI conference May 2017
Legion: Expressing locality and independence with logical regions
  • Bauer, Michael; Treichler, Sean; Slaughter, Elliott
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2012.71
conference November 2012
HACC: extreme scaling and performance across diverse architectures
  • Habib, Salman; Morozov, Vitali; Frontiere, Nicholas
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2504566
conference January 2013
Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1
  • Raffenetti, Ken; Blocksome, Michael; Si, Min
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17 https://doi.org/10.1145/3126908.3126963
conference January 2017
How I learned to stop worrying about user-visible endpoints and love MPI conference June 2020
MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming conference September 2022
Enabling communication concurrency through flexible MPI endpoints journal September 2014
Taking the MPI standard and the open MPI library to exascale journal July 2024
LULESH 2.0 Updates and Changes report July 2013

Similar Records

Designing and prototyping extensions to the Message Passing Interface in MPICH
Journal Article · Sun Aug 18 20:00:00 EDT 2024 · International Journal of High Performance Computing Applications · OSTI ID:2571429

MPICH-G2 : a grid-enabled implementation of the message passing interface.
Journal Article · Thu May 01 00:00:00 EDT 2003 · J. Parallel Distrib. Comput. · OSTI ID:949654

Related Subjects