Preparing MPICH for exascale
Journal Article
·
· International Journal of High Performance Computing Applications
more »
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Argonne National Laboratory (ANL), Argonne, IL (United States); Meta, Palo Alto, CA (United States)
- Argonne National Laboratory (ANL), Argonne, IL (United States); Cerebras Systems, Sunnyvale, CA (United States)
- Argonne National Laboratory (ANL), Argonne, IL (United States); Klaytn Foundation (Singapore)
- Argonne National Laboratory (ANL), Argonne, IL (United States); NVIDIA Corporation, Santa Clara, CA (United States)
- Argonne National Laboratory (ANL), Argonne, IL (United States); FernUniversität in Hagen (Germany)
- NVIDIA Corporation, Santa Clara, CA (United States); Univ. of California, Irvine, CA (United States)
- NVIDIA Corporation, Santa Clara, CA (United States); Univ. of California, Riverside, CA (United States)
- Intel Corporation, Santa Clara, CA (United States)
- Meta, Palo Alto, CA (United States); Intel Corporation, Santa Clara, CA (United States)
- Intel Corporation, Santa Clara, CA (United States); Microsoft Corporation, Redmond, WA (United States)
- NVIDIA Corporation, Santa Clara, CA (United States); Intel Corporation, Santa Clara, CA (United States)
- Intel Corporation, Santa Clara, CA (United States); Fastly, San Francisco, CA (United States)
- Hewlett Packard Enterprise, Palo Alto, CA (United States)
- Univ. of Illinois at Urbana-Champaign, IL (United States)
The advent of exascale supercomputers heralds a new era of scientific discovery, yet it introduces significant architectural challenges that must be overcome for MPI applications to fully exploit its potential. Among these challenges is the adoption of heterogeneous architectures, particularly the integration of GPUs to accelerate computation. Additionally, the complexity of multithreaded programming models has also become a critical factor in achieving performance at scale. The efficient utilization of hardware acceleration for communication, provided by modern NICs, is also essential for achieving low latency and high throughput communication in such complex systems. In response to these challenges, the MPICH library, a high-performance and widely used Message Passing Interface (MPI) implementation, has undergone significant enhancements. Here, this paper presents four major contributions that prepare MPICH for the exascale transition. First, we describe a lightweight communication stack that leverages the advanced features of modern NICs to maximize hardware acceleration. Second, our work showcases a highly scalable multithreaded communication model that addresses the complexities of concurrent environments. Third, we introduce GPU-aware communication capabilities that optimize data movement in GPU-integrated systems. Finally, we present a new datatype engine aimed at accelerating the use of MPI derived datatypes on GPUs. These improvements in the MPICH library not only address the immediate needs of exascale computing architectures but also set a foundation for exploiting future innovations in high-performance computing. By embracing these new designs and approaches, MPICH-derived libraries from HPE Cray and Intel were able to achieve real exascale performance on OLCF Frontier and ALCF Aurora respectively.
- Research Organization:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Organization:
- USDOE; USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF)
- Grant/Contract Number:
- AC02-06CH11357; AC05-00OR22725
- OSTI ID:
- 2506860
- Alternate ID(s):
- OSTI ID: 3005770
- Journal Information:
- International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 2 Vol. 39; ISSN 1094-3420; ISSN 1741-2846
- Publisher:
- SAGECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Designing and prototyping extensions to the Message Passing Interface in MPICH
MPICH-G2 : a grid-enabled implementation of the message passing interface.
Journal Article
·
Sun Aug 18 20:00:00 EDT 2024
· International Journal of High Performance Computing Applications
·
OSTI ID:2571429
MPICH-G2 : a grid-enabled implementation of the message passing interface.
Journal Article
·
Thu May 01 00:00:00 EDT 2003
· J. Parallel Distrib. Comput.
·
OSTI ID:949654