Optimizing Communication in 2D Grid-Based MPI Applications at Exascale
- ORNL
The new reality of exascale computing faces many challenges in achieving optimal performance on large numbers of nodes. A key challenge is the efficient utilization of the message-passing interface (MPI), a critical component for process communication. This paper explores communication optimization strategies to harness the GPU-accelerated architectures of these supercomputers. We focus on MPI applications where processors form a two-dimensional process grid, a common arrangement in applications involving dense matrix operations. This configuration offers a unique opportunity to implement innovative strategies to improve performance and maintain effective load distribution. We study two applications— Dist-FW (Apsp:all-pair-shortest-path) and HPL-MxP (LU factorization with Mixed precision)—on two accelerated systems: Summit (IBM Power and NVIDIA V100) and Frontier (AMD EPYC and MI250X). These supercomputers are operated by the Oak Ridge Leadership Computing Facility (OLCF) and are currently ranked #1 and #5 on the Top500 list. We show how to scale up both applications to exascale levels and tackle the MPI challenges related to implementation, synchronization, and performance. We also compare the performance of several communication strategies at an unprecedented scale. Accurately predicting application performance becomes crucial for cost reduction as the computational scale grows. To address this, we suggest a hyperbolic model as a better alternative to the traditional one-sided asymptotic model for predicting future application performance at such large scales.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 2438993
- Country of Publication:
- United States
- Language:
- English
Similar Records
Preparing MPICH for exascale
Ready for the Frontier: Preparing Applications for the World’s First Exascale System
Journal Article
·
Wed Jan 08 19:00:00 EST 2025
· International Journal of High Performance Computing Applications
·
OSTI ID:2506860
Ready for the Frontier: Preparing Applications for the World’s First Exascale System
Conference
·
Mon May 01 00:00:00 EDT 2023
·
OSTI ID:1975345