MPI on millions of cores

BALAJI, PAVAN; BUNTINAS, DARIUS; GOODELL, DAVID; GROPP, WILLIAM; HOEFLER, TORSTEN; KUMAR, SAMEER; LUSK, EWING; THAKUR, RAJEEV; TRÄFF, JESPER LARSSON

doi:10.1142/S0129626411000060

MPI on millions of cores

Journal Article · Tue Mar 01 04:00:00 EST 2011 · Parallel Processing Letters

DOI:https://doi.org/10.1142/S0129626411000060· OSTI ID:1395013

BALAJI, PAVAN; BUNTINAS, DARIUS; GOODELL, DAVID; GROPP, WILLIAM; HOEFLER, TORSTEN; KUMAR, SAMEER; LUSK, EWING; THAKUR, RAJEEV; TRÄFF, JESPER LARSSON

Petascale parallel computers with more than a million processing cores are expected to be available in a couple of years. Although MPI is the dominant programming interface today for large-scale systems that at the highest end already have close to 300,000 processors, a challenging question to both researchers and users is whether MPI will scale to processor and core counts in the millions. In this paper, we examine the issue of scalability of MPI to very large systems. We first examine the MPI specification itself and discuss areas with scalability concerns and how they can be overcome. We then investigate issues that an MPI implementation must address in order to be scalable. To illustrate the issues, we ran a number of simple experiments to measure MPI memory consumption at scale up to 131,072 processes, or 80%, of the IBM Blue Gene/P system at Argonne National Laboratory. Based on the results, we identified nonscalable aspects of the MPI implementation and found ways to tune it to reduce its memory footprint. We also briefly discuss issues in application scalability to large process counts and features of MPI that enable the use of other techniques to alleviate scalability limitations in applications.

Research Organization:: Argonne National Laboratory (ANL)

Sponsoring Organization:: USDOE Office of Science - Office of Advanced Scientific Computing Research; National Science Foundation (NSF)

DOE Contract Number:: AC02-06CH11357

OSTI ID:: 1395013

Journal Information:: Parallel Processing Letters, Journal Name: Parallel Processing Letters Journal Issue: 01 Vol. 21; ISSN 0129-6264

Publisher:: World Scientific

Country of Publication:: United States

Language:: English

References (6)

Fault Tolerance in Message Passing Interface Programs Gropp, William; Lusk, Ewing The International Journal of High Performance Computing Applications, Vol. 18, Issue 3 https://doi.org/10.1177/1094342004046045	journal	August 2004
HARNESS and fault tolerant MPI Fagg, Graham E.; Bukovsky, Antonin; Dongarra, Jack J. Parallel Computing, Vol. 27, Issue 11 https://doi.org/10.1016/S0167-8191(01)00100-4	journal	October 2001
Collective communication: theory, practice, and experience Chan, Ernie; Heimlich, Marcel; Purkayastha, Avi Concurrency and Computation: Practice and Experience, Vol. 19, Issue 13 https://doi.org/10.1002/cpe.1206	journal	January 2007
Q UANTUM M ONTE C ARLO C ALCULATIONS OF L IGHT N UCLEI Pieper, Steven C.; Wiringa, R. B. Annual Review of Nuclear and Particle Science, Vol. 51, Issue 1 https://doi.org/10.1146/annurev.nucl.51.101701.132506	journal	December 2001
Optimization of Collective Communication Operations in MPICH Thakur, Rajeev; Rabenseifner, Rolf; Gropp, William The International Journal of High Performance Computing Applications, Vol. 19, Issue 1 https://doi.org/10.1177/1094342005051521	journal	February 2005
Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P Balaji, Pavan; Chan, Anthony; Thakur, Rajeev Computer Science - Research and Development, Vol. 24, Issue 1-2 https://doi.org/10.1007/s00450-009-0095-3	journal	August 2009

Similar Records

MPI-hybrid Parallelism for Volume Rendering on Large, Multi-core Systems

Conference · Sat Mar 20 00:00:00 EDT 2010 · OSTI ID:983174

Petascale Parallelization of the Gyrokinetic Toroidal Code

Conference · Sat May 01 00:00:00 EDT 2010 · OSTI ID:1032521

Improving Multi-Million Virtual Rank MPI Execution in

Conference · Fri Dec 31 23:00:00 EST 2010 · OSTI ID:1022648

Related Subjects

97 MATHEMATICS AND COMPUTING
MPI
scalability

MPI on millions of cores

Citation Formats

References (6)

Similar Records

Related Subjects