Improved MPI collectives for MPI processes in shared address spaces

Li, Shigang; Hoefler, Torsten; Hu, Chungjin; Snir, Marc

doi:10.1007/s10586-014-0361-4

Improved MPI collectives for MPI processes in shared address spaces

Journal Article · Wed Mar 19 04:00:00 EDT 2014 · Cluster Computing

DOI:https://doi.org/10.1007/s10586-014-0361-4· OSTI ID:1392899

Li, Shigang; Hoefler, Torsten; Hu, Chungjin; Snir, Marc

As the number of cores per node keeps increasing, it becomes increasingly important for MPI to leverage shared memory for intranode communication. This paper investigates the design and optimization of MPI collectives for clusters of NUMA nodes. We develop performance models for collective communication using shared memory and we demonstrate several algorithms for various collectives. Experiments are conducted on both Xeon X5650 and Opteron 6100 InfiniBand clusters. The measurements agree with the model and indicate that different algorithms dominate for short vectors and long vectors. We compare our shared-memory allreduce with several MPI implementations-Open MPI, MPICH2, and MVAPICH2-that utilize system shared memory to facilitate interprocess communication. On a 16-node Xeon cluster and 8-node Opteron cluster, our implementation achieves on geometric average 2.3X and 2.1X speedup over the best MPI implementation, respectively. Our techniques enable an efficient implementation of collective operations on future multi- and manycore systems.

Research Organization:: Argonne National Laboratory (ANL)

Sponsoring Organization:: USDOE Office of Science

DOE Contract Number:: AC02-06CH11357

OSTI ID:: 1392899

Journal Information:: Cluster Computing, Journal Name: Cluster Computing Journal Issue: 4 Vol. 17; ISSN 1386-7857

Country of Publication:: United States

Language:: English

References (7)

Two algorithms for barrier synchronization Hensgen, Debra; Finkel, Raphael; Manber, Udi International Journal of Parallel Programming, Vol. 17, Issue 1 https://doi.org/10.1007/BF01379320	journal	February 1988
NUMA-aware shared-memory collective communication for MPI Li, Shigang; Hoefler, Torsten; Snir, Marc Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13 https://doi.org/10.1145/2493123.2462903	conference	January 2013
Fast collective operations using shared and remote memory access protocols on clusters Tipparaju, V.; Nieplocha, J.; Panda, D. International Parallel and Distributed Processing Symposium (IPDPS 2003), Proceedings International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2003.1213188	conference	January 2003
Optimization of Collective Communication Operations in MPICH Thakur, Rajeev; Rabenseifner, Rolf; Gropp, William The International Journal of High Performance Computing Applications, Vol. 19, Issue 1 https://doi.org/10.1177/1094342005051521	journal	February 2005
Optimizing threaded MPI execution on SMP clusters Tang, Hong; Yang, Tao Proceedings of the 15th international conference on Supercomputing - ICS '01 https://doi.org/10.1145/377792.377895	conference	January 2001
Synchronization without contention Mellor-Crummey, John M.; Scott, Michael L. ACM SIGPLAN Notices, Vol. 26, Issue 4 https://doi.org/10.1145/106973.106999	journal	April 1991
Optimization of MPI collectives on clusters of large-scale SMP's Sistare, Steve; vandeVaart, Rolf; Loh, Eugene Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99 https://doi.org/10.1145/331532.331555	conference	January 1999

Similar Records

Optimizing Blocking and Nonblocking Reduction Operations for Multicore Systems: Hierarchical Design and Implementation

Conference · Mon Dec 31 23:00:00 EST 2012 · OSTI ID:1095156

Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem.

Technical Report · Thu Dec 01 23:00:00 EST 2005 · OSTI ID:881588

Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective

Conference · Mon Mar 05 23:00:00 EST 2007 · OSTI ID:908380

Related Subjects

Collective communication
MPI
MPI_Allreduce
Multithreading
NUMA

Improved MPI collectives for MPI processes in shared address spaces

Citation Formats

References (7)

Similar Records

Related Subjects