Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Optimizing point‐to‐point communication between adaptive MPI endpoints in shared memory

Journal Article · · Concurrency and Computation. Practice and Experience
DOI:https://doi.org/10.1002/cpe.4467· OSTI ID:1582085
 [1];  [1]
  1. Department of Computer Science University of Illinois at Urbana‐Champaign IL 61801‐2302 USA
Summary

Adaptive MPI is an implementation of the MPI standard that supports the virtualization of ranks as user‐level threads, rather than OS processes. In this work, we optimize the communication performance of AMPI based on the locality of the endpoints communicating within a cluster of SMP nodes. We differentiate between point‐to‐point messages with both endpoints co‐located on the same execution unit and point‐to‐point messages with both endpoints residing in the same process but not on the same execution unit. We demonstrate how the messaging semantics of Charm++ enable and hinder AMPI's implementation in different ways, and we motivate extensions to Charm++ to address the limitations. Using the OSU micro‐benchmark suite, we show that our locality‐aware design offers lower latency, higher bandwidth, and reduced memory footprint for applications.

Sponsoring Organization:
USDOE
Grant/Contract Number:
NA0002374
OSTI ID:
1582085
Journal Information:
Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 3 Vol. 32; ISSN 1532-0626
Publisher:
Wiley Blackwell (John Wiley & Sons)Copyright Statement
Country of Publication:
United Kingdom
Language:
English

References (21)

MPC: A Unified Parallel Runtime for Clusters of NUMA Machines book January 2008
MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption book January 2009
Leveraging MPI’s One-Sided Communication Interface for Shared-Memory Programming book January 2012
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory journal May 2013
Advanced Thread Synchronization for Multithreaded MPI Implementations conference May 2017
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters conference September 2006
Optimizing MPI communication within large multicore nodes with kernel assistance
  • Moreaud, Stephanie; Goglin, Brice; Namyst, Raymond
  • 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW 2010), 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW) https://doi.org/10.1109/IPDPSW.2010.5470849
conference April 2010
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
  • Rabenseifner, Rolf; Hager, Georg; Jost, Gabriele
  • 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing https://doi.org/10.1109/PDP.2009.43
conference February 2009
SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor conference November 2008
Parallel Programming with Migratable Objects: Charm++ in Practice
  • Acun, Bilge; Gupta, Abhishek; Jain, Nikhil
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.58
conference November 2014
Evaluating HPC Networks via Simulation of Parallel Workloads
  • Jain, Nikhil; Bhatele, Abhinav; White, Sam
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.13
conference November 2016
Performance evaluation of adaptive MPI
  • Huang, Chao; Zheng, Gengbin; Kalé, Laxmikant
  • Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06 https://doi.org/10.1145/1122971.1122976
conference January 2006
McMPI: a managed-code MPI library in pure C# conference January 2013
Hybrid MPI: efficient message passing for multi-core systems
  • Friedley, Andrew; Bronevetsky, Greg; Hoefler, Torsten
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503294
conference January 2013
Benefits of Cross Memory Attach for MPI libraries on HPC Clusters conference January 2014
Eliminating Costs for Crossing Process Boundary from MPI Intra-node Communication conference January 2014
MPI+Threads: runtime contention and remedies journal January 2015
Introducing Task-Containers as an Alternative to Runtime-Stacking conference January 2016
Towards millions of communicating threads conference January 2016
Enhanced memory management for scalable MPI intra-node communication on many-core processor conference January 2017
Enabling communication concurrency through flexible MPI endpoints journal September 2014

Similar Records

Enabling communication concurrency through flexible MPI endpoints
Journal Article · Tue Sep 23 00:00:00 EDT 2014 · International Journal of High Performance Computing Applications · OSTI ID:1392394

Enabling communication concurrency through flexible MPI endpoints
Journal Article · Mon Sep 22 20:00:00 EDT 2014 · International Journal of High Performance Computing Applications · OSTI ID:1140752

Related Subjects