Optimizing point‐to‐point communication between adaptive MPI endpoints in shared memory
- Department of Computer Science University of Illinois at Urbana‐Champaign IL 61801‐2302 USA
Adaptive MPI is an implementation of the MPI standard that supports the virtualization of ranks as user‐level threads, rather than OS processes. In this work, we optimize the communication performance of AMPI based on the locality of the endpoints communicating within a cluster of SMP nodes. We differentiate between point‐to‐point messages with both endpoints co‐located on the same execution unit and point‐to‐point messages with both endpoints residing in the same process but not on the same execution unit. We demonstrate how the messaging semantics of Charm++ enable and hinder AMPI's implementation in different ways, and we motivate extensions to Charm++ to address the limitations. Using the OSU micro‐benchmark suite, we show that our locality‐aware design offers lower latency, higher bandwidth, and reduced memory footprint for applications.
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- NA0002374
- OSTI ID:
- 1582085
- Journal Information:
- Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 3 Vol. 32; ISSN 1532-0626
- Publisher:
- Wiley Blackwell (John Wiley & Sons)Copyright Statement
- Country of Publication:
- United Kingdom
- Language:
- English
MPC: A Unified Parallel Runtime for Clusters of NUMA Machines
|
book | January 2008 |
MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption
|
book | January 2009 |
Leveraging MPI’s One-Sided Communication Interface for Shared-Memory Programming
|
book | January 2012 |
MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory
|
journal | May 2013 |
Advanced Thread Synchronization for Multithreaded MPI Implementations
|
conference | May 2017 |
Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters
|
conference | September 2006 |
Optimizing MPI communication within large multicore nodes with kernel assistance
|
conference | April 2010 |
Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
|
conference | February 2009 |
SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor
|
conference | November 2008 |
Parallel Programming with Migratable Objects: Charm++ in Practice
|
conference | November 2014 |
Evaluating HPC Networks via Simulation of Parallel Workloads
|
conference | November 2016 |
Performance evaluation of adaptive MPI
|
conference | January 2006 |
McMPI: a managed-code MPI library in pure C#
|
conference | January 2013 |
Hybrid MPI: efficient message passing for multi-core systems
|
conference | January 2013 |
Benefits of Cross Memory Attach for MPI libraries on HPC Clusters
|
conference | January 2014 |
Eliminating Costs for Crossing Process Boundary from MPI Intra-node Communication
|
conference | January 2014 |
MPI+Threads: runtime contention and remedies
|
journal | January 2015 |
Introducing Task-Containers as an Alternative to Runtime-Stacking
|
conference | January 2016 |
Towards millions of communicating threads
|
conference | January 2016 |
Enhanced memory management for scalable MPI intra-node communication on many-core processor
|
conference | January 2017 |
Enabling communication concurrency through flexible MPI endpoints
|
journal | September 2014 |
Similar Records
Enabling communication concurrency through flexible MPI endpoints