skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Hardware MPI message matching: Insights into MPI matching behavior to inform design: Hardware MPI message matching

Abstract

Here, this paper explores key differences of MPI match lists for several important United States Department of Energy (DOE) applications and proxy applications. This understanding is critical in determining the most promising hardware matching design for any given high-speed network. The results of MPI match list studies for the major open-source MPI implementations, MPICH and Open MPI, are presented, and we modify an MPI simulator, LogGOPSim, to provide match list statistics. These results are discussed in the context of several different potential design approaches to MPI matching–capable hardware. The data illustrate the requirements for different hardware designs in terms of performance and memory capacity. Finally, this paper's contributions are the collection and analysis of data to help inform hardware designers of common MPI requirements and highlight the difficulties in determining these requirements by only examining a single MPI implementation.

Authors:
 [1];  [1];  [1]; ORCiD logo [1];  [2]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1501630
Report Number(s):
SAND-2019-0943J
Journal ID: ISSN 1532-0626; 671923
Grant/Contract Number:  
AC04-94AL85000; NA0003525; AC02‐05CH11231
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Volume: 32; Journal Issue: 3; Journal ID: ISSN 1532-0626
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; hardware matching; MPI; MPI matching

Citation Formats

Ferreira, Kurt, Grant, Ryan E., Levenhagen, Michael J., Levy, Scott, and Groves, Taylor. Hardware MPI message matching: Insights into MPI matching behavior to inform design: Hardware MPI message matching. United States: N. p., 2019. Web. doi:10.1002/cpe.5150.
Ferreira, Kurt, Grant, Ryan E., Levenhagen, Michael J., Levy, Scott, & Groves, Taylor. Hardware MPI message matching: Insights into MPI matching behavior to inform design: Hardware MPI message matching. United States. doi:10.1002/cpe.5150.
Ferreira, Kurt, Grant, Ryan E., Levenhagen, Michael J., Levy, Scott, and Groves, Taylor. Wed . "Hardware MPI message matching: Insights into MPI matching behavior to inform design: Hardware MPI message matching". United States. doi:10.1002/cpe.5150. https://www.osti.gov/servlets/purl/1501630.
@article{osti_1501630,
title = {Hardware MPI message matching: Insights into MPI matching behavior to inform design: Hardware MPI message matching},
author = {Ferreira, Kurt and Grant, Ryan E. and Levenhagen, Michael J. and Levy, Scott and Groves, Taylor},
abstractNote = {Here, this paper explores key differences of MPI match lists for several important United States Department of Energy (DOE) applications and proxy applications. This understanding is critical in determining the most promising hardware matching design for any given high-speed network. The results of MPI match list studies for the major open-source MPI implementations, MPICH and Open MPI, are presented, and we modify an MPI simulator, LogGOPSim, to provide match list statistics. These results are discussed in the context of several different potential design approaches to MPI matching–capable hardware. The data illustrate the requirements for different hardware designs in terms of performance and memory capacity. Finally, this paper's contributions are the collection and analysis of data to help inform hardware designers of common MPI requirements and highlight the difficulties in determining these requirements by only examining a single MPI implementation.},
doi = {10.1002/cpe.5150},
journal = {Concurrency and Computation. Practice and Experience},
issn = {1532-0626},
number = 3,
volume = 32,
place = {United States},
year = {2019},
month = {2}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 1 work
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

An architecture to perform NIC based MPI matching
conference, September 2007

  • Hemmert, K. Scott; Underwood, Keith D.; Rodrigues, Arun
  • 2007 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTR.2007.4629234

A Dedicated Message Matching Mechanism for Collective Communications
conference, January 2018

  • Ghazimirsaeed, S. Mahdieh; Grant, Ryan E.; Afsahi, Ahmad
  • Proceedings of the 47th International Conference on Parallel Processing Companion - ICPP '18
  • DOI: 10.1145/3229710.3229712

Improving MPI Multi-threaded RMA Communication Performance
conference, January 2018

  • Hjelm, Nathan; Dosanjh, Matthew G. F.; Grant, Ryan E.
  • Proceedings of the 47th International Conference on Parallel Processing - ICPP 2018
  • DOI: 10.1145/3225058.3225114

The Case for Semi-Permanent Cache Occupancy: Understanding the Impact of Data Locality on Network Processing
conference, January 2018

  • Dosanjh, Matthew G. F.; Ghazimirsaeed, S. Mahdieh; Grant, Ryan E.
  • Proceedings of the 47th International Conference on Parallel Processing - ICPP 2018
  • DOI: 10.1145/3225058.3225130

Early Experiences Co-Scheduling Work and Communication Tasks for Hybrid MPI+X Applications
conference, November 2014

  • Stark, Dylan T.; Barrett, Richard F.; Grant, Ryan E.
  • 2014 Workshop on Exascale MPI at Supercomputing Conference (ExaMPI)
  • DOI: 10.1109/ExaMPI.2014.6

An evaluation of MPI message rate on hybrid-core processors
journal, November 2014

  • Barrett, Brian W.; Brightwell, Ron; Grant, Ryan
  • The International Journal of High Performance Computing Applications, Vol. 28, Issue 4
  • DOI: 10.1177/1094342014552085

Re-evaluating Network Onload vs. Offload for the Many-Core Era
conference, September 2015

  • Dosanjh, Matthew G. F.; Grant, Ryan E.; Bridges, Patrick G.
  • 2015 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2015.55

Myrinet: a gigabit-per-second local area network
journal, January 1995

  • Boden, N. J.; Cohen, D.; Felderman, R. E.
  • IEEE Micro, Vol. 15, Issue 1
  • DOI: 10.1109/40.342015

Performance of particle in cell methods on highly concurrent computational architectures
journal, July 2007


Eliminating contention bottlenecks in multithreaded MPI
journal, November 2017


Enabling communication concurrency through flexible MPI endpoints
journal, September 2014

  • Dinan, James; Grant, Ryan E.; Balaji, Pavan
  • The International Journal of High Performance Computing Applications, Vol. 28, Issue 4
  • DOI: 10.1177/1094342014548772

Understanding Performance Interference in Next-Generation HPC Systems
conference, November 2016

  • Mondragon, Oscar H.; Bridges, Patrick G.; Levy, Scott
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2016.32

LogGOPSim: simulating large-scale applications in the LogGOPS model
conference, January 2010

  • Hoefler, Torsten; Schneider, Timo; Lumsdaine, Andrew
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10
  • DOI: 10.1145/1851476.1851564

Fast Parallel Algorithms for Short-Range Molecular Dynamics
journal, March 1995


Characterizing MPI matching via trace-based simulation
journal, September 2018


A high-performance, portable implementation of the MPI message passing interface standard
journal, September 1996


Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1
conference, January 2017

  • Raffenetti, Ken; Blocksome, Michael; Si, Min
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17
  • DOI: 10.1145/3126908.3126963

Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
conference, November 2010

  • Hoefler, Torsten; Schneider, Timo; Lumsdaine, Andrew
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2010.12

The BXI Interconnect Architecture
conference, August 2015

  • Derradji, Said; Palfer-Sollier, Thibaut; Panziera, Jean-Pierre
  • 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects (HOTI)
  • DOI: 10.1109/HOTI.2015.15

Instrumentation and Analysis of MPI Queue Times on the SeaStar High-Performance Network
conference, August 2008

  • Brightwell, R.; Pedretti, K.; Ferreira, K.
  • 17th International Conference on Computer Communications and Networks 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks
  • DOI: 10.1109/ICCCN.2008.ECP.116

Preparing for exascale: modeling MPI for many-core systems using fine-grain queues
conference, January 2015

  • Bridges, Patrick G.; Dosanjh, Matthew G. F.; Grant, Ryan
  • Proceedings of the 3rd Workshop on Exascale MPI - ExaMPI '15
  • DOI: 10.1145/2831129.2831134

The impact of MPI queue usage on message latency
conference, January 2004

  • Underwood, K. D.; Brightwell, R.
  • International Conference on Parallel Processing, 2004. ICPP 2004.
  • DOI: 10.1109/ICPP.2004.1327915

SeaStar Interconnect: Balanced Bandwidth for Scalable Performance
journal, May 2006

  • Brightwell, R.; Pedretti, K. T.; Underwood, K. D.
  • IEEE Micro, Vol. 26, Issue 3
  • DOI: 10.1109/MM.2006.65

How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms
conference, January 2016

  • Levy, Scott; Ferreira, Kurt B.; Widener, Patrick
  • Proceedings of the 23rd European MPI Users' Group Meeting on - EuroMPI 2016
  • DOI: 10.1145/2966884.2966920

The Quadrics network: high-performance clustering technology
journal, January 2002


A fast and resource-conscious MPI message queue mechanism for large-scale jobs
journal, January 2014


Protocols for Fully Offloaded Collective Operations on Accelerated Network Adapters
conference, October 2013

  • Schneider, Timo; Hoefler, Torsten; Grant, Ryan E.
  • 2013 42nd International Conference on Parallel Processing (ICPP)
  • DOI: 10.1109/ICPP.2013.73

Toward an evolutionary task parallel integrated MPI + X programming model
conference, January 2015

  • Barrett, Richard F.; Stark, Dylan T.; Vaughan, Courtenay T.
  • Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '15
  • DOI: 10.1145/2712386.2712388