skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Characterizing MPI matching via trace-based simulation

Abstract

With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Here, data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide applicationmore » and middleware developers with insight into the scalability issues associated with MPI message matching.« less

Authors:
 [1];  [1];  [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1444084
Alternate Identifier(s):
OSTI ID: 1457519
Report Number(s):
[SAND-2018-5449J; SAND-2018-6407J]
[Journal ID: ISSN 0167-8191; 663297]
Grant/Contract Number:  
[AC04-94AL85000]
Resource Type:
Accepted Manuscript
Journal Name:
Parallel Computing
Additional Journal Information:
[Journal Name: Parallel Computing]; Journal ID: ISSN 0167-8191
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; message passing; MPI; simulation; message matching

Citation Formats

Ferreira, Kurt Brian, Levy, Scott Larson Nicoll, Pedretti, Kevin, and Grant, Ryan E. Characterizing MPI matching via trace-based simulation. United States: N. p., 2017. Web. doi:10.1145/3127024.3127040.
Ferreira, Kurt Brian, Levy, Scott Larson Nicoll, Pedretti, Kevin, & Grant, Ryan E. Characterizing MPI matching via trace-based simulation. United States. doi:10.1145/3127024.3127040.
Ferreira, Kurt Brian, Levy, Scott Larson Nicoll, Pedretti, Kevin, and Grant, Ryan E. Mon . "Characterizing MPI matching via trace-based simulation". United States. doi:10.1145/3127024.3127040. https://www.osti.gov/servlets/purl/1444084.
@article{osti_1444084,
title = {Characterizing MPI matching via trace-based simulation},
author = {Ferreira, Kurt Brian and Levy, Scott Larson Nicoll and Pedretti, Kevin and Grant, Ryan E.},
abstractNote = {With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Here, data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.},
doi = {10.1145/3127024.3127040},
journal = {Parallel Computing},
number = ,
volume = ,
place = {United States},
year = {2017},
month = {9}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Adaptive and Dynamic Design for MPI Tag Matching
conference, September 2016

  • Bayatpour, M.; Subramoni, H.; Chakraborty, S.
  • 2016 IEEE International Conference on Cluster Computing (CLUSTER)
  • DOI: 10.1109/CLUSTER.2016.69

Instrumentation and Analysis of MPI Queue Times on the SeaStar High-Performance Network
conference, August 2008

  • Brightwell, R.; Pedretti, K.; Ferreira, K.
  • 17th International Conference on Computer Communications and Networks 2008, 2008 Proceedings of 17th International Conference on Computer Communications and Networks
  • DOI: 10.1109/ICCCN.2008.ECP.116

An analysis of NIC resource usage for offloading MPI
conference, January 2004

  • Brightwell, R.; Underwood, K. D.
  • 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
  • DOI: 10.1109/IPDPS.2004.1303192

LogP: towards a realistic model of parallel computation
conference, January 1993

  • Culler, David; Karp, Richard; Patterson, David
  • Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPOPP '93
  • DOI: 10.1145/155332.155333

Towards millions of communicating threads
conference, January 2016

  • Dang, Hoang-Vu; Snir, Marc; Gropp, William
  • Proceedings of the 23rd European MPI Users' Group Meeting on - EuroMPI 2016
  • DOI: 10.1145/2966884.2966914

Characterizing application sensitivity to OS interference using kernel-level noise injection
conference, November 2008

  • Ferreira, Kurt B.; Bridges, Patrick; Brightwell, Ron
  • 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2008.5219920

Understanding the Effects of Communication and Coordination on Checkpointing at Scale
conference, November 2014

  • Ferreira, Kurt B.; Widener, Patrick; Levy, Scott
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2014.77

BoomerAMG: A parallel algebraic multigrid solver and preconditioner
journal, April 2002


Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
conference, November 2010

  • Hoefler, Torsten; Schneider, Timo; Lumsdaine, Andrew
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
  • DOI: 10.1109/SC.2010.12

Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
conference, May 2017

  • Klenk, Benjamin; Froening, Holger; Eberle, Hans
  • 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
  • DOI: 10.1109/IPDPS.2017.94

CTH: A three-dimensional shock wave physics code
journal, January 1990

  • McGlaun, J. M.; Thompson, S. L.; Elrick, M. G.
  • International Journal of Impact Engineering, Vol. 10, Issue 1-4
  • DOI: 10.1016/0734-743X(90)90071-3

Fast Parallel Algorithms for Short-Range Molecular Dynamics
journal, March 1995


A Hardware Acceleration Unit for MPI Queue Processing
conference, January 2005

  • Underwood, K. D.; Hemmert, K. S.; Rodrigues, A.
  • 19th IEEE International Parallel and Distributed Processing Symposium
  • DOI: 10.1109/IPDPS.2005.30