skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluating MPI resource usage summary statistics

Journal Article · · Parallel Computing

The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today’s high-performance computing (HPC) systems. This dominance stems from MPI’s powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an application’s MPI resource usage is critical to tuning MPI’s performance on a particular platform. The result of this analysis is typically a discussion of the mean message sizes, queue search lengths and message arrival times for a workload or set of workloads. While a discussion of the arithmetic mean in MPI resource usage might be the most intuitive summary statistic, it is not always the most accurate in terms of representing the underlying data. In this paper, we analyze MPI resource usage for a number of key MPI workloads using an existing MPI trace collector and discrete-event simulator. Our analysis demonstrates that the average, while easy and efficient to calculate, is a useful metric for characterizing latency and bandwidth measurements, but may not be a good representation of application message sizes, match list search depths, or MPI inter-operation times. Additionally, we show that the median and mode are superior choices in many cases. We also observe that the arithmetic mean is not the best representation of central tendency for data that are drawn from distributions that are multi-modal or have heavy tails. Furthermore, the results and analysis of our work provide valuable guidance on how we, as a community, should discuss and analyze MPI resource usage data for scientific applications.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
NA0003525
OSTI ID:
1822241
Alternate ID(s):
OSTI ID: 1868722
Report Number(s):
SAND-2021-10260J; 699637
Journal Information:
Parallel Computing, Vol. 108; ISSN 0167-8191
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (7)

CTH: A Software Family for Multi-Dimensional Shock Physics Analysis book January 1995
Communication Requirements and Interconnect Optimization for High-End Scientific Applications journal February 2010
Towards Performance Portability in a Compressible CFD Code conference June 2017
Characterizing MPI matching via trace-based simulation journal September 2018
Fast Parallel Algorithms for Short-Range Molecular Dynamics journal March 1995
A large-scale study of MPI usage in open-source HPC applications
  • Laguna, Ignacio; Marshall, Ryan; Mohror, Kathryn
  • SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3295500.3356176
conference November 2019
CTH: A three-dimensional shock wave physics code journal January 1990

Similar Records

Characterizing MPI matching via trace-based simulation
Journal Article · Mon Sep 25 00:00:00 EDT 2017 · Parallel Computing · OSTI ID:1822241

Using Simulation to Examine the Effect of MPI Message Matching Costs on Application Performance
Journal Article · Wed Feb 27 00:00:00 EST 2019 · Parallel Computing · OSTI ID:1822241

On the memory attribution problem: A solution and case study using MPI
Journal Article · Mon Feb 04 00:00:00 EST 2019 · Concurrency and Computation. Practice and Experience · OSTI ID:1822241