skip to main content

DOE PAGESDOE PAGES

Title: Characterizing MPI matching via trace-based simulation

With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Here, data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide applicationmore » and middleware developers with insight into the scalability issues associated with MPI message matching.« less
Authors:
 [1] ;  [1] ;  [1] ;  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Report Number(s):
SAND-2018-5449J; SAND-2018-6407J
663297
Grant/Contract Number:
AC04-94AL85000
Type:
Accepted Manuscript
Journal Name:
Parallel Computing
Additional Journal Information:
Journal Name: Parallel Computing
Research Org:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; message passing; MPI; simulation; message matching
OSTI Identifier:
1457519
Alternate Identifier(s):
OSTI ID: 1444084