skip to main content

DOE PAGESDOE PAGES

Title: On noise and the performance benefit of nonblocking collectives

Relaxed synchronization offers the potential of maintaining application scalability by allowing many processes to make independent progress when some processes suffer delays. Yet, the benefits of this approach in important parallel workloads have not been investigated in detail. In this paper, we use a validated simulation approach to explore the noise mitigation effects of idealized nonblocking collectives in workloads where these collectives are a major contributor to total execution time. In conclusion, although nonblocking collectives are unlikely to provide significant noise mitigation to applications in the low-OS-noise environments expected in next-generation HPC systems, we show that they can potentially improve application runtime with respect to other noise types.
Authors:
 [1] ;  [2] ;  [1] ;  [3]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  2. Univ. of New Mexico, Albuquerque, NM (United States)
  3. ETH Zurich (Switzerland)
Publication Date:
OSTI Identifier:
1257977
Report Number(s):
SAND--2014-19529J
Journal ID: ISSN 1094-3420; 641904
Grant/Contract Number:
AC04-94AL85000
Type:
Accepted Manuscript
Journal Name:
International Journal of High Performance Computing Applications
Additional Journal Information:
Journal Volume: 30; Journal Issue: 1; Journal ID: ISSN 1094-3420
Publisher:
SAGE
Research Org:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING HPC; collectives; nonblocking; resilience; checkpointing; simulation