On noise and the performance benefit of nonblocking collectives
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Univ. of New Mexico, Albuquerque, NM (United States)
- ETH Zurich (Switzerland)
Relaxed synchronization offers the potential of maintaining application scalability by allowing many processes to make independent progress when some processes suffer delays. Yet, the benefits of this approach in important parallel workloads have not been investigated in detail. In this paper, we use a validated simulation approach to explore the noise mitigation effects of idealized nonblocking collectives in workloads where these collectives are a major contributor to total execution time. In conclusion, although nonblocking collectives are unlikely to provide significant noise mitigation to applications in the low-OS-noise environments expected in next-generation HPC systems, we show that they can potentially improve application runtime with respect to other noise types.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1257977
- Report Number(s):
- SAND-2014-19529J; 641904
- Journal Information:
- International Journal of High Performance Computing Applications, Vol. 30, Issue 1; ISSN 1094-3420
- Publisher:
- SAGECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
The unexpected virtue of almost: Exploiting MPI collective operations to approximately coordinate checkpoints
|
journal | September 2018 |
Similar Records
Mini-Ckpts: Surviving OS Failures in Persistent Memory
A Fault Oblivious Extreme-Scale Execution Environment