Extreme-scale Algorithms and Solver Resilience
- Univ. of Tennessee, Knoxville, TN (United States); University of Tennessee
A widening gap exists between the peak performance of high-performance computers and the performance achieved by complex applications running on these platforms. Over the next decade, extreme-scale systems will present major new challenges to algorithm development that could amplify this mismatch in such a way that it prevents the productive use of future DOE Leadership computers due to the following; Extreme levels of parallelism due to multicore processors; An increase in system fault rates requiring algorithms to be resilient beyond just checkpoint/restart; Complex memory hierarchies and costly data movement in both energy and performance; Heterogeneous system architectures (mixing CPUs, GPUs, etc.); and Conflicting goals of performance, resilience, and power requirements.
- Research Organization:
- Univ. of Tennessee, Knoxville, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- SC0010042
- OSTI ID:
- 1334619
- Report Number(s):
- DOE-UTK--10042; ER26137
- Country of Publication:
- United States
- Language:
- English
Similar Records
Resiliency in numerical algorithm design for extreme scale simulations
Implementing Software Resiliency in HPX for Extreme Scale Computing
On the performance and energy efficiency of sparse linear algebra on GPUs
Journal Article
·
Thu Dec 09 19:00:00 EST 2021
· International Journal of High Performance Computing Applications
·
OSTI ID:1855669
Implementing Software Resiliency in HPX for Extreme Scale Computing
Technical Report
·
Wed Apr 15 00:00:00 EDT 2020
·
OSTI ID:1614897
On the performance and energy efficiency of sparse linear algebra on GPUs
Journal Article
·
Tue Oct 04 20:00:00 EDT 2016
· International Journal of High Performance Computing Applications
·
OSTI ID:1437692