skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Investigating the Interplay between Energy Efficiency and Resilience in High Performance Computing

Conference ·

Energy efficiency and resilience are two crucial challenges for HPC systems to reach exascale. While energy efficiency and resilience issues have been extensively studied individually, little has been done to understand the interplay between energy efficiency and resilience for HPC systems. Decreasing the supply voltage associated with a given operating frequency for processors and other CMOS-based components can significantly reduce power consumption. However, this often raises system failure rates and consequently increases application execution time. In this work, we present an energy saving undervolting approach that leverages the mainstream resilience techniques to tolerate the increased failures caused by undervolting.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1213014
Report Number(s):
PNNL-SA-109093; KJ0402000
Resource Relation:
Conference: IEEE International Parallel and Distributed Processing Symposium (IPDPS 2015), May 25-29, 2015, Hyderabad, India, 786-796
Country of Publication:
United States
Language:
English

Similar Records

Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology
Conference · Mon Jan 18 00:00:00 EST 2016 · OSTI ID:1213014

Scalable Energy Efficiency with Resilience for High Performance Computing Systems: A Quantitative Methodology
Journal Article · Mon Nov 16 00:00:00 EST 2015 · ACM Transactions on Architecture and Code Optimization · OSTI ID:1213014

Rolex: Resilience-oriented language extensions for extreme-scale systems
Journal Article · Thu May 26 00:00:00 EDT 2016 · Journal of Supercomputing · OSTI ID:1213014

Related Subjects