skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Characterization of the Impact of Soft Errors on Iterative Methods

Conference ·
OSTI ID:1512780

Soft errors caused by transient bit flips have the potential to significantly impact an application’s behavior. This has motivated the design of an array of techniques to detect, isolate, and correct soft errors using microarchitectural, architectural, compilation-based, or application-level techniques to minimize their impact on the executing application. The first step towards the design of good error detection/correction techniques involves an understanding of an application’s vulnerability to soft errors. In this paper, we present the first comprehensive characterization of the impact of soft errors on the convergence characteristics of six iterative methods using application-level fault injection. In particular, we consider the use of iterative methods to incrementally solve a linear systems of equations, which constitutes the core kernel in many scientific applications. We analyze the impact of soft errors in terms of the type of error (single- vs multi-bit), the distribution and location of bits affected, the data structure and the statement impacted, and variation with time. In addition to understanding the vulnerability of iterative solvers to soft errors, this characterization can aid the design of fault injection campaigns that ensure systematic coverage.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1512780
Report Number(s):
PNNL-SA-138072
Resource Relation:
Conference: 25TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HiPC 2018), December 17-20, 2018, Bengaluru, Inda
Country of Publication:
United States
Language:
English