Detecting Soft Errors in Stencil based Computations
- Univ. of Utah, Salt Lake City, UT (United States)
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Given the growing emphasis on system resilience, it is important to develop software-level error detectors that help trap hardware-level faults with reasonable accuracy while minimizing false alarms as well as the performance overhead introduced. We present a technique that approaches this idea by taking stencil computations as our target, and synthesizing detectors based on machine learning. In particular, we employ linear regression to generate computationally inexpensive models which form the basis for error detection. Our technique has been incorporated into a new open-source library called SORREL. In addition to reporting encouraging experimental results, we demonstrate techniques that help reduce the size of training data. We also discuss the efficacy of various detectors synthesized, as well as our future plans.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- DE-AC52-07NA27344
- OSTI ID:
- 1184174
- Report Number(s):
- LLNL-TR-670435
- Country of Publication:
- United States
- Language:
- English
Similar Records
PRESAGE: Protecting Structured Address Generation against Soft Errors
PRESAGE: Protecting Structured Address Generation against Soft Errors