skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Detecting Soft Errors in Stencil based Computations

Technical Report ·
DOI:https://doi.org/10.2172/1184174· OSTI ID:1184174
 [1];  [1];  [2]
  1. Univ. of Utah, Salt Lake City, UT (United States)
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

Given the growing emphasis on system resilience, it is important to develop software-level error detectors that help trap hardware-level faults with reasonable accuracy while minimizing false alarms as well as the performance overhead introduced. We present a technique that approaches this idea by taking stencil computations as our target, and synthesizing detectors based on machine learning. In particular, we employ linear regression to generate computationally inexpensive models which form the basis for error detection. Our technique has been incorporated into a new open-source library called SORREL. In addition to reporting encouraging experimental results, we demonstrate techniques that help reduce the size of training data. We also discuss the efficacy of various detectors synthesized, as well as our future plans.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
DE-AC52-07NA27344
OSTI ID:
1184174
Report Number(s):
LLNL-TR-670435
Country of Publication:
United States
Language:
English