| | |
Summary: Fault-tolerant Typed Assembly Language
Frances Perry Lester Mackey George A. Reis Jay Ligatti David I. August David Walker
Departments of Computer Science and Electrical Engineering
Department of Computer Science and Engineering
Princeton University University of South Florida
{frances, lmackey, gareis, august, dpw}@cs.princeton.edu ligatti@cse.usf.edu
Abstract
A transient hardware fault occurs when an energetic particle strikes
a transistor, causing it to change state. Although transient faults do
not permanently damage the hardware, they may corrupt computa-
tions by altering stored values and signal transfers. In this paper, we
propose a new scheme for provably safe and reliable computing in
the presence of transient hardware faults. In our scheme, software
computations are replicated to provide redundancy while special
instructions compare the independently computed results to detect
errors before writing critical data. In stark contrast to any previous
efforts in this area, we have analyzed our fault tolerance scheme
from a formal, theoretical perspective. To be specific, first, we pro-
vide an operational semantics for our assembly language, which
|