Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

DAFT: Decoupled Acyclic Fault Tolerance Yun Zhang Jae W. Lee

Summary: DAFT: Decoupled Acyclic Fault Tolerance
Yun Zhang Jae W. Lee
Nick P. Johnson David I. August
Computer Science Department Parakinetics Inc.
Princeton University Princeton, NJ 08542
Princeton, NJ 08540 leejw@parakinetics.com
{yunzhang, npjohnso, august}@princeton.edu
Higher transistor counts, lower voltage levels, and reduced noise
margin increase the susceptibility of multicore processors to tran-
sient faults. Redundant hardware modules can detect such errors,
but software transient fault detection techniques are more appealing
for their low cost and flexibility. Recent software proposals dou-
ble register pressure or memory usage, or are too slow in the ab-
sence of hardware extensions, preventing widespread acceptance.
This paper presents DAFT, a fast, safe, and memory efficient tran-
sient fault detection framework for commodity multicore systems.
DAFT replicates computation across multiple cores and schedules
fault detection off the critical path. Where possible, values are
speculated to be correct and only communicated to the redundant


Source: August, David - Department of Computer Science, Princeton University


Collections: Computer Technologies and Information Sciences