Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

A Log-Based Redundant Architecture for Reliable Parallel Computation Daniel Sanchez, Juan L. Aragon and Jose M. Garcia

Summary: A Log-Based Redundant Architecture for Reliable Parallel Computation
Daniel S´anchez, Juan L. Arag´on and Jos´e M. Garc´ia
Departamento de Ingenier´ia y Tecnolog´ia de Computadores
Universidad de Murcia, Spain
Email: {dsanchez, jlaragon, jmgarcia}@ditec.um.es
CMOS scaling exacerbates hardware errors making re-
liability a big concern for recent and future microarchi-
tecture designs. Mechanisms to provide fault tolerance in
architectures must accomplish several objectives such as
low performance degradation, power consumption and area
overhead. Several studies have been already proposed to
provide fault tolerance for parallel codes. However, these
proposals are usually implemented over non-realistic envi-
ronments including the use of shared-buses among proces-
sors or modifying highly optimized hardware designs such
as caches. Our main design goal is to provide transient fault
detection and recovery while modifying hardware as less as
To this end, we propose LBRA based on a Hardware


Source: Aragón Alcaraz, Juan Luis - Departamento de Ingenieria y Tecnologia de Computadores, Universidad de Murcia


Collections: Computer Technologies and Information Sciences