Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Fault tolerance in multiprocessor systems without dedicated redundancy

Journal Article · · IEEE Trans. Comput.; (United States)
DOI:https://doi.org/10.1109/12.2174· OSTI ID:5299288

This paper describes an algorithm called RAFT (recursive algorithm for fault tolerance) for achieving fault tolerance in multiprocessor systems. Through the use of a combination of dynamic space and time redundancy techniques, RAFT achieves fault tolerance in the presence of permanent as well as intermittent faults. Performance and reliability of multiprocessor system using RAFT are determined as a function of individual processor reliability and the total number of fault modes in a processor. RAFT-based systems are superior to TMR systems in hardware economy and provide comparable reliability. A multiprocessor architecture adopting RAFT is given.

Research Organization:
Computing Systems Research Lab., AT and T Bell Labs., Murray Hill, NJ 07974 (US)
OSTI ID:
5299288
Journal Information:
IEEE Trans. Comput.; (United States), Journal Name: IEEE Trans. Comput.; (United States) Vol. 37:3; ISSN ITCOB
Country of Publication:
United States
Language:
English