Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Fault-tolerant delivery algorithms

Thesis/Dissertation ·
OSTI ID:5457957
This dissertation addresses the problem of constructing a highly reliable delivery system in a distributed environment. It presents fault tolerance algorithms that guarantee the delivery of a message to its destination despite faults in one or more nodes in a system of loosely coupled processors. These algorithms are distinguished but not using extra hardware or checkpoint facilities that are common to many algorithms of their type. Instead, they maintain an appropriate number of copies of the message in nodes where the message passes. In the case of a fault, the algorithms locate a copy of the message closest to the destination, and resume delivery of the message from this location. The mechanism introduced in this dissertation can be implemented on existing distributed systems without the addition of specialized hardware or changes in the existing application program. Moreover, the proposed mechanism can be used transparently so that failure detection and recovery is automatic, and users are completely unaware of the detail of the algorithms. A complete analysis of both algorithms is presented in this dissertation. The communication overhead of each algorithm is presented. Also, the author discusses the conditions under which a loop may occur in a system where the algorithms are implemented. The availability of the system where the algorithms are implemented is found. The reliability model is presented in detail for each algorithm and different topology is examined. The parameters that affect the performance of both algorithms when implemented in a distributed system are presented based on our simulation result.
Research Organization:
George Washington Univ., Washington, DC (United States)
OSTI ID:
5457957
Country of Publication:
United States
Language:
English

Similar Records

Fault tolerance in modular multiprocessor systems
Thesis/Dissertation · Mon Dec 31 23:00:00 EST 1990 · OSTI ID:5254206

On fault-tolerant mechanisms in distributed systems
Thesis/Dissertation · Thu Dec 31 23:00:00 EST 1987 · OSTI ID:6309833

An efficient modular spare allocation scheme and its application to fault tolerant binary hypercubes
Journal Article · Mon Dec 31 23:00:00 EST 1990 · IEEE Transactions on Parallel and Distributed Systems (Institute of Electrical and Electronics Engineers); (United States) · OSTI ID:6253910