| | |
Summary: Closure and Convergence:
A Foundation of FaultTolerant Computing
Anish ARORA Mohamed GOUDA
Department of Computer Science Department of Computer Sciences
The Ohio State Univ. at Columbus The Univ. of Texas at Austin
2036 Neil Avenue Mall, OH 43210 2.128 Taylor Hall, TX 78712
6142921836, Fax: 6142922911 5124719532, Fax: 5124718885
anish@cis.ohiostate.edu gouda@cs.utexas.edu
Abstract
We give a formal definition of what it means for a system to ``tolerate'' a class
of ``faults''. The definition consists of two conditions: One, if a fault occurs when
the system state is within a set of ``legal'' states, the resulting state is within some
larger set and, if faults continue occurring, the system state remains within that
larger set (Closure). And two, if faults stop occurring, the system eventually reaches
a state within the legal set (Convergence). We demonstrate the applicability of
our definition for specifying and verifying the faulttolerance properties of a variety
of digital and computer systems. Further, using the definition, we obtain a sim
ple classification of faulttolerant systems and discuss methods for their systematic
design.
Keywords: Faulttolerance, Reliability, Algorithms, Verification, Design.
|