| | |
Summary: WAFT: Support for FaultTolerance in WideArea Object Oriented
Systems
Lorenzo Alvisi, University of Texas at Austin
Keith Marzullo, University of California at San Diego
July 22, 1998
The difficulties that widearea networks present to application designers are severe. For example, wide
area networks suffer from very unpredictable communication properties, which make it hard to distribute a
computation over multiple sites. This is made even harder because what constitutes poor communication
depends on the application's quality of service requirements. Yet, two major attractions of widearea systems
are the ability to support geographically dispersed cooperative work (for example, [15]) and the possibility
of harnessing many processors for massively parallel computation (for example, [18, 14]). Both of these
attractions push one to design geographicallydispersed applications, and therefore one must address the
problem of unpredictable communication properties.
In addition, widearea networks are not secure environments, and so the applications that are designed
to run in them must be able to withstand security attacks. Replication, which is needed for fault tolerance,
is an obvious point of attack. Rollbackrecovery protocols, for instance, use replication in time [10] by
restoring crashed agents to a previous state and repeating lost execution. A malicious user who alters the
information used during recovery can affect the state to which a failed object is restored, thereby introducing
a Trojan horse. The problem of keeping recovery information secure is especially acute for techniques such as
causal logging [3, 4, 5] because causal logging replicates this information in the volatile memory of multiple
|