Summary: The Cost of Recovery in
Message Logging Protocols
Sriram Rao, Lorenzo Alvisi, and Harrick M. Vin
Abstract–Past research in message logging has focused on studying the relative overhead imposed by pessimistic, optimistic, and
causal protocols during failure-free executions. In this paper, we give the first experimental evaluation of the performance of these
protocols during recovery. Our results suggest that applications face a complex trade-off when choosing a message logging protocol
for fault tolerance. On the one hand, optimistic protocols can provide fast failure-free execution and good performance during recovery,
but are complex to implement and can create orphan processes. On the other hand, orphan-free protocols either risk being slow during
recovery, e.g., sender-based pessimistic and causal protocols, or incur a substantial overhead during failure-free execution, e.g.,
receiver-based pessimistic protocols. To address this trade-off, we propose hybrid logging protocols, a new class of orphan-free
protocols. We show that hybrid protocols perform within two percent of causal logging during failure-free execution and within two
percent of receiver-based logging during recovery.
Index Terms–Distributed computing, fault tolerance, log-based rollback recovery, pessimistic protocols, optimistic protocols, causal
protocols, hybrid protocols.
MESSAGE-LOGGING protocols, for example, , , , ,
, , , , are popular techniques for
building systems that can tolerate process crash failures.
These protocols are built on the assumption that the state of