Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Lowcost Faulttolerance in Barrier Synchronizations Sandeep S. Kulkarni Anish Arora

Summary: Low­cost Fault­tolerance in Barrier Synchronizations
Sandeep S. Kulkarni Anish Arora
Department of Computer and Information Science 1
The Ohio State University
Columbus, OH 43210 USA
In this paper, we show how fault­tolerance can be effectively added to several
types of faults in program computations that use barrier synchronization. We
divide the faults that occur in practice into two classes, detectable and undetectable,
and design a fully distributed program that tolerates the faults in both classes. Our
program guarantees that every barrier is executed correctly even if detectable faults
occur, and that eventually every barrier is executed correctly even if undetectable
faults occur. Via analytical as well as simulation results we show that the cost
of adding fault­tolerance is low, in part by comparing the times required by our
program with that required by the corresponding fault­intolerant counterpart.
Keywords: fault­tolerance, multitolerance, detectable and undetectable faults,
synchronization, concurrency.
1 Email: fkulkarni,anishg@cis.ohio­state.edu; Web: http://www.cis.ohio­state.edu/f~ kulkarni,~anish g. Re­
search supported in part by NSF Grant CCR­93­08640, OSU Grant 221506, and NSA MDA904­96­1­1011.


Source: Arora, Anish - Department of Computer Science and Engineering, Ohio State University


Collections: Computer Technologies and Information Sciences