Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Techniques for simplifying the programming of distributed systems

Thesis/Dissertation ·
OSTI ID:6917508

It is difficult to design and verify distributed programs that execute correctly despite transient processor failures, or despite variable and unpredictable processor speeds, and message transmission times. This thesis describes a check pointing/rollback mechanism that allows programmers to write distributed programs with the simplifying assumption that processors do not fail, and then run these programs correctly on systems with transient processor failures. Also described is a translation mechanisms that can be used to write programs with the simplifying assumptions that processors execute in synchronized steps and messages take exactly one step to arrive, and then run these programs correctly on systems that violate these assumptions. Both mechanisms are transparent to the programmer, and they can be applied to solve a large class of problems.

Research Organization:
Cornell Univ., Ithaca, NY (USA)
OSTI ID:
6917508
Country of Publication:
United States
Language:
English