Debugging parallel programs with instant replay
The debugging cycle is the most common methodology for finding and correcting errors in sequential programs. Cyclic debugging is effective because sequential programs are usually deterministic. Debugging parallel programs is considerably more difficult because successive executions of the same program often do not produce the same results. In this paper they present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay. During program execution they save the relative order of significant events as they occur, not the data associated with such events. As a result, our approach requires less time and space to save the information needed for program replay than other methods. Our technique is not dependent on any particular form of interprocess communication. It provides for replay of an entire program, rather than individual processes in isolation. No centralized bottlenecks are introduced and there is no need for synchronized clocks or a globally consistent logical time. We describe a prototype implementation of Instant Replay on the BBN Butterfly Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs.
- Research Organization:
- Dept. of Computer Science, Univ. of Rochester, Rochester, NY 14627
- OSTI ID:
- 6896088
- Journal Information:
- IEEE Trans. Comput.; (United States), Journal Name: IEEE Trans. Comput.; (United States) Journal Issue: 4 Vol. C-36:4; ISSN ITCOB
- Country of Publication:
- United States
- Language:
- English
Similar Records
Distributed Order Recording Techniques for Efficient Record-and-Replay of Multi-threaded Programs
Hardware-assisted replay of microprocessor programs