Checkpoint repair for high-performance out-of-order execution machines
Journal Article
·
· IEEE Trans. Comput.; (United States)
Out-or-order execution and branch prediction are two mechanisms that can be used profitably in the design of supercomputers to increase performance. Proper exception handling and branch prediction miss handling in an out-of-order execution machine to require some kind of repair mechanism which can restore the machine to a known previous state. In this paper the authors present a class of repair mechanisms using the concept of checkpointing. The authors derive several properties of checkpoint repair mechanisms. In addition, they provide algorithms for performing checkpoint repair that incur little overhead in time and modest cost in hardware, which also require no additional complexity or time for use with write-back cache memory systems than they do with write-through cache memory systems, contrary to statements made by previous researchers.
- Research Organization:
- Dept. of Electrical and Computer Engineering, Univ. of Illinois, Urbana-Champaign, IL
- OSTI ID:
- 5496980
- Journal Information:
- IEEE Trans. Comput.; (United States), Journal Name: IEEE Trans. Comput.; (United States) Vol. C-36:12; ISSN ITCOB
- Country of Publication:
- United States
- Language:
- English
Similar Records
Lazy Checkpointing : Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems
Orchestrating Fault Prediction with Live Migration and Checkpointing
The effect of sharing on the cache and bus performance of parallel programs
Conference
·
Tue Dec 31 23:00:00 EST 2013
·
OSTI ID:1130431
Orchestrating Fault Prediction with Live Migration and Checkpointing
Conference
·
Mon Jun 01 00:00:00 EDT 2020
·
OSTI ID:1648858
The effect of sharing on the cache and bus performance of parallel programs
Book
·
Thu Dec 31 23:00:00 EST 1987
·
OSTI ID:5005342