Checkpoint repair for high-performance out-of-order execution machines
Out-or-order execution and branch prediction are two mechanisms that can be used profitably in the design of supercomputers to increase performance. Proper exception handling and branch prediction miss handling in an out-of-order execution machine to require some kind of repair mechanism which can restore the machine to a known previous state. In this paper the authors present a class of repair mechanisms using the concept of checkpointing. The authors derive several properties of checkpoint repair mechanisms. In addition, they provide algorithms for performing checkpoint repair that incur little overhead in time and modest cost in hardware, which also require no additional complexity or time for use with write-back cache memory systems than they do with write-through cache memory systems, contrary to statements made by previous researchers.
- Research Organization:
- Dept. of Electrical and Computer Engineering, Univ. of Illinois, Urbana-Champaign, IL
- OSTI ID:
- 5496980
- Journal Information:
- IEEE Trans. Comput.; (United States), Journal Name: IEEE Trans. Comput.; (United States) Vol. C-36:12; ISSN ITCOB
- Country of Publication:
- United States
- Language:
- English
Similar Records
Orchestrating Fault Prediction with Live Migration and Checkpointing
The effect of sharing on the cache and bus performance of parallel programs