Computer systems: modeling and reliability issues
Improved techniques for reliability prediction are developed. Methods are proposed to determine the parameters used in analytical reliability models, specifically the coverage. It is shown how to calculate the coverage, taking into account the frequencies of the different errors and the faults in hardware recovery mechanisms, and how to incorporate the coverage into widely used automated reliability models. Furthermore, a simple fault-tolerant system is built out of LSTTL catalog parts. Temporary failures are injected into the system, and it is found that some conventional dynamic-recovery mechanisms are inadequate in the presence of temporary failures that cannot be modeled by the stuck-at fault model. Also, a strongly fault-secure recovery mechanisms is designed for self-purging systems. It consists of a coded-state sequential circuit that can be easily implemented by a Programmable Logic Sequencer (PLS). In this recovery mechanism, it is not necessary for the circuit to go through all input and state combinations between two successive faults. Finally, a model is developed to estimate the degradation in access time for cache-memory systems due to chip failures. This model can be very helpful when setting up maintenance schedules.
- Research Organization:
- Stanford Univ., CA (USA)
- OSTI ID:
- 6674383
- Resource Relation:
- Other Information: Thesis (Ph. D.)
- Country of Publication:
- United States
- Language:
- English
Similar Records
Nonclassical faults in CMOS digital integrated circuits
Blackcomb: Hardware-Software Co-design for Non-Volatile Memory in Exascale Systems