Defining and measuring supercomputer Reliability, Availability, and Serviceability (RAS).
Conference
·
OSTI ID:948682
The absence of agreed definitions and metrics for supercomputer RAS obscures meaningful discussion of the issues involved and hinders their solution. This paper seeks to foster a common basis for communication about supercomputer RAS, by proposing a system state model, definitions, and measurements. These are modeled after the SEMI-E10 specification which is widely used in the semiconductor manufacturing industry.
- Research Organization:
- Sandia National Laboratories
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 948682
- Report Number(s):
- SAND2005-1574C
- Country of Publication:
- United States
- Language:
- English
Similar Records
Towards a specification for measuring red storm reliability, availability, and serviceability (RAS).
Reliability, availability, and serviceability for petascale high-end computing and beyond
Exploring Process Groups for Reliability, Availability and Serviceability of Terascale Computing Systems
Conference
·
Wed Jun 01 00:00:00 EDT 2005
·
OSTI ID:972870
Reliability, availability, and serviceability for petascale high-end computing and beyond
Technical Report
·
Tue May 31 00:00:00 EDT 2011
·
OSTI ID:1041206
Exploring Process Groups for Reliability, Availability and Serviceability of Terascale Computing Systems
Conference
·
Sat Dec 31 23:00:00 EST 2005
·
OSTI ID:989650