Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Symmetric Active/Active High Availability for High-Performance Computing System Services: Accomplishments and Limitations

Conference ·
OSTI ID:945336
 [1];  [1];  [2];  [3]
  1. ORNL
  2. Louisiana Tech University
  3. Tennessee Technological University

This paper summarizes our efforts over the last 3-4 years in providing symmetric active/active high availability for high-performance computing (HPC) system services. This work paves the way for high-level reliability, availability and serviceability in extreme-scale HPC systems by focusing on the most critical components, head and service nodes, and by reinforcing them with appropriate high availability solutions. This paper presents our accomplishments in the form of concepts and respective prototypes, discusses existing limitations, outlines possible future work, and describes the relevance of this research to other, planned efforts.

Research Organization:
Oak Ridge National Laboratory (ORNL)
Sponsoring Organization:
SC USDOE - Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
945336
Country of Publication:
United States
Language:
English