Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

Technical Report ·
DOI:https://doi.org/10.2172/1104503· OSTI ID:1104503
 [1];  [2]
  1. The Ohio State Univ., Columbus, OH (United States); The Ohio State University
  2. The Ohio State Univ., Columbus, OH (United States)

With the Coordinated Infrastructure for Fault Tolerance Systems (CIFTS, as the original project came to be called) project, our aim has been to understand and tackle the following broad research questions, the answers to which will help the HEC community analyze and shape the direction of research in the field of fault tolerance and resiliency on future high-end leadership systems. Will availability of global fault information, obtained by fault information exchange between the different HEC software on a system, allow individual system software to better detect, diagnose, and adaptively respond to faults? If fault-awareness is raised throughout the system through fault information exchange, is it possible to get all system software working together to provide a more comprehensive end-to-end fault management on the system?

Research Organization:
The Ohio State Univ., Columbus, OH (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Contributing Organization:
Argonne National Laboratory, The Ohio State University, Lawrence Berkeley National Laboratory, Oakridge National Laboratory, Indiana University and University of Tennesse
DOE Contract Number:
FC02-06ER25749
OSTI ID:
1104503
Report Number(s):
DOE-OSU--25749-Final
Country of Publication:
United States
Language:
English

Similar Records

Coordinated Fault Tolerance for High-Performance Computing
Technical Report · Mon Apr 08 00:00:00 EDT 2013 · OSTI ID:1072982

CIFTS : A coordinated infrastructure for fault-tolerant systems.
Conference · Wed Dec 31 23:00:00 EST 2008 · OSTI ID:982645

Award ER25750: Coordinated Infrastructure for Fault Tolerance Systems Indiana University Final Report
Technical Report · Thu Mar 07 23:00:00 EST 2013 · OSTI ID:1105002

Related Subjects