DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Computer hardware fault administration

Abstract

Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.

Inventors:
 [1];  [1];  [1];  [1]
  1. Rochester, MN
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1017167
Patent Number(s):
7796527
Application Number:
11/279,579
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B519700
Resource Type:
Patent
Country of Publication:
United States
Language:
English

Citation Formats

Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, and Smith, Brian E. Computer hardware fault administration. United States: N. p., 2010. Web.
Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, & Smith, Brian E. Computer hardware fault administration. United States.
Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, and Smith, Brian E. Tue . "Computer hardware fault administration". United States. https://www.osti.gov/servlets/purl/1017167.
@article{osti_1017167,
title = {Computer hardware fault administration},
author = {Archer, Charles J and Megerian, Mark G and Ratterman, Joseph D and Smith, Brian E},
abstractNote = {Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Sep 14 00:00:00 EDT 2010},
month = {Tue Sep 14 00:00:00 EDT 2010}
}

Works referenced in this record:

An Overview of the BlueGene/L Supercomputer
conference, January 2002