Computer hardware fault administration
Abstract
Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.
- Inventors:
-
- Rochester, MN
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1017167
- Patent Number(s):
- 7796527
- Application Number:
- 11/279,579
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B519700
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, and Smith, Brian E. Computer hardware fault administration. United States: N. p., 2010.
Web.
Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, & Smith, Brian E. Computer hardware fault administration. United States.
Archer, Charles J, Megerian, Mark G, Ratterman, Joseph D, and Smith, Brian E. Tue .
"Computer hardware fault administration". United States. https://www.osti.gov/servlets/purl/1017167.
@article{osti_1017167,
title = {Computer hardware fault administration},
author = {Archer, Charles J and Megerian, Mark G and Ratterman, Joseph D and Smith, Brian E},
abstractNote = {Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {9}
}
Works referenced in this record:
An Overview of the BlueGene/L Supercomputer
conference, January 2002
- Adiga, N. R.; Almasi, G.; Almasi, G. S.
- ACM/IEEE SC 2002 Conference (SC'02)