Locating hardware faults in a parallel computer
Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.
- Research Organization:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Organization:
- USDOE
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Number(s):
- 7,697,443
- Application Number:
- 11/279,592
- OSTI ID:
- 1176237
- Country of Publication:
- United States
- Language:
- English
Similar Records
Switch for serial or parallel communication networks
Switch for serial or parallel communication networks