Global to push GA events into
skip to main content

Title: Dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job

Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of compute nodes running the re-launched job.
Inventors:
 [1];  [1];  [1];  [1];  [2]
  1. Rochester, MN
  2. Byron, MN
Issue Date:
OSTI Identifier:
1039560
Assignee:
International Business Machines Corporation (Armonk, NY) OSTI
Patent Number(s):
8,140,889
Application Number:
US patent applicaiton 12/861,426
Contract Number:
B554331
Research Org:
International Business Machines Corporation (Armonk, NY)
Sponsoring Org:
USDOE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING