skip to main content
DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Aggregating job exit statuses of a plurality of compute nodes executing a parallel application

Abstract

Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregating each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.

Inventors:
; ; ;
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1195933
Patent Number(s):
9,086,962
Application Number:
13/524,602
Assignee:
International Business Machines Corporation (Armonk, NY)
DOE Contract Number:  
B579040
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Jun 15
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Aho, Michael E., Attinella, John E., Gooding, Thomas M., and Mundy, Michael B. Aggregating job exit statuses of a plurality of compute nodes executing a parallel application. United States: N. p., 2015. Web.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., & Mundy, Michael B. Aggregating job exit statuses of a plurality of compute nodes executing a parallel application. United States.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., and Mundy, Michael B. Tue . "Aggregating job exit statuses of a plurality of compute nodes executing a parallel application". United States. https://www.osti.gov/servlets/purl/1195933.
@article{osti_1195933,
title = {Aggregating job exit statuses of a plurality of compute nodes executing a parallel application},
author = {Aho, Michael E. and Attinella, John E. and Gooding, Thomas M. and Mundy, Michael B.},
abstractNote = {Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregating each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {7}
}

Patent:

Save / Share: