DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Collectively loading programs in a multiple program multiple data environment

Abstract

Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the program needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.

Inventors:
; ; ; ;
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1331154
Patent Number(s):
9491259
Application Number:
13/800,948
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
H - ELECTRICITY H04 - ELECTRIC COMMUNICATION TECHNIQUE H04L - TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
DOE Contract Number:  
0A-45527
Resource Type:
Patent
Resource Relation:
Patent File Date: 2013 Mar 13
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Aho, Michael E., Attinella, John E., Gooding, Thomas M., Gooding, Thomas M., and Miller, Samuel J. Collectively loading programs in a multiple program multiple data environment. United States: N. p., 2016. Web.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., Gooding, Thomas M., & Miller, Samuel J. Collectively loading programs in a multiple program multiple data environment. United States.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., Gooding, Thomas M., and Miller, Samuel J. Tue . "Collectively loading programs in a multiple program multiple data environment". United States. https://www.osti.gov/servlets/purl/1331154.
@article{osti_1331154,
title = {Collectively loading programs in a multiple program multiple data environment},
author = {Aho, Michael E. and Attinella, John E. and Gooding, Thomas M. and Gooding, Thomas M. and Miller, Samuel J.},
abstractNote = {Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the program needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Nov 08 00:00:00 EST 2016},
month = {Tue Nov 08 00:00:00 EST 2016}
}

Works referenced in this record:

Deadlock-free class routes for collective communications embedded in a multi-dimensional torus network
patent, January 2013


Parallel Application Load Balancing and Distributed Work Management
patent-application, March 2008


Multi-Petascale Highly Efficient Parallel Supercomputer
patent-application, September 2011


Dynamically Reassigning a Connected Node to a Block of Compute Nodes for Re-Launching a Failed Job
patent-application, February 2012


Collectively Loading An Application In A Parallel Computer
patent-application, October 2013