Collectively loading programs in a multiple program multiple data environment
Abstract
Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the program needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1487107
- Patent Number(s):
- 10104202
- Application Number:
- 13/801,165
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
H - ELECTRICITY H04 - ELECTRIC COMMUNICATION TECHNIQUE H04L - TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2013 Mar 13
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Aho, Michael E., Attinella, John E., Gooding, Thomas M., and Miller, Samuel J. Collectively loading programs in a multiple program multiple data environment. United States: N. p., 2018.
Web.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., & Miller, Samuel J. Collectively loading programs in a multiple program multiple data environment. United States.
Aho, Michael E., Attinella, John E., Gooding, Thomas M., and Miller, Samuel J. Tue .
"Collectively loading programs in a multiple program multiple data environment". United States. https://www.osti.gov/servlets/purl/1487107.
@article{osti_1487107,
title = {Collectively loading programs in a multiple program multiple data environment},
author = {Aho, Michael E. and Attinella, John E. and Gooding, Thomas M. and Miller, Samuel J.},
abstractNote = {Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the program needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2018},
month = {10}
}
Works referenced in this record:
Deadlock-free class routes for collective communications embedded in a multi-dimensional torus network
patent, January 2013
- Chen, Dong; Eisley, Noel A.; Steinmacher-Burow, Burkhard
- US Patent Document 8,364,844
Parallel Application Load Balancing and Distributed Work Management
patent-application, March 2008
- Archer, Charles J.; Mullins, Timothy J.; Ratterman, Joseph D.
- US Patent Application 11/469107; 20080059555
Providing Point To Point Communications Among Compute Nodes In A Global Combining Network Of A Parallel Computer
January 2010
- Archer, Charles J.; Peters, Amanda E.; Smith, Brian E.
- US Patent Application 12/176840; 20100014523
Multi-Petascale Highly Efficient Parallel Supercomputer
patent-application, September 2011
- Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.
- US Patent Document 13/004007; 20110219208
Dynamically Reassigning a Connected Node to a Block of Compute Nodes for Re-Launching a Failed Job
patent-application, February 2012
- Budnik, Thomas A.; Knudson, Brant L.; Megerian, Mark G.
- US Patent Application 12/861426; 20120047393
Collectively Loading An Application In A Parallel Computer
patent-application, October 2013
- Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.
- US Patent Application 13/431248; 20130263138