Performing process migration with allreduce operations
Abstract
Compute nodes perform allreduce operations that swap processes at nodes. A first allreduce operation generates a first result and uses a first process from a first compute node, a second process from a second compute node, and zeros from other compute nodes. The first compute node replaces the first process with the first result. A second allreduce operation generates a second result and uses the first result from the first compute node, the second process from the second compute node, and zeros from others. The second compute node replaces the second process with the second result, which is the first process. A third allreduce operation generates a third result and uses the first result from first compute node, the second result from the second compute node, and zeros from others. The first compute node replaces the first result with the third result, which is the second process.
- Inventors:
-
- Rochester, MN
- Plymouth, MN
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1009532
- Patent Number(s):
- 7853639
- Application Number:
- 11/531,175
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B519700
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Archer, Charles Jens, Peters, Amanda, and Wallenfelt, Brian Paul. Performing process migration with allreduce operations. United States: N. p., 2010.
Web.
Archer, Charles Jens, Peters, Amanda, & Wallenfelt, Brian Paul. Performing process migration with allreduce operations. United States.
Archer, Charles Jens, Peters, Amanda, and Wallenfelt, Brian Paul. Tue .
"Performing process migration with allreduce operations". United States. https://www.osti.gov/servlets/purl/1009532.
@article{osti_1009532,
title = {Performing process migration with allreduce operations},
author = {Archer, Charles Jens and Peters, Amanda and Wallenfelt, Brian Paul},
abstractNote = {Compute nodes perform allreduce operations that swap processes at nodes. A first allreduce operation generates a first result and uses a first process from a first compute node, a second process from a second compute node, and zeros from other compute nodes. The first compute node replaces the first process with the first result. A second allreduce operation generates a second result and uses the first result from the first compute node, the second process from the second compute node, and zeros from others. The second compute node replaces the second process with the second result, which is the first process. A third allreduce operation generates a third result and uses the first result from first compute node, the second result from the second compute node, and zeros from others. The first compute node replaces the first result with the third result, which is the second process.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {12}
}
Works referenced in this record:
Automatic Performance Tuning for J2EE Application Server Systems
book, January 2005
- Zhang, Yan; Qu, Wei; Liu, Anna
- Lecture Notes in Computer Science
The Autopilot performance-directed adaptive control system
journal, September 2001
- Ribler, Randy L.; Simitci, Huseyin; Reed, Daniel A.
- Future Generation Computer Systems, Vol. 18, Issue 1, p. 175-187
Automated cluster-based web service performance tuning
conference, January 2004
- I-Hsin Chung, ; Hollingsworth, J. K.
- Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004.