Effecting a broadcast with an allreduce operation on a parallel computer
Abstract
A parallel computer comprises a plurality of compute nodes organized into at least one operational group for collective parallel operations. Each compute node is assigned a unique rank and is coupled for data communications through a global combining network. One compute node is assigned to be a logical root. A send buffer and a receive buffer is configured. Each element of a contribution of the logical root in the send buffer is contributed. One or more zeros corresponding to a size of the element are injected. An allreduce operation with a bitwise OR using the element and the injected zeros is performed. And the result for the allreduce operation is determined and stored in each receive buffer.
- Inventors:
-
- Ardsley, NY
- Rochester, MN
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1017453
- Patent Number(s):
- 7827385
- Application Number:
- 11/832,918
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B519700
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Almasi, Gheorghe, Archer, Charles J, Ratterman, Joseph D, and Smith, Brian E. Effecting a broadcast with an allreduce operation on a parallel computer. United States: N. p., 2010.
Web.
Almasi, Gheorghe, Archer, Charles J, Ratterman, Joseph D, & Smith, Brian E. Effecting a broadcast with an allreduce operation on a parallel computer. United States.
Almasi, Gheorghe, Archer, Charles J, Ratterman, Joseph D, and Smith, Brian E. Tue .
"Effecting a broadcast with an allreduce operation on a parallel computer". United States. https://www.osti.gov/servlets/purl/1017453.
@article{osti_1017453,
title = {Effecting a broadcast with an allreduce operation on a parallel computer},
author = {Almasi, Gheorghe and Archer, Charles J and Ratterman, Joseph D and Smith, Brian E},
abstractNote = {A parallel computer comprises a plurality of compute nodes organized into at least one operational group for collective parallel operations. Each compute node is assigned a unique rank and is coupled for data communications through a global combining network. One compute node is assigned to be a logical root. A send buffer and a receive buffer is configured. Each element of a contribution of the logical root in the send buffer is contributed. One or more zeros corresponding to a size of the element are injected. An allreduce operation with a bitwise OR using the element and the injected zeros is performed. And the result for the allreduce operation is determined and stored in each receive buffer.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {11}
}
Works referenced in this record:
Computing parallel prefix and reduction using coterie structures
conference, January 1992
- Herbordt, M. C.; Weems, C. C.
- [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
Optimization of MPI collectives on clusters of large-scale SMP's
conference, January 1999
- Sistare, Steve; vandeVaart, Rolf; Loh, Eugene
- Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '99
Universality of mixed action extrapolation formulae
journal, April 2009
- Chen, Jiunn-Wei; Walker-Loud, André; O'Connell, Donal
- Journal of High Energy Physics, Vol. 2009, Issue 04
Computing the Hough transform on a scan line array processor (image processing)
journal, March 1989
- Fisher, A. L.; Highnam, P. T.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, Issue 3
Interleaved all-to-all reliable broadcast on meshes and hypercubes
journal, May 1994
- Sunggu Lee, ; Shin, K. G.
- IEEE Transactions on Parallel and Distributed Systems, Vol. 5, Issue 5
Efficient algorithms for all-to-all communications in multiport message-passing systems
journal, January 1997
- Bruck, J.; Kipnis, S.
- IEEE Transactions on Parallel and Distributed Systems, Vol. 8, Issue 11