skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performing an allreduce operation on a plurality of compute nodes of a parallel computer

Patent ·
OSTI ID:1040781

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.

Research Organization:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
B554331
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Number(s):
8,161,268
Application Number:
12/124,756
OSTI ID:
1040781
Resource Relation:
Patent File Date: 2008 May 21
Country of Publication:
United States
Language:
English

References (13)

Adaptive Model Trust Region Methods for Generalized Eigenvalue Problems book January 2005
Extending the message passing interface (MPI) conference January 1995
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks conference September 2006
Interleaved all-to-all reliable broadcast on meshes and hypercubes journal May 1994
Optimizing threaded MPI execution on SMP clusters conference January 2001
Efficient algorithms for all-to-all communications in multiport message-passing systems journal January 1997
Computing parallel prefix and reduction using coterie structures
  • Herbordt, M. C.; Weems, C. C.
  • [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation https://doi.org/10.1109/FMPC.1992.234895
conference January 1992
Optimization of MPI collectives on clusters of large-scale SMP's conference January 1999
Universality of mixed action extrapolation formulae journal April 2009
Coprocessor design to support MPI primitives in configurable multiprocessors journal April 2007
Computing the Hough transform on a scan line array processor (image processing) journal March 1989
DADO: A tree-structured machine architecture for production systems report March 1982
Bandwidth Efficient All-reduce Operation on Tree Topologies conference March 2007