skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performing an allreduce operation on a plurality of compute nodes of a parallel computer

Patent ·
OSTI ID:1082948

Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer, each node including at least two processing cores, that include: performing, for each node, a local reduction operation using allreduce contribution data for the cores of that node, yielding, for each node, a local reduction result for one or more representative cores for that node; establishing one or more logical rings among the nodes, each logical ring including only one of the representative cores from each node; performing, for each logical ring, a global allreduce operation using the local reduction result for the representative cores included in that logical ring, yielding a global allreduce result for each representative core included in that logical ring; and performing, for each node, a local broadcast operation using the global allreduce results for each representative core on that node.

Research Organization:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
B554331
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Number(s):
8,375,197
Application Number:
12/124,763
OSTI ID:
1082948
Country of Publication:
United States
Language:
English

References (87)

Non-Binary Source-to-Channel Symbol Mappings with Minimized Distortion patent-application August 2009
Tracking Network Contention patent-application June 2009
Method, system and computer program product for managing memory in a non-uniform memory access system patent September 2001
Direct memory access controller system with message-based programming patent-application July 2005
Correlating Hardware Devices Between Local Operating System and Global Management Entity patent-application August 2008
Routing resource reserve/release protocol for multi-processor computer systems patent June 2000
Method and apparatus for pre-provisioning networks to support fast restoration with minimum overbuild patent-application November 2005
Interleaved all-to-all reliable broadcast on meshes and hypercubes journal May 1994
An All-Reduce Operation in Star Networks Using All-to-All Broadcast Communication Pattern book January 2005
Method and apparatus for internetworking buffer management patent August 2000
Performing an allreduce operation on a plurality of compute nodes of a parallel computer patent April 2012
Performing Process Migration with Allreduce Operations patent-application July 2010
Data transfer apparatus and method patent-application October 2003
Computing the Hough transform on a scan line array processor (image processing) journal March 1989
Method for testing integrated memory using an integrated DMA controller patent September 1997
DMA descriptor queue read and cache write pointer arrangement patent February 2009
Dead reckoning routing of packet data within a network of nodes having generally regular topology patent August 2001
Facilitating intra-node data transfer in collective communications patent May 2009
Virtual private networks within a packet network having a mesh topology patent-application May 2005
Partitioning of processing elements in a SIMD/MIMD array processor patent March 1999
Performing process migration with allreduce operations patent December 2010
Apparatus and methods for connecting modules using remote switching patent-application February 2002
Fast restoration mechanism and method of determining minimum restoration capacity in a transmission networks patent November 2006
Systems for communicating current and future activity information among mobile internet users and methods therefor patent October 2007
Hierarchical interconnection network architecture for parallel processing, having interconnections between bit-addressible nodes based on address bit permutations patent April 1996
Dynamic load balancing among processors in a parallel computer patent September 2001
Input/output controller for coupling the processor-memory complex to the fabric in fabric-backplane interprise servers patent February 2010
Handling potential deadlocks and correctness problems of reduce operations in parallel systems patent-application March 2009
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer patent-application November 2009
Optimizing threaded MPI execution on SMP clusters conference January 2001
Method and apparatus for the connection of a closed ring through a telephone exchange patent December 1987
Managing Hardware Resources by Sending Messages Amongst Servers in a Data Center patent-application July 2011
Deterministic real time hierarchical distributed computing system patent August 2007
Performing an Allreduce Operation Using Shared Memory patent-application December 2008
Adaptive congestion control mechanism for modular computer networks patent September 1999
Bandwidth Efficient All-reduce Operation on Tree Topologies conference March 2007
Method and apparatus for controlling (N+I) I/O channels with (N) data managers in a homogenous software programmable environment patent January 1999
System and method for generating object code for map-reduce idioms in multiprocessor systems patent-application May 2008
Communications network patent March 2004
Apparatus and method for controlling direct memory access patent-application August 2006
Apparatus, system, and method for reliable, fast, and scalable multicast message delivery in service overlay networks patent-application May 2007
Method and apparatus for wire speed IP multicast forwarding patent June 2004
Parallel computing system patent December 1999
Multi-use data access descriptor patent-application October 2002
Efficient circuits for out-of-order microprocessors patent-application February 2004
Self-timed mesh routing chip with data broadcasting patent July 1994
Extending the message passing interface (MPI) conference January 1995
Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks conference September 2006
Parallel computer system using properties of messages to route them through an interconnect network and to select virtual channel circuits therewithin patent April 1999
Arithmetic functions in torus and tree networks patent-application April 2004
Optimized Collectives Using a DMA on a Parallel Computer patent-application January 2009
Cross-Channel Network Operation Offloading for Collective Operations patent-application May 2011
Dynamically matching users for group communications based on a threshold degree of matching of sender and recipient predetermined acceptance criteria patent November 2002
Apparatus and method for capacity planning for data center server consolidation and workload reassignment patent-application March 2008
Efficient algorithms for all-to-all communications in multiport message-passing systems journal January 1997
Method for performing alltoall communication in parallel computers patent December 2001
Parallel processing method patent June 1989
Prediction system for RF power distribution patent September 1999
Video output controller and video card patent July 2005
Coprocessor design to support MPI primitives in configurable multiprocessors journal April 2007
Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication patent April 2001
High-speed, parallel, processor architecture for front-end electronics, based on a single type of ASIC, and method use thereof patent August 1999
Communicator-based token/buffer management for eager protocol support in collective communication operations patent-application May 2010
Inter-computer message routing system with each computer having separate routinng automata for each dimension of the network patent April 1992
Message transfer system and method for parallel computer with message transfers being scheduled by skew and roll functions to avoid bottlenecks patent April 1997
Database system providing optimization of group by operator over a union all patent February 2004
System for allocating computing resources of distributed computer system with transaction manager patent September 2009
Cluster Computing Support for Application Programs patent-application December 2007
Phased upgrade of a computing environment patent August 2007
Mechanism For Process Migration On A Massively Parallel Computer patent-application March 2009
Computer Hardware Fault Diagnosis patent-application October 2007
Method of optimizing recognition of collective data movement in a parallel distributed system patent October 1998
Executing an Allgather Operation on a Parallel Computer patent-application October 2007
Multicomputer memory access architecture patent February 1998
Method and apparatus for zeroing a transfer buffer memory as a background task patent-application May 2002
Irregular network patent-application November 2003
Development of parallel/distributed applications patent-application December 2006
Computing parallel prefix and reduction using coterie structures
  • Herbordt, M. C.; Weems, C. C.
  • [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation https://doi.org/10.1109/FMPC.1992.234895
conference January 1992
Distributed processing multi-processor computer patent-application September 2003
Optimization of MPI collectives on clusters of large-scale SMP's conference January 1999
Data gathering/scattering system for a plurality of processors in a parallel computer patent November 1998
Method, system and program product for communicating among processes in a symmetric multi-processing cluster environment patent-application July 2007
Pattern generation and shift plane operations for a mesh connected computer patent May 2000
Packet-switched self-routing multistage interconnection network having contention-free fanout, low-loss routing, and fanin buffering to efficiently realize arbitrarily low packet loss patent July 1996
Method and apparatus for efficient transfer of data packets patent May 2004
Method, System, and Program for Handling Input/Output Commands patent-application July 2006
Parallel Programming Development Environment patent-application May 2002