skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performing an allreduce operation using shared memory

Patent ·
OSTI ID:1134015

Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.

Research Organization:
International Business Machines Corporation, Armonk, NY (USA
Sponsoring Organization:
USDOE
DOE Contract Number:
B554331
Assignee:
International Business Machines Corporation
Patent Number(s):
8,752,051
Application Number:
13/427,057
OSTI ID:
1134015
Country of Publication:
United States
Language:
English

References (106)

Inter-computer message routing system with each computer having separate routinng automata for each dimension of the network patent April 1992
Self-timed mesh routing chip with data broadcasting patent July 1994
Hierarchical interconnection network architecture for parallel processing, having interconnections between bit-addressible nodes based on address bit permutations patent April 1996
Packet-switched self-routing multistage interconnection network having contention-free fanout, low-loss routing, and fanin buffering to efficiently realize arbitrarily low packet loss patent July 1996
Object oriented message passing system and method patent December 1996
Data gathering/scattering system for a plurality of processors in a parallel computer patent November 1998
Intelligent batching of distributed messages patent February 1999
Partitioning of processing elements in a SIMD/MIMD array processor patent March 1999
Parallel computer system using properties of messages to route them through an interconnect network and to select virtual channel circuits therewithin patent April 1999
High-speed, parallel, processor architecture for front-end electronics, based on a single type of ASIC, and method use thereof patent August 1999
Prediction system for RF power distribution patent September 1999
Parallel computing system patent December 1999
SMP clusters with remote resource managers for distributing work to other clusters while reducing bus traffic to a minimum patent March 2000
Pattern generation and shift plane operations for a mesh connected computer patent May 2000
Routing resource reserve/release protocol for multi-processor computer systems patent June 2000
Dead reckoning routing of packet data within a network of nodes having generally regular topology patent August 2001
Method, system and computer program product for managing memory in a non-uniform memory access system patent September 2001
Implementing locks in a distributed processing system patent October 2002
Dynamically matching users for group communications based on a threshold degree of matching of sender and recipient predetermined acceptance criteria patent November 2002
System and method for configuration, management, and monitoring of a computer network using inheritance patent December 2004
Protocol for self-organizing network using a logical spanning tree backbone patent January 2006
Efficient method of globalization and synchronization of distributed resources in distributed peer data processing environments patent March 2006
Multiprocessor system supporting multiple outstanding TLBI operations per partition patent July 2006
Hyperbolic tree space display of computer system monitoring and analysis data patent November 2006
Reliable datagram transport service patent January 2007
Hierarchical tree-based protection scheme for mesh networks patent April 2007
Phased upgrade of a computing environment patent August 2007
Systems for communicating current and future activity information among mobile internet users and methods therefor patent October 2007
Method and apparatus for storing tree data structures among and within multiple memory channels patent April 2008
Method and apparatus for suspending execution of a thread until a specified memory access occurs patent April 2008
Distributed counter and centralized sensor in barrier wait synchronization patent February 2009
Distributed model compilation patent March 2009
Massively parallel supercomputer patent June 2009
Synchronizing access to global resources patent August 2009
Method and apparatus for storing tree data structures among and within multiple memory channels patent November 2009
Implementing locks in a distributed processing system patent December 2009
Configuring compute nodes of a parallel computer in an operational group into a plurality of independent non-overlapping collective networks patent March 2010
Locating hardware faults in a parallel computer patent April 2010
Memory control device patent April 2010
System and method for automatic generation of a hierarchical tree network and the use of two complementary learning algorithms, optimized for each leaf of the hierarchical tree network patent May 2010
Method and apparatus for stacked address, bus to memory data transfer patent June 2010
Hierarchical tree-based protection scheme for mesh networks patent August 2010
Reinforced handle assembly for lock patent September 2010
Computer hardware fault administration patent September 2010
Dynamic multipoint tree rearrangement patent October 2010
Root node redundancy for multipoint-to-multipoint transport trees patent November 2010
Cross-layer design techniques for interference-aware routing configuration in wireless mesh networks patent May 2011
Signaling completion of a message transfer from an origin compute node to a target compute node patent May 2011
Efficient content authentication in peer-to-peer networks patent July 2011
Mechanism to support generic collective communication across a variety of programming models patent July 2011
Broadcasting a message in a parallel computer patent August 2011
Methods and systems for launching applications into existing isolation environments patent January 2012
Method and a system for responding locally to requests for file metadata associated with files stored remotely patent March 2012
Systems and methods for determining compute kernels for an application in a parallel-processing computer system patent March 2012
Performing an allreduce operation using shared memory patent April 2012
Methods and systems for launching applications into existing isolation environments patent December 2012
Runtime optimization of an application executing on a parallel computer patent January 2013
Monitoring operating parameters in a distributed computing system with active messages patent May 2013
Performing a scatterv operation on a hierarchical tree network optimized for collective operations patent October 2013
System and method for configuring computer applications and devices using inheritance patent-application July 2002
Synchronization objects for multi-computer systems patent-application February 2003
Efficient method of globalization and synchronization of distributed resources in distributed peer data processing environments patent-application
Efficient circuits for out-of-order microprocessors patent-application February 2004
Arithmetic functions in torus and tree networks patent-application April 2004
Method and apparatus for managing an event processing system patent-application July 2006
Fast and memory protected asynchronous message scheme in a multi-process and multi-thread environment patent-application August 2006
MPI-aware networking infrastructure patent-application December 2006
Method, system and program product for communicating among processes in a symmetric multi-processing cluster environment patent-application July 2007
Programming a Multi-processor System patent-application September 2007
Systems and methods for determining compute kernels for an application in a parallel-processing computer system patent-application December 2007
Systems and methods for profiling an application running on a parallel-processing computer system patent-application December 2007
Remote DMA systems and methods for supporting synchronization of distributed processes in a multi-processor system using collective operations patent-application May 2008
Integrated Development Environment with Object-Oriented GUI Rendering Feature patent-application October 2008
Signaling Completion of a Message Transfer from an Origin Compute Node to a Target Compute Node patent-application November 2008
Interprocess Resource-Based Dynamic Scheduling System and Method patent-application November 2008
Performing an Allreduce Operation Using Shared Memory patent-application December 2008
Non-Volatile Memory And Method With Non-Sequential Update Block Management patent-application January 2009
Fault Tolerant Self-Optimizing Multi-Processor System and Method Thereof patent-application January 2009
Database Retrieval with a Non-Unique Key on a Parallel Computer System patent-application February 2009
Determining When a Set of Compute Nodes Participating in a Barrier Operation on a Parallel Computer are Ready to Exit the Barrier Operation patent-application February 2009
Query Execution and Optimization Utilizing a Combining Network in a Parallel Computer System patent-application February 2009
System and Method for Providing a Fully Non-Blocking Switch in a Supernode of a Multi-Tiered Full-Graph Interconnect Architecture patent-application March 2009
Mechanism For Process Migration On A Massively Parallel Computer patent-application March 2009
Broadcasting A Message In A Parallel Computer patent-application September 2009
Collecting and Aggregating Data Using Distributed Resources patent-application October 2009
Novel Massively Parallel Supercomputer patent-application October 2009
Performing An Allreduce Operation On A Plurality Of Compute Nodes Of A Parallel Computer patent-application November 2009
Method and System for Increasing Throughput in a Hierarchical Wireless Network patent-application December 2009
Message Flow Control in a Multi-Node Computer System patent-application December 2009
Processing Data Access Requests Among A Plurality Of Compute Nodes patent-application January 2010
Providing Improved Message Handling Performance in Computer Systems Utilizing Shared Network Devices patent-application January 2010
System-On-A-Chip Having an Array of Programmable Processing Elements Linked By an On-Chip Network with Distributed On-Chip Shared Memory and External Shared Memory patent-application July 2010
Recording A Communication Pattern and Replaying Messages in a Parallel Computing System patent-application January 2011
Cross-Channel Network Operation Offloading for Collective Operations patent-application May 2011
Distributed Symmetric Multiprocessing Computing Architecture patent-application May 2011
Adaptive Address Mapping with Dynamic Runtime Memory Mapping Selection patent-application June 2011
Runtime Optimization Of An Application Executing On A Parallel Computer patent-application October 2011
Monitoring operating parameters in a distributed computing system with active messages patent-application November 2011
Consolidated Information Retrieval Results patent-application August 2012
Automatic generation and tuning of MPI collective communication routines conference January 2005
Computing the Hough transform on a scan line array processor (image processing) journal March 1989
Computing parallel prefix and reduction using coterie structures
  • Herbordt, M. C.; Weems, C. C.
  • [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation https://doi.org/10.1109/FMPC.1992.234895
conference January 1992
Kernel-level single system image for petascale computing journal April 2006
Building packet buffers using interleaved memories conference January 2005
Optimization of MPI Collectives on Clusters of Large-Scale SMP's conference January 1999
Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids journal November 2000