DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Improving efficiency of a global barrier operation in a parallel computer

Abstract

Performing a global barrier operation in a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.

Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1327915
Patent Number(s):
9459934
Application Number:
13/683,726
Assignee:
International Business Machines Corporation (Armonk, NY
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B554331
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Nov 21
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

. Improving efficiency of a global barrier operation in a parallel computer. United States: N. p., 2016. Web.
. Improving efficiency of a global barrier operation in a parallel computer. United States.
. Tue . "Improving efficiency of a global barrier operation in a parallel computer". United States. https://www.osti.gov/servlets/purl/1327915.
@article{osti_1327915,
title = {Improving efficiency of a global barrier operation in a parallel computer},
author = {},
abstractNote = {Performing a global barrier operation in a parallel computer that includes compute nodes coupled for data communications, where each compute node executes tasks, with one task on each compute node designated as a master task, including: for each task on each compute node until all master tasks have joined a global barrier: determining whether the task is a master task; if the task is not a master task, joining a single local barrier; if the task is a master task, joining the global barrier and the single local barrier only after all other tasks on the compute node have joined the single local barrier.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Oct 04 00:00:00 EDT 2016},
month = {Tue Oct 04 00:00:00 EDT 2016}
}

Works referenced in this record:

Parallel processing method
patent, June 1989


Hexagonal mesh multiprocessor system
patent, March 1992


Shared buffer switching module
patent, July 1996


Object oriented message passing system and method
patent, December 1996


Method for testing integrated memory using an integrated DMA controller
patent, September 1997


Multicomputer memory access architecture
patent, February 1998


Multiprocessor computer system with interleaved processing element nodes
patent, April 1998


Partial broadcast method in parallel computer and a parallel computer suitable therefor
patent, October 1998


Data gathering/scattering system for a plurality of processors in a parallel computer
patent, November 1998


Partitioning of processing elements in a SIMD/MIMD array processor
patent, March 1999


Prediction system for RF power distribution
patent, September 1999


Adaptive congestion control mechanism for modular computer networks
patent, September 1999


Parallel computing system
patent, December 1999


Method and apparatus for internetworking buffer management
patent, August 2000


Method and apparatus for manifold array processing
patent, December 2000


Hybrid hypercube/torus architecture
patent, May 2001


Scalable system control unit for distributed shared memory multi-processor systems
patent, April 2002


Implementing locks in a distributed processing system
patent, October 2002


Lingering locks with fairness control for multi-node computer systems
patent, November 2002


Database system providing optimization of group by operator over a union all
patent, February 2004


Communications network
patent, March 2004


Method and apparatus for efficient transfer of data packets
patent, May 2004


Video output controller and video card
patent, July 2005


Data transfer apparatus and method
patent, October 2005


Protocol for self-organizing network using a logical spanning tree backbone
patent, January 2006


Multiprocessor system supporting multiple outstanding TLBI operations per partition
patent, July 2006


Reliable datagram transport service
patent, January 2007


Manifold array processor
patent, March 2007


Phased upgrade of a computing environment
patent, August 2007


Method and apparatus for storing tree data structures among and within multiple memory channels
patent, April 2008


Distributed counter and centralized sensor in barrier wait synchronization
patent, February 2009


DMA descriptor queue read and cache write pointer arrangement
patent, February 2009


Distributed model compilation
patent, March 2009


Facilitating intra-node data transfer in collective communications
patent, May 2009


Massively parallel supercomputer
patent, June 2009


Synchronizing access to global resources
patent, August 2009


Class network routing
patent, September 2009


Method and apparatus for storing tree data structures among and within multiple memory channels
patent, November 2009


Implementing locks in a distributed processing system
patent, December 2009


Locating hardware faults in a data communications network of a parallel computer
patent, January 2010


Locating hardware faults in a parallel computer
patent, April 2010


Memory control device
patent, April 2010


Method and apparatus for stacked address, bus to memory data transfer
patent, June 2010


Computer hardware fault administration
patent, September 2010


Dynamic multipoint tree rearrangement
patent, October 2010


Root node redundancy for multipoint-to-multipoint transport trees
patent, November 2010


Performing process migration with allreduce operations
patent, December 2010


Performing an allreduce operation using shared memory
patent, April 2012


Monitoring operating parameters in a distributed computing system with active messages
patent, May 2013


Performing a scatterv operation on a hierarchical tree network optimized for collective operations
patent, October 2013


Apparatus and methods for connecting modules using remote switching
patent-application, February 2002


Parallel Programming Development Environment
patent-application, May 2002


Method and apparatus for zeroing a transfer buffer memory as a background task
patent-application, May 2002


System and method for configuring computer applications and devices using inheritance
patent-application, July 2002


Multi-use data access descriptor
patent-application, October 2002


Synchronization objects for multi-computer systems
patent-application, February 2003


Distributed processing multi-processor computer
patent-application, September 2003


Data transfer apparatus and method
patent-application, October 2003


Irregular network
patent-application, November 2003


Efficient circuits for out-of-order microprocessors
patent-application, February 2004


Arithmetic functions in torus and tree networks
patent-application, April 2004


Broadcast invalidate scheme
patent-application, April 2004


Method and system for generically reporting events occurring within a computer system
patent-application, June 2004


Virtual private networks within a packet network having a mesh topology
patent-application, May 2005


Method and system for pre-pending layer 2 (L2) frame descriptors
patent-application, June 2005


Direct memory access controller system with message-based programming
patent-application, July 2005


Method and apparatus for pre-provisioning networks to support fast restoration with minimum overbuild
patent-application, November 2005


Method and apparatus for managing an event processing system
patent-application, July 2006


Method, System, and Program for Handling Input/Output Commands
patent-application, July 2006


Apparatus and method for controlling direct memory access
patent-application, August 2006


Fast and memory protected asynchronous message scheme in a multi-process and multi-thread environment
patent-application, August 2006


Development of parallel/distributed applications
patent-application, December 2006


MPI-aware networking infrastructure
patent-application, December 2006


Apparatus, system, and method for reliable, fast, and scalable multicast message delivery in service overlay networks
patent-application, May 2007


Programming a Multi-processor System
patent-application, September 2007


Computer Hardware Fault Diagnosis
patent-application, October 2007


Executing an Allgather Operation on a Parallel Computer
patent-application, October 2007


Power management in computer operating systems
patent-application, October 2007


Cluster Computing Support for Application Programs
patent-application, December 2007


Systems and methods for determining compute kernels for an application in a parallel-processing computer system
patent-application, December 2007


Systems and methods for profiling an application running on a parallel-processing computer system
patent-application, December 2007


Executing an Allgather Operation with an Alltoallv Operation in a Parallel Computer
patent-application, January 2008


Apparatus and method for capacity planning for data center server consolidation and workload reassignment
patent-application, March 2008


System and method for generating object code for map-reduce idioms in multiprocessor systems
patent-application, May 2008


Method and Apparatus for Setting and Managing Operational Dynamics Within Cognitive Radio Networks
patent-application, June 2008


Correlating Hardware Devices Between Local Operating System and Global Management Entity
patent-application, August 2008


Integrated Development Environment with Object-Oriented GUI Rendering Feature
patent-application, October 2008


Executing a Scatter Operation on a Parallel Computer
patent-application, October 2008


Parallel-Prefix Broadcast for a Parallel-Prefix Operation on a Parallel Computer
patent-application, October 2008


Message Communications of Particular Message Types Between Compute Nodes Using DMA Shadow Buffers
patent-application, October 2008


Signaling Completion of a Message Transfer from an Origin Compute Node to a Target Compute Node
patent-application, November 2008


Interprocess Resource-Based Dynamic Scheduling System and Method
patent-application, November 2008


Performing an Allreduce Operation Using Shared Memory
patent-application, December 2008


Administering an Epoch Initiated for Remote Memory Access
patent-application, December 2008


Optimized Collectives Using a DMA on a Parallel Computer
patent-application, January 2009


Direct Memory Access ('DMA') Engine Assisted Local Reduction
patent-application, January 2009


Ultrascalable Petaflop Parallel Supercomputer
patent-application, January 2009


Non-Volatile Memory And Method With Non-Sequential Update Block Management
patent-application, January 2009


Fault Tolerant Self-Optimizing Multi-Processor System and Method Thereof
patent-application, January 2009


Database Retrieval with a Non-Unique Key on a Parallel Computer System
patent-application, February 2009


Effecting a Broadcast with an Allreduce Operation on a Parallel Computer
patent-application, February 2009


Executing an Allgather Operation on a Parallel Computer
patent-application, February 2009


Query Execution and Optimization Utilizing a Combining Network in a Parallel Computer System
patent-application, February 2009


Line-Plane Broadcasting in a Data Communications Network of a Parallel Computer
patent-application, February 2009


Line-Plane Broadcasting in a Data Communications Network of a Parallel Computer
patent-application, February 2009


System and Method for Providing Full Hardware Support of Collective Operations in a Multi-Tiered Full-Graph Interconnect Architecture
patent-application, March 2009


System and Method for Providing a Fully Non-Blocking Switch in a Supernode of a Multi-Tiered Full-Graph Interconnect Architecture
patent-application, March 2009


Handling potential deadlocks and correctness problems of reduce operations in parallel systems
patent-application, March 2009


Mechanism For Process Migration On A Massively Parallel Computer
patent-application, March 2009


Tracking Network Contention
patent-application, June 2009


Non-Binary Source-to-Channel Symbol Mappings with Minimized Distortion
patent-application, August 2009


Broadcasting A Message In A Parallel Computer
patent-application, September 2009


Broadcasting A Message In A Parallel Computer
patent-application, October 2009


Collecting and Aggregating Data Using Distributed Resources
patent-application, October 2009


Novel Massively Parallel Supercomputer
patent-application, October 2009


Method and System for Increasing Throughput in a Hierarchical Wireless Network
patent-application, December 2009


Message Flow Control in a Multi-Node Computer System
patent-application, December 2009


Performing An All-To-All Data Exchange On A Plurality Of Data Buffers By Performing Swap Operations
patent-application, January 2010


Processing Data Access Requests Among A Plurality Of Compute Nodes
patent-application, January 2010


Performing Process Migration with Allreduce Operations
patent-application, July 2010


Executing a Gather Operation on a Parallel Computer
patent-application, October 2010


Recording A Communication Pattern and Replaying Messages in a Parallel Computing System
patent-application, January 2011


Cross-Channel Network Operation Offloading for Collective Operations
patent-application, May 2011


Distributed Symmetric Multiprocessing Computing Architecture
patent-application, May 2011


Adaptive Address Mapping with Dynamic Runtime Memory Mapping Selection
patent-application, June 2011


Managing Hardware Resources by Sending Messages Amongst Servers in a Data Center
patent-application, July 2011


Performing A Scatterv Operation On A Hierarchical Tree Network Optimized For Collective Operations
patent-application, September 2011


Performing A Local Reduction Operation On A Parallel Computer
patent-application, October 2011


Runtime Optimization Of An Application Executing On A Parallel Computer
patent-application, October 2011


Monitoring operating parameters in a distributed computing system with active messages
patent-application, November 2011


Optimizing Collective Operations
patent-application, November 2011


Effecting Hardware Acceleration Of Broadcast Operations In A Parallel Computer
patent-application, November 2011


Performing A Deterministic Reduction Operation In A Parallel Computer
patent-application, December 2011


Performing A Deterministic Reduction Operation In A Parallel Computer
patent-application, December 2011


Send-Side Matching Of Data Communications Messages
patent-application, March 2012


Processing Data Communications Events In A Parallel Active Messaging Interface Of A Parallel Computer
patent-application, May 2012


Performing An Allreduce Operation Using Shared Memory
patent-application, July 2012


Consolidated Information Retrieval Results
patent-application, August 2012


Performing An All-To-All Data Exchange On A Plurality Of Data Buffers By Performing Swap Operations
patent-application, August 2012


Performing A Local Reduction Operation On A Parallel Computer
patent-application, December 2012


Compressing Result Data For A Compute Node In A Parallel Computer
patent-application, December 2012


Collective Operation Protocol Selection In A Parallel Computer
patent-application, February 2013


Performing A Global Barrier Operation In A Parallel Computer
patent-application, February 2013


Performing A Local Barrier Operation
patent-application, February 2013


Compressing Result Data for a Compute Node in a Parallel COmputer
patent-application, March 2013


Send-Side Matching of Data Communications Messages
patent-application, March 2013


Performing A Deterministic Reduction Operation in a Parallel Computer
patent-application, March 2013


Processing Data Communications Events in a Parallel Active Messaging Interface of a Parallel Computer
patent-application, March 2013


Effecting Hardware Acceleration of Broadcast Operations in a Parallel Computer
patent-application, March 2013


Collective Operation Protocol Selection in a Parallel Computer
patent-application, April 2013


Performing a Local Barrier Operation
patent-application, May 2013


Performing a Global Barrier Operation in a Parallel Computer
patent-application, July 2013


Initiating A Collective Operation In A Parallel Computer
patent-application, August 2013


Developing Collective Operations For A Parallel Computer
patent-application, August 2013


Developing Collective Operations for a Parallel Computer
patent-application, August 2013


Broadcasting A Message In A Parallel Computer
patent-application, September 2013


Performing a Deterministic Reduction Operation in a Parallel Computer
patent-application, October 2013


Automatic generation and tuning of MPI collective communication routines
conference, January 2005


Building packet buffers using interleaved memories
conference, January 2005


Kernel-level single system image for petascale computing
journal, April 2006


Real-Time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids
journal, November 2000


Optimization of MPI collectives on clusters of large-scale SMP's
conference, January 1999


AM++: a generalized active message framework
conference, January 2010

  • Willcock, Jeremiah James; Hoefler, Torsten; Edmonds, Nicholas Gerard
  • Proceedings of the 19th international conference on Parallel architectures and compilation techniques - PACT '10
  • https://doi.org/10.1145/1854273.1854323

Extending the message passing interface (MPI)
conference, January 1995


Computing parallel prefix and reduction using coterie structures
conference, January 1992

  • Herbordt, M. C.; Weems, C. C.
  • [1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
  • https://doi.org/10.1109/FMPC.1992.234895

Computing the Hough transform on a scan line array processor (image processing)
journal, March 1989


Efficient MPI Collective Operations for Clusters in Long-and-Fast Networks
conference, September 2006


An All-Reduce Operation in Star Networks Using All-to-All Broadcast Communication Pattern
book, January 2005


Bandwidth Efficient All-reduce Operation on Tree Topologies
conference, March 2007


Optimizing threaded MPI execution on SMP clusters
conference, January 2001


Interleaved all-to-all reliable broadcast on meshes and hypercubes
journal, May 1994


Efficient algorithms for all-to-all communications in multiport message-passing systems
journal, January 1997


Performance analysis and optimization of MPI collective operations on multi-core clusters
journal, April 2009