Determining collective barrier operation skew in a parallel computer
Abstract
Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1226812
- Patent Number(s):
- 9195516
- Application Number:
- 13/308,917
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2011 Dec 01
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States: N. p., 2015.
Web.
Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States.
Faraj, Daniel A. Tue .
"Determining collective barrier operation skew in a parallel computer". United States. https://www.osti.gov/servlets/purl/1226812.
@article{osti_1226812,
title = {Determining collective barrier operation skew in a parallel computer},
author = {Faraj, Daniel A.},
abstractNote = {Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Nov 24 00:00:00 EST 2015},
month = {Tue Nov 24 00:00:00 EST 2015}
}
Works referenced in this record:
Uniform load distributing method for use in executing parallel processing in parallel computer
patent, July 1996
- Matsuoka, Hidetoshi; Hirose, Fumiyasu
- US Patent Document 5,535,387
Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks
patent, July 2012
- Arimilli, Lakshminarayana B.; Arimilli, Ravi Kumar; Rajamony, Ramakrishnan
- US Patent Document 8,234,652
Synchronization of distributed simulation nodes by keeping timestep schedulers in lockstep
patent-application, May 2003
- Sivier, Steven A.; Frankel, Carl B.; Cavanagh, Carl
- US Patent Application 10/008643; 20030093569
Effective use of a hardware barrier synchronization register for protocol synchronization
patent-application, March 2008
- Chaudhary, Piyush; Govindaraju, Rama K.; Kim, Chulho
- US Patent Application 11/534891: 20080077921
Determining When a Set of Compute Nodes Participating in a Barrier Operation on a Parallel Computer are Ready to Exit the Barrier Operation
patent-application, February 2009
- Blocksome, Michael A.
- US Patent Application 11/832192; 20090037707
System and Method for Providing a Fully Non-Blocking Switch in a Supernode of a Multi-Tiered Full-Graph Interconnect Architecture
patent-application, March 2009
- Arimilli, Lakshminarayana B.; Arimilli, Ravi K.; Rajamony, Ramakrishnan
- US Patent Application 11/845211; 20090064140
Handling potential deadlocks and correctness problems of reduce operations in parallel systems
patent-application, March 2009
- Ohly, Patrick; Shumilin, Victor
- US Patent Application 11/897480; 20090064176
