DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Determining collective barrier operation skew in a parallel computer

Abstract

Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.

Inventors:
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1226811
Patent Number(s):
9195517
Application Number:
13/685,869
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B554331
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Nov 27
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States: N. p., 2015. Web.
Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States.
Faraj, Daniel A. Tue . "Determining collective barrier operation skew in a parallel computer". United States. https://www.osti.gov/servlets/purl/1226811.
@article{osti_1226811,
title = {Determining collective barrier operation skew in a parallel computer},
author = {Faraj, Daniel A.},
abstractNote = {Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {11}
}

Works referenced in this record:

Method for clock skew cost calculation
patent, April 1998


Synchronization of distributed simulation nodes by keeping timestep schedulers in lockstep
patent-application, May 2003


Effective use of a hardware barrier synchronization register for protocol synchronization
patent-application, March 2008


Adjustment of clock approximations
patent-application, October 2008


System and Method for Providing a Fully Non-Blocking Switch in a Supernode of a Multi-Tiered Full-Graph Interconnect Architecture
patent-application, March 2009


Handling potential deadlocks and correctness problems of reduce operations in parallel systems
patent-application, March 2009


Synchronizing Clocks in an Asynchronous Distributed System
patent-application, October 2009


Physical Manager of Synchronization Barrier Between Multiple Processes
patent-application, October 2011


A Clock Synchronization Strategy for Minimizing Clock Variance at Runtime in High-End Computing Environments
conference, October 2010


Replay-Based Synchronization of Timestamps in Event Traces of Massively Parallel Applications
conference, September 2008

  • Becker, Daniel; Linford, John C.; Rabenseifner, Rolf
  • 2008 International Conference on Parallel Processing Workshops (ICPP-W), 2008 International Conference on Parallel Processing - Workshops
  • https://doi.org/10.1109/ICPP-W.2008.17

Internal Timer Synchronization for Parallel Event Tracing
book, January 2008


Probabilistic internal clock synchronization
conference, January 1994


The accuracy of the clock synchronization achieved by TEMPO in Berkeley UNIX 4.3BSD
journal, July 1989