DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Determining collective barrier operation skew in a parallel computer

Abstract

Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.

Inventors:
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1226811
Patent Number(s):
9195517
Application Number:
13/685,869
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B554331
Resource Type:
Patent
Resource Relation:
Patent File Date: 2012 Nov 27
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States: N. p., 2015. Web.
Faraj, Daniel A. Determining collective barrier operation skew in a parallel computer. United States.
Faraj, Daniel A. Tue . "Determining collective barrier operation skew in a parallel computer". United States. https://www.osti.gov/servlets/purl/1226811.
@article{osti_1226811,
title = {Determining collective barrier operation skew in a parallel computer},
author = {Faraj, Daniel A.},
abstractNote = {Determining collective barrier operation skew in a parallel computer that includes a number of compute nodes organized into an operational group includes: for each of the nodes until each node has been selected as a delayed node: selecting one of the nodes as a delayed node; entering, by each node other than the delayed node, a collective barrier operation; entering, after a delay by the delayed node, the collective barrier operation; receiving an exit signal from a root of the collective barrier operation; and measuring, for the delayed node, a barrier completion time. The barrier operation skew is calculated by: identifying, from the compute nodes' barrier completion times, a maximum barrier completion time and a minimum barrier completion time and calculating the barrier operation skew as the difference of the maximum and the minimum barrier completion time.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {11}
}

Works referenced in this record:

A Clock Synchronization Strategy for Minimizing Clock Variance at Runtime in High-End Computing Environments
conference, October 2010


Replay-Based Synchronization of Timestamps in Event Traces of Massively Parallel Applications
conference, September 2008

  • Becker, Daniel; Linford, John C.; Rabenseifner, Rolf
  • 2008 International Conference on Parallel Processing Workshops (ICPP-W), 2008 International Conference on Parallel Processing - Workshops
  • https://doi.org/10.1109/ICPP-W.2008.17

Probabilistic internal clock synchronization
conference, January 1994


The accuracy of the clock synchronization achieved by TEMPO in Berkeley UNIX 4.3BSD
journal, July 1989