Fast, contention-free combining tree barriers for shared-memory multiprocessors
- Univ. of Rochester, NY (United States)
- Rice Univ., Houston, TX (United States)
In a previous article, Gupta and Hill introduced an adaptive combining tree algorithm for busy-wait barrier synchronization on shared-memory multiprocessors. The intent of the algorithm was to achieve a barrier in logarithmic time when processes arrive simultaneously, and in constant time after the last arrival when arrival times are skewed. A fuzzy version of the algorithm allows a process to perform useful work between the point at which it notifies other processes of its arrival and the point at which it waits for all other processes to arrive. Unfortunately, adaptive combining tree barriers as originally devised perform a large amount of work at each node of the tree, including the acquisition and release of locks. They also perform an unbounded number of accesses to nonlocal locations, inducing large amounts of memory and interconnect contention. We present new adaptive combining tree barriers that eliminate these problems. We compare the performance of the new algorithms to that of other fast barriers on a 64-node BBN Butterfly 1 multiprocessor, a 35-node BBN TC2000, and a 126-node KSR 1. The results reveal scenarios in which our algorithms outperform all known alternatives, and suggest that both adaptation and the combination of fuzziness with tree-style synchronization will be of increasing importance on future generations of shared-memory multiprocessors.
- OSTI ID:
- 7020429
- Journal Information:
- International Journal of Parallel Programming; (United States), Vol. 22:4; ISSN 0885-7458
- Country of Publication:
- United States
- Language:
- English
Similar Records
A distributed fair polling scheme applied to OR-parallel logic programming
Hierarchical N-body methods on shared address space multiprocessors