skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Gaussian techniques on shared-memory multiprocessors

Technical Report ·
OSTI ID:5113829

The performance characteristics of numerical algorithms running on single processor computers are well understood in terms of operation count and vectorization. When examining algorithm performance on a shared-memory multiprocessor one must consider, in addition to operation count and vectorization, the effects of processor synchronization, serial sections, memory access conflicts, and load imbalances. In this thesis the performance of Gauss and Gauss-Jordan elimination on a shared-memory multiprocessor is considered. Because real multiprocessors with appropriately pipelined functional units and suitably large numbers of processors are not yet available, the Cerberus multiprocessor simulator is used to evaluate algorithm performance. A general purpose synchronization strategy using barriers to satisfy data dependencies is commonly used in parallel algorithms. The barrier requires that processors wait for all other processors to arrive before execution is continued. Barrier synchronization can be used to satisfy most data dependencies, but in many cases is more than is needed. A key result of this work is that a custom synchronization strategy which explicitly exploits data dependencies of the Gauss elimination algorithm can outperform the generic barrier synchronization strategy without special hardware support for synchronization operations. When one is studying algorithms for multiprocessors, the traditional operation count analysis can be a poor predictor of performance. An algorithm which might be a poor performance choice on a single-processor vector architecture might become the star performer on a multiprocessor. Another result of this work is that Gauss-Jordan elimination, which has an operation count 50% greater than Gauss elimination, can perform better than the latter algorithm as the number of processors is increased for a fixed problem size.

Research Organization:
Lawrence Livermore National Lab., CA (USA)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
5113829
Report Number(s):
UCRL-53863; ON: DE88010594
Resource Relation:
Other Information: THESIS (M.S.). SUBMITTED TO UNIV. OF CALIFORNIA, DAVIS
Country of Publication:
United States
Language:
English

Similar Records

Gaussian techniques on shared memory multiprocessor computers
Conference · Fri Jan 01 00:00:00 EST 1988 · OSTI ID:5113829

Fast, contention-free combining tree barriers for shared-memory multiprocessors
Journal Article · Mon Aug 01 00:00:00 EDT 1994 · International Journal of Parallel Programming; (United States) · OSTI ID:5113829

The performance implications of thread management alternatives for shared-memory multiprocessors
Journal Article · Fri Dec 01 00:00:00 EST 1989 · IEEE (Institute of Electrical and Electronics Engineers) Transactions on Computers; (USA) · OSTI ID:5113829