Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Improved load distribution in parallel sparse Cholesky factorization

Book ·
OSTI ID:87679
 [1];  [2]
  1. Intel Supercomputer Systems Division, Beaverton, OR (United States)
  2. NASA Ames Research Center, Moffett Field, CA (United States). Research Inst. for Advanced Computer Science
Compared to the customary column-oriented approaches, block-oriented, distributed-memory sparse Cholesky factorization benefits front an asymptotic reduction in interprocessor communication volume and asymptotic increase in the amount of concurrency that is exposed in the problem. Unfortunately, block oriented approaches (specifically, the block fan-out method) have suffered from poor balance of the computational load. As a result, achieved performance can be quite low. This paper investigates the reasons for this load imbalance and proposes simple block mapping heuristics that dramatically improve it. The result is a roughly 20% increase in realized parallel factorization performance, as demonstrated by performance results from an Intel Paragon{trademark} system. The authors have achieved performance of nearly 3.2 billion floating point operations per second with this technique on a 196-node Paragon system.
OSTI ID:
87679
Report Number(s):
CONF-941118--; ISBN 0-8186-6605-6
Country of Publication:
United States
Language:
English

Similar Records

Efficient parallel sparse Cholesky factorization
Conference · Thu Nov 30 23:00:00 EST 1995 · OSTI ID:125543

Block sparse Cholesky algorithms on advanced uniprocessor computers
Journal Article · Wed Sep 01 00:00:00 EDT 1993 · SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (United States) · OSTI ID:6110603

Block sparse Cholesky algorithms on advanced uniprocessor computers
Technical Report · Sat Nov 30 23:00:00 EST 1991 · OSTI ID:6097949