Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Efficient parallel sparse Cholesky factorization

Conference ·
OSTI ID:125543
 [1];  [2]
  1. Silicon Graphics, Inc., Mountain View, CA (United States)
  2. NASA Ames Research Center, Moffett Field, CA (United States)
Scalable implementations of sparse Cholesky factorization use block mappings of the dense parts of the matrix onto a 2-D processor grid, mappings which reduce communication cost. With naive cyclic mappings, block-oriented approaches (specifically, the block fan-out method) suffer from poor balance of the computational load and modest efficiency. Here, we show that heuristic remapping of the block rows and columns essentially removes load imbalance as a cause of inefficiency, producing a 20% increase in realized performance on a 196-node Paragon multicomputer. To attack the remaining inefficiency, we consider the scheduling of available tasks at each node. Priorities based on elimination tree depth yield another 10% improvement.
OSTI ID:
125543
Report Number(s):
CONF-950212--
Country of Publication:
United States
Language:
English

Similar Records

Improved load distribution in parallel sparse Cholesky factorization
Book · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:87679

A scalable parallel algorithm for sparse Cholesky factorization
Book · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:87680

A mapping algorithm for parallel sparse Cholesky factorization
Journal Article · Wed Sep 01 00:00:00 EDT 1993 · SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (United States) · OSTI ID:6107258