Basker: Parallel sparse LU factorization utilizing hierarchical parallelism and data layouts
Journal Article
·
· Parallel Computing
- Bucknell Univ., Lewisburg, PA (United States)
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Transient simulation in circuit simulation tools, such as SPICE and Xyce, depend on scalable and robust sparse LU factorizations for efficient numerical simulation of circuits and power grids. As the need for simulations of very large circuits grow, the prevalence of multicore architectures enable us to use shared memory parallel algorithms for such simulations. A parallel factorization is a critical component of such shared memory parallel simulations. We develop a parallel sparse factorization algorithm that can solve problems from circuit simulations efficiently, and map well to architectural features. This new factorization algorithm exposes hierarchical parallelism to accommodate irregular structure that arise in our target problems. It also uses a hierarchical two-dimensional data layout which reduces synchronization costs and maps to memory hierarchy found in multicore processors. We present an OpenMP based implementation of the parallel algorithm in a new multithreaded solver called Basker in the Trilinos framework. Here, we present performance evaluations of Basker on the Intel SandyBridge and Xeon Phi platforms using circuit and power grid matrices taken from the University of Florida sparse matrix collection and from Xyce circuit simulation. Basker achieves a geometric mean speedup of 5.91× on CPU (16 cores) and 7.4× on Xeon Phi (32 cores) relative to state-of-the-art solver KLU. Basker outperforms Intel MKL Pardiso solver (PMKL) by as much as 30× on CPU (16 cores) and 7.5× on Xeon Phi (32 cores) for low fill-in circuit matrices. Furthermore, Basker provides 5.4× speedup on a challenging matrix sequence taken from an actual Xyce simulation.
- Research Organization:
- Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC04-94AL85000; NA0003525
- OSTI ID:
- 1499033
- Alternate ID(s):
- OSTI ID: 1550153
- Report Number(s):
- SAND--2019-2046J; 672871
- Journal Information:
- Parallel Computing, Journal Name: Parallel Computing Journal Issue: C Vol. 68; ISSN 0167-8191
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Preparing sparse solvers for exascale computing
|
journal | January 2020 |
Similar Records
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
Journal Article
·
Wed Oct 26 20:00:00 EDT 2016
· SIAM Journal on Scientific Computing
·
OSTI ID:1378736
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
Technical Report
·
Thu Dec 31 23:00:00 EST 2015
·
OSTI ID:1237520