A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
- Michigan State Univ., East Lansing, MI (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
- Iowa State Univ., Ames, IA (United States). Dept. of Physics and Astronomy
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using the compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Nuclear Physics (NP); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- Grant/Contract Number:
- AC02-05CH11231; SC0008485; FG02-87ER40371
- OSTI ID:
- 1379875
- Journal Information:
- IEEE Transactions on Parallel and Distributed Systems, Vol. 28, Issue 6; ISSN 1045-9219
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
On the performance and energy efficiency of sparse linear algebra on GPUs
Tensor Contraction and Operation Minimization forExtreme Scale Computational Chemistry