Optimizing Sparse MatrixMultiple Vectors Multiplication for Nuclear Configuration Interaction Calculations
Abstract
Obtaining highly accurate predictions on the properties of light atomic nuclei using the configuration interaction (CI) approach requires computing a few extremal Eigen pairs of the manybody nuclear Hamiltonian matrix. In the Manybody Fermion Dynamics for nuclei (MFDn) code, a block Eigen solver is used for this purpose. Due to the large size of the sparse matrices involved, a significant fraction of the time spent on the Eigen value computations is associated with the multiplication of a sparse matrix (and the transpose of that matrix) with multiple vectors (SpMM and SpMMT). Existing implementations of SpMM and SpMMT significantly underperform expectations. Thus, in this paper, we present and analyze optimized implementations of SpMM and SpMMT. We base our implementation on the compressed sparse blocks (CSB) matrix format and target systems with multicore architectures. We develop a performance model that allows us to understand and estimate the performance characteristics of our SpMM kernel implementations, and demonstrate the efficiency of our implementation on a series of realworld matrices extracted from MFDn. In particular, we obtain 34 speedup on the requisite operations over good implementations based on the commonly used compressed sparse row (CSR) matrix format. The improvements in the SpMM kernel suggest wemore »
 Authors:

 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
 Publication Date:
 Research Org.:
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
 Sponsoring Org.:
 USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC21)
 OSTI Identifier:
 1407214
 DOE Contract Number:
 AC0205CH11231
 Resource Type:
 Conference
 Resource Relation:
 Conference: International Parallel and Distributed Processing Symposium, IPDPS (2014 IEEE), Phoenix, AZ (United States), 1923 May 2014
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING; Sparse Matrix Multiplication; Block Eigensolver; Nuclear Configuration Interaction; Extended Roofline Model
Citation Formats
Aktulga, Hasan Metin, Buluc, Aydin, Williams, Samuel, and Yang, Chao. Optimizing Sparse MatrixMultiple Vectors Multiplication for Nuclear Configuration Interaction Calculations. United States: N. p., 2014.
Web. doi:10.1109/IPDPS.2014.125.
Aktulga, Hasan Metin, Buluc, Aydin, Williams, Samuel, & Yang, Chao. Optimizing Sparse MatrixMultiple Vectors Multiplication for Nuclear Configuration Interaction Calculations. United States. doi:10.1109/IPDPS.2014.125.
Aktulga, Hasan Metin, Buluc, Aydin, Williams, Samuel, and Yang, Chao. Thu .
"Optimizing Sparse MatrixMultiple Vectors Multiplication for Nuclear Configuration Interaction Calculations". United States. doi:10.1109/IPDPS.2014.125. https://www.osti.gov/servlets/purl/1407214.
@article{osti_1407214,
title = {Optimizing Sparse MatrixMultiple Vectors Multiplication for Nuclear Configuration Interaction Calculations},
author = {Aktulga, Hasan Metin and Buluc, Aydin and Williams, Samuel and Yang, Chao},
abstractNote = {Obtaining highly accurate predictions on the properties of light atomic nuclei using the configuration interaction (CI) approach requires computing a few extremal Eigen pairs of the manybody nuclear Hamiltonian matrix. In the Manybody Fermion Dynamics for nuclei (MFDn) code, a block Eigen solver is used for this purpose. Due to the large size of the sparse matrices involved, a significant fraction of the time spent on the Eigen value computations is associated with the multiplication of a sparse matrix (and the transpose of that matrix) with multiple vectors (SpMM and SpMMT). Existing implementations of SpMM and SpMMT significantly underperform expectations. Thus, in this paper, we present and analyze optimized implementations of SpMM and SpMMT. We base our implementation on the compressed sparse blocks (CSB) matrix format and target systems with multicore architectures. We develop a performance model that allows us to understand and estimate the performance characteristics of our SpMM kernel implementations, and demonstrate the efficiency of our implementation on a series of realworld matrices extracted from MFDn. In particular, we obtain 34 speedup on the requisite operations over good implementations based on the commonly used compressed sparse row (CSR) matrix format. The improvements in the SpMM kernel suggest we may attain roughly a 40% speed up in the overall execution time of the block Eigen solver used in MFDn.},
doi = {10.1109/IPDPS.2014.125},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2014},
month = {8}
}