Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver
In this paper, we describe a number of recently developed techniques for improving the performance of largescale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multicore supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
 Authors:

^{[1]}
;
^{[2]};
^{[1]};
^{[1]};
^{[3]};
^{[3]}
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division
 Michigan State Univ., East Lansing, MI (United States). Dept. of Computer Science and Engineering
 Iowa State Univ., Ames, IA (United States). Dept. of Physics and Astronomy
 Publication Date:
 Grant/Contract Number:
 AC0205CH11231; SC0008485; FG0287ER40371; GE100082
 Type:
 Accepted Manuscript
 Journal Name:
 Computer Physics Communications
 Additional Journal Information:
 Journal Volume: 222; Journal ID: ISSN 00104655
 Publisher:
 Elsevier
 Research Org:
 Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Iowa State Univ., Ames, IA (United States); Michigan State Univ., East Lansing, MI (United States)
 Sponsoring Org:
 USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC21); USDOE Office of Science (SC), Nuclear Physics (NP) (SC26); Michigan State Univ. (United States)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING; 73 NUCLEAR PHYSICS AND RADIATION PHYSICS; nuclear configuration interaction; symmetric eigenvalue problem; LOBPCG; preconditioning
 OSTI Identifier:
 1439235
Shao, Meiyue, Aktulga, H. Metin, Yang, Chao, Ng, Esmond G., Maris, Pieter, and Vary, James P.. Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver. United States: N. p.,
Web. doi:10.1016/j.cpc.2017.09.004.
Shao, Meiyue, Aktulga, H. Metin, Yang, Chao, Ng, Esmond G., Maris, Pieter, & Vary, James P.. Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver. United States. doi:10.1016/j.cpc.2017.09.004.
Shao, Meiyue, Aktulga, H. Metin, Yang, Chao, Ng, Esmond G., Maris, Pieter, and Vary, James P.. 2017.
"Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver". United States.
doi:10.1016/j.cpc.2017.09.004. https://www.osti.gov/servlets/purl/1439235.
@article{osti_1439235,
title = {Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver},
author = {Shao, Meiyue and Aktulga, H. Metin and Yang, Chao and Ng, Esmond G. and Maris, Pieter and Vary, James P.},
abstractNote = {In this paper, we describe a number of recently developed techniques for improving the performance of largescale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multicore supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.},
doi = {10.1016/j.cpc.2017.09.004},
journal = {Computer Physics Communications},
number = ,
volume = 222,
place = {United States},
year = {2017},
month = {9}
}