Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank1 ShermanMorrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrixmatrix operations instead of matrixvector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi core CPUs and GPUs.
 Authors:

^{[1]};
^{[2]}
;
^{[2]}
;
^{[1]};
^{[2]}
 Univ. of Tennessee, Knoxville, TN (United States)
 Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
 Publication Date:
 Grant/Contract Number:
 AC0500OR22725
 Type:
 Accepted Manuscript
 Journal Name:
 Journal of Chemical Physics
 Additional Journal Information:
 Journal Volume: 147; Journal Issue: 17; Journal ID: ISSN 00219606
 Publisher:
 American Institute of Physics (AIP)
 Research Org:
 Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
 Sponsoring Org:
 USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC22)
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING
 OSTI Identifier:
 1407773
 Alternate Identifier(s):
 OSTI ID: 1407834
McDaniel, Tyler, D’Azevedo, Ed F., Li, Ying Wai, Wong, Kwai, and Kent, Paul R. C.. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo. United States: N. p.,
Web. doi:10.1063/1.4998616.
McDaniel, Tyler, D’Azevedo, Ed F., Li, Ying Wai, Wong, Kwai, & Kent, Paul R. C.. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo. United States. doi:10.1063/1.4998616.
McDaniel, Tyler, D’Azevedo, Ed F., Li, Ying Wai, Wong, Kwai, and Kent, Paul R. C.. 2017.
"Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo". United States.
doi:10.1063/1.4998616. https://www.osti.gov/servlets/purl/1407773.
@article{osti_1407773,
title = {Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo},
author = {McDaniel, Tyler and D’Azevedo, Ed F. and Li, Ying Wai and Wong, Kwai and Kent, Paul R. C.},
abstractNote = {Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank1 ShermanMorrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrixmatrix operations instead of matrixvector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi core CPUs and GPUs.},
doi = {10.1063/1.4998616},
journal = {Journal of Chemical Physics},
number = 17,
volume = 147,
place = {United States},
year = {2017},
month = {11}
}