The computation of two-electron repulsion integrals (ERIs) is often the most expensive step of integral-direct self-consistent field methods. Formally it scales as O(N4), where N is the number of Gaussian basis functions used to represent the molecular wave function. In practice, this scaling can be reduced to O(N2) or less by neglecting small integrals with screening methods. The contributions of the ERIs to the Fock matrix are of Coulomb (J) and exchange (K) type and require separate algorithms to compute matrix elements efficiently. We previously implemented highly efficient GPU-accelerated J-matrix and K-matrix algorithms in the electronic structure code TeraChem. Although these implementations supported the use of multiple GPUs on a node, they did not support the use of multiple nodes. This presents a key bottleneck to cutting-edge ab initio simulations of large systems, e.g., excited state dynamics of photoactive proteins. We present our implementation of multinode multi-GPU J- and K-matrix algorithms in TeraChem using the Regent programming language. Regent directly supports distributed computation in a task-based model and can generate code for a variety of architectures, including NVIDIA GPUs. We demonstrate multinode scaling up to 45 GPUs (3 nodes) and benchmark against hand-coded TeraChem integral code. Finally, we also outline our metaprogrammed Regent implementation, which enables flexible code generation for integrals of different angular momenta.
Johnson, K. Grace, et al. "Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language." Journal of Chemical Theory and Computation, vol. 18, no. 11, Oct. 2022. https://doi.org/10.1021/acs.jctc.2c00414
Johnson, K. Grace, Mirchandaney, Seema, Hoag, Ellis, Heirich, Alan, Aiken, Alex, & Martínez, Todd J. (2022). Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language. Journal of Chemical Theory and Computation, 18(11). https://doi.org/10.1021/acs.jctc.2c00414
Johnson, K. Grace, Mirchandaney, Seema, Hoag, Ellis, et al., "Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language," Journal of Chemical Theory and Computation 18, no. 11 (2022), https://doi.org/10.1021/acs.jctc.2c00414
@article{osti_1998577,
author = {Johnson, K. Grace and Mirchandaney, Seema and Hoag, Ellis and Heirich, Alan and Aiken, Alex and Martínez, Todd J.},
title = {Multinode Multi-GPU Two-Electron Integrals: Code Generation Using the Regent Language},
annote = {The computation of two-electron repulsion integrals (ERIs) is often the most expensive step of integral-direct self-consistent field methods. Formally it scales as O(N4), where N is the number of Gaussian basis functions used to represent the molecular wave function. In practice, this scaling can be reduced to O(N2) or less by neglecting small integrals with screening methods. The contributions of the ERIs to the Fock matrix are of Coulomb (J) and exchange (K) type and require separate algorithms to compute matrix elements efficiently. We previously implemented highly efficient GPU-accelerated J-matrix and K-matrix algorithms in the electronic structure code TeraChem. Although these implementations supported the use of multiple GPUs on a node, they did not support the use of multiple nodes. This presents a key bottleneck to cutting-edge ab initio simulations of large systems, e.g., excited state dynamics of photoactive proteins. We present our implementation of multinode multi-GPU J- and K-matrix algorithms in TeraChem using the Regent programming language. Regent directly supports distributed computation in a task-based model and can generate code for a variety of architectures, including NVIDIA GPUs. We demonstrate multinode scaling up to 45 GPUs (3 nodes) and benchmark against hand-coded TeraChem integral code. Finally, we also outline our metaprogrammed Regent implementation, which enables flexible code generation for integrals of different angular momenta.},
doi = {10.1021/acs.jctc.2c00414},
url = {https://www.osti.gov/biblio/1998577},
journal = {Journal of Chemical Theory and Computation},
issn = {ISSN 1549-9618},
number = {11},
volume = {18},
place = {United States},
publisher = {American Chemical Society},
year = {2022},
month = {10}}
Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Vol. 389, Issue 1-2https://doi.org/10.1016/S0168-9002(97)00059-4
Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, Vol. 200, Issue 1063, p. 542-554https://doi.org/10.1098/rspa.1950.0036
Gautier, Thierry; Lima, Joao V. F.; Maillard, Nicolas
2013 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on Parallel and Distributed Processinghttps://doi.org/10.1109/IPDPS.2013.66
2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysishttps://doi.org/10.1109/SC.2012.71
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15https://doi.org/10.1145/2807591.2807629