On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters
Abstract
The predominance of Kohn–Sham density functional theory (KS-DFT) for the theoretical treatment of large experimentally relevant systems in molecular chemistry and materials science relies primarily on the existence of efficient software implementations which are capable of leveraging the latest advances in modern high-performance computing (HPC). With recent trends in HPC leading toward increasing reliance on heterogeneous accelerator-based architectures such as graphics processing units (GPU), existing code bases must embrace these architectural advances to maintain the high levels of performance that have come to be expected for these methods. In this work, we purpose a three-level parallelism scheme for the distributed numerical integration of the exchange-correlation (XC) potential in the Gaussian basis set discretization of the Kohn–Sham equations on large computing clusters consisting of multiple GPUs per compute node. In addition, we purpose and demonstrate the efficacy of the use of batched kernels, including batched level-3 BLAS operations, in achieving high levels of performance on the GPU. We demonstrate the performance and scalability of the implementation of the purposed method in the NWChemEx software package by comparing to the existing scalable CPU XC integration in NWChem.
- Authors:
-
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Brookhaven National Lab. (BNL), Upton, NY (United States)
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research; USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities Division
- OSTI Identifier:
- 1650078
- Alternate Identifier(s):
- OSTI ID: 1764587
- Report Number(s):
- BNL-220973-2021-JAAM
Journal ID: ISSN 2296-2646; ark:/13030/qt0ms5611x
- Grant/Contract Number:
- AC02-05CH11231; AC05-00OR22725; SC0012704
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Frontiers in Chemistry
- Additional Journal Information:
- Journal Volume: 8; Journal ID: ISSN 2296-2646
- Publisher:
- Frontiers Research Foundation
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; density functional theory; graphics processing unit; high-performance computing; parallelcomputing; quantum chemistry; parallel computing
Citation Formats
Williams-Young, David B, de Jong, Wibe A., van Dam, Hubertus J. J., and Yang, Chao. On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters. United States: N. p., 2020.
Web. doi:10.3389/fchem.2020.581058.
Williams-Young, David B, de Jong, Wibe A., van Dam, Hubertus J. J., & Yang, Chao. On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters. United States. https://doi.org/10.3389/fchem.2020.581058
Williams-Young, David B, de Jong, Wibe A., van Dam, Hubertus J. J., and Yang, Chao. Thu .
"On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters". United States. https://doi.org/10.3389/fchem.2020.581058. https://www.osti.gov/servlets/purl/1650078.
@article{osti_1650078,
title = {On the Efficient Evaluation of the Exchange Correlation Potential on Graphics Processing Unit Clusters},
author = {Williams-Young, David B and de Jong, Wibe A. and van Dam, Hubertus J. J. and Yang, Chao},
abstractNote = {The predominance of Kohn–Sham density functional theory (KS-DFT) for the theoretical treatment of large experimentally relevant systems in molecular chemistry and materials science relies primarily on the existence of efficient software implementations which are capable of leveraging the latest advances in modern high-performance computing (HPC). With recent trends in HPC leading toward increasing reliance on heterogeneous accelerator-based architectures such as graphics processing units (GPU), existing code bases must embrace these architectural advances to maintain the high levels of performance that have come to be expected for these methods. In this work, we purpose a three-level parallelism scheme for the distributed numerical integration of the exchange-correlation (XC) potential in the Gaussian basis set discretization of the Kohn–Sham equations on large computing clusters consisting of multiple GPUs per compute node. In addition, we purpose and demonstrate the efficacy of the use of batched kernels, including batched level-3 BLAS operations, in achieving high levels of performance on the GPU. We demonstrate the performance and scalability of the implementation of the purposed method in the NWChemEx software package by comparing to the existing scalable CPU XC integration in NWChem.},
doi = {10.3389/fchem.2020.581058},
journal = {Frontiers in Chemistry},
number = ,
volume = 8,
place = {United States},
year = {Thu Dec 10 00:00:00 EST 2020},
month = {Thu Dec 10 00:00:00 EST 2020}
}
Works referenced in this record:
High-performance Tensor Contractions for GPUs
journal, January 2016
- Abdelfattah, A.; Baboulin, M.; Dobrev, V.
- Procedia Computer Science, Vol. 80
Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods
journal, September 2013
- Andrade, Xavier; Aspuru-Guzik, Alán
- Journal of Chemical Theory and Computation, Vol. 9, Issue 10
NWChem: Past, present, and future
journal, May 2020
- Aprà, E.; Bylaska, E. J.; de Jong, W. A.
- The Journal of Chemical Physics, Vol. 152, Issue 18
Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units
journal, February 2010
- Asadchev, Andrey; Allada, Veerendra; Felder, Jacob
- Journal of Chemical Theory and Computation, Vol. 6, Issue 3
A multicenter numerical integration scheme for polyatomic molecules
journal, February 1988
- Becke, A. D.
- The Journal of Chemical Physics, Vol. 88, Issue 4
Density‐functional thermochemistry. III. The role of exact exchange
journal, April 1993
- Becke, Axel D.
- The Journal of Chemical Physics, Vol. 98, Issue 7, p. 5648-5652
Massively Multicore Parallelization of Kohn−Sham Theory
journal, September 2008
- Brown, Philip; Woods, Christopher; McIntosh-Smith, Simon
- Journal of Chemical Theory and Computation, Vol. 4, Issue 10
A massively multicore parallelization of the Kohn-Sham energy gradients
journal, January 2010
- Brown, Philip; Woods, Christopher J.; McIntosh-Smith, Simon
- Journal of Computational Chemistry
Linear Scaling Hierarchical Integration Scheme for the Exchange-Correlation Term in Molecular and Periodic Systems
journal, August 2011
- Burow, Asbjörn M.; Sierka, Marek
- Journal of Chemical Theory and Computation, Vol. 7, Issue 10
SG-0: A small standard grid for DFT quadrature on large systems
journal, January 2006
- Chien, Siu-Hung; Gill, Peter M. W.
- Journal of Computational Chemistry, Vol. 27, Issue 6
Fast, scalable and accurate finite-element based ab initio calculations using mixed precision computing: 46 PFLOPS simulation of a metallic dislocation system
conference, November 2019
- Das, Sambit; Motamarri, Phani; Gavini, Vikram
- SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Utilizing high performance computing for chemistry: parallel computational chemistry
journal, January 2010
- de Jong, Wibe A.; Bylaska, Eric; Govind, Niranjan
- Physical Chemistry Chemical Physics, Vol. 12, Issue 26
Self‐Consistent Molecular‐Orbital Methods. IX. An Extended Gaussian‐Type Basis for Molecular‐Orbital Studies of Organic Molecules
journal, January 1971
- Ditchfield, R.; Hehre, W. J.; Pople, J. A.
- The Journal of Chemical Physics, Vol. 54, Issue 2
Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen
journal, January 1989
- Dunning, Thom H.
- The Journal of Chemical Physics, Vol. 90, Issue 2
Two-Component Noncollinear Time-Dependent Spin Density Functional Theory for Excited State Calculations
journal, May 2017
- Egidi, Franco; Sun, Shichao; Goings, Joshua J.
- Journal of Chemical Theory and Computation, Vol. 13, Issue 6
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
conference, January 2004
- Fatahalian, K.; Sugerman, J.; Hanrahan, P.
- Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware - HWWS '04
Self‐consistent molecular orbital methods. XXIII. A polarization‐type basis set for second‐row elements
journal, October 1982
- Francl, Michelle M.; Pietro, William J.; Hehre, Warren J.
- The Journal of Chemical Physics, Vol. 77, Issue 7, p. 3654-3665
Density functional theory calculation on many-cores hybrid central processing unit-graphic processing unit architectures
journal, July 2009
- Genovese, Luigi; Ospici, Matthieu; Deutsch, Thierry
- The Journal of Chemical Physics, Vol. 131, Issue 3
A standard grid for density functional calculations
journal, July 1993
- Gill, Peter M. W.; Johnson, Benny G.; Pople, John A.
- Chemical Physics Letters, Vol. 209, Issue 5-6
Radial quadrature for multiexponential integrands
journal, April 2003
- Gill, Peter M. W.; Chien, Siu-Hung
- Journal of Computational Chemistry, Vol. 24, Issue 6
Novel Computer Architectures and Quantum Chemistry
journal, May 2020
- Gordon, Mark S.; Barca, Giuseppe; Leang, Sarom S.
- The Journal of Physical Chemistry A, Vol. 124, Issue 23
Self-consistent molecular-orbital methods. 22. Small split-valence basis sets for second-row elements
journal, May 1982
- Gordon, Mark S.; Binkley, J. Stephen; Pople, John A.
- Journal of the American Chemical Society, Vol. 104, Issue 10
Batched matrix computations on hardware accelerators based on GPUs
journal, April 2014
- Haidar, Azzam; Dong, Tingxing; Luszczek, Piotr
- The International Journal of High Performance Computing Applications, Vol. 29, Issue 2
The influence of polarization functions on molecular orbital hydrogenation energies
journal, January 1973
- Hariharan, P. C.; Pople, J. A.
- Theoretica Chimica Acta, Vol. 28, Issue 3
Self—Consistent Molecular Orbital Methods. XII. Further Extensions of Gaussian—Type Basis Sets for Use in Molecular Orbital Studies of Organic Molecules
journal, March 1972
- Hehre, W. J.; Ditchfield, R.; Pople, J. A.
- The Journal of Chemical Physics, Vol. 56, Issue 5, p. 2257-2261
Generic Matrix Multiplication for Multi-GPU Accelerated Distributed-Memory Platforms over PaRSEC
conference, November 2019
- Herault, Thomas; Robert, Yves; Bosilca, George
- 2019 IEEE/ACM 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA)
Inhomogeneous Electron Gas
journal, November 1964
- Hohenberg, P.; Kohn, W.
- Physical Review, Vol. 136, Issue 3B, p. B864-B871
GPU acceleration of all-electron electronic structure theory using localized numeric atom-centered basis functions
journal, September 2020
- Huhn, William P.; Lange, Björn; Yu, Victor Wen-zhe
- Computer Physics Communications, Vol. 254
Towards Highly scalable Ab Initio Molecular Dynamics (AIMD) Simulations on the Intel Knights Landing Manycore Processor
conference, May 2017
- Jacquelin, Mathias; De Jong, Wibe; Bylaska, Eric
- 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Parallel transport time-dependent density functional theory calculations with hybrid functional on summit
conference, November 2019
- Jia, Weile; Wang, Lin-Wang; Lin, Lin
- SC '19: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
Arbitrary Angular Momentum Electron Repulsion Integrals with Graphical Processing Units: Application to the Resolution of Identity Hartree–Fock Method
journal, June 2017
- Kalinowski, Jaroslaw; Wennmohs, Frank; Neese, Frank
- Journal of Chemical Theory and Computation, Vol. 13, Issue 7
GPU clusters for high-performance computing
conference, August 2009
- Kindratenko, Volodymyr V.; Enos, Jeremy J.; Shi, Guochun
- 2009 IEEE International Conference on Cluster Computing and Workshops
Self-Consistent Equations Including Exchange and Correlation Effects
journal, November 1965
- Kohn, W.; Sham, L. J.
- Physical Review, Vol. 140, Issue 4A, p. A1133-A1138
Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units
journal, May 2017
- Kussmann, Jörg; Ochsenfeld, Christian
- Journal of Chemical Theory and Computation, Vol. 13, Issue 6
An improved molecular partitioning scheme for numerical quadratures in density functional theory
journal, November 2018
- Laqua, Henryk; Kussmann, Jörg; Ochsenfeld, Christian
- The Journal of Chemical Physics, Vol. 149, Issue 20
Highly Efficient, Linear-Scaling Seminumerical Exact-Exchange Method for Graphic Processing Units
journal, February 2020
- Laqua, Henryk; Thompson, Travis H.; Kussmann, Jörg
- Journal of Chemical Theory and Computation, Vol. 16, Issue 3
Optimization and Parallelization of DFT and TDDFT in GAMESS on DoD HPC Machines
conference, July 2008
- Lasinski, Michael E.; Romero, Nichols A.; Yau, Anthony D.
- 2008 DoD HPCMP Users Group Conference
Quadratures on a sphere
journal, January 1976
- Lebedev, V. I.
- USSR Computational Mathematics and Mathematical Physics, Vol. 16, Issue 2
Recent developments in libxc — A comprehensive library of functionals for density functional theory
journal, January 2018
- Lehtola, Susi; Steigemann, Conrad; Oliveira, Micael J. T.
- SoftwareX, Vol. 7
Gaussian Basis Set Hartree-Fock, Density Functional Theory, and Beyond on GPUs
book, January 2016
- Luehr, Nathan; Sisto, Aaron; Mart??nez, Todd J.
- Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics
Speeding up plane-wave electronic-structure calculations using graphics-processing units
journal, July 2011
- Maintz, Stefan; Eck, Bernhard; Dronskowski, Richard
- Computer Physics Communications, Vol. 182, Issue 7
Parallel Implementation of Density Functional Theory Methods in the Quantum Interaction Computational Kernel Program
journal, June 2020
- Manathunga, Madushanka; Miao, Yipu; Mu, Dawei
- Journal of Chemical Theory and Computation, Vol. 16, Issue 7
Acceleration of Electron Repulsion Integral Evaluation on Graphics Processing Units via Use of Recurrence Relations
journal, December 2012
- Miao, Yipu; Merz, Kenneth M.
- Journal of Chemical Theory and Computation, Vol. 9, Issue 2
DFT-FE – A massively parallel adaptive finite-element code for large-scale density functional theory calculations
journal, January 2020
- Motamarri, Phani; Das, Sambit; Rudraraju, Shiva
- Computer Physics Communications, Vol. 246
Improved radial grids for quadrature in molecular density‐functional calculations
journal, June 1996
- Mura, Michael E.; Knowles, Peter J.
- The Journal of Chemical Physics, Vol. 104, Issue 24
Quadrature schemes for integrals of density functional theory
journal, March 1993
- Murray, Christopher W.; Handy, Nicholas C.; Laming, Gregory J.
- Molecular Physics, Vol. 78, Issue 4
An Improved Magma Gemm For Fermi Graphics Processing Units
journal, September 2010
- Nath, Rajib; Tomov, Stanimire; Dongarra, Jack
- The International Journal of High Performance Computing Applications, Vol. 24, Issue 4
Automatic translation of MPI source into a latency-tolerant, data-driven form
journal, August 2017
- Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric
- Journal of Parallel and Distributed Computing, Vol. 106
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
journal, May 2006
- Nieplocha, Jarek; Palmer, Bruce; Tipparaju, Vinod
- The International Journal of High Performance Computing Applications, Vol. 20, Issue 2
Trends in High Performance Computing: Exascale Systems and Facilities Beyond the First Wave
conference, May 2019
- Parnell, Lynn A.; Demetriou, Dustin W.; Kamath, Vinod
- 2019 18th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm)
Density-functional approximation for the correlation energy of the inhomogeneous electron gas
journal, June 1986
- Perdew, John P.
- Physical Review B, Vol. 33, Issue 12
Generalized Gradient Approximation Made Simple
journal, October 1996
- Perdew, John P.; Burke, Kieron; Ernzerhof, Matthias
- Physical Review Letters, Vol. 77, Issue 18, p. 3865-3868
Accurate and simple density functional for the electronic exchange energy: Generalized gradient approximation
journal, June 1986
- Perdew, John P.; Yue, Wang
- Physical Review B, Vol. 33, Issue 12
Combining Graphics Processing Units, Simplified Time-Dependent Density Functional Theory, and Finite-Difference Couplings to Accelerate Nonadiabatic Molecular Dynamics
journal, May 2020
- Peters, Laurens D. M.; Kussmann, Jörg; Ochsenfeld, Christian
- The Journal of Physical Chemistry Letters, Vol. 11, Issue 10
An efficient implementation of two-component relativistic density functional theory with torque-free auxiliary variables
journal, July 2018
- Petrone, Alessio; Williams-Young, David B.; Sun, Shichao
- The European Physical Journal B, Vol. 91, Issue 7
Kohn—Sham density-functional theory within a finite basis set
journal, November 1992
- Pople, John A.; Gill, Peter M. W.; Johnson, Benny G.
- Chemical Physics Letters, Vol. 199, Issue 6
Challenges in large scale quantum mechanical calculations: Challenges in large scale quantum mechanical calculations
journal, November 2016
- Ratcliff, Laura E.; Mohr, Stephan; Huhs, Georg
- Wiley Interdisciplinary Reviews: Computational Molecular Science, Vol. 7, Issue 1
Transformation between Cartesian and pure spherical harmonic Gaussians
journal, April 1995
- Schlegel, H. Bernhard; Frisch, Michael J.
- International Journal of Quantum Chemistry, Vol. 54, Issue 2
Achieving linear scaling in exchange-correlation density functional quadratures
journal, July 1996
- Stratmann, R. Eric; Scuseria, Gustavo E.; Frisch, Michael J.
- Chemical Physics Letters, Vol. 257, Issue 3-4
Generating Efficient Quantum Chemistry Codes for Novel Architectures
journal, November 2012
- Titov, Alexey V.; Ufimtsev, Ivan S.; Luehr, Nathan
- Journal of Chemical Theory and Computation, Vol. 9, Issue 1
Towards dense linear algebra for hybrid GPU accelerated manycore systems
journal, June 2010
- Tomov, Stanimire; Dongarra, Jack; Baboulin, Marc
- Parallel Computing, Vol. 36, Issue 5-6
Efficient molecular numerical integration schemes
journal, January 1995
- Treutler, Oliver; Ahlrichs, Reinhart
- The Journal of Chemical Physics, Vol. 102, Issue 1
Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation
journal, January 2008
- Ufimtsev, Ivan S.; Martínez, Todd J.
- Journal of Chemical Theory and Computation, Vol. 4, Issue 2
Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation
journal, March 2009
- Ufimtsev, Ivan S.; Martinez, Todd J.
- Journal of Chemical Theory and Computation, Vol. 5, Issue 4
Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics
journal, August 2009
- Ufimtsev, Ivan S.; Martinez, Todd J.
- Journal of Chemical Theory and Computation, Vol. 5, Issue 10
Large scale plane wave pseudopotential density functional theory calculations on GPU clusters
conference, January 2011
- Wang, Long; Wu, Yue; Jia, Weile
- Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
Gaussian basis sets for use in correlated molecular calculations. III. The atoms aluminum through argon
journal, January 1993
- Woon, David E.; Dunning, Thom H.
- The Journal of Chemical Physics, Vol. 98, Issue 2
Density functional theory calculations: A powerful tool to simulate and design high-performance energy storage and conversion materials
journal, June 2019
- Wu, Xi; Kang, Feiyu; Duan, Wenhui
- Progress in Natural Science: Materials International, Vol. 29, Issue 3
Accelerating Density Functional Calculations with Graphics Processing Unit
journal, July 2008
- Yasuda, Koji
- Journal of Chemical Theory and Computation, Vol. 4, Issue 8
GPU‐Accelerated Large‐Scale Excited‐State Simulation Based on Divide‐and‐Conquer Time‐Dependent Density‐Functional Tight‐Binding
journal, August 2019
- Yoshikawa, Takeshi; Komoto, Nana; Nishimura, Yoshifumi
- Journal of Computational Chemistry, Vol. 40, Issue 31
ChemInform Abstract: SELF-CONSISTENT MOLECULAR-ORBITAL METHODS. 22. SMALL SPLIT-VALENCE BASIS SETS S FOR SECOND-ROW ELEMENTS
journal, August 1982
- Gordon, M. S.; Binkley, J. S.; Pople, J. A.
- Chemischer Informationsdienst, Vol. 13, Issue 34