A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations
Abstract
Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix, based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, withmore »
- Authors:
-
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Publication Date:
- Research Org.:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1165754
- DOE Contract Number:
- DE-AC52-07NA27344
- Resource Type:
- Journal Article
- Journal Name:
- SIAM Journal on Scientific Computing
- Additional Journal Information:
- Journal Volume: 36; Journal Issue: 4; Journal ID: ISSN 1064-8275
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 71 CLASSICAL AND QUANTUMM MECHANICS, GENERAL PHYSICS; 36 MATERIALS SCIENCE; 97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; LINEAR SCALING ALGORITHMS; DENSITY FUNCTIONAL THEORY; GRAM MATRIX INVERSE; LARGE SCALE MOLECULAR DYNAMICS; PARALLEL APPROXIMATE INVERSE
Citation Formats
Osei-Kuffuor, Daniel, and Fattebert, Jean-Luc. A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations. United States: N. p., 2014.
Web. doi:10.1137/140956476.
Osei-Kuffuor, Daniel, & Fattebert, Jean-Luc. A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations. United States. https://doi.org/10.1137/140956476
Osei-Kuffuor, Daniel, and Fattebert, Jean-Luc. 2014.
"A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations". United States. https://doi.org/10.1137/140956476. https://www.osti.gov/servlets/purl/1165754.
@article{osti_1165754,
title = {A Scalable O(N) Algorithm for Large-Scale Parallel First-Principles Molecular Dynamics Simulations},
author = {Osei-Kuffuor, Daniel and Fattebert, Jean-Luc},
abstractNote = {Traditional algorithms for first-principles molecular dynamics (FPMD) simulations only gain a modest capability increase from current petascale computers, due to their O(N3) complexity and their heavy use of global communications. To address this issue, we are developing a truly scalable O(N) complexity FPMD algorithm, based on density functional theory (DFT), which avoids global communications. The computational model uses a general nonorthogonal orbital formulation for the DFT energy functional, which requires knowledge of selected elements of the inverse of the associated overlap matrix. We present a scalable algorithm for approximately computing selected entries of the inverse of the overlap matrix, based on an approximate inverse technique, by inverting local blocks corresponding to principal submatrices of the global overlap matrix. The new FPMD algorithm exploits sparsity and uses nearest neighbor communication to provide a computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic orbitals are confined, and a cutoff beyond which the entries of the overlap matrix can be omitted when computing selected entries of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to O(100K) atoms on O(100K) processors, with a wall-clock time of O(1) minute per molecular dynamics time step.},
doi = {10.1137/140956476},
url = {https://www.osti.gov/biblio/1165754},
journal = {SIAM Journal on Scientific Computing},
issn = {1064-8275},
number = 4,
volume = 36,
place = {United States},
year = {Wed Jan 01 00:00:00 EST 2014},
month = {Wed Jan 01 00:00:00 EST 2014}
}