Deep learning of dynamically responsive chemical Hamiltonians with semiempirical quantum mechanics
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, Center of Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545
- Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, Center of Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, NM 87545
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, NM 87545
Conventional machine-learning (ML) models in computational chemistry learn to directly predict molecular properties using quantum chemistry only for reference data. While these heuristic ML methods show quantum-level accuracy with speeds several orders of magnitude faster than traditional quantum chemistry methods, they suffer from poor extensibility and transferability; i.e., their accuracy degrades on large or new chemical systems. Incorporating quantum chemistry frameworks into the ML models directly solves this problem. Here we take the structure of semiempirical quantum mechanics (SEQM) methods to construct dynamically responsive Hamiltonians. SEQM methods use empirical parameters fitted to experimental properties to construct reduced-order Hamiltonians, facilitating much faster calculations than ab initio methods but with compromised accuracy. By replacing these static parameters with machine-learned dynamic values inferred from the local environment, we greatly improve the accuracy of the SEQM methods. Trained on molecular energies and atomic forces, these dynamically generated Hamiltonian parameters show a strong correlation with atomic hybridization and bonding. Trained with only about 60,000 small organic molecular conformers, the resulting model retains interpretability, extensibility, and transferability when testing on much larger chemical systems and predicting various molecular properties. Overall, this work demonstrates the virtues of incorporating physics-based descriptions with ML to develop models that are simultaneously accurate, transferable, and interpretable.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Basic Energy Sciences (BES). Chemical Sciences, Geosciences & Biosciences Division; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- LDRD; 89233218CNA000001; FWP: LANLE3F2
- OSTI ID:
- 1874630
- Alternate ID(s):
- OSTI ID: 1875159; OSTI ID: 1903543
- Report Number(s):
- LA-UR-22-31577; e2120333119
- Journal Information:
- Proceedings of the National Academy of Sciences of the United States of America, Journal Name: Proceedings of the National Academy of Sciences of the United States of America Vol. 119 Journal Issue: 27; ISSN 0027-8424
- Publisher:
- Proceedings of the National Academy of SciencesCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Machine learning of parameters for accurate semiempirical quantum chemical calculations
Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals