Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties

Journal Article · · International Journal of Quantum Chemistry
DOI:https://doi.org/10.1002/qua.24912· OSTI ID:1392322
 [1];  [2];  [2];  [3]
  1. University of Basel (Switzerland); Argonne National Laboratory (ANL), Argonne, IL (United States). Argonne Leadership Computing Facility (ALCF)
  2. University of Basel (Switzerland)
  3. Argonne National Laboratory (ANL), Argonne, IL (United States). Mathematics and Computer Science Division; University of Texas, Austin, TX (United States)
Here we introduce a fingerprint representation of molecules based on a Fourier series of atomic radial distribution functions. This fingerprint is unique (except for chirality), continuous, and differentiable with respect to atomic coordinates and nuclear charges. It is invariant with respect to translation, rotation, and nuclear permutation, and requires no preconceived knowledge about chemical bonding, topology, or electronic orbitals. As such, it meets many important criteria for a good molecular representation, suggesting its usefulness for machine learning models of molecular properties trained across chemical compound space. To assess the performance of this new descriptor, we have trained machine learning models of molecular enthalpies of atomization for training sets with up to 10 k organic molecules, drawn at random from a published set of 134 k organic molecules with an average atomization enthalpy of over 1770 kcal/mol. We validate the descriptor on all remaining molecules of the 134 k set. For a training set of 10 k molecules, the fingerprint descriptor achieves a mean absolute error of 8.0 kcal/mol. This is slightly worse than the performance attained using the Coulomb matrix, another popular alternative, reaching 6.2 kcal/mol for the same training and test sets.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
Swiss National Science Foundation; USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Science (SC)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1392322
Journal Information:
International Journal of Quantum Chemistry, Journal Name: International Journal of Quantum Chemistry Journal Issue: 16 Vol. 115; ISSN 0020-7608
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English

References (63)

A fast method of molecular shape comparison: A simple application of a Gaussian description of molecular shape journal November 1996
Estimation of pKa Using Semiempirical Molecular Orbital Methods. Part 1: Application to Phenols and Carboxylic Acids. journal November 2002
Molecular Electronic-Structure Theory book August 2000
How Important is Parity Violation for Molecular and Biomolecular Chirality? journal December 2002
Virtual Exploration of the Small-Molecule Chemical Universe below 160 Daltons journal February 2005
First principles view on chemical compound space: Gaining rigorous atomistic control of molecular properties journal February 2013
How similar is a molecule to another? An electron density measure of similarity between two molecular structures journal June 1980
Molecular alignment as a penalized permutation Procrustes problem journal December 2012
Potential energy surfaces for macromolecules. A neural network technique journal May 1992
Deriving the 3D structure of organic molecules from their infrared spectra journal February 1999
Developing a methodology for an inverse quantitative structure-activity relationship using the signature molecular descriptor journal June 2002
A generalized exchange-correlation functional: the Neural-Networks approach journal May 2004
Representing high-dimensional potential-energy surfaces for reactions at surfaces by neural networks journal September 2004
Radial distribution function descriptors: an alternative for predicting A2 A adenosine receptors agonists journal January 2006
Density Functionals with Broad Applicability in Chemistry journal February 2008
The Signature Molecular Descriptor. 1. Using Extended Valence Sequences in QSAR and QSPR Studies journal May 2003
On Outliers and Activity CliffsWhy QSAR Often Disappoints journal July 2006
Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F:  Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery journal January 2007
Finding Nature’s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory journal June 2010
Some Relations between Reaction Rates and Equilibrium Constants. journal August 1935
Toward Quantitative Structure–Property Relationships for Charge Transfer Rates of Polycyclic Aromatic Hydrocarbons journal July 2011
Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies journal July 2013
Alchemical Variations of Intermolecular Energies According to Molecular Grand-Canonical Ensemble Density Functional Theory journal March 2007
The Effect of Structure upon the Reactions of Organic Compounds. Benzene Derivatives journal January 1937
Hopping Transport in Conductive Heterocyclic Oligomers:  Reorganization Energies and Substituent Effects journal February 2005
Enol Tautomers of Watson−Crick Base Pair Models Are Metastable Because of Nuclear Quantum Effects journal August 2010
970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database GDB-13 journal July 2009
Homometric Structures journal June 1939
Chemical space journal December 2004
The inverse band-structure problem of finding an atomic configuration with given electronic properties journal November 1999
Towards the computational design of solid catalysts journal April 2009
The high-throughput highway to computational materials design journal February 2013
Virtual screening: an endless staircase? journal April 2010
Quantum chemistry structures and properties of 134 kilo molecules journal August 2014
Selectivity of guest–host interactions in self-assembled hydrogen-bonded nanostructures observed by NMR journal January 2007
Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations journal January 2011
Combined first-principles calculation and neural-network correction approach for heat of formation journal December 2003
Atom distributions in binary atom clusters: A perturbational approach and its validation in a case study journal December 2004
A random-sampling high dimensional model representation neural network for building potential energy surfaces journal August 2006
Molecular grand-canonical ensemble density functional theory and exploration of chemical space journal October 2006
Tuning electronic eigenvalues of benzene via doping journal August 2007
Accurate ab initio energy gradients in chemical compound space journal October 2009
Two- and three-body interatomic dispersion energy contributions to binding in molecules and solids journal June 2010
Alchemical derivatives of reaction energetics journal August 2010
Binding of hydrogen on benzene, coronene, and graphene from quantum Monte Carlo calculations journal April 2011
Ab initio molecular dynamics: Concepts, recent developments, and future trends journal May 2005
Nearsightedness of electronic matter journal August 2005
Collective many-body van der Waals interactions in molecular systems journal August 2012
Machine learning of molecular electronic properties in chemical compound space journal September 2013
Predicting protein-protein interactions using signature products journal August 2004
Inhomogeneous Electron Gas journal November 1964
On representing chemical environments journal May 2013
Bell-Evans-Polanyi principle for molecular dynamics trajectories and its implications for global optimization journal May 2008
Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons journal April 2010
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Finding Density Functionals with Machine Learning journal June 2012
Comment on “Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning” journal August 2012
Rupp et al. Reply: journal August 2012
Predicting Crystal Structures with Data Mining of Quantum Calculations journal September 2003
Variational Particle Number Approach for Rational Compound Design journal October 2005
Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces journal April 2007
Rotational Invariance Based on Fourier Analysis in Polar and Spherical Coordinates journal September 2009
A Homonuclear Molecule with a Permanent Electric Dipole Moment journal November 2011

Cited By (3)

Neural Networks for the Prediction of Organic Chemistry Reactions journal October 2016
A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks journal December 2017
The octet rule in chemical space: Generating virtual molecules text January 2017

Similar Records

Quantum-Chemically Informed Machine Learning: Prediction of Energies of Organic Molecules with 10 to 14 Non-hydrogen Atoms
Journal Article · Sun Jun 14 20:00:00 EDT 2020 · Journal of Physical Chemistry. A, Molecules, Spectroscopy, Kinetics, Environment, and General Theory · OSTI ID:1656872

Machine Learning for Prediction of Thermodynamic Descriptors
Technical Report · Fri Sep 29 00:00:00 EDT 2023 · OSTI ID:2203236

Machine Learning of Parameters for Accurate Semiempirical Quantum Chemical Calculations
Journal Article · Tue May 12 00:00:00 EDT 2015 · Journal of Chemical Theory and Computation · OSTI ID:1392016