skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Strategies and Software for Machine Learning Accelerated Discovery in Transition Metal Chemistry

Journal Article · · Industrial and Engineering Chemistry Research
 [1];  [1]; ORCiD logo [1];  [2]; ORCiD logo [1]
  1. Massachusetts Institute of Technology (MIT), Cambridge, MA (United States)
  2. Massachusetts Institute of Technology (MIT), Cambridge, MA (United States); Eidgenoessische Technische Hochschule (ETH), Zurich (Switzerland)

Machine learning the electronic structure of open shell transition metal complexes presents unique challenges, including robust and automated data set generation. In this report we introduce tools that simplify data acquisition from density functional theory (DFT) and validation of trained machine learning models using the molSimplify automatic design (mAD) workflow. We demonstrate this workflow by training and comparing the performance of LASSO, kernel ridge regression (KRR), and artificial neural network (ANN) models using heuristic, topological revised autocorrelation (RAC) descriptors we have recently introduced for machine learning inorganic chemistry. On a series of open shell transition metal complexes, we evaluate set aside test errors of these models for predicting the HOMO level and HOMO–LUMO gap. The best performing models are ANNs, which show 0.15 and 0.25 eV test set mean absolute errors on the HOMO level and HOMO–LUMO gap, respectively. Poor performing KRR models using the full 153-feature RAC set are improved to nearly the same performance as the ANNs when trained on down-selected subsets of 20–30 features. Analysis of the essential descriptors for HOMO level and HOMO–LUMO gap prediction as well as comparison to subsets previously obtained for other properties reveal the paramount importance of nonlocal, steric properties in determining frontier molecular orbital energetics. We demonstrate our model performance on diverse complexes and in the discovery of molecules with target HOMO–LUMO gaps from a large 15,000 molecule design space in minutes rather than days that full DFT evaluation would require.

Research Organization:
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
Sponsoring Organization:
USDOE Office of Science (SC); US Department of the Navy, Office of Naval Research (ONR); Defense Advanced Research Projects Agency (DARPA); National Science Foundation (NSF)
Grant/Contract Number:
SC0018096; N00014-17-1-2956; N00014-18-1-2434; D18AP00039; CBET-1704266; ACI-1548562; ACI-1429830
OSTI ID:
1612842
Journal Information:
Industrial and Engineering Chemistry Research, Vol. 57, Issue 42; ISSN 0888-5885
Publisher:
American Chemical Society (ACS)Copyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 79 works
Citation information provided by
Web of Science

References (84)

Switching of Molecular Spin States in Inorganic Complexes by Temperature, Pressure, Magnetic Field and Light: Towards Molecular Devices: Switching of Molecular Spin States in Inorganic Complexes journal November 2004
Light-induced excited spin state trapping in a transition-metal complex: The hexa-1-propyltetrazole-iron (II) tetrafluoroborate spin-crossover system journal March 1984
Reversible CO Scavenging via Adsorbate-Dependent Spin State Transitions in an Iron(II)–Triazolate Metal–Organic Framework journal April 2016
Enhanced Cooperativity in Supported Spin-Crossover Metal–Organic Frameworks journal July 2017
Guest Tunable Structure and Spin Crossover Properties in a Nanoporous Coordination Framework Material journal September 2009
The role of transition metal complexes in dye sensitized solar devices journal May 2013
Understanding the reactivity of transition metal complexes involving multiple spin states journal March 2003
Understanding and Breaking Scaling Relations in Single-Site Catalysis: Methane to Methanol Conversion by Fe IV ═O journal January 2018
Computational Investigation and Design of Cobalt Aqua Complexes for Homogeneous Water Oxidation journal April 2016
When Is Ligand p K a a Good Descriptor for Catalyst Energetics? In Search of Optimal CO 2 Hydration Catalysts journal April 2018
Using Gas-Phase Clusters to Screen Porphyrin-Supported Nanocluster Catalysts for Ethane Oxidation to Ethanol journal October 2016
The Catalyst Genome journal December 2012
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
The high-throughput highway to computational materials design journal February 2013
The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid journal August 2011
Efficient Computational Screening of Organic Polymer Photovoltaics journal April 2013
The ligand field molecular mechanics model and the stereoelectronic effects of d and s electrons journal February 2001
How Much Can Density Functional Approximations (DFA) Fail? The Extreme Case of the FeO 4 Species journal March 2016
Towards quantifying the role of exact exchange in predictions of transition metal complex properties journal July 2015
Ligand-Field-Dependent Behavior of Meta-GGA Exchange in Transition-Metal Complex Spin-State Ordering journal October 2016
Ironing out the photochemical and spin-crossover behavior of Fe(II) coordination compounds with computational chemistry journal April 2017
Comparison of density functionals for differences between the high- (T2g5) and low- (A1g1) spin states of iron(II) compounds. IV. Results for the ferrous complexes [Fe(L)(‘NHS4’)] journal June 2005
AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations journal June 2012
The AFLOW standard for high-throughput materials science calculations journal October 2015
AFLOW: An automatic framework for high-throughput materials discovery journal June 2012
The atomic simulation environment—a Python library for working with atoms journal June 2017
Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis journal February 2013
Open Babel: An open chemical toolbox journal October 2011
The ChEMBL database in 2017 journal November 2016
ZINC − A Free Database of Commercially Available Compounds for Virtual Screening journal December 2004
molSimplify: A toolkit for automating discovery in inorganic chemistry journal July 2016
Computational Discovery of Hydrogen Bond Design Rules for Electrochemical Ion Separation journal August 2016
Leveraging Cheminformatics Strategies for Inorganic Discovery: Application to Redox Potential Design journal April 2017
Harnessing Organic Ligand Libraries for First-Principles Inorganic Discovery: Indium Phosphide Quantum Dot Precursor Design Strategies journal April 2017
Machine Learning of Partial Charges Derived from High-Quality Quantum-Mechanical Calculations journal February 2018
Machine-Learning Energy Gaps of Porphyrins with Molecular Graph Representations journal April 2018
Machine learning for the structure–energy–property landscapes of molecular crystals journal January 2018
SchNet – A deep learning architecture for molecules and materials journal June 2018
Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces journal April 2007
Perspective: Machine learning potentials for atomistic simulations journal November 2016
ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost journal January 2017
Machine learning in materials informatics: recent applications and prospects journal December 2017
Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach journal August 2016
Machine learning in catalysis journal April 2018
Machine learning for heterogeneous catalyst design and discovery journal May 2018
Machine learning-based screening of complex molecules for polymer solar cells journal June 2018
Multi-fidelity machine learning models for accurate bandgap predictions of solids journal March 2017
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach journal April 2015
Assessment of density functional theory for iron(II) molecules across the spin-crossover transition journal September 2012
Spin Propensities of Octahedral Complexes From Density Functional Theory journal April 2015
Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure–Property Relationships journal November 2017
Simulated evolution of fluorophores for light emitting diodes journal March 2015
Accelerating Chemical Discovery with Machine Learning: Simulated Evolution of Spin Crossover Complexes with an Artificial Neural Network journal February 2018
General Approach to Estimate Error Bars for Quantitative Structure–Activity Relationship Predictions of Molecular Activity journal June 2018
Stochastic Voyages into Uncharted Chemical Space Produce a Representative Library of All Possible Drug-Like Compounds journal May 2013
Prediction of Partition Coefficients (LOGPoct) Using Autocorrelation Descriptors journal December 1997
Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning journal January 2012
Quantum chemistry structures and properties of 134 kilo molecules journal August 2014
ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules journal December 2017
Frontier molecular orbital theory of cycloaddition reactions journal November 1975
A Molecular Orbital Theory of Reactivity in Aromatic Hydrocarbons journal April 1952
Electronics using hybrid-molecular and mono-molecular devices journal November 2000
Molecular Design of Photovoltaic Materials for Polymer Solar Cells: Toward Suitable Electronic Energy Levels and Broad Absorption journal January 2012
Comparison of DFT Methods for Molecular Orbital Eigenvalue Calculations journal March 2007
Understanding band gaps of solids in generalized Kohn–Sham theory journal March 2017
Orbital-dependent density functionals: Theory and applications journal January 2008
A solution for the best rotation to relate two sets of vectors journal September 1976
Density functional theory for modelling large molecular adsorbate–surface interactions: a mini-review and worked example journal November 2016
Less is more: Sampling chemical space with active learning journal June 2018
Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics journal August 2009
Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields journal November 1994
Density‐functional thermochemistry. III. The role of exact exchange journal April 1993
Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density journal January 1988
Ab initio effective core potentials for molecular calculations. Potentials for the transition metal atoms Sc to Hg journal January 1985
A ?Level-Shifting? method for converging closed shell Hartree-Fock wave functions journal July 1973
Fractional charge perspective on the band gap in density-functional theory journal March 2008
Physical Content of the Exact Kohn-Sham Orbital Energies: Band Gaps and Derivative Discontinuities journal November 1983
On representing chemical environments journal May 2013
Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity journal October 2016
Big Data of Materials Science: Critical Role of the Descriptor journal March 2015
A Shape Index from Molecular Graphs journal January 1985
The Elements of Statistical Learning book January 2009
Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms conference January 2013
Random Forests journal January 2001