DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network

Abstract

Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict 1H and 13C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observedmore » 13C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.« less

Authors:
ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [1]
  1. Department of Chemistry, Colorado State University, Fort Collins, CO, 80523, USA
  2. Biosciences Center, National Renewable Energy Laboratory, Golden, CO 80401, USA
Publication Date:
Research Org.:
National Renewable Energy Lab. (NREL), Golden, CO (United States)
Sponsoring Org.:
USDOE Advanced Research Projects Agency - Energy (ARPA-E)
OSTI Identifier:
1812410
Alternate Identifier(s):
OSTI ID: 1822396
Report Number(s):
NREL/JA-2700-80593
Journal ID: ISSN 2041-6520; CSHCBM
Grant/Contract Number:  
AC36-08GO28308
Resource Type:
Published Article
Journal Name:
Chemical Science
Additional Journal Information:
Journal Name: Chemical Science Journal Volume: 12 Journal Issue: 36; Journal ID: ISSN 2041-6520
Publisher:
Royal Society of Chemistry (RSC)
Country of Publication:
United Kingdom
Language:
English
Subject:
37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; cheminformatics; machine learning; NMR

Citation Formats

Guan, Yanfei, Shree Sowndarya, S. V., Gallegos, Liliana C., St. John, Peter C., and Paton, Robert S. Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network. United Kingdom: N. p., 2021. Web. doi:10.1039/D1SC03343C.
Guan, Yanfei, Shree Sowndarya, S. V., Gallegos, Liliana C., St. John, Peter C., & Paton, Robert S. Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network. United Kingdom. https://doi.org/10.1039/D1SC03343C
Guan, Yanfei, Shree Sowndarya, S. V., Gallegos, Liliana C., St. John, Peter C., and Paton, Robert S. Wed . "Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network". United Kingdom. https://doi.org/10.1039/D1SC03343C.
@article{osti_1812410,
title = {Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network},
author = {Guan, Yanfei and Shree Sowndarya, S. V. and Gallegos, Liliana C. and St. John, Peter C. and Paton, Robert S.},
abstractNote = {Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict 1H and 13C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observed 13C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.},
doi = {10.1039/D1SC03343C},
journal = {Chemical Science},
number = 36,
volume = 12,
place = {United Kingdom},
year = {Wed Sep 22 00:00:00 EDT 2021},
month = {Wed Sep 22 00:00:00 EDT 2021}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1039/D1SC03343C

Save / Share:

Works referenced in this record:

NMReDATA, a standard to report the NMR assignment and parameters of organic compounds
journal, May 2018

  • Pupier, Marion; Nuzillard, Jean-Marc; Wist, Julien
  • Magnetic Resonance in Chemistry, Vol. 56, Issue 8
  • DOI: 10.1002/mrc.4737

NMRShiftDBConstructing a Free Chemical Information System with Open-Source Components
journal, November 2003

  • Steinbeck, Christoph; Krause, Stefan; Kuhn, Stefan
  • Journal of Chemical Information and Computer Sciences, Vol. 43, Issue 6
  • DOI: 10.1021/ci0341363

Crystal Structure Prediction via Deep Learning
journal, June 2018

  • Ryan, Kevin; Lengyel, Jeff; Shatruk, Michael
  • Journal of the American Chemical Society, Vol. 140, Issue 32
  • DOI: 10.1021/jacs.8b03913

MoleculeNet: a benchmark for molecular machine learning
journal, January 2018

  • Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.
  • Chemical Science, Vol. 9, Issue 2
  • DOI: 10.1039/C7SC02664A

Predicting NMR Spectra by Computational Methods:  Structure Revision of Hexacyclinol
journal, June 2006


Addressing the Stereochemistry of Complex Organic Molecules by Density Functional Theory-NMR: Vannusal B in Retrospective
journal, April 2011

  • Saielli, Giacomo; Nicolaou, K. C.; Ortiz, Adrian
  • Journal of the American Chemical Society, Vol. 133, Issue 15
  • DOI: 10.1021/ja201108a

Deep learning
journal, May 2015

  • LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
  • Nature, Vol. 521, Issue 7553
  • DOI: 10.1038/nature14539

ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
journal, January 2017

  • Smith, J. S.; Isayev, O.; Roitberg, A. E.
  • Chemical Science, Vol. 8, Issue 4
  • DOI: 10.1039/C6SC05720A

The value of universally available raw NMR data for transparency, reproducibility, and integrity in natural product research
journal, January 2019

  • McAlpine, James B.; Chen, Shao-Nong; Kutateladze, Andrei
  • Natural Product Reports, Vol. 36, Issue 1
  • DOI: 10.1039/C7NP00064B

Assigning the Stereochemistry of Pairs of Diastereoisomers Using GIAO NMR Shift Calculation
journal, June 2009

  • Smith, Steven G.; Goodman, Jonathan M.
  • The Journal of Organic Chemistry, Vol. 74, Issue 12
  • DOI: 10.1021/jo900408d

Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation
journal, November 2015

  • Riniker, Sereina; Landrum, Gregory A.
  • Journal of Chemical Information and Modeling, Vol. 55, Issue 12
  • DOI: 10.1021/acs.jcim.5b00654

Beyond DP4: an Improved Probability for the Stereochemical Assignment of Isomeric Compounds using Quantum Chemical Calculations of NMR Shifts
journal, December 2015

  • Grimblat, Nicolás; Zanardi, María M.; Sarotti, Ariel M.
  • The Journal of Organic Chemistry, Vol. 80, Issue 24
  • DOI: 10.1021/acs.joc.5b02396

Doubling the power of DP4 for computational structure elucidation
journal, January 2017

  • Ermanis, K.; Parkes, K. E. B.; Agback, T.
  • Organic & Biomolecular Chemistry, Vol. 15, Issue 42
  • DOI: 10.1039/C7OB01379E

A computer program for the prediction of 13-C-NMR chemical shifts of organic compounds
journal, January 1990


Toward More Reliable 13 C and 1 H Chemical Shift Prediction:  A Systematic Comparison of Neural-Network and Least-Squares Regression Based Approaches
journal, December 2007

  • Smurnyy, Yegor D.; Blinov, Kirill A.; Churanova, Tatiana S.
  • Journal of Chemical Information and Modeling, Vol. 48, Issue 1
  • DOI: 10.1021/ci700256n

A Multi-standard Approach for GIAO 13 C NMR Calculations
journal, October 2009

  • Sarotti, Ariel M.; Pellegrinet, Silvina C.
  • The Journal of Organic Chemistry, Vol. 74, Issue 19
  • DOI: 10.1021/jo901234h

Assigning Stereochemistry to Single Diastereoisomers by GIAO NMR Calculation: The DP4 Probability
journal, September 2010

  • Smith, Steven G.; Goodman, Jonathan M.
  • Journal of the American Chemical Society, Vol. 132, Issue 37
  • DOI: 10.1021/ja105035r

On representing chemical environments
journal, May 2013


Message-passing neural networks for high-throughput polymer screening
journal, June 2019

  • St. John, Peter C.; Phillips, Caleb; Kemper, Travis W.
  • The Journal of Chemical Physics, Vol. 150, Issue 23
  • DOI: 10.1063/1.5099132

IMPRESSION – prediction of NMR parameters for 3-dimensional chemical structures using machine learning with near quantum chemical accuracy
journal, January 2020

  • Gerrard, Will; Bratholm, Lars A.; Packer, Martin J.
  • Chemical Science, Vol. 11, Issue 2
  • DOI: 10.1039/C9SC03854J

Demonstrating the Transferability and the Descriptive Power of Sketch-Map
journal, February 2013

  • Ceriotti, Michele; Tribello, Gareth A.; Parrinello, Michele
  • Journal of Chemical Theory and Computation, Vol. 9, Issue 3
  • DOI: 10.1021/ct3010563

Prediction of 1 H NMR Chemical Shifts Using Neural Networks
journal, January 2002

  • Aires-de-Sousa, João; Hemmer, Markus C.; Gasteiger, Johann
  • Analytical Chemistry, Vol. 74, Issue 1
  • DOI: 10.1021/ac010737m

SchNet – A deep learning architecture for molecules and materials
journal, June 2018

  • Schütt, K. T.; Sauceda, H. E.; Kindermans, P. -J.
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5019779

Hose — a novel substructure code
journal, December 1978


Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties
journal, February 2021

  • Gallegos, Liliana C.; Luchini, Guilian; St. John, Peter C.
  • Accounts of Chemical Research, Vol. 54, Issue 4
  • DOI: 10.1021/acs.accounts.0c00745

Transferable Machine-Learning Model of the Electron Density
journal, December 2018


Fourier series of atomic radial distribution functions: A molecular fingerprint for machine learning models of quantum chemical properties
journal, April 2015

  • von Lilienfeld, O. Anatole; Ramakrishnan, Raghunathan; Rupp, Matthias
  • International Journal of Quantum Chemistry, Vol. 115, Issue 16
  • DOI: 10.1002/qua.24912

Use of 13 C NMR Chemical Shift as QSAR/QSPR Descriptor
journal, March 2011

  • Verma, Rajeshwar P.; Hansch, Corwin
  • Chemical Reviews, Vol. 111, Issue 4
  • DOI: 10.1021/cr100125d

Fast and accurate prediction of the regioselectivity of electrophilic aromatic substitution reactions
journal, January 2018

  • Kromann, Jimmy C.; Jensen, Jan H.; Kruszyk, Monika
  • Chemical Science, Vol. 9, Issue 3
  • DOI: 10.1039/C7SC04156J

Exchange functionals with improved long-range behavior and adiabatic connection methods without adjustable parameters: The mPW and mPW1PW models
journal, January 1998

  • Adamo, Carlo; Barone, Vincenzo
  • The Journal of Chemical Physics, Vol. 108, Issue 2
  • DOI: 10.1063/1.475428

Determination of Relative Configuration in Organic Compounds by NMR Spectroscopy and Computational Methods
journal, September 2007

  • Bifulco, Giuseppe; Dambruoso, Paolo; Gomez-Paloma, Luigi
  • Chemical Reviews, Vol. 107, Issue 9
  • DOI: 10.1021/cr030733c

Using Neural Networks for 13C NMR Chemical Shift Prediction–Comparison with Traditional Methods
journal, August 2002

  • Meiler, Jens; Maier, Walter; Will, Martin
  • Journal of Magnetic Resonance, Vol. 157, Issue 2
  • DOI: 10.1006/jmre.2002.2599

A Predictive Tool for Electrophilic Aromatic Substitutions Using Machine Learning
journal, October 2018

  • Tomberg, Anna; Johansson, Magnus J.; Norrby, Per-Ola
  • The Journal of Organic Chemistry, Vol. 84, Issue 8
  • DOI: 10.1021/acs.joc.8b02270

Using 1 H and 13 C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approach
journal, January 2018

  • Nguyen, Q. Nhu N.; Schwochert, Joshua; Tantillo, Dean J.
  • Physical Chemistry Chemical Physics, Vol. 20, Issue 20
  • DOI: 10.1039/C8CP01616J

Rapid prediction of NMR spectral properties with quantified uncertainty
journal, August 2019


Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors
journal, January 2021

  • Guan, Yanfei; Coley, Connor W.; Wu, Haoyang
  • Chemical Science, Vol. 12, Issue 6
  • DOI: 10.1039/D0SC04823B

Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction
journal, January 2008


Efficient implementation of the gauge-independent atomic orbital method for NMR chemical shift calculations
journal, November 1990

  • Wolinski, Krzysztof; Hinton, James F.; Pulay, Peter
  • Journal of the American Chemical Society, Vol. 112, Issue 23
  • DOI: 10.1021/ja00179a005

Atom-centered symmetry functions for constructing high-dimensional neural network potentials
journal, February 2011

  • Behler, Jörg
  • The Journal of Chemical Physics, Vol. 134, Issue 7
  • DOI: 10.1063/1.3553717

Machine Learning for Quantum Mechanical Properties of Atoms in Molecules
journal, July 2015

  • Rupp, Matthias; Ramakrishnan, Raghunathan; von Lilienfeld, O. Anatole
  • The Journal of Physical Chemistry Letters, Vol. 6, Issue 16
  • DOI: 10.1021/acs.jpclett.5b01456

Carbon-13 NMR Chemical Shift: A Descriptor for Electronic Structure and Reactivity of Organometallic Compounds
journal, July 2019

  • Gordon, Christopher P.; Raynaud, Christophe; Andersen, Richard A.
  • Accounts of Chemical Research, Vol. 52, Issue 8
  • DOI: 10.1021/acs.accounts.9b00225

Application of the Multi-standard Methodology for Calculating 1 H NMR Chemical Shifts
journal, July 2012

  • Sarotti, Ariel M.; Pellegrinet, Silvina C.
  • The Journal of Organic Chemistry, Vol. 77, Issue 14
  • DOI: 10.1021/jo3008447

Molecular graph convolutions: moving beyond fingerprints
journal, August 2016

  • Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
  • Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
  • DOI: 10.1007/s10822-016-9938-8

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012


GFN2-xTB—An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions
journal, January 2019

  • Bannwarth, Christoph; Ehlert, Sebastian; Grimme, Stefan
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 3
  • DOI: 10.1021/acs.jctc.8b01176

Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure–Property Relationships
journal, November 2017

  • Janet, Jon Paul; Kulik, Heather J.
  • The Journal of Physical Chemistry A, Vol. 121, Issue 46
  • DOI: 10.1021/acs.jpca.7b08750

Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning
journal, July 2019


Reinvestigation of a robotically revealed reaction
journal, June 2019


DP4-AI automated NMR data analysis: straight from spectrometer to structure
journal, January 2020

  • Howarth, Alexander; Ermanis, Kristaps; Goodman, Jonathan M.
  • Chemical Science, Vol. 11, Issue 17
  • DOI: 10.1039/D0SC00442A

Analyzing Learned Molecular Representations for Property Prediction
journal, July 2019

  • Yang, Kevin; Swanson, Kyle; Jin, Wengong
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 8
  • DOI: 10.1021/acs.jcim.9b00237

The Correct Structure of Aquatolide—Experimental Validation of a Theoretically-Predicted Structural Revision
journal, November 2012

  • Lodewyk, Michael W.; Soldi, Cristian; Jones, Paul B.
  • Journal of the American Chemical Society, Vol. 134, Issue 45
  • DOI: 10.1021/ja3089394

Machine learning unifies the modeling of materials and molecules
journal, December 2017

  • Bartók, Albert P.; De, Sandip; Poelking, Carl
  • Science Advances, Vol. 3, Issue 12
  • DOI: 10.1126/sciadv.1701816

Bypassing the Kohn-Sham equations with machine learning
journal, October 2017


PotentialNet for Molecular Property Prediction
journal, November 2018


Synergy of synthesis, computation and NMR reveals correct baulamycin structures
journal, July 2017

  • Wu, Jingjing; Lorenzo, Paula; Zhong, Siying
  • Nature, Vol. 547, Issue 7664
  • DOI: 10.1038/nature23265

Quantum chemical calculations for over 200,000 organic radical species and 40,000 associated closed-shell molecules
journal, July 2020


Neural Message Passing for NMR Chemical Shift Prediction
journal, April 2020

  • Kwon, Youngchun; Lee, Dongseon; Choi, Youn-Suk
  • Journal of Chemical Information and Modeling, Vol. 60, Issue 4
  • DOI: 10.1021/acs.jcim.0c00195

Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations
journal, January 2011

  • Behler, Jörg
  • Physical Chemistry Chemical Physics, Vol. 13, Issue 40
  • DOI: 10.1039/c1cp21668f

Computational Prediction of 1 H and 13 C Chemical Shifts: A Useful Tool for Natural Product, Mechanistic, and Synthetic Organic Chemistry
journal, November 2011

  • Lodewyk, Michael W.; Siebert, Matthew R.; Tantillo, Dean J.
  • Chemical Reviews, Vol. 112, Issue 3
  • DOI: 10.1021/cr200106v

Chemical shifts in molecular solids by machine learning
journal, October 2018


A Survey on Transfer Learning
journal, October 2010

  • Pan, Sinno Jialin; Yang, Qiang
  • IEEE Transactions on Knowledge and Data Engineering, Vol. 22, Issue 10
  • DOI: 10.1109/TKDE.2009.191

Performance Validation of Neural Network Based 13 C NMR Prediction Using a Publicly Available Data Source
journal, February 2008

  • Blinov, K. A.; Smurnyy, Y. D.; Elyashberg, M. E.
  • Journal of Chemical Information and Modeling, Vol. 48, Issue 3
  • DOI: 10.1021/ci700363r

Can Two Molecules Have the Same NMR Spectrum? Hexacyclinol Revisited
journal, February 2009

  • Saielli, Giacomo; Bagno, Alessandro
  • Organic Letters, Vol. 11, Issue 6
  • DOI: 10.1021/ol900164a

Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

Total Synthesis of (−)-Himalensine A
journal, November 2017

  • Shi, Heyao; Michaelides, Iacovos N.; Darses, Benjamin
  • Journal of the American Chemical Society, Vol. 139, Issue 49
  • DOI: 10.1021/jacs.7b10956

A Synthesis of Echinopine B
journal, June 2012

  • Michels, Theo D.; Dowling, Matthew S.; Vanderwal, Christopher D.
  • Angewandte Chemie International Edition, Vol. 51, Issue 30
  • DOI: 10.1002/anie.201203147

Stereostructure Assignment of Flexible Five-Membered Rings by GIAO 13 C NMR Calculations: Prediction of the Stereochemistry of Elatenyne
journal, May 2008

  • Smith, Steven G.; Paton, Robert S.; Burton, Jonathan W.
  • The Journal of Organic Chemistry, Vol. 73, Issue 11
  • DOI: 10.1021/jo8003138