DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Message-passing neural networks for high-throughput polymer screening

Abstract

Machine learning methods have shown promise in predicting molecular properties, and given sufficient training data, machine learning approaches can enable rapid high-throughput virtual screening of large libraries of compounds. Graph-based neural network architectures have emerged in recent years as the most successful approach for predictions based on molecular structure and have consistently achieved the best performance on benchmark quantum chemical datasets. However, these models have typically required optimized 3D structural information for the molecule to achieve the highest accuracy. These 3D geometries are costly to compute for high levels of theory, limiting the applicability and practicality of machine learning methods in high-throughput screening applications. In this study, we present a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures. Compared to existing datasets, this dataset contains substantially larger molecules (up to 200 atoms) as well as extrapolated properties for long polymer chains. We show that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets. Furthermore, these results therefore emphasize that for larger molecules with practical applications, near-optimal prediction results can be obtained without using optimized 3D geometry as anmore » input. We further show that learned molecular representations can be leveraged to reduce the training data required to transfer predictions to a new density functional theory functional.« less

Authors:
ORCiD logo [1]; ORCiD logo [1];  [1]; ORCiD logo [1];  [2]; ORCiD logo [1]; ORCiD logo [1]; ORCiD logo [1]
  1. National Renewable Energy Lab. (NREL), Golden, CO (United States)
  2. Colorado State Univ., Fort Collins, CO (United States)
Publication Date:
Research Org.:
National Renewable Energy Laboratory (NREL), Golden, CO (United States)
Sponsoring Org.:
USDOE Office of Energy Efficiency and Renewable Energy (EERE), Sustainable Transportation Office. Bioenergy Technologies Office
OSTI Identifier:
1543249
Alternate Identifier(s):
OSTI ID: 1527066
Report Number(s):
NREL/JA-2700-74021
Journal ID: ISSN 0021-9606
Grant/Contract Number:  
AC36-08GO28308
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Chemical Physics
Additional Journal Information:
Journal Volume: 150; Journal Issue: 23; Journal ID: ISSN 0021-9606
Publisher:
American Institute of Physics (AIP)
Country of Publication:
United States
Language:
English
Subject:
14 SOLAR ENERGY; 71 CLASSICAL AND QUANTUM MECHANICS, GENERAL PHYSICS; chemical compounds and components; isomerism; photovoltaics; regression analysis; machine learning; optoelectronic properties; artificial neural networks; molecular properties

Citation Formats

St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., and Larsen, Ross E. Message-passing neural networks for high-throughput polymer screening. United States: N. p., 2019. Web. doi:10.1063/1.5099132.
St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., & Larsen, Ross E. Message-passing neural networks for high-throughput polymer screening. United States. https://doi.org/10.1063/1.5099132
St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., and Larsen, Ross E. Wed . "Message-passing neural networks for high-throughput polymer screening". United States. https://doi.org/10.1063/1.5099132. https://www.osti.gov/servlets/purl/1543249.
@article{osti_1543249,
title = {Message-passing neural networks for high-throughput polymer screening},
author = {St. John, Peter C. and Phillips, Caleb T. and Kemper, Travis W. and Wilson, A. Nolan and Guan, Yanfei and Crowley, Michael F. and Nimlos, Mark R. and Larsen, Ross E.},
abstractNote = {Machine learning methods have shown promise in predicting molecular properties, and given sufficient training data, machine learning approaches can enable rapid high-throughput virtual screening of large libraries of compounds. Graph-based neural network architectures have emerged in recent years as the most successful approach for predictions based on molecular structure and have consistently achieved the best performance on benchmark quantum chemical datasets. However, these models have typically required optimized 3D structural information for the molecule to achieve the highest accuracy. These 3D geometries are costly to compute for high levels of theory, limiting the applicability and practicality of machine learning methods in high-throughput screening applications. In this study, we present a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures. Compared to existing datasets, this dataset contains substantially larger molecules (up to 200 atoms) as well as extrapolated properties for long polymer chains. We show that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets. Furthermore, these results therefore emphasize that for larger molecules with practical applications, near-optimal prediction results can be obtained without using optimized 3D geometry as an input. We further show that learned molecular representations can be leveraged to reduce the training data required to transfer predictions to a new density functional theory functional.},
doi = {10.1063/1.5099132},
journal = {Journal of Chemical Physics},
number = 23,
volume = 150,
place = {United States},
year = {Wed Jun 19 00:00:00 EDT 2019},
month = {Wed Jun 19 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 37 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Machine learning-based screening of complex molecules for polymer solar cells
journal, June 2018

  • Jørgensen, Peter Bjørn; Mesta, Murat; Shil, Suranjan
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5023563

Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error
text, January 2017


Deep learning
journal, May 2015

  • LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
  • Nature, Vol. 521, Issue 7553
  • DOI: 10.1038/nature14539

UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations
journal, December 1992

  • Rappe, A. K.; Casewit, C. J.; Colwell, K. S.
  • Journal of the American Chemical Society, Vol. 114, Issue 25, p. 10024-10035
  • DOI: 10.1021/ja00051a040

Deep Learning
text, January 2018


The TensorMol-0.1 Model Chemistry: a Neural Network Augmented with Long-Range Physics
preprint, January 2017


Optimal Computer-Aided Molecular Design:  A Polymer Design Case Study
journal, January 1996

  • Maranas, Costas D.
  • Industrial & Engineering Chemistry Research, Vol. 35, Issue 10
  • DOI: 10.1021/ie960096z

Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error
journal, October 2017

  • Faber, Felix A.; Hutchison, Luke; Huang, Bing
  • Journal of Chemical Theory and Computation, Vol. 13, Issue 11
  • DOI: 10.1021/acs.jctc.7b00577

SchNet - a deep learning architecture for molecules and materials
text, January 2017


Design Rules for Donors in Bulk-Heterojunction Solar Cells—Towards 10 % Energy-Conversion Efficiency
journal, March 2006

  • Scharber, M. C.; Mühlbacher, D.; Koppe, M.
  • Advanced Materials, Vol. 18, Issue 6, p. 789-794
  • DOI: 10.1002/adma.200501717

Efficient Computational Screening of Organic Polymer Photovoltaics
journal, April 2013

  • Kanal, Ilana Y.; Owens, Steven G.; Bechtel, Jonathon S.
  • The Journal of Physical Chemistry Letters, Vol. 4, Issue 10
  • DOI: 10.1021/jz400215j

SchNet – A deep learning architecture for molecules and materials
journal, June 2018

  • Schütt, K. T.; Sauceda, H. E.; Kindermans, P. -J.
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5019779

Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery
journal, September 2015

  • Pyzer-Knapp, Edward O.; Li, Kewei; Aspuru-Guzik, Alan
  • Advanced Functional Materials, Vol. 25, Issue 41
  • DOI: 10.1002/adfm.201501919

Designing Novel Polymers with Targeted Properties Using the Signature Molecular Descriptor
journal, March 2006

  • Brown, W. Michael; Martin, Shawn; Rintoul, Mark D.
  • Journal of Chemical Information and Modeling, Vol. 46, Issue 2
  • DOI: 10.1021/ci0504521

Photovoltaics from soluble small molecules
journal, November 2007


Accelerating materials property predictions using machine learning
journal, September 2013

  • Pilania, Ghanshyam; Wang, Chenchen; Jiang, Xun
  • Scientific Reports, Vol. 3, Issue 1
  • DOI: 10.1038/srep02810

Predicting molecular properties with covariant compositional networks
journal, June 2018

  • Hy, Truong Son; Trivedi, Shubhendu; Pan, Horace
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5024797

Measuring and predicting sooting tendencies of oxygenates, alkanes, alkenes, cycloalkanes, and aromatics on a unified scale
journal, April 2018


ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
text, January 2017

  • ,
  • The University of North Carolina at Chapel Hill University Libraries
  • DOI: 10.17615/bhbf-9r93

Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013

  • Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
  • APL Materials, Vol. 1, Issue 1
  • DOI: 10.1063/1.4812323

Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
journal, November 2012

  • Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C.
  • Journal of Chemical Information and Modeling, Vol. 52, Issue 11
  • DOI: 10.1021/ci300415d

Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

The Harvard organic photovoltaic dataset
journal, September 2016

  • Lopez, Steven A.; Pyzer-Knapp, Edward O.; Simm, Gregor N.
  • Scientific Data, Vol. 3, Issue 1
  • DOI: 10.1038/sdata.2016.86

SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules
journal, February 1988

  • Weininger, David
  • Journal of Chemical Information and Modeling, Vol. 28, Issue 1
  • DOI: 10.1021/ci00057a005

Non-Basic High-Performance Molecules for Solution-Processed Organic Solar Cells
journal, June 2012

  • van der Poll, Thomas S.; Love, John A.; Nguyen, Thuc-Quyen
  • Advanced Materials, Vol. 24, Issue 27
  • DOI: 10.1002/adma.201201127

Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction
journal, October 2010

  • Katritzky, Alan R.; Kuanar, Minati; Slavov, Svetoslav
  • Chemical Reviews, Vol. 110, Issue 10
  • DOI: 10.1021/cr900238d

Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

Quantum-Chemical Insights from Deep Tensor Neural Networks
text, January 2016


A Quantitative Model for the Prediction of Sooting Tendency from Molecular Structure
journal, August 2017


Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • Scientific Data, Vol. 1, Issue 1
  • DOI: 10.1038/sdata.2014.22

Simple Extrapolation Method To Predict the Electronic Structure of Conjugated Polymers from Calculations on Oligomers
journal, April 2016


Quantum chemistry structures and properties of 134 kilo molecules
text, January 2014


Works referencing / citing this record:

Deep Learning for Automated Classification and Characterization of Amorphous Materials
preprint, January 2019


Representations and descriptors unifying the study of molecular and bulk systems
journal, December 2019

  • Rossi, Kevin; Cumby, James
  • International Journal of Quantum Chemistry, Vol. 120, Issue 8
  • DOI: 10.1002/qua.26151