Message-passing neural networks for high-throughput polymer screening
Abstract
Machine learning methods have shown promise in predicting molecular properties, and given sufficient training data, machine learning approaches can enable rapid high-throughput virtual screening of large libraries of compounds. Graph-based neural network architectures have emerged in recent years as the most successful approach for predictions based on molecular structure and have consistently achieved the best performance on benchmark quantum chemical datasets. However, these models have typically required optimized 3D structural information for the molecule to achieve the highest accuracy. These 3D geometries are costly to compute for high levels of theory, limiting the applicability and practicality of machine learning methods in high-throughput screening applications. In this study, we present a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures. Compared to existing datasets, this dataset contains substantially larger molecules (up to 200 atoms) as well as extrapolated properties for long polymer chains. We show that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets. Furthermore, these results therefore emphasize that for larger molecules with practical applications, near-optimal prediction results can be obtained without using optimized 3D geometry as anmore »
- Authors:
-
- National Renewable Energy Lab. (NREL), Golden, CO (United States)
- Colorado State Univ., Fort Collins, CO (United States)
- Publication Date:
- Research Org.:
- National Renewable Energy Laboratory (NREL), Golden, CO (United States)
- Sponsoring Org.:
- USDOE Office of Energy Efficiency and Renewable Energy (EERE), Sustainable Transportation Office. Bioenergy Technologies Office
- OSTI Identifier:
- 1543249
- Alternate Identifier(s):
- OSTI ID: 1527066
- Report Number(s):
- NREL/JA-2700-74021
Journal ID: ISSN 0021-9606
- Grant/Contract Number:
- AC36-08GO28308
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Chemical Physics
- Additional Journal Information:
- Journal Volume: 150; Journal Issue: 23; Journal ID: ISSN 0021-9606
- Publisher:
- American Institute of Physics (AIP)
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 14 SOLAR ENERGY; 71 CLASSICAL AND QUANTUM MECHANICS, GENERAL PHYSICS; chemical compounds and components; isomerism; photovoltaics; regression analysis; machine learning; optoelectronic properties; artificial neural networks; molecular properties
Citation Formats
St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., and Larsen, Ross E. Message-passing neural networks for high-throughput polymer screening. United States: N. p., 2019.
Web. doi:10.1063/1.5099132.
St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., & Larsen, Ross E. Message-passing neural networks for high-throughput polymer screening. United States. https://doi.org/10.1063/1.5099132
St. John, Peter C., Phillips, Caleb T., Kemper, Travis W., Wilson, A. Nolan, Guan, Yanfei, Crowley, Michael F., Nimlos, Mark R., and Larsen, Ross E. Wed .
"Message-passing neural networks for high-throughput polymer screening". United States. https://doi.org/10.1063/1.5099132. https://www.osti.gov/servlets/purl/1543249.
@article{osti_1543249,
title = {Message-passing neural networks for high-throughput polymer screening},
author = {St. John, Peter C. and Phillips, Caleb T. and Kemper, Travis W. and Wilson, A. Nolan and Guan, Yanfei and Crowley, Michael F. and Nimlos, Mark R. and Larsen, Ross E.},
abstractNote = {Machine learning methods have shown promise in predicting molecular properties, and given sufficient training data, machine learning approaches can enable rapid high-throughput virtual screening of large libraries of compounds. Graph-based neural network architectures have emerged in recent years as the most successful approach for predictions based on molecular structure and have consistently achieved the best performance on benchmark quantum chemical datasets. However, these models have typically required optimized 3D structural information for the molecule to achieve the highest accuracy. These 3D geometries are costly to compute for high levels of theory, limiting the applicability and practicality of machine learning methods in high-throughput screening applications. In this study, we present a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures. Compared to existing datasets, this dataset contains substantially larger molecules (up to 200 atoms) as well as extrapolated properties for long polymer chains. We show that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets. Furthermore, these results therefore emphasize that for larger molecules with practical applications, near-optimal prediction results can be obtained without using optimized 3D geometry as an input. We further show that learned molecular representations can be leveraged to reduce the training data required to transfer predictions to a new density functional theory functional.},
doi = {10.1063/1.5099132},
journal = {Journal of Chemical Physics},
number = 23,
volume = 150,
place = {United States},
year = {Wed Jun 19 00:00:00 EDT 2019},
month = {Wed Jun 19 00:00:00 EDT 2019}
}
Web of Science
Works referenced in this record:
Machine learning-based screening of complex molecules for polymer solar cells
journal, June 2018
- Jørgensen, Peter Bjørn; Mesta, Murat; Shil, Suranjan
- The Journal of Chemical Physics, Vol. 148, Issue 24
Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error
text, January 2017
- Faber, Felix A.; Hutchison, Luke; Huang, Bing
- American Chemical Society
Deep learning
journal, May 2015
- LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
- Nature, Vol. 521, Issue 7553
UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations
journal, December 1992
- Rappe, A. K.; Casewit, C. J.; Colwell, K. S.
- Journal of the American Chemical Society, Vol. 114, Issue 25, p. 10024-10035
The TensorMol-0.1 Model Chemistry: a Neural Network Augmented with Long-Range Physics
preprint, January 2017
- Yao, Kun; Herr, John E.; Toth, David W.
- arXiv
Optimal Computer-Aided Molecular Design: A Polymer Design Case Study
journal, January 1996
- Maranas, Costas D.
- Industrial & Engineering Chemistry Research, Vol. 35, Issue 10
Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error
journal, October 2017
- Faber, Felix A.; Hutchison, Luke; Huang, Bing
- Journal of Chemical Theory and Computation, Vol. 13, Issue 11
SchNet - a deep learning architecture for molecules and materials
text, January 2017
- Schütt, Kristof T.; Sauceda, Huziel E.; Kindermans, Pieter-Jan
- arXiv
Design Rules for Donors in Bulk-Heterojunction Solar Cells—Towards 10 % Energy-Conversion Efficiency
journal, March 2006
- Scharber, M. C.; Mühlbacher, D.; Koppe, M.
- Advanced Materials, Vol. 18, Issue 6, p. 789-794
Efficient Computational Screening of Organic Polymer Photovoltaics
journal, April 2013
- Kanal, Ilana Y.; Owens, Steven G.; Bechtel, Jonathon S.
- The Journal of Physical Chemistry Letters, Vol. 4, Issue 10
SchNet – A deep learning architecture for molecules and materials
journal, June 2018
- Schütt, K. T.; Sauceda, H. E.; Kindermans, P. -J.
- The Journal of Chemical Physics, Vol. 148, Issue 24
Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery
journal, September 2015
- Pyzer-Knapp, Edward O.; Li, Kewei; Aspuru-Guzik, Alan
- Advanced Functional Materials, Vol. 25, Issue 41
Designing Novel Polymers with Targeted Properties Using the Signature Molecular Descriptor
journal, March 2006
- Brown, W. Michael; Martin, Shawn; Rintoul, Mark D.
- Journal of Chemical Information and Modeling, Vol. 46, Issue 2
Photovoltaics from soluble small molecules
journal, November 2007
- Lloyd, Matthew T.; Anthony, John E.; Malliaras, George G.
- Materials Today, Vol. 10, Issue 11
Accelerating materials property predictions using machine learning
journal, September 2013
- Pilania, Ghanshyam; Wang, Chenchen; Jiang, Xun
- Scientific Reports, Vol. 3, Issue 1
Predicting molecular properties with covariant compositional networks
journal, June 2018
- Hy, Truong Son; Trivedi, Shubhendu; Pan, Horace
- The Journal of Chemical Physics, Vol. 148, Issue 24
Measuring and predicting sooting tendencies of oxygenates, alkanes, alkenes, cycloalkanes, and aromatics on a unified scale
journal, April 2018
- Das, Dhrubajyoti D.; St. John, Peter C.; McEnally, Charles S.
- Combustion and Flame, Vol. 190
ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost
text, January 2017
- ,
- The University of North Carolina at Chapel Hill University Libraries
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013
- Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
- APL Materials, Vol. 1, Issue 1
Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
journal, November 2012
- Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C.
- Journal of Chemical Information and Modeling, Vol. 52, Issue 11
Quantum-chemical insights from deep tensor neural networks
journal, January 2017
- Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
- Nature Communications, Vol. 8, Issue 1
The Harvard organic photovoltaic dataset
journal, September 2016
- Lopez, Steven A.; Pyzer-Knapp, Edward O.; Simm, Gregor N.
- Scientific Data, Vol. 3, Issue 1
SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules
journal, February 1988
- Weininger, David
- Journal of Chemical Information and Modeling, Vol. 28, Issue 1
Non-Basic High-Performance Molecules for Solution-Processed Organic Solar Cells
journal, June 2012
- van der Poll, Thomas S.; Love, John A.; Nguyen, Thuc-Quyen
- Advanced Materials, Vol. 24, Issue 27
Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction
journal, October 2010
- Katritzky, Alan R.; Kuanar, Minati; Slavov, Svetoslav
- Chemical Reviews, Vol. 110, Issue 10
Quantum-chemical insights from deep tensor neural networks
journal, January 2017
- Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
- Nature Communications, Vol. 8, Issue 1
Measuring and Predicting Sooting Tendencies of Oxygenates, Alkanes, Alkenes, Cycloalkanes, and Aromatics on a Unified Scale
text, January 2017
- Das, Dhrubajyoti; St. John, Peter; McEnally, Charles
- engrXiv
Quantum-Chemical Insights from Deep Tensor Neural Networks
text, January 2016
- Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
- arXiv
A Quantitative Model for the Prediction of Sooting Tendency from Molecular Structure
journal, August 2017
- St. John, Peter C.; Kairys, Paul; Das, Dhrubajyoti D.
- Energy & Fuels, Vol. 31, Issue 9
Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014
- Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
- Scientific Data, Vol. 1, Issue 1
Simple Extrapolation Method To Predict the Electronic Structure of Conjugated Polymers from Calculations on Oligomers
journal, April 2016
- Larsen, Ross E.
- The Journal of Physical Chemistry C, Vol. 120, Issue 18
Quantum chemistry structures and properties of 134 kilo molecules
text, January 2014
- Raghunathan, Ramakrishnan,; O., Dral, Pavlo; Matthias, Rupp,
- Springer Nature
Predicting Electronic Structure Properties of Transition Metal Complexes with Neural Networks
text, January 2017
- Janet, Jon Paul; Kulik, Heather J.
- arXiv
Works referencing / citing this record:
Deep Learning for Automated Classification and Characterization of Amorphous Materials
preprint, January 2019
- Swanson, Kirk; Trivedi, Shubhendu; Lequieu, Joshua
- arXiv
Representations and descriptors unifying the study of molecular and bulk systems
journal, December 2019
- Rossi, Kevin; Cumby, James
- International Journal of Quantum Chemistry, Vol. 120, Issue 8