DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Artificial Intelligence for Autonomous Molecular Design: A Perspective

Abstract

Domain-aware artificial intelligence has been increasingly adopted in recent years to expedite molecular design in various applications, including drug design and discovery. Recent advances in areas such as physics-informed machine learning and reasoning, software engineering, high-end hardware development, and computing infrastructures are providing opportunities to build scalable and explainable AI molecular discovery systems. This could improve a design hypothesis through feedback analysis, data integration that can provide a basis for the introduction of end-to-end automation for compound discovery and optimization, and enable more intelligent searches of chemical space. Several state-of-the-art ML architectures are predominantly and independently used for predicting the properties of small molecules, their high throughput synthesis, and screening, iteratively identifying and optimizing lead therapeutic candidates. However, such deep learning and ML approaches also raise considerable conceptual, technical, scalability, and end-to-end error quantification challenges, as well as skepticism about the current AI hype to build automated tools. To this end, synergistically and intelligently using these individual components along with robust quantum physics-based molecular representation and data generation tools in a closed-loop holds enormous promise for accelerated therapeutic design to critically analyze the opportunities and challenges for their more widespread application. This article aims to identify the most recent technologymore » and breakthrough achieved by each of the components and discusses how such autonomous AI and ML workflows can be integrated to radically accelerate the protein target or disease model-based probe design that can be iteratively validated experimentally. Taken together, this could significantly reduce the timeline for end-to-end therapeutic discovery and optimization upon the arrival of any novel zoonotic transmission event. Our article serves as a guide for medicinal, computational chemistry and biology, analytical chemistry, and the ML community to practice autonomous molecular design in precision medicine and drug discovery.« less

Authors:
; ORCiD logo
Publication Date:
Research Org.:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE Office of Science (SC); USDOE Laboratory Directed Research and Development (LDRD) Program
OSTI Identifier:
1829517
Alternate Identifier(s):
OSTI ID: 1833355
Report Number(s):
PNNL-SA-159775
Journal ID: ISSN 1420-3049; MOLEFW; PII: molecules26226761
Grant/Contract Number:  
NVBL; AC05-76RL01830
Resource Type:
Published Article
Journal Name:
Molecules
Additional Journal Information:
Journal Name: Molecules Journal Volume: 26 Journal Issue: 22; Journal ID: ISSN 1420-3049
Publisher:
MDPI AG
Country of Publication:
Switzerland
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; autonomous workflow; therapeutic design; computer aided drug discovery; computational modeling and simulations; quantum mechanics and quantum computing; artificial intelligence; machine learning; deep learning; machine reasoning and causal inference and causal reasoning

Citation Formats

Joshi, Rajendra P., and Kumar, Neeraj. Artificial Intelligence for Autonomous Molecular Design: A Perspective. Switzerland: N. p., 2021. Web. doi:10.3390/molecules26226761.
Joshi, Rajendra P., & Kumar, Neeraj. Artificial Intelligence for Autonomous Molecular Design: A Perspective. Switzerland. https://doi.org/10.3390/molecules26226761
Joshi, Rajendra P., and Kumar, Neeraj. Tue . "Artificial Intelligence for Autonomous Molecular Design: A Perspective". Switzerland. https://doi.org/10.3390/molecules26226761.
@article{osti_1829517,
title = {Artificial Intelligence for Autonomous Molecular Design: A Perspective},
author = {Joshi, Rajendra P. and Kumar, Neeraj},
abstractNote = {Domain-aware artificial intelligence has been increasingly adopted in recent years to expedite molecular design in various applications, including drug design and discovery. Recent advances in areas such as physics-informed machine learning and reasoning, software engineering, high-end hardware development, and computing infrastructures are providing opportunities to build scalable and explainable AI molecular discovery systems. This could improve a design hypothesis through feedback analysis, data integration that can provide a basis for the introduction of end-to-end automation for compound discovery and optimization, and enable more intelligent searches of chemical space. Several state-of-the-art ML architectures are predominantly and independently used for predicting the properties of small molecules, their high throughput synthesis, and screening, iteratively identifying and optimizing lead therapeutic candidates. However, such deep learning and ML approaches also raise considerable conceptual, technical, scalability, and end-to-end error quantification challenges, as well as skepticism about the current AI hype to build automated tools. To this end, synergistically and intelligently using these individual components along with robust quantum physics-based molecular representation and data generation tools in a closed-loop holds enormous promise for accelerated therapeutic design to critically analyze the opportunities and challenges for their more widespread application. This article aims to identify the most recent technology and breakthrough achieved by each of the components and discusses how such autonomous AI and ML workflows can be integrated to radically accelerate the protein target or disease model-based probe design that can be iteratively validated experimentally. Taken together, this could significantly reduce the timeline for end-to-end therapeutic discovery and optimization upon the arrival of any novel zoonotic transmission event. Our article serves as a guide for medicinal, computational chemistry and biology, analytical chemistry, and the ML community to practice autonomous molecular design in precision medicine and drug discovery.},
doi = {10.3390/molecules26226761},
journal = {Molecules},
number = 22,
volume = 26,
place = {Switzerland},
year = {Tue Nov 09 00:00:00 EST 2021},
month = {Tue Nov 09 00:00:00 EST 2021}
}

Works referenced in this record:

Protein–Ligand Scoring with Convolutional Neural Networks
journal, April 2017

  • Ragoza, Matthew; Hochuli, Joshua; Idrobo, Elisa
  • Journal of Chemical Information and Modeling, Vol. 57, Issue 4
  • DOI: 10.1021/acs.jcim.6b00740

MoleculeNet: a benchmark for molecular machine learning
journal, January 2018

  • Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.
  • Chemical Science, Vol. 9, Issue 2
  • DOI: 10.1039/C7SC02664A

Quantum Mechanical Methods Predict Accurate Thermodynamics of Biochemical Reactions
journal, March 2021


Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI
journal, September 2012


Idea2Data: Toward a New Paradigm for Drug Discovery
journal, February 2019


Virtual discovery of melatonin receptor ligands to modulate circadian rhythms
journal, February 2020


Collision Cross Sections for Structural Proteomics
journal, April 2015


Computational methods in drug discovery
journal, January 2016

  • Leelananda, Sumudu P.; Lindert, Steffen
  • Beilstein Journal of Organic Chemistry, Vol. 12
  • DOI: 10.3762/bjoc.12.267

Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space
journal, June 2015

  • Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan
  • The Journal of Physical Chemistry Letters, Vol. 6, Issue 12
  • DOI: 10.1021/acs.jpclett.5b00831

Deep learning enables rapid identification of potent DDR1 kinase inhibitors
journal, September 2019


Machine Learning Techniques and Drug Design
journal, September 2012

  • Gertrudes, J. C.; Maltarollo, V. G.; Silva, R. A.
  • Current Medicinal Chemistry, Vol. 19, Issue 25
  • DOI: 10.2174/092986712802884259

InChI - the worldwide chemical structure identifier standard
journal, January 2013

  • Heller, Stephen; McNaught, Alan; Stein, Stephen
  • Journal of Cheminformatics, Vol. 5, Issue 1
  • DOI: 10.1186/1758-2946-5-7

Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error
journal, October 2017

  • Faber, Felix A.; Hutchison, Luke; Huang, Bing
  • Journal of Chemical Theory and Computation, Vol. 13, Issue 11
  • DOI: 10.1021/acs.jctc.7b00577

Development and evaluation of a deep learning model for protein–ligand binding affinity prediction
journal, May 2018


Ultra-large library docking for discovering new chemotypes
journal, February 2019


The Electrolyte Genome project: A big data approach in battery materials discovery
journal, June 2015


BRADSHAW: a system for automated molecular design
journal, October 2019

  • Green, Darren V. S.; Pickett, Stephen; Luscombe, Chris
  • Journal of Computer-Aided Molecular Design, Vol. 34, Issue 7
  • DOI: 10.1007/s10822-019-00234-8

Envisioning the Future: Medicine in the Year 2050
journal, June 2012


Current and Future Roles of Artificial Intelligence in Medicinal Chemistry Synthesis
journal, April 2020


SMILES-based deep generative scaffold decorator for de-novo drug design
journal, May 2020

  • Arús-Pous, Josep; Patronov, Atanas; Bjerrum, Esben Jannik
  • Journal of Cheminformatics, Vol. 12, Issue 1
  • DOI: 10.1186/s13321-020-00441-8

SchNet – A deep learning architecture for molecules and materials
journal, June 2018

  • Schütt, K. T.; Sauceda, H. E.; Kindermans, P. -J.
  • The Journal of Chemical Physics, Vol. 148, Issue 24
  • DOI: 10.1063/1.5019779

Chemical process optimization by computer — a self-directed chemical synthesis system
journal, December 1978


DeepScaffold: A Comprehensive Tool for Scaffold-Based De Novo Drug Discovery Using Deep Learning
journal, December 2019

  • Li, Yibo; Hu, Jianxing; Wang, Yanxing
  • Journal of Chemical Information and Modeling, Vol. 60, Issue 1
  • DOI: 10.1021/acs.jcim.9b00727

OrbNet: Deep learning for quantum chemistry using symmetry-adapted atomic-orbital features
journal, September 2020

  • Qiao, Zhuoran; Welborn, Matthew; Anandkumar, Animashree
  • The Journal of Chemical Physics, Vol. 153, Issue 12
  • DOI: 10.1063/5.0021955

Molecular de-novo design through deep reinforcement learning
journal, September 2017

  • Olivecrona, Marcus; Blaschke, Thomas; Engkvist, Ola
  • Journal of Cheminformatics, Vol. 9, Issue 1
  • DOI: 10.1186/s13321-017-0235-x

Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design
journal, June 2019

  • Ståhl, Niclas; Falkman, Göran; Karlsson, Alexander
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 7
  • DOI: 10.1021/acs.jcim.9b00325

Bayer’s in silico ADMET platform: a journey of machine learning over the past two decades
journal, September 2020


Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
journal, January 2018

  • Gómez-Bombarelli, Rafael; Wei, Jennifer N.; Duvenaud, David
  • ACS Central Science, Vol. 4, Issue 2
  • DOI: 10.1021/acscentsci.7b00572

Properties of a genetic algorithm equipped with a dynamic penalty function
journal, March 2009


Self-Consistent Equations Including Exchange and Correlation Effects
journal, November 1965


Molecular graph convolutions: moving beyond fingerprints
journal, August 2016

  • Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
  • Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
  • DOI: 10.1007/s10822-016-9938-8

Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning
journal, January 2012


A remote-controlled adaptive medchem lab: an innovative approach to enable drug discovery in the 21st Century
journal, September 2013


Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation
journal, November 2020

  • Krenn, Mario; Häse, Florian; Nigam, AkshatKumar
  • Machine Learning: Science and Technology, Vol. 1, Issue 4
  • DOI: 10.1088/2632-2153/aba947

Machine Learning in Drug Discovery and Development Part 1: A Primer
journal, March 2020

  • Talevi, Alan; Morales, Juan Francisco; Hather, Gregory
  • CPT: Pharmacometrics & Systems Pharmacology, Vol. 9, Issue 3
  • DOI: 10.1002/psp4.12491

Innovation in the pharmaceutical industry: New estimates of R&D costs
journal, May 2016


Benchmarking graph neural networks for materials chemistry
journal, June 2021


Quantum machine learning
journal, November 2018

  • Allcock, Jonathan; Zhang, Shengyu
  • National Science Review, Vol. 6, Issue 1
  • DOI: 10.1093/nsr/nwy149

Reinforced Adversarial Neural Computer for de Novo Molecular Design
journal, May 2018

  • Putin, Evgeny; Asadulaev, Arip; Ivanenkov, Yan
  • Journal of Chemical Information and Modeling, Vol. 58, Issue 6
  • DOI: 10.1021/acs.jcim.7b00690

Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach
journal, April 2010

  • Agarwal, Shivani; Dugar, Deepak; Sengupta, Shiladitya
  • Journal of Chemical Information and Modeling, Vol. 50, Issue 5
  • DOI: 10.1021/ci9003865

Inverse design in search of materials with target functionalities
journal, March 2018


Argumentative Comparative Analysis of Machine Learning on Coronary Artery Disease
journal, January 2020


Computational Design and Selection of Optimal Organic Photovoltaic Materials
journal, July 2011

  • O’Boyle, Noel M.; Campbell, Casey M.; Hutchison, Geoffrey R.
  • The Journal of Physical Chemistry C, Vol. 115, Issue 32
  • DOI: 10.1021/jp202765c

Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17
journal, November 2012

  • Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C.
  • Journal of Chemical Information and Modeling, Vol. 52, Issue 11
  • DOI: 10.1021/ci300415d

Machine learning for target discovery in drug development
journal, June 2020


Automation of Synthesis in Medicinal Chemistry: Progress and Challenges
journal, July 2020


druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico
journal, May 2017


A high-throughput infrastructure for density functional theory calculations
journal, June 2011


Applications of Machine Learning in Drug Target Discovery
journal, December 2020


Inverse Strategies for Molecular Design
journal, January 1996

  • Kuhn, Christoph; Beratan, David N.
  • The Journal of Physical Chemistry, Vol. 100, Issue 25
  • DOI: 10.1021/jp960518i

International chemical identifier for chemical reactions
journal, March 2013


Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

PubChem Substance and Compound databases
journal, September 2015

  • Kim, Sunghwan; Thiessen, Paul A.; Bolton, Evan E.
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv951

Machine-learned and codified synthesis parameters of oxide materials
journal, September 2017


Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation
journal, August 2019

  • Lim, Jaechang; Ryu, Seongok; Park, Kyubyong
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 9
  • DOI: 10.1021/acs.jcim.9b00387

Quantum autoencoders for efficient compression of quantum data
journal, August 2017

  • Romero, Jonathan; Olson, Jonathan P.; Aspuru-Guzik, Alan
  • Quantum Science and Technology, Vol. 2, Issue 4
  • DOI: 10.1088/2058-9565/aa8072

Machine learning in chemoinformatics and drug discovery
journal, August 2018


Information Retrieval and Text Mining Technologies for Chemistry
journal, May 2017


De novo generation of hit-like molecules from gene expression signatures using artificial intelligence
journal, January 2020

  • Méndez-Lucio, Oscar; Baillif, Benoit; Clevert, Djork-Arné
  • Nature Communications, Vol. 11, Issue 1
  • DOI: 10.1038/s41467-019-13807-w

When do short-range atomistic machine-learning models fall short?
journal, January 2021

  • Yue, Shuwen; Muniz, Maria Carolina; Calegari Andrade, Marcos F.
  • The Journal of Chemical Physics, Vol. 154, Issue 3
  • DOI: 10.1063/5.0031215

Deep reinforcement learning for de novo drug design
journal, July 2018

  • Popova, Mariya; Isayev, Olexandr; Tropsha, Alexander
  • Science Advances, Vol. 4, Issue 7
  • DOI: 10.1126/sciadv.aap7885

Message-passing neural networks for high-throughput polymer screening
journal, June 2019

  • St. John, Peter C.; Phillips, Caleb; Kemper, Travis W.
  • The Journal of Chemical Physics, Vol. 150, Issue 23
  • DOI: 10.1063/1.5099132

Text-mined dataset of inorganic materials synthesis recipes
journal, October 2019


Designing compact training sets for data-driven molecular property prediction through optimal exploitation and exploration
journal, January 2019

  • Li, Bowen; Rangarajan, Srinivas
  • Molecular Systems Design & Engineering, Vol. 4, Issue 5
  • DOI: 10.1039/C9ME00078J

Estimation of the size of drug-like chemical space based on GDB-17 data
journal, August 2013

  • Polishchuk, P. G.; Madzhidov, T. I.; Varnek, A.
  • Journal of Computer-Aided Molecular Design, Vol. 27, Issue 8
  • DOI: 10.1007/s10822-013-9672-4

Creating a Virtual Assistant for Medicinal Chemistry
journal, June 2019


Deep Docking: A Deep Learning Platform for Augmentation of Structure Based Drug Discovery
journal, May 2020


Applying machine learning techniques to predict the properties of energetic materials
journal, June 2018


Scaffold-based molecular design with a graph generative model
journal, January 2020

  • Lim, Jaechang; Hwang, Sang-Yeon; Moon, Seokhyun
  • Chemical Science, Vol. 11, Issue 4
  • DOI: 10.1039/C9SC04503A

Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals
journal, April 2019


SchNetPack: A Deep Learning Toolbox For Atomistic Systems
journal, November 2018

  • Schütt, K. T.; Kessel, P.; Gastegger, M.
  • Journal of Chemical Theory and Computation, Vol. 15, Issue 1
  • DOI: 10.1021/acs.jctc.8b00908

Quantum Chemistry in the Age of Machine Learning
journal, March 2020


Stochastic Voyages into Uncharted Chemical Space Produce a Representative Library of All Possible Drug-Like Compounds
journal, May 2013

  • Virshup, Aaron M.; Contreras-García, Julia; Wipf, Peter
  • Journal of the American Chemical Society, Vol. 135, Issue 19
  • DOI: 10.1021/ja401184g

A Deep-Learning View of Chemical Space Designed to Facilitate Drug Discovery
journal, July 2020

  • Maragakis, Paul; Nisonoff, Hunter; Cole, Brian
  • Journal of Chemical Information and Modeling, Vol. 60, Issue 10
  • DOI: 10.1021/acs.jcim.0c00321

AMPL: A Data-Driven Modeling Pipeline for Drug Discovery
journal, April 2020

  • Minnich, Amanda J.; McLoughlin, Kevin; Tse, Margaret
  • Journal of Chemical Information and Modeling, Vol. 60, Issue 4
  • DOI: 10.1021/acs.jcim.9b01053

Algorithm for Advanced Canonical Coding of Planar Chemical Structures That Considers Stereochemical and Symmetric Information
journal, July 2007

  • Koichi, Shungo; Iwata, Satoru; Uno, Takeaki
  • Journal of Chemical Information and Modeling, Vol. 47, Issue 5
  • DOI: 10.1021/ci600238j

Learning a Local-Variable Model of Aromatic and Conjugated Systems
journal, December 2017


Strategy To Discover Diverse Optimal Molecules in the Small Molecule Universe
journal, February 2015

  • Rupakheti, Chetan; Virshup, Aaron; Yang, Weitao
  • Journal of Chemical Information and Modeling, Vol. 55, Issue 3
  • DOI: 10.1021/ci500749q

Analyzing Learned Molecular Representations for Property Prediction
journal, July 2019

  • Yang, Kevin; Swanson, Kyle; Jin, Wengong
  • Journal of Chemical Information and Modeling, Vol. 59, Issue 8
  • DOI: 10.1021/acs.jcim.9b00237

K DEEP : Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks
journal, January 2018

  • Jiménez, José; Škalič, Miha; Martínez-Rosell, Gerard
  • Journal of Chemical Information and Modeling, Vol. 58, Issue 2
  • DOI: 10.1021/acs.jcim.7b00650

Improving Protein-Ligand Docking Results with High-Throughput Molecular Dynamics Simulations
journal, March 2020

  • Guterres, Hugo; Im, Wonpil
  • Journal of Chemical Information and Modeling, Vol. 60, Issue 4
  • DOI: 10.1021/acs.jcim.0c00057

Deep Model Based Transfer and Multi-Task Learning for Biological Image Analysis
journal, June 2020


Deep Convolutional Generative Adversarial Network (dcGAN) Models for Screening and Design of Small Molecules Targeting Cannabinoid Receptors
journal, October 2019


Inhomogeneous Electron Gas
journal, November 1964


Quantum chemical accuracy from density functional approximations via machine learning
journal, October 2020

  • Bogojeski, Mihail; Vogt-Maranto, Leslie; Tuckerman, Mark E.
  • Nature Communications, Vol. 11, Issue 1
  • DOI: 10.1038/s41467-020-19093-1

MONN: A Multi-objective Neural Network for Predicting Compound-Protein Interactions and Affinities
journal, April 2020


Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature
journal, April 2016

  • Singhal, Ayush; Simmons, Michael; Lu, Zhiyong
  • Journal of the American Medical Informatics Association, Vol. 23, Issue 4
  • DOI: 10.1093/jamia/ocw041

SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules
journal, February 1988

  • Weininger, David
  • Journal of Chemical Information and Modeling, Vol. 28, Issue 1
  • DOI: 10.1021/ci00057a005

Defining and Exploring Chemical Spaces
journal, February 2021


Communication: Understanding molecular representations in machine learning: The role of uniqueness and target similarity
journal, October 2016

  • Huang, Bing; von Lilienfeld, O. Anatole
  • The Journal of Chemical Physics, Vol. 145, Issue 16
  • DOI: 10.1063/1.4964627

How to improve R&D productivity: the pharmaceutical industry's grand challenge
journal, February 2010

  • Paul, Steven M.; Mytelka, Daniel S.; Dunwiddie, Christopher T.
  • Nature Reviews Drug Discovery, Vol. 9, Issue 3
  • DOI: 10.1038/nrd3078

Deep learning in neural networks: An overview
journal, January 2015


Quantum chemistry structures and properties of 134 kilo molecules
journal, August 2014

  • Ramakrishnan, Raghunathan; Dral, Pavlo O.; Rupp, Matthias
  • Scientific Data, Vol. 1, Issue 1
  • DOI: 10.1038/sdata.2014.22

Active learning strategies with COMBINE analysis: new tricks for an old dog
journal, December 2018

  • Fusani, Lucia; Cabrera, Alvaro Cortes
  • Journal of Computer-Aided Molecular Design, Vol. 33, Issue 2
  • DOI: 10.1007/s10822-018-0181-3

Off-Line Quality Control, Parameter Design, and the Taguchi Method
journal, October 1985


3D-Scaffold: A Deep Learning Framework to Generate 3D Coordinates of Drug-like Molecules with Desired Scaffolds
journal, October 2021

  • Joshi, Rajendra P.; Gebauer, Niklas W. A.; Bontha, Mridula
  • The Journal of Physical Chemistry B
  • DOI: 10.1021/acs.jpcb.1c06437