skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge

Abstract

Nanomaterials of varying compositions and morphologies are of interest for many applications from catalysis to optics, but the synthesis of nanomaterials and their scale-up are most often time-consuming and Edisonian processes. Information gleaned from the scientific literature can help inform and accelerate nanomaterials development, but again, searching the literature and digesting the information are time-consuming manual processes for researchers. To help address these challenges, here we developed scientific article-processing tools that extract and structure information from the text and figures of nanomaterials articles, thereby enabling the creation of a personalized knowledgebase for nanomaterials synthesis that can be mined to help inform further nanomaterials development. Starting with a corpus of ~35k nanomaterials-related articles, we developed models to classify articles according to the nanomaterial composition and morphology, extract synthesis protocols from within the articles’ text, and extract, normalize, and categorize chemical terms within synthesis protocols. We demonstrate the efficiency of the proposed pipeline on an expert-labeled set of nanomaterials synthesis articles, achieving 100% accuracy on composition prediction, 95% accuracy on morphology prediction, 0.99 AUC on protocol identification, and up to a 0.87 F1-score on chemical entity recognition. In addition to processing articles’ text, microscopy images of nanomaterials within the articles are alsomore » automatically identified and analyzed to determine the nanomaterials’ morphologies and size distributions. To enable users to easily explore the database, we developed a complementary browser-based visualization tool that provides flexibility in comparing across subsets of articles of interest. We use these tools and information to identify trends in nanomaterials synthesis, such as the correlation of certain reagents with various nanomaterial morphologies, which is useful in guiding hypotheses and reducing the potential parameter space during experimental design.« less

Authors:
 [1];  [2];  [3];  [3];  [2];  [2]; ORCiD logo [1];  [2];  [2]; ORCiD logo [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Materials Science Division
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing
  3. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Global Security Computing Applications Division
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program
OSTI Identifier:
1669214
Report Number(s):
LLNL-JRNL-779959
Journal ID: ISSN 1549-9596; 971534
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Journal of Chemical Information and Modeling
Additional Journal Information:
Journal Volume: 60; Journal Issue: 6; Journal ID: ISSN 1549-9596
Publisher:
American Chemical Society
Country of Publication:
United States
Language:
English
Subject:
36 MATERIALS SCIENCE; biological databases; gold; morphology; nanocubes; nanomaterials

Citation Formats

Hiszpanski, Anna M., Gallagher, Brian, Chellappan, Karthik, Li, Peggy, Liu, Shusen, Kim, Hyojin, Han, Jinkyu, Kailkhura, Bhavya, Buttler, David J., and Han, Thomas Yong-Jin. Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge. United States: N. p., 2020. Web. doi:10.1021/acs.jcim.0c00199.
Hiszpanski, Anna M., Gallagher, Brian, Chellappan, Karthik, Li, Peggy, Liu, Shusen, Kim, Hyojin, Han, Jinkyu, Kailkhura, Bhavya, Buttler, David J., & Han, Thomas Yong-Jin. Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge. United States. https://doi.org/10.1021/acs.jcim.0c00199
Hiszpanski, Anna M., Gallagher, Brian, Chellappan, Karthik, Li, Peggy, Liu, Shusen, Kim, Hyojin, Han, Jinkyu, Kailkhura, Bhavya, Buttler, David J., and Han, Thomas Yong-Jin. Tue . "Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge". United States. https://doi.org/10.1021/acs.jcim.0c00199. https://www.osti.gov/servlets/purl/1669214.
@article{osti_1669214,
title = {Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge},
author = {Hiszpanski, Anna M. and Gallagher, Brian and Chellappan, Karthik and Li, Peggy and Liu, Shusen and Kim, Hyojin and Han, Jinkyu and Kailkhura, Bhavya and Buttler, David J. and Han, Thomas Yong-Jin},
abstractNote = {Nanomaterials of varying compositions and morphologies are of interest for many applications from catalysis to optics, but the synthesis of nanomaterials and their scale-up are most often time-consuming and Edisonian processes. Information gleaned from the scientific literature can help inform and accelerate nanomaterials development, but again, searching the literature and digesting the information are time-consuming manual processes for researchers. To help address these challenges, here we developed scientific article-processing tools that extract and structure information from the text and figures of nanomaterials articles, thereby enabling the creation of a personalized knowledgebase for nanomaterials synthesis that can be mined to help inform further nanomaterials development. Starting with a corpus of ~35k nanomaterials-related articles, we developed models to classify articles according to the nanomaterial composition and morphology, extract synthesis protocols from within the articles’ text, and extract, normalize, and categorize chemical terms within synthesis protocols. We demonstrate the efficiency of the proposed pipeline on an expert-labeled set of nanomaterials synthesis articles, achieving 100% accuracy on composition prediction, 95% accuracy on morphology prediction, 0.99 AUC on protocol identification, and up to a 0.87 F1-score on chemical entity recognition. In addition to processing articles’ text, microscopy images of nanomaterials within the articles are also automatically identified and analyzed to determine the nanomaterials’ morphologies and size distributions. To enable users to easily explore the database, we developed a complementary browser-based visualization tool that provides flexibility in comparing across subsets of articles of interest. We use these tools and information to identify trends in nanomaterials synthesis, such as the correlation of certain reagents with various nanomaterial morphologies, which is useful in guiding hypotheses and reducing the potential parameter space during experimental design.},
doi = {10.1021/acs.jcim.0c00199},
url = {https://www.osti.gov/biblio/1669214}, journal = {Journal of Chemical Information and Modeling},
issn = {1549-9596},
number = 6,
volume = 60,
place = {United States},
year = {2020},
month = {4}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Materials Data Science: Current Status and Future Outlook
journal, July 2015


The Cambridge Structural Database: a quarter of a million crystal structures and rising
journal, May 2002


Crystallography Open Database – an open-access collection of crystal structures
journal, May 2009


The Inorganic Crystal Structure Database (ICSD)—Present and Future
journal, January 2004


Crystallography and Databases
journal, August 2017


Machine learning for molecular and materials science
journal, July 2018


Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013


The Materials Data Facility: Data Services to Advance Materials Science Research
journal, July 2016


The Materials Commons: A Collaboration Platform and Information Repository for the Global Materials Community
journal, July 2016


Creating an integrated collaborative environment for materials research
journal, August 2016


Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design
journal, March 2016


The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid
journal, August 2011


Informatics Infrastructure for the Materials Genome Initiative
journal, July 2016


TE Design Lab: A virtual laboratory for thermoelectric material design
journal, February 2016


Matminer: An open source toolkit for materials data mining
journal, September 2018


Use machine learning to find energy materials 
journal, December 2017


The Catalyst Genome
journal, December 2012


Perspective: Interactive material property databases through aggregation of literature data
journal, March 2016


Information Retrieval and Text Mining Technologies for Chemistry
journal, May 2017


ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature
journal, October 2016


The Next Breakthrough for Organic Photovoltaics?
journal, December 2014


Data-Driven Review of Thermoelectric Materials: Performance and Resource Considerations
journal, May 2013


Performance and resource considerations of Li-ion battery electrode materials
journal, January 2015


Autonomous discovery in the chemical sciences part I: Progress
journal, September 2019


Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
journal, October 2017


A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction
journal, April 2019


Data mining for better material synthesis: The case of pulsed laser deposition of complex oxides
journal, March 2018


Text-mined dataset of inorganic materials synthesis recipes
journal, October 2019


Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature
journal, July 2019


Unsupervised word embeddings capture latent knowledge from materials science literature
journal, July 2019


An information-theoretic perspective of tf–idf measures
journal, January 2003


Enhanced Electrocatalytic Performance of One-Dimensional Metal Nanowires and Arrays Generated via an Ambient, Surfactantless Synthesis
journal, April 2009


Porous Platinum Nanotubes for Oxygen Reduction and Methanol Oxidation Reactions
journal, October 2010


Solution-Processed Metal Nanowire Mesh Transparent Electrodes
journal, February 2008


Efficient Organic Solar Cells with Solution-Processed Silver Nanowire Electrodes
journal, August 2011


The Growth Mechanism of Copper Nanowires and Their Properties in Flexible, Transparent Conducting Films
journal, June 2010


Synthesis of Ultralong Copper Nanowires for High-Performance Transparent Electrodes
journal, August 2012


Metal nanogrids, nanowires, and nanofibers for transparent electrodes
journal, October 2011


Plasmonic nanorod metamaterials for biosensing
journal, October 2009


Different Plasmon Sensing Behavior of Silver and Gold Nanorods
journal, April 2013


An introduction to ROC analysis
journal, June 2006


Nucleation-Controlled Distributed Plasticity in Penta-twinned Silver Nanowires
journal, July 2012


OSCAR4: a flexible architecture for chemical text-mining
journal, October 2011


ChemSpot: a hybrid system for chemical named entity recognition
journal, April 2012


Chemical entity extraction using CRF and an ensemble of extractors
journal, January 2015


The Stanford CoreNLP Natural Language Processing Toolkit
conference, January 2014

  • Manning, Christopher; Surdeanu, Mihai; Bauer, John
  • Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations
  • https://doi.org/10.3115/v1/P14-5010

PubChem 2019 update: improved access to chemical data
journal, October 2018


Shape-Controlled Synthesis of Gold and Silver Nanoparticles
journal, December 2002


Quantitative Analysis of the Role Played by Poly(vinylpyrrolidone) in Seed-Mediated Growth of Ag Nanocrystals
journal, January 2012


Polyvinylpyrrolidone (PVP) in nanoparticle synthesis
journal, January 2015


Large-Scale Synthesis of Silver Nanocubes: The Role of HCl in Promoting Cube Perfection and Monodispersity
journal, March 2005


Synthesis of Ag Nanocubes 18–32 nm in Edge Length: The Effects of Polyol on Reduction Kinetics, Size Control, and Reproducibility
journal, January 2013


Facile Synthesis of Ag Nanocubes of 30 to 70 nm in Edge Length with CF 3 COOAg as a Precursor
journal, June 2010


A “Tips and Tricks” Practical Guide to the Synthesis of Gold Nanorods
journal, October 2015


Gold nanorod crystal growth: From seed-mediated synthesis to nanoscale sculpting
journal, April 2011


Room Temperature, High-Yield Synthesis of Multiple Shapes of Gold Nanoparticles in Aqueous Solution
journal, July 2004