skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge

Journal Article · · Journal of Chemical Information and Modeling
 [1];  [2];  [3];  [3];  [2];  [2]; ORCiD logo [1];  [2];  [2]; ORCiD logo [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Materials Science Division
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing
  3. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Global Security Computing Applications Division

Nanomaterials of varying compositions and morphologies are of interest for many applications from catalysis to optics, but the synthesis of nanomaterials and their scale-up are most often time-consuming and Edisonian processes. Information gleaned from the scientific literature can help inform and accelerate nanomaterials development, but again, searching the literature and digesting the information are time-consuming manual processes for researchers. To help address these challenges, here we developed scientific article-processing tools that extract and structure information from the text and figures of nanomaterials articles, thereby enabling the creation of a personalized knowledgebase for nanomaterials synthesis that can be mined to help inform further nanomaterials development. Starting with a corpus of ~35k nanomaterials-related articles, we developed models to classify articles according to the nanomaterial composition and morphology, extract synthesis protocols from within the articles’ text, and extract, normalize, and categorize chemical terms within synthesis protocols. We demonstrate the efficiency of the proposed pipeline on an expert-labeled set of nanomaterials synthesis articles, achieving 100% accuracy on composition prediction, 95% accuracy on morphology prediction, 0.99 AUC on protocol identification, and up to a 0.87 F1-score on chemical entity recognition. In addition to processing articles’ text, microscopy images of nanomaterials within the articles are also automatically identified and analyzed to determine the nanomaterials’ morphologies and size distributions. To enable users to easily explore the database, we developed a complementary browser-based visualization tool that provides flexibility in comparing across subsets of articles of interest. We use these tools and information to identify trends in nanomaterials synthesis, such as the correlation of certain reagents with various nanomaterial morphologies, which is useful in guiding hypotheses and reducing the potential parameter space during experimental design.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
AC52-07NA27344
OSTI ID:
1669214
Report Number(s):
LLNL-JRNL-779959; 971534
Journal Information:
Journal of Chemical Information and Modeling, Vol. 60, Issue 6; ISSN 1549-9596
Publisher:
American Chemical SocietyCopyright Statement
Country of Publication:
United States
Language:
English

References (67)

Materials Data Science: Current Status and Future Outlook journal July 2015
The Cambridge Structural Database: a quarter of a million crystal structures and rising journal May 2002
Crystallography Open Database – an open-access collection of crystal structures journal May 2009
The Inorganic Crystal Structure Database (ICSD)—Present and Future journal January 2004
Crystallography and Databases journal August 2017
Machine learning for molecular and materials science journal July 2018
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
The Materials Data Facility: Data Services to Advance Materials Science Research journal July 2016
Materials Data Infrastructure: A Case Study of the Citrination Platform to Examine Data Import, Storage, and Access journal June 2016
The Materials Commons: A Collaboration Platform and Information Repository for the Global Materials Community journal July 2016
Creating an integrated collaborative environment for materials research journal August 2016
Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design journal March 2016
The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid journal August 2011
Informatics Infrastructure for the Materials Genome Initiative journal July 2016
TE Design Lab: A virtual laboratory for thermoelectric material design journal February 2016
Matminer: An open source toolkit for materials data mining journal September 2018
Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science journal April 2016
Use machine learning to find energy materials  journal December 2017
Connecting Chemistry with Global Challenges through Data Standards journal May 2017
The Catalyst Genome journal December 2012
Overlapping Target Event and Story Line Detection of Online Newspaper Articles
  • Wei, Yifang; Singh, Lisa; Gallagher, Brian
  • 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) https://doi.org/10.1109/DSAA.2016.30
conference October 2016
Perspective: Interactive material property databases through aggregation of literature data journal March 2016
Information Retrieval and Text Mining Technologies for Chemistry journal May 2017
ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature journal October 2016
The Next Breakthrough for Organic Photovoltaics? journal December 2014
Data-Driven Review of Thermoelectric Materials: Performance and Resource Considerations journal May 2013
Performance and resource considerations of Li-ion battery electrode materials journal January 2015
Autonomous discovery in the chemical sciences part I: Progress journal September 2019
What's Cooking with Chef Watson? An Interview with Lav Varshney and James Briscione journal October 2015
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning journal October 2017
A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction journal April 2019
Data mining for better material synthesis: The case of pulsed laser deposition of complex oxides journal March 2018
Text-mined dataset of inorganic materials synthesis recipes journal October 2019
Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature journal July 2019
Unsupervised word embeddings capture latent knowledge from materials science literature journal July 2019
Elsevier opens its papers to text-mining journal February 2014
An information-theoretic perspective of tf–idf measures journal January 2003
Enhanced Electrocatalytic Performance of One-Dimensional Metal Nanowires and Arrays Generated via an Ambient, Surfactantless Synthesis journal April 2009
Porous Platinum Nanotubes for Oxygen Reduction and Methanol Oxidation Reactions journal October 2010
Solution-Processed Metal Nanowire Mesh Transparent Electrodes journal February 2008
Efficient Organic Solar Cells with Solution-Processed Silver Nanowire Electrodes journal August 2011
The Growth Mechanism of Copper Nanowires and Their Properties in Flexible, Transparent Conducting Films journal June 2010
Synthesis of Ultralong Copper Nanowires for High-Performance Transparent Electrodes journal August 2012
Metal nanogrids, nanowires, and nanofibers for transparent electrodes journal October 2011
Plasmonic nanorod metamaterials for biosensing journal October 2009
Different Plasmon Sensing Behavior of Silver and Gold Nanorods journal April 2013
Seed-Mediated Growth of Ultralong Gold Nanorods and Nanowires with a Wide Range of Length Tunability journal August 2013
An introduction to ROC analysis journal June 2006
Nucleation-Controlled Distributed Plasticity in Penta-twinned Silver Nanowires journal July 2012
OSCAR4: a flexible architecture for chemical text-mining journal October 2011
ChemSpot: a hybrid system for chemical named entity recognition journal April 2012
Banner: an Executable Survey of Advances in Biomedical Named Entity Recognition conference January 2012
Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations journal January 2015
Chemical entity extraction using CRF and an ensemble of extractors journal January 2015
The Stanford CoreNLP Natural Language Processing Toolkit
  • Manning, Christopher; Surdeanu, Mihai; Bauer, John
  • Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations https://doi.org/10.3115/v1/P14-5010
conference January 2014
A dictionary to identify small molecules and drugs in free text journal September 2009
PubChem 2019 update: improved access to chemical data journal October 2018
Shape-Controlled Synthesis of Gold and Silver Nanoparticles journal December 2002
Quantitative Analysis of the Role Played by Poly(vinylpyrrolidone) in Seed-Mediated Growth of Ag Nanocrystals journal January 2012
Polyvinylpyrrolidone (PVP) in nanoparticle synthesis journal January 2015
Large-Scale Synthesis of Silver Nanocubes: The Role of HCl in Promoting Cube Perfection and Monodispersity journal March 2005
Synthesis of Ag Nanocubes 18–32 nm in Edge Length: The Effects of Polyol on Reduction Kinetics, Size Control, and Reproducibility journal January 2013
Facile Synthesis of Ag Nanocubes of 30 to 70 nm in Edge Length with CF 3 COOAg as a Precursor journal June 2010
The Importance of the CTAB Surfactant on the Colloidal Seed-Mediated Synthesis of Gold Nanorods journal January 2008
A “Tips and Tricks” Practical Guide to the Synthesis of Gold Nanorods journal October 2015
Gold nanorod crystal growth: From seed-mediated synthesis to nanoscale sculpting journal April 2011
Room Temperature, High-Yield Synthesis of Multiple Shapes of Gold Nanoparticles in Aqueous Solution journal July 2004

Similar Records

Materials Informatics ChemVis
Software · Mon Jun 17 00:00:00 EDT 2019 · OSTI ID:1669214

Molecular Assemblies, Genes and Genomics Integrated Efficiently (MAGGIE)
Technical Report · Thu May 26 00:00:00 EDT 2011 · OSTI ID:1669214

Text-mined dataset of gold nanoparticle synthesis procedures, morphologies, and size entities
Journal Article · Thu May 26 00:00:00 EDT 2022 · Scientific Data · OSTI ID:1669214