skip to main content

DOE PAGESDOE PAGES

Title: Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning

In the past several years, Materials Genome Initiative (MGI) efforts have produced myriad examples of computationally designed materials in the fields of energy storage, catalysis, thermoelectrics, and hydrogen storage as well as large data resources that are used to screen for potentially transformative compounds. The bottleneck in high-Throughput materials design has thus shifted to materials synthesis, which motivates our development of a methodology to automatically compile materials synthesis parameters across tens of thousands of scholarly publications using natural language processing techniques. To demonstrate our framework's capabilities, we examine the synthesis conditions for various metal oxides across more than 12 thousand manuscripts. We then apply machine learning methods to predict the critical parameters needed to synthesize titania nanotubes via hydrothermal methods and verify this result against known mechanisms. Lastly, we demonstrate the capacity for transfer learning by using machine learning models to predict synthesis outcomes on materials systems not included in the training set and thereby outperform heuristic strategies.
Authors:
ORCiD logo [1] ;  [1] ;  [2] ;  [2] ;  [3] ; ORCiD logo [1]
  1. Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)
  2. Univ. of Massachusetts, Amherst, MA (United States)
  3. Univ. of California, Berkeley, CA (United States)
Publication Date:
Grant/Contract Number:
AC02-05CH11231
Type:
Accepted Manuscript
Journal Name:
Chemistry of Materials
Additional Journal Information:
Journal Volume: 29; Journal Issue: 21; Related Information: © 2017 American Chemical Society.; Journal ID: ISSN 0897-4756
Publisher:
American Chemical Society (ACS)
Research Org:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
Country of Publication:
United States
Language:
English
Subject:
36 MATERIALS SCIENCE
OSTI Identifier:
1476572

Kim, Edward, Huang, Kevin, Saunders, Adam, McCallum, Andrew, Ceder, Gerbrand, and Olivetti, Elsa. Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning. United States: N. p., Web. doi:10.1021/acs.chemmater.7b03500.
Kim, Edward, Huang, Kevin, Saunders, Adam, McCallum, Andrew, Ceder, Gerbrand, & Olivetti, Elsa. Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning. United States. doi:10.1021/acs.chemmater.7b03500.
Kim, Edward, Huang, Kevin, Saunders, Adam, McCallum, Andrew, Ceder, Gerbrand, and Olivetti, Elsa. 2017. "Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning". United States. doi:10.1021/acs.chemmater.7b03500. https://www.osti.gov/servlets/purl/1476572.
@article{osti_1476572,
title = {Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning},
author = {Kim, Edward and Huang, Kevin and Saunders, Adam and McCallum, Andrew and Ceder, Gerbrand and Olivetti, Elsa},
abstractNote = {In the past several years, Materials Genome Initiative (MGI) efforts have produced myriad examples of computationally designed materials in the fields of energy storage, catalysis, thermoelectrics, and hydrogen storage as well as large data resources that are used to screen for potentially transformative compounds. The bottleneck in high-Throughput materials design has thus shifted to materials synthesis, which motivates our development of a methodology to automatically compile materials synthesis parameters across tens of thousands of scholarly publications using natural language processing techniques. To demonstrate our framework's capabilities, we examine the synthesis conditions for various metal oxides across more than 12 thousand manuscripts. We then apply machine learning methods to predict the critical parameters needed to synthesize titania nanotubes via hydrothermal methods and verify this result against known mechanisms. Lastly, we demonstrate the capacity for transfer learning by using machine learning models to predict synthesis outcomes on materials systems not included in the training set and thereby outperform heuristic strategies.},
doi = {10.1021/acs.chemmater.7b03500},
journal = {Chemistry of Materials},
number = 21,
volume = 29,
place = {United States},
year = {2017},
month = {10}
}