DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Enabling deeper learning on big data for materials informatics applications

Abstract

Abstract The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) bettermore » model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.« less

Authors:
; ; ; ; ; ; ; ;
Publication Date:
Research Org.:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1766588
Alternate Identifier(s):
OSTI ID: 1823644
Grant/Contract Number:  
SC0014330, DE-SC0019358; AC02-06CH11357
Resource Type:
Published Article
Journal Name:
Scientific Reports
Additional Journal Information:
Journal Name: Scientific Reports Journal Volume: 11 Journal Issue: 1; Journal ID: ISSN 2045-2322
Publisher:
Nature Publishing Group
Country of Publication:
United Kingdom
Language:
English
Subject:
36 MATERIALS SCIENCE; Computational methods; Materials science

Citation Formats

Jha, Dipendra, Gupta, Vishu, Ward, Logan, Yang, Zijiang, Wolverton, Christopher, Foster, Ian, Liao, Wei-keng, Choudhary, Alok, and Agrawal, Ankit. Enabling deeper learning on big data for materials informatics applications. United Kingdom: N. p., 2021. Web. doi:10.1038/s41598-021-83193-1.
Jha, Dipendra, Gupta, Vishu, Ward, Logan, Yang, Zijiang, Wolverton, Christopher, Foster, Ian, Liao, Wei-keng, Choudhary, Alok, & Agrawal, Ankit. Enabling deeper learning on big data for materials informatics applications. United Kingdom. https://doi.org/10.1038/s41598-021-83193-1
Jha, Dipendra, Gupta, Vishu, Ward, Logan, Yang, Zijiang, Wolverton, Christopher, Foster, Ian, Liao, Wei-keng, Choudhary, Alok, and Agrawal, Ankit. Fri . "Enabling deeper learning on big data for materials informatics applications". United Kingdom. https://doi.org/10.1038/s41598-021-83193-1.
@article{osti_1766588,
title = {Enabling deeper learning on big data for materials informatics applications},
author = {Jha, Dipendra and Gupta, Vishu and Ward, Logan and Yang, Zijiang and Wolverton, Christopher and Foster, Ian and Liao, Wei-keng and Choudhary, Alok and Agrawal, Ankit},
abstractNote = {Abstract The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.},
doi = {10.1038/s41598-021-83193-1},
journal = {Scientific Reports},
number = 1,
volume = 11,
place = {United Kingdom},
year = {Fri Feb 19 00:00:00 EST 2021},
month = {Fri Feb 19 00:00:00 EST 2021}
}

Works referenced in this record:

Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science
journal, April 2016

  • Agrawal, Ankit; Choudhary, Alok
  • APL Materials, Vol. 4, Issue 5
  • DOI: 10.1063/1.4946894

ElemNet: Deep Learning the Chemistry of Materials From Only Elemental Composition
journal, December 2018


Learning atoms for materials discovery
journal, June 2018

  • Zhou, Quan; Tang, Peizhe; Liu, Shenxiu
  • Proceedings of the National Academy of Sciences, Vol. 115, Issue 28
  • DOI: 10.1073/pnas.1801181115

Extracting Grain Orientations from EBSD Patterns of Polycrystalline Materials Using Convolutional Neural Networks
journal, October 2018

  • Jha, Dipendra; Singh, Saransh; Al-Bahrani, Reda
  • Microscopy and Microanalysis, Vol. 24, Issue 5
  • DOI: 10.1017/S1431927618015131

Densely Connected Convolutional Networks
conference, July 2017

  • Huang, Gao; Liu, Zhuang; Maaten, Laurens van der
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2017.243

The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies
journal, December 2015


Accelerated search for materials with targeted properties by adaptive design
journal, April 2016

  • Xue, Dezhen; Balachandran, Prasanna V.; Hogden, John
  • Nature Communications, Vol. 7, Issue 1
  • DOI: 10.1038/ncomms11241

A predictive machine learning approach for microstructure optimization and materials design
journal, June 2015

  • Liu, Ruoqian; Kumar, Abhishek; Chen, Zhengzhang
  • Scientific Reports, Vol. 5, Issue 1
  • DOI: 10.1038/srep11551

Elastic properties of bulk and low-dimensional materials using van der Waals density functional
journal, July 2018


Including crystal structure attributes in machine learning models of formation energies via Voronoi tessellations
journal, July 2017


Materials Informatics: The Materials “Gene” and Big Data
journal, July 2015


Microsoft COCO: Common Objects in Context
book, January 2014


Matminer: An open source toolkit for materials data mining
journal, September 2018


Going deeper with convolutions
conference, June 2015


Machine learning with force-field-inspired descriptors for materials: Fast screening and mapping energy landscape
journal, August 2018


Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013

  • Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
  • APL Materials, Vol. 1, Issue 1
  • DOI: 10.1063/1.4812323

Deep materials informatics: Applications of deep learning in materials science
journal, June 2019


The high-throughput highway to computational materials design
journal, February 2013

  • Curtarolo, Stefano; Hart, Gus L. W.; Nardelli, Marco Buongiorno
  • Nature Materials, Vol. 12, Issue 3
  • DOI: 10.1038/nmat3568

Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning
journal, November 2019


Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD)
journal, September 2013


Machine learning of molecular electronic properties in chemical compound space
journal, September 2013


ImageNet: A large-scale hierarchical image database
conference, June 2009

  • Deng, Jia; Dong, Wei; Socher, Richard
  • 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognition
  • DOI: 10.1109/CVPR.2009.5206848

Deep Residual Learning for Image Recognition
conference, June 2016

  • He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • DOI: 10.1109/CVPR.2016.90

AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations
journal, June 2012


Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
journal, February 2017

  • Szegedy, Christian; Ioffe, Sergey; Vanhoucke, Vincent
  • Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, Issue 1
  • DOI: 10.1609/aaai.v31i1.11231

Machine learning in materials informatics: recent applications and prospects
journal, December 2017

  • Ramprasad, Rampi; Batra, Rohit; Pilania, Ghanshyam
  • npj Computational Materials, Vol. 3, Issue 1
  • DOI: 10.1038/s41524-017-0056-5

Materials science with large-scale data and informatics: Unlocking new opportunities
journal, May 2016

  • Hill, Joanne; Mulholland, Gregory; Persson, Kristin
  • MRS Bulletin, Vol. 41, Issue 5
  • DOI: 10.1557/mrs.2016.93

Adaptive machine learning framework to accelerate ab initio molecular dynamics
journal, December 2014

  • Botu, Venkatesh; Ramprasad, Rampi
  • International Journal of Quantum Chemistry, Vol. 115, Issue 16
  • DOI: 10.1002/qua.24836

Predicting materials properties without crystal structure: deep representation learning from stoichiometry
journal, December 2020


Predicting the Band Gaps of Inorganic Solids by Machine Learning
journal, March 2018

  • Zhuo, Ya; Mansouri Tehrani, Aria; Brgoch, Jakoah
  • The Journal of Physical Chemistry Letters, Vol. 9, Issue 7
  • DOI: 10.1021/acs.jpclett.8b00124

Machine Learning Energies of 2 Million Elpasolite ( A B C 2 D 6 ) Crystals
journal, September 2016


Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals
journal, April 2019


Computational screening of high-performance optoelectronic materials using OptB88vdW and TB-mBJ formalisms
journal, May 2018

  • Choudhary, Kamal; Zhang, Qin; Reid, Andrew C. E.
  • Scientific Data, Vol. 5, Issue 1
  • DOI: 10.1038/sdata.2018.82

IRNet: A General Purpose Deep Residual Regression Framework for Materials Discovery
conference, July 2019

  • Jha, Dipendra; Ward, Logan; Yang, Zijiang
  • KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
  • DOI: 10.1145/3292500.3330703

Atomistic calculations and materials informatics: A review
journal, June 2017


A general-purpose machine learning framework for predicting properties of inorganic materials
journal, August 2016


The Materials Data Facility: Data Services to Advance Materials Science Research
journal, July 2016


Informatics Infrastructure for the Materials Genome Initiative
journal, July 2016


High-throughput Identification and Characterization of Two-dimensional Materials using Density functional theory
journal, July 2017


Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties
journal, April 2018


Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery
journal, September 2015

  • Pyzer-Knapp, Edward O.; Li, Kewei; Aspuru-Guzik, Alan
  • Advanced Functional Materials, Vol. 25, Issue 41
  • DOI: 10.1002/adfm.201501919

An improved residual LSTM architecture for acoustic modeling
conference, July 2017

  • Huang, Lu; Xu, Ji; Sun, Jiasong
  • 2017 2nd International Conference on Computer and Communication Systems (ICCCS)
  • DOI: 10.1109/CCOMS.2017.8075276

Combinatorial screening for new materials in unconstrained composition space with machine learning
journal, March 2014


Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery
journal, June 2020