DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Learning atoms for materials discovery

Abstract

Exciting advances have been made in artificial intelligence (AI) during recent decades. Among them, applications of machine learning (ML) and deep learning techniques brought human-competitive performances in various tasks of fields, including image recognition, speech recognition, and natural language understanding. Even in Go, the ancient game of profound complexity, the AI player has already beat human world champions convincingly with and without learning from the human. In this work, we show that our unsupervised machines (Atom2Vec) can learn the basic properties of atoms by themselves from the extensive database of known compounds and materials. These learned properties are represented in terms of high-dimensional vectors, and clustering of atoms in vector space classifies them into meaningful groups consistent with human knowledge. Furthermore, we use the atom vectors as basic input units for neural networks and other ML models designed and trained to predict materials properties, which demonstrate significant accuracy.

Authors:
 [1];  [1];  [1];  [2];  [2];  [3]
  1. Department of Physics, Stanford University, Stanford, CA 94305-4045,
  2. Department of Physics, Temple University, Philadelphia, PA 19122,
  3. Department of Physics, Stanford University, Stanford, CA 94305-4045,, Stanford Institute for Materials and Energy Sciences, SLAC National Accelerator Laboratory, Menlo Park, CA 94025
Publication Date:
Research Org.:
Energy Frontier Research Centers (EFRC) (United States). Center for Complex Materials from First Principles (CCM); SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1457210
Alternate Identifier(s):
OSTI ID: 1463356
Grant/Contract Number:  
AC02-76SF00515; SC0012575
Resource Type:
Published Article
Journal Name:
Proceedings of the National Academy of Sciences of the United States of America
Additional Journal Information:
Journal Name: Proceedings of the National Academy of Sciences of the United States of America Journal Volume: 115 Journal Issue: 28; Journal ID: ISSN 0027-8424
Publisher:
Proceedings of the National Academy of Sciences
Country of Publication:
United States
Language:
English
Subject:
74 ATOMIC AND MOLECULAR PHYSICS; atomism; machine learning; materials discovery

Citation Formats

Zhou, Quan, Tang, Peizhe, Liu, Shenxiu, Pan, Jinbo, Yan, Qimin, and Zhang, Shou-Cheng. Learning atoms for materials discovery. United States: N. p., 2018. Web. doi:10.1073/pnas.1801181115.
Zhou, Quan, Tang, Peizhe, Liu, Shenxiu, Pan, Jinbo, Yan, Qimin, & Zhang, Shou-Cheng. Learning atoms for materials discovery. United States. https://doi.org/10.1073/pnas.1801181115
Zhou, Quan, Tang, Peizhe, Liu, Shenxiu, Pan, Jinbo, Yan, Qimin, and Zhang, Shou-Cheng. Tue . "Learning atoms for materials discovery". United States. https://doi.org/10.1073/pnas.1801181115.
@article{osti_1457210,
title = {Learning atoms for materials discovery},
author = {Zhou, Quan and Tang, Peizhe and Liu, Shenxiu and Pan, Jinbo and Yan, Qimin and Zhang, Shou-Cheng},
abstractNote = {Exciting advances have been made in artificial intelligence (AI) during recent decades. Among them, applications of machine learning (ML) and deep learning techniques brought human-competitive performances in various tasks of fields, including image recognition, speech recognition, and natural language understanding. Even in Go, the ancient game of profound complexity, the AI player has already beat human world champions convincingly with and without learning from the human. In this work, we show that our unsupervised machines (Atom2Vec) can learn the basic properties of atoms by themselves from the extensive database of known compounds and materials. These learned properties are represented in terms of high-dimensional vectors, and clustering of atoms in vector space classifies them into meaningful groups consistent with human knowledge. Furthermore, we use the atom vectors as basic input units for neural networks and other ML models designed and trained to predict materials properties, which demonstrate significant accuracy.},
doi = {10.1073/pnas.1801181115},
journal = {Proceedings of the National Academy of Sciences of the United States of America},
number = 28,
volume = 115,
place = {United States},
year = {Tue Jun 26 00:00:00 EDT 2018},
month = {Tue Jun 26 00:00:00 EDT 2018}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1073/pnas.1801181115

Citation Metrics:
Cited by: 106 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science
journal, April 2016

  • Agrawal, Ankit; Choudhary, Alok
  • APL Materials, Vol. 4, Issue 5
  • DOI: 10.1063/1.4946894

Molecular graph convolutions: moving beyond fingerprints
journal, August 2016

  • Kearnes, Steven; McCloskey, Kevin; Berndl, Marc
  • Journal of Computer-Aided Molecular Design, Vol. 30, Issue 8
  • DOI: 10.1007/s10822-016-9938-8

The high-throughput highway to computational materials design
journal, February 2013

  • Curtarolo, Stefano; Hart, Gus L. W.; Nardelli, Marco Buongiorno
  • Nature Materials, Vol. 12, Issue 3
  • DOI: 10.1038/nmat3568

Mastering the game of Go without human knowledge
journal, October 2017

  • Silver, David; Schrittwieser, Julian; Simonyan, Karen
  • Nature, Vol. 550, Issue 7676
  • DOI: 10.1038/nature24270

Big Data of Materials Science: Critical Role of the Descriptor
journal, March 2015


Tl 2 LiYCl 6 :Ce: A New Elpasolite Scintillator
journal, December 2016

  • Hawrami, R.; Ariesanti, E.; Soundara-Pandian, L.
  • IEEE Transactions on Nuclear Science, Vol. 63, Issue 6
  • DOI: 10.1109/TNS.2016.2627523

Deep learning
journal, May 2015

  • LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
  • Nature, Vol. 521, Issue 7553
  • DOI: 10.1038/nature14539

Accelerated search for materials with targeted properties by adaptive design
journal, April 2016

  • Xue, Dezhen; Balachandran, Prasanna V.; Hogden, John
  • Nature Communications, Vol. 7, Issue 1
  • DOI: 10.1038/ncomms11241

Machine Learning Energies of 2 Million Elpasolite ( A B C 2 D 6 ) Crystals
journal, September 2016


Quantum-chemical insights from deep tensor neural networks
journal, January 2017

  • Schütt, Kristof T.; Arbabzadah, Farhad; Chmiela, Stefan
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms13890

Mastering the game of Go with deep neural networks and tree search
journal, January 2016

  • Silver, David; Huang, Aja; Maddison, Chris J.
  • Nature, Vol. 529, Issue 7587
  • DOI: 10.1038/nature16961

Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach
journal, August 2016

  • Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D.
  • Nature Materials, Vol. 15, Issue 10
  • DOI: 10.1038/nmat4717

Towards the computational design of solid catalysts
journal, April 2009

  • Nørskov, J.; Bligaard, T.; Rossmeisl, J.
  • Nature Chemistry, Vol. 1, Issue 1, p. 37-46
  • DOI: 10.1038/nchem.121

Machine-learning-assisted materials discovery using failed experiments
journal, May 2016

  • Raccuglia, Paul; Elbert, Katherine C.; Adler, Philip D. F.
  • Nature, Vol. 533, Issue 7601
  • DOI: 10.1038/nature17439

Crystal structure representations for machine learning models of formation energies
journal, April 2015

  • Faber, Felix; Lindmaa, Alexander; von Lilienfeld, O. Anatole
  • International Journal of Quantum Chemistry, Vol. 115, Issue 16
  • DOI: 10.1002/qua.24917

A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge.
journal, January 1997


Glove: Global Vectors for Word Representation
conference, January 2014

  • Pennington, Jeffrey; Socher, Richard; Manning, Christopher
  • Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
  • DOI: 10.3115/v1/D14-1162

Distributional Structure
journal, August 1954


Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
journal, November 2012


Tunable multifunctional topological insulators in ternary Heusler compounds
journal, May 2010

  • Chadov, Stanislav; Qi, Xiaoliang; Kübler, Jürgen
  • Nature Materials, Vol. 9, Issue 7
  • DOI: 10.1038/nmat2770

Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
journal, July 2013

  • Jain, Anubhav; Ong, Shyue Ping; Hautier, Geoffroy
  • APL Materials, Vol. 1, Issue 1
  • DOI: 10.1063/1.4812323

Battery materials for ultrafast charging and discharging
journal, March 2009

  • Kang, Byoungwoo; Ceder, Gerbrand
  • Nature, Vol. 458, Issue 7235, p. 190-193
  • DOI: 10.1038/nature07853

Combinatorial screening for new materials in unconstrained composition space with machine learning
journal, March 2014