Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A data ecosystem to support machine learning in materials science

Journal Article · · MRS Communications
DOI:https://doi.org/10.1557/mrc.2019.118· OSTI ID:1607645
 [1];  [1];  [2];  [3];  [1];  [4];  [1];  [1]
  1. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Argonne, IL (United States)
  2. Argonne National Lab. (ANL), Argonne, IL (United States)
  3. Univ. of Chicago, IL (United States)
  4. Cornell Univ., Ithaca, NY (United States)
Facilitating the application of machine learning to materials science problems requires enhancing the data ecosystem to enable discovery and collection of data from many sources, automated dissemination of new data across the ecosystem, and the connecting of data with materialsspecific machine learning models. Here, we present two projects, the Materials Data Facility (MDF) and the Data and Learning Hub for Science (DLHub), that address these needs. We use examples to show how MDF and DLHub capabilities can be leveraged to link data with machine learning models and how users can access those capabilities through web and programmatic interfaces.
Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC); National Inst. of Standards and Technology (NIST), Boulder, CO (United States)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1607645
Journal Information:
MRS Communications, Journal Name: MRS Communications Journal Issue: 4 Vol. 9; ISSN 2159-6859
Publisher:
Materials Research Society - Cambridge University PressCopyright Statement
Country of Publication:
United States
Language:
English

References (31)

PyMKS: Materials Knowledge System in Python software May 2014
Materials Data Infrastructure: A Case Study of the Citrination Platform to Examine Data Import, Storage, and Access journal June 2016
The Materials Commons: A Collaboration Platform and Information Repository for the Global Materials Community journal July 2016
Informatics Infrastructure for the Materials Genome Initiative journal July 2016
The Materials Data Facility: Data Services to Advance Materials Science Research journal July 2016
AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations journal June 2012
Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis journal February 2013
Matminer: An open source toolkit for materials data mining journal September 2018
Automated algorithms for band gap analysis from optical absorption spectra journal December 2017
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach journal April 2015
Colorimetric Screening for High-Throughput Discovery of Light Absorbers journal January 2015
The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies journal December 2015
Real-time coherent diffraction inversion using deep generative networks journal November 2018
Machine learning of optical properties of materials – predicting spectra from images and images from spectra journal January 2019
Gaussian-4 theory using reduced order perturbation theory journal September 2007
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
SchNet – A deep learning architecture for molecules and materials journal June 2018
4CeeD: Real-Time Data Acquisition and Analysis Framework for Material-Related Cyber-Physical Environments conference May 2017
The Discovery Cloud: Accelerating and Democratizing Research on a Global Scale conference April 2016
Towards a Hybrid Human-Computer Scientific Information Extraction Pipeline conference October 2017
Globus Platform Services for Data Publication conference January 2018
Machine learning prediction of accurate atomization energies of organic molecules from low-fidelity quantum chemical calculations journal August 2019
The Materials Genome Initiative: One year on journal August 2012
NOMAD: The FAIR concept for big data-driven materials science journal September 2018
The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies text January 2015
SchNet - a deep learning architecture for molecules and materials text January 2017
Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations text January 2019
Introducing Parsl: A Python Parallel Scripting Library text January 2017
Introducing Parsl: A Python Parallel Scripting Library text January 2017
Introducing Parsl: A Python Parallel Scripting Library text January 2017
Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach text January 2015

Cited By (3)

A Cloud-Based Framework for Machine Learning Workloads and Applications journal January 2020
Biofilm Rupture by Laser-Induced Stress Waves Increases with Loading Amplitude, Independent of Location journal February 2020
Dredging a data lake: decentralized metadata extraction
  • Skluzacek, Tyler J.
  • Middleware '19: 20th International Middleware Conference, Proceedings of the 20th International Middleware Conference Doctoral Symposium https://doi.org/10.1145/3366624.3368170
conference December 2019

Figures / Tables (6)


Similar Records

Data automation at light sources
Conference · Mon Dec 31 23:00:00 EST 2018 · OSTI ID:1558638

DLHub: Simplifying publication, discovery, and use of machine learning models in science
Journal Article · Thu Aug 27 00:00:00 EDT 2020 · Journal of Parallel and Distributed Computing · OSTI ID:1837199

Emerging materials intelligence ecosystems propelled by machine learning
Journal Article · Sun Nov 08 23:00:00 EST 2020 · Nature Reviews. Materials · OSTI ID:1864296

Related Subjects