skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models

Journal Article · · Integrating Materials and Manufacturing Innovation

Modern machine learning and autonomous experimentation schemes in materials science rely on accurate analysis of the data ingested by these models. Unfortunately, accurate analysis of the underlying data can be difficult, even for domain experts, complicating the training of the models intended to drive experiments. This is especially true when the goal is to identify the presence of weak signatures in diffraction or spectroscopic datasets. In this work, we examine a set of as-obtained diffraction data that track the phase transition from monoclinic to tetragonal in a Nb-doped VO2 film as a function of temperature and dopant concentration. We then task a set of domain experts and a set of machine learning experts with identifying which phase is present in each diffraction pattern manually and algorithmically, respectively; in both cases, the labels can vary dramatically, especially at the phase boundaries. We use the mode of the labels and the Shannon entropy as a method to capture, preserve and propagate consensus labels and their variance. Further we use the expert labels as a benchmark and demonstrate the use of Shannon entropy weighted scoring to test the performance of machine learning generated labels. Finally, we propose a material data challenge centered around generating improved labeling algorithms. This real-world dataset curated with expert labels can act as test bed for new algorithms. The raw data, annotations and code used in this study are all available online at data.gov and the interested reader is encouraged to replicate and improve the existing models

Research Organization:
National Renewable Energy Laboratory (NREL), Golden, CO (United States)
Sponsoring Organization:
USDOE National Renewable Energy Laboratory (NREL), Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
AC36-08GO28308
OSTI ID:
1798722
Report Number(s):
NREL/JA-5K00-78444; MainId:32361; UUID:5f3ffc91-f636-4657-aa90-7f0c4826215d; MainAdminID:25675
Journal Information:
Integrating Materials and Manufacturing Innovation, Vol. 10, Issue 2; ISSN 2193-9764
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (23)

Pauling's model not universally accepted journal January 1986
Adjustment of thermal hysteresis in epitaxial VO2 films by doping metal ions journal January 2011
Correlation between thermal hysteresis width and broadening of metal–insulator transition in Cr- and Nb-doped VO 2 films journal June 2014
A Kriging-Based Approach to Autonomous Experimentation with Applications to X-Ray Scattering journal August 2019
Measurement of the hysteretic thermal properties of W-doped and undoped nanocrystalline powders of VO2 journal October 2019
Get another label? improving data quality and data mining using multiple, noisy labelers
  • Sheng, Victor S.; Provost, Foster; Ipeirotis, Panagiotis G.
  • Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08 https://doi.org/10.1145/1401890.1401965
conference January 2008
On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets journal September 2014
High-Throughput Measurements of Thermochromic Behavior in V 1– x Nb x O 2 Combinatorial Thin Film Libraries journal September 2014
Deep learning analysis of defect and phase evolution during electron beam-induced transformations in WS2 journal February 2019
Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy journal August 2018
Structural Characterization of Atomic Layer Deposited Vanadium Dioxide journal August 2017
Recent advances and applications of machine learning in solid-state materials science journal August 2019
Automated defect analysis in electron microscopic images journal July 2018
Self-driving laboratory for accelerated discovery of thin-film materials journal May 2020
Model, prediction, and experimental verification of composition and thickness in continuous spread thin film combinatorial libraries grown by pulsed laser deposition journal July 2007
A High-Throughput Structural and Electrochemical Study of Metallic Glass Formation in Ni–Ti–Al journal June 2020
Autonomy in materials research: a case study in carbon nanotube growth journal October 2016
How Water’s Properties Are Encoded in Its Molecular Structure and Energies journal September 2017
The war over supercooled water journal August 2018
A Bayesian experimental autonomous researcher for mechanical design journal April 2020
Radial Basis Functions book January 2009
Tuning the hysteresis of a metal-insulator transition via lattice compatibility journal July 2020
Active learning of uniformly accurate interatomic potentials for materials simulation journal February 2019