DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Case Study of Multimodal, Multi-institutional Data Management for the Combinatorial Materials Science Community

Journal Article · · Integrating Materials and Manufacturing Innovation
ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [3]; ORCiD logo [4]; ORCiD logo [5]; ORCiD logo [6]; ORCiD logo [7]; ORCiD logo [7]; ORCiD logo [8]; ORCiD logo [9]; ORCiD logo [10]; ORCiD logo [11]
  1. Citrine Informatics, Redwood City, CA (United States); SLAC
  2. Citrine Informatics, Redwood City, CA (United States); 11:59, Sacramento, CA (United States)
  3. SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States); ZEISS Microscopy, Dublin, CA (United States)
  4. Citrine Informatics, Redwood City, CA (United States); NobleAI, San Francisco, CA (United States)
  5. Univ. of Maryland, College Park, MD (United States); National Inst. of Standards and Technology (NIST), Gaithersburg, MD (United States)
  6. National Inst. of Standards and Technology (NIST), Gaithersburg, MD (United States)
  7. Univ. of Maryland, College Park, MD (United States)
  8. Northwestern Univ., Evanston, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
  9. Citrine Informatics, Redwood City, CA (United States)
  10. Argonne National Laboratory (ANL), Argonne, IL (United States)
  11. SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)

Although the convergence of high-performance computing, automation, and machine learning has significantly altered the materials design timeline, transformative advances in functional materials and acceleration of their design will require addressing the deficiencies that currently exist in materials informatics, particularly a lack of standardized experimental data management. The challenges associated with experimental data management are especially true for combinatorial materials science, where advancements in automation of experimental workflows have produced datasets that are often too large and too complex for human reasoning. The data management challenge is further compounded by the multimodal and multi-institutional nature of these datasets, as they tend to be distributed across multiple institutions and can vary substantially in format, size, and content. Furthermore, modern materials engineering requires the tuning of not only composition but also of phase and microstructure to elucidate processing–structure–property–performance relationships. To adequately map a materials design space from such datasets, an ideal materials data infrastructure would contain data and metadata describing (i) synthesis and processing conditions, (ii) characterization results, and (iii) property and performance measurements. In this work, we present a case study for the low-barrier development of such a dashboard that enables standardized organization, analysis, and visualization of a large data lake consisting of combinatorial datasets of synthesis and processing conditions, X-ray diffraction patterns, and materials property measurements generated at several different institutions. While this dashboard was developed specifically for data-driven thermoelectric materials discovery, we envision the adaptation of this prototype to other materials applications, and, more ambitiously, future integration into an all-encompassing materials data management infrastructure.

Research Organization:
SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), High Energy Physics (HEP); USDOE Office of Energy Efficiency and Renewable Energy (EERE), Energy Efficiency Office. Advanced Materials & Manufacturing Technologies Office (AMMTO)
Grant/Contract Number:
AC02-76SF00515
OSTI ID:
2403654
Journal Information:
Integrating Materials and Manufacturing Innovation, Journal Name: Integrating Materials and Manufacturing Innovation Journal Issue: 2 Vol. 13; ISSN 2193-9764
Publisher:
SpringerCopyright Statement
Country of Publication:
United States
Language:
English

References (41)

Enabling Modular Autonomous Feedback‐Loops in Materials Science through Hierarchical Experimental Laboratory Automation and Orchestration journal January 2022
Data‐Driven Materials Science: Status, Challenges, and Perspectives journal September 2019
Combinatorial and High-Throughput Materials Science journal August 2007
Organic Synthesis: March of the Machines journal January 2015
Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) journal September 2013
The Materials Data Facility: Data Services to Advance Materials Science Research journal July 2016
High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates journal July 2017
Accelerating materials science with high-throughput computations and machine learning journal April 2019
The Materials Genome Initiative, the interplay of experiment, theory and computation journal April 2014
Accelerating Materials Development via Automation, Machine Learning, and High-Performance Computing journal August 2018
World energy economics and geopolitics amid COVID-19 and post-COVID-19 policy direction journal June 2023
Advanced Thermoelectric Design: From Materials and Structures to Devices journal July 2020
High Throughput Light Absorber Discovery, Part 2: Establishing Structure–Band Gap Energy Relationships journal October 2016
COMBIgor: Data-Analysis Package for Combinatorial Materials Science journal May 2019
The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies journal December 2015
Autonomy in materials research: a case study in carbon nanotube growth journal October 2016
Tracking materials science data lineage to manage millions of materials experiments and analyses journal July 2019
The Materials Provenance Store journal April 2023
Shared metadata for data-centric materials science journal September 2023
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
An open experimental database for exploring inorganic materials journal April 2018
On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets journal September 2014
Accelerated atomic-scale exploration of phase evolution in compositionally complex materials journal January 2018
Functional mapping reveals mechanistic clusters for OER catalysis across (Cu–Mn–Ta–Co–Sn–Fe)O x composition and pH space journal January 2019
Progress and prospects for accelerating materials science with automated and autonomous workflows journal January 2019
Data management and visualization of x-ray diffraction spectra from thin film ternary composition spreads journal June 2005
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
Utilization of machine learning to accelerate colloidal synthesis and discovery journal June 2021
The 2019 materials by design roadmap journal October 2018
Globus Online: Accelerating and Democratizing Science through Cloud-Based Services journal May 2011
The Manufacturing Data and Machine Learning Platform: Enabling Real-time Monitoring and Control of Scientific Experiments via IoT conference June 2020
Digitization of multistep organic synthesis in reactionware for on-demand pharmaceuticals journal January 2018
Software as a service for data scientists journal February 2012
2023 DOE Public Access Plan report January 2023
Metadata Schema for the Persistent Identification of Instruments text January 2021
Toward autonomous additive manufacturing: Bayesian optimization on a 3D printer journal April 2021
The OAuth 1.0 Protocol report April 2010
ChemOS 2.0: an orchestration architecture for chemical self-driving laboratories preprint August 2023
How open science helps researchers succeed journal July 2016
The Modern Research Data Portal: a design pattern for networked, data-intensive science journal January 2018
High Throughput Experimental Materials Database
  • Zakutayev, Andriy; Perkins, John; Schwarting, Marcus
  • National Renewable Energy Laboratory - Data (NREL-DATA), Golden, CO (United States); National Renewable Energy Laboratory https://doi.org/10.7799/1407128
dataset January 2017