A Case Study of Multimodal, Multi-institutional Data Management for the Combinatorial Materials Science Community
- Citrine Informatics, Redwood City, CA (United States); SLAC
- Citrine Informatics, Redwood City, CA (United States); 11:59, Sacramento, CA (United States)
- SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States); ZEISS Microscopy, Dublin, CA (United States)
- Citrine Informatics, Redwood City, CA (United States); NobleAI, San Francisco, CA (United States)
- Univ. of Maryland, College Park, MD (United States); National Inst. of Standards and Technology (NIST), Gaithersburg, MD (United States)
- National Inst. of Standards and Technology (NIST), Gaithersburg, MD (United States)
- Univ. of Maryland, College Park, MD (United States)
- Northwestern Univ., Evanston, IL (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
- Citrine Informatics, Redwood City, CA (United States)
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
Although the convergence of high-performance computing, automation, and machine learning has significantly altered the materials design timeline, transformative advances in functional materials and acceleration of their design will require addressing the deficiencies that currently exist in materials informatics, particularly a lack of standardized experimental data management. The challenges associated with experimental data management are especially true for combinatorial materials science, where advancements in automation of experimental workflows have produced datasets that are often too large and too complex for human reasoning. The data management challenge is further compounded by the multimodal and multi-institutional nature of these datasets, as they tend to be distributed across multiple institutions and can vary substantially in format, size, and content. Furthermore, modern materials engineering requires the tuning of not only composition but also of phase and microstructure to elucidate processing–structure–property–performance relationships. To adequately map a materials design space from such datasets, an ideal materials data infrastructure would contain data and metadata describing (i) synthesis and processing conditions, (ii) characterization results, and (iii) property and performance measurements. In this work, we present a case study for the low-barrier development of such a dashboard that enables standardized organization, analysis, and visualization of a large data lake consisting of combinatorial datasets of synthesis and processing conditions, X-ray diffraction patterns, and materials property measurements generated at several different institutions. While this dashboard was developed specifically for data-driven thermoelectric materials discovery, we envision the adaptation of this prototype to other materials applications, and, more ambitiously, future integration into an all-encompassing materials data management infrastructure.
- Research Organization:
- SLAC National Accelerator Laboratory (SLAC), Menlo Park, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), High Energy Physics (HEP); USDOE Office of Energy Efficiency and Renewable Energy (EERE), Energy Efficiency Office. Advanced Materials & Manufacturing Technologies Office (AMMTO)
- Grant/Contract Number:
- AC02-76SF00515
- OSTI ID:
- 2403654
- Journal Information:
- Integrating Materials and Manufacturing Innovation, Journal Name: Integrating Materials and Manufacturing Innovation Journal Issue: 2 Vol. 13; ISSN 2193-9764
- Publisher:
- SpringerCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Tracking materials science data lineage to manage millions of materials experiments and analyses
Scanning Probe Microscope to Map Thermal and Thermoelectric Properties of Combinatorial Materials