Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Explainable Synthesizability Prediction of Inorganic Crystal Polymorphs Using Large Language Models

Journal Article · · Angewandte Chemie
 [1];  [2];  [3]
  1. Department of Chemical and Biological Engineering (BK21 four) Seoul National University 1 Gwanak‐ro Gwanak‐gu Seoul 08826 South Korea, Department of Chemical and Biomolecular Engineering Korea Advanced Institute of Science and Technology (KAIST) 291, Daehak‐ro Yuseong‐gu Daejeon 34141 South Korea
  2. Department of Chemistry and Biochemistry Fordham University 441 E. Fordham Road The Bronx New York 10458 USA
  3. Department of Chemical and Biological Engineering (BK21 four) Seoul National University 1 Gwanak‐ro Gwanak‐gu Seoul 08826 South Korea, Institute of Chemical Processes Seoul National University 1 Gwanak‐ro Gwanak‐gu Seoul 08826 South Korea, Institute of Engineering Research Seoul National University 1 Gwanak‐ro Gwanak‐gu Seoul 08826 South Korea
Abstract

We evaluate the ability of machine learning to predict whether a hypothetical crystal structure can be synthesized and explain those predictions to scientists. Fine‐tuned large language models (LLMs) trained on a human‐readable text description of the target crystal structure perform comparably to previous bespoke convolutional graph neural network methods, but better prediction quality can be achieved by training a positive‐unlabeled learning model on a text‐embedding representation of the structure. An LLM‐based workflow can then be used to generate human‐readable explanations for the types of factors governing synthesizability, extract the underlying physical rules, and assess the veracity of those rules. These explanations can guide chemists in modifying or optimizing non‐synthesizable hypothetical structures to make them more feasible for materials design.

Sponsoring Organization:
USDOE
OSTI ID:
2536774
Journal Information:
Angewandte Chemie, Journal Name: Angewandte Chemie Journal Issue: 19 Vol. 137; ISSN 0044-8249
Publisher:
Wiley Blackwell (John Wiley & Sons)Copyright Statement
Country of Publication:
Germany
Language:
English

References (63)

Conceptual Inorganic Materials Discovery - A Road Map journal April 2015
Encoding the atomic structure for machine learning in materials science journal June 2021
Learning from positive and unlabeled data: a survey journal April 2020
Large-language models: The game-changers for materials science research journal December 2024
High-Throughput Screening of Solid-State Li-Ion Conductors Using Lattice-Dynamics Descriptors journal June 2019
Synthesizability of materials stoichiometry using semi-supervised learning journal June 2024
Explaining nonlinear classification decisions with deep Taylor decomposition journal May 2017
A bagging SVM to learn from positive and unlabeled examples journal February 2014
Artificial Intelligence Driving Materials Discovery? Perspective on the Article: Scaling Deep Learning for Materials Discovery journal April 2024
Discovery of Hidden Classes of Layered Electrides by Extensive High-Throughput Material Screening journal February 2019
Developing Quantitative Structure–Activity Relationship (QSAR) Models for Water Contaminants’ Activities/Properties by Fine-Tuning GPT-3 Models journal September 2023
Comment on “Comparing the Performance of College Chemistry Students with ChatGPT for Calculations Involving Acids and Bases” journal April 2024
Label-Free Data Mining of Scientific Literature by Unsupervised Syntactic Distance Analysis journal December 2023
Prediction of Synthesis of 2D Metal Carbides and Nitrides (MXenes) and Their Precursors with Positive and Unlabeled Machine Learning journal March 2019
Predicting Synthesizability using Machine Learning on Databases of Existing Inorganic Materials journal February 2023
Evaluation of Tavorite-Structured Cathode Materials for Lithium-Ion Batteries Using High-Throughput Computing journal September 2011
Materials Cartography: Representing and Mining Materials Space Using Structural and Electronic Fingerprints journal January 2015
Structure-Based Synthesizability Prediction of Crystals Using Partially Supervised Learning journal October 2020
Rational Solid-State Synthesis Routes for Inorganic Materials journal June 2021
In Pursuit of the Exceptional: Research Directions for Machine Learning in Chemical and Materials Science journal September 2023
ChatGPT Chemistry Assistant for Text Mining and the Prediction of MOF Synthesis journal August 2023
Large Language Models for Inorganic Synthesis Predictions journal July 2024
Generative Pretrained Transformer for Heterogeneous Catalysts journal November 2024
Functional materials discovery using energy–structure–function maps journal March 2017
Towards the computational design of solid catalysts journal April 2009
The high-throughput highway to computational materials design journal February 2013
Deep neural networks for accurate predictions of crystal stability journal September 2018
Identifying an efficient, thermally robust inorganic phosphor host via machine learning journal October 2018
Robust and synthesizable photocatalysts for CO2 reduction: a data-driven materials discovery journal January 2019
The role of decomposition reactions in assessing first-principles predictions of solid stability journal January 2019
A critical examination of compound stability predictions from machine-learned formation energies journal July 2020
Perovskite synthesizability using graph neural networks journal April 2022
Predicting the synthesizability of crystalline inorganic materials from the data of known material compositions journal August 2023
Leveraging language representation for materials exploration and discovery journal March 2024
Inverse design in search of materials with target functionalities journal March 2018
The case for data science in experimental chemistry: examples and recommendations journal April 2022
Machine learning for molecular and materials science journal July 2018
Unsupervised word embeddings capture latent knowledge from materials science literature journal July 2019
Anthropogenic biases in chemical reaction data hinder exploratory inorganic synthesis journal September 2019
Autonomous chemical research with large language models journal December 2023
Novel inorganic crystal structures predicted using autonomous simulation agents journal June 2022
Leveraging large language models for predictive chemistry journal February 2024
Augmenting large language models with chemistry tools journal May 2024
Predicting synthesizability of crystalline materials via deep learning journal November 2021
Accelerated chemical science with AI journal January 2024
Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules journal January 2024
How the AI-assisted discovery and synthesis of a ternary oxide highlights capability gaps in materials science journal January 2024
Materials science in the era of large language models: a perspective journal January 2024
Exploring the expertise of large language models in materials science and metallurgical engineering journal January 2025
Mapping inorganic crystal chemical space journal January 2025
A critical reflection on attempts to machine-learn materials synthesis insights from text-mined literature recipes journal January 2025
A review of large language models and autonomous agents in chemistry journal January 2025
Assessment of fine-tuned large language models for real-world chemistry and material science applications journal January 2025
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation journal July 2013
Atoms as words: A novel approach to deciphering material properties using NLP-inspired machine learning on crystallographic information files (CIFs) journal April 2024
The thermodynamic scale of inorganic crystalline metastability journal November 2016
Thermodynamic limit for synthesis of metastable inorganic materials journal April 2018
XAI—Explainable artificial intelligence journal December 2019
Estimating the synthetic accessibility of molecules with building block and reaction-aware SAScore journal July 2024
Robocrystallographer: automated crystal structure text descriptions and analysis journal July 2019
Similarity of materials and data-quality assessment by fingerprinting journal September 2022
Recovering True Classifier Performance in Positive-Unlabeled Learning journal February 2017
Fast Nonparametric Estimation of Class Proportions in the Positive-Unlabeled Classification Setting journal April 2020

Similar Records

Explainable Synthesizability Prediction of Inorganic Crystal Polymorphs Using Large Language Models
Journal Article · Fri Mar 21 20:00:00 EDT 2025 · Angewandte Chemie (International Edition) · OSTI ID:2536759

Predicting the synthesizability of crystalline inorganic materials from the data of known material compositions
Journal Article · Thu Aug 24 20:00:00 EDT 2023 · npj Computational Materials · OSTI ID:1996869

Machine learned synthesizability predictions aided by density functional theory
Journal Article · Tue Oct 11 20:00:00 EDT 2022 · Communications Materials · OSTI ID:1891929

Related Subjects