skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning with persistent homology and chemical word embeddings improves prediction accuracy and interpretability in metal-organic frameworks

Journal Article · · Scientific Reports
 [1];  [2];  [3];  [2];  [4]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Toyota Research Inst., Los Altos, CA (United States)
  2. Toyota Research Inst., Los Altos, CA (United States)
  3. IMDEA Materials Inst., Madrid (Spain)
  4. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

Machine learning has emerged as a powerful approach in materials discovery. Its major challenge is selecting features that create interpretable representations of materials, useful across multiple prediction tasks. We introduce an end-to-end machine learning model that automatically generates descriptors that capture a complex representation of a material’s structure and chemistry. This approach builds on computational topology techniques (namely, persistent homology) and word embeddings from natural language processing. It automatically encapsulates geometric and chemical information directly from the material system. We demonstrate our approach on multiple nanoporous metal–organic framework datasets by predicting methane and carbon dioxide adsorption across different conditions. Our results show considerable improvement in both accuracy and transferability across targets compared to models constructed from the commonly-used, manually-curated features, consistently achieving an average 25–30% decrease in root-mean-squared-deviation and an average increase of 40–50% in R2 scores. A key advantage of our approach is interpretability: Our model identifies the pores that correlate best to adsorption at different pressures, which contributes to understanding atomic-level structure–property relationships for materials design.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities Division
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1797749
Journal Information:
Scientific Reports, Vol. 11, Issue 1; ISSN 2045-2322
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (37)

Quantum-Chemical Characterization of the Properties and Reactivities of Metal–Organic Frameworks journal April 2015
Understanding the diversity of the metal-organic framework ecosystem journal August 2020
What Are the Best Materials To Separate a Xenon/Krypton Mixture? journal June 2015
Responsive Metal–Organic Frameworks and Framework Materials: Under Pressure, Taking the Heat, in the Spotlight, with Friends journal February 2015
Methane storage in metal–organic frameworks journal January 2014
Theoretical consideration of the effect of porosity on thermal conductivity of porous materials journal August 2006
Topological Persistence and Simplification journal November 2002
Effect of pore size and shape on the thermal conductivity of metal-organic frameworks journal January 2017
Applications of machine learning in metal-organic frameworks journal November 2020
Topological Analysis of Metal–Organic Frameworks with Polytopic Linkers and/or Multiple Building Units and the Minimal Transitivity Principle journal November 2013
Carbon Dioxide Capture in Metal–Organic Frameworks journal September 2011
Unsupervised word embeddings capture latent knowledge from materials science literature journal July 2019
Matminer: An open source toolkit for materials data mining journal September 2018
Machine learning and in silico discovery of metal-organic frameworks: Methanol as a working fluid in adsorption-driven heat pumps and chillers journal March 2020
Topological Descriptors Help Predict Guest Adsorption in Nanoporous Materials journal April 2020
Advances, Updates, and Analytics for the Computation-Ready, Experimental Metal–Organic Framework Database: CoRE MOF 2019 journal November 2019
Gas Adsorption Sites in a Large-Pore Metal-Organic Framework journal August 2005
Author Correction: Chemically intuited, large-scale screening of MOFs by machine learning techniques journal November 2017
Adsorption Isotherm Predictions for Multiple Molecules in MOFs Using the Same Deep Learning Model journal January 2020
Understanding the diversity of the metal-organic framework ecosystem dataset January 2020
A Universal Machine Learning Algorithm for Large-Scale Screening of Materials journal February 2020
Metal–Organic Frameworks for Separations journal September 2011
Geometrical Properties Can Predict CO 2 and N 2 Adsorption Performance of Metal–Organic Frameworks (MOFs) at Low Pressure journal April 2016
Similarity-Driven Discovery of Zeolite Materials for Adsorption-Based Separations journal August 2012
Crystallization process development of metal–organic frameworks by linking secondary building units, lattice nucleation and luminescence: insight into reproducibility journal January 2017
Quantifying similarity of pore-geometry in nanoporous materials journal May 2017
Large-scale screening of hypothetical metal–organic frameworks journal November 2011
Role of Pore Chemistry and Topology in the CO 2 Capture Capabilities of MOFs: From Molecular Simulation to Machine Learning journal August 2018
A generalized method for constructing hypothetical nanoporous materials of any net topology from graph theory journal January 2016
Machine Learning Using Combined Structural and Chemical Descriptors for Prediction of Methane Adsorption Performance of Metal Organic Frameworks (MOFs) journal September 2017
Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials journal February 2012
Topological persistence and simplification conference January 2000
Catalysis by Metal Organic Frameworks: Perspective and Suggestions for Future Research journal January 2019
Revealing hidden medium-range order in amorphous materials using topological data analysis journal September 2020
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning journal June 2020
Chemically intuited, large-scale screening of MOFs by machine learning techniques journal October 2017
Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis journal February 2013