DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A robust estimator of mutual information for deep learning interpretability

Journal Article · · Machine Learning: Science and Technology

Abstract We develop the use of mutual information (MI), a well-established metric in information theory, to interpret the inner workings of deep learning (DL) models. To accurately estimate MI from a finite number of samples, we present GMM-MI (pronounced ‘Jimmie’), an algorithm based on Gaussian mixture models that can be applied to both discrete and continuous settings. GMM-MI is computationally efficient, robust to the choice of hyperparameters and provides the uncertainty on the MI estimate due to the finite sample size. We extensively validate GMM-MI on toy data for which the ground truth MI is known, comparing its performance against established MI estimators. We then demonstrate the use of our MI estimator in the context of representation learning, working with synthetic data and physical datasets describing highly non-linear processes. We train DL models to encode high-dimensional data within a meaningful compressed (latent) representation, and use GMM-MI to quantify both the level of disentanglement between the latent variables, and their association with relevant physical quantities, thus unlocking the interpretability of the latent representation. We make GMM-MI publicly available in this GitHub repository.

Research Organization:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Organization:
European Research Council (ERC); Göran Gustafsson Foundation for Research in Natural Sciences and Medicine; National Science Foundation (NSF); Simons Foundation; Swiss National Science Foundation (SNSF); UCL Graduate Research Scholarship (GRS); UCL Overseas Research Scholarship (ORS); USDOE; USDOE Office of Science (SC), High Energy Physics (HEP)
Grant/Contract Number:
AC02-07CH11359
OSTI ID:
1969147
Report Number(s):
FERMILAB-PUB-22-794-SCD; arXiv:2211.00024
Journal Information:
Machine Learning: Science and Technology, Journal Name: Machine Learning: Science and Technology Journal Issue: 2 Vol. 4; ISSN 2632-2153
Publisher:
IOP PublishingCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (63)

Asymptotic evaluation of certain markov process expectations for large time. IV journal March 1983
Approximation by superpositions of a sigmoidal function journal December 1989
Estimation of mutual information by the fuzzy histogram journal February 2014
EM for mixtures journal June 2015
Application of information theory in systems biology journal March 2020
Multilayer feedforward networks are universal approximators journal January 1989
Approximation capabilities of multilayer feedforward networks journal January 1991
Filling the gaps: Gaussian mixture models from noisy, truncated or incomplete samples journal October 2018
Information theoretic approaches to understanding circuit function journal August 2012
Cellular noise and information transmission journal August 2014
Distribution of mutual information from complete and incomplete data journal March 2005
Textural feature selection by joint mutual information based on Gaussian mixture model for multispectral image classification journal July 2010
Local Optima in Mixture Modeling journal July 2016
The Structure of Cold Dark Matter Halos journal May 1996
A Universal Density Profile from Hierarchical Clustering journal December 1997
Evolution of Structure in Cold Dark Matter Universes journal May 1998
How Universal Are the Density Profiles of Dark Halos? journal May 1999
Can a conditioning on stellar mass explain the mutual information between morphology and environment? journal September 2020
On the origin of red spirals: does assembly bias play a role? journal March 2022
On the information bottleneck theory of deep learning journal December 2019
The rise and fall of satellites in galaxy clusters journal September 1997
A study on the statistical significance of mutual information between morphology of a galaxy and its large-scale environment journal August 2020
Machines learn to infer stellar parameters just by looking at a large number of spectra journal January 2021
Likelihood-free inference with neural compression of DES SV weak lensing map statistics journal November 2020
Do galactic bars depend on environment?: an information theoretic analysis of Galaxy Zoo 2 journal December 2020
Sufficiency of a Gaussian power spectrum likelihood for accurate cosmology from upcoming weak lensing surveys journal February 2021
How much a galaxy knows about its large-scale environment?: An information theoretic perspective journal December 2016
Independent coordinates for strange attractors from mutual information journal February 1986
Discovering the building blocks of dark matter halo density profiles with neural networks journal May 2022
Persistent random motion with maximally correlated fluctuations journal August 2019
Persistent homology of complex networks for dynamic state detection journal August 2019
Estimation of mutual information for real-valued data with error bars and controlled bias journal August 2019
Refined nonuniform embedding for coupling detection in multivariate time series journal June 2020
Estimation of mutual information using kernel density estimators journal September 1995
Estimating mutual information journal June 2004
Mutual information as a tool for identifying phase transitions in dynamical complex systems with limited data journal May 2007
Using mutual information to measure order in model glass formers journal October 2012
Quantifying information transfer and mediation along causal pathways in complex systems journal December 2015
Mutual information identifies spurious Hurst phenomena in resting state EEG and fMRI data journal February 2018
Information transfer from causal history in complex system dynamics journal January 2019
Transfer entropy computation using the Perron-Frobenius operator journal April 2019
Symmetries and phase diagrams with real-space mutual information neural estimation journal December 2021
Estimation of the information by an adaptive partitioning of the observation space journal May 1999
Entropy expressions for multivariate continuous distributions journal March 2000
A new look at the statistical model identification journal December 1974
Least squares quantization in PCM journal March 1982
Input feature selection by mutual information based on Parzen window journal December 2002
Representation Learning: A Review and New Perspectives journal August 2013
Are mergers responsible for universal halo properties? journal June 2009
Maximum Likelihood from Incomplete Data Via the EM Algorithm journal September 1977
Information Processing in Living Systems journal March 2016
Estimation of Entropy and Mutual Information journal June 2003
Learning Factorial Codes by Predictability Minimization journal November 1992
Multimodel Inference: Understanding AIC and BIC in Model Selection journal November 2004
Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations journal June 2011
On Information and Sufficiency journal March 1951
Estimating the Dimension of a Model journal March 1978
Mutual Information between Discrete and Continuous Data Sets journal February 2014
Bayesian and Quasi-Bayesian Estimators for Mutual Information from Discrete Data journal May 2013
Improvement of the k-nn Entropy Estimator with Applications in Systems Biology journal December 2015
Nonlinear Information Bottleneck journal November 2019
Explainable AI: A Review of Machine Learning Interpretability Methods journal December 2020
EmpiriciSN: Re-sampling Observed Supernova/Host Galaxy Populations Using an XD Gaussian Mixture Model journal May 2017