DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Graphical Model for Fusing Diverse Microbiome Data

Journal Article · · IEEE Transactions on Signal Processing

This paper develops a Bayesian graphical model for fusing disparate types of count data. The motivating application is the study of bacterial communities from diverse high-dimensional features, in this case, transcripts, collected from different treatments. In such datasets, there are no explicit correspondences between the communities and each corresponds to different factors, making data fusion challenging. We introduce a flexible multinomial-Gaussian generative model for jointly modeling such count data. This latent variable model jointly characterizes the observed data through a common multivariate Gaussian latent space that parameterizes the set of multinomial probabilities of the transcriptome counts. The covariance matrix of the latent variables induces a covariance matrix of co-dependencies between all the transcripts, effectively fusing multiple data sources. We present a computationally scalable variational Expectation-Maximization (EM) algorithm for inferring the latent variables and the parameters of the model. Here, the inferred latent variables provide a common dimensionality reduction for visualizing the data and the inferred parameters provide a predictive posterior distribution. In addition to simulation studies that demonstrate the variational EM procedure, we apply our model to a bacterial microbiome dataset.

Research Organization:
Georgia Institute of Technology, Atlanta, GA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
NA0003921
OSTI ID:
2283181
Journal Information:
IEEE Transactions on Signal Processing, Journal Name: IEEE Transactions on Signal Processing Vol. 71; ISSN 1053-587X
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (44)

Indexing by latent semantic analysis journal September 1990
Inferring species interactions from co‐occurrence data with Markov networks journal December 2016
Joint estimation of multiple mixed graphical models for pan‐cancer network analysis journal January 2020
Information Criteria and Statistical Modeling book October 2007
Multinomial logistic regression algorithm journal March 1992
The varimax criterion for analytic rotation in factor analysis journal September 1958
A well-conditioned estimator for large-dimensional covariance matrices journal February 2004
Microbial model communities: To understand complexity, harness the power of simplicity journal January 2020
A general algorithm for covariance modeling of discrete data journal May 2018
So Many Variables: Joint Modeling in Community Ecology journal December 2015
A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. journal January 1997
Ranking genome-wide correlation measurements improves microarray and RNA-seq based global and targeted co-expression networks journal July 2018
Variational Inference: A Review for Statisticians journal July 2016
Inference algorithms and learning theory for Bayesian sparse factor analysis journal December 2009
Testing the manifold hypothesis journal February 2016
Joint estimation of multiple graphical models journal February 2011
Comparing Numerical Taxonomic Studies journal December 1981
Multimodal Data Fusion in High-Dimensional Heterogeneous Datasets Via Generative Models journal January 2021
Synthetic microbial ecosystems: an exciting tool to understand and apply microbial communities: Synthetic microbial ecosystems journal December 2013
Probabilistic Principal Component Analysis journal August 1999
Understanding co‐occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM) journal May 2014
Untangling direct species associations from indirect mediator species effects with graphical models journal July 2019
The Joint Graphical Lasso for Inverse Covariance Estimation Across Multiple Classes
  • Danaher, Patrick; Wang, Pei; Witten, Daniela M.
  • Journal of the Royal Statistical Society Series B: Statistical Methodology, Vol. 76, Issue 2 https://doi.org/10.1111/rssb.12033
journal August 2013
A Global Geometric Framework for Nonlinear Dimensionality Reduction journal December 2000
Nonlinear Dimensionality Reduction by Locally Linear Embedding journal December 2000
Bacterial Analogs of Plant Tetrahydropyridine Alkaloids Mediate Microbial Interactions in a Rhizosphere Model System journal May 2019
Introducing THOR, a Model Microbiome for Genetic Dissection of Community Behavior journal March 2019
Using Cultivated Microbial Communities To Dissect Microbiome Assembly: Challenges, Limitations, and the Path Ahead journal March 2018
Experimental Microbiomes: Models Not to Scale journal July 2019
THOR’s Hammer: the Antibiotic Koreenceine Drives Gene Expression in a Model Microbial Community journal June 2022
Deterministic Latent Variable Models and their Pitfalls conference April 2008
Optimization Methods for Large-Scale Machine Learning journal January 2018
Supervised probabilistic principal component analysis conference August 2006
The Plant Microbiome: From Ecology to Reductionism and Beyond journal September 2020
Nonlinear Component Analysis as a Kernel Eigenvalue Problem journal July 1998
Variational Inference for Large-Scale Models of Discrete Choice journal March 2010
Variational Bayesian learning of directed graphical models with hidden variables journal December 2006
The graphical lasso: New insights and alternatives journal January 2012
Minimum-Distortion Embedding journal January 2021
Deep Contextualized Word Representations
  • Peters, Matthew; Neumann, Mark; Iyyer, Mohit
  • Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) https://doi.org/10.18653/v1/N18-1202
conference January 2018
Phylogenetic Inference for Binary Data on Dendograms Using Markov Chain Monte Carlo journal March 1997
Microbial Networks in SPRING - Semi-parametric Rank-Based Correlation and Partial Correlation Estimation for Quantitative Microbiome Data journal June 2019
Advances and Challenges in Metatranscriptomic Analysis journal September 2019
Metagenomics, Metatranscriptomics, and Metabolomics Approaches for Microbiome Analysis journal January 2016