Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A Novel Sparse Compositional Technique Reveals Microbial Perturbations

Journal Article · · mSystems
ABSTRACT

The central aims of many host or environmental microbiome studies are to elucidate factors associated with microbial community compositions and to relate microbial features to outcomes. However, these aims are often complicated by difficulties stemming from high-dimensionality, non-normality, sparsity, and the compositional nature of microbiome data sets. A key tool in microbiome analysis is beta diversity, defined by the distances between microbial samples. Many different distance metrics have been proposed, all with varying discriminatory power on data with differing characteristics. Here, we propose a compositional beta diversity metric rooted in a centered log-ratio transformation and matrix completion called robust Aitchison PCA. We demonstrate the benefits of compositional transformations upstream of beta diversity calculations through simulations. Additionally, we demonstrate improved effect size, classification accuracy, and robustness to sequencing depth over the current methods on several decreased sample subsets of real microbiome data sets. Finally, we highlight the ability of this new beta diversity metric to retain the feature loadings linked to sample ordinations revealing salient intercommunity niche feature importance.

IMPORTANCE By accounting for the sparse compositional nature of microbiome data sets, robust Aitchison PCA can yield high discriminatory power and salient feature ranking between microbial niches. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/DEICODE ; additionally, a QIIME 2 plugin is provided to perform this analysis at https://library.qiime2.org/plugins/q2-deicode .

Research Organization:
Johns Hopkins Univ., Baltimore, MD (United States); Univ. of California, San Diego, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
SC0012586; SC0012658
OSTI ID:
1494211
Alternate ID(s):
OSTI ID: 1611890
Journal Information:
mSystems, Journal Name: mSystems Journal Issue: 1 Vol. 4; ISSN 2379-5077
Publisher:
American Society for MicrobiologyCopyright Statement
Country of Publication:
United States
Language:
English

References (47)

Modelling and Analysis of Compositional Data journal February 2015
Multiple factor analysis: principal component analysis for multitable and multiblock data sets: Multiple factor analysis
  • Abdi, Hervé; Williams, Lynne J.; Valentin, Domininique
  • Wiley Interdisciplinary Reviews: Computational Statistics, Vol. 5, Issue 2 https://doi.org/10.1002/wics.1246
journal February 2013
On the Surprising Behavior of Distance Metrics in High Dimensional Space book January 2001
The development of numerical classification and ordination journal October 1980
Interpreting 16S rDNA T-RFLP Data: Application of Self-Organizing Maps and Principal Component Analysis to Describe Community Dynamics and Convergence journal December 2001
Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data, Contingency Tables and Ratio-Scale Measurements journal April 2009
zCompositions — R package for multivariate imputation of left-censored data under a compositional approach journal April 2015
Integrated metabolism in sponge–microbe symbiosis revealed by genome-centered metatranscriptomics journal March 2017
Microbial community resemblance methods differ in their ability to detect biologically relevant patterns journal September 2010
Differential abundance analysis for microbial marker-gene surveys journal September 2013
The gut microbiome in atherosclerotic cardiovascular disease journal October 2017
Qiita: rapid, web-enabled microbiome meta-analysis journal October 2018
Inhabitancy of active Nitrosopumilus-like ammonia-oxidizing archaea and Nitrospira nitrite-oxidizing bacteria in the sponge Theonella swinhoei journal April 2016
Erratum: Corrigendum: SparRec: An effective matrix completion framework of missing data imputation for GWAS journal November 2016
Succession of microbial consortia in the developing infant gut microbiome journal July 2010
Forensic identification using skin bacterial communities journal March 2010
Structured Matrix Completion with Applications to Genomic Data Integration journal April 2016
Associating microbiome composition with environmental covariates using generalized UniFrac distances journal June 2012
The sponge microbiome project journal August 2017
Microbial community profiling for human microbiome projects: Tools, techniques, and challenges journal April 2009
Matrix completion from a few entries conference June 2009
Regularization for matrix completion conference June 2010
Matplotlib: A 2D Graphics Environment journal January 2007
Probabilistic Principal Component Analysis journal August 1999
Biplots of compositional data journal October 2002
Cryptic diversity of the symbiotic cyanobacterium Synechococcus spongiarum among sponge hosts journal June 2008
The ecology and phylogeny of cyanobacterial symbionts in sponges journal June 2008
Pyrosequencing-Based Assessment of Soil pH as a Predictor of Soil Bacterial Community Structure at the Continental Scale journal June 2009
Quantitative and Qualitative   Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities journal January 2007
UniFrac: a New Phylogenetic Method for Comparing Microbial Communities journal December 2005
Intermittent Hypoxia and Hypercapnia, a Hallmark of Obstructive Sleep Apnea, Alters the Gut Microbiome and Metabolome journal June 2018
Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information journal April 2018
Balances: a New Perspective for Microbiome Analysis journal July 2018
Uncovering the Horseshoe Effect in Microbial Analyses journal February 2017
Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns journal March 2017
Robust principal component analysis? journal May 2011
The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome journal July 2012
On Information and Sufficiency journal March 1951
Intestinal Colonization by a Lachnospiraceae Bacterium Contributes to the Development of Diabetes in Obese Mice journal January 2014
Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible journal April 2014
Distribution-Free and Robust Statistical Methods: Viable Alternatives to Parametric Statistics journal September 1993
Logistic-Normal Distributions: Some Properties and Uses journal August 1980
The Multivariate Poisson-Log Normal Distribution journal December 1989
Correlational Procedures for Research journal November 1979
Sponge-mediated nitrification in tropical benthic communities journal January 1997
Functional Intestinal Bile Acid 7α-Dehydroxylation by Clostridium scindens Associated with Protection from Clostridium difficile Infection in a Gnotobiotic Mouse Model journal December 2016
A phylogenetic transform enhances analysis of compositional microbiota data journal February 2017

Similar Records

Pickaxe: a Python library for the prediction of novel metabolic reactions
Journal Article · Tue Mar 21 20:00:00 EDT 2023 · BMC Bioinformatics · OSTI ID:1962934

Applications of physics informed neural operators
Journal Article · Wed May 17 20:00:00 EDT 2023 · Machine Learning: Science and Technology · OSTI ID:1974276