skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity

Abstract

Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k-mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity ( N d) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profilesmore » and show that N d additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes.« less

Authors:
ORCiD logo [1];  [2];  [3];  [4];  [5];  [6]
  1. Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering
  2. Michigan State Univ., East Lansing, MI (United States). Center for Microbial Ecology
  3. Michigan State Univ., East Lansing, MI (United States). Center for Microbial Ecology, Dept. of Microbiology and Molecular Genetics, and Dept. of Plant, Soil and Microbial Sciences
  4. Michigan State Univ., East Lansing, MI (United States). Center for Microbial Ecology and Dept. of Plant, Soil and Microbial Sciences
  5. Georgia Inst. of Technology, Atlanta, GA (United States). School of Civil and Environmental Engineering and School of Biological Sciences
  6. Univ. of North Carolina, Charlotte, NC (United States)
Publication Date:
Research Org.:
Univ. of Wisconsin, Madison, WI (United States); Univ. of Tennessee, Knoxville, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1511043
Grant/Contract Number:  
FC02-07ER64494; SC0006662
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
mSystems
Additional Journal Information:
Journal Volume: 3; Journal Issue: 3; Journal ID: ISSN 2379-5077
Publisher:
American Society for Microbiology
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; bioinformatics; coverage; metagenomics; microbial ecology

Citation Formats

Rodriguez-R, Luis M., Gunturu, Santosh, Tiedje, James M., Cole, James R., Konstantinidis, Konstantinos T., and Fodor, Anthony. Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity. United States: N. p., 2018. Web. doi:10.1128/msystems.00039-18.
Rodriguez-R, Luis M., Gunturu, Santosh, Tiedje, James M., Cole, James R., Konstantinidis, Konstantinos T., & Fodor, Anthony. Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity. United States. doi:10.1128/msystems.00039-18.
Rodriguez-R, Luis M., Gunturu, Santosh, Tiedje, James M., Cole, James R., Konstantinidis, Konstantinos T., and Fodor, Anthony. Tue . "Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity". United States. doi:10.1128/msystems.00039-18. https://www.osti.gov/servlets/purl/1511043.
@article{osti_1511043,
title = {Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity},
author = {Rodriguez-R, Luis M. and Gunturu, Santosh and Tiedje, James M. and Cole, James R. and Konstantinidis, Konstantinos T. and Fodor, Anthony},
abstractNote = {Estimations of microbial community diversity based on metagenomic data sets are affected, often to an unknown degree, by biases derived from insufficient coverage and reference database-dependent estimations of diversity. For instance, the completeness of reference databases cannot be generally estimated since it depends on the extant diversity sampled to date, which, with the exception of a few habitats such as the human gut, remains severely undersampled. Further, estimation of the degree of coverage of a microbial community by a metagenomic data set is prohibitively time-consuming for large data sets, and coverage values may not be directly comparable between data sets obtained with different sequencing technologies. Here, we extend Nonpareil, a database-independent tool for the estimation of coverage in metagenomic data sets, to a high-performance computing implementation that scales up to hundreds of cores and includes, in addition, a k-mer-based estimation as sensitive as the original alignment-based version but about three hundred times as fast. Further, we propose a metric of sequence diversity (Nd) derived directly from Nonpareil curves that correlates well with alpha diversity assessed by traditional metrics. We use this metric in different experiments demonstrating the correlation with the Shannon index estimated on 16S rRNA gene profiles and show that Nd additionally reveals seasonal patterns in marine samples that are not captured by the Shannon index and more precise rankings of the magnitude of diversity of microbial communities in different habitats. Therefore, the new version of Nonpareil, called Nonpareil 3, advances the toolbox for metagenomic analyses of microbiomes.},
doi = {10.1128/msystems.00039-18},
journal = {mSystems},
issn = {2379-5077},
number = 3,
volume = 3,
place = {United States},
year = {2018},
month = {4}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: