skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PanFP: Pangenome-based functional profiles for microbial communities

Abstract

For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost effective way to screen samples of interest for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. As a result, we present a computational method called pangenome based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU s taxonomic lineage.more » From this lineage, we derive an OTU functional profile by weighting a pangenome s functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8 0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed reference OTU picking strategies against specific reference sequence databases. In conclusion, we developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub.« less

Authors:
 [1];  [2];  [1];  [2];  [3]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Tennessee, Knoxville, TN (United States)
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  3. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Colorado State Univ., Fort Collins, CO (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1256793
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
BMC Research Notes
Additional Journal Information:
Journal Volume: 8; Journal Issue: 1; Journal ID: ISSN 1756-0500
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; microbial communities; metagenome; 16S rRNA survey; pangenome

Citation Formats

Jun, Se -Ran, Hauser, Loren John, Schadt, Christopher Warren, Gorin, Andrey A., and Robeson, Michael S. PanFP: Pangenome-based functional profiles for microbial communities. United States: N. p., 2015. Web. doi:10.1186/s13104-015-1462-8.
Jun, Se -Ran, Hauser, Loren John, Schadt, Christopher Warren, Gorin, Andrey A., & Robeson, Michael S. PanFP: Pangenome-based functional profiles for microbial communities. United States. doi:10.1186/s13104-015-1462-8.
Jun, Se -Ran, Hauser, Loren John, Schadt, Christopher Warren, Gorin, Andrey A., and Robeson, Michael S. Sat . "PanFP: Pangenome-based functional profiles for microbial communities". United States. doi:10.1186/s13104-015-1462-8. https://www.osti.gov/servlets/purl/1256793.
@article{osti_1256793,
title = {PanFP: Pangenome-based functional profiles for microbial communities},
author = {Jun, Se -Ran and Hauser, Loren John and Schadt, Christopher Warren and Gorin, Andrey A. and Robeson, Michael S.},
abstractNote = {For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost effective way to screen samples of interest for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. As a result, we present a computational method called pangenome based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU s taxonomic lineage. From this lineage, we derive an OTU functional profile by weighting a pangenome s functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8 0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed reference OTU picking strategies against specific reference sequence databases. In conclusion, we developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub.},
doi = {10.1186/s13104-015-1462-8},
journal = {BMC Research Notes},
number = 1,
volume = 8,
place = {United States},
year = {2015},
month = {9}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Search and clustering orders of magnitude faster than BLAST
journal, August 2010