Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam
Abstract
To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accurate comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. Inmore »
- Authors:
-
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). BioSciences Division; Univ. of Tennessee, Knoxville, TN (United States). Joint Inst. for Biological Sciences
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). BioSciences Division
- Publication Date:
- Research Org.:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). BioEnergy Science Center (BESC)
- Sponsoring Org.:
- USDOE Office of Science (SC); USDOE Laboratory Directed Research and Development (LDRD) Program
- OSTI Identifier:
- 1286772
- Grant/Contract Number:
- AC05-00OR22725
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Evolutionary Biology (Online)
- Additional Journal Information:
- Journal Name: BMC Evolutionary Biology (Online); Journal Volume: 14; Journal Issue: 1; Journal ID: ISSN 1471-2148
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; Prokaryotes; Cellular function; Pathway; Genomes; Evolution; Phylogenomics
Citation Formats
Chai, Juanjuan, Kora, Guruprasad, Ahn, Tae-Hyuk, Hyatt, Doug, and Pan, Chongle. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam. United States: N. p., 2014.
Web. doi:10.1186/s12862-014-0207-y.
Chai, Juanjuan, Kora, Guruprasad, Ahn, Tae-Hyuk, Hyatt, Doug, & Pan, Chongle. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam. United States. https://doi.org/10.1186/s12862-014-0207-y
Chai, Juanjuan, Kora, Guruprasad, Ahn, Tae-Hyuk, Hyatt, Doug, and Pan, Chongle. Thu .
"Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam". United States. https://doi.org/10.1186/s12862-014-0207-y. https://www.osti.gov/servlets/purl/1286772.
@article{osti_1286772,
title = {Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam},
author = {Chai, Juanjuan and Kora, Guruprasad and Ahn, Tae-Hyuk and Hyatt, Doug and Pan, Chongle},
abstractNote = {To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accurate comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.},
doi = {10.1186/s12862-014-0207-y},
journal = {BMC Evolutionary Biology (Online)},
number = 1,
volume = 14,
place = {United States},
year = {Thu Oct 09 00:00:00 EDT 2014},
month = {Thu Oct 09 00:00:00 EDT 2014}
}
Web of Science
Works referenced in this record:
Horizontal gene transfer, genome innovation and evolution
journal, August 2005
- Gogarten, J. Peter; Townsend, Jeffrey P.
- Nature Reviews Microbiology, Vol. 3, Issue 9
Prokka: rapid prokaryotic genome annotation
journal, March 2014
- Seemann, T.
- Bioinformatics, Vol. 30, Issue 14
Comparison of phylogenetic trees
journal, February 1981
- Robinson, D. F.; Foulds, L. R.
- Mathematical Biosciences, Vol. 53, Issue 1-2
The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, December 2007
- Caspi, R.; Foerster, H.; Fulcher, C. A.
- Nucleic Acids Research, Vol. 36, Issue Database
Enzyme Recruitment and Its Role in Metabolic Expansion
journal, January 2014
- Schulenburg, Cindy; Miller, Brian G.
- Biochemistry, Vol. 53, Issue 5
High-throughput generation, optimization and analysis of genome-scale metabolic models
journal, August 2010
- Henry, Christopher S.; DeJongh, Matthew; Best, Aaron A.
- Nature Biotechnology, Vol. 28, Issue 9
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases
journal, November 2011
- Caspi, R.; Altman, T.; Dreher, K.
- Nucleic Acids Research, Vol. 40, Issue D1
KAAS: an automatic genome annotation and pathway reconstruction server
journal, May 2007
- Moriya, Y.; Itoh, M.; Okuda, S.
- Nucleic Acids Research, Vol. 35, Issue S2, p. W182-W185
Accelerated Profile HMM Searches
journal, October 2011
- Eddy, Sean R.
- PLoS Computational Biology, Vol. 7, Issue 10
A survey of metabolic databases emphasizing the MetaCyc family
journal, April 2011
- Karp, Peter D.; Caspi, Ron
- Archives of Toxicology, Vol. 85, Issue 9
Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences
journal, August 2013
- Langille, Morgan G. I.; Zaneveld, Jesse; Caporaso, J. Gregory
- Nature Biotechnology, Vol. 31, Issue 9
Estimating Divergence Times in Large Phylogenetic Trees
journal, October 2007
- Britton, Tom; Anderson, Cajsa Lisa; Jacquet, David
- Systematic Biology, Vol. 56, Issue 5
Improved scoring of functional groups from gene expression data by decorrelating GO graph structure
journal, April 2006
- Alexa, A.; Rahnenfuhrer, J.; Lengauer, T.
- Bioinformatics, Vol. 22, Issue 13
phangorn: phylogenetic analysis in R
journal, December 2010
- Schliep, Klaus Peter
- Bioinformatics, Vol. 27, Issue 4
KEGG: Kyoto Encyclopedia of Genes and Genomes
journal, January 2000
- Kanehisa, Minoru; Goto, Susumu
- Nucleic Acids Research, Vol. 28, Issue 1, p. 27-30
GenBank
journal, November 2012
- Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
- Nucleic Acids Research, Vol. 41, Issue D1
Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology
journal, December 2009
- Karp, P. D.; Paley, S. M.; Krummenacker, M.
- Briefings in Bioinformatics, Vol. 11, Issue 1
The Retention Index and the Rescaled Consistency Index
journal, December 1989
- Farris, James S.
- Cladistics, Vol. 5, Issue 4
Consistency of gene starts among Burkholderia genomes
journal, February 2011
- Dunbar, John; Cohn, Judith D.; Wall, Michael E.
- BMC Genomics, Vol. 12, Issue 1
BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions
journal, January 2010
- Schellenberger, Jan; Park, Junyoung O.; Conrad, Tom M.
- BMC Bioinformatics, Vol. 11, Issue 1
FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
journal, March 2010
- Price, Morgan N.; Dehal, Paramvir S.; Arkin, Adam P.
- PLoS ONE, Vol. 5, Issue 3
The Bacterial Species Challenge: Making Sense of Genetic and Ecological Diversity
journal, February 2009
- Fraser, C.; Alm, E. J.; Polz, M. F.
- Science, Vol. 323, Issue 5915
Horizontal gene transfer in evolution: facts and challenges
journal, November 2009
- Boto, Luis
- Proceedings of the Royal Society B: Biological Sciences, Vol. 277, Issue 1683
The prokaryotic tree of life: past, present…and future?
journal, May 2008
- McInerney, James O.; Cotton, James A.; Pisani, Davide
- Trends in Ecology & Evolution, Vol. 23, Issue 5
The Pfam protein families database
journal, November 2011
- Punta, M.; Coggill, P. C.; Eberhardt, R. Y.
- Nucleic Acids Research, Vol. 40, Issue D1
Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010
- Hyatt, Doug; Chen, Gwo-Liang; LoCascio, Philip F.
- BMC Bioinformatics, Vol. 11, Issue 1
The COG database: an updated version includes eukaryotes
journal, January 2003
- Tatusov, Roman L.; Fedorova, Natalie D.; Jackson, John D.
- BMC Bioinformatics, Vol. 4, Article No. 41
APE: Analyses of Phylogenetics and Evolution in R language
journal, January 2004
- Paradis, E.; Claude, J.; Strimmer, K.
- Bioinformatics, Vol. 20, Issue 2
Genome Majority Vote Improves Gene Predictions
journal, November 2011
- Wall, Michael E.; Raghavan, Sindhu; Cohn, Judith D.
- PLoS Computational Biology, Vol. 7, Issue 11
Lateral Gene Transfer and the Origins of Prokaryotic Groups
journal, December 2003
- Boucher, Yan; Douady, Christophe J.; Papke, R. Thane
- Annual Review of Genetics, Vol. 37, Issue 1
Ancient horizontal gene transfer
journal, February 2003
- Brown, James R.
- Nature Reviews Genetics, Vol. 4, Issue 2
PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes
journal, August 2013
- Segata, Nicola; Börnigen, Daniela; Morgan, Xochitl C.
- Nature Communications, Vol. 4, Issue 1
A systematic comparison of the MetaCyc and KEGG pathway databases
journal, January 2013
- Altman, Tomer; Travers, Michael; Kothari, Anamika
- BMC Bioinformatics, Vol. 14, Issue 1
Reconstructing ancestral character states under Wagner parsimony
journal, December 1987
- Swofford, David L.; Maddison, Wayne P.
- Mathematical Biosciences, Vol. 87, Issue 2
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
journal, January 2013
- Katoh, K.; Standley, D. M.
- Molecular Biology and Evolution, Vol. 30, Issue 4
Origins and impact of constraints in evolution of gene families
journal, October 2006
- Shakhnovich, B. E.; Koonin, E. V.
- Genome Research, Vol. 16, Issue 12
Biased biological functions of horizontally transferred genes in prokaryotic genomes
journal, June 2004
- Nakamura, Yoji; Itoh, Takeshi; Matsuda, Hideo
- Nature Genetics, Vol. 36, Issue 7
Search and clustering orders of magnitude faster than BLAST
journal, August 2010
- Edgar, Robert C.
- Bioinformatics, Vol. 26, Issue 19, p. 2460-2461
Computational tools for metabolic engineering
journal, May 2012
- Copeland, Wilbert B.; Bartley, Bryan A.; Chandran, Deepak
- Metabolic Engineering, Vol. 14, Issue 3
A Model Recognition Approach to the Prediction of All-Helical Membrane Protein Structure and Topology
journal, March 1994
- Jones, D. T.; Taylor, W. R.; Thornton, J. M.
- Biochemistry, Vol. 33, Issue 10
Reactome: a database of reactions, pathways and biological processes
journal, November 2010
- Croft, D.; O'Kelly, G.; Wu, G.
- Nucleic Acids Research, Vol. 39, Issue Database
The RAST Server: Rapid Annotations using Subsystems Technology
journal, January 2008
- Aziz, Ramy K.; Bartels, Daniela; Best, Aaron A.
- BMC Genomics, Vol. 9, Issue 1, Article No. 75
Reconstructing ancestral character states under Wagner parsimony
journal, December 1987
- Swofford, David L.; Maddison, Wayne P.
- Mathematical Biosciences, Vol. 87, Issue 2
Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences
journal, August 2013
- Langille, Morgan G. I.; Zaneveld, Jesse; Caporaso, J. Gregory
- Nature Biotechnology, Vol. 31, Issue 9
PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes
journal, August 2013
- Segata, Nicola; Börnigen, Daniela; Morgan, Xochitl C.
- Nature Communications, Vol. 4, Issue 1
Biased biological functions of horizontally transferred genes in prokaryotic genomes
journal, June 2004
- Nakamura, Yoji; Itoh, Takeshi; Matsuda, Hideo
- Nature Genetics, Vol. 36, Issue 7
Horizontal gene transfer, genome innovation and evolution
journal, August 2005
- Gogarten, J. Peter; Townsend, Jeffrey P.
- Nature Reviews Microbiology, Vol. 3, Issue 9
Commensal Pseudomonas protect Arabidopsis thaliana from a coexisting pathogen via multiple lineage-dependent mechanisms
journal, December 2021
- Shalev, Or; Ashkenazy, Haim; Neumann, Manuela
- The ISME Journal, Vol. 16, Issue 5
New globally distributed bacterial phyla within the FCB superphylum
journal, December 2022
- Gong, Xianzhe; del Río, Álvaro Rodríguez; Xu, Le
- Nature Communications, Vol. 13, Issue 1
Expanded diversity of Asgard archaea and their relationships with eukaryotes
journal, April 2021
- Liu, Yang; Makarova, Kira S.; Huang, Wen-Cong
- Nature, Vol. 593, Issue 7860
Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data
journal, January 2022
- Ringeling, Francisca Rojas; Chakraborty, Shounak; Vissers, Caroline
- Nature Biotechnology, Vol. 40, Issue 5
Red versus green leaves: transcriptomic comparison of foliar senescence between two Prunus cerasifera genotypes
journal, February 2020
- Vangelisti, Alberto; Guidi, Lucia; Cavallini, Andrea
- Scientific Reports, Vol. 10, Issue 1
A New Phylogenomic Approach For Quantifying Horizontal Gene Transfer Trends in Prokaryotes
journal, July 2020
- Avni, Eliran; Snir, Sagi
- Scientific Reports, Vol. 10, Issue 1
Climatic oscillations in Quaternary have shaped the co-evolutionary patterns between the Norway spruce and its host-associated herbivore
journal, October 2020
- Goczał, Jakub; Oleksa, Andrzej; Rossa, Robert
- Scientific Reports, Vol. 10, Issue 1
Environmental factors shape the epiphytic bacterial communities of Gracilariopsis lemaneiformis
journal, April 2021
- Pei, Pengbing; Aslam, Muhammad; Du, Hong
- Scientific Reports, Vol. 11, Issue 1
Full-length transcriptome analysis of multiple organs and identification of adaptive genes and pathways in Mikania micrantha
journal, February 2022
- Ruan, Xiaoxian; Wang, Zhen; Su, Yingjuan
- Scientific Reports, Vol. 12, Issue 1
Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology
journal, December 2009
- Karp, P. D.; Paley, S. M.; Krummenacker, M.
- Briefings in Bioinformatics, Vol. 11, Issue 1
APE: Analyses of Phylogenetics and Evolution in R language
journal, January 2004
- Paradis, E.; Claude, J.; Strimmer, K.
- Bioinformatics, Vol. 20, Issue 2
Improved scoring of functional groups from gene expression data by decorrelating GO graph structure
journal, April 2006
- Alexa, A.; Rahnenfuhrer, J.; Lengauer, T.
- Bioinformatics, Vol. 22, Issue 13
Search and clustering orders of magnitude faster than BLAST
journal, August 2010
- Edgar, Robert C.
- Bioinformatics, Vol. 26, Issue 19, p. 2460-2461
The Gene Ontology (GO) database and informatics resource
journal, January 2004
- Gene Ontology Consortium,
- Nucleic Acids Research, Vol. 32, Issue 90001
KAAS: an automatic genome annotation and pathway reconstruction server
journal, May 2007
- Moriya, Y.; Itoh, M.; Okuda, S.
- Nucleic Acids Research, Vol. 35, Issue S2, p. W182-W185
The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, December 2007
- Caspi, R.; Foerster, H.; Fulcher, C. A.
- Nucleic Acids Research, Vol. 36, Issue Database
GenBank
journal, November 2012
- Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
- Nucleic Acids Research, Vol. 41, Issue D1
A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.
journal, May 1994
- Kuhner, M. K.; Felsenstein, J.
- Molecular Biology and Evolution
Origins and impact of constraints in evolution of gene families
journal, October 2006
- Shakhnovich, B. E.; Koonin, E. V.
- Genome Research, Vol. 16, Issue 12
The Retention Index and the Rescaled Consistency Index
journal, December 1989
- Farris, James S.
- Cladistics, Vol. 5, Issue 4
Lateral Gene Transfer and the Origins of Prokaryotic Groups
journal, December 2003
- Boucher, Yan; Douady, Christophe J.; Papke, R. Thane
- Annual Review of Genetics, Vol. 37, Issue 1
Enzyme Recruitment and Its Role in Metabolic Expansion
text, January 2014
- Cindy, Schulenburg,; G., Miller, Brian
- ETH Zurich
Works referencing / citing this record:
Microbial metaproteomics for characterizing the range of metabolic functions and activities of human gut microbiota
journal, May 2015
- Xiong, Weili; Abraham, Paul E.; Li, Zhou
- PROTEOMICS, Vol. 15, Issue 20
Uncovering carbohydrate metabolism through a genotype-phenotype association study of 56 lactic acid bacteria genomes
journal, March 2019
- Buron-Moles, Gemma; Chailyan, Anna; Dolejs, Igor
- Applied Microbiology and Biotechnology, Vol. 103, Issue 7
A genomic perspective on stoichiometric regulation of soil carbon cycling
journal, July 2017
- Hartman, Wyatt H.; Ye, Rongzhong; Horwath, William R.
- The ISME Journal, Vol. 11, Issue 12
Community proteogenomics reveals the systemic impact of phosphorus availability on microbial functions in tropical soil
journal, January 2018
- Yao, Qiuming; Li, Zhou; Song, Yang
- Nature Ecology & Evolution, Vol. 2, Issue 3
Genome-Resolved Proteomic Stable Isotope Probing of Soil Microbial Communities Using 13CO2 and 13C-Methanol
journal, December 2019
- Li, Zhou; Yao, Qiuming; Guo, Xuan
- Frontiers in Microbiology, Vol. 10
Compact graphical representation of phylogenetic data and metadata with GraPhlAn
journal, January 2015
- Asnicar, Francesco; Weingart, George; Tickle, Timothy L.
- PeerJ, Vol. 3
A genomic perspective on stoichiometric regulation of soil carbon cycling
journal, July 2017
- Hartman, Wyatt H.; Ye, Rongzhong; Horwath, William R.
- The ISME Journal, Vol. 11, Issue 12
Acquisition of 1,000 eubacterial genes physiologically transformed a methanogen at the origin of Haloarchaea
journal, November 2012
- Nelson-Sathi, S.; Dagan, T.; Landan, G.
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 50
Increasing Metagenomic Resolution of Microbiome Interactions Through Functional Phylogenomics and Bacterial Sub-Communities
journal, February 2016
- Cibrián-Jaramillo, Angélica; Barona-Gómez, Francisco
- Frontiers in Genetics, Vol. 7
Compact graphical representation of phylogenetic data and metadata with GraPhlAn
journal, January 2015
- Asnicar, Francesco; Weingart, George; Tickle, Timothy L.
- PeerJ, Vol. 3