DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A fast comparative genome browser for diverse bacteria and archaea

Journal Article · · PLoS ONE

Genome sequencing has revealed an incredible diversity of bacteria and archaea, but there are no fast and convenient tools for browsing across these genomes. It is cumbersome to view the prevalence of homologs for a protein of interest, or the gene neighborhoods of those homologs, across the diversity of the prokaryotes. We developed a web-based tool, fast . genomics , that uses two strategies to support fast browsing across the diversity of prokaryotes. First, the database of genomes is split up. The main database contains one representative from each of the 6,377 genera that have a high-quality genome, and additional databases for each taxonomic order contain up to 10 representatives of each species. Second, homologs of proteins of interest are identified quickly by using accelerated searches, usually in a few seconds. Once homologs are identified, fast . genomics can quickly show their prevalence across taxa, view their neighboring genes, or compare the prevalence of two different proteins. Fast . genomics is available at https://fast.genomics.lbl.gov .

Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2335832
Alternate ID(s):
OSTI ID: 2448507
Journal Information:
PLoS ONE, Journal Name: PLoS ONE Journal Issue: 4 Vol. 19; ISSN 1932-6203
Publisher:
Public Library of Science (PLoS)Copyright Statement
Country of Publication:
United States
Language:
English

References (45)

SANSparallel: interactive homology search against Uniprot journal April 2015
Expanded microbial genome coverage and improved protein family annotation in the COG database journal November 2014
New finite-size correction for local alignment score distributions journal January 2012
Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences journal August 2000
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models journal November 2021
EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools journal July 2023
A powerful non-homology method for the prediction of operons in prokaryotes journal July 2002
eggNOG 6.0: enabling comparative genomics across 12 535 organisms journal November 2022
The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest journal November 2022
CDD: NCBI's conserved domain database journal November 2014
Conservation of gene order: a fingerprint of proteins that physically interact journal September 1998
proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes journal November 2022
A novel method for accurate operon predictions in all sequenced prokaryotes journal February 2005
PhyloCorrelate: inferring bacterial gene–gene functional associations through large-scale phylogenetic profiling journal January 2021
HMMER web server: 2018 update journal June 2018
STRING 7--recent developments in the integration and prediction of protein interactions journal January 2007
Inference and Analysis of the Relative Stability of Bacterial Chromosomes journal November 2005
FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments journal March 2010
Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles journal April 1999
ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process journal November 2011
InterPro in 2017—beyond protein family and domain annotations journal November 2016
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences journal May 2006
TIGRFAMs and Genome Properties in 2013 journal November 2012
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets journal October 2017
Curated BLAST for Genomes journal March 2019
Adaptive seeds tame genomic sequence comparison journal January 2011
From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria journal September 2003
Lateral gene transfer journal April 2011
Four families of folate-independent methionine synthases journal February 2021
Genome Alignment, Evolution of Prokaryotic Genome Organization, and Prediction of Gene Function Using Genomic Context journal February 2001
GUNC: detection of chimerism and contamination in prokaryotic genomes journal June 2021
The IMG/M data management and analysis system v.7: content updates and new features journal November 2022
FlaGs and webFlaGs: discovering novel biology through the analysis of gene neighbourhood conservation journal December 2020
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
AnnoTree: visualization and exploration of a functionally annotated microbial tree of life journal April 2019
GeCoViz: genomic context visualisation of prokaryotic genes from a functional and evolutionary perspective journal May 2022
CD-HIT: accelerated for clustering the next-generation sequencing data journal October 2012
eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale journal October 2021
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
MicrobesOnline: an integrated portal for comparative and functional genomics journal November 2009
Pfam: the protein families database journal November 2013
The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions journal October 2004
A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life journal August 2018
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs journal September 1997