Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

$$\mathrm{COBRA}$$ improves the completeness and contiguity of viral genomes assembled from metagenomes

Journal Article · · Nature Microbiology
 [1];  [2]
  1. University of California, Berkeley, CA (United States)
  2. University of California, Berkeley, CA (United States); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). Earth and Environmental Sciences

Viruses are often studied using metagenome-assembled sequences, but genome incompleteness hampers comprehensive and accurate analyses. Contig Overlap Based Re-Assembly (COBRA) resolves assembly breakpoints based on the de Bruijn graph and joins contigs. Here we benchmarked COBRA using ocean and soil viral datasets. COBRA accurately joined the assembled sequences and achieved notably higher genome accuracy than binning tools. From 231 published freshwater metagenomes, we obtained 7,334 bacteriophage clusters, ~83% of which represent new phage species. Notably, ~70% of these were circular, compared with 34% before COBRA analyses. We expanded sampling of huge phages (≥200 kbp), the largest of which was curated to completion (717 kbp). Improved phage genomes from Rotsee Lake provided context for metatranscriptomic data and indicated the in situ activity of huge phages, whiB-encoding phages and cysC- and cysH-encoding phages. COBRA improves viral genome assembly contiguity and completeness, thus the accuracy and reliability of analyses of gene content, diversity and evolution.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); Natural Sciences and Engineering Research Council of Canada (NSERC)
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
2335317
Journal Information:
Nature Microbiology, Journal Name: Nature Microbiology Journal Issue: 3 Vol. 9; ISSN 2058-5276
Publisher:
Nature Publishing GroupCopyright Statement
Country of Publication:
United States
Language:
English

References (72)

Recent changes to virus taxonomy ratified by the International Committee on Taxonomy of Viruses (2022) journal August 2022
Massive expansion of human gut bacteriophage diversity journal February 2021
Virus-associated organosulfur metabolism in human and environmental systems journal August 2021
Closely related Lak megaphages replicate in the microbiomes of diverse animals journal August 2021
Bacterial photosynthesis genes in a virus journal August 2003
Metagenomic recovery of phage genomes of uncultured freshwater actinobacteria journal August 2016
Community structure and metabolism through reconstruction of microbial genomes from the environment journal February 2004
Photosynthesis genes in marine viruses yield proteins during host infection journal October 2005
Unusual biology across a group comprising more than 15% of domain Bacteria journal June 2015
Uncovering Earth’s virome journal August 2016
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses journal September 2016
Fast gapped-read alignment with Bowtie 2 journal March 2012
Binning metagenomic contigs by coverage and composition journal September 2014
A century of the phage: past, present and future journal November 2015
Infection strategy and biogeography distinguish cosmopolitan groups of marine jumbo bacteriophages journal March 2022
Ecogenomics of virophages and their giant virus hosts assessed through time series metagenomics journal October 2017
High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries journal November 2018
Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses journal April 2020
Ecology of inorganic sulfur auxiliary metabolism in widespread bacteriophages journal June 2021
Genome binning of viral entities from bulk metagenomics data journal February 2022
Megaphages infect Prevotella and variants are widespread in gut microbiomes journal January 2019
Large freshwater phages with the potential to augment aerobic methane oxidation journal August 2020
Widespread stop-codon recoding in bacteriophages may regulate translation of lytic genes journal May 2022
Metabolic and biogeochemical consequences of viral infection in aquatic ecosystems journal November 2019
Phage diversity, genomics and phylogeny journal February 2020
Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis journal June 2018
Giant virus diversity and host interactions through global metagenomics journal January 2020
Clades of huge phages from across Earth’s ecosystems journal February 2020
Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks journal May 2019
CheckV assesses the quality and completeness of metagenome-assembled viral genomes journal December 2020
Identification of mobile genetic elements with geNomad journal September 2023
Comprehensive dataset of shotgun metagenomes from oxygen stratified freshwater lakes and ponds journal May 2021
Viruses of sulfur oxidizing phototrophs encode genes for pigment, carbon, and sulfur metabolisms journal April 2023
Phage-encoded ribosomal protein S21 expression is linked to late-stage phage replication journal March 2022
Uncovering 1058 Novel Human Enteric DNA Viruses Through Deep Long-Read Third-Generation Sequencing and Their Clinical Impact journal September 2022
Virus classification for viral genomic fragments using PhaGCN2 journal December 2022
VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families journal January 2021
CoCoNet: an efficient deep learning tool for viral metagenome binning journal April 2021
Phables: from fragmented assemblies to high-quality bacteriophage genomes journal September 2023
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses journal June 2009
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth journal April 2012
Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data journal April 2012
MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph journal January 2015
Simulating Illumina metagenomic data with InSilicoSeq journal July 2018
IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era journal February 2020
IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses journal November 2020
vRhyme enables binning of viral genomes from metagenomes journal May 2022
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins journal January 2007
IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes journal November 2018
HMMER web server: 2018 update journal June 2018
Borg extrachromosomal elements of methane-oxidizing archaea have conserved and expressed genetic repertoires preprint August 2023
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities journal February 2020
Accurate and complete genomes from metagenomes journal March 2020
Sulfur Oxidation Genes in Diverse Deep-Sea Viruses journal May 2014
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data journal March 2021
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data journal July 2017
Phage-centric ecological interactions in aquatic ecosystems revealed through ultra-deep metagenomics journal October 2019
Diversity, evolution, and classification of virophages uncovered through global metagenomics journal December 2019
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences journal June 2020
Accelerated Profile HMM Searches journal October 2011
A method for achieving complete microbial genomes and improving bins from metagenomics data journal May 2021
Jumbo Bacteriophages: An Overview journal March 2017
Exploring Viral Diversity in a Gypsum Karst Lake Ecosystem Using Targeted Single-Cell Genomics journal June 2021
Jumbo Phages: A Comparative Genomic Overview of Core Functions and Adaptions for Biological Conflicts journal January 2021
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data collection January 2017
Phage-centric ecological interactions in aquatic ecosystems revealed through ultra-deep metagenomics collection January 2019
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences collection January 2020
MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies journal January 2019
VirSorter: mining viral signal from microbial genomic data journal January 2015