skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity

Journal Article · · PeerJ
DOI:https://doi.org/10.7717/peerj.3817· OSTI ID:1424953
 [1];  [1];  [2];  [3]
  1. The Ohio State Univ., Columbus, OH (United States). Department of Microbiology
  2. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  3. The Ohio State Univ., Columbus, OH (United States). Department of Microbiology and Department of Civil, Environmental and Geodetic Engineering

Background: Viral metagenomics (viromics) is increasingly used to obtain uncultivated viral genomes, evaluate community diversity, and assess ecological hypotheses. While viromic experimental methods are relatively mature and widely accepted by the research community, robust bioinformatics standards remain to be established. Here we usedin silicomock viral communities to evaluate the viromic sequence-to-ecological-inference pipeline, including (i) read pre-processing and metagenome assembly, (ii) thresholds applied to estimate viral relative abundances based on read mapping to assembled contigs, and (iii) normalization methods applied to the matrix of viral relative abundances for alpha and beta diversity estimates. Results: Tools specifically designed for metagenomes, specifically metaSPAdes, MEGAHIT, and IDBA-UD, were the most effective at assembling viromes. Read pre-processing, such as partitioning, had virtually no impact on assembly output, but may be useful when hardware is limited. Viral populations with 2–5 × coverage typically assembled well, whereas lesser coverage led to fragmented assembly. Strain heterogeneity within populations hampered assembly, especially when strains were closely related (average nucleotide identity, or ANI ≥97%) and when the most abundant strain represented <50% of the population. Viral community composition assessments based on read recruitment were generally accurate when the following thresholds for detection were applied: (i) ≥10 kb contig lengths to define populations, (ii) coverage defined from reads mapping at ≥90% identity, and (iii) ≥75% of contig length with ≥1 × coverage. Finally, although data are limited to the most abundant viruses in a community, alpha and beta diversity patterns were robustly estimated (±10%) when comparing samples of similar sequencing depth, but more divergent (up to 80%) when sequencing depth was uneven across the dataset. In the latter cases, the use of normalization methods specifically developed for metagenomes provided the best estimates. Conclusions: These simulations provide benchmarks for selecting analysis cut-offs and establish that an optimized sample-to-ecological-inference viromics pipeline is robust for making ecological inferences from natural viral communities. Continued development to better accessing RNA, rare, and/or diverse viral populations and improved reference viral genome availability will alleviate many of viromics remaining limitations.

Research Organization:
Univ. of Arizona, Tucson, AZ (United States); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
Grant/Contract Number:
SC0010580; SC0016440; AC02-05CH11231
OSTI ID:
1424953
Alternate ID(s):
OSTI ID: 1581051
Journal Information:
PeerJ, Vol. 5; ISSN 2167-8359
Publisher:
PeerJ Inc.Copyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 136 works
Citation information provided by
Web of Science

References (78)

Rising to the challenge: accelerated pace of discovery transforms marine virology journal February 2015
The GAAS Metagenomic Tool and Its Estimations of Viral and Microbial Average Genome Size in Four Major Biomes journal December 2009
Utilization of defined microbial communities enables effective evaluation of meta-genomic assemblies journal April 2017
Recombination and microdiversity in coastal marine cyanophages journal November 2009
Utilizing novel diversity estimators to quantify multiple dimensions of microbial biodiversity across domains journal January 2013
Assembly of Viral Metagenomes from Yellowstone Hot Springs journal April 2008
Accurate, multi-kb reads resolve complex populations and detect rare microorganisms journal February 2015
ggplot2 book January 2009
Depth-stratified functional and taxonomic niche specialization in the ‘core’ and ‘flexible’ Pacific Ocean Virome journal August 2014
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
Evaluation of viral genome assembly and diversity estimation in deep metagenomes journal January 2014
Seasonality and monthly dynamics of marine myovirus communities: Marine myovirus community dynamics at SPOT journal April 2012
Recovering complete and draft population genomes from metagenome datasets journal March 2016
Viruses as Winners in the Game of Life journal September 2016
Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization journal August 2012
Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer journal November 2016
Fermentation, Hydrogen, and Sulfur Metabolism in Multiple Uncultivated Bacterial Phyla journal September 2012
Towards quantitative metagenomics of wild viruses and other ultra-low concentration DNA samples: a rigorous assessment and optimization of the linker amplification method: Linker amplification for ultra-low DNA samples journal June 2012
Differential expression analysis for sequence count data journal October 2010
Using ecological diversity measures with bacterial communities journal February 2003
Challenges in the analysis of viral metagenomes journal July 2016
Trimmomatic: a flexible trimmer for Illumina sequence data journal April 2014
Patterns and ecological drivers of ocean viral communities journal May 2015
Probing Individual Environmental Bacteria for Viruses by Using Microfluidic Digital PCR journal June 2011
Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible journal April 2014
The khmer software package: enabling efficient nucleotide sequence analysis journal January 2015
Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist’s Perspective - Not Only Size Matters! journal January 2017
Where Next for Microbiome Research? journal January 2015
Assessing the Impact of Assemblers on Virus Detection in a De Novo Metagenomic Analysis Pipeline journal September 2017
Towards quantitative viromics for both double-stranded and single-stranded DNA viruses journal January 2016
Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes journal May 2015
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Single-virus genomics reveals hidden cosmopolitan and abundant viruses journal June 2017
Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data journal February 2012
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads journal July 2012
Pyrosequencing enumerates and contrasts soil microbial diversity journal July 2007
Are we missing half of the viruses in the ocean? journal November 2012
Assessing the Diversity and Specificity of Two Freshwater Viral Communities through Metagenomics journal March 2012
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes journal May 2015
Metabolic reprogramming by viruses in the sunlit and dark ocean journal January 2013
MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices journal June 2016
Using MUMmer to Identify Similar Regions in Large Sequence Sets journal January 2003
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses journal October 2016
Omega: an Overlap-graph de novo Assembler for Metagenomics journal June 2014
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth journal April 2012
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut journal January 2014
Viral metagenomics journal May 2005
Marine T4-type bacteriophages, a ubiquitous component of the dark matter of the biosphere journal August 2005
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data journal November 2009
Distantly sampled soils carry few species in common journal June 2008
NeSSM: A Next-Generation Sequencing Simulator for Metagenomics journal October 2013
Metagenomic 16S rDNA Illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities: Using journal September 2013
Metavir 2: new tools for viral metagenome comparison and assembled virome analysis journal January 2014
A survey of error-correction methods for next-generation sequencing journal April 2012
Robust estimation of microbial diversity in theory and in practice journal February 2013
Differential abundance analysis for microbial marker-gene surveys journal September 2013
The bright side of microbial dark matter: lessons learned from the uncultivated majority journal June 2016
vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria journal January 2017
Microbes do not follow the elevational diversity patterns of plants and animals journal April 2011
Unveiling viral–host interactions within the ‘microbial dark matter’ journal August 2014
Viral and microbial community dynamics in four aquatic environments journal February 2010
Single-cell and population level viral infection dynamics revealed by phageFISH, a method to visualize intracellular and free viruses: phageFISH - visualizing intracellular and free viruses journal March 2013
The Microbial Engines That Drive Earth's Biogeochemical Cycles journal May 2008
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations journal September 2015
Genomic diversification of marine cyanophages into stable ecotypes: Cyanophage diversification into ecotypes journal October 2016
Transcriptomic analysis of female and male gonads in juvenile snakeskin gourami (Trichopodus pectoralis) journal March 2020
Characterization and functional analysis of phytoene synthase gene family in tobacco journal January 2021
Assessing the impact of assemblers on virus detection in a de novo metagenomic analysis pipeline text January 2019
Differential expression analysis for sequence count data journal April 2010
Waste Not, Want Not: Why Rarefying Microbiome Data is Inadmissible text January 2013
Salinimonas marina sp. nov. Isolated from Jeju Island Marine Sediment journal June 2021
Binning Metagenomic Contigs Using Unsupervised Clustering and Reference Databases journal May 2022
CXCR4 signaling directs Igk recombination and the molecular mechanisms of late B lymphopoiesis journal September 2019
Targeting redox metabolism: the perfect storm induced by acrylamide poisoning in the brain journal January 2020
Intestinal bacteria flora changes in patients with Mycoplasma pneumoniae pneumonia with or without wheezing journal April 2022
Salinity tolerance mechanisms of an Arctic Pelagophyte using comparative transcriptomic and gene expression analysis journal May 2022
Development of phoH as a Novel Signature Gene for Assessing Marine Phage Diversity journal September 2011
New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily journal January 2014

Cited By (22)

Mini‐Metagenomics and Nucleotide Composition Aid the Identification and Host Association of Novel Bacteriophage Sequences journal May 2019
The Human Gut Virome Is Highly Diverse, Stable, and Individual Specific journal October 2019
Minimum Information about an Uncultivated Virus Genome (MIUViG) journal December 2018
Single-cell genomics uncover Pelagibacter as the putative host of the extremely abundant uncultured 37-F6 viral population in the ocean journal September 2018
Cobaviruses – a new globally distributed phage group infecting Rhodobacteraceae in marine ecosystems journal February 2019
Murine colitis reveals a disease-associated bacteriophage community journal July 2018
Next-generation sequencing of dsRNA is greatly improved by treatment with the inexpensive denaturing reagent DMSO journal November 2019
Riding the wave of genomics to investigate aquatic coliphage diversity and activity journal April 2019
Genome‐resolved viral and cellular metagenomes revealed potential key virus‐host interactions in a deep freshwater lake journal September 2019
Genomic and Seasonal Variations among Aquatic Phages Infecting the Baltic Sea Gammaproteobacterium Rheinheimera sp. Strain BAL341 journal July 2019
A Viral Ecogenomics Framework To Uncover the Secrets of Nature’s “Microbe Whisperers” journal May 2019
Host-hijacking and planktonic piracy: how phages command the microbial high seas journal February 2019
Choice of assembly software has a critical impact on virome characterisation journal January 2019
MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins journal August 2018
Deciphering the Human Virome with Single-Virus Genomics and Metagenomics journal March 2018
Viruses of Eukaryotic Algae: Diversity, Methods for Detection, and Future Directions journal September 2018
Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands journal January 2019
Towards optimized viral metagenomes for double-stranded and single-stranded DNA viruses from challenging soils journal January 2019
Benchmarking protocols for the metagenomic analysis of stream biofilm viromes journal January 2019
Mouse Vendor Influence on the Bacterial and Viral Gut Composition Exceeds the Effect of Diet journal May 2019
Minimum information about an uncultivated virus genome (MIUVIG) text January 2019
KBase Narrative - Viral Analysis End-to-End dataset January 2022