skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature

Journal Article · · PeerJ
DOI:https://doi.org/10.7717/peerj.11088· OSTI ID:1813699
 [1];  [2];  [3];  [3];  [4];  [3];  [5];  [6];  [2]
  1. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology; The Ohio State Univ., Columbus, OH (United States). Center of Microbiome Science
  2. Univ. of Exeter, Devon (United Kingdom). School of Biosciences
  3. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology
  4. Univ. of Exeter, Devon (United Kingdom). School of Biosciences; Plymouth Marine Laboratory, Plymouth, Devon (United Kingdom)
  5. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Physical and Life Sciences Directorate; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Accelerator & Fusion Research Division
  6. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology; The Ohio State Univ., Columbus, OH (United States). Center of Microbiome Science; The Ohio State Univ., Columbus, OH (United States). Dept. of Civil, Environmental and Geodetic Engineering

Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequencing has greatly improved our virus ecology toolkit—and revealed many critical ecosystem roles for viruses—microdiverse populations and fine-scale genomic traits are missed. Some of these microdiverse populations are abundant and the missed regions may be of interest for identifying selection pressures that underpin evolutionary constraints associated with hosts and environments. Though long-read sequencing promises complete virus genomes on single reads, it currently suffers from high DNA requirements and sequencing errors that limit accurate gene prediction. Here we introduce VirION2, an integrated short- and long-read metagenomic wet-lab and informatics pipeline that updates our previous method (VirION) to further enhance the utility of long-read viral metagenomics. Using a viral mock community, we first optimized laboratory protocols (polymerase choice, DNA shearing size, PCR cycling) to enable 76% longer reads (now median length of 6,965 bp) from 100-fold less input DNA (now 1 nanogram). Using a virome from a natural seawater sample, we compared viromes generated with VirION2 against other library preparation options (unamplified, original VirION, and short-read), and optimized downstream informatics for improved long-read error correction and assembly. VirION2 assemblies combined with short-read based data (‘enhanced’ viromes), provided significant improvements over VirION libraries in the recovery of longer and more complete viral genomes, and our optimized error-correction strategy using long- and short-read data achieved 99.97% accuracy. In the seawater virome, VirION2 assemblies captured 5,161 viral populations (including all of the virus populations observed in the other assemblies), 30% of which were uniquely assembled through inclusion of long-reads, and 22% of the top 10% most abundant virus populations derived from assembly of long-reads. Viral populations unique to VirION2 assemblies had significantly higher microdiversity means, which may explain why short-read virome approaches failed to capture them. These findings suggest the VirION2 sample prep and workflow can help researchers better investigate the virosphere, even from challenging low-biomass samples. Our new protocols are available to the research community on protocols.io as a ‘living document’ to facilitate dissemination of updates to keep pace with the rapid evolution of long-read sequencing technology.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC52-07NA27344; SC0020173; SCW1632
OSTI ID:
1813699
Report Number(s):
LLNL-JRNL-820026; 1031250
Journal Information:
PeerJ, Vol. 9, Issue N/A; ISSN 2167-8359
Publisher:
PeerJ Inc.Copyright Statement
Country of Publication:
United States
Language:
English

References (43)

Rising to the challenge: accelerated pace of discovery transforms marine virology journal February 2015
Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity journal January 2017
BCFtools/csq: haplotype-aware variant consequences journal February 2017
QUAST: quality assessment tool for genome assemblies journal February 2013
Comparative Omics and Trait Analyses of Marine Pseudoalteromonas Phages Advance the Phage OTU Concept journal July 2017
hybridSPAdes: an algorithm for hybrid assembly of short and long reads journal November 2015
Doubling of the known set of RNA viruses by metagenomic analysis of an aquatic virome journal July 2020
Host-linked soil viral ecology along a permafrost thaw gradient journal July 2018
Ménage à trois in the human gut: interactions between host, bacteria and phages journal May 2017
Phage on tap–a quick and efficient protocol for the preparation of bacteriophage laboratory stocks journal January 2016
Regulation of average length of complex PCR product journal September 1999
Opportunities and challenges in long-read sequencing data analysis journal February 2020
Towards quantitative metagenomics of wild viruses and other ultra-low concentration DNA samples: a rigorous assessment and optimization of the linker amplification method: Linker amplification for ultra-low DNA samples journal June 2012
Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation journal March 2017
Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes journal September 2019
Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities journal February 2020
Minimap2: pairwise alignment for nucleotide sequences journal May 2018
yacrd and fpa: upstream tools for long-read genome assembly journal April 2020
Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences journal March 2016
Assembly of long, error-prone reads using repeat graphs journal April 2019
Clades of huge phages from across Earth’s ecosystems journal February 2020
Patterns and ecological drivers of ocean viral communities journal May 2015
Marine DNA Viral Macro- and Microdiversity from Pole to Pole journal May 2019
Soil Viruses Are Underexplored Players in Ecosystem Carbon Processing journal October 2018
A simple and efficient method for concentration of ocean viruses by chemical flocculation: Virus concentration by flocculation with iron journal August 2010
NanoPack: visualizing and processing long-read sequencing data journal March 2018
Viral tagging reveals discrete populations in Synechococcus viral genome sequence space journal July 2014
metaSPAdes: a new versatile metagenomic assembler journal March 2017
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses journal September 2016
Single-virus genomics reveals hidden cosmopolitan and abundant viruses journal June 2017
VirSorter: mining viral signal from microbial genomic data journal January 2015
Jumbo Bacteriophages: An Overview journal March 2017
The Human Gut Virome Is Highly Diverse, Stable, and Individual Specific journal October 2019
Fast and accurate de novo genome assembly from long uncorrected reads journal January 2017
Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement journal November 2014
Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands journal January 2019
Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia journal October 2015
The Sequence Alignment/Map format and SAMtools journal June 2009
Complete, closed bacterial genomes from microbiomes using nanopore sequencing journal February 2020
Benchmarking of long-read assemblers for prokaryote whole genome sequencing journal January 2019
Evaluation of methods to concentrate and purify ocean virus communities through comparative, replicated metagenomics: Viral community concentration and purification journal July 2012
Genomic analysis of uncultured marine viral communities journal October 2002
Redefining the invertebrate RNA virosphere journal November 2016