DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature

Abstract

Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequencing has greatly improved our virus ecology toolkit—and revealed many critical ecosystem roles for viruses—microdiverse populations and fine-scale genomic traits are missed. Some of these microdiverse populations are abundant and the missed regions may be of interest for identifying selection pressures that underpin evolutionary constraints associated with hosts and environments. Though long-read sequencing promises complete virus genomes on single reads, it currently suffers from high DNA requirements and sequencing errors that limit accurate gene prediction. Here we introduce VirION2, an integrated short- and long-read metagenomic wet-lab and informatics pipeline that updates our previous method (VirION) to further enhance the utility of long-read viral metagenomics. Using a viral mock community, we first optimized laboratory protocols (polymerase choice, DNA shearing size, PCR cycling) to enable 76% longer reads (now median length of 6,965 bp) from 100-fold less input DNA (now 1 nanogram). Using a virome from a natural seawater sample, we compared viromes generated with VirION2 against other librarymore » preparation options (unamplified, original VirION, and short-read), and optimized downstream informatics for improved long-read error correction and assembly. VirION2 assemblies combined with short-read based data (‘enhanced’ viromes), provided significant improvements over VirION libraries in the recovery of longer and more complete viral genomes, and our optimized error-correction strategy using long- and short-read data achieved 99.97% accuracy. In the seawater virome, VirION2 assemblies captured 5,161 viral populations (including all of the virus populations observed in the other assemblies), 30% of which were uniquely assembled through inclusion of long-reads, and 22% of the top 10% most abundant virus populations derived from assembly of long-reads. Viral populations unique to VirION2 assemblies had significantly higher microdiversity means, which may explain why short-read virome approaches failed to capture them. These findings suggest the VirION2 sample prep and workflow can help researchers better investigate the virosphere, even from challenging low-biomass samples. Our new protocols are available to the research community on protocols.io as a ‘living document’ to facilitate dissemination of updates to keep pace with the rapid evolution of long-read sequencing technology.« less

Authors:
 [1];  [2];  [3];  [3];  [4];  [3];  [5];  [6];  [2]
  1. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology; The Ohio State Univ., Columbus, OH (United States). Center of Microbiome Science
  2. Univ. of Exeter, Devon (United Kingdom). School of Biosciences
  3. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology
  4. Univ. of Exeter, Devon (United Kingdom). School of Biosciences; Plymouth Marine Laboratory, Plymouth, Devon (United Kingdom)
  5. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Physical and Life Sciences Directorate; Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Accelerator & Fusion Research Division
  6. The Ohio State Univ., Columbus, OH (United States). Dept. of Microbiology; The Ohio State Univ., Columbus, OH (United States). Center of Microbiome Science; The Ohio State Univ., Columbus, OH (United States). Dept. of Civil, Environmental and Geodetic Engineering
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Biological and Environmental Research (BER)
OSTI Identifier:
1813699
Report Number(s):
LLNL-JRNL-820026
Journal ID: ISSN 2167-8359; 1031250
Grant/Contract Number:  
AC52-07NA27344; SC0020173; SCW1632
Resource Type:
Accepted Manuscript
Journal Name:
PeerJ
Additional Journal Information:
Journal Volume: 9; Journal Issue: N/A; Journal ID: ISSN 2167-8359
Publisher:
PeerJ Inc.
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 54 ENVIRONMENTAL SCIENCES; 37 INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY; Viral metagenomics; Virus; Virome; Metagenome; Nanopore sequencing; Phage; Long-reads

Citation Formats

Zablocki, Olivier, Michelsen, Michelle, Burris, Marie, Solonenko, Natalie, Warwick-Dugdale, Joanna, Ghosh, Romik, Pett-Ridge, Jennifer, Sullivan, Matthew B., and Temperton, Ben. VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature. United States: N. p., 2021. Web. doi:10.7717/peerj.11088.
Zablocki, Olivier, Michelsen, Michelle, Burris, Marie, Solonenko, Natalie, Warwick-Dugdale, Joanna, Ghosh, Romik, Pett-Ridge, Jennifer, Sullivan, Matthew B., & Temperton, Ben. VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature. United States. https://doi.org/10.7717/peerj.11088
Zablocki, Olivier, Michelsen, Michelle, Burris, Marie, Solonenko, Natalie, Warwick-Dugdale, Joanna, Ghosh, Romik, Pett-Ridge, Jennifer, Sullivan, Matthew B., and Temperton, Ben. Tue . "VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature". United States. https://doi.org/10.7717/peerj.11088. https://www.osti.gov/servlets/purl/1813699.
@article{osti_1813699,
title = {VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature},
author = {Zablocki, Olivier and Michelsen, Michelle and Burris, Marie and Solonenko, Natalie and Warwick-Dugdale, Joanna and Ghosh, Romik and Pett-Ridge, Jennifer and Sullivan, Matthew B. and Temperton, Ben},
abstractNote = {Microbes play fundamental roles in shaping natural ecosystem properties and functions, but do so under constraints imposed by their viral predators. However, studying viruses in nature can be challenging due to low biomass and the lack of universal gene markers. Though metagenomic short-read sequencing has greatly improved our virus ecology toolkit—and revealed many critical ecosystem roles for viruses—microdiverse populations and fine-scale genomic traits are missed. Some of these microdiverse populations are abundant and the missed regions may be of interest for identifying selection pressures that underpin evolutionary constraints associated with hosts and environments. Though long-read sequencing promises complete virus genomes on single reads, it currently suffers from high DNA requirements and sequencing errors that limit accurate gene prediction. Here we introduce VirION2, an integrated short- and long-read metagenomic wet-lab and informatics pipeline that updates our previous method (VirION) to further enhance the utility of long-read viral metagenomics. Using a viral mock community, we first optimized laboratory protocols (polymerase choice, DNA shearing size, PCR cycling) to enable 76% longer reads (now median length of 6,965 bp) from 100-fold less input DNA (now 1 nanogram). Using a virome from a natural seawater sample, we compared viromes generated with VirION2 against other library preparation options (unamplified, original VirION, and short-read), and optimized downstream informatics for improved long-read error correction and assembly. VirION2 assemblies combined with short-read based data (‘enhanced’ viromes), provided significant improvements over VirION libraries in the recovery of longer and more complete viral genomes, and our optimized error-correction strategy using long- and short-read data achieved 99.97% accuracy. In the seawater virome, VirION2 assemblies captured 5,161 viral populations (including all of the virus populations observed in the other assemblies), 30% of which were uniquely assembled through inclusion of long-reads, and 22% of the top 10% most abundant virus populations derived from assembly of long-reads. Viral populations unique to VirION2 assemblies had significantly higher microdiversity means, which may explain why short-read virome approaches failed to capture them. These findings suggest the VirION2 sample prep and workflow can help researchers better investigate the virosphere, even from challenging low-biomass samples. Our new protocols are available to the research community on protocols.io as a ‘living document’ to facilitate dissemination of updates to keep pace with the rapid evolution of long-read sequencing technology.},
doi = {10.7717/peerj.11088},
journal = {PeerJ},
number = N/A,
volume = 9,
place = {United States},
year = {Tue Mar 30 00:00:00 EDT 2021},
month = {Tue Mar 30 00:00:00 EDT 2021}
}

Works referenced in this record:

Rising to the challenge: accelerated pace of discovery transforms marine virology
journal, February 2015

  • Brum, Jennifer R.; Sullivan, Matthew B.
  • Nature Reviews Microbiology, Vol. 13, Issue 3
  • DOI: 10.1038/nrmicro3404

Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity
journal, January 2017

  • Roux, Simon; Emerson, Joanne B.; Eloe-Fadrosh, Emiley A.
  • PeerJ, Vol. 5
  • DOI: 10.7717/peerj.3817

BCFtools/csq: haplotype-aware variant consequences
journal, February 2017


QUAST: quality assessment tool for genome assemblies
journal, February 2013


Comparative Omics and Trait Analyses of Marine Pseudoalteromonas Phages Advance the Phage OTU Concept
journal, July 2017


hybridSPAdes: an algorithm for hybrid assembly of short and long reads
journal, November 2015


Doubling of the known set of RNA viruses by metagenomic analysis of an aquatic virome
journal, July 2020


Host-linked soil viral ecology along a permafrost thaw gradient
journal, July 2018


Ménage à trois in the human gut: interactions between host, bacteria and phages
journal, May 2017

  • Mirzaei, Mohammadali Khan; Maurice, Corinne F.
  • Nature Reviews Microbiology, Vol. 15, Issue 7
  • DOI: 10.1038/nrmicro.2017.30

Phage on tap–a quick and efficient protocol for the preparation of bacteriophage laboratory stocks
journal, January 2016

  • Bonilla, Natasha; Rojas, Maria Isabel; Netto Flores Cruz, Giuliano
  • PeerJ, Vol. 4
  • DOI: 10.7717/peerj.2261

Regulation of average length of complex PCR product
journal, September 1999

  • Shagin, D. A.; Lukyanov, K. A.; Vagner, L. L.
  • Nucleic Acids Research, Vol. 27, Issue 18
  • DOI: 10.1093/nar/27.18.e23-i

Opportunities and challenges in long-read sequencing data analysis
journal, February 2020


Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation
journal, March 2017

  • Koren, Sergey; Walenz, Brian P.; Berlin, Konstantin
  • Genome Research, Vol. 27, Issue 5
  • DOI: 10.1101/gr.215087.116

Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes
journal, September 2019

  • De Maio, Nicola; Shaw, Liam P.; Hubbard, Alasdair
  • Microbial Genomics, Vol. 5, Issue 9
  • DOI: 10.1099/mgen.0.000294

Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities
journal, February 2020

  • Beaulaurier, John; Luo, Elaine; Eppley, John M.
  • Genome Research, Vol. 30, Issue 3
  • DOI: 10.1101/gr.251686.119

Minimap2: pairwise alignment for nucleotide sequences
journal, May 2018


yacrd and fpa: upstream tools for long-read genome assembly
journal, April 2020


Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
journal, March 2016


Assembly of long, error-prone reads using repeat graphs
journal, April 2019


Clades of huge phages from across Earth’s ecosystems
journal, February 2020


Patterns and ecological drivers of ocean viral communities
journal, May 2015


Marine DNA Viral Macro- and Microdiversity from Pole to Pole
journal, May 2019


Soil Viruses Are Underexplored Players in Ecosystem Carbon Processing
journal, October 2018


A simple and efficient method for concentration of ocean viruses by chemical flocculation: Virus concentration by flocculation with iron
journal, August 2010


NanoPack: visualizing and processing long-read sequencing data
journal, March 2018


Viral tagging reveals discrete populations in Synechococcus viral genome sequence space
journal, July 2014

  • Deng, Li; Ignacio-Espinoza, J. Cesar; Gregory, Ann C.
  • Nature, Vol. 513, Issue 7517
  • DOI: 10.1038/nature13459

metaSPAdes: a new versatile metagenomic assembler
journal, March 2017

  • Nurk, Sergey; Meleshko, Dmitry; Korobeynikov, Anton
  • Genome Research, Vol. 27, Issue 5
  • DOI: 10.1101/gr.213959.116

Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses
journal, September 2016

  • Roux, Simon; Brum, Jennifer R.; Dutilh, Bas E.
  • Nature, Vol. 537, Issue 7622
  • DOI: 10.1038/nature19366

Single-virus genomics reveals hidden cosmopolitan and abundant viruses
journal, June 2017

  • Martinez-Hernandez, Francisco; Fornas, Oscar; Lluesma Gomez, Monica
  • Nature Communications, Vol. 8, Issue 1
  • DOI: 10.1038/ncomms15892

VirSorter: mining viral signal from microbial genomic data
journal, January 2015

  • Roux, Simon; Enault, Francois; Hurwitz, Bonnie L.
  • PeerJ, Vol. 3
  • DOI: 10.7717/peerj.985

Jumbo Bacteriophages: An Overview
journal, March 2017


The Human Gut Virome Is Highly Diverse, Stable, and Individual Specific
journal, October 2019

  • Shkoporov, Andrey N.; Clooney, Adam G.; Sutton, Thomas D. S.
  • Cell Host & Microbe, Vol. 26, Issue 4
  • DOI: 10.1016/j.chom.2019.09.009

Fast and accurate de novo genome assembly from long uncorrected reads
journal, January 2017

  • Vaser, Robert; Sović, Ivan; Nagarajan, Niranjan
  • Genome Research, Vol. 27, Issue 5
  • DOI: 10.1101/gr.214270.116

Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement
journal, November 2014


Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands
journal, January 2019

  • Warwick-Dugdale, Joanna; Solonenko, Natalie; Moore, Karen
  • PeerJ, Vol. 7
  • DOI: 10.7717/peerj.6800

Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia
journal, October 2015

  • Nelson, William C.; Maezato, Yukari; Wu, Yu-Wei
  • Applied and Environmental Microbiology, Vol. 82, Issue 1
  • DOI: 10.1128/AEM.02274-15

The Sequence Alignment/Map format and SAMtools
journal, June 2009


Complete, closed bacterial genomes from microbiomes using nanopore sequencing
journal, February 2020


Benchmarking of long-read assemblers for prokaryote whole genome sequencing
journal, January 2019


Genomic analysis of uncultured marine viral communities
journal, October 2002

  • Breitbart, M.; Salamon, P.; Andresen, B.
  • Proceedings of the National Academy of Sciences, Vol. 99, Issue 22
  • DOI: 10.1073/pnas.202488399

Redefining the invertebrate RNA virosphere
journal, November 2016

  • Shi, Mang; Lin, Xian-Dan; Tian, Jun-Hua
  • Nature, Vol. 540, Issue 7634
  • DOI: 10.1038/nature20167