skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses

Journal Article · · Microbiome

Abstract Background Viruses are a significant player in many biosphere and human ecosystems, but most signals remain “hidden” in metagenomic/metatranscriptomic sequence datasets due to the lack of universal gene markers, database representatives, and insufficiently advanced identification tools. Results Here, we introduce VirSorter2, a DNA and RNA virus identification tool that leverages genome-informed database advances across a collection of customized automatic classifiers to improve the accuracy and range of virus sequence detection. When benchmarked against genomes from both isolated and uncultivated viruses, VirSorter2 uniquely performed consistently with high accuracy (F1-score > 0.8) across viral diversity, while all other tools under-detected viruses outside of the group most represented in reference databases (i.e., those in the order Caudovirales ). Among the tools evaluated, VirSorter2 was also uniquely able to minimize errors associated with atypical cellular sequences including eukaryotic genomes and plasmids. Finally, as the virosphere exploration unravels novel viral sequences, VirSorter2’s modular design makes it inherently able to expand to new types of viruses via the design of new classifiers to maintain maximal sensitivity and specificity. Conclusion With multi-classifier and modular design, VirSorter2 demonstrates higher overall accuracy across major viral groups and will advance our knowledge of virus evolution, diversity, and virus-microbe interaction in various ecosystems. Source code of VirSorter2 is freely available ( https://bitbucket.org/MAVERICLab/virsorter2 ), and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( https://de.cyverse.org/de ).

Research Organization:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
SC0020173; AC02-05CH11231
OSTI ID:
1763830
Alternate ID(s):
OSTI ID: 1808518
Journal Information:
Microbiome, Journal Name: Microbiome Vol. 9 Journal Issue: 1; ISSN 2049-2618
Publisher:
Springer Science + Business MediaCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (67)

Rising to the challenge: accelerated pace of discovery transforms marine virology journal February 2015
CheckV: assessing the quality of metagenome-assembled viral genomes preprint May 2020
Recombination between RNA viruses and plasmids might have played a central role in the origin and evolution of small DNA viruses journal August 2012
Discovery of several thousand highly diverse circular DNA viruses February 2021
Host-linked soil viral ecology along a permafrost thaw gradient journal July 2018
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks journal May 2019
Metagenome-wide association studies: fine-mining the microbiome journal July 2016
Phage puppet masters of the marine microbial realm journal June 2018
Expansion of known ssRNA phage genomes: From tens to over a thousand journal February 2020
Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses journal April 2020
Global Organization and Proposed Megataxonomy of the Virus World journal March 2020
Patterns and ecological drivers of ocean viral communities journal May 2015
Prophages and bacterial genomics: what have we learned so far?: Prophage genomics journal June 2003
Dynamic Genome Evolution and Blueprint of Complex Virocell Metabolism in Globally-Distributed Giant Viruses posted_content January 2019
Snakemake--a scalable bioinformatics workflow engine journal August 2012
Marine DNA Viral Macro- and Microdiversity from Pole to Pole journal May 2019
VIBRANT: Automated recovery, annotation and curation of microbial viruses, and evaluation of virome function from genomic sequences posted_content November 2019
Putative archaeal viruses from the mesopelagic ocean journal January 2017
The Promises and Pitfalls of Machine Learning for Detecting Viruses in Aquatic Metagenomes journal April 2019
Bioconda: sustainable and comprehensive software distribution for the life sciences journal July 2018
PHASTER: a better, faster version of the PHAST phage search tool journal May 2016
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses journal September 2016
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
VirSorter: mining viral signal from microbial genomic data journal January 2015
Diversity, evolution, and classification of virophages uncovered through global metagenomics journal December 2019
Embracing the unknown: disentangling the complexities of the soil microbiome journal August 2017
Minimum Information about an Uncultivated Virus Genome (MIUViG) journal December 2018
Viruses, plasmids and other genetic elements of thermophilic and hyperthermophilic Archaea journal May 1996
Cressdnaviricota : a Virus Phylum Unifying Seven Families of Rep-Encoding Viruses with Single-Stranded, Circular DNA Genomes journal April 2020
Shotgun metagenomics, from sampling to analysis journal September 2017
A Curated, Comprehensive Database of Plasmid Sequences journal January 2019
Viromes, Not Gene Markers, for Studying Double-Stranded DNA Virus Communities journal December 2014
Uncovering Earth’s virome journal August 2016
Plasmids, viruses and virus-like membrane vesicles from Thermococcales journal January 2011
Phage_Finder: Automated identification and classification of prophage regions in complete bacterial genome sequences journal October 2006
Going viral: next-generation sequencing applied to phage populations in the human gut journal August 2012
Viral metagenomics journal May 2005
Contemporary Phage Biology: From Classic Models to New Insights journal March 2018
The Pfam protein families database in 2019 journal October 2018
PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies journal May 2012
The evolutionary history of vertebrate RNA viruses journal April 2018
Are There 10 31 Virus Particles on Earth, or More, or Fewer? journal February 2020
Accelerated Profile HMM Searches journal October 2011
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets journal October 2017
IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes journal November 2018
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data journal July 2017
Identifying viruses from metagenomic data using deep learning journal January 2020
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation journal November 2015
Prophinder: a computational tool for prophage prediction in prokaryotic genomes journal January 2008
MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins journal August 2018
Ecogenomics of virophages and their giant virus hosts assessed through time series metagenomics journal October 2017
The ancestral and industrialized gut microbiota and implications for human health journal May 2019
Detecting overlapping protein complexes in protein-protein interaction networks journal March 2012
From deep sequencing to viral tagging: Recent advances in viral metagenomics: Prospects & Overviews journal March 2013
Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes journal July 2019
Geminiviruses: a tale of a plasmid becoming a virus journal January 2009
Viral dark matter and virus–host interactions resolved from publicly available microbial genomes journal July 2015
Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV) journal October 2017
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
Soil Viruses: A New Hope journal May 2019
PHAST: A Fast Phage Search Tool journal June 2011
Giant virus diversity and host interactions through global metagenomics journal January 2020
Evolutionary relationships among diverse bacteriophages and prophages: All the world's a phage journal March 1999
The Microbial Engines That Drive Earth's Biogeochemical Cycles journal May 2008
HH-suite3 for fast remote homology detection and deep protein annotation journal September 2019
Redefining the invertebrate RNA virosphere journal November 2016

Similar Records

Related Subjects