Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses

Journal Article · · Microbiome
Abstract Background

Viruses are a significant player in many biosphere and human ecosystems, but most signals remain “hidden” in metagenomic/metatranscriptomic sequence datasets due to the lack of universal gene markers, database representatives, and insufficiently advanced identification tools.

Results

Here, we introduce VirSorter2, a DNA and RNA virus identification tool that leverages genome-informed database advances across a collection of customized automatic classifiers to improve the accuracy and range of virus sequence detection. When benchmarked against genomes from both isolated and uncultivated viruses, VirSorter2 uniquely performed consistently with high accuracy (F1-score > 0.8) across viral diversity, while all other tools under-detected viruses outside of the group most represented in reference databases (i.e., those in the order Caudovirales ). Among the tools evaluated, VirSorter2 was also uniquely able to minimize errors associated with atypical cellular sequences including eukaryotic genomes and plasmids. Finally, as the virosphere exploration unravels novel viral sequences, VirSorter2’s modular design makes it inherently able to expand to new types of viruses via the design of new classifiers to maintain maximal sensitivity and specificity.

Conclusion

With multi-classifier and modular design, VirSorter2 demonstrates higher overall accuracy across major viral groups and will advance our knowledge of virus evolution, diversity, and virus-microbe interaction in various ecosystems. Source code of VirSorter2 is freely available ( https://bitbucket.org/MAVERICLab/virsorter2 ), and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( https://de.cyverse.org/de ).

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
Grant/Contract Number:
AC02-05CH11231; SC0020173
OSTI ID:
1763830
Alternate ID(s):
OSTI ID: 1808518
Journal Information:
Microbiome, Journal Name: Microbiome Journal Issue: 1 Vol. 9; ISSN 2049-2618
Publisher:
Springer Science + Business MediaCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (67)

Recombination between RNA viruses and plasmids might have played a central role in the origin and evolution of small DNA viruses journal August 2012
From deep sequencing to viral tagging: Recent advances in viral metagenomics: Prospects & Overviews journal March 2013
Identifying viruses from metagenomic data using deep learning journal January 2020
Contemporary Phage Biology: From Classic Models to New Insights journal March 2018
Marine DNA Viral Macro- and Microdiversity from Pole to Pole journal May 2019
Uncovering Earth’s virome journal August 2016
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses journal September 2016
Redefining the invertebrate RNA virosphere journal November 2016
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea journal August 2017
Shotgun metagenomics, from sampling to analysis journal September 2017
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets journal October 2017
Minimum Information about an Uncultivated Virus Genome (MIUViG) journal December 2018
Detecting overlapping protein complexes in protein-protein interaction networks journal March 2012
Metagenome-wide association studies: fine-mining the microbiome journal July 2016
Embracing the unknown: disentangling the complexities of the soil microbiome journal August 2017
Viral metagenomics journal May 2005
Going viral: next-generation sequencing applied to phage populations in the human gut journal August 2012
Rising to the challenge: accelerated pace of discovery transforms marine virology journal February 2015
Ecogenomics of virophages and their giant virus hosts assessed through time series metagenomics journal October 2017
Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses journal April 2020
Phage puppet masters of the marine microbial realm journal June 2018
Host-linked soil viral ecology along a permafrost thaw gradient journal July 2018
Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes journal July 2019
The ancestral and industrialized gut microbiota and implications for human health journal May 2019
The evolutionary history of vertebrate RNA viruses journal April 2018
Giant virus diversity and host interactions through global metagenomics journal January 2020
Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks journal May 2019
Bioconda: sustainable and comprehensive software distribution for the life sciences journal July 2018
Plasmids, viruses and virus-like membrane vesicles from Thermococcales journal January 2011
Prophages and bacterial genomics: what have we learned so far?: Prophage genomics journal June 2003
Evolutionary relationships among diverse bacteriophages and prophages: All the world's a phage journal March 1999
Prophinder: a computational tool for prophage prediction in prokaryotic genomes journal January 2008
Snakemake--a scalable bioinformatics workflow engine journal August 2012
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
Phage_Finder: Automated identification and classification of prophage regions in complete bacterial genome sequences journal October 2006
PHAST: A Fast Phage Search Tool journal June 2011
PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies journal May 2012
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation journal November 2015
PHASTER: a better, faster version of the PHAST phage search tool journal May 2016
Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV) journal October 2017
IMG/VR v.2.0: an integrated data management and analysis system for cultivated and environmental viral genomes journal November 2018
The Pfam protein families database in 2019 journal October 2018
CheckV: assessing the quality of metagenome-assembled viral genomes preprint May 2020
Dynamic Genome Evolution and Blueprint of Complex Virocell Metabolism in Globally-Distributed Giant Viruses posted_content January 2019
VIBRANT: Automated recovery, annotation and curation of microbial viruses, and evaluation of virome function from genomic sequences posted_content November 2019
Viruses, plasmids and other genetic elements of thermophilic and hyperthermophilic Archaea journal May 1996
Expansion of known ssRNA phage genomes: From tens to over a thousand journal February 2020
The Microbial Engines That Drive Earth's Biogeochemical Cycles journal May 2008
Patterns and ecological drivers of ocean viral communities journal May 2015
Are There 10 31 Virus Particles on Earth, or More, or Fewer? journal February 2020
Cressdnaviricota : a Virus Phylum Unifying Seven Families of Rep-Encoding Viruses with Single-Stranded, Circular DNA Genomes journal April 2020
Viromes, Not Gene Markers, for Studying Double-Stranded DNA Virus Communities journal December 2014
Global Organization and Proposed Megataxonomy of the Virus World journal March 2020
A Curated, Comprehensive Database of Plasmid Sequences journal January 2019
Soil Viruses: A New Hope journal May 2019
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
Geminiviruses: a tale of a plasmid becoming a virus journal January 2009
HH-suite3 for fast remote homology detection and deep protein annotation journal September 2019
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data journal July 2017
Diversity, evolution, and classification of virophages uncovered through global metagenomics journal December 2019
Accelerated Profile HMM Searches journal October 2011
MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins journal August 2018
The Promises and Pitfalls of Machine Learning for Detecting Viruses in Aquatic Metagenomes journal April 2019
Viral dark matter and virus–host interactions resolved from publicly available microbial genomes journal July 2015
Discovery of several thousand highly diverse circular DNA viruses February 2021
Putative archaeal viruses from the mesopelagic ocean journal January 2017
VirSorter: mining viral signal from microbial genomic data journal January 2015

Similar Records

Pickaxe: a Python library for the prediction of novel metabolic reactions
Journal Article · Tue Mar 21 20:00:00 EDT 2023 · BMC Bioinformatics · OSTI ID:1962934

A fast comparative genome browser for diverse bacteria and archaea
Journal Article · Mon Apr 08 20:00:00 EDT 2024 · PLoS ONE · OSTI ID:2335832

Related Subjects