skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

This content will become publicly available on January 29, 2020

Title: Gene sharing networks to automate genome-based prokaryotic viral taxonomy

Abstract

Viruses of bacteria and archaea impact natural, engineered and human ecosystems, but their study is hampered by the lack of a universal or scalable taxonomic framework. Furthermore we introduce vConTACT v2.0, a network-based application to establish prokaryotic virus taxonomy that scales to thousands of uncultivated virus genomes, and integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. Performance tests demonstrated significant improvements over the original tool and near-identical (96%) correspondence to current International Committee on Taxonomy of Viruses (ICTV) viral taxonomy where genus-level assignments are available. Beyond these “known viruses”, vConTACT v2.0 suggested automatic genus assignments for 1,364 previously unclassified reference viruses, with perfectly scoring assignments submitted as new taxonomic proposals to ICTV. Scaling experiments with 15,280 global ocean large viral genome fragments demonstrated that the reference network was rapidly scalable and robust to adding large-scale viral metagenomic datasets. Together these efforts provide a critically-needed, systematically classified reference network and an accurate, scalable, and automatable taxonomic analysis tool.

Authors:
 [1];  [1]; ORCiD logo [1];  [2]; ORCiD logo [3]; ORCiD logo [4];  [5];  [6]; ORCiD logo [7]; ORCiD logo [8]; ORCiD logo [1]
  1. The Ohio State Univ., Columbus, OH (United States)
  2. National Institutes of Health, Fort Detrick, Frederick, MD (United States)
  3. USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)
  4. Univ. of Liverpool, Liverpool (United Kingdom)
  5. National Inst. of Health (NIH), Bethesda, MD (United States)
  6. Univ. of Guelph, Guelph, ON (Canada)
  7. Inst. Pasteur, Paris (France)
  8. Univ. of the West of England, Bristol (United Kingdom)
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
OSTI Identifier:
1569046
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Accepted Manuscript
Journal Name:
Nature Biotechnology
Additional Journal Information:
Journal Volume: 37; Journal Issue: 6; Journal ID: ISSN 1087-0156
Publisher:
Springer Nature
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 54 ENVIRONMENTAL SCIENCES

Citation Formats

Jang, Ho Bin, Bolduc, Benjamin, Zablocki, Olivier, Kuhn, Jens H., Roux, Simon, Adriaenssens, Evelien M., Brister, J. Rodney, Kropinski, Andrew M., Krupovic, Mart, Turner, Dann, and Sullivan, Matthew B. Gene sharing networks to automate genome-based prokaryotic viral taxonomy. United States: N. p., 2019. Web. doi:10.1101/533240.
Jang, Ho Bin, Bolduc, Benjamin, Zablocki, Olivier, Kuhn, Jens H., Roux, Simon, Adriaenssens, Evelien M., Brister, J. Rodney, Kropinski, Andrew M., Krupovic, Mart, Turner, Dann, & Sullivan, Matthew B. Gene sharing networks to automate genome-based prokaryotic viral taxonomy. United States. doi:10.1101/533240.
Jang, Ho Bin, Bolduc, Benjamin, Zablocki, Olivier, Kuhn, Jens H., Roux, Simon, Adriaenssens, Evelien M., Brister, J. Rodney, Kropinski, Andrew M., Krupovic, Mart, Turner, Dann, and Sullivan, Matthew B. Tue . "Gene sharing networks to automate genome-based prokaryotic viral taxonomy". United States. doi:10.1101/533240.
@article{osti_1569046,
title = {Gene sharing networks to automate genome-based prokaryotic viral taxonomy},
author = {Jang, Ho Bin and Bolduc, Benjamin and Zablocki, Olivier and Kuhn, Jens H. and Roux, Simon and Adriaenssens, Evelien M. and Brister, J. Rodney and Kropinski, Andrew M. and Krupovic, Mart and Turner, Dann and Sullivan, Matthew B.},
abstractNote = {Viruses of bacteria and archaea impact natural, engineered and human ecosystems, but their study is hampered by the lack of a universal or scalable taxonomic framework. Furthermore we introduce vConTACT v2.0, a network-based application to establish prokaryotic virus taxonomy that scales to thousands of uncultivated virus genomes, and integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. Performance tests demonstrated significant improvements over the original tool and near-identical (96%) correspondence to current International Committee on Taxonomy of Viruses (ICTV) viral taxonomy where genus-level assignments are available. Beyond these “known viruses”, vConTACT v2.0 suggested automatic genus assignments for 1,364 previously unclassified reference viruses, with perfectly scoring assignments submitted as new taxonomic proposals to ICTV. Scaling experiments with 15,280 global ocean large viral genome fragments demonstrated that the reference network was rapidly scalable and robust to adding large-scale viral metagenomic datasets. Together these efforts provide a critically-needed, systematically classified reference network and an accurate, scalable, and automatable taxonomic analysis tool.},
doi = {10.1101/533240},
journal = {Nature Biotechnology},
number = 6,
volume = 37,
place = {United States},
year = {2019},
month = {1}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on January 29, 2020
Publisher's Version of Record

Save / Share:

Works referenced in this record:

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
journal, September 1997

  • Altschul, Stephen F.; Madden, Thomas L.; Schäffer, Alejandro A.
  • Nucleic Acids Research, Vol. 25, Issue 17, p. 3389-3402
  • DOI: 10.1093/nar/25.17.3389