DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: PHANOTATE: a novel approach to gene identification in phage genomes

Abstract

Motivation: Currently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path. Results: We compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the totalmore » predicted genes, with PHANOTATE predicting more genes than the other three. We searched for these extra genes in both GenBank’s non-redundant protein database and all of the metagenomes in the sequence read archive, and found that they are present at levels that suggest that these are functional protein-coding genes. Availability and implementation: https://github.com/deprekate/PHANOTATE. Supplementary information: Supplementary data are available at Bioinformatics online.« less

Authors:
 [1];  [2];  [3];  [2];  [4]
  1. San Diego State Univ., CA (United States). Computational Sciences Research Center
  2. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  3. San Diego State Univ., CA (United States). Dept. of Biology
  4. San Diego State Univ., CA (United States). Computational Sciences Research Center. Dept. of Biology. Viral Information Inst.
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1625296
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Accepted Manuscript
Journal Name:
Bioinformatics
Additional Journal Information:
Journal Volume: 35; Journal Issue: 22; Journal ID: ISSN 1367-4803
Publisher:
International Society for Computational Biology - Oxford University Press
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Computer Science; Mathematical & Computational Biology; Mathematics

Citation Formats

McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, and Edwards, Robert A. PHANOTATE: a novel approach to gene identification in phage genomes. United States: N. p., 2019. Web. doi:10.1093/bioinformatics/btz265.
McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, & Edwards, Robert A. PHANOTATE: a novel approach to gene identification in phage genomes. United States. https://doi.org/10.1093/bioinformatics/btz265
McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, and Edwards, Robert A. Thu . "PHANOTATE: a novel approach to gene identification in phage genomes". United States. https://doi.org/10.1093/bioinformatics/btz265. https://www.osti.gov/servlets/purl/1625296.
@article{osti_1625296,
title = {PHANOTATE: a novel approach to gene identification in phage genomes},
author = {McNair, Katelyn and Zhou, Carol and Dinsdale, Elizabeth A. and Souza, Brian and Edwards, Robert A.},
abstractNote = {Motivation: Currently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path. Results: We compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the total predicted genes, with PHANOTATE predicting more genes than the other three. We searched for these extra genes in both GenBank’s non-redundant protein database and all of the metagenomes in the sequence read archive, and found that they are present at levels that suggest that these are functional protein-coding genes. Availability and implementation: https://github.com/deprekate/PHANOTATE. Supplementary information: Supplementary data are available at Bioinformatics online.},
doi = {10.1093/bioinformatics/btz265},
journal = {Bioinformatics},
number = 22,
volume = 35,
place = {United States},
year = {Thu Apr 25 00:00:00 EDT 2019},
month = {Thu Apr 25 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 76 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Adaptive seeds tame genomic sequence comparison
journal, January 2011


Prokka: rapid prokaryotic genome annotation
journal, March 2014


Database resources of the National Center for Biotechnology Information
journal, January 2006


Frameshift alignment: statistics and post-genomic applications
journal, August 2014


PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies
journal, May 2012

  • Akhter, Sajia; Aziz, Ramy K.; Edwards, Robert A.
  • Nucleic Acids Research, Vol. 40, Issue 16
  • DOI: 10.1093/nar/gks406

CRITICA: coding region identification tool invoking comparative analysis
journal, April 1999


Metagenomics and future perspectives in virus discovery
journal, February 2012


On a routing problem
journal, January 1958

  • Bellman, Richard
  • Quarterly of Applied Mathematics, Vol. 16, Issue 1
  • DOI: 10.1090/qam/102435

GenBank
journal, November 2016

  • Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
  • Nucleic Acids Research, Vol. 45, Issue D1
  • DOI: 10.1093/nar/gkw1070

Strain-Resolved Dynamics of the Lung Microbiome in Patients with Cystic Fibrosis
journal, April 2021

  • Dmitrijeva, Marija; Kahlert, Christian R.; Feigelman, Rounak
  • mBio, Vol. 12, Issue 2
  • DOI: 10.1128/mbio.02863-20

Draft Genome Sequence of Comamonas aquatilis Strain LK (= CSUR P6418 = CECT 9772), Isolated from the Planarian Schmidtea mediterranea
journal, February 2021

  • Kangale, Luis Johnson; Levasseur, Anthony; Raoult, Didier
  • Microbiology Resource Announcements, Vol. 10, Issue 5
  • DOI: 10.1128/mra.00297-20

Rz/Rz1 Lysis Gene Equivalents in Phages of Gram-negative Hosts
journal, November 2007

  • Summer, Elizabeth J.; Berry, Joel; Tran, Tram Anh T.
  • Journal of Molecular Biology, Vol. 373, Issue 5
  • DOI: 10.1016/j.jmb.2007.08.045

Database resources of the National Center for Biotechnology Information
journal, November 2018

  • Sayers, Eric W.; Agarwala, Richa; Bolton, Evan E.
  • Nucleic Acids Research, Vol. 47, Issue D1
  • DOI: 10.1093/nar/gky1069

The Phage Proteomic Tree: a Genome-Based Taxonomy for Phage
journal, August 2002


Database resources of the National Center for Biotechnology Information
journal, December 2007

  • Wheeler, D. L.; Barrett, T.; Benson, D. A.
  • Nucleic Acids Research, Vol. 36, Issue Database
  • DOI: 10.1093/nar/gkm1000

Effect size, confidence interval and statistical significance: a practical guide for biologists
journal, November 2007


Genetic Analysis of the Lambda Spanins Rz and Rz1: Identification of Functional Domains
journal, December 2016

  • Cahill, Jesse; Rajaure, Manoj; O’Leary, Chandler
  • G3: Genes|Genomes|Genetics, Vol. 7, Issue 2
  • DOI: 10.1534/g3.116.037192

PARTIE: a partition engine to separate metagenomic and amplicon projects in the Sequence Read Archive
journal, March 2017


Database resources of the National Center for Biotechnology Information
journal, October 2020

  • Sayers, Eric W.; Beck, Jeffrey; Bolton, Evan E.
  • Nucleic Acids Research, Vol. 49, Issue D1
  • DOI: 10.1093/nar/gkaa892

Genomics and Proteomics of Mycobacteriophage Patience, an Accidental Tourist in the Mycobacterium Neighborhood
journal, October 2014

  • Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.
  • mBio, Vol. 5, Issue 6
  • DOI: 10.1128/mBio.02145-14

PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016

  • Arndt, David; Grant, Jason R.; Marcu, Ana
  • Nucleic Acids Research, Vol. 44, Issue W1
  • DOI: 10.1093/nar/gkw387

Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


Viral dark matter and virus–host interactions resolved from publicly available microbial genomes
journal, July 2015


Heuristic approach to deriving models for gene finding
journal, October 1999


On a Routing Problem
journal, July 2004

  • Lindvall, Torgny
  • Probability in the Engineering and Informational Sciences, Vol. 18, Issue 03
  • DOI: 10.1017/s0269964804183046

Metagenomics and future perspectives in virus discovery
journal, February 2012


Rz/Rz1 Lysis Gene Equivalents in Phages of Gram-negative Hosts
journal, November 2007

  • Summer, Elizabeth J.; Berry, Joel; Tran, Tram Anh T.
  • Journal of Molecular Biology, Vol. 373, Issue 5
  • DOI: 10.1016/j.jmb.2007.08.045

Protein family classification and functional annotation
journal, February 2003


Prokka: rapid prokaryotic genome annotation
journal, March 2014


Frameshift alignment: statistics and post-genomic applications
journal, August 2014


GenBank
journal, November 2016

  • Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
  • Nucleic Acids Research, Vol. 45, Issue D1
  • DOI: 10.1093/nar/gkw1070

PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016

  • Arndt, David; Grant, Jason R.; Marcu, Ana
  • Nucleic Acids Research, Vol. 44, Issue W1
  • DOI: 10.1093/nar/gkw387

Adaptive seeds tame genomic sequence comparison
journal, January 2011


Multivariate Entropy Distance Method for Prokaryotic gene Identification
journal, June 2004

  • Ouyang, Zhengqing; Zhu, Huaiqiu; Wang, Jin
  • Journal of Bioinformatics and Computational Biology, Vol. 02, Issue 02
  • DOI: 10.1142/s0219720004000624

Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


Genetic Analysis of the Lambda Spanins Rz and Rz1: Identification of Functional Domains
journal, December 2016

  • Cahill, Jesse; Rajaure, Manoj; O’Leary, Chandler
  • G3: Genes|Genomes|Genetics, Vol. 7, Issue 2
  • DOI: 10.1534/g3.116.037192

Works referencing / citing this record:

Genomic and ecological attributes of marine bacteriophages encoding bacterial virulence genes
journal, February 2020

  • Silveira, Cynthia B.; Coutinho, Felipe H.; Cavalcanti, Giselle S.
  • BMC Genomics, Vol. 21, Issue 1
  • DOI: 10.1186/s12864-020-6523-2

Isolation of Four Lytic Phages Infecting Klebsiella pneumoniae K22 Clinical Isolates from Spain
journal, January 2020

  • Domingo-Calap, Pilar; Beamud, Beatriz; Vienne, Justine
  • International Journal of Molecular Sciences, Vol. 21, Issue 2
  • DOI: 10.3390/ijms21020425

Global phylogeography and ancient evolution of the widespread human gut virus crAssphage
journal, July 2019

  • Edwards, Robert A.; Vega, Alejandro A.; Norman, Holly M.
  • Nature Microbiology, Vol. 4, Issue 10
  • DOI: 10.1038/s41564-019-0494-6

Complete Genome Sequence of XaF13, a Novel Bacteriophage of Xanthomonas vesicatoria from Mexico
journal, January 2020

  • Solís-Sánchez, Guillermo Alejandro; Quiñones-Aguilar, Evangelina Esmeralda; Fraire-Velázquez, Saul
  • Microbiology Resource Announcements, Vol. 9, Issue 5
  • DOI: 10.1128/mra.01371-19

8-OH-DPAT, a 5-HT 1A agonist and ritanserin, a 5-HT 2A/C antagonist, reverse haloperidol-induced catalepsy in rats independently of striatal dopamine release
journal, May 1997

  • Lucas, Guillaume; Bonhomme, Norbert; Deurwaerdère, Philippe De
  • Psychopharmacology, Vol. 131, Issue 1
  • DOI: 10.1007/s002130050265

multiPhATE: bioinformatics pipeline for functional annotation of phage isolates
journal, May 2019


Complete Genome Sequence of XaF13, a Novel Bacteriophage of Xanthomonas vesicatoria from Mexico
journal, January 2020

  • Solís-Sánchez, Guillermo Alejandro; Quiñones-Aguilar, Evangelina Esmeralda; Fraire-Velázquez, Saul
  • Microbiology Resource Announcements, Vol. 9, Issue 5
  • DOI: 10.1128/mra.01371-19

Genomic and ecological attributes of marine bacteriophages encoding bacterial virulence genes
journal, February 2020

  • Silveira, Cynthia B.; Coutinho, Felipe H.; Cavalcanti, Giselle S.
  • BMC Genomics, Vol. 21, Issue 1
  • DOI: 10.1186/s12864-020-6523-2

A Method for Improving the Accuracy and Efficiency of Bacteriophage Genome Annotation
journal, July 2019

  • Salisbury, Alicia; Tsourkas, Philippos K.
  • International Journal of Molecular Sciences, Vol. 20, Issue 14
  • DOI: 10.3390/ijms20143391

Isolation and Characterization of Two Klebsiella pneumoniae Phages Encoding Divergent Depolymerases
journal, April 2020

  • Domingo-Calap, Pilar; Beamud, Beatriz; Mora-Quilis, Lucas
  • International Journal of Molecular Sciences, Vol. 21, Issue 9
  • DOI: 10.3390/ijms21093160

Genetic Mining of Newly Isolated Salmophages for Phage Therapy
journal, August 2022

  • Gendre, Julia; Ansaldi, Mireille; Olivenza, David R.
  • International Journal of Molecular Sciences, Vol. 23, Issue 16
  • DOI: 10.3390/ijms23168917