PHANOTATE: a novel approach to gene identification in phage genomes
Abstract
Motivation: Currently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path. Results: We compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the totalmore »
- Authors:
-
- San Diego State Univ., CA (United States). Computational Sciences Research Center
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- San Diego State Univ., CA (United States). Dept. of Biology
- San Diego State Univ., CA (United States). Computational Sciences Research Center. Dept. of Biology. Viral Information Inst.
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1625296
- Grant/Contract Number:
- AC52-07NA27344
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Bioinformatics
- Additional Journal Information:
- Journal Volume: 35; Journal Issue: 22; Journal ID: ISSN 1367-4803
- Publisher:
- International Society for Computational Biology - Oxford University Press
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; Biochemistry & Molecular Biology; Biotechnology & Applied Microbiology; Computer Science; Mathematical & Computational Biology; Mathematics
Citation Formats
McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, and Edwards, Robert A. PHANOTATE: a novel approach to gene identification in phage genomes. United States: N. p., 2019.
Web. doi:10.1093/bioinformatics/btz265.
McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, & Edwards, Robert A. PHANOTATE: a novel approach to gene identification in phage genomes. United States. https://doi.org/10.1093/bioinformatics/btz265
McNair, Katelyn, Zhou, Carol, Dinsdale, Elizabeth A., Souza, Brian, and Edwards, Robert A. Thu .
"PHANOTATE: a novel approach to gene identification in phage genomes". United States. https://doi.org/10.1093/bioinformatics/btz265. https://www.osti.gov/servlets/purl/1625296.
@article{osti_1625296,
title = {PHANOTATE: a novel approach to gene identification in phage genomes},
author = {McNair, Katelyn and Zhou, Carol and Dinsdale, Elizabeth A. and Souza, Brian and Edwards, Robert A.},
abstractNote = {Motivation: Currently there are no tools specifically designed for annotating genes in phages. Several tools are available that have been adapted to run on phage genomes, but due to their underlying design, they are unable to capture the full complexity of phage genomes. Phages have adapted their genomes to be extremely compact, having adjacent genes that overlap and genes completely inside of other longer genes. This non-delineated genome structure makes it difficult for gene prediction using the currently available gene annotators. Here we present PHANOTATE, a novel method for gene calling specifically designed for phage genomes. Although the compact nature of genes in phages is a problem for current gene annotators, we exploit this property by treating a phage genome as a network of paths: where open reading frames are favorable, and overlaps and gaps are less favorable, but still possible. We represent this network of connections as a weighted graph, and use dynamic programing to find the optimal path. Results: We compare PHANOTATE to other gene callers by annotating a set of 2133 complete phage genomes from GenBank, using PHANOTATE and the three most popular gene callers. We found that the four programs agree on 82% of the total predicted genes, with PHANOTATE predicting more genes than the other three. We searched for these extra genes in both GenBank’s non-redundant protein database and all of the metagenomes in the sequence read archive, and found that they are present at levels that suggest that these are functional protein-coding genes. Availability and implementation: https://github.com/deprekate/PHANOTATE. Supplementary information: Supplementary data are available at Bioinformatics online.},
doi = {10.1093/bioinformatics/btz265},
journal = {Bioinformatics},
number = 22,
volume = 35,
place = {United States},
year = {Thu Apr 25 00:00:00 EDT 2019},
month = {Thu Apr 25 00:00:00 EDT 2019}
}
Web of Science
Works referenced in this record:
Adaptive seeds tame genomic sequence comparison
journal, January 2011
- Kielbasa, S. M.; Wan, R.; Sato, K.
- Genome Research, Vol. 21, Issue 3
Prokka: rapid prokaryotic genome annotation
journal, March 2014
- Seemann, T.
- Bioinformatics, Vol. 30, Issue 14
Database resources of the National Center for Biotechnology Information
journal, January 2006
- Wheeler, D. L.
- Nucleic Acids Research, Vol. 34, Issue 90001
Frameshift alignment: statistics and post-genomic applications
journal, August 2014
- Sheetlin, Sergey L.; Park, Yonil; Frith, Martin C.
- Bioinformatics, Vol. 30, Issue 24
PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies
journal, May 2012
- Akhter, Sajia; Aziz, Ramy K.; Edwards, Robert A.
- Nucleic Acids Research, Vol. 40, Issue 16
CRITICA: coding region identification tool invoking comparative analysis
journal, April 1999
- Badger, J. H.; Olsen, G. J.
- Molecular Biology and Evolution, Vol. 16, Issue 4
Metagenomics and future perspectives in virus discovery
journal, February 2012
- Mokili, John L.; Rohwer, Forest; Dutilh, Bas E.
- Current Opinion in Virology, Vol. 2, Issue 1
On a routing problem
journal, January 1958
- Bellman, Richard
- Quarterly of Applied Mathematics, Vol. 16, Issue 1
GenBank
journal, November 2016
- Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
- Nucleic Acids Research, Vol. 45, Issue D1
Strain-Resolved Dynamics of the Lung Microbiome in Patients with Cystic Fibrosis
journal, April 2021
- Dmitrijeva, Marija; Kahlert, Christian R.; Feigelman, Rounak
- mBio, Vol. 12, Issue 2
Draft Genome Sequence of Comamonas aquatilis Strain LK (= CSUR P6418 = CECT 9772), Isolated from the Planarian Schmidtea mediterranea
journal, February 2021
- Kangale, Luis Johnson; Levasseur, Anthony; Raoult, Didier
- Microbiology Resource Announcements, Vol. 10, Issue 5
Rz/Rz1 Lysis Gene Equivalents in Phages of Gram-negative Hosts
journal, November 2007
- Summer, Elizabeth J.; Berry, Joel; Tran, Tram Anh T.
- Journal of Molecular Biology, Vol. 373, Issue 5
Database resources of the National Center for Biotechnology Information
journal, November 2018
- Sayers, Eric W.; Agarwala, Richa; Bolton, Evan E.
- Nucleic Acids Research, Vol. 47, Issue D1
The Phage Proteomic Tree: a Genome-Based Taxonomy for Phage
journal, August 2002
- Rohwer, F.; Edwards, R.
- Journal of Bacteriology, Vol. 184, Issue 16
Database resources of the National Center for Biotechnology Information
journal, December 2007
- Wheeler, D. L.; Barrett, T.; Benson, D. A.
- Nucleic Acids Research, Vol. 36, Issue Database
Effect size, confidence interval and statistical significance: a practical guide for biologists
journal, November 2007
- Nakagawa, Shinichi; Cuthill, Innes C.
- Biological Reviews, Vol. 82, Issue 4
Genetic Analysis of the Lambda Spanins Rz and Rz1: Identification of Functional Domains
journal, December 2016
- Cahill, Jesse; Rajaure, Manoj; O’Leary, Chandler
- G3: Genes|Genomes|Genetics, Vol. 7, Issue 2
PARTIE: a partition engine to separate metagenomic and amplicon projects in the Sequence Read Archive
journal, March 2017
- Torres, Pedro J.; Edwards, Robert A.; McNair, Katelyn A.
- Bioinformatics, Vol. 33, Issue 15
Database resources of the National Center for Biotechnology Information
journal, October 2020
- Sayers, Eric W.; Beck, Jeffrey; Bolton, Evan E.
- Nucleic Acids Research, Vol. 49, Issue D1
Genomics and Proteomics of Mycobacteriophage Patience, an Accidental Tourist in the Mycobacterium Neighborhood
journal, October 2014
- Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.
- mBio, Vol. 5, Issue 6
PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016
- Arndt, David; Grant, Jason R.; Marcu, Ana
- Nucleic Acids Research, Vol. 44, Issue W1
Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010
- Hyatt, Doug; Chen, Gwo-Liang; LoCascio, Philip F.
- BMC Bioinformatics, Vol. 11, Issue 1
Viral dark matter and virus–host interactions resolved from publicly available microbial genomes
journal, July 2015
- Roux, Simon; Hallam, Steven J.; Woyke, Tanja
- eLife, Vol. 4
Heuristic approach to deriving models for gene finding
journal, October 1999
- Besemer, J.
- Nucleic Acids Research, Vol. 27, Issue 19
On a Routing Problem
journal, July 2004
- Lindvall, Torgny
- Probability in the Engineering and Informational Sciences, Vol. 18, Issue 03
Metagenomics and future perspectives in virus discovery
journal, February 2012
- Mokili, John L.; Rohwer, Forest; Dutilh, Bas E.
- Current Opinion in Virology, Vol. 2, Issue 1
Rz/Rz1 Lysis Gene Equivalents in Phages of Gram-negative Hosts
journal, November 2007
- Summer, Elizabeth J.; Berry, Joel; Tran, Tram Anh T.
- Journal of Molecular Biology, Vol. 373, Issue 5
Protein family classification and functional annotation
journal, February 2003
- Wu, Cathy H.; Huang, Hongzhan; Yeh, Lai-Su L.
- Computational Biology and Chemistry, Vol. 27, Issue 1
Prokka: rapid prokaryotic genome annotation
journal, March 2014
- Seemann, T.
- Bioinformatics, Vol. 30, Issue 14
Frameshift alignment: statistics and post-genomic applications
journal, August 2014
- Sheetlin, Sergey L.; Park, Yonil; Frith, Martin C.
- Bioinformatics, Vol. 30, Issue 24
GenBank
journal, November 2016
- Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen
- Nucleic Acids Research, Vol. 45, Issue D1
PHASTER: a better, faster version of the PHAST phage search tool
journal, May 2016
- Arndt, David; Grant, Jason R.; Marcu, Ana
- Nucleic Acids Research, Vol. 44, Issue W1
Adaptive seeds tame genomic sequence comparison
journal, January 2011
- Kielbasa, S. M.; Wan, R.; Sato, K.
- Genome Research, Vol. 21, Issue 3
Top-Down Proteomic Identification of Shiga Toxin 2 Subtypes from Shiga Toxin-Producing Escherichia coli by Matrix-Assisted Laser Desorption Ionization-Tandem Time of Flight Mass Spectrometry
journal, February 2014
- Fagerquist, C. K.; Zaragoza, W. J.; Sultan, O.
- Applied and Environmental Microbiology, Vol. 80, Issue 9
Multivariate Entropy Distance Method for Prokaryotic gene Identification
journal, June 2004
- Ouyang, Zhengqing; Zhu, Huaiqiu; Wang, Jin
- Journal of Bioinformatics and Computational Biology, Vol. 02, Issue 02
Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010
- Hyatt, Doug; Chen, Gwo-Liang; LoCascio, Philip F.
- BMC Bioinformatics, Vol. 11, Issue 1
Genetic Analysis of the Lambda Spanins Rz and Rz1: Identification of Functional Domains
journal, December 2016
- Cahill, Jesse; Rajaure, Manoj; O’Leary, Chandler
- G3: Genes|Genomes|Genetics, Vol. 7, Issue 2
Works referencing / citing this record:
Genomic and ecological attributes of marine bacteriophages encoding bacterial virulence genes
journal, February 2020
- Silveira, Cynthia B.; Coutinho, Felipe H.; Cavalcanti, Giselle S.
- BMC Genomics, Vol. 21, Issue 1
Isolation of Four Lytic Phages Infecting Klebsiella pneumoniae K22 Clinical Isolates from Spain
journal, January 2020
- Domingo-Calap, Pilar; Beamud, Beatriz; Vienne, Justine
- International Journal of Molecular Sciences, Vol. 21, Issue 2
Global phylogeography and ancient evolution of the widespread human gut virus crAssphage
journal, July 2019
- Edwards, Robert A.; Vega, Alejandro A.; Norman, Holly M.
- Nature Microbiology, Vol. 4, Issue 10
Complete Genome Sequence of XaF13, a Novel Bacteriophage of Xanthomonas vesicatoria from Mexico
journal, January 2020
- Solís-Sánchez, Guillermo Alejandro; Quiñones-Aguilar, Evangelina Esmeralda; Fraire-Velázquez, Saul
- Microbiology Resource Announcements, Vol. 9, Issue 5
8-OH-DPAT, a 5-HT 1A agonist and ritanserin, a 5-HT 2A/C antagonist, reverse haloperidol-induced catalepsy in rats independently of striatal dopamine release
journal, May 1997
- Lucas, Guillaume; Bonhomme, Norbert; Deurwaerdère, Philippe De
- Psychopharmacology, Vol. 131, Issue 1
multiPhATE: bioinformatics pipeline for functional annotation of phage isolates
journal, May 2019
- Ecale Zhou, Carol L.; Malfatti, Stephanie; Kimbrel, Jeffrey
- Bioinformatics, Vol. 35, Issue 21
Complete Genome Sequence of XaF13, a Novel Bacteriophage of Xanthomonas vesicatoria from Mexico
journal, January 2020
- Solís-Sánchez, Guillermo Alejandro; Quiñones-Aguilar, Evangelina Esmeralda; Fraire-Velázquez, Saul
- Microbiology Resource Announcements, Vol. 9, Issue 5
Genomic and ecological attributes of marine bacteriophages encoding bacterial virulence genes
journal, February 2020
- Silveira, Cynthia B.; Coutinho, Felipe H.; Cavalcanti, Giselle S.
- BMC Genomics, Vol. 21, Issue 1
A Method for Improving the Accuracy and Efficiency of Bacteriophage Genome Annotation
journal, July 2019
- Salisbury, Alicia; Tsourkas, Philippos K.
- International Journal of Molecular Sciences, Vol. 20, Issue 14
Isolation and Characterization of Two Klebsiella pneumoniae Phages Encoding Divergent Depolymerases
journal, April 2020
- Domingo-Calap, Pilar; Beamud, Beatriz; Mora-Quilis, Lucas
- International Journal of Molecular Sciences, Vol. 21, Issue 9
Genetic Mining of Newly Isolated Salmophages for Phage Therapy
journal, August 2022
- Gendre, Julia; Ansaldi, Mireille; Olivenza, David R.
- International Journal of Molecular Sciences, Vol. 23, Issue 16