ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs
Abstract
Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new ‘‘Highly Pathogenic H5N1 Clade Classification Tool’’ (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification.more »
- Authors:
-
- Ernst-Moritz-Arndt Universitat Greifswald (Germany)
- Ernst-Moritz-Arndt Universitat Greifswald (Germany); Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Federal Research Institute for Animal Health, Island of Riems, Greifswald (Germany)
- Publication Date:
- Research Org.:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER); German Science Foundation (DFG)
- OSTI Identifier:
- 1627669
- Grant/Contract Number:
- AC52-06NA25396; STA 1009/5-1
- Resource Type:
- Accepted Manuscript
- Journal Name:
- PLoS ONE
- Additional Journal Information:
- Journal Volume: 9; Journal Issue: 1; Journal ID: ISSN 1932-6203
- Publisher:
- Public Library of Science
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; hidden Markov models; H5N1; BLAST algorithm; influenza A virus; phylogenetic analysis; sequence databases; animal phylogenetics; viral pathogens
Citation Formats
Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, and Stanke, Mario. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs. United States: N. p., 2014.
Web. doi:10.1371/journal.pone.0084558.
Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, & Stanke, Mario. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs. United States. https://doi.org/10.1371/journal.pone.0084558
Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, and Stanke, Mario. Fri .
"ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs". United States. https://doi.org/10.1371/journal.pone.0084558. https://www.osti.gov/servlets/purl/1627669.
@article{osti_1627669,
title = {ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs},
author = {Van der Auwera, Sandra and Bulla, Ingo and Ziller, Mario and Pohlmann, Anne and Harder, Timm and Stanke, Mario},
abstractNote = {Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new ‘‘Highly Pathogenic H5N1 Clade Classification Tool’’ (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification. A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections.},
doi = {10.1371/journal.pone.0084558},
journal = {PLoS ONE},
number = 1,
volume = 9,
place = {United States},
year = {Fri Jan 03 00:00:00 EST 2014},
month = {Fri Jan 03 00:00:00 EST 2014}
}
Works referenced in this record:
Cross-species comparison of site-specific evolutionary-rate variation in influenza haemagglutinin
journal, March 2013
- Meyer, Austin G.; Dawson, Eric T.; Wilke, Claus O.
- Philosophical Transactions of the Royal Society B: Biological Sciences, Vol. 368, Issue 1614
Crystal structures of two subtype N10 neuraminidase-like proteins from bat influenza A viruses reveal a diverged putative active site
journal, September 2012
- Zhu, Xueyong; Yang, Hua; Guo, Zhu
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 46
The emergence of pandemic influenza viruses
journal, January 2010
- Guan, Yi; Vijaykrishna, Dhanasekaran; Bahl, Justin
- Protein & Cell, Vol. 1, Issue 1
The genesis and source of the H7N9 influenza viruses causing human infections in China
journal, August 2013
- Lam, Tommy Tsan-Yuk; Wang, Jia; Shen, Yongyi
- Nature, Vol. 502, Issue 7470
The Evolving Threat of Influenza Viruses of Animal Origin and the Challenges in Developing Appropriate Diagnostics
journal, November 2012
- Mak, Polly WY; Jayawardena, Shanthi; Poon, Leo LM
- Clinical Chemistry, Vol. 58, Issue 11
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0
journal, March 2010
- Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent
- Systematic Biology, Vol. 59, Issue 3
MUSCLE: multiple sequence alignment with high accuracy and high throughput
journal, March 2004
- Edgar, R. C.
- Nucleic Acids Research, Vol. 32, Issue 5, p. 1792-1797
Accelerated Profile HMM Searches
journal, October 2011
- Eddy, Sean R.
- PLoS Computational Biology, Vol. 7, Issue 10
Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
journal, August 2006
- Anisimova, Maria; Gascuel, Olivier
- Systematic Biology, Vol. 55, Issue 4
Rapid haemagglutinin subtyping and pathotyping of avian influenza viruses by a DNA microarray
journal, September 2009
- Gall, Astrid; Hoffmann, Bernd; Harder, Timm
- Journal of Virological Methods, Vol. 160, Issue 1-2
Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009
- Waterhouse, A. M.; Procter, J. B.; Martin, D. M. A.
- Bioinformatics, Vol. 25, Issue 9
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods
journal, May 2011
- Tamura, K.; Peterson, D.; Peterson, N.
- Molecular Biology and Evolution, Vol. 28, Issue 10
Basic local alignment search tool
journal, October 1990
- Altschul, Stephen F.; Gish, Warren; Miller, Webb
- Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance: Influenza Research Database
journal, January 2012
- Squires, R. Burke; Noronha, Jyothi; Hunt, Victoria
- Influenza and Other Respiratory Viruses, Vol. 6, Issue 6
The emergence of pandemic influenza viruses
journal, January 2010
- Guan, Yi; Vijaykrishna, Dhanasekaran; Bahl, Justin
- Protein & Cell, Vol. 1, Issue 1
Rapid haemagglutinin subtyping and pathotyping of avian influenza viruses by a DNA microarray
journal, September 2009
- Gall, Astrid; Hoffmann, Bernd; Harder, Timm
- Journal of Virological Methods, Vol. 160, Issue 1-2
Avian influenza: our current understanding
journal, June 2010
- Suarez, David L.
- Animal Health Research Reviews, Vol. 11, Issue 1
The neuraminidase of bat influenza viruses is not a neuraminidase
journal, October 2012
- García-Sastre, Adolfo
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 46
Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
journal, August 2006
- Anisimova, Maria; Gascuel, Olivier
- Systematic Biology, Vol. 55, Issue 4
Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009
- Waterhouse, A. M.; Procter, J. B.; Martin, D. M. A.
- Bioinformatics, Vol. 25, Issue 9
MUSCLE: multiple sequence alignment with high accuracy and high throughput
journal, March 2004
- Edgar, R. C.
- Nucleic Acids Research, Vol. 32, Issue 5, p. 1792-1797
Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance: Influenza Research Database
journal, January 2012
- Squires, R. Burke; Noronha, Jyothi; Hunt, Victoria
- Influenza and Other Respiratory Viruses, Vol. 6, Issue 6
The Evolving Threat of Influenza Viruses of Animal Origin and the Challenges in Developing Appropriate Diagnostics
journal, November 2012
- Mak, Polly WY; Jayawardena, Shanthi; Poon, Leo LM
- Clinical Chemistry, Vol. 58, Issue 11
Works referencing / citing this record:
Overview of Virus Metagenomic Classification Methods and Their Biological Applications
journal, April 2018
- Nooij, Sam; Schmitz, Dennis; Vennema, Harry
- Frontiers in Microbiology, Vol. 9
Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains
journal, July 2018
- Phan, My V. T.; Ngo Tri, Tue; Hong Anh, Pham
- Virus Evolution, Vol. 4, Issue 2
Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains.
text, January 2018
- Phan, My VT; Ngo Tri, Tue; Hong Anh, Pham
- Apollo - University of Cambridge Repository
Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains
journal, July 2018
- Phan, My V. T.; Ngo Tri, Tue; Hong Anh, Pham
- Virus Evolution, Vol. 4, Issue 2
Overview of Virus Metagenomic Classification Methods and Their Biological Applications
journal, April 2018
- Nooij, Sam; Schmitz, Dennis; Vennema, Harry
- Frontiers in Microbiology, Vol. 9