DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs

Abstract

Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new ‘‘Highly Pathogenic H5N1 Clade Classification Tool’’ (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification.more » A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections.« less

Authors:
 [1];  [2];  [3];  [3];  [3];  [1]
  1. Ernst-Moritz-Arndt Universitat Greifswald (Germany)
  2. Ernst-Moritz-Arndt Universitat Greifswald (Germany); Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
  3. Federal Research Institute for Animal Health, Island of Riems, Greifswald (Germany)
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER); German Science Foundation (DFG)
OSTI Identifier:
1627669
Grant/Contract Number:  
AC52-06NA25396; STA 1009/5-1
Resource Type:
Accepted Manuscript
Journal Name:
PLoS ONE
Additional Journal Information:
Journal Volume: 9; Journal Issue: 1; Journal ID: ISSN 1932-6203
Publisher:
Public Library of Science
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; hidden Markov models; H5N1; BLAST algorithm; influenza A virus; phylogenetic analysis; sequence databases; animal phylogenetics; viral pathogens

Citation Formats

Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, and Stanke, Mario. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs. United States: N. p., 2014. Web. doi:10.1371/journal.pone.0084558.
Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, & Stanke, Mario. ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs. United States. https://doi.org/10.1371/journal.pone.0084558
Van der Auwera, Sandra, Bulla, Ingo, Ziller, Mario, Pohlmann, Anne, Harder, Timm, and Stanke, Mario. Fri . "ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs". United States. https://doi.org/10.1371/journal.pone.0084558. https://www.osti.gov/servlets/purl/1627669.
@article{osti_1627669,
title = {ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs},
author = {Van der Auwera, Sandra and Bulla, Ingo and Ziller, Mario and Pohlmann, Anne and Harder, Timm and Stanke, Mario},
abstractNote = {Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new ‘‘Highly Pathogenic H5N1 Clade Classification Tool’’ (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification. A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections.},
doi = {10.1371/journal.pone.0084558},
journal = {PLoS ONE},
number = 1,
volume = 9,
place = {United States},
year = {Fri Jan 03 00:00:00 EST 2014},
month = {Fri Jan 03 00:00:00 EST 2014}
}

Works referenced in this record:

Cross-species comparison of site-specific evolutionary-rate variation in influenza haemagglutinin
journal, March 2013

  • Meyer, Austin G.; Dawson, Eric T.; Wilke, Claus O.
  • Philosophical Transactions of the Royal Society B: Biological Sciences, Vol. 368, Issue 1614
  • DOI: 10.1098/rstb.2012.0334

Crystal structures of two subtype N10 neuraminidase-like proteins from bat influenza A viruses reveal a diverged putative active site
journal, September 2012

  • Zhu, Xueyong; Yang, Hua; Guo, Zhu
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 46
  • DOI: 10.1073/pnas.1212579109

The emergence of pandemic influenza viruses
journal, January 2010


The genesis and source of the H7N9 influenza viruses causing human infections in China
journal, August 2013

  • Lam, Tommy Tsan-Yuk; Wang, Jia; Shen, Yongyi
  • Nature, Vol. 502, Issue 7470
  • DOI: 10.1038/nature12515

The Evolving Threat of Influenza Viruses of Animal Origin and the Challenges in Developing Appropriate Diagnostics
journal, November 2012


New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0
journal, March 2010

  • Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent
  • Systematic Biology, Vol. 59, Issue 3
  • DOI: 10.1093/sysbio/syq010

MUSCLE: multiple sequence alignment with high accuracy and high throughput
journal, March 2004

  • Edgar, R. C.
  • Nucleic Acids Research, Vol. 32, Issue 5, p. 1792-1797
  • DOI: 10.1093/nar/gkh340

Accelerated Profile HMM Searches
journal, October 2011


Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
journal, August 2006


Rapid haemagglutinin subtyping and pathotyping of avian influenza viruses by a DNA microarray
journal, September 2009


Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009


MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods
journal, May 2011

  • Tamura, K.; Peterson, D.; Peterson, N.
  • Molecular Biology and Evolution, Vol. 28, Issue 10
  • DOI: 10.1093/molbev/msr121

Basic local alignment search tool
journal, October 1990

  • Altschul, Stephen F.; Gish, Warren; Miller, Webb
  • Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
  • DOI: 10.1016/S0022-2836(05)80360-2

Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance: Influenza Research Database
journal, January 2012


The emergence of pandemic influenza viruses
journal, January 2010


Rapid haemagglutinin subtyping and pathotyping of avian influenza viruses by a DNA microarray
journal, September 2009


Avian influenza: our current understanding
journal, June 2010


The neuraminidase of bat influenza viruses is not a neuraminidase
journal, October 2012

  • García-Sastre, Adolfo
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 46
  • DOI: 10.1073/pnas.1215857109

Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
journal, August 2006


Jalview Version 2--a multiple sequence alignment editor and analysis workbench
journal, January 2009


MUSCLE: multiple sequence alignment with high accuracy and high throughput
journal, March 2004

  • Edgar, R. C.
  • Nucleic Acids Research, Vol. 32, Issue 5, p. 1792-1797
  • DOI: 10.1093/nar/gkh340

Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance: Influenza Research Database
journal, January 2012


The Evolving Threat of Influenza Viruses of Animal Origin and the Challenges in Developing Appropriate Diagnostics
journal, November 2012


Works referencing / citing this record:

Overview of Virus Metagenomic Classification Methods and Their Biological Applications
journal, April 2018


Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains
journal, July 2018

  • Phan, My V. T.; Ngo Tri, Tue; Hong Anh, Pham
  • Virus Evolution, Vol. 4, Issue 2
  • DOI: 10.1093/ve/vey035

Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains.
text, January 2018

  • Phan, My VT; Ngo Tri, Tue; Hong Anh, Pham
  • Apollo - University of Cambridge Repository
  • DOI: 10.17863/cam.69918

Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains
journal, July 2018

  • Phan, My V. T.; Ngo Tri, Tue; Hong Anh, Pham
  • Virus Evolution, Vol. 4, Issue 2
  • DOI: 10.1093/ve/vey035

Overview of Virus Metagenomic Classification Methods and Their Biological Applications
journal, April 2018