Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

ClassyFlu: Classification of Influenza A Viruses with Discriminatively Trained Profile-HMMs

Journal Article · · PLoS ONE
 [1];  [2];  [3];  [3];  [3];  [4]
  1. Ernst-Moritz-Arndt Universitat Greifswald (Germany); DOE/OSTI
  2. Ernst-Moritz-Arndt Universitat Greifswald (Germany); Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
  3. Federal Research Institute for Animal Health, Island of Riems, Greifswald (Germany)
  4. Ernst-Moritz-Arndt Universitat Greifswald (Germany)
Accurate and rapid characterization of influenza A virus (IAV) hemagglutinin (HA) and neuraminidase (NA) sequences with respect to subtype and clade is at the basis of extended diagnostic services and implicit to molecular epidemiologic studies. ClassyFlu is a new tool and web service for the classification of IAV sequences of the HA and NA gene into subtypes and phylogenetic clades using discriminatively trained profile hidden Markov models (HMMs), one for each subtype or clade. ClassyFlu merely requires as input unaligned, full-length or partial HA or NA DNA sequences. It enables rapid and highly accurate assignment of HA sequences to subtypes H1–H17 but particularly focusses on the finer grained assignment of sequences of highly pathogenic avian influenza viruses of subtype H5N1 according to the cladistics proposed by the H5N1 Evolution Working Group. NA sequences are classified into subtypes N1–N10. ClassyFlu was compared to semiautomatic classification approaches using BLAST and phylogenetics and additionally for H5 sequences to the new ‘‘Highly Pathogenic H5N1 Clade Classification Tool’’ (IRD-CT) proposed by the Influenza Research Database. Our results show that both web tools (ClassyFlu and IRD-CT), although based on different methods, are nearly equivalent in performance and both are more accurate and faster than semiautomatic classification. A retraining of ClassyFlu to altered cladistics as well as an extension of ClassyFlu to other IAV genome segments or fragments thereof is undemanding. This is exemplified by unambiguous assignment to a distinct cluster within subtype H7 of sequences of H7N9 viruses which emerged in China early in 2013 and caused more than 130 human infections.
Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
German Science Foundation (DFG); USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC52-06NA25396
OSTI ID:
1627669
Journal Information:
PLoS ONE, Journal Name: PLoS ONE Journal Issue: 1 Vol. 9; ISSN 1932-6203
Publisher:
Public Library of ScienceCopyright Statement
Country of Publication:
United States
Language:
English

References (17)

The emergence of pandemic influenza viruses journal January 2010
Rapid haemagglutinin subtyping and pathotyping of avian influenza viruses by a DNA microarray journal September 2009
Avian influenza: our current understanding journal June 2010
The neuraminidase of bat influenza viruses is not a neuraminidase journal October 2012
Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative journal August 2006
Jalview Version 2--a multiple sequence alignment editor and analysis workbench journal January 2009
MUSCLE: multiple sequence alignment with high accuracy and high throughput journal March 2004
Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance: Influenza Research Database journal January 2012
The Evolving Threat of Influenza Viruses of Animal Origin and the Challenges in Developing Appropriate Diagnostics journal November 2012
Basic local alignment search tool journal October 1990
Avian influenza: our current understanding journal June 2010
The genesis and source of the H7N9 influenza viruses causing human infections in China journal August 2013
Crystal structures of two subtype N10 neuraminidase-like proteins from bat influenza A viruses reveal a diverged putative active site journal September 2012
MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods journal May 2011
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0 journal March 2010
Cross-species comparison of site-specific evolutionary-rate variation in influenza haemagglutinin journal March 2013
Accelerated Profile HMM Searches journal October 2011

Cited By (3)

Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains journal July 2018
Overview of Virus Metagenomic Classification Methods and Their Biological Applications journal April 2018
Identification and characterization of Coronaviridae genomes from Vietnamese bats and rats based on conserved protein domains. text January 2018