MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences
- Georgia Institute of Technology, Atlanta, GA (United States)
Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it employs all genes present in an unknown sequence as classifiers, weighting each gene based on its (predetermined) classifying power at a given taxonomic level and frequency of horizontal gene transfer. MyTaxa also implements a novel classification scheme based on the genome-aggregate average amino acid identity concept to determine the degree of novelty of sequences representing uncharacterized taxa, i.e. whether they represent novel species, genera or phyla. Application of MyTaxa on in silico generated (mock) and real metagenomes of varied read length (100–2000 bp) revealed that it correctly classified at least 5% more sequences than any other tool. The analysis also showed that ~10% of the assembled sequences from human gut metagenomes represent novel species with no sequenced representatives, several of which were highly abundant in situ such as members of the Prevotella genus. Thus, MyTaxa can find several important applications in microbial identification and diversity studies.
- Research Organization:
- Univ. of Oklahoma, Norman, OK (United States); Georgia Institute of Technology, Atlanta, GA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Biological and Environmental Research (BER); National Science Foundation (NSF)
- Grant/Contract Number:
- SC0004601; 1241046
- OSTI ID:
- 1904533
- Journal Information:
- Nucleic Acids Research, Vol. 42, Issue 8; ISSN 0305-1048
- Publisher:
- Oxford University PressCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Genome-Based Taxonomic Classification of Bacteroidetes
The Dark Side of the Mushroom Spring Microbial Mat: Life in the Shadow of Chlorophototrophs. II. Metabolic Functions of Abundant Community Members Predicted from Metagenomic Analyses