U.S. Department of Energy FAQ | Widget | Site Map | Contact Us
Find the latest in DOE-sponsored Scientific and Technical Software
Bookmark and Share
Find DOE Software 
Most Requested What's New
MetaBAT: Assembling individual genomes from shotgun metagenomic sequences derived from complex microbial communities is so far one of the most challenging problems in bioinformatics. As it is impractical to directly assemble full-length genomes, a first step that groups contigs from the same organisms, called metagenome binning, has been developed to provide insights of individual organisms. However, current binning methods perform poorly in the context of large complex community, and as a result they fail to recover many novel genomes. To overcome this limitation, we developed integrated software, called MetaBAT, which automatically forms hundreds of individual genome bins from metagenome contigs. Probabilistic models of abundance and tetranucleotide frequency were trained by extensive empirical studies and integrated to decide the membership of contigs iteratively. To test the performance of MetaBAT, we applied MetaBAT to both synthetic and several large-scale real world metagenome datasets. By using two independent metrics, we demonstrate that in all the data sets tested MetaBAT achieves good sensitivity (16~87%) and very high specificity (56~99%) in forming genome bins. Further analyses of the novel genomes recovered from the human gut microbiome suggest a subset of these genomes are potentially associated with pathological conditions. In conclusion, we believe MetaBAT is a powerful tool