skip to main content

SciTech ConnectSciTech Connect

Title: Fizzy. Feature subset selection for metagenomics

Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at
 [1] ;  [2] ;  [2] ;  [2]
  1. Univ. of Arizona, Tucson, AZ (United States)
  2. Drexel Univ., Philadelphia, PA (United States)
Publication Date:
OSTI Identifier:
Grant/Contract Number:
Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
BioMed Central
Research Org:
Kent State Univ., Kent, OH (United States)
Sponsoring Org:
USDOE Office of Science (SC)
Country of Publication:
United States
97 MATHEMATICS AND COMPUTING; 59 BASIC BIOLOGICAL SCIENCES Feature subset selection; Comparative metagenomics; Open-source software