DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Fizzy: feature subset selection for metagenomics

Abstract

Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.

Authors:
; ; ;
Publication Date:
Research Org.:
Kent State Univ., Kent, OH (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1618520
Alternate Identifier(s):
OSTI ID: 1242040
Grant/Contract Number:  
SC004335; SC0004335
Resource Type:
Published Article
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Name: BMC Bioinformatics Journal Volume: 16 Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
Springer Science + Business Media
Country of Publication:
United Kingdom
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; 59 BASIC BIOLOGICAL SCIENCES; Feature subset selection; Comparative metagenomics; Open-source software

Citation Formats

Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, and Rosen, Gail L. Fizzy: feature subset selection for metagenomics. United Kingdom: N. p., 2015. Web. doi:10.1186/s12859-015-0793-8.
Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, & Rosen, Gail L. Fizzy: feature subset selection for metagenomics. United Kingdom. https://doi.org/10.1186/s12859-015-0793-8
Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, and Rosen, Gail L. Wed . "Fizzy: feature subset selection for metagenomics". United Kingdom. https://doi.org/10.1186/s12859-015-0793-8.
@article{osti_1618520,
title = {Fizzy: feature subset selection for metagenomics},
author = {Ditzler, Gregory and Morrison, J. Calvin and Lan, Yemin and Rosen, Gail L.},
abstractNote = {Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.},
doi = {10.1186/s12859-015-0793-8},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United Kingdom},
year = {Wed Nov 04 00:00:00 EST 2015},
month = {Wed Nov 04 00:00:00 EST 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1186/s12859-015-0793-8

Citation Metrics:
Cited by: 29 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Feature Selection with the Boruta Package
journal, January 2010

  • Kursa, Miron B.; Rudnicki, Witold R.
  • Journal of Statistical Software, Vol. 36, Issue 11
  • DOI: 10.18637/jss.v036.i11

Metagenomic biomarker discovery and explanation
journal, January 2011


A Bootstrap Based Neyman-Pearson Test for Identifying Variable Importance
journal, April 2015

  • Ditzler, Gregory; Polikar, Robi; Rosen, Gail
  • IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, Issue 4
  • DOI: 10.1109/TNNLS.2014.2320415

The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes
journal, September 2008


The Role of N-Acetyltransferase 2 Polymorphism in the Etiopathogenesis of Inflammatory Bowel Disease
journal, February 2011


Scaling a neyman-pearson subset selection approach via heuristics for mining massive data
conference, December 2014

  • Ditzler, Gregory; Austen, Matthew; Rosen, Gail
  • 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)
  • DOI: 10.1109/CIDM.2014.7008701

Association of dietary type with fecal microbiota in vegetarians and omnivores in Slovenia
journal, October 2013

  • Matijašić, Bojana Bogovič; Obermajer, Tanja; Lipoglavšek, Luka
  • European Journal of Nutrition, Vol. 53, Issue 4
  • DOI: 10.1007/s00394-013-0607-6

Senior Thai Fecal Microbiota Comparison Between Vegetarians and Non-Vegetarians Using PCR-DGGE and Real-Time PCR
journal, August 2014

  • Ruengsomwong, Supatjaree; Korenori, Yuki; Sakamoto, Naoshige
  • Journal of Microbiology and Biotechnology, Vol. 24, Issue 8
  • DOI: 10.4014/jmb.1310.10043

The role of glycosylation in IBD
journal, June 2014

  • Theodoratou, Evropi; Campbell, Harry; Ventham, Nicholas T.
  • Nature Reviews Gastroenterology & Hepatology, Vol. 11, Issue 10
  • DOI: 10.1038/nrgastro.2014.78

A core gut microbiome in obese and lean twins
journal, November 2008

  • Turnbaugh, Peter J.; Hamady, Micah; Yatsunenko, Tanya
  • Nature, Vol. 457, Issue 7228
  • DOI: 10.1038/nature07540

The NIH Human Microbiome Project
journal, October 2009


Meeting Report: The Terabase Metagenomics Workshop and the Vision of an Earth Microbiome Project
journal, January 2010

  • Gilbert, Jack A.; Meyer, Folker; Antonopoulos, Dion
  • Standards in Genomic Sciences, Vol. 3, Issue 3
  • DOI: 10.4056/sigs.1433550

The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome
journal, July 2012

  • McDonald, Daniel; Clemente, Jose C.; Kuczynski, Justin
  • GigaScience, Vol. 1, Issue 1
  • DOI: 10.1186/2047-217X-1-7

The Health Advantage of a Vegan Diet: Exploring the Gut Microbiota Connection
journal, October 2014

  • Glick-Bauer, Marian; Yeh, Ming-Chin
  • Nutrients, Vol. 6, Issue 11
  • DOI: 10.3390/nu6114822

A human gut microbial gene catalogue established by metagenomic sequencing
journal, March 2010

  • Qin, Junjie; Li, Ruiqiang; Raes, Jeroen
  • Nature, Vol. 464, Issue 7285
  • DOI: 10.1038/nature08821

QIIME allows analysis of high-throughput community sequencing data
journal, April 2010

  • Caporaso, J. Gregory; Kuczynski, Justin; Stombaugh, Jesse
  • Nature Methods, Vol. 7, Issue 5
  • DOI: 10.1038/nmeth.f.303

Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes
journal, September 2011


Obesity and the regulation of fat metabolism
journal, January 2007


Moving pictures of the human microbiome
journal, January 2011

  • Caporaso, J. Gregory; Lauber, Christian L.; Costello, Elizabeth K.
  • Genome Biology, Vol. 12, Issue 5
  • DOI: 10.1186/gb-2011-12-5-r50

Works referencing / citing this record:

Opportunities and obstacles for deep learning in biology and medicine
journal, April 2018

  • Ching, Travers; Himmelstein, Daniel S.; Beaulieu-Jones, Brett K.
  • Journal of The Royal Society Interface, Vol. 15, Issue 141
  • DOI: 10.1098/rsif.2017.0387

The parameter sensitivity of random forests
journal, September 2016


Taxonomy-aware feature engineering for microbiome classification
journal, June 2018


Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights
journal, July 2016


Biomarker discovery in inflammatory bowel diseases using network-based feature selection
journal, November 2019


A Review and Tutorial of Machine Learning Methods for Microbiome Host Trait Prediction
journal, June 2019