Fizzy: feature subset selection for metagenomics
Abstract
Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.
- Authors:
- Publication Date:
- Research Org.:
- Kent State Univ., Kent, OH (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC)
- OSTI Identifier:
- 1618520
- Alternate Identifier(s):
- OSTI ID: 1242040
- Grant/Contract Number:
- SC004335; SC0004335
- Resource Type:
- Published Article
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Name: BMC Bioinformatics Journal Volume: 16 Journal Issue: 1; Journal ID: ISSN 1471-2105
- Publisher:
- Springer Science + Business Media
- Country of Publication:
- United Kingdom
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; 59 BASIC BIOLOGICAL SCIENCES; Feature subset selection; Comparative metagenomics; Open-source software
Citation Formats
Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, and Rosen, Gail L. Fizzy: feature subset selection for metagenomics. United Kingdom: N. p., 2015.
Web. doi:10.1186/s12859-015-0793-8.
Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, & Rosen, Gail L. Fizzy: feature subset selection for metagenomics. United Kingdom. https://doi.org/10.1186/s12859-015-0793-8
Ditzler, Gregory, Morrison, J. Calvin, Lan, Yemin, and Rosen, Gail L. Wed .
"Fizzy: feature subset selection for metagenomics". United Kingdom. https://doi.org/10.1186/s12859-015-0793-8.
@article{osti_1618520,
title = {Fizzy: feature subset selection for metagenomics},
author = {Ditzler, Gregory and Morrison, J. Calvin and Lan, Yemin and Rosen, Gail L.},
abstractNote = {Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.},
doi = {10.1186/s12859-015-0793-8},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United Kingdom},
year = {Wed Nov 04 00:00:00 EST 2015},
month = {Wed Nov 04 00:00:00 EST 2015}
}
https://doi.org/10.1186/s12859-015-0793-8
Web of Science
Works referenced in this record:
Feature Selection with the Boruta Package
journal, January 2010
- Kursa, Miron B.; Rudnicki, Witold R.
- Journal of Statistical Software, Vol. 36, Issue 11
Metagenomic biomarker discovery and explanation
journal, January 2011
- Segata, Nicola; Izard, Jacques; Waldron, Levi
- Genome Biology, Vol. 12, Issue 6
A Bootstrap Based Neyman-Pearson Test for Identifying Variable Importance
journal, April 2015
- Ditzler, Gregory; Polikar, Robi; Rosen, Gail
- IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, Issue 4
The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes
journal, September 2008
- Meyer, F.; Paarmann, D.; D'Souza, M.
- BMC Bioinformatics, Vol. 9, Issue 1
The Role of N-Acetyltransferase 2 Polymorphism in the Etiopathogenesis of Inflammatory Bowel Disease
journal, February 2011
- Baranska, M.; Trzcinski, R.; Dziki, A.
- Digestive Diseases and Sciences, Vol. 56, Issue 7
Scaling a neyman-pearson subset selection approach via heuristics for mining massive data
conference, December 2014
- Ditzler, Gregory; Austen, Matthew; Rosen, Gail
- 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)
Association of dietary type with fecal microbiota in vegetarians and omnivores in Slovenia
journal, October 2013
- Matijašić, Bojana Bogovič; Obermajer, Tanja; Lipoglavšek, Luka
- European Journal of Nutrition, Vol. 53, Issue 4
Senior Thai Fecal Microbiota Comparison Between Vegetarians and Non-Vegetarians Using PCR-DGGE and Real-Time PCR
journal, August 2014
- Ruengsomwong, Supatjaree; Korenori, Yuki; Sakamoto, Naoshige
- Journal of Microbiology and Biotechnology, Vol. 24, Issue 8
The role of glycosylation in IBD
journal, June 2014
- Theodoratou, Evropi; Campbell, Harry; Ventham, Nicholas T.
- Nature Reviews Gastroenterology & Hepatology, Vol. 11, Issue 10
A core gut microbiome in obese and lean twins
journal, November 2008
- Turnbaugh, Peter J.; Hamady, Micah; Yatsunenko, Tanya
- Nature, Vol. 457, Issue 7228
The NIH Human Microbiome Project
journal, October 2009
- Peterson, J.; Garges, S.; Giovanni, M.
- Genome Research, Vol. 19, Issue 12
Meeting Report: The Terabase Metagenomics Workshop and the Vision of an Earth Microbiome Project
journal, January 2010
- Gilbert, Jack A.; Meyer, Folker; Antonopoulos, Dion
- Standards in Genomic Sciences, Vol. 3, Issue 3
The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome
journal, July 2012
- McDonald, Daniel; Clemente, Jose C.; Kuczynski, Justin
- GigaScience, Vol. 1, Issue 1
The Health Advantage of a Vegan Diet: Exploring the Gut Microbiota Connection
journal, October 2014
- Glick-Bauer, Marian; Yeh, Ming-Chin
- Nutrients, Vol. 6, Issue 11
A human gut microbial gene catalogue established by metagenomic sequencing
journal, March 2010
- Qin, Junjie; Li, Ruiqiang; Raes, Jeroen
- Nature, Vol. 464, Issue 7285
QIIME allows analysis of high-throughput community sequencing data
journal, April 2010
- Caporaso, J. Gregory; Kuczynski, Justin; Stombaugh, Jesse
- Nature Methods, Vol. 7, Issue 5
Linking Long-Term Dietary Patterns with Gut Microbial Enterotypes
journal, September 2011
- Wu, G. D.; Chen, J.; Hoffmann, C.
- Science, Vol. 334, Issue 6052
Moving pictures of the human microbiome
journal, January 2011
- Caporaso, J. Gregory; Lauber, Christian L.; Costello, Elizabeth K.
- Genome Biology, Vol. 12, Issue 5
Works referencing / citing this record:
Opportunities and obstacles for deep learning in biology and medicine
journal, April 2018
- Ching, Travers; Himmelstein, Daniel S.; Beaulieu-Jones, Brett K.
- Journal of The Royal Society Interface, Vol. 15, Issue 141
The parameter sensitivity of random forests
journal, September 2016
- Huang, Barbara F. F.; Boutros, Paul C.
- BMC Bioinformatics, Vol. 17, Issue 1
Taxonomy-aware feature engineering for microbiome classification
journal, June 2018
- Oudah, Mai; Henschel, Andreas
- BMC Bioinformatics, Vol. 19, Issue 1
Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights
journal, July 2016
- Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan
- PLOS Computational Biology, Vol. 12, Issue 7
Biomarker discovery in inflammatory bowel diseases using network-based feature selection
journal, November 2019
- Abbas, Mostafa; Matta, John; Le, Thanh
- PLOS ONE, Vol. 14, Issue 11
A Review and Tutorial of Machine Learning Methods for Microbiome Host Trait Prediction
journal, June 2019
- Zhou, Yi-Hui; Gallins, Paul
- Frontiers in Genetics, Vol. 10
Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data
journal, January 2020
- Patil, Abhijeet R.; Kim, Sangjin
- Mathematics, Vol. 8, Issue 1