DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: iMicrobe: Tools and data-driven discovery platform for the microbiome sciences

Journal Article · · GigaScience
ORCiD logo [1];  [2];  [2];  [3];  [2];  [4];  [4]; ORCiD logo [5]
  1. Univ. of Arizona, Tucson, AZ (United States). Dept. of Biosystems Engineering; DOE/OSTI
  2. Univ. of Arizona, Tucson, AZ (United States). Dept. of Biosystems Engineering
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Environmental Genomics and Systems Biology Division
  4. Univ. of Arizona, Tucson, AZ (United States). Dept. of Computer Science
  5. Univ. of Arizona, Tucson, AZ (United States). Dept. of Biosystems Engineering; Univ. of Arizona, Tucson, AZ (United States). BIO5 Inst.

Background: Scientists have amassed a wealth of microbiome datasets, making it possible to study microbes in biotic and abiotic systems on a population or planetary scale; however, this potential has not been fully realized given that the tools, datasets, and computation are available in diverse repositories and locations. To address this challenge, we developed iMicrobe.us, a community-driven microbiome data marketplace and tool exchange for users to integrate their own data and tools with those from the broader community. Findings: The iMicrobe platform brings together analysis tools and microbiome datasets by leveraging National Science Foundation–supported cyberinfrastructure and computing resources from CyVerse, Agave, and XSEDE. The primary purpose of iMicrobe is to provide users with a freely available, web-based platform to (1) maintain and share project data, metadata, and analysis products, (2) search for related public datasets, and (3) use and publish bioinformatics tools that run on highly scalable computing resources. Analysis tools are implemented in containers that encapsulate complex software dependencies and run on freely available XSEDE resources via the Agave API, which can retrieve datasets from the CyVerse Data Store or any web-accessible location (e.g., FTP, HTTP). Conclusions: iMicrobe promotes data integration, sharing, and community-driven tool development by making open source data and tools accessible to the research community in a web-based platform.

Research Organization:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
Grant/Contract Number:
AC02-05CH11231
OSTI ID:
1625348
Journal Information:
GigaScience, Journal Name: GigaScience Journal Issue: 7 Vol. 8; ISSN 2047-217X
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (60)

KBase: The United States Department of Energy Systems Biology Knowledgebase journal July 2018
Fast and sensitive protein alignment using DIAMOND journal November 2014
Qiita: rapid, web-enabled microbiome meta-analysis journal October 2018
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads journal February 2014
Trimmomatic: a flexible trimmer for Illumina sequence data journal April 2014
UProC: tools for ultra-fast protein domain classification journal December 2014
BioContainers: an open-source and community-driven framework for software standardization journal March 2017
WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs journal July 2017
MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes journal October 2008
FragGeneScan: predicting genes in short and error-prone reads journal August 2010
IMG 4 version of the integrated microbial genomes comparative analysis system journal October 2013
The MG-RAST metagenomics database and portal in 2015 journal December 2015
GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes journal March 2018
The Agave Platform: An Open, Science-as-a-Service Platform for Digital Science
  • Dooley, Rion; Brandt, Steven R.; Fonner, John
  • PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing https://doi.org/10.1145/3219104.3219129
conference July 2018
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
CAMERA: A Community Resource for Metagenomics journal March 2007
The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences journal January 2016
Singularity: Scientific containers for mobility of compute journal May 2017
Supporting data for "iMicrobe: Tools and data-driven discovery platform for the microbiome sciences" dataset January 2019
KBase: The United States Department of Energy Systems Biology Knowledgebase journal July 2018
Fast and sensitive protein alignment using DIAMOND journal November 2014
How to capture developmental brain dynamics: gaps and solutions journal May 2021
Environmental drivers of a microbial genomic transition zone in the ocean’s interior journal August 2017
Qiita: rapid, web-enabled microbiome meta-analysis journal October 2018
The FAIR Guiding Principles for scientific data management and stewardship journal March 2016
SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads journal February 2014
Prokka: rapid prokaryotic genome annotation journal March 2014
Trimmomatic: a flexible trimmer for Illumina sequence data journal April 2014
UProC: tools for ultra-fast protein domain classification journal December 2014
MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph journal January 2015
BioContainers: an open-source and community-driven framework for software standardization journal March 2017
WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs journal July 2017
MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes journal October 2008
Libra: scalable k- mer–based tool for massive all-vs-all metagenome comparisons journal December 2018
FragGeneScan: predicting genes in short and error-prone reads journal August 2010
IMG 4 version of the integrated microbial genomes comparative analysis system journal October 2013
The MG-RAST metagenomics database and portal in 2015 journal December 2015
GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes journal March 2018
Centrifuge: rapid and sensitive classification of metagenomic sequences journal October 2016
XSEDE: Accelerating Scientific Discovery journal September 2014
Draft Genome Sequence of Comamonas aquatilis Strain LK (= CSUR P6418 = CECT 9772), Isolated from the Planarian Schmidtea mediterranea journal February 2021
Microbiota Assessments for the Identification and Confirmation of Slit Defect-Causing Bacteria in Milk and Cheddar Cheese journal February 2021
The Agave Platform: An Open, Science-as-a-Service Platform for Digital Science
  • Dooley, Rion; Brandt, Steven R.; Fonner, John
  • PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing https://doi.org/10.1145/3219104.3219129
conference July 2018
Prodigal: prokaryotic gene recognition and translation initiation site identification journal March 2010
Characterization and functional analysis of phytoene synthase gene family in tobacco journal January 2021
Mash: fast genome and metagenome distance estimation using MinHash journal June 2016
Improving the usability and archival stability of bioinformatics software journal February 2019
Polinton-like viruses are abundant in aquatic ecosystems journal January 2021
CAMERA: A Community Resource for Metagenomics journal March 2007
The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences journal January 2016
Protocols.io: Virtual Communities for Protocol Development and Discussion journal August 2016
Singularity: Scientific containers for mobility of compute journal May 2017
The FAIR Guiding Principles for scientific data management and stewardship. other January 2016
SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads preprint January 2013
MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph preprint January 2014
Supporting data for "iMicrobe: Tools and data-driven discovery platform for the microbiome sciences" dataset January 2019
Mash: fast genome and metagenome distance estimation using MinHash collection January 2016
vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria journal January 2017
Putative archaeal viruses from the mesopelagic ocean journal January 2017

Cited By (3)

PuMA: A papillomavirus genome annotation tool journal July 2020
Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity journal November 2019
PuMA: A papillomavirus genome annotation tool journal July 2020