DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: iMicrobe: Tools and data-driven discovery platform for the microbiome sciences

Abstract

Background: Scientists have amassed a wealth of microbiome datasets, making it possible to study microbes in biotic and abiotic systems on a population or planetary scale; however, this potential has not been fully realized given that the tools, datasets, and computation are available in diverse repositories and locations. To address this challenge, we developed iMicrobe.us, a community-driven microbiome data marketplace and tool exchange for users to integrate their own data and tools with those from the broader community. Findings: The iMicrobe platform brings together analysis tools and microbiome datasets by leveraging National Science Foundation–supported cyberinfrastructure and computing resources from CyVerse, Agave, and XSEDE. The primary purpose of iMicrobe is to provide users with a freely available, web-based platform to (1) maintain and share project data, metadata, and analysis products, (2) search for related public datasets, and (3) use and publish bioinformatics tools that run on highly scalable computing resources. Analysis tools are implemented in containers that encapsulate complex software dependencies and run on freely available XSEDE resources via the Agave API, which can retrieve datasets from the CyVerse Data Store or any web-accessible location (e.g., FTP, HTTP). Conclusions: iMicrobe promotes data integration, sharing, and community-driven tool development by making openmore » source data and tools accessible to the research community in a web-based platform.« less

Authors:
ORCiD logo [1];  [1];  [1];  [2];  [1];  [3];  [3]; ORCiD logo [4]
  1. Univ. of Arizona, Tucson, AZ (United States). Dept. of Biosystems Engineering
  2. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Environmental Genomics and Systems Biology Division
  3. Univ. of Arizona, Tucson, AZ (United States). Dept. of Computer Science
  4. Univ. of Arizona, Tucson, AZ (United States). Dept. of Biosystems Engineering; Univ. of Arizona, Tucson, AZ (United States). BIO5 Inst.
Publication Date:
Research Org.:
Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
OSTI Identifier:
1625348
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
GigaScience
Additional Journal Information:
Journal Volume: 8; Journal Issue: 7; Journal ID: ISSN 2047-217X
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Life Sciences & Biomedicine - Other Topics; Science & Technology - Other Topics; cyberinfrastructure; cloud computing; bioinformatics; metagenomics

Citation Formats

Youens-Clark, Ken, Bomhoff, Matt, Ponsero, Alise J., Wood-Charlson, Elisha M., Lynch, Joshua, Choi, Illyoung, Hartman, John H., and Hurwitz, Bonnie L. iMicrobe: Tools and data-driven discovery platform for the microbiome sciences. United States: N. p., 2019. Web. doi:10.1093/gigascience/giz083.
Youens-Clark, Ken, Bomhoff, Matt, Ponsero, Alise J., Wood-Charlson, Elisha M., Lynch, Joshua, Choi, Illyoung, Hartman, John H., & Hurwitz, Bonnie L. iMicrobe: Tools and data-driven discovery platform for the microbiome sciences. United States. https://doi.org/10.1093/gigascience/giz083
Youens-Clark, Ken, Bomhoff, Matt, Ponsero, Alise J., Wood-Charlson, Elisha M., Lynch, Joshua, Choi, Illyoung, Hartman, John H., and Hurwitz, Bonnie L. Tue . "iMicrobe: Tools and data-driven discovery platform for the microbiome sciences". United States. https://doi.org/10.1093/gigascience/giz083. https://www.osti.gov/servlets/purl/1625348.
@article{osti_1625348,
title = {iMicrobe: Tools and data-driven discovery platform for the microbiome sciences},
author = {Youens-Clark, Ken and Bomhoff, Matt and Ponsero, Alise J. and Wood-Charlson, Elisha M. and Lynch, Joshua and Choi, Illyoung and Hartman, John H. and Hurwitz, Bonnie L.},
abstractNote = {Background: Scientists have amassed a wealth of microbiome datasets, making it possible to study microbes in biotic and abiotic systems on a population or planetary scale; however, this potential has not been fully realized given that the tools, datasets, and computation are available in diverse repositories and locations. To address this challenge, we developed iMicrobe.us, a community-driven microbiome data marketplace and tool exchange for users to integrate their own data and tools with those from the broader community. Findings: The iMicrobe platform brings together analysis tools and microbiome datasets by leveraging National Science Foundation–supported cyberinfrastructure and computing resources from CyVerse, Agave, and XSEDE. The primary purpose of iMicrobe is to provide users with a freely available, web-based platform to (1) maintain and share project data, metadata, and analysis products, (2) search for related public datasets, and (3) use and publish bioinformatics tools that run on highly scalable computing resources. Analysis tools are implemented in containers that encapsulate complex software dependencies and run on freely available XSEDE resources via the Agave API, which can retrieve datasets from the CyVerse Data Store or any web-accessible location (e.g., FTP, HTTP). Conclusions: iMicrobe promotes data integration, sharing, and community-driven tool development by making open source data and tools accessible to the research community in a web-based platform.},
doi = {10.1093/gigascience/giz083},
journal = {GigaScience},
number = 7,
volume = 8,
place = {United States},
year = {Tue Jul 09 00:00:00 EDT 2019},
month = {Tue Jul 09 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

Figure 1 Figure 1: iMicrobe’s architecture allows for the integration of datasets hosted by iMicrobe that can be placed into the data cart, those private to the user, and others publicly accessible on the Internet. Analyses created by iMicrobe or other developers run on Stampede2, and the results go into the user’smore » home directory in the CyVerse Data Store.« less

Save / Share:

Works referenced in this record:

GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes
journal, March 2018

  • Boyd, Joel A.; Woodcroft, Ben J.; Tyson, Gene W.
  • Nucleic Acids Research, Vol. 46, Issue 10
  • DOI: 10.1093/nar/gky174

GraftM: a tool for scalable, phylogenetically informed classification of genes within metagenomes
journal, March 2018

  • Boyd, Joel A.; Woodcroft, Ben J.; Tyson, Gene W.
  • Nucleic Acids Research, Vol. 46, Issue 10
  • DOI: 10.1093/nar/gky174

Environmental drivers of a microbial genomic transition zone in the ocean’s interior
journal, August 2017

  • Mende, Daniel R.; Bryant, Jessica A.; Aylward, Frank O.
  • Nature Microbiology, Vol. 2, Issue 10
  • DOI: 10.1038/s41564-017-0008-3

Supporting data for "iMicrobe: Tools and data-driven discovery platform for the microbiome sciences"
dataset, January 2019

  • Alise, Ponsero; Bonnie, Hurwitz L.; Elisha, Wood-Charlson M.
  • GigaScience Database
  • DOI: 10.5524/100611

Supporting data for "iMicrobe: Tools and data-driven discovery platform for the microbiome sciences"
dataset, January 2019

  • Alise, Ponsero; Bonnie, Hurwitz L.; Elisha, Wood-Charlson M.
  • GigaScience Database
  • DOI: 10.5524/100611

Trimmomatic: a flexible trimmer for Illumina sequence data
journal, April 2014


Singularity: Scientific containers for mobility of compute
journal, May 2017


The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences
journal, January 2016


The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences
journal, January 2016


The MG-RAST metagenomics database and portal in 2015
journal, December 2015

  • Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1322

The MG-RAST metagenomics database and portal in 2015
journal, December 2015

  • Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang
  • Nucleic Acids Research, Vol. 44, Issue D1
  • DOI: 10.1093/nar/gkv1322

Polinton-like viruses are abundant in aquatic ecosystems
journal, January 2021


Draft Genome Sequence of Comamonas aquatilis Strain LK (= CSUR P6418 = CECT 9772), Isolated from the Planarian Schmidtea mediterranea
journal, February 2021

  • Kangale, Luis Johnson; Levasseur, Anthony; Raoult, Didier
  • Microbiology Resource Announcements, Vol. 10, Issue 5
  • DOI: 10.1128/mra.00297-20

Singularity: Scientific containers for mobility of compute
journal, May 2017


Trimmomatic: a flexible trimmer for Illumina sequence data
journal, April 2014


Fast and sensitive protein alignment using DIAMOND
journal, November 2014

  • Buchfink, Benjamin; Xie, Chao; Huson, Daniel H.
  • Nature Methods, Vol. 12, Issue 1
  • DOI: 10.1038/nmeth.3176

MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
journal, January 2015


Centrifuge: rapid and sensitive classification of metagenomic sequences
journal, October 2016

  • Kim, Daehwan; Song, Li; Breitwieser, Florian P.
  • Genome Research, Vol. 26, Issue 12
  • DOI: 10.1101/gr.210641.116

Putative archaeal viruses from the mesopelagic ocean
journal, January 2017


XSEDE: Accelerating Scientific Discovery
journal, September 2014

  • Towns, John; Cockerill, Timothy; Dahan, Maytal
  • Computing in Science & Engineering, Vol. 16, Issue 5
  • DOI: 10.1109/MCSE.2014.80

UProC: tools for ultra-fast protein domain classification
journal, December 2014


Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


Characterization and functional analysis of phytoene synthase gene family in tobacco
journal, January 2021


IMG 4 version of the integrated microbial genomes comparative analysis system
journal, October 2013

  • Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt963

Protocols.io: Virtual Communities for Protocol Development and Discussion
journal, August 2016


Microbiota Assessments for the Identification and Confirmation of Slit Defect-Causing Bacteria in Milk and Cheddar Cheese
journal, February 2021


Fast and sensitive protein alignment using DIAMOND
journal, November 2014

  • Buchfink, Benjamin; Xie, Chao; Huson, Daniel H.
  • Nature Methods, Vol. 12, Issue 1
  • DOI: 10.1038/nmeth.3176

The FAIR Guiding Principles for scientific data management and stewardship.
other, January 2016

  • Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan
  • UNESCO/IOC
  • DOI: 10.25607/obp-800

The FAIR Guiding Principles for scientific data management and stewardship
journal, March 2016

  • Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan
  • Scientific Data, Vol. 3, Issue 1
  • DOI: 10.1038/sdata.2016.18

Mash: fast genome and metagenome distance estimation using MinHash
journal, June 2016


How to capture developmental brain dynamics: gaps and solutions
journal, May 2021

  • van Atteveldt, Nienke; Vandermosten, Maaike; Weeda, Wouter
  • npj Science of Learning, Vol. 6, Issue 1
  • DOI: 10.1038/s41539-021-00088-6

SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads
journal, February 2014


SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads
journal, February 2014


Prokka: rapid prokaryotic genome annotation
journal, March 2014


KBase: The United States Department of Energy Systems Biology Knowledgebase
journal, July 2018

  • Arkin, Adam P.; Cottingham, Robert W.; Henry, Christopher S.
  • Nature Biotechnology, Vol. 36, Issue 7
  • DOI: 10.1038/nbt.4163

WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs
journal, July 2017


Improving the usability and archival stability of bioinformatics software
journal, February 2019


FragGeneScan: predicting genes in short and error-prone reads
journal, August 2010

  • Rho, Mina; Tang, Haixu; Ye, Yuzhen
  • Nucleic Acids Research, Vol. 38, Issue 20
  • DOI: 10.1093/nar/gkq747

CAMERA: A Community Resource for Metagenomics
journal, March 2007


CAMERA: A Community Resource for Metagenomics
journal, March 2007


KBase: The United States Department of Energy Systems Biology Knowledgebase
journal, July 2018

  • Arkin, Adam P.; Cottingham, Robert W.; Henry, Christopher S.
  • Nature Biotechnology, Vol. 36, Issue 7
  • DOI: 10.1038/nbt.4163

Qiita: rapid, web-enabled microbiome meta-analysis
journal, October 2018

  • Gonzalez, Antonio; Navas-Molina, Jose A.; Kosciolek, Tomasz
  • Nature Methods, Vol. 15, Issue 10
  • DOI: 10.1038/s41592-018-0141-9

SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads
preprint, January 2013


BioContainers: an open-source and community-driven framework for software standardization
journal, March 2017


FragGeneScan: predicting genes in short and error-prone reads
journal, August 2010

  • Rho, Mina; Tang, Haixu; Ye, Yuzhen
  • Nucleic Acids Research, Vol. 38, Issue 20
  • DOI: 10.1093/nar/gkq747

WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs
journal, July 2017


The FAIR Guiding Principles for scientific data management and stewardship
journal, March 2016

  • Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan
  • Scientific Data, Vol. 3, Issue 1
  • DOI: 10.1038/sdata.2016.18

The Agave Platform: An Open, Science-as-a-Service Platform for Digital Science
conference, July 2018

  • Dooley, Rion; Brandt, Steven R.; Fonner, John
  • PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing
  • DOI: 10.1145/3219104.3219129

The Agave Platform: An Open, Science-as-a-Service Platform for Digital Science
conference, July 2018

  • Dooley, Rion; Brandt, Steven R.; Fonner, John
  • PEARC '18: Practice and Experience in Advanced Research Computing, Proceedings of the Practice and Experience on Advanced Research Computing
  • DOI: 10.1145/3219104.3219129

Prodigal: prokaryotic gene recognition and translation initiation site identification
journal, March 2010


UProC: tools for ultra-fast protein domain classification
journal, December 2014


vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria
journal, January 2017


IMG 4 version of the integrated microbial genomes comparative analysis system
journal, October 2013

  • Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt963

Libra: scalable k- mer–based tool for massive all-vs-all metagenome comparisons
journal, December 2018


BioContainers: an open-source and community-driven framework for software standardization
journal, March 2017


Qiita: rapid, web-enabled microbiome meta-analysis
journal, October 2018

  • Gonzalez, Antonio; Navas-Molina, Jose A.; Kosciolek, Tomasz
  • Nature Methods, Vol. 15, Issue 10
  • DOI: 10.1038/s41592-018-0141-9

Works referencing / citing this record:

Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity
journal, November 2019


PuMA: A papillomavirus genome annotation tool
journal, July 2020

  • Pace, Josh; Youens-Clark, Ken; Freeman, Cordell
  • Virus Evolution, Vol. 6, Issue 2
  • DOI: 10.1093/ve/veaa068

PuMA: A papillomavirus genome annotation tool
journal, July 2020

  • Pace, Josh; Youens-Clark, Ken; Freeman, Cordell
  • Virus Evolution, Vol. 6, Issue 2
  • DOI: 10.1093/ve/veaa068

Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.