skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A RESTful API for accessing microbial community data for MG-RAST

Journal Article · · PLoS Computational Biology (Online)
 [1];  [1];  [1];  [2];  [1];  [1];  [1];  [1];  [1];  [1];  [1];  [1];  [3]
  1. Argonne National Lab. (ANL), Lement, IL (United States). Mathematics and Computer Science Division; Univ. of Chicago, Chicago, IL (United States). Computation Institute.
  2. Argonne National Lab. (ANL), Lement, IL (United States). Mathematics and Computer Science Division.
  3. Univ. of Canterbury (New Zealand)

Metagenomic sequencing has produced significant amounts of data in recent years. For example, as of summer 2013, MG-RAST has been used to annotate over 110,000 data sets totaling over 43 Terabases. With metagenomic sequencing finding even wider adoption in the scientific community, the existing web-based analysis tools and infrastructure in MG-RAST provide limited capability for data retrieval and analysis, such as comparative analysis between multiple data sets. Moreover, although the system provides many analysis tools, it is not comprehensive. By opening MG-RAST up via a web services API (application programmers interface) we have greatly expanded access to MG-RAST data, as well as provided a mechanism for the use of third-party analysis tools with MG-RAST data. This RESTful API makes all data and data objects created by the MG-RAST pipeline accessible as JSON objects. As part of the DOE Systems Biology Knowledgebase project (KBase, http://kbase.us) we have implemented a web services API for MG-RAST. This API complements the existing MG-RAST web interface and constitutes the basis of KBase's microbial community capabilities. In addition, the API exposes a comprehensive collection of data to programmers. This API, which uses a RESTful (Representational State Transfer) implementation, is compatible with most programming environments and should be easy to use for end users and third parties. It provides comprehensive access to sequence data, quality control results, annotations, and many other data types. Where feasible, we have used standards to expose data and metadata. Code examples are provided in a number of languages both to show the versatility of the API and to provide a starting point for users. We present an API that exposes the data in MG-RAST for consumption by our users, greatly enhancing the utility of the MG-RAST service.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1212400
Alternate ID(s):
OSTI ID: 1395022
Journal Information:
PLoS Computational Biology (Online), Vol. 11, Issue 1; ISSN 1553-7358
Publisher:
Public Library of ScienceCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 49 works
Citation information provided by
Web of Science

References (14)

The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomes journal September 2008
The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification journal May 2013
A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISEE journal June 2012
The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools journal January 2012
Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications journal May 2011
InterPro in 2011: new developments in the family and domain prediction database journal November 2011
Using clouds for metagenomics: A case study conference August 2009
Identifying Protein Domains with the Pfam Database journal September 2008
Accessing the SEED Genome Databases via Web Services API: Tools for Programmers journal January 2010
The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome journal July 2012
The 'rare biosphere': a reality check journal September 2009
Identifying Protein Domains with the Pfam Database journal March 2003
InterPro in 2011: new developments in the family and domain prediction database journal May 2012
Accurate determination of microbial diversity from 454 pyrosequencing data journal August 2009

Cited By (23)

Towards Solving The Metagenomics Reproducibility Crisis With Cwl And Ro other October 2018
A novel and wide substrate specific polyhydroxyalkanoate (PHA) synthase from unculturable bacteria found in mangrove soil journal December 2017
IgA regulates the composition and metabolic function of gut microbiota by promoting symbiosis between bacteria journal July 2018
MG-RAST version 4—lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis journal September 2017
Functional sequencing read annotation for high precision microbiome analysis journal November 2017
The diversity of antibiotic resistance and virulence genes are correlated in human gut and environmental microbiomes journal January 2019
Exploring bacterial pathogen community dynamics in freshwater beach sediments: A tale of two lakes journal November 2019
Metagenomic evidence for the presence of phototrophic Gemmatimonadetes bacteria in diverse environments: Phototrophic Gemmatimonadetes in diverse environments journal January 2016
Ancient plant DNA in lake sediments journal April 2017
Complete Genome Sequence of Escherichia coli Phage vB_EcoS Sa179lw, Isolated from Surface Water in a Produce-Growing Area in Northern California journal July 2018
Antibiotic Resistance Gene Diversity and Virulence Gene Diversity Are Correlated in Human Gut and Environmental Microbiomes journal May 2019
Genomics of the Uncultivated, Periodontitis-Associated Bacterium Tannerella sp. BU045 (Oral Taxon 808) journal June 2018
SAMSA: a comprehensive metatranscriptome analysis pipeline journal September 2016
Taxon-Function Decoupling as an Adaptive Signature of Lake Microbial Metacommunities Under a Chronic Polymetallic Pollution Gradient journal May 2018
Microscale Biosignatures and Abiotic Mineral Authigenesis in Little Hot Creek, California journal May 2018
Towards Solving The Metagenomics Reproducibility Crisis With Cwl And Ro other October 2018
SAMSA: A comprehensive metatranscriptome analysis pipeline journal March 2016
Web Resources for Metagenomics Studies journal October 2015
GeneHunt for rapid domain-specific annotation of glycoside hydrolases journal July 2019
Comparative metagenomics reveals taxonomically idiosyncratic yet functionally congruent communities in periodontitis journal December 2016
Variable habitat conditions drive species covariation in the human microbiota journal April 2017
Insights into Red Sea Brine Pool Specialized Metabolism Gene Clusters Encoding Potential Metabolites for Biotechnological Applications and Extremophile Survival journal May 2019
IgA-about the unexpected. text January 2018