skip to main content


Title: A RESTful API for accessing microbial community data for MG-RAST

Metagenomic sequencing has produced significant amounts of data in recent years. For example, as of summer 2013, MGRAST has been used to annotate over 110,000 data sets totaling over 43 Terabases. With metagenomic sequencing finding even wider adoption in the scientific community, the existing web-based analysis tools and infrastructure in MG-RAST provide limited capability for data retrieval and analysis, such as comparative analysis between multiple data sets. Moreover, although the system provides many analysis tools, it is not comprehensive. By opening MG-RAST up via a web services API (application programmers interface) we have greatly expanded access to MG-RAST data, as well as provided a mechanism for the use of third-party analysis tools with MG-RAST data. This RESTful API makes all data and data objects created by the MG-RAST pipeline accessible as JSON objects. As part of the DOE Systems Biology Knowledgebase project (KBase, http:// we have implemented a web services API for MG-RAST. This API complements the existing MG-RAST web interface and constitutes the basis of KBase’s microbial community capabilities. In addition, the API exposes a comprehensive collection of data to programmers. This API, which uses a RESTful (Representational State Transfer) implementation, is compatible with most programming environments andmore » should be easy to use for end users and third parties. It provides comprehensive access to sequence data, quality control results, annotations, and many other data types. Where feasible, we have used standards to expose data and metadata. Code examples are provided in a number of languages both to show the versatility of the API and to provide a starting point for users. We present an API that exposes the data in MG-RAST for consumption by our users, greatly enhancing the utility of the MG-RAST service.« less
 [1] ;  [1] ;  [1] ;  [2] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [1] ;  [3]
  1. Argonne National Lab. (ANL), Lement, IL (United States). Mathematics and Computer Science Division; Univ. of Chicago, Chicago, IL (United States). Computation Institute.
  2. Argonne National Lab. (ANL), Lement, IL (United States). Mathematics and Computer Science Division.
  3. Univ. of Canterbury (New Zealand)
Publication Date:
Grant/Contract Number:
Accepted Manuscript
Journal Name:
PLoS Computational Biology (Online)
Additional Journal Information:
Journal Name: PLoS Computational Biology (Online); Journal Volume: 11; Journal Issue: 1; Journal ID: ISSN 1553-7358
Public Library of Science
Research Org:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
Country of Publication:
United States
59 BASIC BIOLOGICAL SCIENCES; 96 KNOWLEDGE MANAGEMENT AND PRESERVATION; sequence databases; information retrieval; metagenomics; web-based applications; proteases; DNA sequence analysis; database searching; quality control
OSTI Identifier:
Alternate Identifier(s):
OSTI ID: 1395022