skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models

Abstract

Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorials explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical researchmore » community.« less

Authors:
 [1];  [1];  [1]
  1. Univ. of Luxembourg, Esch-sur-Alzette (Luxembourg)
Publication Date:
Research Org.:
Univ. of Luxembourg, Esch-sur-Alzette (Luxembourg)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23); European Union (EU)
OSTI Identifier:
1326674
Grant/Contract Number:
SC0010429; 668738
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Frontiers in Physiology
Additional Journal Information:
Journal Volume: 7; Journal ID: ISSN 1664-042X
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; constraint-based reconstruction and analysis (COBRA); metabolomics; metabolism; metabolic modeling; metabolic phenotypes; protocol; modelanalysis

Citation Formats

Aurich, Maike K., Fleming, Ronan M. T., and Thiele, Ines. MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models. United States: N. p., 2016. Web. doi:10.3389/fphys.2016.00327.
Aurich, Maike K., Fleming, Ronan M. T., & Thiele, Ines. MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models. United States. doi:10.3389/fphys.2016.00327.
Aurich, Maike K., Fleming, Ronan M. T., and Thiele, Ines. Wed . "MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models". United States. doi:10.3389/fphys.2016.00327. https://www.osti.gov/servlets/purl/1326674.
@article{osti_1326674,
title = {MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models},
author = {Aurich, Maike K. and Fleming, Ronan M. T. and Thiele, Ines},
abstractNote = {Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Previous work, by us and others, revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. With the MetaboTools, we make our methods available to the broader scientific community. The MetaboTools consist of a protocol, a toolbox, and tutorials of two use cases. The protocol describes, in a step-wise manner, the workflow of data integration, and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorials explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, the MetaboTools constitute a comprehensive guide to the intra-model analysis of extracellular metabolomic data from microbial, plant, or human cells. In conclusion, this computational modeling resource offers a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.},
doi = {10.3389/fphys.2016.00327},
journal = {Frontiers in Physiology},
number = ,
volume = 7,
place = {United States},
year = {Wed Aug 03 00:00:00 EDT 2016},
month = {Wed Aug 03 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 1work
Citation information provided by
Web of Science

Save / Share:
  • Genome-scale metabolic models have proven to be valuable for predicting organism phenotypes from genotypes. Yet efforts to develop new models are failing to keep pace with genome sequencing. To address this problem, we introduce the Model SEED, a web-based resource for high-throughput generation, optimization and analysis of genome-scale metabolic models. The Model SEED integrates existing methods and introduces techniques to automate nearly every step of this process, taking {approx}48 h to reconstruct a metabolic model from an assembled genome sequence. We apply this resource to generate 130 genome-scale metabolic models representing a taxonomically diverse set of bacteria. Twenty-two of themore » models were validated against available gene essentiality and Biolog data, with the average model accuracy determined to be 66% before optimization and 87% after optimization.« less
  • There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less
  • Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or moremore » network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. Lastly, in this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.« less
  • Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
  • There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less