Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-basedmore »
- Univ. of Illinois at Urbana-Champaign, Urbana, IL (United States). Dept. of Chemical and Biomolecular Engineering.
- Mayo Clinic, Rochester, MN (United States). Center for Individualized Medicine.
- Argonne National Lab. (ANL), Lement, IL (United States). Mathematics and Computer Science Division.
- Mayo Clinic, Rochester, MN (United States). Center for Individualized Medicine, Depts. of Surgery and Physiology and Bioengineering.
- Univ. of Illinois at Urbana-Champaign, Urbana, IL (United States). Dept. of Chemical and Biomolecular Engineering; Inst. for Systems Biology, Seattle, WA (United States)
- Pennsylvania State Univ., University Park, PA (US)
- Publication Date:
- OSTI Identifier:
- Grant/Contract Number:
- FG02-10ER64999; AC02-06CH11357
- Accepted Manuscript
- Journal Name:
- PLoS Computational Biology (Online)
- Additional Journal Information:
- Journal Name: PLoS Computational Biology (Online); Journal Volume: 10; Journal Issue: 10; Journal ID: ISSN 1553-7358
- Public Library of Science
- Research Org:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org:
- USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
- Country of Publication:
- United States
- 59 BASIC BIOLOGICAL SCIENCES genomic databases; metabolic networks; drug metabolism; sequence databases; web-based applications; genome annotation; genetic networks; simulation and modeling