skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Modeling biochemical pathways in the gene ontology

Abstract

The concept of a biological pathway, an ordered sequence of molecular transformations, is used to collect and represent molecular knowledge for a broad span of organismal biology. Representations of biomedical pathways typically are rich but idiosyncratic presentations of organized knowledge about individual pathways. Meanwhile, biomedical ontologies and associated annotation files are powerful tools that organize molecular information in a logically rigorous form to support computational analysis. The Gene Ontology (GO), representing Molecular Functions, Biological Processes and Cellular Components, incorporates many aspects of biological pathways within its ontological representations. Here we present a methodology for extending and refining the classes in the GO for more comprehensive, consistent and integrated representation of pathways, leveraging knowledge embedded in current pathway representations such as those in the Reactome Knowledgebase and MetaCyc. With carbohydrate metabolic pathways as a use case, we discuss how our representation supports the integration of variant pathway classes into a unified ontological structure that can be used for data comparison and analysis.

Authors:
 [1];  [2];  [3];  [4];  [1];  [1]
  1. The Jackson Lab., Bar Harbor, ME (United States)
  2. NYU School of Medicine, New York, NY (United States)
  3. Phoenix Bioinformatics, Redwood City, CA (United States)
  4. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
OSTI Identifier:
1377445
Grant/Contract Number:
AC02-05CH11231
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Database
Additional Journal Information:
Journal Volume: 2016; Journal ID: ISSN 1758-0463
Publisher:
Oxford University Press
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 60 APPLIED LIFE SCIENCES

Citation Formats

Hill, David P., D’Eustachio, Peter, Berardini, Tanya Z., Mungall, Christopher J., Renedo, Nikolai, and Blake, Judith A. Modeling biochemical pathways in the gene ontology. United States: N. p., 2016. Web. doi:10.1093/database/baw126.
Hill, David P., D’Eustachio, Peter, Berardini, Tanya Z., Mungall, Christopher J., Renedo, Nikolai, & Blake, Judith A. Modeling biochemical pathways in the gene ontology. United States. doi:10.1093/database/baw126.
Hill, David P., D’Eustachio, Peter, Berardini, Tanya Z., Mungall, Christopher J., Renedo, Nikolai, and Blake, Judith A. 2016. "Modeling biochemical pathways in the gene ontology". United States. doi:10.1093/database/baw126. https://www.osti.gov/servlets/purl/1377445.
@article{osti_1377445,
title = {Modeling biochemical pathways in the gene ontology},
author = {Hill, David P. and D’Eustachio, Peter and Berardini, Tanya Z. and Mungall, Christopher J. and Renedo, Nikolai and Blake, Judith A.},
abstractNote = {The concept of a biological pathway, an ordered sequence of molecular transformations, is used to collect and represent molecular knowledge for a broad span of organismal biology. Representations of biomedical pathways typically are rich but idiosyncratic presentations of organized knowledge about individual pathways. Meanwhile, biomedical ontologies and associated annotation files are powerful tools that organize molecular information in a logically rigorous form to support computational analysis. The Gene Ontology (GO), representing Molecular Functions, Biological Processes and Cellular Components, incorporates many aspects of biological pathways within its ontological representations. Here we present a methodology for extending and refining the classes in the GO for more comprehensive, consistent and integrated representation of pathways, leveraging knowledge embedded in current pathway representations such as those in the Reactome Knowledgebase and MetaCyc. With carbohydrate metabolic pathways as a use case, we discuss how our representation supports the integration of variant pathway classes into a unified ontological structure that can be used for data comparison and analysis.},
doi = {10.1093/database/baw126},
journal = {Database},
number = ,
volume = 2016,
place = {United States},
year = 2016,
month = 9
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:
  • Most current approaches to automatic pathway generation are based on a reverse engineering approach in which pathway plausibility is solely derived from microarray gene expression data. These approaches tend to lack in generality and offer no independent validation as they are too reliant on the pathway observables that guide pathway generation. By contrast, alternative approaches that use prior biological knowledge to validate pathways inferred from gene expression data may err in the opposite direction as the prior knowledge is usually not sufficiently tuned to the pathology of focus. In this paper, we present a novel pathway generation approach that combinesmore » insights from the reverse engineering and knowledge-based approaches to increase the biological plausibility of automatically generated regulatory networks and describe an application of this approach to transcriptional data from a mouse model of neuroprotection during stroke.« less
  • Gene and gene product similarity is a fundamental diagnostic measure in analyzing biological data and constructing predictive models for functional genomics. With the rising influence of the Gene Ontology, two complementary approaches have emerged where the similarity between two genes or gene products is obtained by comparing Gene Ontology (GO) annotations associated with the genes or gene products. One approach captures GO-based similarity in terms of hierarchical relations within each gene subontology. The other approach identifies GO-based similarity in terms of associative relations across the three gene subontologies. We propose a novel methodology where the two approaches can be mergedmore » with ensuing benefits in coverage and accuracy, and demonstrate that further improvements can be obtained by integrating textual evidence extracted from relevant biomedical literature.« less
  • Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstratemore » that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.« less
  • Here, the Gene Ontology (GO) has been used in high-throughput omics research as a major bioinformatics resource. The hierarchical structure of GO provides users a convenient platform for biological information abstraction and hypothesis testing. Computational methods have been developed to identify functionally similar genes. However, none of the existing measurements take into account all the rich information in GO. Similarly, using these existing methods, web-based applications have been constructed to compute gene functional similarities, and to provide pure text-based outputs. Without a graphical visualization interface, it is difficult for result interpretation. As a result, we present InteGO2, a web toolmore » that allows researchers to calculate the GO-based gene semantic similarities using seven widely used GO-based similarity measurements. Also, we provide an integrative measurement that synergistically integrates all the individual measurements to improve the overall performance. Using HTML5 and cytoscape.js, we provide a graphical interface in InteGO2 to visualize the resulting gene functional association networks. In conclusion, InteGO2 is an easy-to-use HTML5 based web tool. With it, researchers can measure gene or gene product functional similarity conveniently, and visualize the network of functional interactions in a graphical interface.« less
  • Dramatic increases in research in the area of microbial biofuel production coupled with high-throughput data generation on bioenergy-related microbes has led to a deluge of information in the scientific literature and in databases. Consolidating this information and making it easily accessible requires a unified vocabulary.The Gene Ontology (GO) fulfills that requirement, as it is a well-developed structured vocabulary that describes the activities and locations of gene products in a consistent manner across all kingdoms of life. The Microbial ENergy processes Gene Ontology (http://www.mengo.biochem.vt.edu) project is extending the GO to include new terms to describe microbial processes of interest to bioenergymore » production. Our effort has added over 600 bioenergy related terms to the Gene Ontology. These terms will aid in the comprehensive annotation of gene products from diverse energy-related microbial genomes. An area of microbial energy research that has received a lot of attention is microbial production of advanced biofuels. These include alcohols such as butanol, isopropanol, isobutanol, and fuels derived from fatty acids, isoprenoids, and polyhydroxyalkanoates. These fuels are superior to first generation biofuels (ethanol and biodiesel esterified from vegetable oil or animal fat), can be generated from non-food feedstock sources, can be used as supplements or substitutes for gasoline, diesel and jet fuels, and can be stored and distributed using existing infrastructure. We review the roles of genes associated with synthesis of advanced biofuels, and at the same time introduce the use of the GO to describe the functions of these genes in a standardized way.« less