skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Measuring semantic similarities by combining gene ontology annotations and gene co-function networks

Abstract

Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but aremore » relevant in a taxon-specific manner become measurable when GO annotations are limited.« less

Authors:
 [1];  [2];  [3];  [4];  [3];  [5]
  1. Harbin Institute of Technology, Harbin (China); Michigan State Univ., East Lansing, MI (United States)
  2. Michigan State Univ., East Lansing, MI (United States)
  3. Carnegie Institution for Science, Stanford, CA (United States)
  4. Harbin Institute of Technology, Harbin (China)
  5. Michigan State University, East Lansing, MI (United States)
Publication Date:
Research Org.:
Michigan State Univ., East Lansing, MI (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OSTI Identifier:
1194164
Grant/Contract Number:  
FG02-91ER20021
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; co-function network; gene ontology; semantic similarity; gene function annotation

Citation Formats

Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States: N. p., 2015. Web. doi:10.1186/s12859-015-0474-7.
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., & Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States. https://doi.org/10.1186/s12859-015-0474-7
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. 2015. "Measuring semantic similarities by combining gene ontology annotations and gene co-function networks". United States. https://doi.org/10.1186/s12859-015-0474-7. https://www.osti.gov/servlets/purl/1194164.
@article{osti_1194164,
title = {Measuring semantic similarities by combining gene ontology annotations and gene co-function networks},
author = {Peng, Jiajie and Uygun, Sahra and Kim, Taehyong and Wang, Yadong and Rhee, Seung Y. and Chen, Jin},
abstractNote = {Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.},
doi = {10.1186/s12859-015-0474-7},
url = {https://www.osti.gov/biblio/1194164}, journal = {BMC Bioinformatics},
issn = {1471-2105},
number = 1,
volume = 16,
place = {United States},
year = {Sat Feb 14 00:00:00 EST 2015},
month = {Sat Feb 14 00:00:00 EST 2015}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 39 works
Citation information provided by
Web of Science

Figures / Tables:

Figure 1 Figure 1: An example of GO structure and annotation, gene co-function network, and the functional distance. (A) GO structure and annotation. ta…tj and “root” are GO terms, edges are the ‘is-a’ (solid line) or ‘part-of ’ (dashed line) relations between these terms, and {g1g13} in boxes are the sets ofmore » genes annotated to the corresponding terms. (B) An example of a co-function network. Each node and edge represents a gene and a functional association between the genes, respectively. The number at each edge represents a confidence score that measures the probability of an interaction to represent a true functional linkage between the genes. (C) An example of the functional distance between two gene sets. Ga (or Gb) is the set of genes annotated to ta (or tb) or its descendants. The number at each edge represents the functional distance between the genes where 0 = functional identity and 1 = no functional relationship.« less

Save / Share:

Works referenced in this record:

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007


MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research
journal, May 2005


Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011


Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007


The Gene Ontology Categorizer
journal, July 2004


A new measure for functional similarity of gene products based on Gene Ontology
journal, June 2006


Measuring gene functional similarity based on group-wise comparison of GO terms
journal, April 2013


The Gene Ontology Categorizer
journal, July 2004


Gene Ontology: tool for the unification of biology
journal, May 2000


Cytochrome P450 and Chemical Toxicology
journal, January 2008


Comparing partitions
journal, December 1985


A new method to measure the semantic similarity of GO terms
journal, March 2007


A categorization approach to automated ontological function annotation
journal, June 2006


Using GOstats to test gene lists for GO term association
journal, November 2006


Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000


Towards revealing the functions of all genes in plants
journal, April 2014


Defining genetic interaction
journal, February 2008


The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011


The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011


An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011


Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011


Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006


Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006


An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011


Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006


Predicting gene function through systematic analysis and quality assessment of high-throughput data
journal, November 2004


PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013


QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009


Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006


Use and misuse of the gene ontology annotations
journal, May 2008


Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000


Defining genetic interaction
journal, February 2008


Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010


STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012


Evaluation of high-throughput functional categorization of human disease genes
journal, January 2007


Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network
journal, January 2003


PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013


The MIPS mammalian protein-protein interaction database
journal, November 2004


Basic local alignment search tool
journal, October 1990


Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011


Cytochrome P450 and Chemical Toxicology
journal, January 2008


Semantic Similarity in Biomedical Ontologies
journal, July 2009


Dietary palmitic acid promotes a prometastatic memory via Schwann cells
journal, November 2021


A novel network pharmacology approach for leukaemia differentiation therapy using Mogrify®
journal, October 2022


Semantic Similarity in Biomedical Ontologies
journal, July 2009


Diversification of P450 Genes During Land Plant Evolution
journal, June 2010


The MIPS mammalian protein-protein interaction database
journal, November 2004


Evaluation of high-throughput functional categorization of human disease genes
text, January 2007


Towards revealing the functions of all genes in plants
journal, April 2014


Diversification of P450 Genes During Land Plant Evolution
journal, June 2010


The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013


QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009


Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011


Use and misuse of the gene ontology annotations
journal, May 2008


A new method to measure the semantic similarity of GO terms
journal, March 2007


Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010


Classification
journal, June 1999


Works referencing / citing this record:

An online tool for measuring and visualizing phenotype similarities using HPO
journal, August 2018


Erratum to: InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology
journal, March 2017


OAHG: an integrated resource for annotating human genes with multi-level ontologies
journal, October 2016


Constructing an integrated gene similarity network for the identification of disease genes
conference, December 2016


Constructing Networks of Organelle Functional Modules in Arabidopsis
journal, August 2016


Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework
journal, March 2017


An online tool for measuring and visualizing phenotype similarities using HPO
journal, August 2018


Predicting disease-related genes using integrated biomedical networks
journal, January 2017


Constructing an integrated gene similarity network for the identification of disease genes
conference, December 2016


Investigations on factors influencing HPO-based semantic similarity calculation
journal, September 2017


Measuring disease similarity and predicting disease-related ncRNAs by a novel method
journal, December 2017


Constructing an integrated gene similarity network for the identification of disease genes
journal, September 2017


Predicting disease-related genes using integrated biomedical networks
journal, January 2017


OAHG: an integrated resource for annotating human genes with multi-level ontologies
journal, October 2016


InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology
journal, August 2016


Figures/Tables have been extracted from DOE-funded journal article accepted manuscripts.