skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Measuring semantic similarities by combining gene ontology annotations and gene co-function networks

Abstract

Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but aremore » relevant in a taxon-specific manner become measurable when GO annotations are limited.« less

Authors:
 [1];  [2];  [3];  [4];  [3];  [5]
  1. Harbin Institute of Technology, Harbin (China); Michigan State Univ., East Lansing, MI (United States)
  2. Michigan State Univ., East Lansing, MI (United States)
  3. Carnegie Institution for Science, Stanford, CA (United States)
  4. Harbin Institute of Technology, Harbin (China)
  5. Michigan State University, East Lansing, MI (United States)
Publication Date:
Research Org.:
Michigan State Univ., East Lansing, MI (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OSTI Identifier:
1194164
Grant/Contract Number:  
FG02-91ER20021
Resource Type:
Accepted Manuscript
Journal Name:
BMC Bioinformatics
Additional Journal Information:
Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
Publisher:
BioMed Central
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; co-function network; gene ontology; semantic similarity; gene function annotation

Citation Formats

Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States: N. p., 2015. Web. doi:10.1186/s12859-015-0474-7.
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., & Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States. doi:https://doi.org/10.1186/s12859-015-0474-7
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Sat . "Measuring semantic similarities by combining gene ontology annotations and gene co-function networks". United States. doi:https://doi.org/10.1186/s12859-015-0474-7. https://www.osti.gov/servlets/purl/1194164.
@article{osti_1194164,
title = {Measuring semantic similarities by combining gene ontology annotations and gene co-function networks},
author = {Peng, Jiajie and Uygun, Sahra and Kim, Taehyong and Wang, Yadong and Rhee, Seung Y. and Chen, Jin},
abstractNote = {Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.},
doi = {10.1186/s12859-015-0474-7},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United States},
year = {2015},
month = {2}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 24 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

Gene Ontology: tool for the unification of biology
journal, May 2000

  • Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
  • Nature Genetics, Vol. 25, Issue 1
  • DOI: 10.1038/75556

Using GOstats to test gene lists for GO term association
journal, November 2006


Evaluation of high-throughput functional categorization of human disease genes
journal, January 2007


Predicting gene function through systematic analysis and quality assessment of high-throughput data
journal, November 2004


A categorization approach to automated ontological function annotation
journal, June 2006

  • Verspoor, Karin; Cohn, Judith; Mniszewski, Susan
  • Protein Science, Vol. 15, Issue 6
  • DOI: 10.1110/ps.062184006

A new measure for functional similarity of gene products based on Gene Ontology
journal, June 2006

  • Schlicker, Andreas; Domingues, Francisco S.; Rahnenführer, Jörg
  • BMC Bioinformatics, Vol. 7, Issue 1
  • DOI: 10.1186/1471-2105-7-302

A new method to measure the semantic similarity of GO terms
journal, March 2007


Measuring gene functional similarity based on group-wise comparison of GO terms
journal, April 2013


Semantic Similarity in Biomedical Ontologies
journal, July 2009


The Gene Ontology Categorizer
journal, July 2004


Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006

  • Hawkins, Troy; Luban, Stanislav; Kihara, Daisuke
  • Protein Science, Vol. 15, Issue 6
  • DOI: 10.1110/ps.062153506

An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007


Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010

  • Lee, Insuk; Ambaru, Bindu; Thakkar, Pranjali
  • Nature Biotechnology, Vol. 28, Issue 2
  • DOI: 10.1038/nbt.1603

Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011


Use and misuse of the gene ontology annotations
journal, May 2008

  • Yon Rhee, Seung; Wood, Valerie; Dolinski, Kara
  • Nature Reviews Genetics, Vol. 9, Issue 7
  • DOI: 10.1038/nrg2363

The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011

  • Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr1090

Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011

  • Cherry, J. M.; Hong, E. L.; Amundsen, C.
  • Nucleic Acids Research, Vol. 40, Issue D1
  • DOI: 10.1093/nar/gkr1029

MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research
journal, May 2005

  • Zhang, Peifen; Foerster, Hartmut; Tissier, Christophe P.
  • Plant Physiology, Vol. 138, Issue 1
  • DOI: 10.1104/pp.105.060376

The Pathway Tools software
journal, July 2002


The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013

  • Caspi, Ron; Altman, Tomer; Billington, Richard
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1103

An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011

  • Szappanos, Balázs; Kovács, Károly; Szamecz, Béla
  • Nature Genetics, Vol. 43, Issue 7
  • DOI: 10.1038/ng.846

Diversification of P450 Genes During Land Plant Evolution
journal, June 2010


Diverse Transcriptional Programs Associated with Environmental Stress and Hormones in the Arabidopsis Receptor-Like Kinase Gene Family
journal, January 2009

  • Chae, Lee; Sudat, Sylvia; Dudoit, Sandrine
  • Molecular Plant, Vol. 2, Issue 1
  • DOI: 10.1093/mp/ssn083

PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013

  • Jin, Jinpu; Zhang, He; Kong, Lei
  • Nucleic Acids Research, Vol. 42, Issue D1
  • DOI: 10.1093/nar/gkt1016

Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006

  • Allison, David B.; Cui, Xiangqin; Page, Grier P.
  • Nature Reviews Genetics, Vol. 7, Issue 1
  • DOI: 10.1038/nrg1749

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007


Defining genetic interaction
journal, February 2008

  • Mani, R.; St. Onge, R. P.; Hartman, J. L.
  • Proceedings of the National Academy of Sciences, Vol. 105, Issue 9
  • DOI: 10.1073/pnas.0712255105

Towards revealing the functions of all genes in plants
journal, April 2014


The MIPS mammalian protein-protein interaction database
journal, November 2004


STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012

  • Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
  • Nucleic Acids Research, Vol. 41, Issue D1
  • DOI: 10.1093/nar/gks1094

Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000


Basic local alignment search tool
journal, October 1990

  • Altschul, Stephen F.; Gish, Warren; Miller, Webb
  • Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
  • DOI: 10.1016/S0022-2836(05)80360-2

Cytochrome P450 and Chemical Toxicology
journal, January 2008

  • Guengerich, F. Peter
  • Chemical Research in Toxicology, Vol. 21, Issue 1
  • DOI: 10.1021/tx700079z

Comparing partitions
journal, December 1985

  • Hubert, Lawrence; Arabie, Phipps
  • Journal of Classification, Vol. 2, Issue 1
  • DOI: 10.1007/BF01908075

QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009


    Works referencing / citing this record:

    Constructing an integrated gene similarity network for the identification of disease genes
    journal, September 2017


    Predicting disease-related genes using integrated biomedical networks
    journal, January 2017