Measuring semantic similarities by combining gene ontology annotations and gene co-function networks
Abstract
Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but aremore »
- Authors:
-
- Harbin Institute of Technology, Harbin (China); Michigan State Univ., East Lansing, MI (United States)
- Michigan State Univ., East Lansing, MI (United States)
- Carnegie Institution for Science, Stanford, CA (United States)
- Harbin Institute of Technology, Harbin (China)
- Michigan State University, East Lansing, MI (United States)
- Publication Date:
- Research Org.:
- Michigan State Univ., East Lansing, MI (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Basic Energy Sciences (BES)
- OSTI Identifier:
- 1194164
- Grant/Contract Number:
- FG02-91ER20021
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; co-function network; gene ontology; semantic similarity; gene function annotation
Citation Formats
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States: N. p., 2015.
Web. doi:10.1186/s12859-015-0474-7.
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., & Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States. https://doi.org/10.1186/s12859-015-0474-7
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Sat .
"Measuring semantic similarities by combining gene ontology annotations and gene co-function networks". United States. https://doi.org/10.1186/s12859-015-0474-7. https://www.osti.gov/servlets/purl/1194164.
@article{osti_1194164,
title = {Measuring semantic similarities by combining gene ontology annotations and gene co-function networks},
author = {Peng, Jiajie and Uygun, Sahra and Kim, Taehyong and Wang, Yadong and Rhee, Seung Y. and Chen, Jin},
abstractNote = {Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.},
doi = {10.1186/s12859-015-0474-7},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United States},
year = {2015},
month = {2}
}
Web of Science
Figures / Tables:

Works referenced in this record:
Gene Ontology: tool for the unification of biology
journal, May 2000
- Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
- Nature Genetics, Vol. 25, Issue 1
Using GOstats to test gene lists for GO term association
journal, November 2006
- Falcon, S.; Gentleman, R.
- Bioinformatics, Vol. 23, Issue 2
Evaluation of high-throughput functional categorization of human disease genes
journal, January 2007
- Chen, James L.; Liu, Yang; Sam, Lee T.
- BMC Bioinformatics, Vol. 8, Issue Suppl 3
Predicting gene function through systematic analysis and quality assessment of high-throughput data
journal, November 2004
- Kemmeren, P.; Kockelkorn, T. T. J. P.; Bijma, T.
- Bioinformatics, Vol. 21, Issue 8
Globally predicting protein functions based on co-expressed protein–protein interaction networks and ontology taxonomy similarities
journal, April 2007
- Zhu, Mingzhu; Gao, Lei; Guo, Zheng
- Gene, Vol. 391, Issue 1-2
A categorization approach to automated ontological function annotation
journal, June 2006
- Verspoor, Karin; Cohn, Judith; Mniszewski, Susan
- Protein Science, Vol. 15, Issue 6
A new measure for functional similarity of gene products based on Gene Ontology
journal, June 2006
- Schlicker, Andreas; Domingues, Francisco S.; Rahnenführer, Jörg
- BMC Bioinformatics, Vol. 7, Issue 1
A new method to measure the semantic similarity of GO terms
journal, March 2007
- Wang, J. Z.; Du, Z.; Payattakool, R.
- Bioinformatics, Vol. 23, Issue 10
Measuring gene functional similarity based on group-wise comparison of GO terms
journal, April 2013
- Teng, Z.; Guo, M.; Liu, X.
- Bioinformatics, Vol. 29, Issue 11
Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
journal, May 2013
- Wu, Xiaomei; Pang, Erli; Lin, Kui
- PLoS ONE, Vol. 8, Issue 5
Semantic Similarity in Biomedical Ontologies
journal, July 2009
- Pesquita, Catia; Faria, Daniel; Falcão, André O.
- PLoS Computational Biology, Vol. 5, Issue 7
The Gene Ontology Categorizer
journal, July 2004
- Joslyn, C. A.; Mniszewski, S. M.; Fulmer, A.
- Bioinformatics, Vol. 20, Issue Suppl 1
Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006
- Hawkins, Troy; Luban, Stanislav; Kihara, Daisuke
- Protein Science, Vol. 15, Issue 6
An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007
- Lee, Insuk; Li, Zhihua; Marcotte, Edward M.
- PLoS ONE, Vol. 2, Issue 10
Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010
- Lee, Insuk; Ambaru, Bindu; Thakkar, Pranjali
- Nature Biotechnology, Vol. 28, Issue 2
Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011
- Lee, I.; Blom, U. M.; Wang, P. I.
- Genome Research, Vol. 21, Issue 7
Use and misuse of the gene ontology annotations
journal, May 2008
- Yon Rhee, Seung; Wood, Valerie; Dolinski, Kara
- Nature Reviews Genetics, Vol. 9, Issue 7
The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011
- Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
- Nucleic Acids Research, Vol. 40, Issue D1
Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011
- Cherry, J. M.; Hong, E. L.; Amundsen, C.
- Nucleic Acids Research, Vol. 40, Issue D1
MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research
journal, May 2005
- Zhang, Peifen; Foerster, Hartmut; Tissier, Christophe P.
- Plant Physiology, Vol. 138, Issue 1
The Pathway Tools software
journal, July 2002
- Karp, P. D.; Paley, S.; Romero, P.
- Bioinformatics, Vol. 18, Issue Suppl 1
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013
- Caspi, Ron; Altman, Tomer; Billington, Richard
- Nucleic Acids Research, Vol. 42, Issue D1
An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011
- Szappanos, Balázs; Kovács, Károly; Szamecz, Béla
- Nature Genetics, Vol. 43, Issue 7
Diversification of P450 Genes During Land Plant Evolution
journal, June 2010
- Mizutani, Masaharu; Ohta, Daisaku
- Annual Review of Plant Biology, Vol. 61, Issue 1
Diverse Transcriptional Programs Associated with Environmental Stress and Hormones in the Arabidopsis Receptor-Like Kinase Gene Family
journal, January 2009
- Chae, Lee; Sudat, Sylvia; Dudoit, Sandrine
- Molecular Plant, Vol. 2, Issue 1
PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013
- Jin, Jinpu; Zhang, He; Kong, Lei
- Nucleic Acids Research, Vol. 42, Issue D1
Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006
- Allison, David B.; Cui, Xiangqin; Page, Grier P.
- Nature Reviews Genetics, Vol. 7, Issue 1
Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007
- Yu, Haiyuan; Jansen, Ronald; Stolovitzky, Gustavo
- Bioinformatics, Vol. 23, Issue 16
Defining genetic interaction
journal, February 2008
- Mani, R.; St. Onge, R. P.; Hartman, J. L.
- Proceedings of the National Academy of Sciences, Vol. 105, Issue 9
Towards revealing the functions of all genes in plants
journal, April 2014
- Rhee, Seung Yon; Mutwil, Marek
- Trends in Plant Science, Vol. 19, Issue 4
The MIPS mammalian protein-protein interaction database
journal, November 2004
- Pagel, P.; Kovac, S.; Oesterheld, M.
- Bioinformatics, Vol. 21, Issue 6
STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012
- Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
- Nucleic Acids Research, Vol. 41, Issue D1
Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000
- Riechmann, J. L.; Heard, J.; Martin, G.
- Science, Vol. 290, Issue 5499
Basic local alignment search tool
journal, October 1990
- Altschul, Stephen F.; Gish, Warren; Miller, Webb
- Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
Cytochrome P450 and Chemical Toxicology
journal, January 2008
- Guengerich, F. Peter
- Chemical Research in Toxicology, Vol. 21, Issue 1
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants
journal, June 2010
- Zhang, Peifen; Dreher, Kate; Karthikeyan, A.
- Plant Physiology, Vol. 153, Issue 4
Comparing partitions
journal, December 1985
- Hubert, Lawrence; Arabie, Phipps
- Journal of Classification, Vol. 2, Issue 1
QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009
- Binns, D.; Dimmer, E.; Huntley, R.
- Bioinformatics, Vol. 25, Issue 22
Works referencing / citing this record:
OAHG: an integrated resource for annotating human genes with multi-level ontologies
journal, October 2016
- Cheng, Liang; Sun, Jie; Xu, Wanying
- Scientific Reports, Vol. 6, Issue 1
InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology
journal, August 2016
- Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang
- BMC Genomics, Vol. 17, Issue S5
Predicting disease-related genes using integrated biomedical networks
journal, January 2017
- Peng, Jiajie; Bai, Kun; Shang, Xuequn
- BMC Genomics, Vol. 18, Issue S1
An online tool for measuring and visualizing phenotype similarities using HPO
journal, August 2018
- Peng, Jiajie; Xue, Hansheng; Hui, Weiwei
- BMC Genomics, Vol. 19, Issue S6
Measuring disease similarity and predicting disease-related ncRNAs by a novel method
journal, December 2017
- Hu, Yang; Zhou, Meng; Shi, Hongbo
- BMC Medical Genomics, Vol. 10, Issue S5
Constructing an integrated gene similarity network for the identification of disease genes
journal, September 2017
- Tian, Zhen; Guo, Maozu; Wang, Chunyu
- Journal of Biomedical Semantics, Vol. 8, Issue S1
Figures / Tables found in this record: