Measuring semantic similarities by combining gene ontology annotations and gene co-function networks
Abstract
Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but aremore »
- Authors:
-
- Harbin Institute of Technology, Harbin (China); Michigan State Univ., East Lansing, MI (United States)
- Michigan State Univ., East Lansing, MI (United States)
- Carnegie Institution for Science, Stanford, CA (United States)
- Harbin Institute of Technology, Harbin (China)
- Michigan State University, East Lansing, MI (United States)
- Publication Date:
- Research Org.:
- Michigan State Univ., East Lansing, MI (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Basic Energy Sciences (BES)
- OSTI Identifier:
- 1194164
- Grant/Contract Number:
- FG02-91ER20021
- Resource Type:
- Accepted Manuscript
- Journal Name:
- BMC Bioinformatics
- Additional Journal Information:
- Journal Volume: 16; Journal Issue: 1; Journal ID: ISSN 1471-2105
- Publisher:
- BioMed Central
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; co-function network; gene ontology; semantic similarity; gene function annotation
Citation Formats
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States: N. p., 2015.
Web. doi:10.1186/s12859-015-0474-7.
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., & Chen, Jin. Measuring semantic similarities by combining gene ontology annotations and gene co-function networks. United States. https://doi.org/10.1186/s12859-015-0474-7
Peng, Jiajie, Uygun, Sahra, Kim, Taehyong, Wang, Yadong, Rhee, Seung Y., and Chen, Jin. Sat .
"Measuring semantic similarities by combining gene ontology annotations and gene co-function networks". United States. https://doi.org/10.1186/s12859-015-0474-7. https://www.osti.gov/servlets/purl/1194164.
@article{osti_1194164,
title = {Measuring semantic similarities by combining gene ontology annotations and gene co-function networks},
author = {Peng, Jiajie and Uygun, Sahra and Kim, Taehyong and Wang, Yadong and Rhee, Seung Y. and Chen, Jin},
abstractNote = {Background: Gene Ontology (GO) has been used widely to study functional relationships between genes. The current semantic similarity measures rely only on GO annotations and GO structure. This limits the power of GO-based similarity because of the limited proportion of genes that are annotated to GO in most organisms. Results: We introduce a novel approach called NETSIM (network-based similarity measure) that incorporates information from gene co-function networks in addition to using the GO structure and annotations. Using metabolic reaction maps of yeast, Arabidopsis, and human, we demonstrate that NETSIM can improve the accuracy of GO term similarities. We also demonstrate that NETSIM works well even for genomes with sparser gene annotation data. We applied NETSIM on large Arabidopsis gene families such as cytochrome P450 monooxygenases to group the members functionally and show that this grouping could facilitate functional characterization of genes in these families. Conclusions: Using NETSIM as an example, we demonstrated that the performance of a semantic similarity measure could be significantly improved after incorporating genome-specific information. NETSIM incorporates both GO annotations and gene co-function network data as a priori knowledge in the model. Therefore, functional similarities of GO terms that are not explicitly encoded in GO but are relevant in a taxon-specific manner become measurable when GO annotations are limited.},
doi = {10.1186/s12859-015-0474-7},
journal = {BMC Bioinformatics},
number = 1,
volume = 16,
place = {United States},
year = {Sat Feb 14 00:00:00 EST 2015},
month = {Sat Feb 14 00:00:00 EST 2015}
}
Web of Science
Figures / Tables:
Works referenced in this record:
Mitochondrial dysfunction induces ALK5-SMAD2-mediated hypovascularization and arteriovenous malformations in mouse retinas
journal, December 2022
- Zhang, Haifeng; Li, Busu; Huang, Qunhua
- Nature Communications, Vol. 13, Issue 1
Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007
- Yu, Haiyuan; Jansen, Ronald; Stolovitzky, Gustavo
- Bioinformatics, Vol. 23, Issue 16
MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research
journal, May 2005
- Zhang, Peifen; Foerster, Hartmut; Tissier, Christophe P.
- Plant Physiology, Vol. 138, Issue 1
An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007
- Lee, Insuk; Li, Zhihua; Marcotte, Edward M.
- PLoS ONE, Vol. 2, Issue 10
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants
journal, June 2010
- Zhang, Peifen; Dreher, Kate; Karthikeyan, A.
- Plant Physiology, Vol. 153, Issue 4
Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011
- Cherry, J. M.; Hong, E. L.; Amundsen, C.
- Nucleic Acids Research, Vol. 40, Issue D1
An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae
journal, October 2007
- Lee, Insuk; Li, Zhihua; Marcotte, Edward M.
- PLoS ONE, Vol. 2, Issue 10
Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
journal, May 2007
- Yu, Haiyuan; Jansen, Ronald; Stolovitzky, Gustavo
- Bioinformatics, Vol. 23, Issue 16
The Gene Ontology Categorizer
journal, July 2004
- Joslyn, C. A.; Mniszewski, S. M.; Fulmer, A.
- Bioinformatics, Vol. 20, Issue Suppl 1
A new measure for functional similarity of gene products based on Gene Ontology
journal, June 2006
- Schlicker, Andreas; Domingues, Francisco S.; Rahnenführer, Jörg
- BMC Bioinformatics, Vol. 7, Issue 1
Measuring gene functional similarity based on group-wise comparison of GO terms
journal, April 2013
- Teng, Z.; Guo, M.; Liu, X.
- Bioinformatics, Vol. 29, Issue 11
The Gene Ontology Categorizer
journal, July 2004
- Joslyn, C. A.; Mniszewski, S. M.; Fulmer, A.
- Bioinformatics, Vol. 20, Issue Suppl 1
Gene Ontology: tool for the unification of biology
journal, May 2000
- Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
- Nature Genetics, Vol. 25, Issue 1
Cytochrome P450 and Chemical Toxicology
journal, January 2008
- Guengerich, F. Peter
- Chemical Research in Toxicology, Vol. 21, Issue 1
Comparing partitions
journal, December 1985
- Hubert, Lawrence; Arabie, Phipps
- Journal of Classification, Vol. 2, Issue 1
A new method to measure the semantic similarity of GO terms
journal, March 2007
- Wang, J. Z.; Du, Z.; Payattakool, R.
- Bioinformatics, Vol. 23, Issue 10
A categorization approach to automated ontological function annotation
journal, June 2006
- Verspoor, Karin; Cohn, Judith; Mniszewski, Susan
- Protein Science, Vol. 15, Issue 6
Using GOstats to test gene lists for GO term association
journal, November 2006
- Falcon, S.; Gentleman, R.
- Bioinformatics, Vol. 23, Issue 2
Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000
- Riechmann, J. L.; Heard, J.; Martin, G.
- Science, Vol. 290, Issue 5499
Towards revealing the functions of all genes in plants
journal, April 2014
- Rhee, Seung Yon; Mutwil, Marek
- Trends in Plant Science, Vol. 19, Issue 4
Defining genetic interaction
journal, February 2008
- Mani, R.; St. Onge, R. P.; Hartman, J. L.
- Proceedings of the National Academy of Sciences, Vol. 105, Issue 9
Improving the Measurement of Semantic Similarity between Gene Ontology Terms and Gene Products: Insights from an Edge- and IC-Based Hybrid Method
journal, May 2013
- Wu, Xiaomei; Pang, Erli; Lin, Kui
- PLoS ONE, Vol. 8, Issue 5
The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011
- Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
- Nucleic Acids Research, Vol. 40, Issue D1
Diverse Transcriptional Programs Associated with Environmental Stress and Hormones in the Arabidopsis Receptor-Like Kinase Gene Family
journal, January 2009
- Chae, Lee; Sudat, Sylvia; Dudoit, Sandrine
- Molecular Plant, Vol. 2, Issue 1
Diverse Transcriptional Programs Associated with Environmental Stress and Hormones in the Arabidopsis Receptor-Like Kinase Gene Family
journal, January 2009
- Chae, Lee; Sudat, Sylvia; Dudoit, Sandrine
- Molecular Plant, Vol. 2, Issue 1
The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
journal, December 2011
- Lamesch, Philippe; Berardini, Tanya Z.; Li, Donghui
- Nucleic Acids Research, Vol. 40, Issue D1
An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011
- Szappanos, Balázs; Kovács, Károly; Szamecz, Béla
- Nature Genetics, Vol. 43, Issue 7
Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011
- Lee, I.; Blom, U. M.; Wang, P. I.
- Genome Research, Vol. 21, Issue 7
Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006
- Hawkins, Troy; Luban, Stanislav; Kihara, Daisuke
- Protein Science, Vol. 15, Issue 6
Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006
- Allison, David B.; Cui, Xiangqin; Page, Grier P.
- Nature Reviews Genetics, Vol. 7, Issue 1
An integrated approach to characterize genetic interaction networks in yeast metabolism
journal, May 2011
- Szappanos, Balázs; Kovács, Károly; Szamecz, Béla
- Nature Genetics, Vol. 43, Issue 7
Microarray data analysis: from disarray to consolidation and consensus
journal, January 2006
- Allison, David B.; Cui, Xiangqin; Page, Grier P.
- Nature Reviews Genetics, Vol. 7, Issue 1
Predicting gene function through systematic analysis and quality assessment of high-throughput data
journal, November 2004
- Kemmeren, P.; Kockelkorn, T. T. J. P.; Bijma, T.
- Bioinformatics, Vol. 21, Issue 8
PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013
- Jin, Jinpu; Zhang, He; Kong, Lei
- Nucleic Acids Research, Vol. 42, Issue D1
QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009
- Binns, D.; Dimmer, E.; Huntley, R.
- Bioinformatics, Vol. 25, Issue 22
Enhanced automated function prediction using distantly related sequences and contextual association by PFP
journal, June 2006
- Hawkins, Troy; Luban, Stanislav; Kihara, Daisuke
- Protein Science, Vol. 15, Issue 6
Globally predicting protein functions based on co-expressed protein–protein interaction networks and ontology taxonomy similarities
journal, April 2007
- Zhu, Mingzhu; Gao, Lei; Guo, Zheng
- Gene, Vol. 391, Issue 1-2
Use and misuse of the gene ontology annotations
journal, May 2008
- Yon Rhee, Seung; Wood, Valerie; Dolinski, Kara
- Nature Reviews Genetics, Vol. 9, Issue 7
Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among Eukaryotes
journal, December 2000
- Riechmann, J. L.; Heard, J.; Martin, G.
- Science, Vol. 290, Issue 5499
Defining genetic interaction
journal, February 2008
- Mani, R.; St. Onge, R. P.; Hartman, J. L.
- Proceedings of the National Academy of Sciences, Vol. 105, Issue 9
Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010
- Lee, Insuk; Ambaru, Bindu; Thakkar, Pranjali
- Nature Biotechnology, Vol. 28, Issue 2
STRING v9.1: protein-protein interaction networks, with increased coverage and integration
journal, November 2012
- Franceschini, Andrea; Szklarczyk, Damian; Frankild, Sune
- Nucleic Acids Research, Vol. 41, Issue D1
Evaluation of high-throughput functional categorization of human disease genes
journal, January 2007
- Chen, James L.; Liu, Yang; Sam, Lee T.
- BMC Bioinformatics, Vol. 8, Issue Suppl 3
Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network
journal, January 2003
- Brun, Christine; Chevenet, François; Martin, David
- Genome Biology, Vol. 5, Issue 1, p. R6
Globally predicting protein functions based on co-expressed protein–protein interaction networks and ontology taxonomy similarities
journal, April 2007
- Zhu, Mingzhu; Gao, Lei; Guo, Zheng
- Gene, Vol. 391, Issue 1-2
PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors
journal, October 2013
- Jin, Jinpu; Zhang, He; Kong, Lei
- Nucleic Acids Research, Vol. 42, Issue D1
The MIPS mammalian protein-protein interaction database
journal, November 2004
- Pagel, P.; Kovac, S.; Oesterheld, M.
- Bioinformatics, Vol. 21, Issue 6
Basic local alignment search tool
journal, October 1990
- Altschul, Stephen F.; Gish, Warren; Miller, Webb
- Journal of Molecular Biology, Vol. 215, Issue 3, p. 403-410
Prioritizing candidate disease genes by network-based boosting of genome-wide association data
journal, May 2011
- Lee, I.; Blom, U. M.; Wang, P. I.
- Genome Research, Vol. 21, Issue 7
Cytochrome P450 and Chemical Toxicology
journal, January 2008
- Guengerich, F. Peter
- Chemical Research in Toxicology, Vol. 21, Issue 1
Semantic Similarity in Biomedical Ontologies
journal, July 2009
- Pesquita, Catia; Faria, Daniel; Falcão, André O.
- PLoS Computational Biology, Vol. 5, Issue 7
Dietary palmitic acid promotes a prometastatic memory via Schwann cells
journal, November 2021
- Pascual, Gloria; Domínguez, Diana; Elosúa-Bayes, Marc
- Nature, Vol. 599, Issue 7885
A novel network pharmacology approach for leukaemia differentiation therapy using Mogrify®
journal, October 2022
- Lee, Lin Ming; Christodoulou, Eleni G.; Shyamsunder, Pavithra
- Oncogene, Vol. 41, Issue 48
Semantic Similarity in Biomedical Ontologies
journal, July 2009
- Pesquita, Catia; Faria, Daniel; Falcão, André O.
- PLoS Computational Biology, Vol. 5, Issue 7
Diversification of P450 Genes During Land Plant Evolution
journal, June 2010
- Mizutani, Masaharu; Ohta, Daisaku
- Annual Review of Plant Biology, Vol. 61, Issue 1
The MIPS mammalian protein-protein interaction database
journal, November 2004
- Pagel, P.; Kovac, S.; Oesterheld, M.
- Bioinformatics, Vol. 21, Issue 6
Evaluation of high-throughput functional categorization of human disease genes
text, January 2007
- Chen, James L.; Liu, Yang; Sam, Lee T.
- Columbia University
Towards revealing the functions of all genes in plants
journal, April 2014
- Rhee, Seung Yon; Mutwil, Marek
- Trends in Plant Science, Vol. 19, Issue 4
Diversification of P450 Genes During Land Plant Evolution
journal, June 2010
- Mizutani, Masaharu; Ohta, Daisaku
- Annual Review of Plant Biology, Vol. 61, Issue 1
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases
journal, November 2013
- Caspi, Ron; Altman, Tomer; Billington, Richard
- Nucleic Acids Research, Vol. 42, Issue D1
QuickGO: a web-based tool for Gene Ontology searching
journal, September 2009
- Binns, D.; Dimmer, E.; Huntley, R.
- Bioinformatics, Vol. 25, Issue 22
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants
journal, June 2010
- Zhang, Peifen; Dreher, Kate; Karthikeyan, A.
- Plant Physiology, Vol. 153, Issue 4
Saccharomyces Genome Database: the genomics resource of budding yeast
journal, November 2011
- Cherry, J. M.; Hong, E. L.; Amundsen, C.
- Nucleic Acids Research, Vol. 40, Issue D1
Use and misuse of the gene ontology annotations
journal, May 2008
- Yon Rhee, Seung; Wood, Valerie; Dolinski, Kara
- Nature Reviews Genetics, Vol. 9, Issue 7
A new method to measure the semantic similarity of GO terms
journal, March 2007
- Wang, J. Z.; Du, Z.; Payattakool, R.
- Bioinformatics, Vol. 23, Issue 10
Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana
journal, January 2010
- Lee, Insuk; Ambaru, Bindu; Thakkar, Pranjali
- Nature Biotechnology, Vol. 28, Issue 2
Works referencing / citing this record:
An online tool for measuring and visualizing phenotype similarities using HPO
journal, August 2018
- Peng, Jiajie; Xue, Hansheng; Hui, Weiwei
- BMC Genomics, Vol. 19, Issue S6
Erratum to: InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology
journal, March 2017
- Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang
- BMC Genomics, Vol. 18, Issue 1
OAHG: an integrated resource for annotating human genes with multi-level ontologies
journal, October 2016
- Cheng, Liang; Sun, Jie; Xu, Wanying
- Scientific Reports, Vol. 6, Issue 1
Constructing an integrated gene similarity network for the identification of disease genes
conference, December 2016
- Zhen Tian, ; Guo, Maozu
- 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach
journal, March 2018
- Peng, Jiajie; Zhang, Xuanshuo; Hui, Weiwei
- BMC Systems Biology, Vol. 12, Issue S2
Constructing Networks of Organelle Functional Modules in Arabidopsis
journal, August 2016
- Peng, Jiajie; Wang, Tao; Hu, Jianping
- Current Genomics, Vol. 17, Issue 5
Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework
journal, March 2017
- Weichenberger, Christian X.; Palermo, Antonia; Pramstaller, Peter P.
- Scientific Reports, Vol. 7, Issue 1
An online tool for measuring and visualizing phenotype similarities using HPO
journal, August 2018
- Peng, Jiajie; Xue, Hansheng; Hui, Weiwei
- BMC Genomics, Vol. 19, Issue S6
Predicting disease-related genes using integrated biomedical networks
journal, January 2017
- Peng, Jiajie; Bai, Kun; Shang, Xuequn
- BMC Genomics, Vol. 18, Issue S1
Constructing an integrated gene similarity network for the identification of disease genes
conference, December 2016
- Zhen Tian, ; Guo, Maozu
- 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Investigations on factors influencing HPO-based semantic similarity calculation
journal, September 2017
- Peng, Jiajie; Li, Qianqian; Shang, Xuequn
- Journal of Biomedical Semantics, Vol. 8, Issue S1
Measuring disease similarity and predicting disease-related ncRNAs by a novel method
journal, December 2017
- Hu, Yang; Zhou, Meng; Shi, Hongbo
- BMC Medical Genomics, Vol. 10, Issue S5
Constructing an integrated gene similarity network for the identification of disease genes
journal, September 2017
- Tian, Zhen; Guo, Maozu; Wang, Chunyu
- Journal of Biomedical Semantics, Vol. 8, Issue S1
Predicting disease-related genes using integrated biomedical networks
journal, January 2017
- Peng, Jiajie; Bai, Kun; Shang, Xuequn
- BMC Genomics, Vol. 18, Issue S1
OAHG: an integrated resource for annotating human genes with multi-level ontologies
journal, October 2016
- Cheng, Liang; Sun, Jie; Xu, Wanying
- Scientific Reports, Vol. 6, Issue 1
InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology
journal, August 2016
- Peng, Jiajie; Li, Hongxiang; Liu, Yongzhuang
- BMC Genomics, Vol. 17, Issue S5
Figures / Tables found in this record: