On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report
Abstract
A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the ‘‘functional similarity’’ between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the ‘‘ortholog conjecture’’ (or, more properly, the ‘‘ortholog functional conservation hypothesis’’). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs ofmore »
- Authors:
-
- Univ. of California, Los Angeles, CA (United States). Dept. of Preventive Medicine. Division of Bioinformatics
- Univ. of Cambridge (United Kingdom). Dept. of Biochemistry. Cambridge Systems Biology Centre
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Genomics Division
- The Jackson Lab., Bar Harbor, ME (United States). Bioinformatics and Computational Biology
- Publication Date:
- Research Org.:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
- Sponsoring Org.:
- USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division; National Institutes of Health (NIH)
- OSTI Identifier:
- 1627221
- Grant/Contract Number:
- AC02-05CH11231; P41 HG002273; R01 GM081084
- Resource Type:
- Accepted Manuscript
- Journal Name:
- PLoS Computational Biology (Online)
- Additional Journal Information:
- Journal Name: PLoS Computational Biology (Online); Journal Volume: 8; Journal Issue: 2; Journal ID: ISSN 1553-7358
- Publisher:
- Public Library of Science
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; Biochemistry & Molecular Biology; Mathematical & Computational Biology
Citation Formats
Thomas, Paul D., Wood, Valerie, Mungall, Christopher J., Lewis, Suzanna E., and Blake, Judith A. On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report. United States: N. p., 2012.
Web. doi:10.1371/journal.pcbi.1002386.
Thomas, Paul D., Wood, Valerie, Mungall, Christopher J., Lewis, Suzanna E., & Blake, Judith A. On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report. United States. https://doi.org/10.1371/journal.pcbi.1002386
Thomas, Paul D., Wood, Valerie, Mungall, Christopher J., Lewis, Suzanna E., and Blake, Judith A. Thu .
"On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report". United States. https://doi.org/10.1371/journal.pcbi.1002386. https://www.osti.gov/servlets/purl/1627221.
@article{osti_1627221,
title = {On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report},
author = {Thomas, Paul D. and Wood, Valerie and Mungall, Christopher J. and Lewis, Suzanna E. and Blake, Judith A.},
abstractNote = {A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the ‘‘functional similarity’’ between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the ‘‘ortholog conjecture’’ (or, more properly, the ‘‘ortholog functional conservation hypothesis’’). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an ‘‘open world assumption’’ (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis.},
doi = {10.1371/journal.pcbi.1002386},
journal = {PLoS Computational Biology (Online)},
number = 2,
volume = 8,
place = {United States},
year = {Thu Feb 16 00:00:00 EST 2012},
month = {Thu Feb 16 00:00:00 EST 2012}
}
Works referenced in this record:
The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics
journal, November 2010
- Blake, J. A.; Bult, C. J.; Kadin, J. A.
- Nucleic Acids Research, Vol. 39, Issue Database
The MAP Kinase Signaling Cascades: A System of Hundreds of Components Regulates a Diverse Array of Physiological Functions
book, January 2010
- Keshet, Yonat; Seger, Rony
- MAP Kinase Signaling Protocols
Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals
journal, June 2011
- Nehrt, Nathan L.; Clark, Wyatt T.; Radivojac, Predrag
- PLoS Computational Biology, Vol. 7, Issue 6
A novel DNA damage recognition protein in Schizosaccharomyces pombe
journal, April 2006
- Pearson, S. J.
- Nucleic Acids Research, Vol. 34, Issue 8
How confident can we be that orthologs are similar, but paralogs differ?
journal, May 2009
- Studer, Romain A.; Robinson-Rechavi, Marc
- Trends in Genetics, Vol. 25, Issue 5
The 3′ Ends of Mature Transcripts Are Generated by a Processosome Complex in Fission Yeast Mitochondria
journal, April 2008
- Hoffmann, Bastian; Nickel, Jens; Speer, Falk
- Journal of Molecular Biology, Vol. 377, Issue 4
Control of a Kinesin-Cargo Linkage Mechanism by JNK Pathway Kinases
journal, August 2007
- Horiuchi, Dai; Collins, Catherine A.; Bhat, Pavan
- Current Biology, Vol. 17, Issue 15
Physiological and Molecular Basis of Thyroid Hormone Action
journal, July 2001
- Yen, Paul M.
- Physiological Reviews, Vol. 81, Issue 3
Gene Ontology: tool for the unification of biology
journal, May 2000
- Ashburner, Michael; Ball, Catherine A.; Blake, Judith A.
- Nature Genetics, Vol. 25, Issue 1
The GOA database in 2009--an integrated Gene Ontology Annotation resource
journal, January 2009
- Barrell, D.; Dimmer, E.; Huntley, R. P.
- Nucleic Acids Research, Vol. 37, Issue Database
Gene Ontology annotations: what they mean and where they come from
journal, January 2008
- Hill, David P.; Smith, Barry; McAndrews-Hill, Monica S.
- BMC Bioinformatics, Vol. 9, Issue Suppl 5
The Gene Ontology in 2010: extensions and refinements
journal, January 2010
- Consortium, The Gene Ontology
- Nucleic Acids Research, Vol. 38, Issue suppl_1, p. D331-D335
Protein Evolution by Molecular Tinkering: Diversification of the Nuclear Receptor Superfamily from a Ligand-Dependent Ancestor
journal, October 2010
- Bridgham, Jamie T.; Eick, Geeta N.; Larroux, Claire
- PLoS Biology, Vol. 8, Issue 10
Evolution of Hormone-Receptor Complexity by Molecular Exploitation
journal, April 2006
- Bridgham, J. T.
- Science, Vol. 312, Issue 5770
Distinguishing Homologous from Analogous Proteins
journal, June 1970
- Fitch, Walter M.
- Systematic Zoology, Vol. 19, Issue 2
When orthologs diverge between human and mouse
journal, June 2011
- Gharib, W. H.; Robinson-Rechavi, M.
- Briefings in Bioinformatics, Vol. 12, Issue 5
Motor Proteins: Trafficking and Signaling Collide
journal, September 2007
- Verhey, Kristen J.
- Current Biology, Vol. 17, Issue 18
How confident can we be that orthologs are similar, but paralogs differ?
journal, May 2009
- Studer, Romain A.; Robinson-Rechavi, Marc
- Trends in Genetics, Vol. 25, Issue 5
When orthologs diverge between human and mouse
journal, June 2011
- Gharib, W. H.; Robinson-Rechavi, M.
- Briefings in Bioinformatics, Vol. 12, Issue 5
A novel DNA damage recognition protein in Schizosaccharomyces pombe
journal, April 2006
- Pearson, S. J.
- Nucleic Acids Research, Vol. 34, Issue 8
The GOA database in 2009--an integrated Gene Ontology Annotation resource
journal, January 2009
- Barrell, D.; Dimmer, E.; Huntley, R. P.
- Nucleic Acids Research, Vol. 37, Issue Database
The Gene Ontology in 2010: extensions and refinements
journal, January 2010
- Consortium, The Gene Ontology
- Nucleic Acids Research, Vol. 38, Issue suppl_1, p. D331-D335
The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics
journal, November 2010
- Blake, J. A.; Bult, C. J.; Kadin, J. A.
- Nucleic Acids Research, Vol. 39, Issue Database
Physiological and Molecular Basis of Thyroid Hormone Action
journal, July 2001
- Yen, Paul M.
- Physiological Reviews, Vol. 81, Issue 3
Estrogen receptors and human disease
journal, March 2006
- Deroo, B. J.
- Journal of Clinical Investigation, Vol. 116, Issue 3
Protein Evolution by Molecular Tinkering: Diversification of the Nuclear Receptor Superfamily from a Ligand-Dependent Ancestor
journal, October 2010
- Bridgham, Jamie T.; Eick, Geeta N.; Larroux, Claire
- PLoS Biology, Vol. 8, Issue 10
Works referencing / citing this record:
Genome-Wide Analysis of Protein Disorder in Arabidopsis thaliana: Implications for Plant Environmental Adaptation
journal, February 2013
- Pietrosemoli, Natalia; García-Martín, Juan A.; Solano, Roberto
- PLoS ONE, Vol. 8, Issue 2
Identifying mouse developmental essential genes using machine learning
journal, December 2018
- Tian, David; Wenlock, Stephanie; Kabir, Mitra
- Disease Models & Mechanisms, Vol. 11, Issue 12
Standardized benchmarking in the quest for orthologs
journal, April 2016
- Altenhoff, Adrian M.; Boeckmann, Brigitte; Capella-Gutierrez, Salvador
- Nature Methods, Vol. 13, Issue 5
ARTDeco: automatic readthrough transcription detection
journal, May 2020
- Roth, Samuel J.; Heinz, Sven; Benner, Christopher
- BMC Bioinformatics, Vol. 21, Issue 1
Biological interpretation of genome-wide association studies using predicted gene functions
journal, January 2015
- Pers, Tune H.; Karjalainen, Juha M.; Chan, Yingleong
- Nature Communications, Vol. 6, Issue 1
Functional and evolutionary implications of gene orthology
journal, April 2013
- Gabaldón, Toni; Koonin, Eugene V.
- Nature Reviews Genetics, Vol. 14, Issue 5
Protein Function Prediction: Problems and Pitfalls
journal, September 2015
- Pearson, William R.
- Current Protocols in Bioinformatics, Vol. 51, Issue 1
An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework
journal, January 2016
- Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji
- Database, Vol. 2016
Big data and other challenges in the quest for orthologs
journal, July 2014
- Sonnhammer, E. L. L.; Gabaldon, T.; Sousa da Silva, A. W.
- Bioinformatics, Vol. 30, Issue 21
Interspecies gene function prediction using semantic similarity
journal, December 2016
- Yu, Guoxian; Luo, Wei; Fu, Guangyuan
- BMC Systems Biology, Vol. 10, Issue S4
Semantic Similarity from Natural Language and Ontology Analysis
journal, May 2015
- Harispe, Sébastien; Ranwez, Sylvie; Janaqi, Stefan
- Synthesis Lectures on Human Language Technologies, Vol. 8, Issue 1
Semantic Similarity from Natural Language and Ontology Analysis
text, January 2017
- Harispe, Sébastien; Ranwez, Sylvie; Janaqi, Stefan
- arXiv
A Tight Link between Orthologs and Bidirectional Best Hits in Bacterial and Archaeal Genomes
journal, November 2012
- Wolf, Yuri I.; Koonin, Eugene V.
- Genome Biology and Evolution, Vol. 4, Issue 12
Functional and structural profiles of GST gene family from three Populus species reveal the sequence–function decoupling of orthologous genes
journal, September 2018
- Yang, Qi; Han, Xue‐Min; Gu, Jin‐Ke
- New Phytologist, Vol. 221, Issue 2
Gene ontology improves template selection in comparative protein docking
journal, December 2018
- Hadarovich, Anna; Anishchenko, Ivan; Tuzikov, Alexander V.
- Proteins: Structure, Function, and Bioinformatics, Vol. 87, Issue 3
Conserved syntenic clusters of protein coding genes are missing in birds
journal, December 2014
- Lovell, Peter V.; Wirthlin, Morgan; Wilhelm, Larry
- Genome Biology, Vol. 15, Issue 12
The Ortholog Conjecture Revisited: the Value of Orthologs and Paralogs in Function Prediction
journal, December 2019
- Stamboulian, Moses; Guerrero, Rafael F.; Hahn, Matthew W.
- Bioinformatics
Pairwise comparisons across species are problematic when analyzing functional genomic data
journal, January 2018
- Dunn, Casey W.; Zapata, Felipe; Munro, Catriona
- Proceedings of the National Academy of Sciences, Vol. 115, Issue 3
Accurate prediction of orthologs in the presence of divergence after duplication
journal, June 2018
- Lafond, Manuel; Meghdari Miardan, Mona; Sankoff, David
- Bioinformatics, Vol. 34, Issue 13
Accurate prediction of orthologs in the presence of divergence after duplication
journal, April 2018
- Lafond, Manuel; Miardan, Mona Meghdari; Sankoff, David
- Bioinformatics
Human Monogenic Disease Genes Have Frequently Functionally Redundant Paralogs
journal, May 2013
- Chen, Wei-Hua; Zhao, Xing-Ming; van Noort, Vera
- PLoS Computational Biology, Vol. 9, Issue 5
The ortholog conjecture revisited: the value of orthologs and paralogs in function prediction
journal, July 2020
- Stamboulian, Moses; Guerrero, Rafael F.; Hahn, Matthew W.
- Bioinformatics, Vol. 36, Issue Supplement_1
OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis
journal, November 2012
- Whiteside, Matthew D.; Winsor, Geoffrey L.; Laird, Matthew R.
- Nucleic Acids Research, Vol. 41, Issue D1
OrthoList 2: A New Comparative Genomic Analysis of Human and Caenorhabditis elegans Genes
journal, August 2018
- Kim, Woojin; Underwood, Ryan S.; Greenwald, Iva
- Genetics, Vol. 210, Issue 2
Standardized benchmarking in the quest for orthologs
text, January 2016
- Altenhoff, Adrian M.; Al, Et
- Nature Publishing Group
The Ortholog Conjecture Is Untestable by the Current Gene Ontology but Is Supported by RNA Sequencing Data
journal, November 2012
- Chen, Xiaoshu; Zhang, Jianzhi
- PLoS Computational Biology, Vol. 8, Issue 11
Pairwise comparisons across species are problematic when analyzing functional genomic data
journal, January 2018
- Dunn, Casey W.; Zapata, Felipe; Munro, Catriona
- Proceedings of the National Academy of Sciences, Vol. 115, Issue 3
Accurate prediction of orthologs in the presence of divergence after duplication
journal, June 2018
- Lafond, Manuel; Meghdari Miardan, Mona; Sankoff, David
- Bioinformatics, Vol. 34, Issue 13
An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework
journal, January 2016
- Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji
- Database, Vol. 2016
A Tight Link between Orthologs and Bidirectional Best Hits in Bacterial and Archaeal Genomes
journal, November 2012
- Wolf, Yuri I.; Koonin, Eugene V.
- Genome Biology and Evolution, Vol. 4, Issue 12
Gene Family Level Comparative Analysis of Gene Expression in Mammals Validates the Ortholog Conjecture
journal, March 2014
- Rogozin, Igor B.; Managadze, David; Shabalina, Svetlana A.
- Genome Biology and Evolution, Vol. 6, Issue 4
OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis
journal, November 2012
- Whiteside, Matthew D.; Winsor, Geoffrey L.; Laird, Matthew R.
- Nucleic Acids Research, Vol. 41, Issue D1
Protein Function Prediction Using Deep Restricted Boltzmann Machines
journal, January 2017
- Zou, Xianchun; Wang, Guijun; Yu, Guoxian
- BioMed Research International, Vol. 2017
The case of Iranian immigrants in the greater Toronto area: a qualitative study
journal, January 2012
- Dastjerdi, Mahdieh
- International Journal for Equity in Health, Vol. 11, Issue 1
Progress and challenges in the computational prediction of gene function using networks
journal, September 2012
- Pavlidis, Paul; Gillis, Jesse
- F1000Research, Vol. 1
The Ortholog Conjecture Is Untestable by the Current Gene Ontology but Is Supported by RNA Sequencing Data
journal, November 2012
- Chen, Xiaoshu; Zhang, Jianzhi
- PLoS Computational Biology, Vol. 8, Issue 11
Phyletic Profiling with Cliques of Orthologs Is Enhanced by Signatures of Paralogy Relationships
journal, January 2013
- Škunca, Nives; Bošnjak, Matko; Kriško, Anita
- PLoS Computational Biology, Vol. 9, Issue 1
Ten Quick Tips for Using the Gene Ontology
journal, November 2013
- Blake, Judith A.
- PLoS Computational Biology, Vol. 9, Issue 11
WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning
journal, November 2016
- Sutphin, George L.; Mahoney, J. Matthew; Sheppard, Keith
- PLOS Computational Biology, Vol. 12, Issue 11
Quickly Finding Orthologs as Reciprocal Best Hits with BLAT, LAST, and UBLAST: How Much Do We Miss?
journal, July 2014
- Ward, Natalie; Moreno-Hagelsieb, Gabriel
- PLoS ONE, Vol. 9, Issue 7
Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)
text, January 2013
- Gillis, Jesse; Pavlidis, Paul
- BioMed Central
In Silico Analysis and Experimental Validation of Active Compounds from Cichorium intybus L. Ameliorating Liver Injury
journal, September 2015
- Li, Guo-Yu; Zheng, Ya-Xin; Sun, Fu-Zhou
- International Journal of Molecular Sciences, Vol. 16, Issue 9
Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs
text, January 2012
- Altenhoff, Adrian M.; Studer, Romain A.; Robinson-Rechavi, Marc
- ETH Zurich
Phylogenetic Profiling : How Much Input Data Is Enough?
text, January 2015
- Nives, Škunca,; Christophe, Dessimoz,
- ETH Zurich
Standardized benchmarking in the quest for orthologs
text, January 2016
- M., Altenhoff, Adrian; Brigitte, Boeckmann,; Salvador, Capella-Gutierrez,
- ETH Zurich
Evaluating the adaptive evolutionary convergence of carnivorous plant taxa through functional genomics
journal, January 2018
- Wheeler, Gregory L.; Carstens, Bryan C.
- PeerJ, Vol. 6