Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Protein annotation as term categorization in the gene ontology using word proximity networks

Journal Article · · BMC Bioinformatics
 [1];  [2];  [2];  [2];  [2];  [3];  [4]
  1. Los Alamos National Lab. (LANL), Los Alamos, NM (United States); DOE/OSTI
  2. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  3. Indiana Univ., Bloomington, IN (United States). School of Informatics, Cognitive Science Program
  4. Indiana Univ., Bloomington, IN (United States). Cognitive Science Program

Background: We participated in the BioCreAtIvE Task 2, which addressed the annotation of proteins into the Gene Ontology (GO) based on the text of a given document and the selection of evidence text from the document justifying that annotation. We approached the task utilizing several combinations of two distinct methods: an unsupervised algorithm for expanding words associated with GO nodes, and an annotation methodology which treats annotation as categorization of terms from a protein's document neighborhood into the GO. Results: The evaluation results indicate that the method for expanding words associated with GO nodes is quite powerful; we were able to successfully select appropriate evidence text for a given annotation in 38% of Task 2.1 queries by building on this method. The term categorization methodology achieved a precision of 16% for annotation within the correct extended family in Task 2.2, though we show through subsequent analysis that this can be improved with a different parameter setting. Our architecture proved not to be very successful on the evidence text component of the task, in the configuration used to generate the submitted results. Conclusion: The initial results show promise for both of the methods we explored, and we are planning to integrate the methods more closely to achieve better results overall.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
Grant/Contract Number:
AC52-06NA25396
OSTI ID:
1626313
Journal Information:
BMC Bioinformatics, Journal Name: BMC Bioinformatics Journal Issue: Suppl 1 Vol. 6; ISSN 1471-2105
Publisher:
BioMed CentralCopyright Statement
Country of Publication:
United States
Language:
English

References (7)

Poset Ontologies and Concept Lattices as Semantic Hierarchies book January 2004
Dietary palmitic acid promotes a prometastatic memory via Schwann cells journal November 2021
Mapping gene ontology to proteins based on protein–protein interaction data journal April 2004
The Gene Ontology Categorizer journal July 2004
Ordered Sets: An Introduction book January 2003
Fuzzy Graphs and Fuzzy Hypergraphs book January 2000
Gene Ontology: tool for the unification of biology journal May 2000

Cited By (12)

Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters journal February 2014
Overview of BioCreAtIvE: critical assessment of information extraction for biology journal January 2005
Evaluation of BioCreAtIvE assessment of task 2 journal May 2005
Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks journal July 2007
Multi-label literature classification based on the Gene Ontology graph journal December 2008
Novel metrics for evaluating the functional coherence of protein groups via protein semantic network journal January 2007
Assessing the Impact of Case Sensitivity and Term Information Gain on Biomedical Concept Recognition journal March 2015
Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks text January 2008
Distance Closures on Complex Networks preprint January 2013
Roles for Text Mining in Protein Function Prediction book January 2014
Distance closures on complex networks journal March 2015
Gene Function Prediction Based on the Gene Ontology Hierarchical Structure journal September 2014

Similar Records

POSet Ontology Categorizer
Software · Tue Mar 01 00:00:00 EST 2005 · OSTI ID:1230863

Ontological Annotation with WordNet
Conference · Tue Jun 06 00:00:00 EDT 2006 · OSTI ID:908503

Automating Ontological Annotation with WordNet
Conference · Sat Jan 21 23:00:00 EST 2006 · OSTI ID:908203