Protein annotation as term categorization in the gene ontology using word proximity networks
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States); DOE/OSTI
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Indiana Univ., Bloomington, IN (United States). School of Informatics, Cognitive Science Program
- Indiana Univ., Bloomington, IN (United States). Cognitive Science Program
Background: We participated in the BioCreAtIvE Task 2, which addressed the annotation of proteins into the Gene Ontology (GO) based on the text of a given document and the selection of evidence text from the document justifying that annotation. We approached the task utilizing several combinations of two distinct methods: an unsupervised algorithm for expanding words associated with GO nodes, and an annotation methodology which treats annotation as categorization of terms from a protein's document neighborhood into the GO. Results: The evaluation results indicate that the method for expanding words associated with GO nodes is quite powerful; we were able to successfully select appropriate evidence text for a given annotation in 38% of Task 2.1 queries by building on this method. The term categorization methodology achieved a precision of 16% for annotation within the correct extended family in Task 2.2, though we show through subsequent analysis that this can be improved with a different parameter setting. Our architecture proved not to be very successful on the evidence text component of the task, in the configuration used to generate the submitted results. Conclusion: The initial results show promise for both of the methods we explored, and we are planning to integrate the methods more closely to achieve better results overall.
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Biological and Environmental Research (BER). Biological Systems Science Division
- Grant/Contract Number:
- AC52-06NA25396
- OSTI ID:
- 1626313
- Journal Information:
- BMC Bioinformatics, Journal Name: BMC Bioinformatics Journal Issue: Suppl 1 Vol. 6; ISSN 1471-2105
- Publisher:
- BioMed CentralCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Poset Ontologies and Concept Lattices as Semantic Hierarchies
|
book | January 2004 |
Dietary palmitic acid promotes a prometastatic memory via Schwann cells
|
journal | November 2021 |
Mapping gene ontology to proteins based on protein–protein interaction data
|
journal | April 2004 |
The Gene Ontology Categorizer
|
journal | July 2004 |
| Ordered Sets: An Introduction | book | January 2003 |
Fuzzy Graphs and Fuzzy Hypergraphs
|
book | January 2000 |
Gene Ontology: tool for the unification of biology
|
journal | May 2000 |
Similar Records
Ontological Annotation with WordNet
Automating Ontological Annotation with WordNet