skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Towards a Semantic Lexicon for Biological Language Processing

Abstract

This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.

Authors:
 [1]
  1. Los Alamos National Laboratory, PO Box 1663, MS B256, Los Alamos, NM 87545, USA
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1197935
Grant/Contract Number:  
W-7405-ENG-36
Resource Type:
Published Article
Journal Name:
Comparative and Functional Genomics
Additional Journal Information:
Journal Name: Comparative and Functional Genomics Journal Volume: 6 Journal Issue: 1-2; Journal ID: ISSN 1531-6912
Publisher:
Hindawi Publishing Corporation
Country of Publication:
Country unknown/Code not available
Language:
English

Citation Formats

Verspoor, Karin. Towards a Semantic Lexicon for Biological Language Processing. Country unknown/Code not available: N. p., 2005. Web. doi:10.1002/cfg.451.
Verspoor, Karin. Towards a Semantic Lexicon for Biological Language Processing. Country unknown/Code not available. doi:10.1002/cfg.451.
Verspoor, Karin. Sat . "Towards a Semantic Lexicon for Biological Language Processing". Country unknown/Code not available. doi:10.1002/cfg.451.
@article{osti_1197935,
title = {Towards a Semantic Lexicon for Biological Language Processing},
author = {Verspoor, Karin},
abstractNote = {This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.},
doi = {10.1002/cfg.451},
journal = {Comparative and Functional Genomics},
number = 1-2,
volume = 6,
place = {Country unknown/Code not available},
year = {2005},
month = {1}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
DOI: 10.1002/cfg.451

Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

Save / Share: