skip to main content

DOE PAGESDOE PAGES

Title: Towards a Semantic Lexicon for Biological Language Processing

This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.
Authors:
 [1]
  1. Los Alamos National Laboratory, PO Box 1663, MS B256, Los Alamos, NM 87545, USA
Publication Date:
OSTI Identifier:
1197935
Grant/Contract Number:
W-7405-ENG-36
Type:
Published Article
Journal Name:
Comparative and Functional Genomics
Additional Journal Information:
Journal Volume: 6; Journal Issue: 1-2; Related Information: CHORUS Timestamp: 2016-08-23 03:43:07; Journal ID: ISSN 1531-6912
Publisher:
Hindawi Publishing Corporation
Sponsoring Org:
USDOE
Country of Publication:
Country unknown/Code not available
Language:
English