Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

Conference ·
OSTI ID:985889

An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.

Research Organization:
Los Alamos National Laboratory (LANL)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC52-06NA25396
OSTI ID:
985889
Report Number(s):
LA-UR-07-0618
Country of Publication:
United States
Language:
English

Similar Records

Experiments in automatic word class and word sense identification for information retrieval
Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68594

Computationally Efficient Learning of Quality Controlled Word Embeddings for Natural Language Processing
Conference · Mon Jul 01 00:00:00 EDT 2019 · OSTI ID:1545208

Semantic role labeling for protein transport predicates
Journal Article · Wed Jun 11 00:00:00 EDT 2008 · BMC Bioinformatics · OSTI ID:1626355