skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

Abstract

An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.

Authors:
 [1];  [1]
  1. Los Alamos National Laboratory
Publication Date:
Research Org.:
Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
OSTI Identifier:
985889
Report Number(s):
LA-UR-07-0618
TRN: US201017%%67
DOE Contract Number:
AC52-06NA25396
Resource Type:
Conference
Resource Relation:
Conference: ASSOCIATION FOR COMPUTATIONAL LINGUISTICS ANNUAL MEETING ; 200706 ; PRAGUE
Country of Publication:
United States
Language:
English
Subject:
99; COMMUNICATIONS; INFORMATION RETRIEVAL; MACHINE TRANSLATIONS; STANDARDIZED TERMINOLOGY

Citation Formats

VERSPOOR, KARIN, and LIN, SHOU-DE. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD. United States: N. p., 2007. Web.
VERSPOOR, KARIN, & LIN, SHOU-DE. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD. United States.
VERSPOOR, KARIN, and LIN, SHOU-DE. Mon . "LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD". United States. doi:. https://www.osti.gov/servlets/purl/985889.
@article{osti_985889,
title = {LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD},
author = {VERSPOOR, KARIN and LIN, SHOU-DE},
abstractNote = {An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learned without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 29 00:00:00 EST 2007},
month = {Mon Jan 29 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Systems based on distributed agent architectures require an agent communications language having a clearly defined semantics. This paper demonstrates that a semantics for an agent communications language can be founded on the premise that agents are building, maintaining, and disbanding teams through their performance of communicative acts. This view requires that definitions of basic communicative acts, such as requesting, be recast in terms of the formation of a joint intention - a mental state that has been suggested underlies team behavior. To illustrate these points, a semantics is developed for a number of communication actions that can form and dissolvemore » teams. It is then demonstrated how much of the structure of popular finite-state dialogue models, such as Winograd and Flores` basic conversation for action, follows as a consequence of the logical relationships that are created by the redefined communicative actions.« less
  • This work integrates three related Al search techniques - constraint satisfaction, branch-and-bound and solution synthesis - and applies the result to semantic processing in natural language (NL). We summarize the approach as {open_quote}Hunter-Gatherer:{close_quotes} (1) branch-and-bound and constraint satisfaction allow us to {open_quote}hunt down{close_quotes} non-optimal and impossible solutions and prune them from the search space. (2) solution synthesis methods then {open_quote}gather{close_quotes} all optimal solutions avoiding exponential complexity. Each of the three techniques is briefly described, as well as their extensions and combinations used in our system. We focus on the combination of solution synthesis and branch-and-bound methods which has enabled near-linear-timemore » processing in our applications. Finally, we illustrate how the use of our technique in a large-scale MT project allowed a drastic reduction in search space.« less
  • In this paper we discuss the control domain specific ontology that is built on top of the domain-neutral Resource Definition Framework (RDF). Specifically, we will discuss the relevant set of ontology concepts along with the relationships among them in order to describe experiment control components and generic event-based state machines. Control Oriented Ontology Language (COOL) is a meta-data modeling language that provides generic means for representation of physics experiment control processes and components, and their relationships, rules and axioms. It provides a semantic reference frame that is useful for automating the communication of information for configuration, deployment and operation. COOLmore » has been successfully used to develop a complete and dynamic knowledge-base for experiment control systems, developed using the AFECS framework.« less
  • We present briefly a language which integrates the description of a data model, data manipulation language and integrity constraints into one coherent framework, resembling that proposed by several recent papers in the field of semantic data models. We then give two formal specifications of the semantics of the model and DML: one, based on states and state transactions, intended for database implementors and programmers, and one, based on axioms and partial correctness assertions intended for verifiers who wish to show that the system maintains integrity constraints. Most significantly, we sketch the proof that the deductive theory is sound and completemore » and hence matches the state transition semantics.« less
  • No abstract prepared.