Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Word prediction

Technical Report ·
DOI:https://doi.org/10.2172/123254· OSTI ID:123254

In this project we have developed a language model based on Artificial Neural Networks (ANNs) for use in conjunction with automatic textual search or speech recognition systems. The model can be trained on large corpora of text to produce probability estimates that would improve the ability of systems to identify words in a sentence given partial contextual information. The model uses a gradient-descent learning procedure to develop a metric of similarity among terms in a corpus, based on context. Using lexical categories based on this metric, a network can then be trained to do serial word probability estimation. Such a metric can also be used to improve the performance of topic-based search by allowing retrieval of information that is related to desired topics even if no obvious set of key words unites all the retrieved items.

Research Organization:
Lawrence Livermore National Lab., CA (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
123254
Report Number(s):
UCRL-ID--121250; ON: DE96001879
Country of Publication:
United States
Language:
English

Similar Records

Experiments in automatic word class and word sense identification for information retrieval
Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68594

Improving Synonym Recommendation Using Sentence Context
Conference · Tue Nov 09 23:00:00 EST 2021 · OSTI ID:1830146

Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span
Journal Article · Mon May 08 00:00:00 EDT 2006 · BMC Bioinformatics · OSTI ID:1626320