skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: ROPE: Recoverable Order-Preserving Embedding of Natural Language

Abstract

We present a novel Recoverable Order-Preserving Embedding (ROPE) of natural language. ROPE maps natural language passages from sparse concatenated one-hot representations to distributed vector representations of predetermined fixed length. We use Euclidean distance to return search results that are both grammatically and semantically similar. ROPE is based on a series of random projections of distributed word embeddings. We show that our technique typically forms a dictionary with sufficient incoherence such that sparse recovery of the original text is possible. We then show how our embedding allows for efficient and meaningful natural search and retrieval on Microsoft’s COCO dataset and the IMDB Movie Review dataset.

Authors:
 [1];  [1];  [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1239214
Report Number(s):
LLNL-TR-682663
DOE Contract Number:  
AC52-07NA27344
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Widemann, David P., Wang, Eric X., and Thiagarajan, Jayaraman J. ROPE: Recoverable Order-Preserving Embedding of Natural Language. United States: N. p., 2016. Web. doi:10.2172/1239214.
Widemann, David P., Wang, Eric X., & Thiagarajan, Jayaraman J. ROPE: Recoverable Order-Preserving Embedding of Natural Language. United States. https://doi.org/10.2172/1239214
Widemann, David P., Wang, Eric X., and Thiagarajan, Jayaraman J. Thu . "ROPE: Recoverable Order-Preserving Embedding of Natural Language". United States. https://doi.org/10.2172/1239214. https://www.osti.gov/servlets/purl/1239214.
@article{osti_1239214,
title = {ROPE: Recoverable Order-Preserving Embedding of Natural Language},
author = {Widemann, David P. and Wang, Eric X. and Thiagarajan, Jayaraman J.},
abstractNote = {We present a novel Recoverable Order-Preserving Embedding (ROPE) of natural language. ROPE maps natural language passages from sparse concatenated one-hot representations to distributed vector representations of predetermined fixed length. We use Euclidean distance to return search results that are both grammatically and semantically similar. ROPE is based on a series of random projections of distributed word embeddings. We show that our technique typically forms a dictionary with sufficient incoherence such that sparse recovery of the original text is possible. We then show how our embedding allows for efficient and meaningful natural search and retrieval on Microsoft’s COCO dataset and the IMDB Movie Review dataset.},
doi = {10.2172/1239214},
url = {https://www.osti.gov/biblio/1239214}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2016},
month = {2}
}