Visual Exploration of Semantic Relationships in Neural Word Embeddings
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Univ. of Utah, Salt Lake City, UT (United States). School of Computing
- Univ. of Utah, Salt Lake City, UT (United States). SCI Inst.
Constructing distributed representations for words through neural language models and using the resulting vector spaces for analysis has become a crucial component of natural language processing (NLP). But, despite their widespread application, little is known about the structure and properties of these spaces. To gain insights into the relationship between words, the NLP community has begun to adapt high-dimensional visualization techniques. Particularly, researchers commonly use t-distributed stochastic neighbor embeddings (t-SNE) and principal component analysis (PCA) to create two-dimensional embeddings for assessing the overall structure and exploring linear relationships (e.g., word analogies), respectively. Unfortunately, these techniques often produce mediocre or even misleading results and cannot address domain-specific visualization challenges that are crucial for understanding semantic relationships in word embeddings. We introduce new embedding techniques for visualizing semantic and syntactic analogies, and the corresponding tests to determine whether the resulting views capture salient structures. Additionally, we introduce two novel views for a comprehensive study of analogy relationships. Finally, we augment t-SNE embeddings to convey uncertainty information in order to allow a reliable interpretation. Combined, the different views address a number of domain-specific tasks difficult to solve with existing tools.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NSF)
- Grant/Contract Number:
- AC52-07NA27344; SC0007446; NA0002375; SC0010498
- OSTI ID:
- 1416496
- Report Number(s):
- LLNL-JRNL-741817
- Journal Information:
- IEEE Transactions on Visualization and Computer Graphics, Vol. 24, Issue 1; ISSN 1077-2626
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Recent research advances on interactive machine learning
|
journal | November 2018 |
Latent Space Cartography: Visual Analysis of Vector Space Embeddings
|
journal | June 2019 |
Recent Research Advances on Interactive Machine Learning | preprint | January 2018 |
Similar Records
Computationally Efficient Learning of Quality Controlled Word Embeddings for Natural Language Processing
LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD