Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

On the use of the singular value decomposition for text retrieval

Conference ·
OSTI ID:775166
The use of the Singular Value Decomposition (SVD) has been proposed for text retrieval in several recent works. This technique uses the SVD to project very high dimensional document and query vectors into a low dimensional space. In this new space it is hoped that the underlying structure of the collection is revealed thus enhancing retrieval performance. Theoretical results have provided some evidence for this claim and to some extent experiments have confirmed this. However, these studies have mostly used small test collections and simplified document models. In this work we investigate the use of the SVD on large document collections. We show that, if interpreted as a mechanism for representing the terms of the collection, this technique alone is insufficient for dealing with the variability in term occurrence. Section 2 introduces the text retrieval concepts necessary for our work. A short description of our experimental architecture is presented in Section 3. Section 4 describes how term occurrence variability affects the SVD and then shows how the decomposition influences retrieval performance. A possible way of improving SVD-based techniques is presented in Section 5 and concluded in Section 6.
Research Organization:
Lawrence Berkeley National Lab., CA (US)
Sponsoring Organization:
USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division (US)
DOE Contract Number:
AC03-76SF00098
OSTI ID:
775166
Report Number(s):
LBNL--47170
Country of Publication:
United States
Language:
English

Similar Records

Compression of Magnetohydrodynamic Simulation Data Using Singular Value Decomposition
Journal Article · Sun Oct 01 00:00:00 EDT 2006 · Journal of Computational Physics · OSTI ID:931693

Compression of magnetohydrodynamic simulation data using singular value decomposition
Journal Article · Wed Feb 28 23:00:00 EST 2007 · Journal of Computational Physics · OSTI ID:20991563

Text retrieval using the vector processing model
Technical Report · Fri Dec 30 23:00:00 EST 1994 · OSTI ID:68562