skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: The computational complexity of alternative updating approaches for an SVD-encoded indexing scheme

Conference ·
OSTI ID:125464
;  [1];  [2]
  1. Univ. of Tennessee, Knoxville, TN (United States)
  2. Information Science Research Group, Morristown, NJ (United States)

Latent Semantic Indexing (LSI) is a conceptual indexing technique which uses the truncated SVD to estimate the underlying latent semantic structure of word to document association. By computing a lower-rank approximation to the original term-document matrix, LSI dampens the effects of word choice variability by representing terms and documents using (orthogonal) left and right singular vectors. Current methods for adding new documents to an LSI database (folding-in documents) can have deteriorating effects on the orthogonality of the vectors used to represent documents in high-dimensional subspaces. An alternative approach which updates the original truncated SVD so as to preserve the orthogonality among document vectors corresponding to the new term-document matrix is presented. The cost of the numerical computations and available memory needed to update the SVD versus the potential inaccuracy of former updating methods presents an interesting tradeoff for LSI database management. The computational cost of recomputing the truncated SVD of perturbed term-document matrices, updating current truncated SVD`s of term-document matrices, and the folding-in of new documents into an existing LSI model is presented.

OSTI ID:
125464
Report Number(s):
CONF-950212-; CNN: Grant NSF-CDA-9115428; Grant NSF-ASC-92-03004; TRN: 95:005768-0008
Resource Relation:
Conference: 7. Society for Industrial and Applied Mathematics (SIAM) conference on parallel processing for scientific computing, San Francisco, CA (United States), 15-17 Feb 1995; Other Information: PBD: 1995; Related Information: Is Part Of Proceedings of the seventh SIAM conference on parallel processing for scientific computing; Bailey, D.H.; Bjorstad, P.E.; Gilbert, J.R. [eds.] [and others]; PB: 894 p.
Country of Publication:
United States
Language:
English

Similar Records

On matrices with low-rank-plus-shift structure: Partial SVD and latent semantic indexing
Technical Report · Sat Aug 01 00:00:00 EDT 1998 · OSTI ID:125464

On updating problems in latent semantic indexing
Journal Article · Fri Oct 01 00:00:00 EDT 1999 · SIAM Journal on Scientific Computing · OSTI ID:125464

On updating problems in latent semantic indexing
Technical Report · Sat Nov 01 00:00:00 EST 1997 · OSTI ID:125464