skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: On updating problems in latent semantic indexing

Abstract

The authors develop new SVD-updating algorithms for three types of updating problems arising from latent semantic indexing (LSI) for information retrieval to deal with rapidly changing text document collections. They also provide theoretical justification for using a reduced-dimension representation of the original document collection in the updating process. Numerical experiments using several standard text document collections show that the new algorithms give higher (interpolated) average precisions that the existing algorithms, and the retrieval accuracy is comparable to that obtained using the complete document collection.

Authors:
;
Publication Date:
Research Org.:
Pennsylvania State Univ., University Park, PA (US)
Sponsoring Org.:
USDOE; National Science Foundation (NSF)
OSTI Identifier:
20015659
DOE Contract Number:
AC03-76SF00098
Resource Type:
Journal Article
Resource Relation:
Journal Name: SIAM Journal on Scientific Computing; Journal Volume: 21; Journal Issue: 2; Other Information: PBD: Oct 1999
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; INFORMATION RETRIEVAL; ALGORITHMS; AUTOMATION; INFORMATION SYSTEMS; PERFORMANCE

Citation Formats

Zha, H., and Simon, H.D. On updating problems in latent semantic indexing. United States: N. p., 1999. Web. doi:10.1137/S1064827597329266.
Zha, H., & Simon, H.D. On updating problems in latent semantic indexing. United States. doi:10.1137/S1064827597329266.
Zha, H., and Simon, H.D. Fri . "On updating problems in latent semantic indexing". United States. doi:10.1137/S1064827597329266.
@article{osti_20015659,
title = {On updating problems in latent semantic indexing},
author = {Zha, H. and Simon, H.D.},
abstractNote = {The authors develop new SVD-updating algorithms for three types of updating problems arising from latent semantic indexing (LSI) for information retrieval to deal with rapidly changing text document collections. They also provide theoretical justification for using a reduced-dimension representation of the original document collection in the updating process. Numerical experiments using several standard text document collections show that the new algorithms give higher (interpolated) average precisions that the existing algorithms, and the retrieval accuracy is comparable to that obtained using the complete document collection.},
doi = {10.1137/S1064827597329266},
journal = {SIAM Journal on Scientific Computing},
number = 2,
volume = 21,
place = {United States},
year = {Fri Oct 01 00:00:00 EDT 1999},
month = {Fri Oct 01 00:00:00 EDT 1999}
}