Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Massively Parallel Latent Semantic Analyzes using a Graphics Processing Unit

Journal Article · · Journal of Undergraduate Research
OSTI ID:986774

Latent Semantic Indexing (LSA) aims to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. Due to the GPU s application-specific architecture, harnessing the GPU s computational prowess for LSA is a great challenge. We present a parallel LSA implementation on the GPU, using NVIDIA Compute Unified Device Architecture and Compute Unified Basic Linear Algebra Subprograms. The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1000x1000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version. The large variation is due to architectural benefits the GPU has for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

Research Organization:
Oak Ridge National Laboratory (ORNL)
Sponsoring Organization:
ORNL LDRD Seed-Money
DOE Contract Number:
AC05-00OR22725
OSTI ID:
986774
Journal Information:
Journal of Undergraduate Research, Journal Name: Journal of Undergraduate Research Vol. IX
Country of Publication:
United States
Language:
English

Similar Records

MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT
Journal Article · Wed Dec 31 23:00:00 EST 2008 · Journal of Undergraduate Research · OSTI ID:1052114

Parallel Latent Semantic Analysis using a Graphics Processing Unit
Conference · Wed Dec 31 23:00:00 EST 2008 · OSTI ID:962623

Computation of Large Covariance Matrices by SAMMY on Graphical Processing Units and Multicore CPUs
Conference · Fri Dec 31 23:00:00 EST 2010 · OSTI ID:1018596