Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Towards Deeper Understanding of the LSA Performance Preslav Nakov1

Summary: Towards Deeper Understanding of the LSA Performance
Preslav Nakov1
, Elena Valchanova2
, Galia Angelova2
(1) University of California at Berkeley, EECS, Berkeley CA 94720, USA
(2) Central Laboratory for Parallel Processing, Bulgarian Academy of Sciences,
25 Acad. G. Bonchev Str., 1113 Sofia, Bulgaria
nakov@eecs.berkeley.edu, {elenav, galia}@lml.bas.bg
The paper presents on-going work towards deeper un-
derstanding of the factors influencing the performance
of the Latent Semantic Analysis (LSA). Unlike previ-
ous attempts that concentrate on problems such as ma-
trix elements weighting, space dimensionality selec-
tion, similarity measure etc., we primarily study the
impact of another, often neglected, but fundamental
element of LSA (and of any text processing techni-
que): the definition of "word". For the purpose, a
balanced corpus of Bulgarian newspaper texts was ca-
refully created, to allow for in-depth observations of


Source: Angelova, Galia - Linguistic Modelling Department, Institute for Parallel Processing (Sofia, Bulgaria)
Hearst, Marti - School of Information Management and Systems, University of California at Berkeley


Collections: Computer Technologies and Information Sciences