Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Towards Deeper Understanding of the LSA Performance Preslav Nakov1
 

Summary: Towards Deeper Understanding of the LSA Performance
Preslav Nakov1
, Elena Valchanova2
, Galia Angelova2
(1) University of California at Berkeley, EECS, Berkeley CA 94720, USA
(2) Central Laboratory for Parallel Processing, Bulgarian Academy of Sciences,
25 Acad. G. Bonchev Str., 1113 Sofia, Bulgaria
nakov@eecs.berkeley.edu, {elenav, galia}@lml.bas.bg
Abstract
The paper presents on-going work towards deeper un-
derstanding of the factors influencing the performance
of the Latent Semantic Analysis (LSA). Unlike previ-
ous attempts that concentrate on problems such as ma-
trix elements weighting, space dimensionality selec-
tion, similarity measure etc., we primarily study the
impact of another, often neglected, but fundamental
element of LSA (and of any text processing techni-
que): the definition of "word". For the purpose, a
balanced corpus of Bulgarian newspaper texts was ca-
refully created, to allow for in-depth observations of

  

Source: Angelova, Galia - Linguistic Modelling Department, Institute for Parallel Processing (Sofia, Bulgaria)
Hearst, Marti - School of Information Management and Systems, University of California at Berkeley

 

Collections: Computer Technologies and Information Sciences