Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
An Information-theoretic Measure for Document Similarity Javed A. Aslam
 

Summary: An Information-theoretic Measure for Document Similarity
Javed A. Aslam
Department of Computer Science
Dartmouth College
jaa@cs.dartmouth.edu
Meredith Frost
Department of Computer Science
Dartmouth College
Meredith.Frost@dartmouth.edu
ABSTRACT
Recent work has demonstrated that the assessment of pair-
wise object similarity can be approached in an axiomatic
manner using information theory. We extend this concept
specifically to document similarity and test the effective-
ness of an information-theoretic measure for pairwise docu-
ment similarity. We adapt query retrieval to rate the quality
of document similarity measures and demonstrate that our
proposed information-theoretic measure for document simi-
larity yields statistically significant improvements over other
popular measures of similarity.

  

Source: Aslam, Javed - College of Computer Science, Northeastern University

 

Collections: Computer Technologies and Information Sciences