Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Score Distributions in Information Retrieval Avi Arampatzis1

Summary: Score Distributions in Information Retrieval
Avi Arampatzis1
, Stephen Robertson2
, and Jaap Kamps1
University of Amsterdam, the Netherlands
Microsoft Research, Cambridge UK
Abstract. We review the history of modeling score distributions, focusing on
the mixture of normal-exponential by investigating the theoretical as well as the
empirical evidence supporting its use. We discuss previously suggested condi-
tions which valid binary mixture models should satisfy, such as the Recall-Fallout
Convexity Hypothesis, and formulate two new hypotheses considering the com-
ponent distributions under some limiting conditions of parameter values. From all
the mixtures suggested in the past, the current theoretical argument points to the
two gamma as the most-likely universal model, with the normal-exponential be-
ing a usable approximation. Beyond the theoretical contribution, we provide new
experimental evidence showing vector space or geometric models, and BM25, as
being "friendly" to the normal-exponential, and that the non-convexity problem
that the mixture possesses is practically not severe.


Source: Arampatzis, Avi - Department of Electrical and Computer Engineering, Democritus University of Thrace


Collections: Computer Technologies and Information Sciences