Summary: A Framework for Determining Necessary
Query Set Sizes to Evaluate Web
Eric C. Jensen, Steven M. Beitzel, Ophir Frieder
Information Retrieval Laboratory
Illinois Institute of Technology
Chicago, IL 60616
Search & Navigation Group
America Online, Inc.
Dulles, VA 20166
1. Randomly sample a distinct set of queries Q with size n from a query log.
2. For each query in Q, manually evaluate the union of the top X retrieved results from each of the engines.
3. Calculate each engines score for each query using the metric of interest, e.g. average precision (AvgP), reciprocal
rank of the best page (MRR), etc.
4. For B iterations:
a. Randomly sample, with repetition, a set of queries Q* with size m from the original set Q.