 
Summary: ExpectedCase Complexity of Approximate Nearest Neighbor
Searching
Sunil Arya
HoYam Addy Fu
Abstract
Most research in algorithms for geometric query problems has focused on their worst
case performance. However, when information on the query distribution is available,
the alternative paradigm of designing and analyzing algorithms from the perspective of
expectedcase performance appears more attractive. We study the approximate nearest
neighbor problem from this perspective.
As a first step in this direction, we assume that the query points are sampled uni
formly from a hypercube that encloses all the data points; however, we make no as
sumption on the distribution of the data points. We show that with a simple partition
tree, called the slidingmidpoint tree, it is possible to achieve linear space and logarith
mic query time in the expected case; in contrast, the data structures known to achieve
linear space and logarithmic query time in the worst case are complex, and algorithms
on them run more slowly in practice. Moreover, we prove that the slidingmidpoint tree
achieves optimal expected query time in a certain class of algorithms.
1 Introduction
The main focus in the design of data structures and algorithms for geometric query problems
