Summary: Expected-Case Complexity of Approximate Nearest Neighbor
Ho-Yam Addy Fu
Most research in algorithms for geometric query problems has focused on their worst-
case performance. However, when information on the query distribution is available,
the alternative paradigm of designing and analyzing algorithms from the perspective of
expected-case performance appears more attractive. We study the approximate nearest
neighbor problem from this perspective.
As a first step in this direction, we assume that the query points are sampled uni-
formly from a hypercube that encloses all the data points; however, we make no as-
sumption on the distribution of the data points. We show that with a simple partition
tree, called the sliding-midpoint tree, it is possible to achieve linear space and logarith-
mic query time in the expected case; in contrast, the data structures known to achieve
linear space and logarithmic query time in the worst case are complex, and algorithms
on them run more slowly in practice. Moreover, we prove that the sliding-midpoint tree
achieves optimal expected query time in a certain class of algorithms.
The main focus in the design of data structures and algorithms for geometric query problems