 
Summary: Nearest Neighbor Search in General Metric Spaces Using a Tree Data Structure with a
Simple Heuristic
Huafeng Xu* and Dimitris K. Agrafiotis
3Dimensional Pharmaceuticals, Inc., 665 Stockton Drive, Exton, Pennsylvania 19341
Received July 21, 2003
We present a new algorithm for nearest neighbor search in general metric spaces. The algorithm organizes
the database into recursively partitioned Voronoi regions and represents these partitions in a tree. The
separations between the Voronoi regions as well as the radius of each region are used with triangular inequality
to derive the minimum possible distance between any point in a region and the query and to discard the
region from further search if a smaller distance has already been found. The algorithm also orders the
search sequence of the tree branches using the estimate of the minimum possible distance. This simple
heuristic proves to considerably enhance the pruning of the search tree. The efficiency of the algorithm is
demonstrated on several artificial data sets and real problems in computational chemistry.
I. INTRODUCTION
Similarity search is one of the most common tasks in
computing.1
It finds, in a collection, the objects that are most
similar to a given object of interest. The collection is often
referred to as the database and the given object of interest
as the query. Similarity, or rather dissimilarity, is measured
