Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval

Summary: BoostMap: An Embedding Method for
Efficient Nearest Neighbor Retrieval
Vassilis Athitsos, Member, IEEE, Jonathan Alon, Member, IEEE,
Stan Sclaroff, Senior Member, IEEE, and George Kollios, Member, IEEE
Abstract--This paper describes BoostMap, a method for efficient nearest neighbor retrieval under computationally expensive distance
measures. Database and query objects are embedded into a vector space in which distances can be measured efficiently. Each
embedding is treated as a classifier that predicts for any three objects X, A, B whether X is closer to A or to B. It is shown that a linear
combination of such embedding-based classifiers naturally corresponds to an embedding and a distance measure. Based on this
property, the BoostMap method reduces the problem of embedding construction to the classical boosting problem of combining many
weak classifiers into an optimized strong classifier. The classification accuracy of the resulting strong classifier is a direct measure of the
amount of nearest neighbor structure preserved by the embedding. An important property of BoostMap is that the embedding optimization
criterion is equally valid in both metric and nonmetric spaces. Performance is evaluated in databases of hand images, handwritten digits,
and time series. In all cases, BoostMap significantly improves retrieval efficiency with small losses in accuracy compared to brute-force
search. Moreover, BoostMap significantly outperforms existing nearest neighbor retrieval methods such as Lipschitz embeddings,
FastMap, and VP-trees.
Index Terms--Indexing methods, embedding methods, similarity matching, multimedia databases, nearest neighbor retrieval, nearest
neighbor classification, non-Euclidean spaces.

NEAREST neighbor retrieval is the task of identifying the


Source: Athitsos, Vassilis - Department of Computer Science and Engineering, University of Texas at Arlington


Collections: Computer Technologies and Information Sciences