Efficiency issues related to probability density function comparison
Abstract
The CANDID project (Comparison Algorithm for Navigating Digital Image Databases) employs probability density functions (PDFs) of localized feature information to represent the content of an image for search and retrieval purposes. A similarity measure between PDFs is used to identify database images that are similar to a user-provided query image. Unfortunately, signature comparison involving PDFs is a very time-consuming operation. In this paper, we look into some efficiency considerations when working with PDFS. Since PDFs can take on many forms, we look into tradeoffs between accurate representation and efficiency of manipulation for several data sets. In particular, we typically represent each PDF as a Gaussian mixture (e.g. as a weighted sum of Gaussian kernels) in the feature space. We find that by constraining all Gaussian kernels to have principal axes that are aligned to the natural axes of the feature space, computations involving these PDFs are simplified. We can also constrain the Gaussian kernels to be hyperspherical rather than hyperellipsoidal, simplifying computations even further, and yielding an order of magnitude speedup in signature comparison. This paper illustrates the tradeoffs encountered when using these constraints.
- Authors:
- Publication Date:
- Research Org.:
- Los Alamos National Lab., NM (United States)
- Sponsoring Org.:
- USDOE, Washington, DC (United States)
- OSTI Identifier:
- 211673
- Report Number(s):
- LA-UR-96-0062; CONF-960171-2
ON: DE96007195
- DOE Contract Number:
- W-7405-ENG-36
- Resource Type:
- Conference
- Resource Relation:
- Conference: IS&T/SPIE symposium on electronic imaging: science & technology, Bellingham, WA (United States), 29 Jan - 2 Feb 1996; Other Information: PBD: [1996]
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 MATHEMATICS, COMPUTERS, INFORMATION SCIENCE, MANAGEMENT, LAW, MISCELLANEOUS; 42 ENGINEERING NOT INCLUDED IN OTHER CATEGORIES; IMAGE PROCESSING; ALGORITHMS; PROBABILITY; EFFICIENCY; DISTANCE; VECTORS; DISTRIBUTION FUNCTIONS
Citation Formats
Kelly, P M, Cannon, M, and Barros, J E. Efficiency issues related to probability density function comparison. United States: N. p., 1996.
Web.
Kelly, P M, Cannon, M, & Barros, J E. Efficiency issues related to probability density function comparison. United States.
Kelly, P M, Cannon, M, and Barros, J E. Fri .
"Efficiency issues related to probability density function comparison". United States. https://www.osti.gov/servlets/purl/211673.
@article{osti_211673,
title = {Efficiency issues related to probability density function comparison},
author = {Kelly, P M and Cannon, M and Barros, J E},
abstractNote = {The CANDID project (Comparison Algorithm for Navigating Digital Image Databases) employs probability density functions (PDFs) of localized feature information to represent the content of an image for search and retrieval purposes. A similarity measure between PDFs is used to identify database images that are similar to a user-provided query image. Unfortunately, signature comparison involving PDFs is a very time-consuming operation. In this paper, we look into some efficiency considerations when working with PDFS. Since PDFs can take on many forms, we look into tradeoffs between accurate representation and efficiency of manipulation for several data sets. In particular, we typically represent each PDF as a Gaussian mixture (e.g. as a weighted sum of Gaussian kernels) in the feature space. We find that by constraining all Gaussian kernels to have principal axes that are aligned to the natural axes of the feature space, computations involving these PDFs are simplified. We can also constrain the Gaussian kernels to be hyperspherical rather than hyperellipsoidal, simplifying computations even further, and yielding an order of magnitude speedup in signature comparison. This paper illustrates the tradeoffs encountered when using these constraints.},
doi = {},
url = {https://www.osti.gov/biblio/211673},
journal = {},
number = ,
volume = ,
place = {United States},
year = {1996},
month = {3}
}