skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms

Abstract

The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections whilemore » maintaining high detection rate for malignant masses.« less

Authors:
; ; ; ;  [1];  [2]
  1. Digital Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705 (United States)
  2. (United States)
Publication Date:
OSTI Identifier:
20853904
Resource Type:
Journal Article
Resource Relation:
Journal Name: Medical Physics; Journal Volume: 34; Journal Issue: 1; Other Information: DOI: 10.1118/1.2401667; (c) 2007 American Association of Physicists in Medicine; Country of input: International Atomic Energy Agency (IAEA)
Country of Publication:
United States
Language:
English
Subject:
62 RADIOLOGY AND NUCLEAR MEDICINE; ACCURACY; BIOMEDICAL RADIOGRAPHY; CARCINOMAS; DIAGNOSIS; ENTROPY; EVALUATION; IMAGES; MAMMARY GLANDS; SCREENING

Citation Formats

Tourassi, Georgia D., Harrawood, Brian, Singh, Swatee, Lo, Joseph Y., Floyd, Carey E., and Digital Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705 and Department of Biomedical Engineering, Duke University, Durham, North Carolina 27710. Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms. United States: N. p., 2007. Web. doi:10.1118/1.2401667.
Tourassi, Georgia D., Harrawood, Brian, Singh, Swatee, Lo, Joseph Y., Floyd, Carey E., & Digital Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705 and Department of Biomedical Engineering, Duke University, Durham, North Carolina 27710. Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms. United States. doi:10.1118/1.2401667.
Tourassi, Georgia D., Harrawood, Brian, Singh, Swatee, Lo, Joseph Y., Floyd, Carey E., and Digital Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705 and Department of Biomedical Engineering, Duke University, Durham, North Carolina 27710. Mon . "Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms". United States. doi:10.1118/1.2401667.
@article{osti_20853904,
title = {Evaluation of information-theoretic similarity measures for content-based retrieval and detection of masses in mammograms},
author = {Tourassi, Georgia D. and Harrawood, Brian and Singh, Swatee and Lo, Joseph Y. and Floyd, Carey E. and Digital Advanced Imaging Laboratories, Department of Radiology, Duke University Medical Center, Durham, North Carolina 27705 and Department of Biomedical Engineering, Duke University, Durham, North Carolina 27710},
abstractNote = {The purpose of this study was to evaluate image similarity measures employed in an information-theoretic computer-assisted detection (IT-CAD) scheme. The scheme was developed for content-based retrieval and detection of masses in screening mammograms. The study is aimed toward an interactive clinical paradigm where physicians query the proposed IT-CAD scheme on mammographic locations that are either visually suspicious or indicated as suspicious by other cuing CAD systems. The IT-CAD scheme provides an evidence-based, second opinion for query mammographic locations using a knowledge database of mass and normal cases. In this study, eight entropy-based similarity measures were compared with respect to retrieval precision and detection accuracy using a database of 1820 mammographic regions of interest. The IT-CAD scheme was then validated on a separate database for false positive reduction of progressively more challenging visual cues generated by an existing, in-house mass detection system. The study showed that the image similarity measures fall into one of two categories; one category is better suited to the retrieval of semantically similar cases while the second is more effective with knowledge-based decisions regarding the presence of a true mass in the query location. In addition, the IT-CAD scheme yielded a substantial reduction in false-positive detections while maintaining high detection rate for malignant masses.},
doi = {10.1118/1.2401667},
journal = {Medical Physics},
number = 1,
volume = 34,
place = {United States},
year = {Mon Jan 15 00:00:00 EST 2007},
month = {Mon Jan 15 00:00:00 EST 2007}
}
  • Ensemble classifiers have been shown efficient in multiple applications. In this article, the authors explore the effectiveness of ensemble classifiers in a case-based computer-aided diagnosis system for detection of masses in mammograms. They evaluate two general ways of constructing subclassifiers by resampling of the available development dataset: Random division and random selection. Furthermore, they discuss the problem of selecting the ensemble size and propose two adaptive incremental techniques that automatically select the size for the problem at hand. All the techniques are evaluated with respect to a previously proposed information-theoretic CAD system (IT-CAD). The experimental results show that the examinedmore » ensemble techniques provide a statistically significant improvement (AUC=0.905{+-}0.024) in performance as compared to the original IT-CAD system (AUC=0.865{+-}0.029). Some of the techniques allow for a notable reduction in the total number of examples stored in the case base (to 1.3% of the original size), which, in turn, results in lower storage requirements and a shorter response time of the system. Among the methods examined in this article, the two proposed adaptive techniques are by far the most effective for this purpose. Furthermore, the authors provide some discussion and guidance for choosing the ensemble parameters.« less
  • The presentation of images that are similar to that of an unknown lesion seen on a mammogram may be helpful for radiologists to correctly diagnose that lesion. For similar images to be useful, they must be quite similar from the radiologists' point of view. We have been trying to quantify the radiologists' impression of similarity for pairs of lesions and to establish a ''gold standard'' for development and evaluation of a computerized scheme for selecting such similar images. However, it is considered difficult to reliably and accurately determine similarity ratings, because they are subjective. In this study, we compared themore » subjective similarities obtained by two different methods, an absolute rating method and a 2-alternative forced-choice (2AFC) method, to demonstrate that reliable similarity ratings can be determined by the responses of a group of radiologists. The absolute similarity ratings were previously obtained for pairs of masses and pairs of microcalcifications from five and nine radiologists, respectively. In this study, similarity ranking scores for eight pairs of masses and eight pairs of microcalcifications were determined by use of the 2AFC method. In the first session, the eight pairs of masses and eight pairs of microcalcifications were grouped and compared separately for determining the similarity ranking scores. In the second session, another similarity ranking score was determined by use of mixed pairs, i.e., by comparison of the similarity of a mass pair with that of a calcification pair. Four pairs of masses and four pairs of microcalcifications were grouped together to create two sets of eight pairs. The average absolute similarity ratings and the average similarity ranking scores showed very good correlations in the first study (Pearson's correlation coefficients: 0.94 and 0.98 for masses and microcalcifications, respectively). Moreover, in the second study, the correlations between the absolute ratings and the ranking scores were also very high (0.92 and 0.96), which implies that the observers were able to compare the similarity of a mass pair with that of a calcification pair consistently. These results provide evidence that the concept of similarity for pairs of images is robust, even across different lesion types, and that radiologists are able to reliably determine subjective similarity for pairs of breast lesions.« less
  • The presentation of images with lesions of known pathology that are similar to an unknown lesion may be helpful to radiologists in the diagnosis of challenging cases for improving the diagnostic accuracy and also for reducing variation among different radiologists. The authors have been developing a computerized scheme for automatically selecting similar images with clustered microcalcifications on mammograms from a large database. For similar images to be useful, they must be similar from the point of view of the diagnosing radiologists. In order to select such images, subjective similarity ratings were obtained for a number of pairs of clustered microcalcificationsmore » by breast radiologists for establishment of a ''gold standard'' of image similarity, and the gold standard was employed for determination and evaluation of the selection of similar images. The images used in this study were obtained from the Digital Database for Screening Mammography developed by the University of South Florida. The subjective similarity ratings for 300 pairs of images with clustered microcalcifications were determined by ten breast radiologists. The authors determined a number of image features which represent the characteristics of clustered microcalcifications that radiologists would use in their diagnosis. For determination of objective similarity measures, an artificial neural network (ANN) was employed. The ANN was trained with the average subjective similarity ratings as teacher and selected image features as input data. The ANN was trained to learn the relationship between the image features and the radiologists' similarity ratings; therefore, once the training was completed, the ANN was able to determine the similarity, called a psychophysical similarity measure, which was expected to be close to radiologists' impressions, for an unknown pair of clustered microcalcifications. By use of a leave-one-out test method, the best combination of features was selected. The correlation coefficient between the gold standard and the psychophysical similarity measure through the use of seven features was relatively high (r=0.71) and was comparable to the correlation coefficients between the ratings by one radiologist and the average ratings by nine radiologists (r=0.69{+-}0.07). The correlation coefficient was improved compared to that of a distance-based method (r=0.58). The result indicated that similar images selected by the psychophysical similarity measure may be useful to radiologists in the diagnosis of clustered microcalcifications on mammograms.« less
  • Purpose: A promising patient positioning technique is based on registering computed tomographic (CT) or magnetic resonance (MR) images to cone-beam CT images (CBCT). The extra radiation dose delivered to the patient can be substantially reduced by using fewer projections. This approach results in lower quality CBCT images. The purpose of this study is to evaluate a number of similarity measures (SMs) suitable for registration of CT or MR images to low-quality CBCTs. Methods and Materials: Using the recently proposed evaluation protocol, we evaluated nine SMs with respect to pretreatment imaging modalities, number of two-dimensional (2D) images used for reconstruction, andmore » number of reconstruction iterations. The image database consisted of 100 X-ray and corresponding CT and MR images of two vertebral columns. Results: Using a higher number of 2D projections or reconstruction iterations results in higher accuracy and slightly lower robustness. The similarity measures that behaved the best also yielded the best registration results. The most appropriate similarity measure was the asymmetric multi-feature mutual information (AMMI). Conclusions: The evaluation protocol proved to be a valuable tool for selecting the best similarity measure for the reconstruction-based registration. The results indicate that accurate and robust CT/CBCT or even MR/CBCT registrations are possible if the AMMI similarity measure is used.« less
  • Purpose: Rigid 2D-3D registration is an alternative to 3D-3D registration for cases where largely bony anatomy can be used for patient positioning in external beam radiation therapy. In this article, the authors evaluated seven similarity measures for use in the intensity-based rigid 2D-3D registration using a variation in Skerl's similarity measure evaluation protocol. Methods: The seven similarity measures are partitioned intensity uniformity, normalized mutual information (NMI), normalized cross correlation (NCC), entropy of the difference image, pattern intensity (PI), gradient correlation (GC), and gradient difference (GD). In contrast to traditional evaluation methods that rely on visual inspection or registration outcomes, themore » similarity measure evaluation protocol probes the transform parameter space and computes a number of similarity measure properties, which is objective and optimization method independent. The variation in protocol offers an improved property in the quantification of the capture range. The authors used this protocol to investigate the effects of the downsampling ratio, the region of interest, and the method of the digitally reconstructed radiograph (DRR) calculation [i.e., the incremental ray-tracing method implemented on a central processing unit (CPU) or the 3D texture rendering method implemented on a graphics processing unit (GPU)] on the performance of the similarity measures. The studies were carried out using both the kilovoltage (kV) and the megavoltage (MV) images of an anthropomorphic cranial phantom and the MV images of a head-and-neck cancer patient. Results: Both the phantom and the patient studies showed the 2D-3D registration using the GPU-based DRR calculation yielded better robustness, while providing similar accuracy compared to the CPU-based calculation. The phantom study using kV imaging suggested that NCC has the best accuracy and robustness, but its slow function value change near the global maximum requires a stricter termination condition for an optimization method. The phantom study using MV imaging indicated that PI, GD, and GC have the best accuracy, while NCC and NMI have the best robustness. The clinical study using MV imaging showed that NCC and NMI have the best robustness. Conclusions: The authors evaluated the performance of seven similarity measures for use in 2D-3D image registration using the variation in Skerl's similarity measure evaluation protocol. The generalized methodology can be used to select the best similarity measures, determine the optimal or near optimal choice of parameter, and choose the appropriate registration strategy for the end user in his specific registration applications in medical imaging.« less