skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Effective and efficient subjective testing of texture similarity metrics

; ; ; ;
Publication Date:
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
Grant/Contract Number:
Resource Type:
Journal Article: Publisher's Accepted Manuscript
Journal Name:
Journal of the Optical Society of America A
Additional Journal Information:
Journal Volume: 32; Journal Issue: 2; Related Information: CHORUS Timestamp: 2017-06-23 00:30:41; Journal ID: ISSN 1084-7529
Optical Society of America
Country of Publication:
United States

Citation Formats

Zujovic, Jana, Pappas, Thrasyvoulos N., Neuhoff, David L., van Egmond, René, and de Ridder, Huib. Effective and efficient subjective testing of texture similarity metrics. United States: N. p., 2015. Web. doi:10.1364/JOSAA.32.000329.
Zujovic, Jana, Pappas, Thrasyvoulos N., Neuhoff, David L., van Egmond, René, & de Ridder, Huib. Effective and efficient subjective testing of texture similarity metrics. United States. doi:10.1364/JOSAA.32.000329.
Zujovic, Jana, Pappas, Thrasyvoulos N., Neuhoff, David L., van Egmond, René, and de Ridder, Huib. 2015. "Effective and efficient subjective testing of texture similarity metrics". United States. doi:10.1364/JOSAA.32.000329.
title = {Effective and efficient subjective testing of texture similarity metrics},
author = {Zujovic, Jana and Pappas, Thrasyvoulos N. and Neuhoff, David L. and van Egmond, René and de Ridder, Huib},
abstractNote = {},
doi = {10.1364/JOSAA.32.000329},
journal = {Journal of the Optical Society of America A},
number = 2,
volume = 32,
place = {United States},
year = 2015,
month = 1

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record at 10.1364/JOSAA.32.000329

Citation Metrics:
Cited by: 2works
Citation information provided by
Web of Science

Save / Share:
  • The presentation of images that are similar to that of an unknown lesion seen on a mammogram may be helpful for radiologists to correctly diagnose that lesion. For similar images to be useful, they must be quite similar from the radiologists' point of view. We have been trying to quantify the radiologists' impression of similarity for pairs of lesions and to establish a ''gold standard'' for development and evaluation of a computerized scheme for selecting such similar images. However, it is considered difficult to reliably and accurately determine similarity ratings, because they are subjective. In this study, we compared themore » subjective similarities obtained by two different methods, an absolute rating method and a 2-alternative forced-choice (2AFC) method, to demonstrate that reliable similarity ratings can be determined by the responses of a group of radiologists. The absolute similarity ratings were previously obtained for pairs of masses and pairs of microcalcifications from five and nine radiologists, respectively. In this study, similarity ranking scores for eight pairs of masses and eight pairs of microcalcifications were determined by use of the 2AFC method. In the first session, the eight pairs of masses and eight pairs of microcalcifications were grouped and compared separately for determining the similarity ranking scores. In the second session, another similarity ranking score was determined by use of mixed pairs, i.e., by comparison of the similarity of a mass pair with that of a calcification pair. Four pairs of masses and four pairs of microcalcifications were grouped together to create two sets of eight pairs. The average absolute similarity ratings and the average similarity ranking scores showed very good correlations in the first study (Pearson's correlation coefficients: 0.94 and 0.98 for masses and microcalcifications, respectively). Moreover, in the second study, the correlations between the absolute ratings and the ranking scores were also very high (0.92 and 0.96), which implies that the observers were able to compare the similarity of a mass pair with that of a calcification pair consistently. These results provide evidence that the concept of similarity for pairs of images is robust, even across different lesion types, and that radiologists are able to reliably determine subjective similarity for pairs of breast lesions.« less
  • Presentation of images of lesions similar to that of an unknown lesion might be useful to radiologists in distinguishing between benign and malignant clustered microcalcifications on mammograms. Investigators have been developing computerized schemes to select similar images from large databases. However, whether selected images are really similar in appearance is not examined for most of the schemes. In order to retrieve images that are useful to radiologists, the selected images must be similar from radiologists' diagnostic points of view. Therefore, in this study, the data of radiologists' subjective similarity for pairs of clustered microcalcification images were obtained from a numbermore » of observers, and the intra- and inter-observer variations and the intergroup correlations were determined to investigate whether reliable similarity ratings by human observers can be determined. Nineteen images of clustered microcalcifications, each of which was paired with six other images, were selected for the observer study. Thus, subjective similarity ratings for 114 pairs of clustered microcalcifications were determined by each observer. Thirteen breast, ten general, and ten nonradiologists participated in the observer study; some of them completed the study multiple times. Although the intraobserver variations for the individual readings and the interobserver variations for pairs of observers were not small, the interobserver agreements were improved by taking the average of readings by the same observers. When the similarity ratings by a number of observers were averaged among the groups of breast, general, and nonradiologists, the mean differences of the ratings between the groups decreased, and good concordance correlations (0.846, 0.817, and 0.785) between the groups were obtained. The result indicates that reliable similarity ratings can be determined by use of this method, and the average similarity ratings by breast radiologists can be considered meaningful and useful for the development and evaluation of a computerized scheme for selection of similar images.« less
  • Purpose: To evaluate different similarity metrics (SM) using natural calcifications and observation-based measures to determine the most accurate prostate and seminal vesicle localization on daily cone-beam CT (CBCT) images. Methods and Materials: CBCT images of 29 patients were retrospectively analyzed; 14 patients with prostate calcifications (calcification data set) and 15 patients without calcifications (no-calcification data set). Three groups of test registrations were performed. Test 1: 70 CT/CBCT pairs from calcification dataset were registered using 17 SMs (6,580 registrations) and compared using the calcification mismatch error as an endpoint. Test 2: Using the four best SMs from Test 1, 75 CT/CBCTmore » pairs in the no-calcification data set were registered (300 registrations). Accuracy of contour overlays was ranked visually. Test 3: For the best SM from Tests 1 and 2, accuracy was estimated using 356 CT/CBCT registrations. Additionally, target expansion margins were investigated for generating registration regions of interest. Results: Test 1-Incremental sign correlation (ISC), gradient correlation (GC), gradient difference (GD), and normalized cross correlation (NCC) showed the smallest errors ({mu} {+-} {sigma}: 1.6 {+-} 0.9 {approx} 2.9 {+-} 2.1 mm). Test 2-Two of the three reviewers ranked GC higher. Test 3-Using GC, 96% of registrations showed <3-mm error when calcifications were filtered. Errors were left/right: 0.1 {+-} 0.5mm, anterior/posterior: 0.8 {+-} 1.0mm, and superior/inferior: 0.5 {+-} 1.1 mm. The existence of calcifications increased the success rate to 97%. Expansion margins of 4-10 mm were equally successful. Conclusion: Gradient-based SMs were most accurate. Estimated error was found to be <3 mm (1.1 mm SD) in 96% of the registrations. Results suggest that the contour expansion margin should be no less than 4 mm.« less
  • Purpose: In image-guided spine surgery, mapping 3D preoperative images to 2D intraoperative images via 3D-2D registration can provide valuable assistance in target localization. However, the presence of surgical instrumentation, hardware implants, and soft-tissue resection/displacement causes mismatches in image content, confounding existing registration methods. Manual/semi-automatic methods to mask such extraneous content is time consuming, user-dependent, error prone, and disruptive to clinical workflow. We developed and evaluated 2 novel similarity metrics within a robust registration framework to overcome such challenges in target localization. Methods: An IRB-approved retrospective study in 19 spine surgery patients included 19 preoperative 3D CT images and 50 intraoperativemore » mobile radiographs in cervical, thoracic, and lumbar spine regions. A neuroradiologist provided truth definition of vertebral positions in CT and radiography. 3D-2D registration was performed using the CMA-ES optimizer with 4 gradient-based image similarity metrics: (1) gradient information (GI); (2) gradient correlation (GC); (3) a novel variant referred to as gradient orientation (GO); and (4) a second variant referred to as truncated gradient correlation (TGC). Registration accuracy was evaluated in terms of the projection distance error (PDE) of the vertebral levels. Results: Conventional similarity metrics were susceptible to gross registration error and failure modes associated with the presence of surgical instrumentation: for GI, the median PDE and interquartile range was 33.0±43.6 mm; similarly for GC, PDE = 23.0±92.6 mm respectively. The robust metrics GO and TGC, on the other hand, demonstrated major improvement in PDE (7.6 ±9.4 mm and 8.1± 18.1 mm, respectively) and elimination of gross failure modes. Conclusion: The proposed GO and TGC similarity measures improve registration accuracy and robustness to gross failure in the presence of strong image content mismatch. Such registration capability could offer valuable assistance in target localization without disruption of clinical workflow. G. Kleinszig and S. Vogt are employees of Siemens Healthcare.« less
  • Purpose: 4D imaging modalities require detailed characterization for clinical optimization. The On-Board Imager mounted on the linear accelerator was used to investigate dose rates in a tissue mimicking phantom using 4D-CBCT and assess variability of contouring similarity metrics between 4D-CT and 4D-CBCT retrospective reconstructions. Methods: A 125 kVp thoracic protocol was used. A phantom placed on a motion platform simulated a patient’s breathing cycle. An ion chamber was affixed inside the phantom’s tissue mimicking cavities (i.e. bone, lung, and soft tissue). A sinusoidal motion waveform was executed with a five second period and superior-inferior motion. Dose rates were measured atmore » six ion chamber positions. A preliminary workflow for contouring similarity between 4D-CT and 4D-CBCT was established using a single lung SBRT patient’s historical data. Average intensity projection (Ave-IP) and maximum intensity projection (MIP) reconstructions generated offline were compared between the 4D modalities. Similarity metrics included Dice similarity coefficient (DSC), Hausdorff distance, and center of mass (COM) deviation. Two isolated lesions were evaluated in the patient’s scans: one located in the right lower lobe (ITVRLL) and one located in the left lower lobe (ITVLLL). Results: Dose rates ranged from 2.30 (lung) to 5.18 (bone) E-3 cGy/mAs. For fixed acquisition parameters, cumulative dose is inversely proportional to gantry speed. For ITVRLL, DSC were 0.70 and 0.68, Hausdorff distances were 6.11 and 5.69 mm, and COM deviations were 1.24 and 4.77 mm, for Ave-IP and MIP respectively. For ITVLLL, DSC were 0.64 and 0.75, Hausdorff distances were 10.74 and 8.00 mm, and COM deviations were 7.55 and 4.3 mm, for Ave-IP and MIP respectively. Conclusion: While the dosimetric output of 4D-CBCT is low, characterization is necessary to assure clinical optimization. A basic workflow for comparison of simulation and treatment 4D image-based contours was established. This work was partially supported by a Research Scholar Grant (RSG-15-137-01-CCE) from the American Cancer Society.« less