skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Predicting diagnostic error in Radiology via eye-tracking and image analytics: Application in mammography

Abstract

Purpose: The primary aim of the present study was to test the feasibility of predicting diagnostic errors in mammography by merging radiologists gaze behavior and image characteristics. A secondary aim was to investigate group-based and personalized predictive models for radiologists of variable experience levels. Methods: The study was performed for the clinical task of assessing the likelihood of malignancy of mammographic masses. Eye-tracking data and diagnostic decisions for 40 cases were acquired from 4 Radiology residents and 2 breast imaging experts as part of an IRB-approved pilot study. Gaze behavior features were extracted from the eye-tracking data. Computer-generated and BIRADs images features were extracted from the images. Finally, machine learning algorithms were used to merge gaze and image features for predicting human error. Feature selection was thoroughly explored to determine the relative contribution of the various features. Group-based and personalized user modeling was also investigated. Results: Diagnostic error can be predicted reliably by merging gaze behavior characteristics from the radiologist and textural characteristics from the image under review. Leveraging data collected from multiple readers produced a reasonable group model (AUC=0.79). Personalized user modeling was far more accurate for the more experienced readers (average AUC of 0.837 0.029) than for themore » less experienced ones (average AUC of 0.667 0.099). The best performing group-based and personalized predictive models involved combinations of both gaze and image features. Conclusions: Diagnostic errors in mammography can be predicted reliably by leveraging the radiologists gaze behavior and image content.« less

Authors:
 [1];  [1];  [2];  [2];  [1]
  1. ORNL
  2. University of Tennessee, Knoxville (UTK)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Laboratory Directed Research and Development (LDRD) Program
OSTI Identifier:
1114262
DOE Contract Number:
DE-AC05-00OR22725
Resource Type:
Journal Article
Resource Relation:
Journal Name: Medical Physics; Journal Volume: 40; Journal Issue: 10
Country of Publication:
United States
Language:
English
Subject:
Modeling; diagnostic radiology error; mammography; eye-tracking; machine learning

Citation Formats

Voisin, Sophie, Pinto, Frank M, Morin-Ducote, Garnetta, Hudson, Kathy, and Tourassi, Georgia. Predicting diagnostic error in Radiology via eye-tracking and image analytics: Application in mammography. United States: N. p., 2013. Web. doi:10.1118/1.4820536.
Voisin, Sophie, Pinto, Frank M, Morin-Ducote, Garnetta, Hudson, Kathy, & Tourassi, Georgia. Predicting diagnostic error in Radiology via eye-tracking and image analytics: Application in mammography. United States. doi:10.1118/1.4820536.
Voisin, Sophie, Pinto, Frank M, Morin-Ducote, Garnetta, Hudson, Kathy, and Tourassi, Georgia. 2013. "Predicting diagnostic error in Radiology via eye-tracking and image analytics: Application in mammography". United States. doi:10.1118/1.4820536.
@article{osti_1114262,
title = {Predicting diagnostic error in Radiology via eye-tracking and image analytics: Application in mammography},
author = {Voisin, Sophie and Pinto, Frank M and Morin-Ducote, Garnetta and Hudson, Kathy and Tourassi, Georgia},
abstractNote = {Purpose: The primary aim of the present study was to test the feasibility of predicting diagnostic errors in mammography by merging radiologists gaze behavior and image characteristics. A secondary aim was to investigate group-based and personalized predictive models for radiologists of variable experience levels. Methods: The study was performed for the clinical task of assessing the likelihood of malignancy of mammographic masses. Eye-tracking data and diagnostic decisions for 40 cases were acquired from 4 Radiology residents and 2 breast imaging experts as part of an IRB-approved pilot study. Gaze behavior features were extracted from the eye-tracking data. Computer-generated and BIRADs images features were extracted from the images. Finally, machine learning algorithms were used to merge gaze and image features for predicting human error. Feature selection was thoroughly explored to determine the relative contribution of the various features. Group-based and personalized user modeling was also investigated. Results: Diagnostic error can be predicted reliably by merging gaze behavior characteristics from the radiologist and textural characteristics from the image under review. Leveraging data collected from multiple readers produced a reasonable group model (AUC=0.79). Personalized user modeling was far more accurate for the more experienced readers (average AUC of 0.837 0.029) than for the less experienced ones (average AUC of 0.667 0.099). The best performing group-based and personalized predictive models involved combinations of both gaze and image features. Conclusions: Diagnostic errors in mammography can be predicted reliably by leveraging the radiologists gaze behavior and image content.},
doi = {10.1118/1.4820536},
journal = {Medical Physics},
number = 10,
volume = 40,
place = {United States},
year = 2013,
month = 1
}
  • Purpose: The primary aim of the present study was to test the feasibility of predicting diagnostic errors in mammography by merging radiologists’ gaze behavior and image characteristics. A secondary aim was to investigate group-based and personalized predictive models for radiologists of variable experience levels.Methods: The study was performed for the clinical task of assessing the likelihood of malignancy of mammographic masses. Eye-tracking data and diagnostic decisions for 40 cases were acquired from four Radiology residents and two breast imaging experts as part of an IRB-approved pilot study. Gaze behavior features were extracted from the eye-tracking data. Computer-generated and BIRADS imagesmore » features were extracted from the images. Finally, machine learning algorithms were used to merge gaze and image features for predicting human error. Feature selection was thoroughly explored to determine the relative contribution of the various features. Group-based and personalized user modeling was also investigated.Results: Machine learning can be used to predict diagnostic error by merging gaze behavior characteristics from the radiologist and textural characteristics from the image under review. Leveraging data collected from multiple readers produced a reasonable group model [area under the ROC curve (AUC) = 0.792 ± 0.030]. Personalized user modeling was far more accurate for the more experienced readers (AUC = 0.837 ± 0.029) than for the less experienced ones (AUC = 0.667 ± 0.099). The best performing group-based and personalized predictive models involved combinations of both gaze and image features.Conclusions: Diagnostic errors in mammography can be predicted to a good extent by leveraging the radiologists’ gaze behavior and image content.« less
  • Different computational methods based on empirical or semi-empirical models and sophisticated Monte Carlo calculations have been proposed for prediction of x-ray spectra both in diagnostic radiology and mammography. In this work, the x-ray spectra predicted by various computational models used in the diagnostic radiology and mammography energy range have been assessed by comparison with measured spectra and their effect on the calculation of absorbed dose and effective dose (ED) imparted to the adult ORNL hermaphroditic phantom quantified. This includes empirical models (TASMIP and MASMIP), semi-empirical models (X-rayb and m, X-raytbc, XCOMP, IPEM, Tucker et al., and Blough et al.), andmore » Monte Carlo modeling (EGS4, ITS3.0, and MCNP4C). As part of the comparative assessment, the K x-ray yield, transmission curves, and half value layers (HVLs) have been calculated for the spectra generated with all computational models at different tube voltages. The measured x-ray spectra agreed well with the generated spectra when using X-raytbc and IPEM in diagnostic radiology and mammography energy ranges, respectively. Despite the systematic differences between the simulated and reference spectra for some models, the student's t-test statistical analysis showed there is no statistically significant difference between measured and generated spectra for all computational models investigated in this study. The MCNP4C-based Monte Carlo calculations showed there is no discernable discrepancy in the calculation of absorbed dose and ED in the adult ORNL hermaphroditic phantom when using different computational models for generating the x-ray spectra. Nevertheless, given the limited flexibility of the empirical and semi-empirical models, the spectra obtained through Monte Carlo modeling offer several advantages by providing detailed information about the interactions in the target and filters, which is relevant for the design of new target and filter combinations and optimization of radiological imaging protocols.« less
  • Purpose: The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Methods: Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth andmore » clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Results: Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502–0.739, 95% Confidence Interval: 0.543–0.680,p < 0.002). Conclusions: Patterns in detection errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees.« less
  • Purpose: Mammography is the most widely accepted and utilized screening modality for early breast cancer detection. Providing high quality mammography education to radiology trainees is essential, since excellent interpretation skills are needed to ensure the highest benefit of screening mammography for patients. The authors have previously proposed a computer-aided education system based on trainee models. Those models relate human-assessed image characteristics to trainee error. In this study, the authors propose to build trainee models that utilize features automatically extracted from images using computer vision algorithms to predict likelihood of missing each mass by the trainee. This computer vision-based approach tomore » trainee modeling will allow for automatically searching large databases of mammograms in order to identify challenging cases for each trainee. Methods: The authors’ algorithm for predicting the likelihood of missing a mass consists of three steps. First, a mammogram is segmented into air, pectoral muscle, fatty tissue, dense tissue, and mass using automated segmentation algorithms. Second, 43 features are extracted using computer vision algorithms for each abnormality identified by experts. Third, error-making models (classifiers) are applied to predict the likelihood of trainees missing the abnormality based on the extracted features. The models are developed individually for each trainee using his/her previous reading data. The authors evaluated the predictive performance of the proposed algorithm using data from a reader study in which 10 subjects (7 residents and 3 novices) and 3 experts read 100 mammographic cases. Receiver operating characteristic (ROC) methodology was applied for the evaluation. Results: The average area under the ROC curve (AUC) of the error-making models for the task of predicting which masses will be detected and which will be missed was 0.607 (95% CI,0.564-0.650). This value was statistically significantly different from 0.5 (p < 0.0001). For the 7 residents only, the AUC performance of the models was 0.590 (95% CI,0.537-0.642) and was also significantly higher than 0.5 (p = 0.0009). Therefore, generally the authors’ models were able to predict which masses were detected and which were missed better than chance. Conclusions: The authors proposed an algorithm that was able to predict which masses will be detected and which will be missed by each individual trainee. This confirms existence of error-making patterns in the detection of masses among radiology trainees. Furthermore, the proposed methodology will allow for the optimized selection of difficult cases for the trainees in an automatic and efficient manner.« less