Proposing New RadLex Terms by Analyzing Free-Text Mammography Reports
Journal Article
·
· Journal of Digital Imaging (Online)
- Stanford University, Department of Radiology and Department of Biomedical Data Science, Medical School Office Building (MSOB) (United States)
- University of Washington, Department of Radiology, Seattle Cancer Care Alliance (United States)
- University of Wisconsin School of Medicine and Public Health, Department of Radiology, E3/311 Clinical Science Center (United States)
After years of development, the RadLex terminology contains a large set of controlled terms for the radiology domain, but gaps still exist. We developed a data-driven approach to discover new terms for RadLex by mining a large corpus of radiology reports using natural language processing (NLP) methods. Our system, developed for mammography, discovers new candidate terms by analyzing noun phrases in free-text reports to extend the mammography part of RadLex. Our NLP system extracts noun phrases from free-text mammography reports and classifies these noun phrases as “Has Candidate RadLex Term” or “Does Not Have Candidate RadLex Term.” We tested the performance of our algorithm using 100 free-text mammography reports. An expert radiologist determined the true positive and true negative RadLex candidate terms. We calculated precision/positive predictive value and recall/sensitivity metrics to judge the system’s performance. Finally, to identify new candidate terms for enhancing RadLex, we applied our NLP method to 270,540 free-text mammography reports obtained from three academic institutions. Our method demonstrated precision/positive predictive value of 0.77 (159/206 terms) and a recall/sensitivity of 0.94 (159/170 terms). The overall accuracy of the system is 0.80 (235/293 terms). When we ran our system on the set of 270,540 reports, it found 31,800 unique noun phrases that are potential candidates for RadLex. Our data-driven approach to mining radiology reports can identify new candidate terms for expanding the breast imaging lexicon portion of RadLex and may be a useful approach for discovering new candidate terms from other radiology domains.
- OSTI ID:
- 22795588
- Journal Information:
- Journal of Digital Imaging (Online), Journal Name: Journal of Digital Imaging (Online) Journal Issue: 5 Vol. 31; ISSN 1618-727X
- Country of Publication:
- United States
- Language:
- English
Similar Records
Discovering Potential Precursors of Mammography Abnormalities based on Textual Features, Frequencies, and Sequences
A UMLS-based spell checker for natural language processing in vaccine safety
RECONCILE: a machine-learning coreference resolution system
Conference
·
Thu Dec 31 23:00:00 EST 2009
·
OSTI ID:986826
A UMLS-based spell checker for natural language processing in vaccine safety
Journal Article
·
Sun Feb 11 19:00:00 EST 2007
· BMC Medical Informatics and Decision Making (Online)
·
OSTI ID:1626564
RECONCILE: a machine-learning coreference resolution system
Software
·
Mon Dec 10 00:00:00 EST 2007
·
OSTI ID:1304621