Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Deformable phrase level attention: A flexible approach for improving AI based medical coding

Journal Article · · Artificial Intelligence in Medicine
 [1];  [2];  [3];  [4];  [4]
  1. Univ. of Tennessee, Knoxville, TN (United States). Bredesen Center for Interdisciplinary Research and Graduate Education; Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
  2. Thomson Reuters, Eagan, MN (United States)
  3. Elsevier, Philadelphia, PA (United States)
  4. Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Objective: Improving the AI-driven automated medical encoding of clinical text plays a vital role in gathering information on the occurrence of diseases to improve population-level health. This work presents a novel attention mechanism designed to enhance text classification models and ensure appropriate classification of medical concepts in unstructured electronic health records. Materials and Methods: We developed a deformable, phrase-level attention mechanism to identify important lexical word-level and contextual phrase-level information from clinical text documents. We evaluated conventional and transformer-based deep learning models that we extended with our attention mechanism on the extraction of critical cancer information (e.g., site, subsite, laterality, histology, behavior) from 629,908 electronic pathology reports and on the automated medical encoding of 52,722 hospital discharge summaries. Results: Transformer-based models with the deformable, phrase-level attention mechanism achieved the best performance on the extraction of critical cancer information from pathology reports. Conventional- and transformer-based models show similar or better performance than their baseline counterparts on the automated medical encoding of clinical documents. Discussion: The addition of phrase-level information allowed models extended with our proposed method to outperform standard word-level attention. Our method showed favorable properties for the real-world application in terms of model robustness and phenotyping. These results indicate that our method is promising for automated data harmonization for common data models. Conclusion: This work proposes a novel deformable, phrase-level attention mechanism that enhances text classification models in the extraction of medical concepts from clinical text documents. We demonstrate strong performances on two clinical text datasets and showcase real-world deployability of our method.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
National Institutes of Health (NIH); USDOE
Grant/Contract Number:
AC02-06CH11357; AC05-00OR22725; AC52-06NA25396; AC52-07NA27344
OSTI ID:
3020940
Journal Information:
Artificial Intelligence in Medicine, Journal Name: Artificial Intelligence in Medicine Vol. 171; ISSN 0933-3657
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (31)

Natural Language Processing in Oncology: A Review journal June 2016
Challenges and opportunities beyond structured data in analysis of electronic health records journal February 2021
RETRACTED ARTICLE: Artificial intelligence and machine learning in precision and genomic medicine journal June 2022
Racial disparities in guideline-concordant cancer care and mortality in the United States journal July 2018
Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction journal June 2016
Classifying cancer pathology reports with hierarchical self-attention networks journal November 2019
Phrase2Vec: Phrase embedding based on parsing journal May 2020
Natural language processing: State of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine journal October 2013
Hierarchical label-wise attention transformer model for explainable ICD coding journal September 2022
Explainable artificial intelligence (XAI) in deep learning-based medical image analysis journal July 2022
Predicting infectious disease for biopreparedness and response: A systematic review of machine learning and deep learning approaches journal December 2022
MIMIC-III, a freely accessible critical care database journal May 2016
Using informatics to improve cancer surveillance journal September 2020
A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports journal June 2024
Evaluating large language models for health-related text classification tasks with public social media data journal August 2024
Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports journal July 2022
Boosting Crowd Counting via Multifaceted Attention conference June 2022
Measuring Domain Shift for Deep Learning in Histopathology journal February 2021
Limitations of Transformers on Clinical Text Classification journal September 2021
Attention Mechanisms in Clinical Text Classification: A Comparative Evaluation journal April 2024
Deformable Self-Attention for Text Classification journal January 2021
A systematic literature review of automated clinical coding and classification systems journal November 2010
Deep Learning--based Text Classification: A Comprehensive Review journal June 2021
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance journal July 2022
Improving Cancer Data Interoperability: The Promise of the Minimal Common Oncology Data Elements (mCODE) Initiative journal November 2020
Population Health Informatics Can Advance Interoperability: National Program of Cancer Registries Electronic Pathology Reporting Project journal November 2020
Effective Convolutional Attention Network for Multi-label Clinical Document Classification conference January 2021
Phrase-level Self-Attention Networks for Universal Sentence Encoding conference January 2018
The Fast Health Interoperability Resources (FHIR) Standard: Systematic Literature Review of Implementations, Applications, Challenges and Opportunities journal July 2021
Text Classification Algorithms: A Survey journal April 2019
Validation of natural language processing to extract breast cancer pathology procedures and results journal January 2015

Similar Records

Class imbalance in out-of-distribution datasets: Improving the robustness of the TextCNN for the classification of rare cancer types
Journal Article · Sun Nov 21 19:00:00 EST 2021 · Journal of Biomedical Informatics · OSTI ID:1884003

FrESCO: Framework for Exploring Scalable Computational Oncology
Journal Article · Sun Sep 10 20:00:00 EDT 2023 · Journal of Open Source Software · OSTI ID:2000393