DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Unstructured clinical notes within the 24 hours since admission predict short, mid & long-term mortality in adult ICU patients

Journal Article · · PLoS ONE

Mortality prediction for intensive care unit (ICU) patients is crucial for improving outcomes and efficient utilization of resources. Accessibility of electronic health records (EHR) has enabled data-driven predictive modeling using machine learning. However, very few studies rely solely on unstructured clinical notes from the EHR for mortality prediction. In this work, we propose a framework to predict short, mid, and long-term mortality in adult ICU patients using unstructured clinical notes from the MIMIC III database, natural language processing (NLP), and machine learning (ML) models. Depending on the statistical description of the patients’ length of stay, we define the short-term as 48-hour and 4-day period, the mid-term as 7-day and 10-day period, and the long-term as 15-day and 30-day period after admission. We found that by only using clinical notes within the 24 hours of admission, our framework can achieve a high area under the receiver operating characteristics (AU-ROC) score for short, mid and long-term mortality prediction tasks. The test AU-ROC scores are 0.87, 0.83, 0.83, 0.82, 0.82, and 0.82 for 48-hour, 4-day, 7-day, 10-day, 15-day, and 30-day period mortality prediction, respectively. We also provide a comparative study among three types of feature extraction techniques from NLP: frequency-based technique, fixed embedding-based technique, and dynamic embedding-based technique. Lastly, we provide an interpretation of the NLP-based predictive models using feature-importance scores.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC); US Department of Veterans Affairs (VA), Office of Information Technology
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1847530
Journal Information:
PLoS ONE, Journal Name: PLoS ONE Journal Issue: 1 Vol. 17; ISSN 1932-6203
Publisher:
Public Library of ScienceCopyright Statement
Country of Publication:
United States
Language:
English

References (42)

Machine Learning Approaches to Predict 6-Month Mortality Among Patients With Cancer journal October 2019
Index for rating diagnostic tests journal January 1950
Evaluation of severity scoring systems in ICUs—translation, conversion and definition ambiguities as a source of inter-observer variability in Apache II, SAPS and OSF journal April 1995
Prediction of outcome in critically ill patients using artificial neural network synthesised by genetic algorithm journal April 1996
Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach journal December 2017
Predicting in-hospital mortality of patients with acute kidney injury in the ICU using random forest model journal May 2019
ISeeU: Visually interpretable deep learning for mortality prediction inside the ICU journal October 2019
Towards unstructured mortality prediction with free-text clinical notes journal August 2020
Early hospital mortality prediction using vital signals journal December 2018
Random Forests journal January 2001
MIMIC-III, a freely accessible critical care database journal May 2016
BioBERT: a pre-trained biomedical language representation model for biomedical text mining journal September 2019
Real world evidence in cardiovascular medicine: ensuring data validity in electronic health record-based studies journal August 2019
Critical care medicine in the United States 2000–2005: An analysis of bed numbers, occupancy rates, payer mix, and costs* journal January 2010
Mortality prediction with self normalizing neural networks in intensive care unit patients conference March 2018
Exploiting Text Data to Improve Critical Care Mortality Prediction conference December 2020
Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books conference December 2015
Achieving Reliable Sentiment Analysis in the Software Engineering Domain using BERT conference September 2020
Monitoring ICU Mortality Risk with A Long Short-Term Memory Recurrent Neural Network conference December 2019
Unfolding physiological state: mortality modelling in intensive care units
  • Ghassemi, Marzyeh; Naumann, Tristan; Doshi-Velez, Finale
  • KDD '14: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining https://doi.org/10.1145/2623330.2623742
conference August 2014
XGBoost: A Scalable Tree Boosting System conference January 2016
An Artificial Neural Networks Model for Early Predicting In-Hospital Mortality in Acute Pancreatitis in MIMIC-III journal January 2021
Enriching Word Vectors with Subword Information journal December 2017
Discovering the Predictive Value of Clinical Notes: Machine Learning Analysis with Text Representation journal December 2020
Prolonged Elevated Heart Rate and 90-Day Survival in Acutely Ill Patients: Data From the MIMIC-III Database journal May 2017
Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU journal October 2020
Prediction of short-term mortality in acute heart failure patients using minimal electronic health record data journal March 2021
Performance of critical care prognostic scoring systems in low and middle-income countries: a systematic review journal January 2018
Using structured pathology data to predict hospital-wide mortality at admission journal June 2020
A machine learning based exploration of COVID-19 mortality risk journal July 2021
New Approach based on Machine Learning for Short-Term Mortality Prediction in Neonatal Intensive Care Unit journal January 2019
Transformers: State-of-the-Art Natural Language Processing conference January 2020
Explainable Clinical Decision Support from Text conference January 2020
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration
  • van Aken, Betty; Papaioannou, Jens-Michalis; Mayrdorfer, Manuel
  • Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume https://doi.org/10.18653/v1/2021.eacl-main.75
conference January 2021
BERT Rediscovers the Classical NLP Pipeline conference January 2019
Publicly Available Clinical conference January 2019
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets conference January 2019
Prolonged Elevated Heart Rate and 90-Day Survival in Acutely Ill Patients: Data From the MIMIC-III Database collection January 2018
Managing Unstructured Big Data in Healthcare System journal January 2019
Cross-domain Authorship Attribution Using Pre-trained Language Models audiovisual January 2020
Performance of critical care prognostic scoring systems in low and middle-income countries: a systematic review collection January 2018
Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost collection January 2020