skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: EHR-BERT: A BERT-based model for effective anomaly detection in electronic health records

Journal Article · · Journal of Biomedical Informatics

Objective: Physicians and clinicians rely on data contained in electronic health records (EHRs), as recorded by health information technology (HIT), to make informed decisions about their patients. The reliability of HIT systems in this regard is critical to patient safety. Consequently, better tools are needed to monitor the performance of HIT systems for potential hazards that could compromise the collected EHRs, which in turn could affect patient safety. In this paper, we propose a new framework for detecting anomalies in EHRs using sequence of clinical events. This new framework, EHR-Bidirectional Encoder Representations from Transformers (BERT), is motivated by the gaps in the existing deep-learning related methods, including high false negatives, sub-optimal accuracy, higher computational cost, and the risk of information loss. EHR-BERT is an innovative framework rooted in the BERT architecture, meticulously tailored to navigate the hurdles in the contemporary BERT method; thus, enhancing anomaly detection in EHRs for healthcare applications.Methods: The EHR-BERT framework was designed using the Sequential Masked Token Prediction (SMTP) method. This approach treats EHRs as natural language sentences and iteratively masks input tokens during both training and prediction stages. This method facilitates the learning of EHR sequence patterns in both directions for each event and identifies anomalies based on deviations from the normal execution models trained on EHR sequences.Results: Extensive experiments on large EHR datasets across various medical domains demonstrate that EHR-BERT markedly improves upon existing models. It significantly reduces the number of false positives and enhances the detection rate, thus bolstering the reliability of anomaly detection in electronic health records. This improvement is attributed to the model’s ability to minimize information loss and maximize data utilization effectively.Conclusion: EHR-BERT showcases immense potential in decreasing medical errors related to anomalous clinical events, positioning itself as an indispensable asset for enhancing patient safety and the overall standard of healthcare services. The framework effectively overcomes the drawbacks of earlier models, making it a promising solution for healthcare professionals to ensure the reliability and quality of health data.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Compute and Data Environment for Science (CADES)
Sponsoring Organization:
USDOE Office of Science (SC)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
2317773
Journal Information:
Journal of Biomedical Informatics, Vol. 150, Issue 1; ISSN 1532-0464
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (20)

The effects and preventability of 2627 patient safety incidents related to health information technology failures: a retrospective analysis of 10 years of incident reporting in England and Wales journal July 2019
An analysis of electronic health record-related patient safety concerns journal November 2014
Anomaly Detection for Discrete Sequences: A Survey journal May 2012
Detecting large-scale system problems by mining console logs conference January 2009
On local anomaly detection and analysis for clinical pathways journal November 2015
Robust log-based anomaly detection on unstable log data
  • Zhang, Xu; Xu, Yong; Lin, Qingwei
  • Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering https://doi.org/10.1145/3338906.3338931
conference August 2019
Log clustering based problem identification for online service systems conference May 2016
Healthcare and anomaly detection: using machine learning to predict anomalies in heart rate data journal May 2020
Detecting anomalous sequences in electronic health records using higher-order tensor networks journal November 2022
Building Usability Knowledge for Health Information Technology: A Usability-Oriented Analysis of Incident Reports journal May 2019
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning conference January 2017
Characterizing the behavior of a program using multiple-length N-grams conference January 2000
A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications journal February 2015
Event Logs for the Analysis of Software Failures: A Rule-Based Approach journal June 2013
Systematic Review: Impact of Health Information Technology on Quality, Efficiency, and Costs of Medical Care journal May 2006
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction journal May 2021
Electronic health records: new opportunities for clinical research journal October 2013
Algorithms on Stings, Trees, and Sequences journal December 1997
Beehive conference December 2013
Enhancing patient safety and quality of care by improving the usability of electronic health record systems: recommendations from AMIA journal June 2013