skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Detecting anomalous sequences in electronic health records using higher-order tensor networks

Journal Article · · Journal of Biomedical Informatics

Detecting anomalous sequences is an integral part of building and protecting modern large-scale health information technology (HIT) systems. These HIT systems generate a large volume of records of patients’ state and significant events, which provide a valuable resource to help improve clinical decisions, patient care processes, and other issues. However, detecting anomalous sequences in electronic health records (EHR) remains a challenge in healthcare applications for several reasons, including imbalances in the data, complexity of relationships between events in the sequence, and the curse of dimensionality. Conventional anomaly detection methods use the finite sequence of events to discriminate sequences. They fail to incorporate salient event details under variable higher-order dependencies (e.g., duration between events) that can provide better discrimination of sequences in their models. To address this problem, we propose event sequence and subsequence anomaly detection algorithms that (1) use network-based representations of interactions in the data, (2) account for variable higher-order dependencies in the data, and (3) incorporate events duration for adequate discrimination of the data. The proposed approach identifies anomalies by monitoring the change in the graph after the test sequence is removed from the network. The change is quantified using graph distance metrics so that dramatic changes in the network can be attributed to the removed sequence. Furthermore, the proposed subsequence algorithm recommends plausible paths and salient information for the detected anomalous subsequences. Our results show that the proposed event sequence anomaly detection algorithm outperforms the baseline methods for both synthetic data and real-world EHR data.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; US Department of Veterans Affairs
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1895227
Journal Information:
Journal of Biomedical Informatics, Vol. 135; ISSN 1532-0464
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English

References (27)

Anomaly Detection for Discrete Sequences: A Survey journal May 2012
Detection of Abnormal Change in a time Series of Graphs journal March 2002
Investigating hidden Markov models capabilities in anomaly detection conference January 2005
A metabolomics-based approach for non-invasive screening of fetal central nervous system anomalies journal May 2018
Problems with health information technology and their effects on care delivery and patient outcomes: a systematic review journal February 2017
Efficient algorithms for mining outliers from large data sets journal June 2000
Building Usability Knowledge for Health Information Technology: A Usability-Oriented Analysis of Incident Reports journal May 2019
Computing label-constraint reachability in graph databases conference June 2010
Detection of Behavioral Anomalies in Medication Adherence Patterns Among Patients With Serious Mental Illness Engaged With a Digital Medicine System journal September 2020
Video Anomaly Identification journal September 2010
Efficient anomaly detection by modeling privilege flows using hidden Markov model journal January 2003
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning conference January 2017
Word2Vec journal December 2016
Algorithms on Stings, Trees, and Sequences journal December 1997
A review of PHR, EMR and EHR integration: A more personalized healthcare and public health policy journal March 2017
A clustering approach for detecting implausible observation values in electronic health records data journal July 2019
The effects and preventability of 2627 patient safety incidents related to health information technology failures: a retrospective analysis of 10 years of incident reporting in England and Wales journal July 2019
Effects of health information technology on patient outcomes: a systematic review journal November 2015
Detecting, Categorizing, and Correcting Coverage Anomalies of RNA-Seq Quantification journal December 2019
Anomaly Detection and Diagnosis Algorithms for Discrete Symbol Sequences with Applications to Airline Safety journal January 2009
Improved$hboxK$-Means Clustering Algorithm for Exploring Local Protein Sequence Motifs Representing Common Structural Property journal September 2005
Characterizing the behavior of a program using multiple-length N-grams conference January 2000
Apache Spark: a unified engine for big data processing journal October 2016
Anomaly detection of event sequences using multiple temporal resolutions and Markov chains journal May 2019
Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection conference January 2018
Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
  • Hundman, Kyle; Constantinou, Valentino; Laporte, Christopher
  • Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining https://doi.org/10.1145/3219819.3219845
conference July 2018
Representing higher-order dependencies in networks journal May 2016