skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Flexible, cluster-based analysis of the electronic medical record of sepsis with composite mixture models

Abstract

The widespread adoption of electronic medical records (EMRs) in healthcare has provided vast new amounts of data for statistical machine learning researchers in their efforts to model and predict patient health status, potentially enabling novel advances in treatment. In the case of sepsis, a debilitating, dysregulated host response to infection, extracting subtle, uncataloged clinical phenotypes from the EMR with statistical machine learning methods has the potential to impact patient diagnosis and treatment early in the course of their hospitalization. However, there are significant barriers that must be overcome to extract these insights from EMR data. First, EMR datasets consist of both static and dynamic observations of discrete and continuous-valued variables, many of which may be missing, precluding the application of standard multivariate analysis techniques. Second, clinical populations observed via EMRs and relevant to the study and management of conditions like sepsis are often heterogeneous; properly accounting for this heterogeneity is critical. Here, we describe an unsupervised, probabilistic framework called a composite mixture model that can simultaneously accommodate the wide variety of observations frequently observed in EMR datasets, characterize heterogeneous clinical populations, and handle missing observations. In conclusion, we demonstrate the efficacy of our approach on a large-scale sepsis cohort, developingmore » novel techniques built on our model-based clusters to track patient mortality risk over time and identify physiological trends and distinct subgroups of the dataset associated with elevated risk of mortality during hospitalization.« less

Authors:
 [1];  [1];  [1];  [2];  [2];  [1]
  1. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  2. Kaiser Permanente Northern California, Oakland, CA (United States)
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1477828
Report Number(s):
LLNL-JRNL-730845
Journal ID: ISSN 1532-0464; 881645
Grant/Contract Number:  
AC52-07NA27344
Resource Type:
Accepted Manuscript
Journal Name:
Journal of Biomedical Informatics
Additional Journal Information:
Journal Volume: 78; Journal Issue: C; Journal ID: ISSN 1532-0464
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; 97 MATHEMATICS AND COMPUTING; Electronic health records; Mixture modeling; Risk stratification; Sepsis; Composite mixture model; Cluster analysis

Citation Formats

Mayhew, Michael B., Petersen, Brenden K., Sales, Ana Paula, Greene, John D., Liu, Vincent X., and Wasson, Todd S. Flexible, cluster-based analysis of the electronic medical record of sepsis with composite mixture models. United States: N. p., 2017. Web. doi:10.1016/j.jbi.2017.11.015.
Mayhew, Michael B., Petersen, Brenden K., Sales, Ana Paula, Greene, John D., Liu, Vincent X., & Wasson, Todd S. Flexible, cluster-based analysis of the electronic medical record of sepsis with composite mixture models. United States. doi:10.1016/j.jbi.2017.11.015.
Mayhew, Michael B., Petersen, Brenden K., Sales, Ana Paula, Greene, John D., Liu, Vincent X., and Wasson, Todd S. Sat . "Flexible, cluster-based analysis of the electronic medical record of sepsis with composite mixture models". United States. doi:10.1016/j.jbi.2017.11.015. https://www.osti.gov/servlets/purl/1477828.
@article{osti_1477828,
title = {Flexible, cluster-based analysis of the electronic medical record of sepsis with composite mixture models},
author = {Mayhew, Michael B. and Petersen, Brenden K. and Sales, Ana Paula and Greene, John D. and Liu, Vincent X. and Wasson, Todd S.},
abstractNote = {The widespread adoption of electronic medical records (EMRs) in healthcare has provided vast new amounts of data for statistical machine learning researchers in their efforts to model and predict patient health status, potentially enabling novel advances in treatment. In the case of sepsis, a debilitating, dysregulated host response to infection, extracting subtle, uncataloged clinical phenotypes from the EMR with statistical machine learning methods has the potential to impact patient diagnosis and treatment early in the course of their hospitalization. However, there are significant barriers that must be overcome to extract these insights from EMR data. First, EMR datasets consist of both static and dynamic observations of discrete and continuous-valued variables, many of which may be missing, precluding the application of standard multivariate analysis techniques. Second, clinical populations observed via EMRs and relevant to the study and management of conditions like sepsis are often heterogeneous; properly accounting for this heterogeneity is critical. Here, we describe an unsupervised, probabilistic framework called a composite mixture model that can simultaneously accommodate the wide variety of observations frequently observed in EMR datasets, characterize heterogeneous clinical populations, and handle missing observations. In conclusion, we demonstrate the efficacy of our approach on a large-scale sepsis cohort, developing novel techniques built on our model-based clusters to track patient mortality risk over time and identify physiological trends and distinct subgroups of the dataset associated with elevated risk of mortality during hospitalization.},
doi = {10.1016/j.jbi.2017.11.015},
journal = {Journal of Biomedical Informatics},
number = C,
volume = 78,
place = {United States},
year = {2017},
month = {12}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

Save / Share: