skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Multi-fidelity classification using Gaussian processes: Accelerating the prediction of large-scale computational models

Journal Article · · Computer Methods in Applied Mechanics and Engineering
 [1]; ORCiD logo [2];  [3];  [1]
  1. Pontificia Universidad Católica de Chile, Santiago (Chile). Institute for Biological and Medical Engineering, Schools of Engineering, Medicine and Biological Sciences; Millennium Nucleus for Cardiovascular Magnetic Resonance (Chile)
  2. Univ. of Pennsylvania, Philadelphia, PA (United States)
  3. Stanford Univ., CA (United States)

We report that machine learning techniques typically rely on large datasets to create accurate classifiers. However, there are situations when data is scarce and expensive to acquire. This is the case of studies that rely on state-of-the-art computational models which typically take days to run, thus hindering the potential of machine learning tools. In this work, we present a novel classifier that takes advantage of lower fidelity models and inexpensive approximations to predict the binary output of expensive computer simulations. We postulate an autoregressive model between the different levels of fidelity with Gaussian process priors. We adopt a fully Bayesian treatment for the hyper-parameters and use Markov Chain Monte Carlo samplers. We take advantage of the probabilistic nature of the classifier to implement active learning strategies. We also introduce a sparse approximation to enhance the ability of the multi-fidelity classifier to handle a large amount of low fidelity samples. We test these multi-fidelity classifiers against their single-fidelity counterpart with synthetic data, showing a median computational cost reduction of 23% for a target accuracy of 90%. In an application to cardiac electrophysiology, the multi-fidelity classifier achieves an F1 score, the harmonic mean of precision and recall, of 99.6% compared to 74.1% of a single-fidelity classifier when both are trained with 50 samples. In general, our results show that the multi-fidelity classifiers outperform their single-fidelity counterpart in terms of accuracy in all cases. Finally, we envision that this new tool will enable researchers to study classification problems that would otherwise be prohibitively expensive. Source code is available at https://github.com/fsahli/MFclass.

Research Organization:
Univ. of Pennsylvania, Philadelphia, PA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
SC0019116
OSTI ID:
1595796
Alternate ID(s):
OSTI ID: 1564273
Journal Information:
Computer Methods in Applied Mechanics and Engineering, Vol. 357, Issue C; Related Information: https://github.com/fsahli/MFclass; ISSN 0045-7825
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 34 works
Citation information provided by
Web of Science

References (31)

Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network journal January 2019
Computational fluid dynamics modelling in cardiovascular medicine journal October 2015
Propagation of material behavior uncertainty in a nonlinear finite element model of reconstructive surgery journal August 2018
Machine learning in drug development: Characterizing the effect of 30 drugs on the QT interval using Gaussian process regression, sensitivity analysis, and uncertainty quantification journal May 2019
A generalized multi-resolution expansion for uncertainty propagation with application to cardiovascular modeling journal February 2017
Towards efficient uncertainty quantification in complex and large-scale biomechanical problems based on a Bayesian multi-fidelity scheme journal September 2014
Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization journal January 2018
Multifidelity Monte Carlo Estimation of Variance and Sensitivity Indices journal January 2018
Fast uncertainty quantification of activation sequences in patient-specific cardiac electrophysiology meeting clinical time constraints: Fast uncertainty quantification in cardiac electrophysiology
  • Quaglino, A.; Pezzuto, S.; Koutsourelakis, P. S.
  • International Journal for Numerical Methods in Biomedical Engineering, Vol. 34, Issue 7 https://doi.org/10.1002/cnm.2985
journal April 2018
A multi-resolution, non-parametric, Bayesian framework for identification of spatially-varying model parameters journal September 2009
Predicting the output from a complex computer code when fast approximations are available journal March 2000
Multi-fidelity Gaussian process regression for prediction of random fields journal May 2017
Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets journal January 2016
Spatial and temporal organization during cardiac fibrillation journal March 1998
A mechanical model predicts morphological abnormalities in the developing human brain journal July 2014
Instabilities of soft films on compliant substrates journal January 2017
Particle Learning of Gaussian Process Models for Sequential Design and Optimization journal January 2011
Probabilistic programming in Python using PyMC3 journal January 2016
Active Learning with Statistical Models journal January 1996
Large Sample Properties of Simulations Using Latin Hypercube Sampling journal May 1987
Individual Comparisons by Ranking Methods journal December 1945
On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other journal March 1947
A simple two-variable model of cardiac excitation journal March 1996
Interpreting Activation Mapping of Atrial Fibrillation: A Hybrid Computational/Physiological Study journal December 2017
Generating Purkinje networks in the human heart journal August 2016
Computational modelling of electrocardiograms: repolarisation and T-wave polarity in the human heart journal October 2012
Erratum: Spatial and temporal organization during cardiac fibrillation journal May 1998
Publisher Correction: Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network journal January 2019
Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling journal February 2017
Scalable Variational Gaussian Process Classification preprint January 2014
Survey of multifidelity methods in uncertainty propagation, inference, and optimization preprint January 2018

Cited By (4)

Multiscale Modeling Meets Machine Learning: What Can We Learn? journal February 2020
Integrating machine learning and multiscale modeling—perspectives, challenges, and opportunities in the biological, biomedical, and behavioral sciences journal November 2019
Multiscale modeling meets machine learning: What can we learn? preprint January 2019
Integrating Machine Learning and Multiscale Modeling: Perspectives, Challenges, and Opportunities in the Biological, Biomedical, and Behavioral Sciences text January 2019