Multi-fidelity classification using Gaussian processes: Accelerating the prediction of large-scale computational models

Costabal, Francisco Sahli; Perdikaris, Paris; Kuhl, Ellen; Hurtado, Daniel E.

doi:10.1016/j.cma.2019.112602

Title: Multi-fidelity classification using Gaussian processes: Accelerating the prediction of large-scale computational models

Journal Article · Fri Aug 30 00:00:00 EDT 2019 · Computer Methods in Applied Mechanics and Engineering

DOI:https://doi.org/10.1016/j.cma.2019.112602· OSTI ID:1595796

Costabal, Francisco Sahli ^[1];

^[2]; Kuhl, Ellen ^[3]; Hurtado, Daniel E. ^[1]

Pontificia Universidad Católica de Chile, Santiago (Chile). Institute for Biological and Medical Engineering, Schools of Engineering, Medicine and Biological Sciences; Millennium Nucleus for Cardiovascular Magnetic Resonance (Chile)
Univ. of Pennsylvania, Philadelphia, PA (United States)
Stanford Univ., CA (United States)

We report that machine learning techniques typically rely on large datasets to create accurate classifiers. However, there are situations when data is scarce and expensive to acquire. This is the case of studies that rely on state-of-the-art computational models which typically take days to run, thus hindering the potential of machine learning tools. In this work, we present a novel classifier that takes advantage of lower fidelity models and inexpensive approximations to predict the binary output of expensive computer simulations. We postulate an autoregressive model between the different levels of fidelity with Gaussian process priors. We adopt a fully Bayesian treatment for the hyper-parameters and use Markov Chain Monte Carlo samplers. We take advantage of the probabilistic nature of the classifier to implement active learning strategies. We also introduce a sparse approximation to enhance the ability of the multi-fidelity classifier to handle a large amount of low fidelity samples. We test these multi-fidelity classifiers against their single-fidelity counterpart with synthetic data, showing a median computational cost reduction of 23% for a target accuracy of 90%. In an application to cardiac electrophysiology, the multi-fidelity classifier achieves an F1 score, the harmonic mean of precision and recall, of 99.6% compared to 74.1% of a single-fidelity classifier when both are trained with 50 samples. In general, our results show that the multi-fidelity classifiers outperform their single-fidelity counterpart in terms of accuracy in all cases. Finally, we envision that this new tool will enable researchers to study classification problems that would otherwise be prohibitively expensive. Source code is available at https://github.com/fsahli/MFclass.

View Accepted Manuscript (DOE)

View Accepted Manuscript (Publisher)

Cite

Export

Save

Research Organization:: Univ. of Pennsylvania, Philadelphia, PA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

Grant/Contract Number:: SC0019116

OSTI ID:: 1595796

Alternate ID(s):: OSTI ID: 1564273

Journal Information:: Computer Methods in Applied Mechanics and Engineering, Vol. 357, Issue C; Related Information: https://github.com/fsahli/MFclass; ISSN 0045-7825

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

Citation Metrics:

Cited by: 34 works

Citation information provided by
Web of Science

References (31)

Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network Hannun, Awni Y.; Rajpurkar, Pranav; Haghpanahi, Masoumeh Nature Medicine, Vol. 25, Issue 1 https://doi.org/10.1038/s41591-018-0268-3	journal	January 2019
Computational fluid dynamics modelling in cardiovascular medicine Morris, Paul D.; Narracott, Andrew; von Tengg-Kobligk, Hendrik Heart, Vol. 102, Issue 1 https://doi.org/10.1136/heartjnl-2015-308044	journal	October 2015
Propagation of material behavior uncertainty in a nonlinear finite element model of reconstructive surgery Lee, Taeksang; Turin, Sergey Y.; Gosain, Arun K. Biomechanics and Modeling in Mechanobiology, Vol. 17, Issue 6 https://doi.org/10.1007/s10237-018-1061-4	journal	August 2018
Machine learning in drug development: Characterizing the effect of 30 drugs on the QT interval using Gaussian process regression, sensitivity analysis, and uncertainty quantification Sahli Costabal, Francisco; Matsuno, Kristen; Yao, Jiang Computer Methods in Applied Mechanics and Engineering, Vol. 348 https://doi.org/10.1016/j.cma.2019.01.033	journal	May 2019
A generalized multi-resolution expansion for uncertainty propagation with application to cardiovascular modeling Schiavazzi, D. E.; Doostan, A.; Iaccarino, G. Computer Methods in Applied Mechanics and Engineering, Vol. 314 https://doi.org/10.1016/j.cma.2016.09.024	journal	February 2017
Towards efficient uncertainty quantification in complex and large-scale biomechanical problems based on a Bayesian multi-fidelity scheme Biehler, Jonas; Gee, Michael W.; Wall, Wolfgang A. Biomechanics and Modeling in Mechanobiology, Vol. 14, Issue 3 https://doi.org/10.1007/s10237-014-0618-0	journal	September 2014
Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization Peherstorfer, Benjamin; Willcox, Karen; Gunzburger, Max SIAM Review, Vol. 60, Issue 3 https://doi.org/10.1137/16M1082469	journal	January 2018
Multifidelity Monte Carlo Estimation of Variance and Sensitivity Indices Qian, E.; Peherstorfer, B.; O'Malley, D. SIAM/ASA Journal on Uncertainty Quantification, Vol. 6, Issue 2 https://doi.org/10.1137/17M1151006	journal	January 2018
Fast uncertainty quantification of activation sequences in patient-specific cardiac electrophysiology meeting clinical time constraints: Fast uncertainty quantification in cardiac electrophysiology Quaglino, A.; Pezzuto, S.; Koutsourelakis, P. S. International Journal for Numerical Methods in Biomedical Engineering, Vol. 34, Issue 7 https://doi.org/10.1002/cnm.2985	journal	April 2018
A multi-resolution, non-parametric, Bayesian framework for identification of spatially-varying model parameters Koutsourelakis, P. S. Journal of Computational Physics, Vol. 228, Issue 17 https://doi.org/10.1016/j.jcp.2009.05.016	journal	September 2009
Predicting the output from a complex computer code when fast approximations are available Kennedy, M. Biometrika, Vol. 87, Issue 1 https://doi.org/10.1093/biomet/87.1.1	journal	March 2000
Multi-fidelity Gaussian process regression for prediction of random fields Parussini, L.; Venturi, D.; Perdikaris, P. Journal of Computational Physics, Vol. 336 https://doi.org/10.1016/j.jcp.2017.01.047	journal	May 2017
Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets Perdikaris, Paris; Venturi, Daniele; Karniadakis, George Em SIAM Journal on Scientific Computing, Vol. 38, Issue 4 https://doi.org/10.1137/15M1055164	journal	January 2016
Spatial and temporal organization during cardiac fibrillation Gray, Richard A.; Pertsov, Arkady M.; Jalife, José Nature, Vol. 392, Issue 6671 https://doi.org/10.1038/32164	journal	March 1998
A mechanical model predicts morphological abnormalities in the developing human brain Budday, Silvia; Raybaud, Charles; Kuhl, Ellen Scientific Reports, Vol. 4, Issue 1 https://doi.org/10.1038/srep05644	journal	July 2014
Instabilities of soft films on compliant substrates Holland, M. A.; Li, B.; Feng, X. Q. Journal of the Mechanics and Physics of Solids, Vol. 98 https://doi.org/10.1016/j.jmps.2016.09.012	journal	January 2017
Particle Learning of Gaussian Process Models for Sequential Design and Optimization Gramacy, Robert B.; Polson, Nicholas G. Journal of Computational and Graphical Statistics, Vol. 20, Issue 1 https://doi.org/10.1198/jcgs.2010.09171	journal	January 2011
Probabilistic programming in Python using PyMC3 Salvatier, John; Wiecki, Thomas V.; Fonnesbeck, Christopher PeerJ Computer Science, Vol. 2 https://doi.org/10.7717/peerj-cs.55	journal	January 2016
Active Learning with Statistical Models Cohn, D. A.; Ghahramani, Z.; Jordan, M. I. Journal of Artificial Intelligence Research, Vol. 4 https://doi.org/10.1613/jair.295	journal	January 1996
Large Sample Properties of Simulations Using Latin Hypercube Sampling Stein, Michael Technometrics, Vol. 29, Issue 2 https://doi.org/10.1080/00401706.1987.10488205	journal	May 1987
Individual Comparisons by Ranking Methods Wilcoxon, Frank Biometrics Bulletin, Vol. 1, Issue 6 https://doi.org/10.2307/3001968	journal	December 1945
On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other Mann, H. B.; Whitney, D. R. The Annals of Mathematical Statistics, Vol. 18, Issue 1 https://doi.org/10.1214/aoms/1177730491	journal	March 1947
A simple two-variable model of cardiac excitation Aliev, Rubin R.; Panfilov, Alexander V. Chaos, Solitons & Fractals, Vol. 7, Issue 3 https://doi.org/10.1016/0960-0779(95)00089-5	journal	March 1996
Interpreting Activation Mapping of Atrial Fibrillation: A Hybrid Computational/Physiological Study Sahli Costabal, Francisco; Zaman, Junaid A. B.; Kuhl, Ellen Annals of Biomedical Engineering, Vol. 46, Issue 2 https://doi.org/10.1007/s10439-017-1969-3	journal	December 2017
Generating Purkinje networks in the human heart Sahli Costabal, Francisco; Hurtado, Daniel E.; Kuhl, Ellen Journal of Biomechanics, Vol. 49, Issue 12 https://doi.org/10.1016/j.jbiomech.2015.12.025	journal	August 2016
Computational modelling of electrocardiograms: repolarisation and T-wave polarity in the human heart Hurtado, Daniel E.; Kuhl, Ellen Computer Methods in Biomechanics and Biomedical Engineering, Vol. 17, Issue 9 https://doi.org/10.1080/10255842.2012.729582	journal	October 2012
Erratum: Spatial and temporal organization during cardiac fibrillation Gray, Richard A.; Pertsov, Arkady M.; Jalife, José Nature, Vol. 393, Issue 6681 https://doi.org/10.1038/30290	journal	May 1998
Publisher Correction: Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network Hannun, Awni Y.; Rajpurkar, Pranav; Haghpanahi, Masoumeh Nature Medicine, Vol. 25, Issue 3 https://doi.org/10.1038/s41591-019-0359-9	journal	January 2019
Nonlinear information fusion algorithms for data-efficient multi-fidelity modelling Perdikaris, P.; Raissi, M.; Damianou, A. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 473, Issue 2198 https://doi.org/10.1098/rspa.2016.0751	journal	February 2017
Scalable Variational Gaussian Process Classification Hensman, James; Matthews, Alex; Ghahramani, Zoubin arXiv https://doi.org/10.48550/arxiv.1411.2005	preprint	January 2014
Survey of multifidelity methods in uncertainty propagation, inference, and optimization Peherstorfer, Benjamin; Willcox, Karen; Gunzburger, Max arXiv https://doi.org/10.48550/arxiv.1806.10761	preprint	January 2018

Cited By (4)

Multiscale Modeling Meets Machine Learning: What Can We Learn? Peng, Grace C. Y.; Alber, Mark; Buganza Tepole, Adrian Archives of Computational Methods in Engineering https://doi.org/10.1007/s11831-020-09405-5	journal	February 2020
Integrating machine learning and multiscale modeling—perspectives, challenges, and opportunities in the biological, biomedical, and behavioral sciences Alber, Mark; Buganza Tepole, Adrian; Cannon, William R. npj Digital Medicine, Vol. 2, Issue 1 https://doi.org/10.1038/s41746-019-0193-y	journal	November 2019
Multiscale modeling meets machine learning: What can we learn? Peng, Grace C. Y.; Alber, Mark; Tepole, Adrian Buganza arXiv https://doi.org/10.48550/arxiv.1911.11958	preprint	January 2019
Integrating Machine Learning and Multiscale Modeling: Perspectives, Challenges, and Opportunities in the Biological, Biomedical, and Behavioral Sciences Alber, Mark; Tepole, Adrian Buganza; Cannon, William arXiv https://doi.org/10.48550/arxiv.1910.01258	text	January 2019

Similar Records

Fast Characterization of Inducible Regions of Atrial Fibrillation Models With Multi-Fidelity Gaussian Process Classification

Journal Article · Mon Mar 07 00:00:00 EST 2022 · Frontiers in Physiology · OSTI ID:1595796

Gander, Lia; Pezzuto, Simone; Gharaviri, Ali; +3 more

Non-autoregressive time-series methods for stable parametric reduced-order models

Journal Article · Tue Aug 25 00:00:00 EDT 2020 · Physics of Fluids · OSTI ID:1595796

Maulik, Romit; Lusch, Bethany; Balaprakash, Prasanna

Physics-Informed Neural Networks for Cardiac Activation Mapping

Journal Article · Fri Feb 28 00:00:00 EST 2020 · Frontiers in Physics · OSTI ID:1595796

Sahli Costabal, Francisco; Yang, Yibo; Perdikaris, Paris; +2 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Machine learning
Bayesian inference
Hamiltonian Monte Carlo
Data-driven modeling
Cardiac electrophysiology

Title: Multi-fidelity classification using Gaussian processes: Accelerating the prediction of large-scale computational models

Citation Formats

References (31)

Cited By (4)

Similar Records

Related Subjects