Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques
Abstract
Optimization of simulation-based or data-driven systems is a challenging task, which has attracted significant attention in the recent literature. A very efficient approach for optimizing systems without analytical expressions is through fitting surrogate models. Due to their increased flexibility, nonlinear interpolating functions, such as radial basis functions and Kriging, have been predominantly used as surrogates for data-driven optimization; however, these methods lead to complex nonconvex formulations. Alternatively, commonly used regression-based surrogates lead to simpler formulations, but they are less flexible and inaccurate if the form is not known a priori. In this work, we investigate the efficiency of subset selection regression techniques for developing surrogate functions that balance both accuracy and complexity. Subset selection creates sparse regression models by selecting only a subset of original features, which are linearly combined to generate a diverse set of surrogate models. Five different subset selection techniques are compared with commonly used nonlinear interpolating surrogate functions with respect to optimization solution accuracy, computation time, sampling requirements, and model sparsity. Furthermore, our results indicate that subset selection-based regression functions exhibit promising performance when the dimensionality is low, while interpolation performs better for higher dimensional problems.
- Authors:
-
- Georgia Inst. of Technology, Atlanta, GA (United States)
- Publication Date:
- Research Org.:
- RAPID Manufacturing Institute, New York, NY (United States)
- Sponsoring Org.:
- USDOE Office of Energy Efficiency and Renewable Energy (EERE), Energy Efficiency Office. Advanced Manufacturing Office
- OSTI Identifier:
- 1642435
- Grant/Contract Number:
- EE0007888
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Optimization Letters
- Additional Journal Information:
- Journal Volume: 14; Journal Issue: 4; Journal ID: ISSN 1862-4472
- Publisher:
- Springer Nature
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 42 ENGINEERING; Machine Learning; Surrogate modeling; Black-box optimization; Data-driven optimization; Subset selection for regression
Citation Formats
Kim, Sun Hye, and Boukouvala, Fani. Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques. United States: N. p., 2019.
Web. doi:10.1007/s11590-019-01428-7.
Kim, Sun Hye, & Boukouvala, Fani. Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques. United States. https://doi.org/10.1007/s11590-019-01428-7
Kim, Sun Hye, and Boukouvala, Fani. Thu .
"Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques". United States. https://doi.org/10.1007/s11590-019-01428-7. https://www.osti.gov/servlets/purl/1642435.
@article{osti_1642435,
title = {Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques},
author = {Kim, Sun Hye and Boukouvala, Fani},
abstractNote = {Optimization of simulation-based or data-driven systems is a challenging task, which has attracted significant attention in the recent literature. A very efficient approach for optimizing systems without analytical expressions is through fitting surrogate models. Due to their increased flexibility, nonlinear interpolating functions, such as radial basis functions and Kriging, have been predominantly used as surrogates for data-driven optimization; however, these methods lead to complex nonconvex formulations. Alternatively, commonly used regression-based surrogates lead to simpler formulations, but they are less flexible and inaccurate if the form is not known a priori. In this work, we investigate the efficiency of subset selection regression techniques for developing surrogate functions that balance both accuracy and complexity. Subset selection creates sparse regression models by selecting only a subset of original features, which are linearly combined to generate a diverse set of surrogate models. Five different subset selection techniques are compared with commonly used nonlinear interpolating surrogate functions with respect to optimization solution accuracy, computation time, sampling requirements, and model sparsity. Furthermore, our results indicate that subset selection-based regression functions exhibit promising performance when the dimensionality is low, while interpolation performs better for higher dimensional problems.},
doi = {10.1007/s11590-019-01428-7},
journal = {Optimization Letters},
number = 4,
volume = 14,
place = {United States},
year = {Thu May 09 00:00:00 EDT 2019},
month = {Thu May 09 00:00:00 EDT 2019}
}
Web of Science
Figures / Tables:
Works referenced in this record:
Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment: Enhanced Protein Structure Refinement
journal, March 2017
- Khoury, George A.; Smadbeck, James; Kieslich, Chris A.
- Proteins: Structure, Function, and Bioinformatics, Vol. 85, Issue 6
Metamodeling Approach to Optimization of Steady-State Flowsheet Simulations
journal, October 2002
- Palmer, K.; Realff, M.
- Chemical Engineering Research and Design, Vol. 80, Issue 7
Stable signal recovery from incomplete and inaccurate measurements
journal, January 2006
- Candès, Emmanuel J.; Romberg, Justin K.; Tao, Terence
- Communications on Pure and Applied Mathematics, Vol. 59, Issue 8, p. 1207-1223
Gene Selection for Cancer Classification using Support Vector Machines
journal, January 2002
- Guyon, Isabelle; Weston, Jason; Barnhill, Stephen
- Machine Learning, Vol. 46, Issue 1/3, p. 389-422
Derivative-free optimization: a review of algorithms and comparison of software implementations
journal, July 2012
- Rios, Luis Miguel; Sahinidis, Nikolaos V.
- Journal of Global Optimization, Vol. 56, Issue 3
Learning surrogate models for simulation-based optimization
journal, March 2014
- Cozad, Alison; Sahinidis, Nikolaos V.; Miller, David C.
- AIChE Journal, Vol. 60, Issue 6
Optimization formulations for multi-product supply chain networks
journal, September 2017
- Sampat, Apoorva M.; Martin, Edgar; Martin, Mariano
- Computers & Chemical Engineering, Vol. 104
Dynamic Data-Driven Modeling of Pharmaceutical Processes
journal, June 2011
- Boukouvala, F.; Muzzio, F. J.; Ierapetritou, Marianthi G.
- Industrial & Engineering Chemistry Research, Vol. 50, Issue 11
A tutorial on support vector regression
journal, August 2004
- Smola, Alex J.; Schölkopf, Bernhard
- Statistics and Computing, Vol. 14, Issue 3
A polyhedral branch-and-cut approach to global optimization
journal, May 2005
- Tawarmalani, Mohit; Sahinidis, Nikolaos V.
- Mathematical Programming, Vol. 103, Issue 2
Sparse principal component regression with adaptive loading
journal, September 2015
- Kawano, Shuichi; Fujisawa, Hironori; Takada, Toyoyuki
- Computational Statistics & Data Analysis, Vol. 89
Partial least-squares regression: a tutorial
journal, January 1986
- Geladi, Paul; Kowalski, Bruce R.
- Analytica Chimica Acta, Vol. 185
Sparse Principal Component Analysis
journal, June 2006
- Zou, Hui; Hastie, Trevor; Tibshirani, Robert
- Journal of Computational and Graphical Statistics, Vol. 15, Issue 2
ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems
journal, April 2016
- Boukouvala, Fani; Floudas, Christodoulos A.
- Optimization Letters, Vol. 11, Issue 5
Feature subset selection using naive Bayes for text classification
journal, November 2015
- Feng, Guozhong; Guo, Jianhua; Jing, Bing-Yi
- Pattern Recognition Letters, Vol. 65
Protein structure prediction by global optimization of a potential energy function
journal, May 1999
- Liwo, A.; Lee, J.; Ripoll, D. R.
- Proceedings of the National Academy of Sciences, Vol. 96, Issue 10
Regularization and variable selection via the elastic net
journal, April 2005
- Zou, Hui; Hastie, Trevor
- Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 67, Issue 2
Practical selection of SVM parameters and noise estimation for SVM regression
journal, January 2004
- Cherkassky, Vladimir; Ma, Yunqian
- Neural Networks, Vol. 17, Issue 1
Robust Face Recognition via Sparse Representation
journal, February 2009
- Wright, J.; Yang, A. Y.; Ganesh, A.
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 31, Issue 2
Assessing a Response Surface-Based Optimization Approach for Soil Vapor Extraction System Design
journal, May 2009
- Fen, Chiu-Shia; Chan, Chencha; Cheng, Hsien-Chie
- Journal of Water Resources Planning and Management, Vol. 135, Issue 3
Efficient Optimization Design Method Using Kriging Model
journal, March 2005
- Jeong, Shinkyu; Murayama, Mitsuhiro; Yamamoto, Kazuomi
- Journal of Aircraft, Vol. 42, Issue 2
Efficient Global Optimization of Expensive Black-Box Functions
journal, January 1998
- Jones, Donald R.; Schonlau, Matthias; Welch, William J.
- Journal of Global Optimization, Vol. 13, Issue 4, p. 455-492
Simulation optimization: A comprehensive review on theory and applications
journal, November 2004
- Tekin, Eylem; Sabuncuoglu, Ihsan
- IIE Transactions, Vol. 36, Issue 11
A trust region-based two phase algorithm for constrained black-box and grey-box optimization with infeasible initial point
journal, August 2018
- Bajaj, Ishan; Iyer, Shachit S.; Faruque Hasan, M. M.
- Computers & Chemical Engineering, Vol. 116
Recent advances in surrogate-based optimization
journal, January 2009
- Forrester, Alexander I. J.; Keane, Andy J.
- Progress in Aerospace Sciences, Vol. 45, Issue 1-3
A method for simulation based optimization using radial basis functions
journal, June 2009
- Jakobsson, Stefan; Patriksson, Michael; Rudholm, Johan
- Optimization and Engineering, Vol. 11, Issue 4
Advances in surrogate based modeling, feasibility analysis, and optimization: A review
journal, January 2018
- Bhosekar, Atharv; Ierapetritou, Marianthi
- Computers & Chemical Engineering, Vol. 108
Optimization of a small-scale LNG supply chain
journal, April 2018
- Bittante, A.; Pettersson, F.; Saxén, H.
- Energy, Vol. 148
Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations
journal, June 2018
- Beykal, Burcu; Boukouvala, Fani; Floudas, Christodoulos A.
- Computers & Chemical Engineering, Vol. 114
Use of reduced-order models in well control optimization
journal, February 2016
- Jansen, Jan Dirk; Durlofsky, Louis J.
- Optimization and Engineering, Vol. 18, Issue 1
Sparse partial least squares regression for simultaneous dimension reduction and variable selection
journal, January 2010
- Chun, Hyonho; KeleÅ, Sündüz
- Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 72, Issue 1
A combined first-principles and data-driven approach to model building
journal, February 2015
- Cozad, Alison; Sahinidis, Nikolaos V.; Miller, David C.
- Computers & Chemical Engineering, Vol. 73
A derivative-free methodology with local and global search for the constrained joint optimization of well locations and controls
journal, November 2013
- Isebor, Obiajulu J.; Durlofsky, Louis J.; Echeverría Ciaurri, David
- Computational Geosciences, Vol. 18, Issue 3-4
An evaluation of adaptive surrogate modeling based optimization with two benchmark problems
journal, October 2014
- Wang, Chen; Duan, Qingyun; Gong, Wei
- Environmental Modelling & Software, Vol. 60
Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions
journal, January 2005
- Regis, Rommel G.; Shoemaker, Christine A.
- Journal of Global Optimization, Vol. 31, Issue 1
Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption
journal, November 2015
- Boukouvala, Fani; Hasan, M. M. Faruque; Floudas, Christodoulos A.
- Journal of Global Optimization, Vol. 67, Issue 1-2
Simulation optimization: a review of algorithms and applications
journal, November 2014
- Amaran, Satyajith; Sahinidis, Nikolaos V.; Sharda, Bikram
- 4OR, Vol. 12, Issue 4
Selection of Subsets of Regression Variables
journal, January 1984
- Miller, Alan J.
- Journal of the Royal Statistical Society. Series A (General), Vol. 147, Issue 3
A Taxonomy of Global Optimization Methods Based on Response Surfaces
journal, December 2001
- Jones, Donald R.
- Journal of Global Optimization, Vol. 21, Issue 4, p. 345-383
Improved molecular replacement by density- and energy-guided protein structure optimization
journal, May 2011
- DiMaio, Frank; Terwilliger, Thomas C.; Read, Randy J.
- Nature, Vol. 473, Issue 7348
Modeling and Optimization of a Pharmaceutical Formulation System Using Radial Basis Function Network
journal, April 2009
- Anand, P.; Siva Prasad, B. V. N.; Venkateswarlu, Ch.
- International Journal of Neural Systems, Vol. 19, Issue 02
Deep Representational Similarity Learning for Analyzing Neural Signatures in Task-based fMRI Dataset
journal, October 2020
- Yousefnezhad, Muhammad; Sawalha, Jeffrey; Selvitella, Alessandro
- Neuroinformatics, Vol. 19, Issue 3
Efficient Optimization Design Method Using Kriging Model
journal, September 2005
- Jeong, Shinkyu; Murayama, Mitsuhiro; Yamamoto, Kazuomi
- Journal of Aircraft, Vol. 42, Issue 5
Simulation optimization: a review of algorithms and applications
journal, September 2015
- Amaran, Satyajith; Sahinidis, Nikolaos V.; Sharda, Bikram
- Annals of Operations Research, Vol. 240, Issue 1
Efficient Optimization Design Method Using Kriging Model
journal, September 2005
- Jeong, Shinkyu; Murayama, Mitsuhiro; Yamamoto, Kazuomi
- Journal of Aircraft, Vol. 42, Issue 5
Efficient Optimization Design Method Using Kriging Model
conference, June 2004
- Jeong, Shinkyu; Murayama, Mitsuhiro; Yamamoto, Kazuomi
- 42nd AIAA Aerospace Sciences Meeting and Exhibit
Sparse principal component regression with adaptive loading
text, January 2014
- Kawano, Shuichi; Fujisawa, Hironori; Takada, Toyoyuki
- arXiv
Simulation optimization: A review of algorithms and applications
text, January 2017
- Amaran, Satyajith; Sahinidis, Nikolaos V.; Sharda, Bikram
- arXiv
Figures / Tables found in this record: