skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: DeepHyper: Asynchronous Hyperparameter Search for Deep Neural Networks

Abstract

Hyperparameters employed by deep learning (DL) methods play a substantial role in the performance and reliability of these methods in practice. Unfortunately, finding performance-optimizing hyperparameter settings is a notoriously difficult task. Hyperparameter search methods typically have limited production-strength implementations or do not target scalability within a highly parallel machine, portability across different machines, experimental comparison between different methods, and tighter integration with workflow systems. In this paper, we present DeepHyper, a Python package that provides a common interface for the implementation and study of scalable hyperparameter search methods. It adopts the Balsam workflow system to hide the complexities of running large numbers of hyperparameter configurations in parallel on high-performance computing (HPC) systems. We implement and study asynchronous model-based search methods that consist of sampling a small number of input hyperparameter configurations and progressively fitting surrogate models over the input-output space until exhausting a user-defined budget of evaluations. We evaluate the efficacy of these methods relative to approaches such as random search, genetic algorithms, Bayesian optimization, and hyperband on DL benchmarks on CPU- and GPU-based HPC systems.

Authors:
; ; ; ;
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE Office of Science - Office of Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1772593
DOE Contract Number:  
AC02-06CH11357
Resource Type:
Conference
Resource Relation:
Conference: 2018 IEEE 25th International Conference on High Performance Computing (HiPC), 12/17/18 - 12/20/18, Bengaluru, India
Country of Publication:
United States
Language:
English

Citation Formats

Balaprakash, Prasanna, Salim, Michael, Uram, Thomas D., Vishwanath, Venkatram, and Wild, Stefan M. DeepHyper: Asynchronous Hyperparameter Search for Deep Neural Networks. United States: N. p., 2018. Web. doi:10.1109/HiPC.2018.00014.
Balaprakash, Prasanna, Salim, Michael, Uram, Thomas D., Vishwanath, Venkatram, & Wild, Stefan M. DeepHyper: Asynchronous Hyperparameter Search for Deep Neural Networks. United States. https://doi.org/10.1109/HiPC.2018.00014
Balaprakash, Prasanna, Salim, Michael, Uram, Thomas D., Vishwanath, Venkatram, and Wild, Stefan M. 2018. "DeepHyper: Asynchronous Hyperparameter Search for Deep Neural Networks". United States. https://doi.org/10.1109/HiPC.2018.00014.
@article{osti_1772593,
title = {DeepHyper: Asynchronous Hyperparameter Search for Deep Neural Networks},
author = {Balaprakash, Prasanna and Salim, Michael and Uram, Thomas D. and Vishwanath, Venkatram and Wild, Stefan M.},
abstractNote = {Hyperparameters employed by deep learning (DL) methods play a substantial role in the performance and reliability of these methods in practice. Unfortunately, finding performance-optimizing hyperparameter settings is a notoriously difficult task. Hyperparameter search methods typically have limited production-strength implementations or do not target scalability within a highly parallel machine, portability across different machines, experimental comparison between different methods, and tighter integration with workflow systems. In this paper, we present DeepHyper, a Python package that provides a common interface for the implementation and study of scalable hyperparameter search methods. It adopts the Balsam workflow system to hide the complexities of running large numbers of hyperparameter configurations in parallel on high-performance computing (HPC) systems. We implement and study asynchronous model-based search methods that consist of sampling a small number of input hyperparameter configurations and progressively fitting surrogate models over the input-output space until exhausting a user-defined budget of evaluations. We evaluate the efficacy of these methods relative to approaches such as random search, genetic algorithms, Bayesian optimization, and hyperband on DL benchmarks on CPU- and GPU-based HPC systems.},
doi = {10.1109/HiPC.2018.00014},
url = {https://www.osti.gov/biblio/1772593}, journal = {},
number = ,
volume = ,
place = {United States},
year = {Mon Jan 01 00:00:00 EST 2018},
month = {Mon Jan 01 00:00:00 EST 2018}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: