DOE Data Explorer title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: BUTTER - Empirical Deep Learning Dataset

Abstract

The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80% / 20% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.

Authors:
; ORCiD logo ; ; ORCiD logo
  1. National Renewable Energy Laboratory
Publication Date:
Other Number(s):
5708
Research Org.:
DOE Open Energy Data Initiative (OEDI); National Renewable Energy Laboratory
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-31)
Collaborations:
National Renewable Energy Laboratory
Subject:
Array; batch size; benchmark; deep learning; depth; empirical; empirical deep learning; empirical machine learning; epoch; label noise; learning rate; machine learning; minibatch size; network shape; network topology; neural architecture search; neural networks; regularization; shape; topology; training; training epoch
OSTI Identifier:
1872441
DOI:
https://doi.org/10.25984/1872441

Citation Formats

Tripp, Charles, Perr-Sauer, Jordan, Hayne, Lucas, and Lunacek, Monte. BUTTER - Empirical Deep Learning Dataset. United States: N. p., 2022. Web. doi:10.25984/1872441.
Tripp, Charles, Perr-Sauer, Jordan, Hayne, Lucas, & Lunacek, Monte. BUTTER - Empirical Deep Learning Dataset. United States. doi:https://doi.org/10.25984/1872441
Tripp, Charles, Perr-Sauer, Jordan, Hayne, Lucas, and Lunacek, Monte. 2022. "BUTTER - Empirical Deep Learning Dataset". United States. doi:https://doi.org/10.25984/1872441. https://www.osti.gov/servlets/purl/1872441. Pub date:Fri May 20 04:00:00 UTC 2022
@article{osti_1872441,
title = {BUTTER - Empirical Deep Learning Dataset},
author = {Tripp, Charles and Perr-Sauer, Jordan and Hayne, Lucas and Lunacek, Monte},
abstractNote = {The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80% / 20% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.},
doi = {10.25984/1872441},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Fri May 20 04:00:00 UTC 2022},
month = {Fri May 20 04:00:00 UTC 2022}
}