skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Inferring Convolutional Neural Networks' Accuracies from Their Architectural Characterizations

Abstract

Convolutional Neural Networks (CNNs) have shown strong promise for analyzing scientific data from many domains including particle imaging detectors. However, the challenge of choosing the appropriate network architecture (depth, kernel shapes, activation functions, etc.) for specific applications and different data sets is still poorly understood. In this paper, we study the relationships between a CNN's architecture and its performance by proposing a systematic language that is useful for comparison between different CNN's architectures before training time. We characterize CNN's architecture by different attributes, and demonstrate that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems -- event vertex finding and hadron multiplicity classification in the MINERvA experiment at Fermi National Accelerator Laboratory. In doing so, we extract several architectural attributes from optimized networks' architecture for the physics problems, which are outputs of a model selection algorithm called Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL). We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training. The models perform 16-20% better than random guessing. Additionally, we found an coefficient of determination of 0.966 for an Ordinary Least Squares model in a regression on accuracymore » over a large population of networks.« less

Authors:
 [1];  [2]; ORCiD logo [3];  [4];  [5];  [5]
  1. Rhodes Coll.
  2. Iowa U.
  3. Fermilab
  4. Oak Ridge
  5. Santa Maria U., Valparaiso
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
Sponsoring Org.:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
OSTI Identifier:
1596056
Report Number(s):
arXiv:2001.02160; FERMILAB-CONF-20-006-QIS
oai:inspirehep.net:1774317
DOE Contract Number:  
AC02-07CH11359
Resource Type:
Conference
Resource Relation:
Conference: Eighteenth International Conference on Machine Learning and Applications, Boca Raton, Florida, 12/16-12/19/2019
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS

Citation Formats

Hoang, Duc, Hamer, Jesse, Perdue, Gabriel N., Young, Steven R., Miller, Jonathan, and Ghosh, Anushree. Inferring Convolutional Neural Networks' Accuracies from Their Architectural Characterizations. United States: N. p., 2020. Web.
Hoang, Duc, Hamer, Jesse, Perdue, Gabriel N., Young, Steven R., Miller, Jonathan, & Ghosh, Anushree. Inferring Convolutional Neural Networks' Accuracies from Their Architectural Characterizations. United States.
Hoang, Duc, Hamer, Jesse, Perdue, Gabriel N., Young, Steven R., Miller, Jonathan, and Ghosh, Anushree. Tue . "Inferring Convolutional Neural Networks' Accuracies from Their Architectural Characterizations". United States. https://www.osti.gov/servlets/purl/1596056.
@article{osti_1596056,
title = {Inferring Convolutional Neural Networks' Accuracies from Their Architectural Characterizations},
author = {Hoang, Duc and Hamer, Jesse and Perdue, Gabriel N. and Young, Steven R. and Miller, Jonathan and Ghosh, Anushree},
abstractNote = {Convolutional Neural Networks (CNNs) have shown strong promise for analyzing scientific data from many domains including particle imaging detectors. However, the challenge of choosing the appropriate network architecture (depth, kernel shapes, activation functions, etc.) for specific applications and different data sets is still poorly understood. In this paper, we study the relationships between a CNN's architecture and its performance by proposing a systematic language that is useful for comparison between different CNN's architectures before training time. We characterize CNN's architecture by different attributes, and demonstrate that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems -- event vertex finding and hadron multiplicity classification in the MINERvA experiment at Fermi National Accelerator Laboratory. In doing so, we extract several architectural attributes from optimized networks' architecture for the physics problems, which are outputs of a model selection algorithm called Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL). We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training. The models perform 16-20% better than random guessing. Additionally, we found an coefficient of determination of 0.966 for an Ordinary Least Squares model in a regression on accuracy over a large population of networks.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2020},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: