skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Technical Report ·
DOI:https://doi.org/10.2172/1525811· OSTI ID:1525811

Mixup is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels. While simple to implement, it has been shown to be a surprisingly effective method of data augmentation for image classification: DNNs trained with mixup show noticeable gains in classification performance on a number of image classification benchmarks. In this work, we discuss a hitherto untouched aspect of mixup training – the calibration and predictive uncertainty of models trained with mixup. We find that DNNs trained with mixup are significantly better calibrated – i.e., the predicted softmax scores are much better indicators of the actual likelihood of a correct prediction – than DNNs trained in the regular fashion. We conduct experiments on a number of image classification architectures and datasets – including large-scale datasets like ImageNet – and find this to be the case. Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration. Finally, we also observe that mixup-trained DNNs are less prone to over-confident predictions on out-of-distribution and random-noise data. We conclude that the typical overconfidence seen in neural networks, even on in-distribution data is likely a consequence of training with hard labels, suggesting that mixup be employed for classification tasks where predictive uncertainty is a significant concern.

Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
National Institutes of Health (NIH); USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
89233218CNA000001
OSTI ID:
1525811
Report Number(s):
LA-UR-19-25277
Resource Relation:
Conference: 2019 Conference on Neural Information Processing Systems, Vancouver, BC (Canada), 8-15 Dec 2019; Related Information: https://nips.cc/Conferences/2019
Country of Publication:
United States
Language:
English

Similar Records

Deep neural network uncertainty quantification for LArTPC reconstruction
Journal Article · Thu Dec 21 00:00:00 EST 2023 · Journal of Instrumentation · OSTI ID:1525811

Deep learning to estimate permeability using geophysical data
Journal Article · Fri Jul 15 00:00:00 EDT 2022 · Advances in Water Resources · OSTI ID:1525811

Deep active learning for classifying cancer pathology reports
Journal Article · Tue Mar 09 00:00:00 EST 2021 · BMC Bioinformatics · OSTI ID:1525811