Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Effects of Jacobian Matrix Regularization on the Detectability of Adversarial Samples

Technical Report ·
DOI:https://doi.org/10.2172/1763568· OSTI ID:1763568
 [1];  [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

The well-known vulnerability of Deep Neural Networks to adversarial samples has led to a rapid cycle of increasingly sophisticated attack algorithms and proposed defenses. While most contemporary defenses have been shown to be vulnerable to carefully configured attacks, methods based on gradient regularization and out-of-distribution detection have attracted much interest recently by demonstrating higher resilience to a broad range of attack algorithms. However, no study has yet investigated the effect of combining these techniques. In this paper, we consider the effect of Jacobian matrix regularization on the detectability of adversarial samples on the CIFAR-10 image benchmark dataset. We find that regularization has a significant effect on detectability, and in some cases can make an undetectable attack on a baseline model detectable. In addition, we give evidence that regularization may mitigate the known weaknesses of detectors to high-confidence adversarial samples. The defenses we consider here are highly generalizable, and we believe they will be useful for further investigations to transfer machine learning robustness to other data domains.

Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); US Air Force
DOE Contract Number:
AC04-94AL85000; NA0003525
OSTI ID:
1763568
Report Number(s):
SAND2020-13986; 693103
Country of Publication:
United States
Language:
English

Similar Records

Defending Against Adversarial Examples
Technical Report · Sun Sep 01 00:00:00 EDT 2019 · OSTI ID:1569514

Reinforcement Learning for feedback-enabled cyber resilience
Journal Article · Sun Jan 30 23:00:00 EST 2022 · Annual Reviews in Control · OSTI ID:1976876

Using Machine Learning in Adversarial Environments.
Technical Report · Sun Jan 31 23:00:00 EST 2016 · OSTI ID:1238101