DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Compound Data Poisoning Technique with Significant Adversarial Effects on Transformer-based Sentiment Classification Tasks

Journal Article · · ACM journal of data and information quality
DOI: https://doi.org/10.1145/3705897 · OSTI ID:2480059

Transformer-based models have demonstrated much success in various natural language processing tasks. However, they are often vulnerable to adversarial attacks, such as data poisoning, which can intentionally fool the model into generating incorrect results. In this article, we present a novel, compound variant of a data poisoning attack on a transformer-based model that maximizes the poisoning effect while minimizing the scope of poisoning. Here we do so by combining the established data poisoning technique (label flipping) with a novel adversarial artifact selection and insertion technique aimed at minimizing detectability and the scope of the poisoning footprint. We find that by using a combination of these two techniques, we achieve a state-of-the-art attack success rate of approximately 90% while poisoning only 0.5% of the original training set, thus minimizing the scope and detectability of the poisoning action. These findings have the potential to advance the development of better data poisoning detection methods.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
2480059
Journal Information:
ACM journal of data and information quality, Journal Name: ACM journal of data and information quality Journal Issue: 4 Vol. 16; ISSN 1936-1963
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English

References (18)

On defending against label flipping attacks on malware detection systems journal July 2020
Label flipping attacks against Naive Bayes on spam filtering systems journal January 2021
A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly journal June 2024
Exploring Data and Model Poisoning Attacks to Deep Learning-Based NLP Systems journal January 2021
A Backdoor Attack Against LSTM-Based Text Classification Systems journal January 2019
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers conference June 2020
Trojaning Language Models for Fun and Profit conference September 2021
A Survey on Backdoor Attack and Defense in Natural Language Processing conference December 2022
Nomen est Omen - The Role of Signatures in Ascribing Email Author Identity with Transformer Neural Networks conference May 2021
Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning journal July 2023
A Survey of Adversarial Defenses and Robustness in NLP journal July 2023
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment journal April 2020
Hidden Trigger Backdoor Attacks journal April 2020
VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text journal May 2014
Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger
  • Qi, Fanchao; Li, Mukai; Chen, Yangyi
  • Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) https://doi.org/10.18653/v1/2021.acl-long.37
conference January 2021
Rethinking Stealthiness of Backdoor Attack against NLP Models
  • Yang, Wenkai; Lin, Yankai; Li, Peng
  • Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) https://doi.org/10.18653/v1/2021.acl-long.431
conference January 2021
Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects
  • Ni, Jianmo; Li, Jiacheng; McAuley, Julian
  • Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) https://doi.org/10.18653/v1/D19-1018
conference January 2019
Classification and Analysis of Adversarial Machine Learning Attacks in IoT: a Label Flipping Attack Case Study conference November 2022