├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Penghui Yang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome Data Poisoning and Backdoor Attacks 2 | 3 | [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) 4 | 5 | **Note:** This repository is no longer maintained as my interests have shifted to other areas. The latest update pertains to ACL 2024. However, contributions from others are welcome, and I encourage pull requests. 6 | 7 | > Disclaimer: This repository may not include all relevant papers in this area. Use at your own discretion and please contribute any missing or overlooked papers via pull request. 8 | 9 | A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them. 10 | 11 | ## Surveys 12 | 13 | + Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses (TPAMI 2022) [[paper](https://ieeexplore.ieee.org/abstract/document/9743317)] 14 | + A Survey on Data Poisoning Attacks and Defenses (DSC 2022) [[paper](https://ieeexplore.ieee.org/abstract/document/9900151)] 15 | 16 | 17 | ## Benchmark 18 | 19 | + APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (arXiv 2023) [[paper](https://arxiv.org/abs/2308.03258)] [[code](https://github.com/lafeat/apbench)] 20 | + Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (ICML 2021) [[paper](https://arxiv.org/pdf/2006.12557)] [[code](https://github.com/aks2203/poisoning-benchmark)] 21 | 22 | 23 | ## 2024 24 | 25 |
26 | NDSS 27 | 28 | + Automatic Adversarial Adaption for Stealthy Poisoning Attacks in Federated Learning (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/automatic-adversarial-adaption-for-stealthy-poisoning-attacks-in-federated-learning/)] 29 | + FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/freqfed-a-frequency-analysis-based-approach-for-mitigating-poisoning-attacks-in-federated-learning/)] 30 | + CrowdGuard: Federated Backdoor Detection in Federated Learning (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/crowdguard-federated-backdoor-detection-in-federated-learning/)] [[code](https://github.com/TRUST-TUDa/crowdguard)] 31 | + LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/lmsanitator-defending-prompt-tuning-against-task-agnostic-backdoors/)] [[code](https://github.com/meng-wenlong/LMSanitator)] 32 | + Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/gradient-shaping-enhancing-backdoor-attack-against-reverse-engineering/)] 33 | + Sneaky Spikes: Uncovering Stealthy Backdoor Attacks in Spiking Neural Networks with Neuromorphic Data (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/sneaky-spikes-uncovering-stealthy-backdoor-attacks-in-spiking-neural-networks-with-neuromorphic-data/)] [[code](https://github.com/GorkaAbad/Sneaky-Spikes)] 34 | + TextGuard: Provable Defense against Backdoor Attacks on Text Classification (NDSS 2024) [[paper](https://www.ndss-symposium.org/ndss-paper/textguard-provable-defense-against-backdoor-attacks-on-text-classification/)] [[code](https://github.com/AI-secure/TextGuard)] 35 | 36 |
37 | 38 |
39 | ICLR 40 | 41 | + Towards Faithful XAI Evaluation via Generalization-Limited Backdoor Watermark (ICLR 2024) [[paper](https://openreview.net/forum?id=cObFETcoeW)] 42 | + Towards Reliable and Efficient Backdoor Trigger Inversion via Decoupling Benign Features (ICLR 2024) [[paper](https://openreview.net/forum?id=Tw9wemV6cb)] 43 | + BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection (ICLR 2024) [[paper](https://openreview.net/forum?id=s56xikpD92)] 44 | + Backdoor Secrets Unveiled: Identifying Backdoor Data with Optimized Scaled Prediction Consistency (ICLR 2024) [[paper](https://openreview.net/forum?id=1OfAO2mes1)] 45 | + Adversarial Feature Map Pruning for Backdoor (ICLR 2024) [[paper](https://openreview.net/forum?id=IOEEDkla96)] 46 | + Safe and Robust Watermark Injection with a Single OoD Image (ICLR 2024) [[paper](https://openreview.net/forum?id=PCm1oT8pZI)] 47 | + Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios (ICLR 2024) [[paper](https://openreview.net/forum?id=vRyp2dhEQp)] 48 | + Backdoor Contrastive Learning via Bi-level Trigger Optimization (ICLR 2024) [[paper](https://openreview.net/forum?id=oxjeePpgSP)] 49 | + BadEdit: Backdooring Large Language Models by Model Editing (ICLR 2024) [[paper](https://openreview.net/forum?id=duZANm2ABX)] 50 | + Backdoor Federated Learning by Poisoning Backdoor-Critical Layers (ICLR 2024) [[paper](https://openreview.net/forum?id=AJBGSVSTT2)] 51 | + Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection (ICLR 2024) [[paper](https://openreview.net/forum?id=8iTpB4RNvP)] 52 | + Influencer Backdoor Attack on Semantic Segmentation (ICLR 2024) [[paper](https://openreview.net/forum?id=VmGRoNDQgJ)] 53 | + Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective (ICLR 2024) [[paper](https://openreview.net/forum?id=iCNOK45Csv)] 54 | + Universal Backdoor Attacks (ICLR 2024) [[paper](https://openreview.net/forum?id=3QkzYBSWqL)] 55 | + Demystifying Poisoning Backdoor Attacks from a Statistical Perspective (ICLR 2024) [[paper](https://openreview.net/forum?id=BPHcEpGvF8)] 56 | + BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models (ICLR 2024) [[paper](https://openreview.net/forum?id=c93SBwz1Ma)] 57 | + Rethinking CNN’s Generalization to Backdoor Attack from Frequency Domain (ICLR 2024) [[paper](https://openreview.net/forum?id=mYhH0CDFFa)] 58 | + Like Oil and Water: Group Robustness Methods and Poisoning Defenses Don't Mix (ICLR 2024) [[paper](https://openreview.net/forum?id=rM9VJPB20F)] 59 | + VDC: Versatile Data Cleanser for Detecting Dirty Samples via Visual-Linguistic Inconsistency (ICLR 2024) [[paper](https://openreview.net/forum?id=ygxTuVz9eU)] 60 | + Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning (ICLR 2024) [[paper](https://openreview.net/forum?id=4DoSULcfG6)] 61 | + Universal Jailbreak Backdoors from Poisoned Human Feedback (ICLR 2024) [[paper](https://openreview.net/forum?id=GxCGsxiAaK)] 62 | + Teach LLMs to Phish: Stealing Private Information from Language Models (ICLR 2024) [[paper](https://openreview.net/forum?id=qo21ZlfNu6)] 63 |
64 | 65 | 66 |
67 | S&P 68 | 69 | + Poisoning Web-Scale Training Datasets is Practical (S&P 2024) [[paper](https://arxiv.org/abs/2302.10149)] 70 | + TrojanPuzzle: Covertly Poisoning Code-Suggestion Models (S&P 2024) [[paper](https://arxiv.org/abs/2301.02344)] [[code](https://github.com/microsoft/CodeGenerationPoisoning)] 71 | + FLShield: A Validation Based Federated Learning Framework to Defend Against Poisoning Attacks (S&P 2024) [[paper](https://arxiv.org/abs/2308.05832)] [[code]()] 72 | + Poisoned ChatGPT Finds Work for Idle Hands: Exploring Developers' Coding Practices with Insecure Suggestions from Poisoned AI Models (S&P 2024) [[paper](https://arxiv.org/abs/2312.06227)] 73 | + FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge (S&P 2024) [[paper](https://arxiv.org/abs/2312.09665)] [[code](https://github.com/cristinalan/FlowMur)] 74 | + Robust Backdoor Detection for Deep Learning via Topological Evolution Dynamics (S&P 2024) [[paper](https://arxiv.org/abs/2312.02673)] [[code](https://github.com/tedbackdoordefense/ted)] 75 | + ODSCAN: Backdoor Scanning for Object Detection Models 76 | (S&P 2024) [[paper](https://www.computer.org/csdl/proceedings-article/sp/2024/313000a119/1Ub23se6M5q)] [[code](https://github.com/Megum1/ODSCAN)] 77 | + Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models (S&P 2024) [[paper](https://arxiv.org/abs/2310.13828)] [[code]()] 78 | + SHERPA: Explainable Robust Algorithms for Privacy-Preserved Federated Learning in Future Networks to Defend Against Data Poisoning Attacks (S&P 2024) [[paper](https://www.researchgate.net/publication/379653840_SHERPA_Explainable_Robust_Algorithms_for_Privacy-Preserved_Federated_Learning_in_Future_Networks_to_Defend_Against_Data_Poisoning_Attacks)] 79 | + BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets (S&P 2024) [[paper](https://arxiv.org/abs/2210.04688)] [[code](https://github.com/2019ChenGong/Offline_RL_Poisoner/)] 80 | + DeepVenom: Persistent DNN Backdoors Exploiting Transient Weight Perturbations in Memories (S&P 2024) 81 | + Need for Speed: Taming Backdoor Attacks with Speed and Precision (S&P 2024) 82 | + Exploring the Orthogonality and Linearity of Backdoor Attacks (S&P 2024) 83 | + BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting (S&P 2024) [[paper](https://arxiv.org/abs/2312.04902)] [[code](https://github.com/JSun20220909/BELT)] 84 | + Test-Time Poisoning Attacks Against Test-Time Adaptation Models (S&P 2024) [[paper](https://arxiv.org/abs/2308.08505)] [[code](https://github.com/tianshuocong/TePA)] 85 | + MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion (S&P 2024) [[paper](https://arxiv.org/abs/2304.11300)] 86 | + MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary Backdoor Pattern Types Using a Maximum Margin Statistic (S&P 2024) [[paper](https://arxiv.org/abs/2205.06900)] [[code](https://github.com/wanghangpsu/MM-BD)] 87 | + BadVFL: Backdoor Attacks in Vertical Federated Learning (S&P 2024) [[paper](https://arxiv.org/abs/2304.08847)] 88 | + Backdooring Multimodal Learning (S&P 2024) [[paper](https://www.computer.org/csdl/proceedings-article/sp/2024/313000a031/1RjEa7rmaxW)] [[code](https://github.com/multimodalbags/BAGS_Multimodal)] 89 | + Distribution Preserving Backdoor Attack in Self-supervised Learning (S&P 2024) [[paper](https://www.computer.org/csdl/proceedings-article/sp/2024/313000a029/1RjEa5rjsHK)] [[code](https://github.com/Gwinhen/DRUPE)] 90 | + 91 |
92 | 93 | 94 | 95 |
96 | 97 | CVPR 98 | 99 | + Data Poisoning based Backdoor Attacks to Contrastive Learning (CVPR 2024) [[paper](https://arxiv.org/abs/2211.08229)] [[code](https://github.com/jzhang538/CorruptEncoder)] 100 | + Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving (CVPR 2024) [[paper](https://arxiv.org/abs/2306.15755)] 101 | + Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment (CVPR 2024) 102 | + BrainWash: A Poisoning Attack to Forget in Continual Learning (CVPR 2024) [[paper](https://arxiv.org/abs/2311.11995)] 103 | + Not All Prompts Are Secure: A Switchable Backdoor Attack against Pre-trained Models (CVPR 2024) [[code](https://github.com/20000yshust/SWARM)] 104 | + Test-Time Backdoor Defense via Detecting and Repairing (CVPR 2024) [[paper](https://arxiv.org/abs/2308.06107)] 105 | + Nearest Is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks (CVPR 2024) [[code](https://github.com/AntigoneRandy/QuantBackdoor_EFRAP)] 106 | + LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning (CVPR 2024) [[paper](https://arxiv.org/abs/2403.17188)] [[code](https://github.com/Megum1/LOTUS)] 107 | + Temperature-based Backdoor Attacks on Thermal Infrared Object Detection (CVPR 2024) 108 | + BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning (CVPR 2024) [[paper](https://arxiv.org/abs/2311.12075)] 109 | + Re-thinking Data Availablity Attacks Against Deep Neural Networks (CVPR 2024) [[paper](https://arxiv.org/abs/2305.10691)] 110 |
111 | 112 |
113 | NAACL 114 | 115 | + From Shortcuts to Triggers: Backdoor Defense with Denoised PoE (NAACL 2024) [[paper](https://arxiv.org/abs/2305.14910)] [[code](https://github.com/luka-group/DPoE)] 116 | + Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors (NAACL 2024) [[paper](https://arxiv.org/abs/2404.02356)] [[code](https://github.com/VictoriaGraf/Nested_PoE)] 117 | + ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (NAACL 2024) [[paper](https://arxiv.org/abs/2304.14475)] 118 | + Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models (NAACL 2024) [[paper](https://arxiv.org/abs/2305.14710)] 119 | + PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning (NAACL 2024) 120 | + Backdoor Attacks on Multilingual Machine Translation (NAACL 2024) [[paper](https://arxiv.org/abs/2404.02393)] 121 | + Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections (NAACL 2024) [[paper](https://arxiv.org/abs/2312.00027)] 122 | + Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection (NAACL 2024) [[paper](https://arxiv.org/abs/2307.16888)] [[code](https://github.com/wegodev2/virtual-prompt-injection)] 123 | + Composite Backdoor Attacks Against Large Language Models (NAACL 2024 Findings) [[paper](https://arxiv.org/abs/2310.07676)] [[code](https://github.com/MiracleHH/CBA)] 124 | + Task-Agnostic Detector for Insertion-Based Backdoor Attacks (NAACL 2024 Findings) [[paper](https://arxiv.org/abs/2403.17155)] 125 | + Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning (NAACL 2024 Findings) [[paper](https://arxiv.org/abs/2402.12168)] 126 |
127 | 128 |
129 | ICML 130 | 131 | + TERD: A Unified Framework for Backdoor Defense on Diffusion Model (ICML 2024) 132 | + Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approximation (ICML 2024) 133 | + Energy-based Backdoor Defense without Task-Specific Samples and Model Retraining (ICML 2024) 134 | + IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency (ICML 2024) 135 | + A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks (ICML 2024) 136 | + SHINE: Shielding Backdoors in Deep Reinforcement Learning (ICML 2024) 137 | + Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks (ICML 2024) [[paper](https://arxiv.org/abs/2310.05862)] 138 | + Generalization Bound and New Algorithm for Clean-Label Backdoor Attack (ICML 2024) 139 | + Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (ICML 2024) [[paper](https://arxiv.org/abs/2404.00473)] 140 | + Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normalization (ICML 2024) 141 | + Causality Based Front-door Denfence Against Backdoor Attack on Language Model (ICML 2024) 142 | + The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright BreachesWithout Adjusting Finetuning Pipeline (ICML 2024) [[paper](https://arxiv.org/abs/2401.04136)] [[code](https://github.com/haonan3/SilentBadDiffusion)] 143 | + Perfect Alignment May be Poisonous to Graph Contrastive Learning (ICML 2024) [[paper](https://arxiv.org/abs/2310.03977)] 144 | + FedREDefense: Defending against Model Poisoning Attacks for Federated Learning using Model Update Reconstruction Error (ICML 2024) 145 | + Naive Bayes Classifiers over Missing Data: Decision and Poisoning (ICML 2024) 146 | + Data Poisoning Attacks against Conformal Prediction (ICML 2024) 147 | 148 |
149 | 150 |
151 | ACL 152 | 153 | + RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models (ACL 2024) [[paper](https://arxiv.org/abs/2311.09641)] 154 | + Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space (ACL 2024) [[paper](https://arxiv.org/abs/2402.12026)] [[code](https://github.com/ZrW00/MuScleLoRA)] 155 | + BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents (ACL 2024) [[paper](https://arxiv.org/abs/2406.03007)] [[code](https://github.com/DPamK/BadAgent)] 156 | + WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection (ACL 2024) [[paper](https://arxiv.org/abs/2403.01472)] 157 | + BadActs: A Universal Backdoor Defense in the Activation Space (ACL 2024 Findings) [[paper](https://arxiv.org/abs/2405.11227)] [[code](https://github.com/clearloveclearlove/BadActs)] 158 | + UOR: Universal Backdoor Attacks on Pre-trained Language Models (ACL 2024 Findings) [[paper](https://arxiv.org/abs/2305.09574)] 159 | + Here's a Free Lunch: Sanitizing Backdoored Models with Model Merge (ACL 2024 Findings) [[paper](https://arxiv.org/abs/2402.19334)] [[code](https://github.com/ansharora7/model-merge-backdoor)] 160 | 161 |
162 | 163 | ## 2023 164 | 165 |
166 | arXiv 167 | 168 | + Silent Killer: Optimizing Backdoor Trigger Yields a Stealthy and Powerful Data Poisoning Attack (arXiv 2023) [[code](https://arxiv.org/abs/2301.02615)] 169 | + Exploring the Limits of Indiscriminate Data Poisoning Attacks (arXiv 2023) [[paper](https://arxiv.org/abs/2303.03592)] 170 | + Students Parrot Their Teachers: Membership Inference on Model Distillation (arXiv 2023) [[paper](https://arxiv.org/abs/2303.03446)] 171 | + More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models (arXiv 2023) [[paper](https://arxiv.org/abs/2302.12173)] [[code](https://github.com/greshake/llm-security)] 172 | + Feature Partition Aggregation: A Fast Certified Defense Against a Union of Sparse Adversarial Attacks (arXiv 2023) [[paper](https://arxiv.org/abs/2302.11628)] [[code](https://github.com/ZaydH/feature-partition)] 173 | + ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms (arXiv 2023) [[paper](https://arxiv.org/abs/2302.11408)] [[code](https://github.com/ruoxi-jia-group/ASSET)] 174 | + Temporal Robustness against Data Poisoning (arXiv 2023) [[paper](https://arxiv.org/abs/2302.03684)] 175 | + A Systematic Evaluation of Backdoor Trigger Characteristics in Image Classification (arXiv 2023) [[paper](http://arxiv.org/abs/2302.01740)] 176 | + Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks (arXiv 2023) [[paper](https://arxiv.org/abs/2303.15127)] [[code](https://github.com/lafeat/ueraser)] 177 | + Backdoor Attacks with Input-unique Triggers in NLP (arXiv 2023) [[paper](https://arxiv.org/abs/2303.14325)] 178 | + Do Backdoors Assist Membership Inference Attacks? (arXiv 2023) [[paper](https://arxiv.org/abs/2303.12589)] 179 | + Black-box Backdoor Defense via Zero-shot Image Purification (arXiv 2023) [[paper](https://arxiv.org/abs/2303.12175)] 180 | + Influencer Backdoor Attack on Semantic Segmentation (arXiv 2023) [[paper](https://arxiv.org/abs/2303.12054)] 181 | + TrojViT: Trojan Insertion in Vision Transformers (arXiv 2023) [[paper](https://arxiv.org/abs/2208.13049)] 182 | + Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling (arXiv 2023) [[paper](https://arxiv.org/abs/2303.17080)] [[code](https://github.com/wisdeth14/MoleRecruitment)] 183 | + Poisoning Web-Scale Training Datasets is Practical (arXiv 2023) [[paper](http://arxiv.org/abs/2302.10149)] 184 | + Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (arXiv 2023) [[paper](https://arxiv.org/abs/2304.11823)] 185 | + MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion (arXiv 2023) [[paper](https://arxiv.org/abs/2304.11300)] 186 | + Launching a Robust Backdoor Attack under Capability Constrained Scenarios (arXiv 2023) [[paper](https://arxiv.org/abs/2304.10985)] 187 | + Certifiable Robustness for Naive Bayes Classifiers (arXiv 2023) [[paper](https://arxiv.org/abs/2303.04811)] [[code](https://github.com/Waterpine/NClean)] 188 | + Assessing Vulnerabilities of Adversarial Learning Algorithm through Poisoning Attacks (arXiv 2023) [[paper](https://arxiv.org/abs/2305.00399)] [[code](https://github.com/zjfheart/Poison-adv-training)] 189 | + Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (arXiv 2023) [[paper](https://arxiv.org/abs/2305.01219)] [[code](https://github.com/shuaizhao95/Prompt_attack)] 190 | + Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning (arXiv 2023) [[paper](https://arxiv.org/abs/2305.04175)] 191 | + BadSAM: Exploring Security Vulnerabilities of SAM via Backdoor Attacks (arXiv 2023) [[paper](https://arxiv.org/abs/2305.03289)] 192 | + Backdoor Learning on Sequence to Sequence Models (arXiv 2023) [[paper](https://arxiv.org/abs/2305.02424)] 193 | + ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (arXiv 2023) [[paper](https://arxiv.org/abs/2304.14475)] 194 | + Evil from Within: Machine Learning Backdoors through Hardware Trojans (arXiv 2023) [[paper](https://arxiv.org/abs/2304.08411)] 195 | 196 |
197 | 198 | 199 |
200 | ICLR 201 | 202 | + Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning (ICLR 2023) [[paper](https://openreview.net/forum?id=f0a_dWEYg-Td)] 203 | + Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only (ICLR 2023) [[paper](https://openreview.net/forum?id=rFQfjDC9Mt)] 204 | + TrojText: Test-time Invisible Textual Trojan Insertion (ICLR 2023) [[paper](https://openreview.net/forum?id=ja4Lpp5mqc2)] [[code](https://github.com/UCF-ML-Research/TrojText)] 205 | + Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning? (ICLR 2023) [[paper](https://openreview.net/forum?id=zKvm1ETDOq)] [[code](https://github.com/WenRuiUSTC/EntF)] 206 | + Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks (ICLR 2023) [[paper](https://openreview.net/forum?id=mkJm5Uy4HrQ)] [[code](https://github.com/charlesjin/compatibility_clustering/)] 207 | + Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023) [[paper](https://openreview.net/forum?id=_wSHsgrVali)] [[code](https://github.com/Unispac/Circumventing-Backdoor-Defenses)] 208 | + Few-shot Backdoor Attacks via Neural Tangent Kernels (ICLR 2023) [[paper](https://openreview.net/forum?id=a70lGJ-rwy)] [[code](https://github.com/SewoongLab/ntk-backdoor)] 209 | + SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency (ICLR 2023) [[paper](https://openreview.net/forum?id=o0LFPcoFKnr)] [[code](https://github.com/JunfengGo/SCALE-UP)] 210 | + Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective (ICLR 2023) [[paper](https://openreview.net/forum?id=dSYoPjM5J_W)] [[code](https://github.com/likuanppd/STRG)] 211 | + Provable Robustness against Wasserstein Distribution Shifts via Input Randomization (ICLR 2023) [[paper](https://openreview.net/forum?id=HJFVrpCaGE)] 212 | + Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure (ICLR 2023) [[paper](https://openreview.net/forum?id=39z0zPZ0AvB)] 213 | + Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors (ICLR 2023) [[paper](https://openreview.net/forum?id=9MO7bjoAfIA)] [[code](https://github.com/Sizhe-Chen/SEP)] 214 | + Towards Robustness Certification Against Universal Perturbations (ICLR 2023) [[paper](https://openreview.net/forum?id=7GEvPKxjtt)] [[code](https://github.com/ruoxi-jia-group/Universal_Pert_Cert)] 215 | + Understanding Influence Functions and Datamodels via Harmonic Analysis (ICLR 2023) [[paper](https://openreview.net/forum?id=cxCEOSF99f)] 216 | + Distilling Cognitive Backdoor Patterns within an Image (ICLR 2023) [[paper](https://openreview.net/forum?id=S3D9NLzjnQ5)] [[code](https://github.com/HanxunH/CognitiveDistillation)] 217 | + FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning (ICLR 2023) [[paper](https://openreview.net/forum?id=Xo2E217_M4n)] [[code](https://github.com/KaiyuanZh/FLIP)] 218 | + UNICORN: A Unified Backdoor Trigger Inversion Framework (ICLR 2023) [[paper](https://openreview.net/forum?id=Mj7K4lglGyj)] [[code](https://github.com/RU-System-Software-and-Security/UNICORN)] 219 | 220 |
221 | 222 | 223 |
224 | ICML 225 | 226 | + Poisoning Language Models During Instruction Tuning (ICML 2023) [[paper](https://arxiv.org/abs/2305.00944)] [[code](https://github.com/AlexWan0/Poisoning-Instruction-Tuned-Models)] 227 | + Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning (ICML 2023) [[paper](https://arxiv.org/abs/2304.12961)] [[code](https://github.com/ybdai7/Chameleon-durable-backdoor)] 228 | + Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression (ICML 2023) [[paper](https://arxiv.org/abs/2301.13838)] [[code](https://github.com/liuzrcc/ImageShortcutSqueezing)] 229 | + Poisoning Generative Replay in Continual Learning to Promote Forgetting (ICML 2023) [[paper](https://openreview.net/forum?id=km7qa1hme2)] [[code](https://www.dropbox.com/sh/mku8oln1t7ngscl/AABVPSwZBlx41GtQYRyYVRgha?dl=0)] 230 | + Exploring Model Dynamics for Accumulative Poisoning Discovery (ICML 2023) [[paper](https://arxiv.org/abs/2306.03726)] [[code](https://github.com/tmlr-group/Memorization-Discrepancy)] 231 | + Data Poisoning Attacks Against Multimodal Encoders (ICML 2023) [[paper](https://arxiv.org/abs/2209.15266)] [[code](https://github.com/zqypku/mm_poison/)] 232 | + Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks (ICML 2023) [[paper](https://openreview.net/forum?id=r1DAAD9IyE)] [[code](https://github.com/watml/plim)] 233 | + Run-Off Election: Improved Provable Defense against Data Poisoning Attacks (ICML 2023) [[paper](http://arxiv.org/abs/2302.02300)] [[code](https://github.com/k1rezaei/Run-Off-Election)] 234 | + Revisiting Data-Free Knowledge Distillation with Poisoned Teachers (ICML 2023) [[paper](https://arxiv.org/abs/2306.02368)] [[code](https://github.com/illidanlab/ABD)] 235 | + Certified Robust Neural Networks: Generalization and Corruption Resistance (ICML 2023) [[paper](https://arxiv.org/abs/2303.02251)] [[code](https://github.com/RyanLucas3/HR_Neural_Networks)] 236 | + Understanding Backdoor Attacks through the Adaptability Hypothesis (ICML 2023) [[paper](https://openreview.net/forum?id=iIuLNEnOue)] 237 | + Robust Collaborative Learning with Linear Gradient Overhead (ICML 2023) [[paper](https://openreview.net/forum?id=BkVWMrgb7K)] [[code](https://github.com/LPD-EPFL/robust-collaborative-learning)] 238 | + Graph Contrastive Backdoor Attacks (ICML 2023) [[paper](https://openreview.net/forum?id=BfVkbfJGW4)] 239 | + Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023) [[paper](https://arxiv.org/abs/2305.14876)] [[code](https://github.com/bboylyg/RNP)] 240 | + Rethinking Backdoor Attacks (ICML 2023) [[paper](https://openreview.net/forum?id=V0ydUD8aW4)] 241 | + UMD: Unsupervised Model Detection for X2X Backdoor Attacks (ICML 2023) [[paper](https://arxiv.org/abs/2305.18651)] 242 | + LeadFL: Client Self-Defense against Model Poisoning in Federated Learning (ICML 2023) [[paper](https://openreview.net/forum?id=2CiaH2Tq4G)] [[code](https://github.com/chaoyitud/LeadFL)] 243 | 244 |
245 | 246 |
247 | NeurIPS 248 | 249 | + BadTrack: A Poison-Only Backdoor Attack on Visual Object Tracking (NeurIPS 2023) [[paper](https://openreview.net/forum?id=W9pJx9sFCh)] 250 | + ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP (NeurIPS 2023) [[paper](https://openreview.net/forum?id=DD0QJvPbTD)] 251 | + Robust Contrastive Language-Image Pretraining against Data Poisoning and Backdoor Attacks (NeurIPS 2023) [[paper](https://openreview.net/forum?id=ONwL9ucoYG)] [[code](https://github.com/BigML-CS-UCLA/RoCLIP)] 252 | + Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features (NeurIPS 2023) [[paper](https://openreview.net/forum?id=VFhN15Vlkj)] [[PyTorch code](https://github.com/SCLBD/BackdoorBench)] [[MindSpore Code](https://github.com/JulieCarlon/NPD-MindSpore)] 253 | + What Distributions are Robust to Indiscriminate Poisoning Attacks for Linear Learners? (NeurIPS 2023) [[paper](https://openreview.net/forum?id=yyLFUPNEiT)] 254 | + Label Poisoning is All You Need (NeurIPS 2023) [[paper](https://openreview.net/forum?id=prftZp6mDH)] 255 | + Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks (NeurIPS 2023) [[paper](https://openreview.net/forum?id=Isy7gl1Hqc)] [[code](https://github.com/Jimmy-di/camouflage-poisoning)] 256 | + Temporal Robustness against Data Poisoning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=P5vzRpoOj2)] 257 | + VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models (NeurIPS 2023) [[paper](https://openreview.net/forum?id=wkIBfnGPTA)] [[code](https://github.com/IBM/villandiffusion)] 258 | + CBD: A Certified Backdoor Detector Based on Local Dominant Probability (NeurIPS 2023) [[paper](https://openreview.net/forum?id=H1CQZqpgdQ)] 259 | + BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=l3yxZS3QdT)] 260 | + Fed-FA: Theoretically Modeling Client Data Divergence for Federated Language Backdoor Defense (NeurIPS 2023) [[paper](https://openreview.net/forum?id=txPdKZrrZF)] 261 | + Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples (NeurIPS 2023) [[paper](https://openreview.net/forum?id=zqOcW3R9rd)] [[PyTorch code](https://github.com/SCLBD/BackdoorBench)] [[MindSpore Code](https://github.com/shawkui/MindTrojan)] 262 | + IBA: Towards Irreversible Backdoor Attacks in Federated Learning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=cemEOP8YoC)] [[code](https://github.com/sail-research/iba)] 263 | + Towards Stable Backdoor Purification through Feature Shift Tuning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=8muKbaAgsh)] [[code](https://github.com/AISafety-HKUST/stable_backdoor_purification)] 264 | + Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks (NeurIPS 2023) [[paper](https://openreview.net/forum?id=GqXbfVmEPW)] [[code](https://github.com/zhaohan-xi/PLM-prompt-defense)] 265 | + Lockdown: Backdoor Defense for Federated Learning with Isolated Subspace Training (NeurIPS 2023) [[paper](https://openreview.net/forum?id=V5cQH7JbGo)] [[code](https://github.com/git-disl/Lockdown)] 266 | + A3FL: Adversarially Adaptive Backdoor Attacks to Federated Learning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=S6ajVZy6FA)] [[code](https://github.com/hfzhang31/A3FL)] 267 | + FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning (NeurIPS 2023) [[paper](https://openreview.net/forum?id=nX0zYBGEka)] [[code](https://github.com/AI-secure/FedGame)] 268 | + A Unified Detection Framework for Inference-Stage Backdoor Defenses (NeurIPS 2023) [[paper](https://openreview.net/forum?id=4zWEyYGGfI)] 269 | + Black-box Backdoor Defense via Zero-shot Image Purification (NeurIPS 2023) [[paper](https://openreview.net/forum?id=W6U2xSbiE1)] [[code](https://github.com/sycny/ZIP)] 270 | + Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots (NeurIPS 2023) [[paper](https://openreview.net/forum?id=2cYxNWNzk3)] 271 | 272 |
273 | 274 |
275 | CVPR 276 | 277 | + Backdoor Defense via Deconfounded Representation Learning (CVPR 2023) [[paper](https://arxiv.org/abs/2303.06818)] [[code](https://github.com/zaixizhang/CBD)] 278 | + Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks (CVPR 2023) [[paper](https://arxiv.org/abs/2303.06199)] 279 | + CUDA: Convolution-based Unlearnable Datasets (CVPR 2023) [[paper](https://arxiv.org/abs/2303.04278)] [[code](https://github.com/vinusankars/Convolution-based-Unlearnability)] 280 | + Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger (CVPR 2023) [[paper](https://arxiv.org/abs/2302.14677)] 281 | + Single Image Backdoor Inversion via Robust Smoothed Classifiers (CVPR 2023) [[paper](https://arxiv.org/abs/2303.00215)] [[code](https://github.com/locuslab/smoothinv)] 282 | + Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples (CVPR 2023) [[paper](https://arxiv.org/abs/2301.01217)] [[code](https://github.com/jiamingzhang94/Unlearnable-Clusters)] 283 | + Backdoor Defense via Adaptively Splitting Poisoned Dataset (CVPR 2023) [[paper](https://arxiv.org/abs/2303.12993)] [[code](https://github.com/KuofengGao/ASD)] 284 | + Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency (CVPR 2023) [[paper](https://arxiv.org/abs/2303.18191)] [[code](https://github.com/CGCL-codes/TeCo)] 285 | + Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning (CVPR 2023) [[paper](https://arxiv.org/abs/2304.01482)] [[code](https://github.com/UCDvision/PatchSearch)] 286 | + Color Backdoor: A Robust Poisoning Attack in Color Space (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Jiang_Color_Backdoor_A_Robust_Poisoning_Attack_in_Color_Space_CVPR_2023_paper.html)] 287 | + How to Backdoor Diffusion Models? (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Chou_How_to_Backdoor_Diffusion_Models_CVPR_2023_paper.html)] [[code](https://github.com/IBM/BadDiffusion)] 288 | + Backdoor Cleansing With Unlabeled Data (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Pang_Backdoor_Cleansing_With_Unlabeled_Data_CVPR_2023_paper.html)] [[code](https://github.com/luluppang/BCU)] 289 | + MEDIC: Remove Model Backdoors via Importance Driven Cloning (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Xu_MEDIC_Remove_Model_Backdoors_via_Importance_Driven_Cloning_CVPR_2023_paper.html)] [[code](https://github.com/qiulingxu/MEDIC)] 290 | + Architectural Backdoors in Neural Networks (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Bober-Irizar_Architectural_Backdoors_in_Neural_Networks_CVPR_2023_paper.html)] 291 | + Detecting Backdoors in Pre-Trained Encoders (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Feng_Detecting_Backdoors_in_Pre-Trained_Encoders_CVPR_2023_paper.html)] [[code](https://github.com/GiantSeaweed/DECREE)] 292 | + The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Chen_The_Dark_Side_of_Dynamic_Routing_Neural_Networks_Towards_Efficiency_CVPR_2023_paper.html)] [[code](https://github.com/SeekingDream/EfficFrog)] 293 | + Progressive Backdoor Erasing via Connecting Backdoor and Adversarial Attacks (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Mu_Progressive_Backdoor_Erasing_via_Connecting_Backdoor_and_Adversarial_Attacks_CVPR_2023_paper.html)] 294 | + You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks? (CVPR 2023) [[paper](https://openaccess.thecvf.com/content/CVPR2023/html/Yuan_You_Are_Catching_My_Attention_Are_Vision_Transformers_Bad_Learners_CVPR_2023_paper.html)] 295 | + Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor Poisoned Samples in DNNs (CVPRW 2023) [[paper](https://arxiv.org/abs/2303.13211)] 296 | 297 |
298 | 299 |
300 | ICCV 301 | 302 | + TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models (ICCV 2023) [[paper](https://arxiv.org/abs/2308.03906)] [[code](https://github.com/SRI-CSL/TIJO)] 303 | + Towards Attack-tolerant Federated Learning via Critical Parameter Analysis (ICCV 2023) [[paper](https://arxiv.org/abs/2308.09318)] [[code](https://github.com/Sungwon-Han/FEDCPA)] 304 | + VertexSerum: Poisoning Graph Neural Networks for Link Inference (ICCV 2023) [[paper](https://arxiv.org/abs/2308.01469)] 305 | + The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data (ICCV 2023) [[paper](http://openaccess.thecvf.com/content/ICCV2023/html/Zhu_The_Victim_and_The_Beneficiary_Exploiting_a_Poisoned_Model_to_ICCV_2023_paper.html)] [[code](https://github.com/Zixuan-Zhu/VaB)] 306 | + CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning (arXiv 2023) [[paper](https://arxiv.org/abs/2303.03323)] [[code](https://github.com/nishadsinghi/CleanCLIP)] 307 | + Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware Minimization (ICCV2023) [[paper](https://arxiv.org/abs/2304.11823)] 308 | + Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis (ICCV2023) [[paper](https://arxiv.org/abs/2211.02408)] [[code](https://github.com/LukasStruppek/Rickrolling-the-Artist)] 309 | + Beating Backdoor Attack at Its Own Game (ICCV 2023) [[paper](https://arxiv.org/abs/2307.15539)] [[code](https://github.com/damianliumin/non-adversarial_backdoor)] 310 | + Multi-Metrics Adaptively Identifies Backdoors in Federated Learning (ICCV2023) [[paper](https://arxiv.org/abs/2303.06601)] [[code](https://github.com/siquanhuang/Multi-metrics_against_backdoors_in_FL)] 311 | + PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning (ICCV2023) [[paper](https://arxiv.org/abs/2202.03609)] 312 | + The Perils of Learning from Unlabeled Data: Backdoor Attacks on Semi-Supervised Learning [[paper](https://arxiv.org/abs/2211.00453)] 313 | 314 |
315 | 316 |
317 | S&P 318 | 319 | + Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers (S&P 2023) [[paper](https://arxiv.org/abs/2202.05470)] 320 | + SNAP: Efficient Extraction of Private Properties with Poisoning (S&P 2023) [[paper](https://arxiv.org/abs/2208.12348)] [[code](https://github.com/johnmath/snap-sp23)] 321 | + BayBFed: Bayesian Backdoor Defense for Federated Learning (S&P 2023) [[paper](https://arxiv.org/abs/2301.09508)] 322 | + RAB: Provable Robustness Against Backdoor Attacks (S&P 2023) [[paper](https://arxiv.org/abs/2003.08904)] 323 | + FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information (S&P 2023) [[paper](https://arxiv.org/abs/2210.10936)] 324 | + 3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning (S&P 2023) [[paper](https://ieeexplore.ieee.org/document/10179401)] 325 | 326 |
327 | 328 |
329 | ACL 330 | 331 | + BITE: Textual Backdoor Attacks with Iterative Trigger Injection (ACL 2023) [[paper](https://aclanthology.org/2023.acl-long.725/)] [[code](https://github.com/INK-USC/BITE)] 332 | + Backdooring Neural Code Search (ACL 2023) [[paper](https://arxiv.org/abs/2305.17506)] [[code](https://github.com/wssun/BADCODE)] 333 | + Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark (ACL 2023) [[paper](https://arxiv.org/abs/2305.10036)] [[code](https://github.com/yjw1029/EmbMarker)] 334 | + NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models (ACL 2023) [[paper](https://arxiv.org/abs/2305.17826)] [[code](https://github.com/RU-System-Software-and-Security/Notable)] 335 | + Multi-target Backdoor Attacks for Code Pre-trained Models (ACL 2023) [[code](https://arxiv.org/abs/2306.08350)] [[code](https://github.com/Lyz1213/Backdoored_PPLM)] 336 | + A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning (ACL 2023) [[paper](https://aclanthology.org/2023.acl-long.194/)] 337 | + Defending against Insertion-based Textual Backdoor Attacks via Attribution (ACL 2023) [[paper](https://arxiv.org/abs/2305.02394)] 338 | + Diffusion Theory as a Scalpel: Detecting and Purifying Poisonous Dimensions in Pre-trained Language Models Caused by Backdoor or Bias (ACL 2023) [[paper](https://arxiv.org/abs/2305.04547)] 339 | + Maximum Entropy Loss, the Silver Bullet Targeting Backdoor Attacks in Pre-trained Language Models (ACL Findings 2023) [[paper](https://aclanthology.org/2023.findings-acl.237/)] 340 | 341 |
342 | 343 | 344 |
345 | EMNLP 346 | 347 | + Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation (EMNLP 2023) [[paper](https://openreview.net/forum?id=vW3TFDUKWl)] [[code](https://github.com/xlhex/emnlp2023_z-defence)] 348 | + Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models (EMNLP 2023) [[paper](https://openreview.net/forum?id=Ek87791lcO)] [[code](https://github.com/shuaizhao95/Prompt_attack)] 349 | + Poisoning Retrieval Corpora by Injecting Adversarial Passages (EMNLP 2023) [[paper](https://openreview.net/forum?id=8FgdMHbW27)] [[code](https://github.com/princeton-nlp/corpus-poisoning)] 350 | + UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning (EMNLP 2023 Findings) [[paper](https://openreview.net/forum?id=Fm0Brp3cTS)] 351 | + Attention-Enhancing Backdoor Attacks Against BERT-based Models (EMNLP 2023 Findings) [[paper](https://openreview.net/forum?id=L7IW2foTq4)] 352 | + Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers (EMNLP 2023 Findings) [[paper](https://openreview.net/forum?id=DTELCDufzE)] 353 | 354 |
355 | 356 |
357 | Others 358 | 359 | + RDM-DC: Poisoning Resilient Dataset Condensation with Robust Distribution Matching (UAI 2023) [[paper](https://openreview.net/forum?id=S5KslIBXt_)] 360 | + Defending Against Backdoor Attacks by Layer-wise Feature Analysis (PAKDD 2023) [[paper](https://arxiv.org/abs/2302.12758)] [[code](https://github.com/NajeebJebreel/DBALFA)] 361 | + Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures (SIGIR 2023) [[paper](https://arxiv.org/abs/2304.03054)] 362 | + The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples (SIGIR 2023) [[paper](https://arxiv.org/abs/2305.00574)] 363 | + PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification (ACM MM 2023) [[paper](https://arxiv.org/abs/2308.11822)] [[code](https://github.com/XaiverYuan/PatchBackdoor)] 364 | + A Dual Stealthy Backdoor: From Both Spatial and Frequency Perspectives (ACM MM 2023) [[paper](https://arxiv.org/abs/2307.10184)] 365 | + How to Sift Out a Clean Data Subset in the Presence of Data Poisoning? (USENIX Security 2023) [[paper](http://arxiv.org/abs/2210.06516)] [[code](https://github.com/ruoxi-jia-group/Meta-Sift)] 366 | + PORE: Provably Robust Recommender Systems against Data Poisoning Attacks (USENIX Security 2023) [[paper](https://arxiv.org/abs/2303.14601)] 367 | + On the Security Risks of Knowledge Graph Reasoning (USENIX Security 2023) [[paper](https://arxiv.org/abs/2305.02383)] [[code](https://github.com/HarrialX/security-risk-KG-reasoning)] 368 | + Fedward: Flexible Federated Backdoor Defense Framework with Non-IID Data (ICME 2023) [[paper](https://arxiv.org/abs/2307.00356)] 369 | + BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT (NDSS 2023) [[paper](https://arxiv.org/abs/2304.12298)] 370 | + Exploiting Logic Locking for a Neural Trojan Attack on Machine Learning Accelerators (GLSVLSI 2023) [[paper](https://arxiv.org/abs/2304.06017)] 371 | + Energy-Latency Attacks to On-Device Neural Networks via Sponge Poisoning (SecTL 2023) [[paper](https://arxiv.org/abs/2305.03888)] 372 | + Beyond the Model: Data Pre-processing Attack to Deep Learning Models in Android Apps (SecTL 2023) [[paper](https://arxiv.org/abs/2305.03963)] 373 | 374 |
375 | 376 | ## 2022 377 | 378 | + Transferable Unlearnable Examples (arXiv 2022) [[paper](https://arxiv.org/abs/2210.10114)] 379 | + Natural Backdoor Datasets (arXiv 2022) [[paper](http://arxiv.org/abs/2206.10673)] 380 | + Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World (arXiv 2022) [[paper](https://arxiv.org/abs/2201.08619)] 381 | + Backdoor Attacks on Self-Supervised Learning (CVPR 2022) [[paper](https://openaccess.thecvf.com/content/CVPR2022/html/Saha_Backdoor_Attacks_on_Self-Supervised_Learning_CVPR_2022_paper.html)] [[code](https://github.com/UMBCvision/SSL-Backdoor)] 382 | + Poisons that are learned faster are more effective (CVPR 2022 Workshops) [[paper](http://arxiv.org/abs/2204.08615)] 383 | + Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning (ICLR 2022) [[paper](https://openreview.net/forum?id=baUQQPwQiAg)] [[code](https://github.com/fshp971/robust-unlearnable-examples)] 384 | + Adversarial Unlearning of Backdoors via Implicit Hypergradient (ICLR 2022) [[paper](https://openreview.net/forum?id=MeeQkFYVbzW)] [[code](https://github.com/YiZeng623/I-BAU)] 385 | + Not All Poisons are Created Equal: Robust Training against Data Poisoning (ICML 2022) [[paper](https://proceedings.mlr.press/v162/yang22j.html)] [[code](https://github.com/YuYang0901/EPIC)] 386 | + Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch (NeurIPS 2022) [[paper](http://arxiv.org/abs/2106.08970)] [[code](https://github.com/hsouri/Sleeper-Agent)] 387 | + Policy Resilience to Environment Poisoning Attacks 388 | on Reinforcement Learning (NeurIPS 2022 Workshop MLSW) [[paper](https://arxiv.org/abs/2304.12151)] 389 | + Hard to Forget: Poisoning Attacks on Certified Machine Unlearning (AAAI 2022) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/20736)] [[code](https://github.com/ngmarchant/attack-unlearning)] 390 | + Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks (AAAI 2022) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/21191)] 391 | + PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning (USENIX Security 2022) [[paper](http://arxiv.org/abs/2205.06401)] 392 | + Planting Undetectable Backdoors in Machine Learning Models (FOCS 2022) [[paper](https://ieeexplore.ieee.org/abstract/document/9996741)] 393 | 394 | ## 2021 395 | 396 | + DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations (arXiv 2021) [[paper](https://arxiv.org/abs/2103.02079)] 397 | + How Robust Are Randomized Smoothing Based Defenses to Data Poisoning? (CVPR 2021) [[paper](https://openaccess.thecvf.com/content/CVPR2021/html/Mehra_How_Robust_Are_Randomized_Smoothing_Based_Defenses_to_Data_Poisoning_CVPR_2021_paper.html)] 398 | + Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release (ICLR 2021 Workshop on Security and Safety in Machine Learning Systems) [[paper](http://arxiv.org/abs/2103.02683)] 399 | + Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching (ICLR 2021) [[paper](https://arxiv.org/abs/2009.02276)] [[code](https://github.com/JonasGeiping/poisoning-gradient-matching)] 400 | + Unlearnable Examples: Making Personal Data Unexploitable (ICLR 2021) [[paper](https://arxiv.org/abs/2101.04898)] [[code](https://github.com/HanxunH/Unlearnable-Examples)] 401 | + Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks (ICLR 2021) [[paper](http://arxiv.org/abs/2101.05930)] [[code](https://github.com/akshaymehra24/PoisoningCertifiedDefenses)] 402 | + LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition (ICLR 2021) [[paper](https://openreview.net/forum?id=hJmtwocEqzc)] 403 | + What Doesn't Kill You Makes You Robust(er): How to Adversarially Train against Data Poisoning (ICLR 2021 Workshop) [[paper](https://arxiv.org/abs/2102.13624)] 404 | + Neural Tangent Generalization Attacks (ICML 2021) [[paper](https://proceedings.mlr.press/v139/yuan21b.html)] 405 | + SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation (ICML 2021) [[paper](https://proceedings.mlr.press/v139/hayase21a.html)] 406 | + Adversarial Examples Make Strong Poisons (NeurIPS 2021) [[paper](https://proceedings.neurips.cc/paper/2021/hash/fe87435d12ef7642af67d9bc82a8b3cd-Abstract.html)] 407 | + Anti-Backdoor Learning: Training Clean Models on Poisoned Data (NeurIPS 2021) [[paper](https://proceedings.neurips.cc/paper/2021/hash/7d38b1e9bd793d3f45e0e212a729a93c-Abstract.html)] [[code](https://github.com/bboylyg/ABL)] 408 | + Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective (ICCV 2021) [[paper](https://openaccess.thecvf.com/content/ICCV2021/html/Zeng_Rethinking_the_Backdoor_Attacks_Triggers_A_Frequency_Perspective_ICCV_2021_paper.html)] [[code](https://github.com/YiZeng623/frequency-backdoor)] 409 | + Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks (AAAI 2021) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/16971)] [[code](https://github.com/jinyuan-jia/BaggingCertifyDataPoisoning)] 410 | + Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff (ICASSP 2021) [[paper](https://ieeexplore.ieee.org/abstract/document/9414862)] 411 | 412 | ## 2020 413 | 414 | + On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping (arXiv 2020) [[paper](https://arxiv.org/abs/2002.11497)] [[code](https://github.com/Sanghyun-Hong/Gradient-Shaping)] 415 | + Backdooring and poisoning neural networks with image-scaling attacks (arXiv 2020) [[paper](https://arxiv.org/abs/2003.08633)] 416 | + Poisoned classifiers are not only backdoored, they are fundamentally broken (arXiv 2020) [[paper](https://arxiv.org/abs/2010.09080)] [[code](https://github.com/locuslab/breaking-poisoned-classifier)] 417 | + Invisible backdoor attacks on deep neural networks via steganography and regularization (TDSC 2020) [[paper](https://ieeexplore.ieee.org/abstract/document/9186317)] 418 | + Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs (CVPR 2020) [[paper](https://arxiv.org/abs/1906.10842)] [[code](https://github.com/UMBCvision/Universal-Litmus-Patterns)] 419 | + MetaPoison: Practical General-purpose Clean-label Data Poisoning (NeurIPS 2020) [[paper](https://arxiv.org/abs/2004.00225)] 420 | + Input-Aware Dynamic Backdoor Attack (NeurIPS 2020) [[paper](https://proceedings.neurips.cc/paper/2020/hash/234e691320c0ad5b45ee3c96d0d7b8f8-Abstract.html)] [[code](https://github.com/VinAIResearch/input-aware-backdoor-attack-release)] 421 | + How To Backdoor Federated Learning (AISTATS 2020) [[paper](https://proceedings.mlr.press/v108/bagdasaryan20a.html)] 422 | + Reflection backdoor: A natural backdoor attack on deep neural networks (ECCV 2020) [[paper](https://arxiv.org/abs/2007.02343)] 423 | + Practical Poisoning Attacks on Neural Networks (ECCV 2020) [[paper](https://link.springer.com/chapter/10.1007/978-3-030-58583-9_9)] 424 | + Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases (ECCV 2020) [[paper](https://link.springer.com/chapter/10.1007/978-3-030-58592-1_14)] [[code](https://github.com/wangren09/TrojanNetDetector)] 425 | + Deep k-NN Defense Against Clean-Label Data Poisoning Attacks (ECCV 2020 Workshops) [[paper](https://link.springer.com/chapter/10.1007/978-3-030-66415-2_4)] [[code](https://github.com/neeharperi/DeepKNNDefense)] 426 | + Radioactive data: tracing through training (ICML 2020) [[paper](https://arxiv.org/abs/2002.00937)] 427 | + Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks (ICML 2020) [[paper](https://proceedings.mlr.press/v119/croce20b.html)] 428 | + Certified Robustness to Label-Flipping Attacks via Randomized Smoothing (ICML 2020) [[paper](http://proceedings.mlr.press/v119/rosenfeld20b.html)] 429 | + An Embarrassingly Simple Approach for Trojan Attack in Deep Neural Networks (KDD 2020) [[paper](https://dl.acm.org/doi/abs/10.1145/3394486.3403064)] [[code](https://github.com/trx14/TrojanNet)] 430 | + Hidden Trigger Backdoor Attacks (AAAI 2020) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/6871)] [[code](https://github.com/UMBCvision/Hidden-Trigger-Backdoor-Attacks)] 431 | 432 | ## 2019 433 | 434 | + Label-consistent backdoor attacks (arXiv 2019) [[paper](https://arxiv.org/abs/1912.02771)] 435 | + Poisoning Attacks with Generative Adversarial Nets (arXiv 2019) [[paper](https://arxiv.org/abs/1906.07773)] 436 | + TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems (arXiv 2019) [[paper](https://arxiv.org/abs/1908.01763)] 437 | + BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (IEEE Access 2019) [[paper](https://arxiv.org/abs/1708.06733)] 438 | + Data Poisoning against Differentially-Private Learners: Attacks and Defenses (IJCAI 2019) [[paper](https://arxiv.org/abs/1903.09860)] 439 | + DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks (IJCAI 2019) [[paper](https://aceslab.org/sites/default/files/DeepInspect.pdf)] 440 | + Sever: A Robust Meta-Algorithm for Stochastic Optimization (ICML 2019) [[paper](https://proceedings.mlr.press/v97/diakonikolas19a.html)] 441 | + Learning with Bad Training Data via Iterative Trimmed Loss Minimization (ICML 2019) [[paper](https://proceedings.mlr.press/v97/shen19e.html)] 442 | + Universal Multi-Party Poisoning Attacks (ICML 2019) [[paper](https://proceedings.mlr.press/v97/mahloujifar19a.html)] 443 | + Transferable Clean-Label Poisoning Attacks on Deep Neural Nets (ICML 2019) [[paper](https://arxiv.org/abs/1905.05897)] 444 | + Defending Neural Backdoors via Generative Distribution Modeling (NeurIPS 2019) [[paper](https://arxiv.org/abs/1910.04749)] 445 | + Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder (NeurIPS 2019) [[paper](https://proceedings.neurips.cc/paper/2019/hash/1ce83e5d4135b07c0b82afffbe2b3436-Abstract.html)] 446 | + The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure (AAAI 2019) [[paper](https://arxiv.org/abs/1809.03063)] 447 | + Backdoor Attacks against Transfer Learning with Pre-trained Deep Learning Models (IEEE Transactions on Services Computing 2019) [[paper](https://arxiv.org/abs/2001.03274)] 448 | + Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks (IEEE Symposium on Security and Privacy 2019) [[paper](https://ieeexplore.ieee.org/abstract/document/8835365)] 449 | + STRIP: a defence against trojan attacks on deep neural networks (ACSAC 2019) [[paper](https://dl.acm.org/doi/abs/10.1145/3359789.3359790?casa_token=JDIdKo9xPV8AAAAA:SPvqGbP3MHjAr6wkOZ6jc7ZEmh_64APDkQ9aDO2WmCyaWL83dU7vHR5-Rjg-aNVDBo0_QU3TOn7uvQ)] 450 | 451 | ## 2018 452 | 453 | + Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering (arXiv 2018) [[paper](https://arxiv.org/abs/1811.03728)] 454 | + Spectral Signatures in Backdoor Attacks (NeurIPS 2018) [[paper](https://proceedings.neurips.cc/paper/2018/hash/280cf18baf4311c92aa5a042336587d3-Abstract.html)] 455 | + Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (NeurIPS 2018) [[paper](https://arxiv.org/abs/1804.00792)] 456 | + Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise (NeurIPS 2018) [[paper](https://arxiv.org/abs/1802.05300)] 457 | + Trojaning Attack on Neural Networks (NDSS 2018) [[paper](https://www.ndss-symposium.org/wp-content/uploads/2018/02/ndss2018_03A-5_Liu_paper.pdf)] 458 | + Label Sanitization Against Label Flipping Poisoning Attacks (ECML PKDD 2018 Workshops) [[paper](https://link.springer.com/chapter/10.1007/978-3-030-13453-2_1)] 459 | + Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring (USENIX Security 2018) [[paper](https://arxiv.org/abs/1802.04633)] 460 | 461 | ## 2017 462 | 463 | + Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (arXiv 2017) [[paper](https://arxiv.org/abs/1712.05526)] 464 | + Generative Poisoning Attack Method Against Neural Networks (arXiv 2017) [[paper](https://arxiv.org/abs/1703.01340)] 465 | + Delving into Transferable Adversarial Examples and Black-box Attacks (ICLR 2017) [[paper](https://openreview.net/forum?id=Sys6GJqxl)] 466 | + Understanding Black-box Predictions via Influence Functions (ICML 2017) [[paper](http://proceedings.mlr.press/v70/koh17a?ref=https://githubhelp.com)] [[code](https://github.com/kohpangwei/influence-release)] 467 | + Certified Defenses for Data Poisoning Attacks (NeurIPS 2017) [[paper](https://arxiv.org/abs/1706.03691)] 468 | 469 | ## 2016 470 | 471 | + Data Poisoning Attacks on Factorization-Based Collaborative Filtering (NeurIPS 2016) [[paper](https://proceedings.neurips.cc/paper/2016/hash/83fa5a432ae55c253d0e60dbfa716723-Abstract.html)] 472 | 473 | ## 2015 474 | 475 | + Is Feature Selection Secure against Training Data Poisoning? (ICML 2015) [[paper](https://proceedings.mlr.press/v37/xiao15.html)] 476 | + Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners (AAAI 2015) [[paper](https://ojs.aaai.org/index.php/AAAI/article/view/9569)] 477 | --------------------------------------------------------------------------------