├── src ├── __init__.py ├── dataset.py ├── utils.py └── attacks.py ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── README.md ├── text_classification.py ├── run_textattack.py ├── evaluate_adv_samples.py ├── whitebox_attack.py └── LICENSE /src/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | Facebook has adopted a Code of Conduct that we expect project participants to adhere to. 4 | Please read the [full text](https://code.fb.com/codeofconduct/) 5 | so that you can understand what actions will and will not be tolerated. 6 | -------------------------------------------------------------------------------- /src/dataset.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import os 8 | 9 | from datasets import load_dataset 10 | from src.utils import target_offset 11 | 12 | def load_data(args): 13 | if args.dataset == "dbpedia14": 14 | dataset = load_dataset("csv", column_names=["label", "title", "sentence"], 15 | data_files={"train": os.path.join(args.data_folder, "dbpedia_csv/train.csv"), 16 | "validation": os.path.join(args.data_folder, "dbpedia_csv/test.csv")}) 17 | dataset = dataset.map(target_offset, batched=True) 18 | num_labels = 14 19 | elif args.dataset == "ag_news": 20 | dataset = load_dataset("ag_news") 21 | num_labels = 4 22 | elif args.dataset == "imdb": 23 | dataset = load_dataset("imdb", ignore_verifications=True) 24 | num_labels = 2 25 | elif args.dataset == "yelp": 26 | dataset = load_dataset("yelp_polarity") 27 | num_labels = 2 28 | elif args.dataset == "mnli": 29 | dataset = load_dataset("glue", "mnli") 30 | num_labels = 3 31 | dataset = dataset.shuffle(seed=0) 32 | 33 | 34 | 35 | return dataset, num_labels -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to text-adversarial-attack 2 | We want to make contributing to this project as easy and transparent as 3 | possible. 4 | 5 | ## Pull Requests 6 | We actively welcome your pull requests. 7 | 8 | 1. Fork the repo and create your branch from `master`. 9 | 2. If you've added code that should be tested, add tests. 10 | 3. If you've changed APIs, update the documentation. 11 | 4. Ensure the test suite passes. 12 | 5. Make sure your code lints. 13 | 6. If you haven't already, complete the Contributor License Agreement ("CLA"). 14 | 15 | ## Contributor License Agreement ("CLA") 16 | In order to accept your pull request, we need you to submit a CLA. You only need 17 | to do this once to work on any of Facebook's open source projects. 18 | 19 | Complete your CLA here: 20 | 21 | ## Issues 22 | We use GitHub issues to track public bugs. Please ensure your description is 23 | clear and has sufficient instructions to be able to reproduce the issue. 24 | 25 | Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe 26 | disclosure of security bugs. In those cases, please go through the process 27 | outlined on that page and do not file a public issue. 28 | 29 | ## License 30 | By contributing to text-adversarial-attack, you agree that your contributions will be licensed 31 | under the LICENSE file in the root directory of this source tree. 32 | 33 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Gradient-based Adversarial Attacks against Text Transformers 2 | 3 | ## 0. Install Dependencies and Data Download: 4 | 1. Install HuggingFace dependences 5 | ``` 6 | conda install -c huggingface transformers 7 | pip install datasets 8 | ``` 9 | 2. (Optional) For attacks against DBPedia14, download from [Kaggle](https://www.kaggle.com/danofer/dbpedia-classes) and setup data directory to contain: 10 | ``` 11 | /dbpedia_csv/ 12 | train.csv 13 | test.csv 14 | ``` 15 | 16 | ## 1. Training finetuned models 17 | Use the following training script to finetune a pre-trained transformer model from HuggingFace: 18 | ``` 19 | python text_classification.py --data_folder --dataset --model --finetune True 20 | ``` 21 | 22 | ## 2. Attacking a finetuned model 23 | To attack a finetuned model after running ```text_classification.py``` or from the TextAttack library: 24 | ``` 25 | python whitebox_attack.py --data_folder --dataset --model --finetune True --start_index 0 --num_samples 100 --gumbel_samples 100 26 | ``` 27 | This runs the GBDA on the first 100 samples from the test set. 28 | 29 | ### 2.1. Downloading GPT-2 trained on BERT tokenizer (optional) 30 | To attack a BERT model, GBDA requires a casual language model trained on the BERT tokenizer. We provide a pretrained GPT-2 model for this purpose. Before the attack, please run the following script to download the model from the Amazon S3 bucket: 31 | ``` 32 | curl https://dl.fbaipublicfiles.com/text-adversarial-attack/transformer_wikitext-103.pth -o transformer_wikitext-103.pth 33 | ``` 34 | 35 | ## 3. Evaluating transfer attack 36 | After attacking a model, run the following script to query a target model from the optimized adversarial distribution: 37 | ``` 38 | python evaluate_adv_samples.py --data_folder --dataset --surrogate_model --target_model --finetune True --start_index 0 --num_samples 100 --end_index 100 --gumbel_samples 1000 39 | ``` 40 | 41 | ## Citation 42 | 43 | Please cite [[1]](https://arxiv.org/abs/2104.13733) if you found the resources in this repository useful. 44 | 45 | 46 | [1] C. Guo *, A. Sablayrolles *, Herve Jegou, Douwe Kiela. [*Gradient-based Adversarial Attacks against Text Transformers*](https://arxiv.org/abs/2104.13733). EMNLP 2021. 47 | 48 | 49 | ``` 50 | @article{guo2021gradientbased, 51 | title={Gradient-based Adversarial Attacks against Text Transformers}, 52 | author={Guo, Chuan and Sablayrolles, Alexandre and Jégou, Hervé and Kiela, Douwe}, 53 | journal={arXiv preprint arXiv:2104.13733}, 54 | year={2021} 55 | } 56 | ``` 57 | 58 | 59 | ## Contributing 60 | See the [CONTRIBUTING](CONTRIBUTING.md) file for how to help out. 61 | 62 | ## License 63 | This project is CC-BY-NC 4.0 licensed, as found in the LICENSE file. 64 | -------------------------------------------------------------------------------- /src/utils.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import argparse 8 | from tqdm import tqdm 9 | import os 10 | import torch 11 | from transformers import GPT2LMHeadModel, GPT2Config 12 | 13 | FALSY_STRINGS = {'off', 'false', '0'} 14 | TRUTHY_STRINGS = {'on', 'true', '1'} 15 | def bool_flag(s): 16 | """ 17 | Parse boolean arguments from the command line. 18 | """ 19 | if s.lower() in FALSY_STRINGS: 20 | return False 21 | elif s.lower() in TRUTHY_STRINGS: 22 | return True 23 | else: 24 | raise argparse.ArgumentTypeError("invalid value for a boolean flag") 25 | 26 | # offset target by 1 if labels start from 1 27 | def target_offset(examples): 28 | examples["label"] = list(map(lambda x: x - 1, examples["label"])) 29 | return examples 30 | 31 | def get_output_file(args, model, start, end): 32 | suffix = '' 33 | if args.finetune: 34 | suffix += '_finetune' 35 | if args.dataset == 'mnli': 36 | dataset_str = f"{args.dataset}_{args.mnli_option}_{args.attack_target}" 37 | else: 38 | dataset_str = args.dataset 39 | attack_str = args.adv_loss 40 | if args.adv_loss == 'cw': 41 | attack_str += f'_kappa={args.kappa}' 42 | 43 | model_name = model.replace('/', '-') 44 | output_file = f"{model_name}_{dataset_str}{suffix}_{start}-{end}" 45 | output_file += f"_iters={args.num_iters}_{attack_str}_lambda_sim={args.lam_sim}_lambda_perp={args.lam_perp}_emblayer={args.embed_layer}_{args.constraint}.pth" 46 | 47 | return output_file 48 | 49 | def load_checkpoints(args): 50 | if args.dataset == 'mnli': 51 | adv_log_coeffs = {'premise': [], 'hypothesis': []} 52 | clean_texts = {'premise': [], 'hypothesis': []} 53 | adv_texts = {'premise': [], 'hypothesis': []} 54 | else: 55 | adv_log_coeffs, clean_texts, adv_texts = [], [], [] 56 | clean_logits, adv_logits, times, labels = [], [], [], [] 57 | 58 | for i in tqdm(range(args.start_index, args.end_index, args.num_samples)): 59 | output_file = get_output_file(args, args.surrogate_model, i, i + args.num_samples) 60 | output_file = os.path.join(args.adv_samples_folder, output_file) 61 | if os.path.exists(output_file): 62 | checkpoint = torch.load(output_file) 63 | clean_logits.append(checkpoint['clean_logits']) 64 | adv_logits.append(checkpoint['adv_logits']) 65 | labels += checkpoint['labels'] 66 | times += checkpoint['times'] 67 | if args.dataset == 'mnli': 68 | adv_log_coeffs['premise'] += checkpoint['adv_log_coeffs']['premise'] 69 | adv_log_coeffs['hypothesis'] += checkpoint['adv_log_coeffs']['hypothesis'] 70 | clean_texts['premise'] += checkpoint['clean_texts']['premise'] 71 | clean_texts['hypothesis'] += checkpoint['clean_texts']['hypothesis'] 72 | adv_texts['premise'] += checkpoint['adv_texts']['premise'] 73 | adv_texts['hypothesis'] += checkpoint['adv_texts']['hypothesis'] 74 | else: 75 | adv_log_coeffs += checkpoint['adv_log_coeffs'] 76 | clean_texts += checkpoint['clean_texts'] 77 | adv_texts += checkpoint['adv_texts'] 78 | else: 79 | print('Skipping %s' % output_file) 80 | clean_logits = torch.cat(clean_logits, 0) 81 | adv_logits = torch.cat(adv_logits, 0) 82 | return clean_texts, adv_texts, clean_logits, adv_logits, adv_log_coeffs, labels, times 83 | 84 | def print_args(args): 85 | args_dict = vars(args) 86 | for arg_name, arg_value in sorted(args_dict.items()): 87 | print(f"\t{arg_name}: {arg_value}") 88 | 89 | def embedding_from_weights(w): 90 | layer = torch.nn.Embedding(w.size(0), w.size(1)) 91 | layer.weight.data = w 92 | 93 | return layer 94 | 95 | def load_gpt2_from_dict(dict_path, output_hidden_states=False): 96 | state_dict = torch.load(dict_path)['model'] 97 | 98 | config = GPT2Config( 99 | vocab_size=30522, 100 | n_embd=1024, 101 | n_head=8, 102 | activation_function='relu', 103 | n_layer=24, 104 | output_hidden_states=output_hidden_states 105 | ) 106 | model = GPT2LMHeadModel(config) 107 | model.load_state_dict(state_dict) 108 | # The input embedding is not loaded automatically 109 | model.set_input_embeddings(embedding_from_weights(state_dict['transformer.wte.weight'].cpu())) 110 | 111 | return model -------------------------------------------------------------------------------- /text_classification.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import torch 8 | import numpy as np 9 | from datasets import list_datasets, load_dataset, list_metrics, load_metric 10 | from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer 11 | import argparse 12 | import os 13 | 14 | from src.dataset import load_data 15 | from src.utils import bool_flag 16 | 17 | 18 | # function for computing accuracy 19 | def compute_metrics(eval_pred): 20 | predictions, labels = eval_pred 21 | if type(predictions) == tuple: 22 | predictions = predictions[0] 23 | predictions = np.argmax(predictions, axis=1) 24 | acc = np.mean(predictions == labels) 25 | return { 26 | 'accuracy': acc 27 | } 28 | 29 | def main(args): 30 | 31 | dataset, num_labels = load_data(args) 32 | 33 | tokenizer = AutoTokenizer.from_pretrained(args.model, use_fast=True) 34 | model = AutoModelForSequenceClassification.from_pretrained(args.model, num_labels=num_labels) 35 | if args.model == 'gpt2': 36 | tokenizer.padding_side = "right" 37 | tokenizer.pad_token = tokenizer.eos_token 38 | model.config.pad_token_id = model.config.eos_token_id 39 | if args.dataset == "mnli": 40 | # only evaluate on matched validation set 41 | testset_key = "validation_matched" 42 | preprocess_function = lambda examples: tokenizer( 43 | examples["premise"], examples["hypothesis"], max_length=256, truncation=True) 44 | else: 45 | text_key = 'text' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'sentence' 46 | testset_key = 'test' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'validation' 47 | preprocess_function = lambda examples: tokenizer(examples[text_key], max_length=256, truncation=True) 48 | encoded_dataset = dataset.map(preprocess_function, batched=True) 49 | 50 | train_args = TrainingArguments( 51 | args.checkpoint_folder, 52 | disable_tqdm=not args.tqdm, 53 | evaluation_strategy = "epoch", 54 | learning_rate=args.lr, 55 | per_device_train_batch_size=args.batch_size, 56 | per_device_eval_batch_size=args.batch_size, 57 | num_train_epochs=args.epochs, 58 | weight_decay=args.weight_decay, 59 | load_best_model_at_end=True, 60 | metric_for_best_model="accuracy", 61 | ) 62 | 63 | trainer = Trainer( 64 | model, 65 | train_args, 66 | train_dataset=encoded_dataset["train"], 67 | eval_dataset=encoded_dataset[testset_key], 68 | tokenizer=tokenizer, 69 | compute_metrics=compute_metrics, 70 | ) 71 | 72 | if not args.finetune: 73 | # freeze parameters of transformer 74 | transformer = list(model.children())[0] 75 | for param in transformer.parameters(): 76 | param.requires_grad = False 77 | 78 | trainer.train() 79 | trainer.evaluate() 80 | suffix = '' 81 | if args.finetune: 82 | suffix += '_finetune' 83 | torch.save(model.state_dict(), 84 | os.path.join(args.result_folder, "%s_%s%s.pth" % (args.model.replace('/', '-'), args.dataset, suffix))) 85 | 86 | 87 | if __name__ == "__main__": 88 | parser = argparse.ArgumentParser(description="Text classification model training.") 89 | 90 | # Bookkeeping 91 | parser.add_argument("--checkpoint_folder", default="checkpoint/", type=str, 92 | help="folder in which to store temporary model checkpoints") 93 | parser.add_argument("--result_folder", default="result/", type=str, 94 | help="folder in which to store trained models") 95 | parser.add_argument("--tqdm", default=True, type=bool_flag, 96 | help="Use tqdm in output") 97 | 98 | # Data 99 | parser.add_argument("--data_folder", required=True, type=str, 100 | help="folder in which to store data") 101 | parser.add_argument("--dataset", default="dbpedia14", type=str, 102 | choices=["dbpedia14", "ag_news", "imdb", "yelp", "mnli"], 103 | help="classification dataset to use") 104 | 105 | # Model 106 | parser.add_argument("--model", default="gpt2", type=str, 107 | help="type of model") 108 | 109 | # Optimization 110 | parser.add_argument("--batch_size", default=16, type=int, 111 | help="batch size for training and evaluation") 112 | parser.add_argument("--epochs", default=5, type=int, 113 | help="number of epochs to train for") 114 | parser.add_argument("--lr", default=2e-5, type=float, 115 | help="learning rate") 116 | parser.add_argument("--weight_decay", default=0.01, type=float, 117 | help="weight decay") 118 | parser.add_argument("--finetune", default=False, type=bool_flag, 119 | help="finetune the transformer; if False, only train linear layer") 120 | 121 | args = parser.parse_args() 122 | 123 | if args.result_folder == 'none': 124 | args.result_folder = args.checkpoint_folder 125 | 126 | main(args) -------------------------------------------------------------------------------- /src/attacks.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import os 8 | from os.path import join 9 | 10 | import argparse 11 | from bert_score import BERTScorer 12 | import csv 13 | import datasets 14 | import numpy as np 15 | import time 16 | import torch 17 | from transformers import AutoTokenizer, pipeline 18 | import transformers 19 | import tensorflow as tf 20 | import tensorflow_hub as hub 21 | import textattack 22 | from textattack.attack_recipes import BERTAttackLi2020, BAEGarg2019 23 | from textattack.datasets import HuggingFaceDataset 24 | from textattack.models.wrappers import ModelWrapper 25 | from textattack.commands.attack.attack_args import HUGGINGFACE_DATASET_BY_MODEL 26 | from textattack.constraints.overlap import MaxWordsPerturbed 27 | from textattack.constraints.pre_transformation import RepeatModification, StopwordModification 28 | from textattack.constraints.semantics.sentence_encoders import UniversalSentenceEncoder 29 | from textattack.goal_functions import UntargetedClassification 30 | from textattack.search_methods import GreedyWordSwapWIR 31 | from textattack.shared.attack import Attack 32 | from textattack.transformations import WordSwapMaskedLM 33 | from textattack.attack_recipes.attack_recipe import AttackRecipe 34 | from textattack.goal_functions.classification.targeted_classification import TargetedClassification 35 | from textattack.shared import AttackedText 36 | from textattack.constraints.grammaticality import PartOfSpeech 37 | from textattack.constraints.overlap import MaxWordsPerturbed 38 | from textattack.constraints.pre_transformation import ( 39 | RepeatModification, 40 | StopwordModification, 41 | ) 42 | from textattack.constraints.semantics import WordEmbeddingDistance 43 | from textattack.goal_functions import UntargetedClassification 44 | from textattack.search_methods import BeamSearch 45 | from textattack.shared.attack import Attack 46 | from textattack.transformations import WordSwapGradientBased 47 | 48 | from src.utils import bool_flag 49 | 50 | 51 | class USE: 52 | def __init__(self): 53 | self.encoder = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4") 54 | 55 | def compute_sim(self, clean_texts, adv_texts): 56 | clean_embeddings = self.encoder(clean_texts) 57 | adv_embeddings = self.encoder(adv_texts) 58 | cosine_sim = tf.reduce_mean(tf.reduce_sum(clean_embeddings * adv_embeddings, axis=1)) 59 | 60 | return float(cosine_sim.numpy()) 61 | 62 | 63 | def build_baegarg2019(model_wrapper, threshold_cosine=0.936338023, query_budget=None, max_candidates=50): 64 | """ 65 | Modified from https://github.com/QData/TextAttack/blob/04b7c6f79bdb5301b360555bd5458c15aa2b8695/textattack/attack_recipes/bae_garg_2019.py 66 | """ 67 | transformation = WordSwapMaskedLM( 68 | method="bae", max_candidates=max_candidates, min_confidence=0.0 69 | ) 70 | constraints = [RepeatModification(), StopwordModification()] 71 | 72 | constraints.append(PartOfSpeech(allow_verb_noun_swap=True)) 73 | 74 | use_constraint = UniversalSentenceEncoder( 75 | threshold=threshold_cosine, 76 | metric="cosine", 77 | compare_against_original=True, 78 | window_size=15, 79 | skip_text_shorter_than_window=True, 80 | ) 81 | constraints.append(use_constraint) 82 | goal_function = UntargetedClassification(model_wrapper) 83 | if query_budget is not None: 84 | goal_function.query_budget = query_budget 85 | search_method = GreedyWordSwapWIR(wir_method="delete") 86 | 87 | return Attack(goal_function, constraints, transformation, search_method) 88 | 89 | 90 | def build_attack(model_wrapper, target_class=-1): 91 | """ 92 | Same as bert-attack except: 93 | - it is TargetedClassification instead of Untargeted when target_class != -1 94 | - using "bae" instead of "bert-attack" because of bert-attack's problem for subtokens 95 | Modified from https://github.com/QData/TextAttack/blob/36dfce6bdab933bdeed3a2093ae411e93018ebbf/textattack/attack_recipes/bert_attack_li_2020.py 96 | """ 97 | 98 | # transformation = WordSwapMaskedLM(method="bert-attack", max_candidates=48) 99 | transformation = WordSwapMaskedLM(method="bae", max_candidates=100) 100 | constraints = [RepeatModification(), StopwordModification()] 101 | constraints.append(MaxWordsPerturbed(max_percent=0.4)) 102 | 103 | use_constraint = UniversalSentenceEncoder( 104 | threshold=0.2, 105 | metric="cosine", 106 | compare_against_original=True, 107 | window_size=None, 108 | ) 109 | constraints.append(use_constraint) 110 | if target_class == -1: 111 | goal_function = UntargetedClassification(model_wrapper) 112 | else: 113 | # We modify the goal 114 | goal_function = TargetedClassification(model_wrapper, target_class=target_class) 115 | search_method = GreedyWordSwapWIR(wir_method="unk") 116 | 117 | return Attack(goal_function, constraints, transformation, search_method) 118 | 119 | 120 | # def build_attack_2(model_wrapper, target_class): 121 | # """ 122 | # Same as HotFlipEbrahimi2017 attack except: 123 | # - it is TargetedClassification instead of Untargeted 124 | # """ 125 | # transformation = WordSwapGradientBased(model_wrapper, top_n=1) 126 | # constraints = [RepeatModification(), StopwordModification()] 127 | # constraints.append(MaxWordsPerturbed(max_num_words=2)) 128 | # constraints.append(WordEmbeddingDistance(min_cos_sim=0.8)) 129 | # constraints.append(PartOfSpeech()) 130 | # goal_function = TargetedClassification(model_wrapper) 131 | 132 | # search_method = BeamSearch(beam_width=10) 133 | 134 | # return Attack(goal_function, constraints, transformation, search_method) -------------------------------------------------------------------------------- /run_textattack.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import os 8 | from os.path import join 9 | 10 | import argparse 11 | from bert_score import BERTScorer 12 | import csv 13 | import datasets 14 | import json 15 | import numpy as np 16 | import time 17 | import torch 18 | from transformers import AutoTokenizer, pipeline 19 | import transformers 20 | import textattack 21 | from textattack.attack_recipes import BERTAttackLi2020, BAEGarg2019, CLARE2020 22 | from textattack.datasets import HuggingFaceDataset 23 | from textattack.models.wrappers import ModelWrapper 24 | from textattack.commands.attack.attack_args import HUGGINGFACE_DATASET_BY_MODEL 25 | from textattack.constraints.overlap import MaxWordsPerturbed 26 | from textattack.constraints.pre_transformation import RepeatModification, StopwordModification 27 | from textattack.constraints.semantics.sentence_encoders import UniversalSentenceEncoder 28 | from textattack.goal_functions import UntargetedClassification 29 | from textattack.search_methods import GreedyWordSwapWIR 30 | from textattack.shared.attack import Attack 31 | from textattack.transformations import WordSwapMaskedLM 32 | from textattack.attack_recipes.attack_recipe import AttackRecipe 33 | from textattack.goal_functions.classification.targeted_classification import TargetedClassification 34 | from textattack.shared import AttackedText 35 | from textattack.constraints.grammaticality import PartOfSpeech 36 | from textattack.constraints.overlap import MaxWordsPerturbed 37 | from textattack.constraints.pre_transformation import ( 38 | RepeatModification, 39 | StopwordModification, 40 | ) 41 | from textattack.constraints.semantics import WordEmbeddingDistance 42 | from textattack.goal_functions import UntargetedClassification 43 | from textattack.search_methods import BeamSearch 44 | from textattack.shared.attack import Attack 45 | from textattack.transformations import WordSwapGradientBased 46 | 47 | from textattack.attack_results import SuccessfulAttackResult, FailedAttackResult 48 | 49 | from src.utils import bool_flag 50 | from src.attacks import build_baegarg2019, build_attack, USE 51 | from src.dataset import load_data 52 | 53 | 54 | def get_parser(): 55 | parser = argparse.ArgumentParser() 56 | 57 | 58 | parser.add_argument("--dataset", choices=['dbpedia14', 'sst2', 'ag_news', 'yelp', 'imdb']) 59 | parser.add_argument("--data-folder", required=True, type=str) 60 | parser.add_argument("--batch_size", type=int, default=32) 61 | parser.add_argument("--dump_path", type=str) 62 | parser.add_argument('--model', type=str, default='bert-base-uncased') 63 | 64 | parser.add_argument("--target_dir", type=str) 65 | parser.add_argument("--chunk_id", type=int, required=True) 66 | parser.add_argument("--chunk_size", type=int, required=True) 67 | 68 | # Attack parameters 69 | parser.add_argument("--attack", choices=["bae", "bert-attack", "custom"]) 70 | parser.add_argument("--bae-threshold", type=float, default=0.8) 71 | parser.add_argument("--query-budget", type=int, default=None) 72 | parser.add_argument("--radioactive", type=bool_flag) 73 | parser.add_argument("--targeted", type=bool_flag, default=True) 74 | parser.add_argument("--ckpt", type=str) 75 | 76 | return parser 77 | 78 | 79 | 80 | def main(params): 81 | # Loading data 82 | dataset, num_labels = load_data(params) 83 | dataset = dataset["train"] 84 | text_key = 'text' 85 | if params.dataset == "dbpedia14": 86 | text_key = 'content' 87 | print(f"Loaded dataset {params.dataset}, that has {len(dataset)} rows") 88 | 89 | # Load model and tokenizer from HuggingFace 90 | model_class = transformers.AutoModelForSequenceClassification 91 | model = model_class.from_pretrained(params.model, num_labels=num_labels).cuda() 92 | 93 | if params.ckpt != None: 94 | state_dict = torch.load(params.ckpt) 95 | model.load_state_dict(state_dict) 96 | tokenizer = textattack.models.tokenizers.AutoTokenizer(params.model) 97 | model_wrapper = textattack.models.wrappers.HuggingFaceModelWrapper(model, tokenizer, batch_size=params.batch_size) 98 | 99 | # Create radioactive directions and modify classification layer to use those 100 | if params.radioactive: 101 | torch.manual_seed(0) 102 | radioactive_directions = torch.randn(num_labels, 768) 103 | radioactive_directions /= torch.norm(radioactive_directions, dim=1, keepdim=True) 104 | print(radioactive_directions) 105 | model.classifier.weight.data = radioactive_directions.cuda() 106 | model.classifier.bias.data = torch.zeros(num_labels).cuda() 107 | 108 | start_index = params.chunk_id * params.chunk_size 109 | end_index = start_index + params.chunk_size 110 | 111 | if params.target_dir is not None: 112 | target_file = join(params.target_dir, f"{params.chunk_id}.csv") 113 | f = open(target_file, "w") 114 | f = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_NONNUMERIC) 115 | 116 | # Creating attack 117 | print(f"Building {params.attack} attack") 118 | if params.attack == "custom": 119 | current_label = -1 120 | if params.targeted: 121 | current_label = dataset[start_index]['label'] 122 | assert all([dataset[i]['label'] == current_label for i in range(start_index, end_index)]) 123 | attack = build_attack(model_wrapper, current_label) 124 | elif params.attack == "bae": 125 | print(f"Building BAE method with threshold={params.bae_threshold:.2f}") 126 | attack = build_baegarg2019(model_wrapper, threshold_cosine=params.bae_threshold, query_budget=params.query_budget) 127 | elif params.attack == "bert-attack": 128 | assert params.query_budget is None 129 | attack = BERTAttackLi2020.build(model_wrapper) 130 | elif params.attack == "clare": 131 | assert params.query_budget is None 132 | attack = CLARE2020.build(model_wrapper) 133 | 134 | # Launching attack 135 | begin_time = time.time() 136 | samples = [(dataset[i][text_key], attack.goal_function.get_output(AttackedText(dataset[i][text_key]))) for i in range(start_index, end_index)] 137 | results = list(attack.attack_dataset(samples)) 138 | 139 | # Storing attacked text 140 | bert_scorer = BERTScorer(model_type="bert-base-uncased", idf=False) 141 | 142 | n_success = 0 143 | similarities = [] 144 | queries = [] 145 | use = USE() 146 | 147 | for i_result, result in enumerate(results): 148 | print("") 149 | print(50 * "*") 150 | print("") 151 | text = dataset[start_index + i_result][text_key] 152 | ptext = result.perturbed_text() 153 | i_data = start_index + i_result 154 | if params.target_dir is not None: 155 | if params.dataset == 'dbpedia14': 156 | f.writerow([dataset[i_data]['label'] + 1, dataset[i_data]['title'], ptext]) 157 | else: 158 | f.writerow([dataset[i_data]['label'] + 1, ptext]) 159 | 160 | print("True label ", dataset[i_data]['label']) 161 | print(f"CLEAN TEXT\n {text}") 162 | print(f"ADV TEXT\n {ptext}") 163 | 164 | if type(result) not in [SuccessfulAttackResult, FailedAttackResult]: 165 | print("WARNING: Attack neither succeeded nor failed...") 166 | print(result.goal_function_result_str()) 167 | precision, recall, f1 = [r.item() for r in bert_scorer.score([ptext], [text])] 168 | print(f"Bert scores: precision {precision:.2f}, recall: {recall:.2f}, f1: {f1:.2f}") 169 | initial_logits = model_wrapper([text]) 170 | final_logits = model_wrapper([ptext]) 171 | print("Initial logits", initial_logits) 172 | print("Final logits", final_logits) 173 | print("Logits difference", final_logits - initial_logits) 174 | 175 | # Statistics 176 | n_success += 1 if type(result) is SuccessfulAttackResult else 0 177 | queries.append(result.num_queries) 178 | similarities.append(use.compute_sim([text], [ptext])) 179 | 180 | print("Processing all samples took %.2f" % (time.time() - begin_time)) 181 | print(f"Total success: {n_success}/{len(results)}") 182 | logs = { 183 | "success_rate": n_success / len(results), 184 | "avg_queries": sum(queries) / len(queries), 185 | "queries": queries, 186 | "avg_similarity": sum(similarities) / len(similarities), 187 | "similarities": similarities, 188 | } 189 | print("__logs:" + json.dumps(logs)) 190 | if params.target_dir is not None: 191 | f.close() 192 | 193 | 194 | if __name__ == "__main__": 195 | print("Using text attack from ", textattack.__file__) 196 | # Parse arguments 197 | parser = get_parser() 198 | params = parser.parse_args() 199 | # if not params.radioactive: 200 | # assert params.ckpt is not None, "Should specify --ckpt if not radioactive." 201 | assert not (params.radioactive and not params.targeted), "Radioactive means targeted" 202 | 203 | # Run main code 204 | begin_time = time.time() 205 | main(params) 206 | print("Running program took %.2f" % (time.time() - begin_time)) 207 | -------------------------------------------------------------------------------- /evaluate_adv_samples.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | import torch 8 | from torch.utils.data import Dataset 9 | import numpy as np 10 | from datasets import list_datasets, load_dataset, list_metrics, load_metric 11 | import transformers 12 | from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification 13 | import math 14 | import random 15 | from tqdm import tqdm 16 | import torch.nn.functional as F 17 | transformers.logging.set_verbosity(transformers.logging.ERROR) 18 | import tensorflow as tf 19 | import tensorflow_hub as hub 20 | import argparse 21 | import json 22 | import os 23 | 24 | from src.dataset import load_data 25 | from src.utils import bool_flag, get_output_file, load_checkpoints 26 | 27 | 28 | def evaluate(model, tokenizer, testset, text_key=None, batch_size=10, pretrained=False, label_perm=(lambda x: x)): 29 | """ 30 | Compute the accuracy of a model on a testset 31 | """ 32 | num_samples = len(testset) 33 | num_batches = int(math.ceil(num_samples / batch_size)) 34 | corr = [] 35 | with torch.no_grad(): 36 | for i in range(num_batches): 37 | lower = i * batch_size 38 | upper = min((i+1) * batch_size, num_samples) 39 | examples = testset[lower:upper] 40 | y = torch.LongTensor(examples['label']) 41 | if text_key is None: 42 | if pretrained: 43 | x = tokenizer(examples['premise'], examples['hypothesis'], padding='max_length', 44 | max_length=256, truncation=True, return_tensors='pt') 45 | else: 46 | x = tokenizer(examples['premise'], examples['hypothesis'], padding=True, 47 | truncation=True, return_tensors='pt') 48 | else: 49 | if pretrained: 50 | x = tokenizer(examples[text_key], padding='max_length', max_length=256, 51 | truncation=True, return_tensors='pt') 52 | else: 53 | x = tokenizer(examples[text_key], padding=True, truncation=True, return_tensors='pt') 54 | preds = model(input_ids=x['input_ids'].cuda(), attention_mask=x['attention_mask'].cuda(), 55 | token_type_ids=(x['token_type_ids'].cuda() if 'token_type_ids' in x else None)).logits.cpu() 56 | if label_perm is not None: 57 | corr.append(preds.argmax(1).eq(label_perm(y))) 58 | else: 59 | corr.append(preds.argmax(1).eq(y)) 60 | 61 | return torch.cat(corr, 0) 62 | 63 | class TokenDataset(Dataset): 64 | def __init__(self, encodings): 65 | self.encodings = encodings 66 | 67 | def __getitem__(self, idx): 68 | item = {key: val[idx] for key, val in self.encodings.items()} 69 | return item 70 | 71 | def __len__(self): 72 | return len(self.encodings['label']) 73 | 74 | def evaluate_adv_samples(model, tokenizer, tokenizer_surr, adv_log_coeffs, clean_texts, labels, 75 | attack_target=None, gumbel_samples=100, gumbel_batch_size=10, batch_size=10, 76 | print_every=10, pretrained=False, pretrained_surrogate=False, label_perm=(lambda x: x)): 77 | 78 | assert len(adv_log_coeffs) == len(labels) == len(clean_texts) 79 | num_samples = len(labels) 80 | all_corr = [] 81 | if attack_target == '': 82 | all_sentences = [] 83 | else: 84 | all_sentences = {'hypothesis': [], 'premise': []} 85 | with torch.no_grad(): 86 | for i in tqdm(range(num_samples)): 87 | if attack_target == 'premise': 88 | log_coeffs = adv_log_coeffs['premise'][i].cuda().unsqueeze(0).repeat(gumbel_batch_size, 1, 1) 89 | elif attack_target == 'hypothesis': 90 | log_coeffs = adv_log_coeffs['hypothesis'][i].cuda().unsqueeze(0).repeat(gumbel_batch_size, 1, 1) 91 | else: 92 | log_coeffs = adv_log_coeffs[i].cuda().unsqueeze(0).repeat(gumbel_batch_size, 1, 1) 93 | adv_ids = [] 94 | 95 | # Batch sampling of adv_ids 96 | num_batches = int(math.ceil(gumbel_samples / gumbel_batch_size)) 97 | for j in range(num_batches): 98 | adv_ids.append(F.gumbel_softmax(log_coeffs, hard=True).argmax(-1)) 99 | adv_ids = torch.cat(adv_ids, 0)[:gumbel_samples] 100 | evalset = {} 101 | 102 | text_key = None 103 | sentences = [tokenizer_surr.decode(adv_id) for adv_id in adv_ids] 104 | if attack_target == 'premise': 105 | premise = sentences 106 | hypothesis = [clean_texts['hypothesis'][i]] * gumbel_samples 107 | all_sentences['premise'].append(premise) 108 | all_sentences['hypothesis'].append(hypothesis) 109 | evalset['premise'] = premise 110 | evalset['hypothesis'] = hypothesis 111 | elif attack_target == 'hypothesis': 112 | premise = [clean_texts['premise'][i]] * gumbel_samples 113 | hypothesis = sentences 114 | all_sentences['premise'].append(premise) 115 | all_sentences['hypothesis'].append(hypothesis) 116 | evalset['premise'] = premise 117 | evalset['hypothesis'] = hypothesis 118 | else: 119 | all_sentences.append(sentences) 120 | evalset['text'] = sentences 121 | text_key = 'text' 122 | evalset['label'] = [labels[i]] * gumbel_samples 123 | evalset = TokenDataset(evalset) 124 | corr = evaluate(model, tokenizer, evalset, text_key, batch_size, 125 | pretrained=pretrained, label_perm=label_perm) 126 | all_corr.append(corr.unsqueeze(0)) 127 | if (i+1) % print_every == 0: 128 | print('Adversarial accuracy = %.4f' % torch.cat(all_corr, 0).float().mean(1).eq(1).float().mean()) 129 | 130 | all_corr = torch.cat(all_corr, 0) 131 | _, min_index = all_corr.float().cummin(1) 132 | embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4") 133 | if attack_target == 'premise': 134 | clean_embeddings = embed(clean_texts['premise']) 135 | adv_texts = [all_sentences['premise'][j][min_index[j, -1]] for j in range(num_samples)] 136 | elif attack_target == 'hypothesis': 137 | clean_embeddings = embed(clean_texts['hypothesis']) 138 | adv_texts = [all_sentences['hypothesis'][j][min_index[j, -1]] for j in range(num_samples)] 139 | else: 140 | clean_embeddings = embed(clean_texts) 141 | adv_texts = [all_sentences[j][min_index[j, -1]] for j in range(num_samples)] 142 | adv_embeddings = embed(adv_texts) 143 | cosine_sim = tf.reduce_mean(tf.reduce_sum(clean_embeddings * adv_embeddings, axis=1)) 144 | print('Cosine similarity = %.4f' % cosine_sim) 145 | 146 | return all_sentences, all_corr, cosine_sim 147 | 148 | 149 | def main(args): 150 | # Load data 151 | dataset, num_labels = load_data(args) 152 | if args.dataset == 'mnli': 153 | text_key = None 154 | testset_key = 'validation_%s' % args.mnli_option 155 | else: 156 | text_key = 'text' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'sentence' 157 | testset_key = 'test' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'validation' 158 | 159 | # Load target model 160 | pretrained = args.target_model.startswith('textattack') 161 | pretrained_surrogate = args.surrogate_model.startswith('textattack') 162 | suffix = '_finetune' if args.finetune else '' 163 | tokenizer = AutoTokenizer.from_pretrained(args.target_model, use_fast=True) 164 | model = AutoModelForSequenceClassification.from_pretrained(args.target_model, num_labels=num_labels).cuda() 165 | if not pretrained: 166 | model_checkpoint = os.path.join(args.result_folder, '%s_%s%s.pth' % (args.target_model.replace('/', '-'), args.dataset, suffix)) 167 | print('Loading checkpoint: %s' % model_checkpoint) 168 | model.load_state_dict(torch.load(model_checkpoint)) 169 | tokenizer.model_max_length = 512 170 | if args.target_model == 'gpt2': 171 | tokenizer.padding_side = "right" 172 | tokenizer.pad_token = tokenizer.eos_token 173 | model.config.pad_token_id = model.config.eos_token_id 174 | 175 | label_perm = lambda x: x 176 | if pretrained: 177 | if args.target_model == 'textattack/bert-base-uncased-MNLI' or args.target_model == 'textattack/xlnet-base-cased-MNLI': 178 | label_perm = lambda x: (x + 1) % 3 179 | elif args.target_model == 'textattack/roberta-base-MNLI': 180 | label_perm = lambda x: -(x - 1) + 1 181 | 182 | # Compute clean accuracy 183 | corr = evaluate(model, tokenizer, dataset[testset_key], text_key, pretrained=pretrained, label_perm=label_perm) 184 | print('Clean accuracy = %.4f' % corr.float().mean()) 185 | 186 | surr_tokenizer = AutoTokenizer.from_pretrained(args.surrogate_model, use_fast=True) 187 | surr_tokenizer.model_max_length = 512 188 | if args.surrogate_model == 'gpt2': 189 | surr_tokenizer.padding_side = "right" 190 | surr_tokenizer.pad_token = tokenizer.eos_token 191 | 192 | clean_texts, adv_texts, clean_logits, adv_logits, adv_log_coeffs, labels, times = load_checkpoints(args) 193 | 194 | label_perm = lambda x: x 195 | if pretrained and args.surrogate_model != args.target_model: 196 | if args.target_model == 'textattack/bert-base-uncased-MNLI' or args.target_model == 'textattack/xlnet-base-cased-MNLI': 197 | label_perm = lambda x: (x + 1) % 3 198 | elif args.target_model == 'textattack/roberta-base-MNLI': 199 | label_perm = lambda x: -(x - 1) + 1 200 | 201 | attack_target = args.attack_target if args.dataset == 'mnli' else '' 202 | all_sentences, all_corr, cosine_sim = evaluate_adv_samples( 203 | model, tokenizer, surr_tokenizer, adv_log_coeffs, clean_texts, labels, attack_target=attack_target, 204 | gumbel_samples=args.gumbel_samples, batch_size=args.batch_size, print_every=args.print_every, 205 | pretrained=pretrained, pretrained_surrogate=pretrained_surrogate, label_perm=label_perm) 206 | 207 | print("__logs:" + json.dumps({ 208 | "cosine_similarity": float(cosine_sim), 209 | "adv_acc2": all_corr.float().mean(1).eq(1).float().mean().item() 210 | })) 211 | output_file = get_output_file(args, args.surrogate_model, args.start_index, args.end_index) 212 | output_file = os.path.join(args.adv_samples_folder, 213 | 'transfer_%s_%s' % (args.target_model.replace('/', '-'), output_file)) 214 | torch.save({ 215 | 'all_sentences': all_sentences, 216 | 'all_corr': all_corr, 217 | 'clean_texts': clean_texts, 218 | 'labels': labels, 219 | 'times': times 220 | }, output_file) 221 | 222 | 223 | if __name__ == "__main__": 224 | parser = argparse.ArgumentParser(description="Evaluate white-box attack.") 225 | 226 | # Bookkeeping 227 | parser.add_argument("--result_folder", default="result/", type=str, 228 | help="folder for loading trained models") 229 | parser.add_argument("--adv_samples_folder", default="adv_samples/", type=str, 230 | help="folder for saving generated samples") 231 | parser.add_argument("--dump_path", default="", type=str, 232 | help="Path to dump logs") 233 | 234 | 235 | # Data 236 | parser.add_argument("--data_folder", required=True, type=str, 237 | help="folder in which to store data") 238 | parser.add_argument("--dataset", default="dbpedia14", type=str, 239 | choices=["dbpedia14", "ag_news", "imdb", "yelp", "mnli"], 240 | help="classification dataset to use") 241 | parser.add_argument("--mnli_option", default="matched", type=str, 242 | choices=["matched", "mismatched"], 243 | help="use matched or mismatched test set for MNLI") 244 | 245 | # Model 246 | parser.add_argument("--target_model", default="gpt2", type=str, 247 | help="type of model") 248 | parser.add_argument("--surrogate_model", default="gpt2", type=str, 249 | help="type of model") 250 | parser.add_argument("--finetune", default=False, type=bool_flag, 251 | help="load finetuned model") 252 | 253 | # Attack setting 254 | parser.add_argument("--start_index", default=0, type=int, 255 | help="starting sample index") 256 | parser.add_argument("--end_index", default=1000, type=int, 257 | help="end sample index") 258 | parser.add_argument("--num_samples", default=100, type=int, 259 | help="number of samples per split") 260 | parser.add_argument("--num_iters", default=100, type=int, 261 | help="number of epochs to train for") 262 | parser.add_argument("--batch_size", default=10, type=int, 263 | help="batch size for evaluation") 264 | parser.add_argument("--attack_target", default="premise", type=str, 265 | choices=["premise", "hypothesis"], 266 | help="attack either the premise or hypothesis for MNLI") 267 | parser.add_argument("--adv_loss", default="cw", type=str, 268 | choices=["cw", "ce"], 269 | help="adversarial loss") 270 | parser.add_argument("--constraint", default="bertscore_idf", type=str, 271 | choices=["cosine", "bertscore", "bertscore_idf"], 272 | help="constraint function") 273 | parser.add_argument("--lr", default=3e-1, type=float, 274 | help="learning rate") 275 | parser.add_argument("--kappa", default=5, type=float, 276 | help="CW loss margin") 277 | parser.add_argument("--embed_layer", default=-1, type=int, 278 | help="which layer of LM to extract embeddings from") 279 | parser.add_argument("--lam_sim", default=1, type=float, 280 | help="embedding similarity regularizer") 281 | parser.add_argument("--lam_perp", default=1, type=float, 282 | help="(log) perplexity regularizer") 283 | parser.add_argument("--print_every", default=100, type=int, 284 | help="print result every x samples") 285 | parser.add_argument("--gumbel_samples", default=100, type=int, 286 | help="number of gumbel samples; if 0, use argmax") 287 | 288 | args = parser.parse_args() 289 | 290 | main(args) -------------------------------------------------------------------------------- /whitebox_attack.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2015-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the CC-by-NC license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | from bert_score.utils import get_idf_dict 8 | from datasets import load_dataset 9 | from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForCausalLM 10 | import transformers 11 | import math 12 | import argparse 13 | import math 14 | import jiwer 15 | import numpy as np 16 | import os 17 | import warnings 18 | transformers.logging.set_verbosity(transformers.logging.ERROR) 19 | import time 20 | import torch 21 | import torch.nn.functional as F 22 | 23 | from src.dataset import load_data 24 | from src.utils import bool_flag, get_output_file, print_args, load_gpt2_from_dict 25 | 26 | 27 | def wer(x, y): 28 | x = " ".join(["%d" % i for i in x]) 29 | y = " ".join(["%d" % i for i in y]) 30 | 31 | return jiwer.wer(x, y) 32 | 33 | 34 | def bert_score(refs, cands, weights=None): 35 | refs_norm = refs / refs.norm(2, -1).unsqueeze(-1) 36 | if weights is not None: 37 | refs_norm *= weights[:, None] 38 | else: 39 | refs_norm /= refs.size(1) 40 | cands_norm = cands / cands.norm(2, -1).unsqueeze(-1) 41 | cosines = refs_norm @ cands_norm.transpose(1, 2) 42 | # remove first and last tokens; only works when refs and cands all have equal length (!!!) 43 | cosines = cosines[:, 1:-1, 1:-1] 44 | R = cosines.max(-1)[0].sum(1) 45 | return R 46 | 47 | 48 | def log_perplexity(logits, coeffs): 49 | shift_logits = logits[:, :-1, :].contiguous() 50 | shift_coeffs = coeffs[:, 1:, :].contiguous() 51 | shift_logits = shift_logits[:, :, :shift_coeffs.size(2)] 52 | return -(shift_coeffs * F.log_softmax(shift_logits, dim=-1)).sum(-1).mean() 53 | 54 | 55 | def main(args): 56 | pretrained = args.model.startswith('textattack') 57 | output_file = get_output_file(args, args.model, args.start_index, args.start_index + args.num_samples) 58 | output_file = os.path.join(args.adv_samples_folder, output_file) 59 | print(f"Outputting files to {output_file}") 60 | if os.path.exists(output_file): 61 | print('Skipping batch as it has already been completed.') 62 | exit() 63 | 64 | # Load dataset 65 | dataset, num_labels = load_data(args) 66 | label_perm = lambda x: x 67 | if pretrained and args.model == 'textattack/bert-base-uncased-MNLI': 68 | label_perm = lambda x: (x + 1) % 3 69 | 70 | # Load tokenizer, model, and reference model 71 | tokenizer = AutoTokenizer.from_pretrained(args.model, use_fast=True) 72 | tokenizer.model_max_length = 512 73 | model = AutoModelForSequenceClassification.from_pretrained(args.model, num_labels=num_labels).cuda() 74 | if not pretrained: 75 | # Load model to attack 76 | suffix = '_finetune' if args.finetune else '' 77 | model_checkpoint = os.path.join(args.result_folder, '%s_%s%s.pth' % (args.model.replace('/', '-'), args.dataset, suffix)) 78 | print('Loading checkpoint: %s' % model_checkpoint) 79 | model.load_state_dict(torch.load(model_checkpoint)) 80 | tokenizer.model_max_length = 512 81 | if args.model == 'gpt2': 82 | tokenizer.padding_side = "right" 83 | tokenizer.pad_token = tokenizer.eos_token 84 | model.config.pad_token_id = model.config.eos_token_id 85 | 86 | if 'bert-base-uncase' in args.model: 87 | # for BERT, load GPT-2 trained on BERT tokenizer 88 | ref_model = load_gpt2_from_dict("%s/transformer_wikitext-103.pth" % args.gpt2_checkpoint_folder, output_hidden_states=True).cuda() 89 | else: 90 | ref_model = AutoModelForCausalLM.from_pretrained(args.model, output_hidden_states=True).cuda() 91 | with torch.no_grad(): 92 | embeddings = model.get_input_embeddings()(torch.arange(0, tokenizer.vocab_size).long().cuda()) 93 | ref_embeddings = ref_model.get_input_embeddings()(torch.arange(0, tokenizer.vocab_size).long().cuda()) 94 | 95 | # encode dataset using tokenizer 96 | if args.dataset == "mnli": 97 | testset_key = "validation_%s" % args.mnli_option 98 | preprocess_function = lambda examples: tokenizer( 99 | examples['premise'], examples['hypothesis'], max_length=256, truncation=True) 100 | else: 101 | text_key = 'text' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'sentence' 102 | testset_key = 'test' if (args.dataset in ["ag_news", "imdb", "yelp"]) else 'validation' 103 | preprocess_function = lambda examples: tokenizer(examples[text_key], max_length=256, truncation=True) 104 | encoded_dataset = dataset.map(preprocess_function, batched=True) 105 | 106 | # Compute idf dictionary for BERTScore 107 | if args.constraint == "bertscore_idf": 108 | if args.dataset == 'mnli': 109 | idf_dict = get_idf_dict(dataset['train']['premise'] + dataset['train']['hypothesis'], tokenizer, nthreads=20) 110 | else: 111 | idf_dict = get_idf_dict(dataset['train'][text_key], tokenizer, nthreads=20) 112 | 113 | if args.dataset == 'mnli': 114 | adv_log_coeffs = {'premise': [], 'hypothesis': []} 115 | clean_texts = {'premise': [], 'hypothesis': []} 116 | adv_texts = {'premise': [], 'hypothesis': []} 117 | else: 118 | adv_log_coeffs, clean_texts, adv_texts = [], [], [] 119 | clean_logits = [] 120 | adv_logits = [] 121 | token_errors = [] 122 | times = [] 123 | 124 | assert args.start_index < len(encoded_dataset[testset_key]), 'Starting index %d is larger than dataset length %d' % (args.start_index, len(encoded_dataset[testset_key])) 125 | end_index = min(args.start_index + args.num_samples, len(encoded_dataset[testset_key])) 126 | adv_losses, ref_losses, perp_losses, entropies = torch.zeros(end_index - args.start_index, args.num_iters), torch.zeros(end_index - args.start_index, args.num_iters), torch.zeros(end_index - args.start_index, args.num_iters), torch.zeros(end_index - args.start_index, args.num_iters) 127 | for idx in range(args.start_index, end_index): 128 | input_ids = encoded_dataset[testset_key]['input_ids'][idx] 129 | if args.model == 'gpt2': 130 | token_type_ids = None 131 | else: 132 | token_type_ids = encoded_dataset[testset_key]['token_type_ids'][idx] 133 | label = label_perm(encoded_dataset[testset_key]['label'][idx]) 134 | clean_logit = model(input_ids=torch.LongTensor(input_ids).unsqueeze(0).cuda(), 135 | token_type_ids=(None if token_type_ids is None else torch.LongTensor(token_type_ids).unsqueeze(0).cuda())).logits.data.cpu() 136 | print('LABEL') 137 | print(label) 138 | print('TEXT') 139 | print(tokenizer.decode(input_ids)) 140 | print('LOGITS') 141 | print(clean_logit) 142 | 143 | forbidden = np.zeros(len(input_ids)).astype('bool') 144 | # set [CLS] and [SEP] tokens to forbidden 145 | forbidden[0] = True 146 | forbidden[-1] = True 147 | offset = 0 if args.model == 'gpt2' else 1 148 | if args.dataset == 'mnli': 149 | # set either premise or hypothesis to forbidden 150 | premise_length = len(tokenizer.encode(encoded_dataset[testset_key]['premise'][idx])) 151 | input_ids_premise = input_ids[offset:(premise_length-offset)] 152 | input_ids_hypothesis = input_ids[premise_length:len(input_ids)-offset] 153 | if args.attack_target == "hypothesis": 154 | forbidden[:premise_length] = True 155 | else: 156 | forbidden[(premise_length-offset):] = True 157 | forbidden_indices = np.arange(0, len(input_ids))[forbidden] 158 | forbidden_indices = torch.from_numpy(forbidden_indices).cuda() 159 | token_type_ids_batch = (None if token_type_ids is None else torch.LongTensor(token_type_ids).unsqueeze(0).repeat(args.batch_size, 1).cuda()) 160 | 161 | start_time = time.time() 162 | with torch.no_grad(): 163 | orig_output = ref_model(torch.LongTensor(input_ids).cuda().unsqueeze(0)).hidden_states[args.embed_layer] 164 | if args.constraint.startswith('bertscore'): 165 | if args.constraint == "bertscore_idf": 166 | ref_weights = torch.FloatTensor([idf_dict[idx] for idx in input_ids]).cuda() 167 | ref_weights /= ref_weights.sum() 168 | else: 169 | ref_weights = None 170 | elif args.constraint == 'cosine': 171 | # GPT-2 reference model uses last token embedding instead of pooling 172 | if args.model == 'gpt2' or 'bert-base-uncased' in args.model: 173 | orig_output = orig_output[:, -1] 174 | else: 175 | orig_output = orig_output.mean(1) 176 | log_coeffs = torch.zeros(len(input_ids), embeddings.size(0)) 177 | indices = torch.arange(log_coeffs.size(0)).long() 178 | log_coeffs[indices, torch.LongTensor(input_ids)] = args.initial_coeff 179 | log_coeffs = log_coeffs.cuda() 180 | log_coeffs.requires_grad = True 181 | 182 | optimizer = torch.optim.Adam([log_coeffs], lr=args.lr) 183 | start = time.time() 184 | for i in range(args.num_iters): 185 | optimizer.zero_grad() 186 | coeffs = F.gumbel_softmax(log_coeffs.unsqueeze(0).repeat(args.batch_size, 1, 1), hard=False) # B x T x V 187 | inputs_embeds = (coeffs @ embeddings[None, :, :]) # B x T x D 188 | pred = model(inputs_embeds=inputs_embeds, token_type_ids=token_type_ids_batch).logits 189 | if args.adv_loss == 'ce': 190 | adv_loss = -F.cross_entropy(pred, label * torch.ones(args.batch_size).long().cuda()) 191 | elif args.adv_loss == 'cw': 192 | top_preds = pred.sort(descending=True)[1] 193 | correct = (top_preds[:, 0] == label).long() 194 | indices = top_preds.gather(1, correct.view(-1, 1)) 195 | adv_loss = (pred[:, label] - pred.gather(1, indices).squeeze() + args.kappa).clamp(min=0).mean() 196 | 197 | # Similarity constraint 198 | ref_embeds = (coeffs @ ref_embeddings[None, :, :]) 199 | pred = ref_model(inputs_embeds=ref_embeds) 200 | if args.lam_sim > 0: 201 | output = pred.hidden_states[args.embed_layer] 202 | if args.constraint.startswith('bertscore'): 203 | ref_loss = -args.lam_sim * bert_score(orig_output, output, weights=ref_weights).mean() 204 | else: 205 | if args.model == 'gpt2' or 'bert-base-uncased' in args.model: 206 | output = output[:, -1] 207 | else: 208 | output = output.mean(1) 209 | cosine = (output * orig_output).sum(1) / output.norm(2, 1) / orig_output.norm(2, 1) 210 | ref_loss = -args.lam_sim * cosine.mean() 211 | else: 212 | ref_loss = torch.Tensor([0]).cuda() 213 | 214 | # (log) perplexity constraint 215 | if args.lam_perp > 0: 216 | perp_loss = args.lam_perp * log_perplexity(pred.logits, coeffs) 217 | else: 218 | perp_loss = torch.Tensor([0]).cuda() 219 | 220 | # Compute loss and backward 221 | total_loss = adv_loss + ref_loss + perp_loss 222 | total_loss.backward() 223 | 224 | entropy = torch.sum(-F.log_softmax(log_coeffs, dim=1) * F.softmax(log_coeffs, dim=1)) 225 | if i % args.print_every == 0: 226 | print('Iteration %d: loss = %.4f, adv_loss = %.4f, ref_loss = %.4f, perp_loss = %.4f, entropy=%.4f, time=%.2f' % ( 227 | i+1, total_loss.item(), adv_loss.item(), ref_loss.item(), perp_loss.item(), entropy.item(), time.time() - start)) 228 | 229 | # Gradient step 230 | log_coeffs.grad.index_fill_(0, forbidden_indices, 0) 231 | optimizer.step() 232 | 233 | # Log statistics 234 | adv_losses[idx - args.start_index, i] = adv_loss.detach().item() 235 | ref_losses[idx - args.start_index, i] = ref_loss.detach().item() 236 | perp_losses[idx - args.start_index, i] = perp_loss.detach().item() 237 | entropies[idx - args.start_index, i] = entropy.detach().item() 238 | times.append(time.time() - start_time) 239 | 240 | print('CLEAN TEXT') 241 | if args.dataset == 'mnli': 242 | clean_premise = tokenizer.decode(input_ids_premise) 243 | clean_hypothesis = tokenizer.decode(input_ids_hypothesis) 244 | clean_texts['premise'].append(clean_premise) 245 | clean_texts['hypothesis'].append(clean_hypothesis) 246 | print('%s %s' % (clean_premise, clean_hypothesis)) 247 | else: 248 | clean_text = tokenizer.decode(input_ids[offset:(len(input_ids)-offset)]) 249 | clean_texts.append(clean_text) 250 | print(clean_text) 251 | clean_logits.append(clean_logit) 252 | 253 | print('ADVERSARIAL TEXT') 254 | with torch.no_grad(): 255 | for j in range(args.gumbel_samples): 256 | adv_ids = F.gumbel_softmax(log_coeffs, hard=True).argmax(1) 257 | if args.dataset == 'mnli': 258 | if args.attack_target == 'premise': 259 | adv_ids_premise = adv_ids[offset:(premise_length-offset)].cpu().tolist() 260 | adv_ids_hypothesis = input_ids_hypothesis 261 | else: 262 | adv_ids_premise = input_ids_premise 263 | adv_ids_hypothesis = adv_ids[premise_length:len(adv_ids)-offset].cpu().tolist() 264 | adv_premise = tokenizer.decode(adv_ids_premise) 265 | adv_hypothesis = tokenizer.decode(adv_ids_hypothesis) 266 | x = tokenizer(adv_premise, adv_hypothesis, max_length=256, truncation=True, return_tensors='pt') 267 | token_errors.append(wer(input_ids_premise + input_ids_hypothesis, x['input_ids'][0])) 268 | else: 269 | adv_ids = adv_ids[offset:len(adv_ids)-offset].cpu().tolist() 270 | adv_text = tokenizer.decode(adv_ids) 271 | x = tokenizer(adv_text, max_length=256, truncation=True, return_tensors='pt') 272 | token_errors.append(wer(adv_ids, x['input_ids'][0])) 273 | adv_logit = model(input_ids=x['input_ids'].cuda(), attention_mask=x['attention_mask'].cuda(), 274 | token_type_ids=(x['token_type_ids'].cuda() if 'token_type_ids' in x else None)).logits.data.cpu() 275 | if adv_logit.argmax() != label or j == args.gumbel_samples - 1: 276 | if args.dataset == 'mnli': 277 | adv_texts['premise'].append(adv_premise) 278 | adv_texts['hypothesis'].append(adv_hypothesis) 279 | print('%s %s' % (adv_premise, adv_hypothesis)) 280 | else: 281 | adv_texts.append(adv_text) 282 | print(adv_text) 283 | adv_logits.append(adv_logit) 284 | break 285 | 286 | # remove special tokens from adv_log_coeffs 287 | if args.dataset == 'mnli': 288 | adv_log_coeffs['premise'].append(log_coeffs[offset:(premise_length-offset), :].cpu()) 289 | adv_log_coeffs['hypothesis'].append(log_coeffs[premise_length:(log_coeffs.size(0)-offset), :].cpu()) 290 | else: 291 | adv_log_coeffs.append(log_coeffs[offset:(log_coeffs.size(0)-offset), :].cpu()) # size T x V 292 | 293 | print('') 294 | print('CLEAN LOGITS') 295 | print(clean_logit) # size 1 x C 296 | print('ADVERSARIAL LOGITS') 297 | print(adv_logit) # size 1 x C 298 | 299 | print("Token Error Rate: %.4f (over %d tokens)" % (sum(token_errors) / len(token_errors), len(token_errors))) 300 | torch.save({ 301 | 'adv_log_coeffs': adv_log_coeffs, 302 | 'adv_logits': torch.cat(adv_logits, 0), # size N x C 303 | 'adv_losses': adv_losses, 304 | 'adv_texts': adv_texts, 305 | 'clean_logits': torch.cat(clean_logits, 0), 306 | 'clean_texts': clean_texts, 307 | 'entropies': entropies, 308 | 'labels': list(map(label_perm, encoded_dataset[testset_key]['label'][args.start_index:end_index])), 309 | 'perp_losses': perp_losses, 310 | 'ref_losses': ref_losses, 311 | 'times': times, 312 | 'token_error': token_errors, 313 | }, output_file) 314 | 315 | 316 | if __name__ == "__main__": 317 | parser = argparse.ArgumentParser(description="White-box attack.") 318 | 319 | # Bookkeeping 320 | parser.add_argument("--result_folder", default="result/", type=str, 321 | help="folder for loading trained models") 322 | parser.add_argument("--gpt2_checkpoint_folder", default="result/", type=str, 323 | help="folder for loading GPT2 model trained with BERT tokenizer") 324 | parser.add_argument("--adv_samples_folder", default="adv_samples/", type=str, 325 | help="folder for saving generated samples") 326 | parser.add_argument("--dump_path", default="", type=str, 327 | help="Path to dump logs") 328 | 329 | # Data 330 | parser.add_argument("--data_folder", required=True, type=str, 331 | help="folder in which to store data") 332 | parser.add_argument("--dataset", default="dbpedia14", type=str, 333 | choices=["dbpedia14", "ag_news", "imdb", "yelp", "mnli"], 334 | help="classification dataset to use") 335 | parser.add_argument("--mnli_option", default="matched", type=str, 336 | choices=["matched", "mismatched"], 337 | help="use matched or mismatched test set for MNLI") 338 | parser.add_argument("--num_samples", default=1, type=int, 339 | help="number of samples to attack") 340 | 341 | # Model 342 | parser.add_argument("--model", default="gpt2", type=str, 343 | help="type of model") 344 | parser.add_argument("--finetune", default=False, type=bool_flag, 345 | help="load finetuned model") 346 | 347 | # Attack setting 348 | parser.add_argument("--start_index", default=0, type=int, 349 | help="starting sample index") 350 | parser.add_argument("--num_iters", default=100, type=int, 351 | help="number of epochs to train for") 352 | parser.add_argument("--batch_size", default=10, type=int, 353 | help="batch size for gumbel-softmax samples") 354 | parser.add_argument("--attack_target", default="premise", type=str, 355 | choices=["premise", "hypothesis"], 356 | help="attack either the premise or hypothesis for MNLI") 357 | parser.add_argument("--initial_coeff", default=15, type=int, 358 | help="initial log coefficients") 359 | parser.add_argument("--adv_loss", default="cw", type=str, 360 | choices=["cw", "ce"], 361 | help="adversarial loss") 362 | parser.add_argument("--constraint", default="bertscore_idf", type=str, 363 | choices=["cosine", "bertscore", "bertscore_idf"], 364 | help="constraint function") 365 | parser.add_argument("--lr", default=3e-1, type=float, 366 | help="learning rate") 367 | parser.add_argument("--kappa", default=5, type=float, 368 | help="CW loss margin") 369 | parser.add_argument("--embed_layer", default=-1, type=int, 370 | help="which layer of LM to extract embeddings from") 371 | parser.add_argument("--lam_sim", default=1, type=float, 372 | help="embedding similarity regularizer") 373 | parser.add_argument("--lam_perp", default=1, type=float, 374 | help="(log) perplexity regularizer") 375 | parser.add_argument("--print_every", default=10, type=int, 376 | help="print loss every x iterations") 377 | parser.add_argument("--gumbel_samples", default=100, type=int, 378 | help="number of gumbel samples; if 0, use argmax") 379 | 380 | args = parser.parse_args() 381 | print_args(args) 382 | main(args) -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Attribution-NonCommercial 4.0 International 2 | 3 | ======================================================================= 4 | 5 | Creative Commons Corporation ("Creative Commons") is not a law firm and 6 | does not provide legal services or legal advice. Distribution of 7 | Creative Commons public licenses does not create a lawyer-client or 8 | other relationship. Creative Commons makes its licenses and related 9 | information available on an "as-is" basis. Creative Commons gives no 10 | warranties regarding its licenses, any material licensed under their 11 | terms and conditions, or any related information. Creative Commons 12 | disclaims all liability for damages resulting from their use to the 13 | fullest extent possible. 14 | 15 | Using Creative Commons Public Licenses 16 | 17 | Creative Commons public licenses provide a standard set of terms and 18 | conditions that creators and other rights holders may use to share 19 | original works of authorship and other material subject to copyright 20 | and certain other rights specified in the public license below. The 21 | following considerations are for informational purposes only, are not 22 | exhaustive, and do not form part of our licenses. 23 | 24 | Considerations for licensors: Our public licenses are 25 | intended for use by those authorized to give the public 26 | permission to use material in ways otherwise restricted by 27 | copyright and certain other rights. Our licenses are 28 | irrevocable. Licensors should read and understand the terms 29 | and conditions of the license they choose before applying it. 30 | Licensors should also secure all rights necessary before 31 | applying our licenses so that the public can reuse the 32 | material as expected. Licensors should clearly mark any 33 | material not subject to the license. This includes other CC- 34 | licensed material, or material used under an exception or 35 | limitation to copyright. More considerations for licensors: 36 | wiki.creativecommons.org/Considerations_for_licensors 37 | 38 | Considerations for the public: By using one of our public 39 | licenses, a licensor grants the public permission to use the 40 | licensed material under specified terms and conditions. If 41 | the licensor's permission is not necessary for any reason--for 42 | example, because of any applicable exception or limitation to 43 | copyright--then that use is not regulated by the license. Our 44 | licenses grant only permissions under copyright and certain 45 | other rights that a licensor has authority to grant. Use of 46 | the licensed material may still be restricted for other 47 | reasons, including because others have copyright or other 48 | rights in the material. A licensor may make special requests, 49 | such as asking that all changes be marked or described. 50 | Although not required by our licenses, you are encouraged to 51 | respect those requests where reasonable. More_considerations 52 | for the public: 53 | wiki.creativecommons.org/Considerations_for_licensees 54 | 55 | ======================================================================= 56 | 57 | Creative Commons Attribution-NonCommercial 4.0 International Public 58 | License 59 | 60 | By exercising the Licensed Rights (defined below), You accept and agree 61 | to be bound by the terms and conditions of this Creative Commons 62 | Attribution-NonCommercial 4.0 International Public License ("Public 63 | License"). To the extent this Public License may be interpreted as a 64 | contract, You are granted the Licensed Rights in consideration of Your 65 | acceptance of these terms and conditions, and the Licensor grants You 66 | such rights in consideration of benefits the Licensor receives from 67 | making the Licensed Material available under these terms and 68 | conditions. 69 | 70 | Section 1 -- Definitions. 71 | 72 | a. Adapted Material means material subject to Copyright and Similar 73 | Rights that is derived from or based upon the Licensed Material 74 | and in which the Licensed Material is translated, altered, 75 | arranged, transformed, or otherwise modified in a manner requiring 76 | permission under the Copyright and Similar Rights held by the 77 | Licensor. For purposes of this Public License, where the Licensed 78 | Material is a musical work, performance, or sound recording, 79 | Adapted Material is always produced where the Licensed Material is 80 | synched in timed relation with a moving image. 81 | 82 | b. Adapter's License means the license You apply to Your Copyright 83 | and Similar Rights in Your contributions to Adapted Material in 84 | accordance with the terms and conditions of this Public License. 85 | 86 | c. Copyright and Similar Rights means copyright and/or similar rights 87 | closely related to copyright including, without limitation, 88 | performance, broadcast, sound recording, and Sui Generis Database 89 | Rights, without regard to how the rights are labeled or 90 | categorized. For purposes of this Public License, the rights 91 | specified in Section 2(b)(1)-(2) are not Copyright and Similar 92 | Rights. 93 | d. Effective Technological Measures means those measures that, in the 94 | absence of proper authority, may not be circumvented under laws 95 | fulfilling obligations under Article 11 of the WIPO Copyright 96 | Treaty adopted on December 20, 1996, and/or similar international 97 | agreements. 98 | 99 | e. Exceptions and Limitations means fair use, fair dealing, and/or 100 | any other exception or limitation to Copyright and Similar Rights 101 | that applies to Your use of the Licensed Material. 102 | 103 | f. Licensed Material means the artistic or literary work, database, 104 | or other material to which the Licensor applied this Public 105 | License. 106 | 107 | g. Licensed Rights means the rights granted to You subject to the 108 | terms and conditions of this Public License, which are limited to 109 | all Copyright and Similar Rights that apply to Your use of the 110 | Licensed Material and that the Licensor has authority to license. 111 | 112 | h. Licensor means the individual(s) or entity(ies) granting rights 113 | under this Public License. 114 | 115 | i. NonCommercial means not primarily intended for or directed towards 116 | commercial advantage or monetary compensation. For purposes of 117 | this Public License, the exchange of the Licensed Material for 118 | other material subject to Copyright and Similar Rights by digital 119 | file-sharing or similar means is NonCommercial provided there is 120 | no payment of monetary compensation in connection with the 121 | exchange. 122 | 123 | j. Share means to provide material to the public by any means or 124 | process that requires permission under the Licensed Rights, such 125 | as reproduction, public display, public performance, distribution, 126 | dissemination, communication, or importation, and to make material 127 | available to the public including in ways that members of the 128 | public may access the material from a place and at a time 129 | individually chosen by them. 130 | 131 | k. Sui Generis Database Rights means rights other than copyright 132 | resulting from Directive 96/9/EC of the European Parliament and of 133 | the Council of 11 March 1996 on the legal protection of databases, 134 | as amended and/or succeeded, as well as other essentially 135 | equivalent rights anywhere in the world. 136 | 137 | l. You means the individual or entity exercising the Licensed Rights 138 | under this Public License. Your has a corresponding meaning. 139 | 140 | Section 2 -- Scope. 141 | 142 | a. License grant. 143 | 144 | 1. Subject to the terms and conditions of this Public License, 145 | the Licensor hereby grants You a worldwide, royalty-free, 146 | non-sublicensable, non-exclusive, irrevocable license to 147 | exercise the Licensed Rights in the Licensed Material to: 148 | 149 | a. reproduce and Share the Licensed Material, in whole or 150 | in part, for NonCommercial purposes only; and 151 | 152 | b. produce, reproduce, and Share Adapted Material for 153 | NonCommercial purposes only. 154 | 155 | 2. Exceptions and Limitations. For the avoidance of doubt, where 156 | Exceptions and Limitations apply to Your use, this Public 157 | License does not apply, and You do not need to comply with 158 | its terms and conditions. 159 | 160 | 3. Term. The term of this Public License is specified in Section 161 | 6(a). 162 | 163 | 4. Media and formats; technical modifications allowed. The 164 | Licensor authorizes You to exercise the Licensed Rights in 165 | all media and formats whether now known or hereafter created, 166 | and to make technical modifications necessary to do so. The 167 | Licensor waives and/or agrees not to assert any right or 168 | authority to forbid You from making technical modifications 169 | necessary to exercise the Licensed Rights, including 170 | technical modifications necessary to circumvent Effective 171 | Technological Measures. For purposes of this Public License, 172 | simply making modifications authorized by this Section 2(a) 173 | (4) never produces Adapted Material. 174 | 175 | 5. Downstream recipients. 176 | 177 | a. Offer from the Licensor -- Licensed Material. Every 178 | recipient of the Licensed Material automatically 179 | receives an offer from the Licensor to exercise the 180 | Licensed Rights under the terms and conditions of this 181 | Public License. 182 | 183 | b. No downstream restrictions. You may not offer or impose 184 | any additional or different terms or conditions on, or 185 | apply any Effective Technological Measures to, the 186 | Licensed Material if doing so restricts exercise of the 187 | Licensed Rights by any recipient of the Licensed 188 | Material. 189 | 190 | 6. No endorsement. Nothing in this Public License constitutes or 191 | may be construed as permission to assert or imply that You 192 | are, or that Your use of the Licensed Material is, connected 193 | with, or sponsored, endorsed, or granted official status by, 194 | the Licensor or others designated to receive attribution as 195 | provided in Section 3(a)(1)(A)(i). 196 | 197 | b. Other rights. 198 | 199 | 1. Moral rights, such as the right of integrity, are not 200 | licensed under this Public License, nor are publicity, 201 | privacy, and/or other similar personality rights; however, to 202 | the extent possible, the Licensor waives and/or agrees not to 203 | assert any such rights held by the Licensor to the limited 204 | extent necessary to allow You to exercise the Licensed 205 | Rights, but not otherwise. 206 | 207 | 2. Patent and trademark rights are not licensed under this 208 | Public License. 209 | 210 | 3. To the extent possible, the Licensor waives any right to 211 | collect royalties from You for the exercise of the Licensed 212 | Rights, whether directly or through a collecting society 213 | under any voluntary or waivable statutory or compulsory 214 | licensing scheme. In all other cases the Licensor expressly 215 | reserves any right to collect such royalties, including when 216 | the Licensed Material is used other than for NonCommercial 217 | purposes. 218 | 219 | Section 3 -- License Conditions. 220 | 221 | Your exercise of the Licensed Rights is expressly made subject to the 222 | following conditions. 223 | 224 | a. Attribution. 225 | 226 | 1. If You Share the Licensed Material (including in modified 227 | form), You must: 228 | 229 | a. retain the following if it is supplied by the Licensor 230 | with the Licensed Material: 231 | 232 | i. identification of the creator(s) of the Licensed 233 | Material and any others designated to receive 234 | attribution, in any reasonable manner requested by 235 | the Licensor (including by pseudonym if 236 | designated); 237 | 238 | ii. a copyright notice; 239 | 240 | iii. a notice that refers to this Public License; 241 | 242 | iv. a notice that refers to the disclaimer of 243 | warranties; 244 | 245 | v. a URI or hyperlink to the Licensed Material to the 246 | extent reasonably practicable; 247 | 248 | b. indicate if You modified the Licensed Material and 249 | retain an indication of any previous modifications; and 250 | 251 | c. indicate the Licensed Material is licensed under this 252 | Public License, and include the text of, or the URI or 253 | hyperlink to, this Public License. 254 | 255 | 2. You may satisfy the conditions in Section 3(a)(1) in any 256 | reasonable manner based on the medium, means, and context in 257 | which You Share the Licensed Material. For example, it may be 258 | reasonable to satisfy the conditions by providing a URI or 259 | hyperlink to a resource that includes the required 260 | information. 261 | 262 | 3. If requested by the Licensor, You must remove any of the 263 | information required by Section 3(a)(1)(A) to the extent 264 | reasonably practicable. 265 | 266 | 4. If You Share Adapted Material You produce, the Adapter's 267 | License You apply must not prevent recipients of the Adapted 268 | Material from complying with this Public License. 269 | 270 | Section 4 -- Sui Generis Database Rights. 271 | 272 | Where the Licensed Rights include Sui Generis Database Rights that 273 | apply to Your use of the Licensed Material: 274 | 275 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right 276 | to extract, reuse, reproduce, and Share all or a substantial 277 | portion of the contents of the database for NonCommercial purposes 278 | only; 279 | 280 | b. if You include all or a substantial portion of the database 281 | contents in a database in which You have Sui Generis Database 282 | Rights, then the database in which You have Sui Generis Database 283 | Rights (but not its individual contents) is Adapted Material; and 284 | 285 | c. You must comply with the conditions in Section 3(a) if You Share 286 | all or a substantial portion of the contents of the database. 287 | 288 | For the avoidance of doubt, this Section 4 supplements and does not 289 | replace Your obligations under this Public License where the Licensed 290 | Rights include other Copyright and Similar Rights. 291 | 292 | Section 5 -- Disclaimer of Warranties and Limitation of Liability. 293 | 294 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE 295 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS 296 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF 297 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, 298 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, 299 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR 300 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, 301 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT 302 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT 303 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. 304 | 305 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE 306 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, 307 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, 308 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, 309 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR 310 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN 311 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR 312 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR 313 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. 314 | 315 | c. The disclaimer of warranties and limitation of liability provided 316 | above shall be interpreted in a manner that, to the extent 317 | possible, most closely approximates an absolute disclaimer and 318 | waiver of all liability. 319 | 320 | Section 6 -- Term and Termination. 321 | 322 | a. This Public License applies for the term of the Copyright and 323 | Similar Rights licensed here. However, if You fail to comply with 324 | this Public License, then Your rights under this Public License 325 | terminate automatically. 326 | 327 | b. Where Your right to use the Licensed Material has terminated under 328 | Section 6(a), it reinstates: 329 | 330 | 1. automatically as of the date the violation is cured, provided 331 | it is cured within 30 days of Your discovery of the 332 | violation; or 333 | 334 | 2. upon express reinstatement by the Licensor. 335 | 336 | For the avoidance of doubt, this Section 6(b) does not affect any 337 | right the Licensor may have to seek remedies for Your violations 338 | of this Public License. 339 | 340 | c. For the avoidance of doubt, the Licensor may also offer the 341 | Licensed Material under separate terms or conditions or stop 342 | distributing the Licensed Material at any time; however, doing so 343 | will not terminate this Public License. 344 | 345 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public 346 | License. 347 | 348 | Section 7 -- Other Terms and Conditions. 349 | 350 | a. The Licensor shall not be bound by any additional or different 351 | terms or conditions communicated by You unless expressly agreed. 352 | 353 | b. Any arrangements, understandings, or agreements regarding the 354 | Licensed Material not stated herein are separate from and 355 | independent of the terms and conditions of this Public License. 356 | 357 | Section 8 -- Interpretation. 358 | 359 | a. For the avoidance of doubt, this Public License does not, and 360 | shall not be interpreted to, reduce, limit, restrict, or impose 361 | conditions on any use of the Licensed Material that could lawfully 362 | be made without permission under this Public License. 363 | 364 | b. To the extent possible, if any provision of this Public License is 365 | deemed unenforceable, it shall be automatically reformed to the 366 | minimum extent necessary to make it enforceable. If the provision 367 | cannot be reformed, it shall be severed from this Public License 368 | without affecting the enforceability of the remaining terms and 369 | conditions. 370 | 371 | c. No term or condition of this Public License will be waived and no 372 | failure to comply consented to unless expressly agreed to by the 373 | Licensor. 374 | 375 | d. Nothing in this Public License constitutes or may be interpreted 376 | as a limitation upon, or waiver of, any privileges and immunities 377 | that apply to the Licensor or You, including from the legal 378 | processes of any jurisdiction or authority. 379 | 380 | ======================================================================= 381 | 382 | Creative Commons is not a party to its public 383 | licenses. Notwithstanding, Creative Commons may elect to apply one of 384 | its public licenses to material it publishes and in those instances 385 | will be considered the “Licensor.” The text of the Creative Commons 386 | public licenses is dedicated to the public domain under the CC0 Public 387 | Domain Dedication. Except for the limited purpose of indicating that 388 | material is shared under a Creative Commons public license or as 389 | otherwise permitted by the Creative Commons policies published at 390 | creativecommons.org/policies, Creative Commons does not authorize the 391 | use of the trademark "Creative Commons" or any other trademark or logo 392 | of Creative Commons without its prior written consent including, 393 | without limitation, in connection with any unauthorized modifications 394 | to any of its public licenses or any other arrangements, 395 | understandings, or agreements concerning use of licensed material. For 396 | the avoidance of doubt, this paragraph does not form part of the 397 | public licenses. 398 | 399 | Creative Commons may be contacted at creativecommons.org. 400 | 401 | 402 | --------------------------------------------------------------------------------