├── .flake8 ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── advprompter.def ├── advprompteropt.py ├── conf ├── base.yaml ├── eval.yaml ├── eval_suffix_dataset.yaml ├── prompter │ ├── base_prompter.yaml │ ├── llama2.yaml │ └── tiny_llama.yaml ├── target_llm │ ├── base_target_llm.yaml │ ├── falcon_chat.yaml │ ├── gemma_chat.yaml │ ├── llama2_chat.yaml │ ├── llama3_chat.yaml │ ├── mistral_chat.yaml │ ├── pythia_chat.yaml │ ├── tiny_llama_chat.yaml │ └── vicuna_chat.yaml └── train.yaml ├── data ├── affirmative_prefixes.csv ├── harmful_behaviors │ └── dataset │ │ ├── test.csv │ │ ├── train.csv │ │ └── validation.csv └── test_prefixes.csv ├── llm.py ├── main.py ├── requirements.txt ├── sequence.py └── utils.py /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | max-line-length = 88 -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/.gitignore -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/CODE_OF_CONDUCT.md -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/CONTRIBUTING.md -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/README.md -------------------------------------------------------------------------------- /advprompter.def: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/advprompter.def -------------------------------------------------------------------------------- /advprompteropt.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/advprompteropt.py -------------------------------------------------------------------------------- /conf/base.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/base.yaml -------------------------------------------------------------------------------- /conf/eval.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/eval.yaml -------------------------------------------------------------------------------- /conf/eval_suffix_dataset.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/eval_suffix_dataset.yaml -------------------------------------------------------------------------------- /conf/prompter/base_prompter.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/prompter/base_prompter.yaml -------------------------------------------------------------------------------- /conf/prompter/llama2.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/prompter/llama2.yaml -------------------------------------------------------------------------------- /conf/prompter/tiny_llama.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/prompter/tiny_llama.yaml -------------------------------------------------------------------------------- /conf/target_llm/base_target_llm.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/base_target_llm.yaml -------------------------------------------------------------------------------- /conf/target_llm/falcon_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/falcon_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/gemma_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/gemma_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/llama2_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/llama2_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/llama3_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/llama3_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/mistral_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/mistral_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/pythia_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/pythia_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/tiny_llama_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/tiny_llama_chat.yaml -------------------------------------------------------------------------------- /conf/target_llm/vicuna_chat.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/target_llm/vicuna_chat.yaml -------------------------------------------------------------------------------- /conf/train.yaml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/conf/train.yaml -------------------------------------------------------------------------------- /data/affirmative_prefixes.csv: -------------------------------------------------------------------------------- 1 | "Sure" 2 | "Here" -------------------------------------------------------------------------------- /data/harmful_behaviors/dataset/test.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/data/harmful_behaviors/dataset/test.csv -------------------------------------------------------------------------------- /data/harmful_behaviors/dataset/train.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/data/harmful_behaviors/dataset/train.csv -------------------------------------------------------------------------------- /data/harmful_behaviors/dataset/validation.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/data/harmful_behaviors/dataset/validation.csv -------------------------------------------------------------------------------- /data/test_prefixes.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/data/test_prefixes.csv -------------------------------------------------------------------------------- /llm.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/llm.py -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/main.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/requirements.txt -------------------------------------------------------------------------------- /sequence.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/sequence.py -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/advprompter/HEAD/utils.py --------------------------------------------------------------------------------