├── 1
├── .gitignore
├── .vscode
└── settings.json
├── attack_processing.png
├── test.py
└── README.md
/1:
--------------------------------------------------------------------------------
1 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | ._*
2 | DP-OPT/*
--------------------------------------------------------------------------------
/.vscode/settings.json:
--------------------------------------------------------------------------------
1 | {
2 | "editor.renderFinalNewline": "on"
3 | }
--------------------------------------------------------------------------------
/attack_processing.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Chen-X666/privacy-preserving-prompt/HEAD/attack_processing.png
--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
1 | from transformers import AutoTokenizer, AutoModel
2 | import torch
3 |
4 | tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
5 | model = AutoModel.from_pretrained("bert-base-uncased")
6 |
7 | def get_similar_words(word):
8 | # 拆分输入文本并将其编码
9 | input_ids = tokenizer.encode(word, return_tensors='pt')
10 | with torch.no_grad():
11 | embeddings = model(input_ids)[0]
12 | # 计算标记之间的余弦相似度
13 | cosine_similarities = torch.nn.functional.cosine_similarity(embeddings, embeddings)
14 | # 获取相似度最高的标记
15 | similar_word_idx = cosine_similarities.argsort().numpy()[0][-2]
16 | similar_word = tokenizer.decode([similar_word_idx])
17 | return similar_word
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Privacy-Preserving Prompt Tuning for Large Language Model
2 |
3 | | Symbol | 🌟 | ⬜️ | ⬛️ |
4 | | --- | --- | --- | --- |
5 | | Description | Inspiration | White-box method | Black-box method |
6 |
7 | ## Attacker Methodology by Stages
8 |
9 |

10 |
11 |
12 | ### Prompt Injection Attacks (PIA)
13 | | Paper | Year | Source | Attack Prompt Type | Tasks |
14 | |-------|------|-----------------------|----------------------|-----------|
15 | | ⬛️Effective Prompt Extraction from Language Models | 2024.02 | [](https://arxiv.org/abs/2307.06865) [](https://anonymous.4open.science/r/prompt-extraction-83C1) | Instruction Prompt | Information Extration |
16 | | ⬛️Prompt Stealing Attacks Against Large Language Models | 2024.02 | [](https://arxiv.org/abs/2402.12959) | Role-Based Prompt, Direct Prompt, In-Context Prompt | Q&A |
17 | | 🌟⬛️TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models | (NeurIPS, 2023) | [](https://arxiv.org/abs/2306.06815) [](https://github.com/UCF-ML-Research/TrojLLM) | Instruction prompt | Classification |
18 | | 🌟⬛️Ignore Previous Prompt : Attack Techniques For Language Models | (NeurIPS, 2022) | [](https://arxiv.org/abs/2211.09527) [](https://github.com/agencyenterprise/PromptInject) | Instruction prompt, In-context learning |
19 | | 🌟⬜️BadPrompt: Backdoor Attacks on Continuous Prompts | (NeurIPS, 2022) | [](https://arxiv.org/abs/2211.14719) [](https://github.com/papersPapers/BadPrompt) | Instruction prompt|
20 | | ⬜️PromptAttack: Prompt-based Attack for Language Models via Gradient Search | (NLPCC, 2022) | [](https://arxiv.org/abs/2209.01882) | Instruction prompt |
21 |
22 |
23 | ### Membership Inference Attacks (MIA)
24 | | Paper | Year | Source |
25 | |-------|------|-----------------|
26 | | 🌟⬛️Do Membership Inference AttacksWork on Large Language Models? | 2024.02 | [](https://arxiv.org/abs/2402.07841) [](http://github.com/iamgroot42/mimir) |
27 | | ⬜️Language Model Inversion | 2023.11 | [](https://arxiv.org/abs/2311.13647) |
28 | | ⬛️Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks | 2023.10 | [](2310.13291) |
29 | | ⬛️Beyond Memorization: Violating Privacy Via Inference with Large Language Models | 2023.10 | [](https://arxiv.org/abs/2310.07298) [](http://github.com/iamgroot42/mimir) |
30 | | ⬜️Extracting Training Data from Large Language Models | (USENIX Security, 2021) | [](https://www.usenix.org/system/files/sec21-carlini-extracting.pdf) |
31 |
32 | ## Protector Methodology
33 |
34 | ### Differential Privacy (DP)
35 | | Paper | Year | Source | Tasks | Defense |
36 | |-------|------|------------------|---------|---------|
37 | | 🌟⬛️Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation | (ICLR, 2024) | [](https://arxiv.org/abs/2309.11765) [](https://github.com/microsoft/dp-few-shot-generation) | Classification, Information Extraction |
38 | | 🌟⬛️DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer | (ICLR, 2024) | [](https://arxiv.org/abs/2312.03724) [](https://github.com/VITA-Group/DP-OPT) | Sentiment Classification |
39 | | 🌟⬛️Privacy-Preserving In-Context Learning For Large Language Models | (ICLR, 2024) | [](https://arxiv.org/abs/2305.01639) | Classification, Document Q&A, Dialog Summarization |
40 | | ⬛️On the Privacy Risk of In-context Learning | (TrustNLP, 2024)) | [](https://trustnlpworkshop.github.io/papers/13.pdf) | Classification, Generation | MIA
41 | | ⬛️A Customized Text Sanitization Mechanism with Differential Privacy | (ACL, 2023) | [](https://aclanthology.org/2023.findings-acl.355) [](https://github.com/sai4july/CusText) | Classification, Generation |
42 | | 🌟⬛️⬜️Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models | (NeurIPS, 2023) | [](https://arxiv.org/abs/2305.15594) | Classification |
43 | | 🌟⬛️Locally Differentially Private Document Generation Using Zero Shot Prompting | (EMNLP, 2023) | [](https://aclanthology.org/2023.findings-emnlp.566) [](https://github.com/SaitejaUtpala/dp_prompt) | Text Classification |
44 | | ⬜️DP-forward: Fine-tuning and inference on language models with differential privacy in forward pass | (SIGSAC, 2023) | [](https://arxiv.org/abs/2309.06746) | Classification |
45 | | ⬛️InferDPT: Privacy-preserving Inference for Black-box Large Language Models | 2023.12 | [](https://arxiv.org/abs/2310.12214) [](https://github.com/mengtong0110/InferDPT) | Classification, Generation |
46 | | ⬜️Privacy-Preserving Prompt Tuning for Large Language Model Services | 2023.05 | [](https://arxiv.org/abs/2305.06212) | Sentiment Classification, Document Q&A |
47 | | ⬛️Differential Privacy for Text Analytics via Natural Text Sanitization | (ACL-IJCNLP, 2021) | [](https://aclanthology.org/2021.findings-acl.337) [](https://github.com/xiangyue9607/SanText) | Classification |
48 |
49 |
50 |
51 | ### Secure Multi-Party Computing (SMPC)
52 | | Paper | Year | Source | Tasks | Defence |
53 | |-------|------|--------------|------------|---------|
54 | | ⬜️Ciphergpt: Secure two-party gpt inference | (Crypto, 2024) | [](https://eprint.iacr.org/2023/1147) | Classification |
55 | | ⬜️SecFormer: Towards Fast and Accurate Privacy-Preserving Inference for Large Language Models | 2024.01 | [](https://arxiv.org/abs/2401.00793) | Classification, Semantic Similarity, Linguistic Acceptability | Model Inside |
56 | | ⬜️LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers | 2023.05 | [](https://arxiv.org/abs/2305.18396) | Classification | Model Inside |
57 |
58 |
59 |
60 | ### Anonymization Techniques
61 | | Paper | Year | Source | Tasks | Keywords |
62 | |-------|------|-------------|----------|----------|
63 | | ⬛️SEmojiCrypt: Prompt Encryption for Secure Communication with Large Language Models | 2024.02 | [](https://arxiv.org/abs/2402.05868) [](https://github.com/agiresearch/EmojiCrypt) | Classification | Emoji |
64 | | ⬛️ProPILE: Probing Privacy Leakage in Large Language Models | (NeurIPS, 2023) | [](https://arxiv.org/abs/2307.01881) | PII |
65 | | ⬛️Recovering from Privacy-Preserving Masking with Large Language Models | 2023.12 | [](https://arxiv.org/abs/2309.08628) | |[MASK]
66 | | ⬛️Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection | 2023.09 | [](https://arxiv.org/abs/2309.03057) [](https://github.com/alohachen/Hide-and-Seek) | PII |
67 |
68 |
69 | ### Other Methods
70 | | Paper | Year | Source | Tasks | Method Keyword |
71 | |-------|------|-----------------------|-------|----------------|
72 | | ⬜️Privatelora For Efficient Privacy Preserving LLM | (CoRR, 2023) | [](https://arxiv.org/abs/2311.14030) [](https://github.com/alipay/private_llm) | | LoRA
73 | | ⬜️TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations | (ACL, 2023) | [](https://aclanthology.org/2023.findings-acl.337) [](https://github.com/xzhou20/TextObfuscator) | Classification |
74 |
75 | ### Related Survey
76 | | Paper | Year | Source |
77 | |-------|------|-----------------------|
78 | On Protecting the Data Privacy of Large Language Models (LLMs): A Survey | 2024.03 | [](https://arxiv.org/abs/2403.05156)|
79 | Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory | (ICLR, 2024) | [](https://arxiv.org/abs//2310.17884)|
80 | A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly | 2023.12 | [](https://arxiv.org/abs/2312.02003)
81 | Privacy in Large Language Models: Attacks, Defenses and Future Directions | 2023.10 | [](https://arxiv.org/abs/2310.10383)
82 | Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection | 2023.05 | [](https://arxiv.org/abs/2302.12173) [](https://github.com/greshake/llm-security) |
83 | Privacy-Preserving Large Language Models (PPLLMs) | 2023.01 | [](https://www.researchgate.net/publication/372607103_Privacy-Preserving_Large_Language_Models_PPLLMs) [](https://github.com/greshake/llm-security) |
84 |
85 | ### Fine-tuning
86 | | Paper | Year | Source |
87 | |-------|------|-----------------------|
88 | | SentinelLMs: Encrypted Input Adaptation and Fine-tuning of Language Models for Private and Secure Inference | (AAAI, 2024) | [](https://arxiv.org/abs/2312.17342) [](https://github.com/abhijitmishra/sentinellm-aaai2024)
89 |
--------------------------------------------------------------------------------