├── README.md
├── askguess
├── agents
│ ├── __init__.py
│ ├── answer_agent.py
│ └── question_agent.py
├── compute.py
├── demo.py
├── labels.json
├── test_labels.json
└── utils
│ ├── __init__.py
│ ├── prompt.py
│ └── utils.py
├── assets
├── .DS_Store
├── GameEval.png
├── case1.png
├── res.png
├── res1.png
└── res2.png
├── chat
├── __init__.py
├── config.py
├── config_example.py
├── gpt3_chat.py
├── gpt4_chat.py
└── text003_chat.py
├── game_askguess.py
├── game_spyfall.py
├── game_tofukingdom.py
├── spyfall
├── .gitignore
├── agents
│ ├── __init__.py
│ └── base_agent.py
├── compute_adversarial.py
├── labels.txt
└── utils
│ ├── prompt.py
│ └── utils.py
└── tofukingdom
├── .gitignore
├── agents
├── __init__.py
├── base_agent.py
├── chef_agent.py
├── guard_agent.py
├── maid_agent.py
├── minister_agent.py
├── prince_agent.py
├── princess_agent.py
├── queen_agent.py
└── spy_agent.py
├── compute.py
└── utils
├── prompt.py
└── utils.py
/README.md:
--------------------------------------------------------------------------------
1 | # GameEval: GameEval: Evaluating LLMs on Conversational Games
2 |
3 | [[`Paper`](https://arxiv.org/pdf/2308.10032v1.pdf)] [[`BibTeX`](#citing-gameeval)]
4 |
5 | GameEval try to evalutate powerful LLMs by playing conversational games.GameEval treats
6 | LLMs as game players and assigns them distinct roles with
7 | specific goals achieved by launching conversations of various forms, including discussion, question answering, and voting.
8 |
9 | ## Involved Capabilities
10 | GameEval is distinct from other evaluation methods, as it requires not only the model’s common capabilities like instruct-following but also the model’s higher-level skills, including cooperative&adversarial strategies, and even deceptive strategies and long-term planning. In this section, we introduce various distinctive capabilities that can be effectively evaluated by conversational games. We show shows the
11 | capabilities of LLMs that can be examined by these games.
12 | | Capabilities | Ask-Guess | SpyFall | TofuKingdom |
13 | |---------------------- |----------- |--------- |------------- |
14 | | Cooperative Strategy | ✓ | ✓ | ✓ |
15 | | Adversarial Strategy | X | ✓ | ✓ |
16 | | Specific Knowledge | ✓ | ✓ | X |
17 | | Multi-hop Reasoning | ✓ | ✓ | ✓ |
18 | | Deceptive Strategy | X | ✓ | ✓ |
19 | | Long-term Planning | ✓ | ✓ | X |
20 | | Instruct-Following | ✓ | ✓ | ✓ |
21 |
22 |
23 | ## Expiremental Result of the Original Version
24 |
25 | **Experimental results on GameEval clearly demonstrate high discrimination in the capabilities of models under evaluation.**
26 |
27 |
28 |
29 |
30 | ### Ask-Guess
31 | The reult of the easy version. (with prior description from the answerer)
32 | | Model | Round | ST | EE | RLE | AME | CE |
33 | |---|---|---|---|---|---|---|
34 | | TD003 | 4.39 | 82.71 | 9.47 | 1.84 | 5.97 | 0.01 |
35 | | ChatGPT | 6.01 | 53.39 | 8.13 | 14.63 | 23.21 | 0.64 |
36 | | GPT4 | 1.57 | 97.69 | 0.80 | 1.01 | 0.47 | 0.03 |
37 |
38 | The reult of the hard version. (without prior description from the answerer)
39 |
40 | | Model | Round | ST | EE | RLE | AME | CE |
41 | |--------- |------- |------- |------- |------- |------ |------ |
42 | | TD003 | 15.13 | 42.36 | 19.18 | 37.19 | 0.36 | 0.91 |
43 | | ChatGPT | 13.78 | 40.50 | 3.88 | 49.89 | 4.57 | 1.16 |
44 | | GPT4 | 4.01 | 92.77 | 2.95 | 0.84 | 2.75 | 0.69 |
45 |
46 | ### SpyFall
47 | S-model means the model plays the spy, V-model means the model plays the villagers.
48 |
49 |
50 |
51 |
52 |
53 | ### TofuKingdom
54 | We let different LLMs play all the roles in the same camps to perform a adversarial game. The model that represent a winning camp can get one point.
55 |
56 | | Prince | Spy | Queen | ChatGPT | GPT4 | TD003 |
57 | |--------- |--------- |--------- |--------- |------ |------- |
58 | | TD003 | GPT4 | ChatGPT | 7 | 9 | 4 |
59 | | TD003 | ChatGPT | GPT4 | 5 | 11 | 4 |
60 | | ChatGPT | GPT4 | TD003 | 8 | 7 | 5 |
61 | | ChatGPT | TD003 | GPT4 | 5 | 9 | 6 |
62 | | GPT4 | TD003 | ChatGPT | 6 | 7 | 7 |
63 | | GPT4 | ChatGPT | TD003 | 8 | 8 | 4 |
64 | | - | - | Total | 39 | 51 | 30 |
65 |
66 | ## Illustration
67 | Below is a simple demonstration of three designed games: Ask-Guess, SpyFall and TofuKingdom.
68 |
69 |
70 |
71 |
72 |
73 | ## How to use GameEval
74 |
75 | ### For Azure OpenAI
76 | You can create a `chat/config.py` file with reference to the `chat/config_example.py` file, and fill in your Azure OpenAI account information.
77 |
78 | ### For other LLMs
79 | For other models including Official OpenAI models and open-source models, you can create a chat file in folder `chat` to create a chatbot which receive messsages or text prompt as input and give the response as output.
80 | You can read other files in folder `chat` for reference.
81 |
82 | ### Install Packages
83 | ```
84 | pip install openai
85 | pip install vthread
86 | pip intsall func_timeout
87 | ```
88 |
89 | ## Ask-Guess
90 | ### Game Introduction
91 | Ask-Guess is a cooperative game involving a questioner and an answerer. At the beginning of the game, the answerer receives a word unknown to the questioner. In each round, the questioner may ask the answerer one question, and the answerer has to answer faithfully. The provided word or phrase must not be included in the answerer’s reply. Both participants should collaborate to minimize the number of Q&A rounds needed for the questioner to deduce the given word or phrase accurately. The questioner should ask targeted questions to progressively narrow down the potential scope of the given word based on the answerer’s responses. The answerer must assess whether the questioner has successfully identified the word and respond with ’Gameover’ to conclude the game.
92 |
93 | ### Get Started
94 | **You can direct use the following script to use model `ChatGPT` to play the game.** You can set the word to be guessed in `label_path` and `n` means run n times for each word. The result and the game log will be automatically recorded.
95 | ```
96 | cd ask-guess
97 | python game_askguess.py \
98 | --label_path test_labels.json \
99 | --model_name gpt3 \
100 | --mode easy \
101 | --debug false \
102 | --n 30
103 | ```
104 | **When all the game is over, you can compute the average result mentioned in the paper by run the file `compute.py`.**
105 |
106 | ### Case
107 | To better understand how conversational games reflect the gap in model capabilities, we show the game dialogue in Ask-Guess without prior description.
108 |
109 |
110 |
111 | As we can see, both ChatGPT and GPT-4 can correctly understand the tasks, and they ask and answer questions according to the game rules.
112 | However, for a given goal, GPT-4 has demonstrated an astonishing planning ability; the series of questions it asks follow a specific taxonomy. In each round, GPT-4 shows a clear awareness of the impossible objectives that have been ruled out by previous Q\&A and ask new questions targeted at the remaining part. However, the questions ChatGPT asks seem more disorganized and disoriented.
113 |
114 |
115 | ## SpyFall
116 |
117 | ### Game Introduction
118 | This game has six players, including one spy and five villagers.
119 | At the beginning of the game, everyone will receive a word.
120 | The spy will receive the spy word, and others will receive the common word.
121 | Spy word is different but relevant to the common word. For example, the spy word can be "lion," and the common word is "tiger."
122 | There are two stages in each round of the game.
123 | In the first stage, everyone needs to describe the word he got but cannot say the given word directly.
124 | In the second stage, everyone should vote for a player he thinks is the spy according to the descriptions in the first stage and state why he thinks this player is a spy.
125 |
126 | ### Get Started
127 |
128 | ```
129 | cd spyfall
130 | python game_spyfall.py \
131 | --label_path spyfall/labels.txt \
132 | --spy_model_name gpt3 \
133 | --villager_model_name gpt3 \
134 | --debug false \
135 | --n 30
136 | ```
137 |
138 | ## TofuKingdom
139 |
140 | ### Game Introduction
141 | This game is a role-playing text reasoning game.
142 | It has eight roles, including Prince, Princess, Queen, Minister, Chef, Guard, Maid, and Spy.
143 | The players, except the Prince, know the real identity of the rest of the players.
144 | The Prince needs to guess which player is the Princess by asking one question to each player.
145 | During the game, the Prince's question can only be chosen from the three questions below:
146 | 1. Who is the Princess;
147 | 2. What is your identity;
148 | 3. What is the identity of \{player\_name\}.
149 |
150 | There are three different camps in this game.
151 | The Princess and Chef belong to the Prince Camp; they must tell the truth when answering the question.
152 | The Queen, Minister, and Guard belong to the Queen Camp; they must tell a lie when answering the question.
153 | The Spy and the Maid belong to the Spy Camp and can choose to speak the truth or lie.
154 | After asking each player one question, the Prince can still choose one player to ask an extra question.
155 | The question should also be chosen from one of the three questions mentioned above.
156 | Then the Prince has to choose a player who he thinks is the Princess.
157 | If the Prince correctly chooses Princess, the Chef and the Princess win.
158 | If the Prince chooses the Queen, the Queen, Minister, and Guard win.
159 | If the Prince chooses a player whose identity is neither the Princess nor the Queen, the Maid and Spy wins.
160 |
161 | ### Get Started
162 | ```
163 | cd spyfall
164 | python game_spyfall.py \
165 | --prince_model_name gpt3 \
166 | --queen_model_name gpt4 \
167 | --spy_model_name td003 \
168 | --debug false \
169 | --n 20
170 | ```
171 |
172 |
173 | ## Citing GameEval
174 |
175 | ```
176 | @article{qiao2023gameeval,
177 | title={GameEval: Evaluating LLMs on Conversational Games},
178 | author={Qiao, Dan and Wu, Chenfei and Liang, Yaobo and Li, Juntao and Duan, Nan},
179 | journal={arXiv preprint arXiv:2308.10032},
180 | year={2023}
181 | }
182 | ```
--------------------------------------------------------------------------------
/askguess/agents/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/askguess/agents/__init__.py
--------------------------------------------------------------------------------
/askguess/agents/answer_agent.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | sys.path.append("ask-guess")
4 | from askguess.utils.prompt import get_answerer_role,get_questioner_role
5 | from askguess.utils.utils import create_message,print_messages, convert_messages_to_prompt
6 | class AnswerAgnent:
7 |
8 | def __init__(self, chatbot, object_name, args) -> None:
9 |
10 | self.chatbot = chatbot
11 | self.object_name = object_name
12 | self.role_easy, self.role_hard = get_answerer_role(object_name)
13 |
14 | self.history = []
15 | if args.mode == "easy":
16 | role_message = create_message("system",self.role_easy)
17 | else:
18 | role_message = create_message("system",self.role_hard)
19 | self.history.append(role_message)
20 |
21 | def play(self):
22 |
23 | if self.chatbot.name == "td003":
24 | text_prompt = convert_messages_to_prompt(messages=self.history,role="answerer")
25 | response = self.chatbot.single_chat(text_prompt)
26 | else:
27 | response = self.chatbot.multi_chat(self.history)
28 | return response
29 |
30 | def answer(self):
31 | # response = self.chatbox.multi_chat(input_list=self.history,role=self.answer_role,role_last = self.role_last)
32 | # import pdb
33 | # pdb.set_trace()
34 | if self.chatbot.name == "td003":
35 | text_prompt = self.get_answer_prompt()
36 | response = self.chatbot.single_chat(text_prompt)
37 |
38 | else:
39 | response = self.chatbot.multi_chat(self.history)
40 |
41 | return response
42 |
43 | def get_answer_prompt(self):
44 | prompt = "##system##"
45 | prompt += self.answer_role + "\n"
46 | flag = (len(self.history) + 1) % 2
47 | for i in range(len(self.history)):
48 | if i % 2 == flag:
49 | prompt += "##questioner##: "
50 | else:
51 | prompt += "##answerer##:"
52 | prompt += self.history[i]["content"] + "\n\n"
53 | prompt += "##answerer##:"
54 |
55 | return prompt
56 |
57 | def get_describe_prompt(self):
58 |
59 | prompt = "##system##"
60 | prompt += self.describe_role + "\n"
61 | flag = (len(self.history) + 1) % 2
62 | for i in range(len(self.history)):
63 | if i % 2 == flag:
64 | prompt += "##questioner##: "
65 | else:
66 | prompt += "##answerer##:"
67 | prompt += self.history[i]["content"] + "\n\n"
68 | prompt += "##answerer##:"
69 |
70 | return prompt
71 |
72 | def update_history(self, new_message):
73 | self.history.append(new_message)
74 |
75 |
76 |
--------------------------------------------------------------------------------
/askguess/agents/question_agent.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 | sys.path.append("ask-guess")
4 | from askguess.utils.prompt import get_questioner_role
5 | from askguess.utils.utils import create_message,convert_messages_to_prompt,print_messages
6 |
7 | class QuestionAgnent:
8 |
9 | def __init__(self,chatbot, object_name, args) -> None:
10 |
11 | self.chatbot = chatbot
12 | self.object_name = object_name
13 | self.role_easy, self.role_hard = get_questioner_role()
14 | self.history = []
15 | if args.mode == "easy":
16 | role_message = create_message("system",self.role_easy)
17 | else:
18 | role_message = create_message("system",self.role_hard)
19 | self.history.append(role_message)
20 |
21 | def play(self):
22 | if self.chatbot.name == "td003":
23 | text_prompt = convert_messages_to_prompt(messages=self.history,role="questioner")
24 | response = self.chatbot.single_chat(text_prompt)
25 | else:
26 | response = self.chatbot.multi_chat(self.history)
27 | return response
28 |
29 | def update_history(self, new_message):
30 | self.history.append(new_message)
31 |
32 |
33 |
34 |
--------------------------------------------------------------------------------
/askguess/compute.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | from agents.answer_agent import AnswerAgnent
4 | from agents.question_agent import QuestionAgnent
5 | from utils.utils import get_model, create_message
6 | from utils.prompt import host_description_prompt, host_qa_prompt
7 | import argparse
8 | import json
9 |
10 |
11 |
12 | if __name__ == "__main__":
13 | parser = argparse.ArgumentParser()
14 | parser.add_argument('--model_name',type=str,default='gpt3')
15 | parser.add_argument('--label_path',type=str,default="labels.json")
16 | parser.add_argument('--mode',type=str,default='easy')
17 | parser.add_argument('--n',type=int,default='20')
18 | args = parser.parse_args()
19 |
20 | model_name = args.model_name
21 | mode = args.mode
22 | N = args.n
23 |
24 |
25 | file_path = f"guess_result_{mode}_{model_name}.json"
26 | output_path = f"avg_result_{mode}_{model_name}.json"
27 | with open(file_path,'r') as f:
28 | data = f.readlines()
29 |
30 | res = {}
31 |
32 | # error_type: EndingError, AnswerMentionedError, RoundLimitError, ChatError, SuccessfulTrial
33 | for item in data:
34 | line = json.loads(item)
35 | name = line["object"].replace(" ","_")
36 | error_type = line["error_type"]
37 | round = line["round"]
38 | if name not in res:
39 | res[name] = {"round":0,"EndingError":0,"SuccessfulTrial":0,"ChatError":0,"RoundLimitError":0,"AnswerMentionedError":0}
40 | if name in res:
41 | res[name][error_type] += 1
42 | if round > 0:
43 | res[name]["round"] += round
44 |
45 | for key,value in res.items():
46 | if res[key]["SuccessfulTrial"] != 0:
47 | res[key]["round"] /= res[key]["SuccessfulTrial"]
48 | else:
49 | res[key]["round"] = 30
50 |
51 |
52 | data = res
53 |
54 | # error_type: EndingError, AnswerMentionedError, RoundLimitError, ChatError, SuccessfulTrial
55 |
56 | acc_avg = { "round": 0,
57 | "EndingError": 0,
58 | "ChatError": 0,
59 | "SuccessfulTrial": 0,
60 | "RoundLimitError": 0,
61 | "AnswerMentionedError": 0}
62 |
63 | cnt_correct = 0
64 | for name,item in data.items():
65 | if item["round"] != 0:
66 | cnt_correct += 1
67 | for key in item:
68 | acc_avg[key] += item[key]
69 |
70 | for key in acc_avg:
71 | if key == "round":
72 | acc_avg[key] /= cnt_correct
73 | else:
74 | acc_avg[key] /= len(data) * N
75 |
76 | data["avg"] = acc_avg
77 |
78 | with open(output_path,'w') as f:
79 | json.dump(data,f,indent=1)
--------------------------------------------------------------------------------
/askguess/demo.py:
--------------------------------------------------------------------------------
1 | import gradio as gr
2 | import random
3 | import time
4 | import json
5 |
6 | from agents.answer_agent import AnswerAgnent
7 | from agents.question_agent import QuestionAgnent
8 | from utils.utils import get_model, create_message
9 | from utils.prompt import host_description_prompt, host_qa_prompt
10 | import argparse
11 | import json
12 |
13 | if __name__ == "__main__":
14 | parser = argparse.ArgumentParser()
15 | parser.add_argument('--model_name',type=str,default='gpt3')
16 | parser.add_argument('--mode',type=str,default='easy')
17 | parser.add_argument('--n',type=int,default='20')
18 |
19 | with gr.Blocks() as demo:
20 | chatbot = gr.Chatbot()
21 | msg = gr.Textbox()
22 | clear = gr.Button("清除")
23 |
24 | def respond(message, chat_history):
25 | bot_message = random.choice(["你好吗?", "我爱你", "我很饿"])
26 | chat_history.append((message, bot_message))
27 | time.sleep(1)
28 | return "", chat_history
29 |
30 | msg.submit(respond, [msg, chatbot], [msg, chatbot])
31 | clear.click(lambda: None, None, chatbot, queue=False)
32 |
33 | demo.launch()
34 |
--------------------------------------------------------------------------------
/askguess/labels.json:
--------------------------------------------------------------------------------
1 | ["apple", "aquarium_fish", "baby", "bear", "beaver", "bed", "bee", "beetle", "bicycle", "bottle", "bowl", "boy", "bridge", "bus", "butterfly", "camel", "can", "castle", "caterpillar", "cattle", "chair", "chimpanzee", "clock", "cloud", "cockroach", "couch", "cra", "crocodile", "cup", "dinosaur", "dolphin", "elephant", "flatfish", "forest", "fox", "girl", "hamster", "house", "kangaroo", "keyboard", "lamp", "lawn_mower", "leopard", "lion", "lizard", "lobster", "man", "maple_tree", "motorcycle", "mountain", "mouse", "mushroom", "oak_tree", "orange", "orchid", "otter", "palm_tree", "pear", "pickup_truck", "pine_tree", "plain", "plate", "poppy", "porcupine", "possum", "rabbit", "raccoon", "ray", "road", "rocket", "rose", "sea", "seal", "shark", "shrew", "skunk", "skyscraper", "snail", "snake", "spider", "squirrel", "streetcar", "sunflower", "sweet_pepper", "table", "tank", "telephone", "television", "tiger", "tractor", "train", "trout", "tulip", "turtle", "wardrobe", "whale", "willow_tree", "wolf", "woman", "worm"]
--------------------------------------------------------------------------------
/askguess/test_labels.json:
--------------------------------------------------------------------------------
1 | ["apple","bed", "beetle", "baby", "bear", "women"]
--------------------------------------------------------------------------------
/askguess/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/askguess/utils/__init__.py
--------------------------------------------------------------------------------
/askguess/utils/prompt.py:
--------------------------------------------------------------------------------
1 | def get_answerer_role(word):
2 | prompt_easy = f'''
3 | You are playing a game called "Ask-Guess."
4 |
5 | The rules of this game are as follows:
6 | The game requires two players, a questioner and an answerer.
7 | At the beginning of the game, the answerer is given a word or phrase as the answer, which the questioner does not know.
8 | The answerer can first briefly describe the given word or phrase without directly mentioning it.
9 | Then, the questioner must guess the word or phrase by asking a series of questions, but only one question in each round.
10 | The answerer should check whether the questioner has guessed the answer. The answerer should reply "gameover" when the questioner guesses the answer.
11 | But the answerer must now directly say the given word or phrase when answering the question.
12 | The goal of the game is to guess the given word or phrase using as few rounds of Q&A as possible.
13 |
14 | Now, you are playing the game. You are the answerer in the game. The given word is "{word}". The user is the questioner to guess the answer.
15 | In each round, you should first check if the user has directly guessed the word "{word}" or guessed similar descriptions like "a kind of {word}" or "a type of {word}." If so, the game is over. You must directly output "gameover".
16 | Otherwise, you should answer questions faithfully or give hints to guide the questioner to guess the word. In no circumstances should you directly say "{word}."
17 | '''
18 |
19 | prompt_hard = f'''
20 | You are playing a game called "Ask-Guess"
21 |
22 | The rules of this game are as follows:
23 | The game requires two players, a questioner and an answerer.
24 | At the beginning of the game, the answerer is given a word or phrase as the answer, which the questioner does not know.
25 | Then, the questioner must guess the word or phrase by asking a series of questions, but only one question in each round.
26 | The answerer should only reply "yes" or "no" to the question or reply "gameover" when the questioner has guessed the answer.
27 | The game's goal is to guess the given word or phrase using as few rounds of Q&A as possible.
28 |
29 | Now, you are the answerer, the user is the questioner, you should answer the questions faithfully.
30 | The answer is "{word}".
31 | You can only answer the question with "yes" or "no" or "gameover", no more words are allowed in your reply.
32 | In each round, you should first check if the user has directly guessed the word "{word}" or guessed similar descriptions like "a kind of {word}" or "a type of {word}." If so, the game is over. You should directly output "gameover".
33 | Remember, you can only reply "yes," "no," or "gameover."
34 | '''
35 | return prompt_easy, prompt_hard
36 |
37 | def get_questioner_role():
38 |
39 | prompt_easy = f'''
40 | You are playing a game called "Ask-Guess"
41 |
42 | The rules of this game are as follows:
43 | The game requires two players, a questioner and an answerer.
44 | At the beginning of the game, the answerer is given a word or phrase as the answer, which the questioner does not know.
45 | The answerer can first briefly describe the given word or phrase without directly mentioning it.
46 | Then, the questioner must guess the word or phrase by asking a series of questions, but only one question in each round.
47 | The answerer should answer the questions faithfully, even give some hints to guide the questioner to guess the answer and check whether the questioner has guessed the answer. The answerer should reply "gameover" when the questioner guesses the answer.
48 | But the answerer must now directly say the given word or phrase when answering the question.
49 | The goal of the game is to guess the given word or phrase using as few rounds of Q&A as possible.
50 |
51 | Now, you are the questioner. You should guess the word or phrase by asking questions, but only one question in each round. Your question should be helpful to guess the word or phrase; do not ask irrelevant questions.
52 | '''
53 | prompt_hard = f'''
54 | You are playing a game called "Ask-Guess"
55 |
56 | The rules of this game are as follows:
57 | The game requires two players, a questioner and an answerer.
58 | At the beginning of the game, the answerer is given a word or phrase as the answer, which the questioner does not know.
59 | Then, the questioner must guess the word or phrase by asking a series of questions, but only one question in each round.
60 | The answerer should only reply "yes" or "no" to the question or reply "gameover" when the questioner has guessed the answer.
61 | The game's goal is to guess the given word or phrase using as few rounds of Q&A as possible.
62 |
63 | Now, you are the questioner, you should guess the word or phrase by asking questions, but only one question in each round. Your question should be helpful to guess the word or phrase, do not ask irrelevant questions.
64 | '''
65 | return prompt_easy, prompt_hard
66 |
67 |
68 | host_description_prompt = '''Now the game start, answerer please give a short description of your received word or phrase.'''
69 |
70 | host_qa_prompt = '''Now the Q&A start, questioner please guess the answer!.'''
--------------------------------------------------------------------------------
/askguess/utils/utils.py:
--------------------------------------------------------------------------------
1 | from chat.gpt3_chat import GPT3
2 | from chat.gpt4_chat import GPT4
3 | from chat.text003_chat import Text003
4 |
5 | def get_model(model_name):
6 | model = None
7 | if model_name == "gpt3":
8 | model = GPT3()
9 | if model_name == "gpt4":
10 | model = GPT4()
11 | if model_name == "td003":
12 | model = Text003()
13 | return model
14 |
15 | def create_message(role,content):
16 | return {"role":role,"content":content}
17 |
18 | def print_messages(messages):
19 | for message in messages:
20 | print(message)
21 |
22 | def convert_messages_to_prompt(messages,role):
23 | prompt = ""
24 | if role == "questioner":
25 | for message in messages:
26 | content = message["content"]
27 | if message["role"] == "user":
28 | prompt += f"questioner: {content}\n"
29 | elif message["role"] == "assistant":
30 | prompt += f"answerer: {content}\n"
31 | else:
32 | prompt += f"host: {content}\n"
33 | prompt += "questioner: "
34 | else:
35 | for message in messages:
36 | content = message["content"]
37 | if message["role"] == "assistant":
38 | prompt += f"questioner: {content}\n"
39 | elif message["role"] == "user":
40 | prompt += f"answerer: {content}\n"
41 | else:
42 | prompt += f"host: {content}\n"
43 | prompt += "answerer: "
44 |
45 | return prompt
46 |
--------------------------------------------------------------------------------
/assets/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/.DS_Store
--------------------------------------------------------------------------------
/assets/GameEval.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/GameEval.png
--------------------------------------------------------------------------------
/assets/case1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/case1.png
--------------------------------------------------------------------------------
/assets/res.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/res.png
--------------------------------------------------------------------------------
/assets/res1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/res1.png
--------------------------------------------------------------------------------
/assets/res2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/assets/res2.png
--------------------------------------------------------------------------------
/chat/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/jordddan/GameEval/5025a1abfc00de62aab177a016cc8b64d2c1b4ab/chat/__init__.py
--------------------------------------------------------------------------------
/chat/config.py:
--------------------------------------------------------------------------------
1 | key_gpt3 = "a41acec784d340b184bc71d850c97a7f"
2 | key_gpt4 = "a41acec784d340b184bc71d850c97a7f"
3 | key_td003 = "653880d85b6e4a209206c263d7c3cc7a"
4 |
5 | api_type_gpt3 = "azure"
6 | api_base_gpt3 = "https://mtutor-dev.openai.azure.com/"
7 | api_version_gpt3 = "2023-03-15-preview"
8 |
9 | api_type_gpt4 = "azure"
10 | api_base_gpt4 = "https://mtutor-dev.openai.azure.com/"
11 | api_version_gpt4 = "2023-03-15-preview"
12 |
13 | api_type_td003 = "azure"
14 | api_base_td003 = "https://gcrgpt4aoai5.openai.azure.com"
15 | api_version_td003 = "2023-03-15-preview"
16 |
17 | engine_gpt4 = "devgpt4-32k"
18 | engine_gpt3 = "mtutor-openai-dev"
19 | engine_td003 = "text-davinci-003"
20 |
21 | temperature_gpt3 = 0.7
22 | temperature_gpt4 = 0.7
23 | temperature_td003 = 0.7
--------------------------------------------------------------------------------
/chat/config_example.py:
--------------------------------------------------------------------------------
1 | key_gpt3 =
2 | key_gpt4 =
3 | key_td003 =
4 |
5 | api_type_gpt3 =
6 | api_base_gpt3 =
7 | api_version_gpt3 =
8 |
9 | api_type_gpt4 =
10 | api_base_gpt4 =
11 | api_version_gpt4 =
12 |
13 | api_type_td003 =
14 | api_base_td003 =
15 | api_version_td003 =
16 |
17 | engine_gpt4 =
18 | engine_gpt3 =
19 | engine_td003 =
20 |
21 | temperature_gpt3 =
22 | temperature_gpt4 =
23 | temperature_td003 =
--------------------------------------------------------------------------------
/chat/gpt3_chat.py:
--------------------------------------------------------------------------------
1 | import sys
2 | import openai
3 | from func_timeout import func_set_timeout
4 |
5 | from chat.config import key_gpt3, api_type_gpt3, api_base_gpt3, api_version_gpt3, engine_gpt3, temperature_gpt3
6 |
7 | @func_set_timeout(15)
8 | def get_response(messages):
9 | response = openai.ChatCompletion.create(
10 | temperature=temperature_gpt3,
11 | engine=engine_gpt3,
12 | messages = messages,
13 | api_type=api_type_gpt3,
14 | api_base=api_base_gpt3,
15 | api_version=api_version_gpt3,
16 | api_key=key_gpt3,
17 | )
18 | return response
19 |
20 | class GPT3:
21 | def __init__(self) -> None:
22 | self.name = "gpt3"
23 | def single_chat(self,content,role=None):
24 | if role is None:
25 | role = "You are an AI assistant that helps people find information."
26 | messages = [
27 | {"role":"system","content":role},
28 | {"role":"user","content":content}
29 | ]
30 | res = None
31 | cnt = 0
32 | while True:
33 | try:
34 | response = get_response(messages)
35 | res = response["choices"][0]["message"]["content"]
36 | break
37 | except:
38 | cnt += 1
39 | if cnt >= 5:
40 | break
41 |
42 | return res
43 |
44 | def multi_chat(self, messages):
45 |
46 | res = None
47 | cnt = 0
48 |
49 | while True:
50 | try:
51 | response = get_response(messages)
52 | res = response["choices"][0]["message"]["content"]
53 | break
54 | except:
55 | cnt += 1
56 | if cnt >= 3:
57 | break
58 |
59 | return res
60 |
61 | if __name__ == "__main__":
62 | pass
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
--------------------------------------------------------------------------------
/chat/gpt4_chat.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | import openai
5 |
6 |
7 | from func_timeout import func_set_timeout
8 | from chat.config import key_gpt4, api_type_gpt4, api_base_gpt4, api_version_gpt4, engine_gpt4,temperature_gpt4
9 |
10 | @func_set_timeout(30)
11 | def get_response(messages):
12 | response = openai.ChatCompletion.create(
13 | temperature=temperature_gpt4,
14 | engine=engine_gpt4,
15 | messages = messages,
16 | api_type=api_type_gpt4,
17 | api_base=api_base_gpt4,
18 | api_version=api_version_gpt4,
19 | api_key=key_gpt4,
20 | )
21 | return response
22 |
23 | class GPT4:
24 | def __init__(self) -> None:
25 | self.name = "gpt4"
26 |
27 | def single_chat(self,content,role=None):
28 | if role is None:
29 | role = "You are an AI assistant that helps people find information."
30 | messages = [
31 | {"role":"system","content":role},
32 | {"role":"user","content":content}
33 | ]
34 | res = None
35 | cnt = 0
36 |
37 | while True:
38 | try:
39 | response = get_response(messages)
40 | res = response["choices"][0]["message"]["content"]
41 | break
42 | except:
43 | cnt += 1
44 | if cnt >= 3:
45 | break
46 |
47 | return res
48 |
49 | def multi_chat(self, messages):
50 |
51 | res = None
52 | cnt = 0
53 | while True:
54 | try:
55 | response = get_response(messages)
56 | res = response["choices"][0]["message"]["content"]
57 | break
58 | except:
59 | cnt += 1
60 | if cnt >= 3:
61 | break
62 |
63 | return res
64 |
65 | if __name__ == "__main__":
66 | pass
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
--------------------------------------------------------------------------------
/chat/text003_chat.py:
--------------------------------------------------------------------------------
1 | import openai
2 | import sys
3 | from func_timeout import func_set_timeout
4 | from chat.config import key_td003, api_type_td003, api_base_td003, api_version_td003, engine_td003,temperature_td003
5 | import re
6 |
7 | import random
8 |
9 | @func_set_timeout(30)
10 | def get_response(prompt):
11 |
12 | response = openai.Completion.create(engine=engine_td003,
13 | temperature=temperature_td003,
14 | prompt = prompt,
15 | max_tokens = 150,
16 | api_type = api_type_td003,
17 | api_base = api_base_td003,
18 | api_version = api_version_td003,
19 | api_key=key_td003,
20 | )
21 |
22 | return response
23 |
24 |
25 | def extract_json(string):
26 | l = string.find("{")
27 | r = string.find("}") + 1
28 | json_string = string[l:r]
29 |
30 | return json_string
31 |
32 |
33 | class Text003:
34 | def __init__(self) -> None:
35 | self.name = "td003"
36 |
37 | def single_chat(self,prompt):
38 | cnt = 0
39 | res = None
40 |
41 | # import pdb
42 | # pdb.set_trace()
43 | while True:
44 | try:
45 | response = get_response(prompt)
46 | res = response['choices'][0]['text'].replace('\n', '').replace(' .', '.').strip()
47 | break
48 | except:
49 | cnt += 1
50 | if cnt >= 3:
51 | break
52 | return res
53 |
54 |
55 | if __name__ == "__main__":
56 |
57 | pass
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
--------------------------------------------------------------------------------
/game_askguess.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | from askguess.agents.answer_agent import AnswerAgnent
4 | from askguess.agents.question_agent import QuestionAgnent
5 | import vthread
6 | import logging
7 | import os
8 | import re
9 | from askguess.utils.utils import get_model, create_message
10 | from askguess.utils.prompt import host_description_prompt, host_qa_prompt
11 | import argparse
12 | import json
13 |
14 | def checkin(word, text):
15 | '''
16 | check whether the answerer directly use the word as hint
17 | '''
18 | pattern = r'[^\w\s]'
19 |
20 | replaced_text = re.sub(pattern, ' ', text)
21 |
22 | if (" " + word + " ") in replaced_text:
23 | return True
24 |
25 | return False
26 |
27 | def game(object, f, model, args):
28 | word = object.replace("_"," ")
29 | flag = False # True meas a successful trial, False means an error happened.
30 | cnt = 0 # to record the game round
31 | answer_agent = AnswerAgnent(model, word, args)
32 | question_agent = QuestionAgnent(model, word, args)
33 | error_type = None
34 | while True:
35 | # ---------- Describing Stage ----------
36 | if args.mode == "easy" and cnt == 0:
37 | # host describing prompt
38 | host_message = create_message("system",host_description_prompt)
39 | answer_agent.update_history(host_message)
40 | question_agent.update_history(host_message)
41 |
42 | description = answer_agent.play()
43 | f.write(f"description: {description}"+"\n")
44 |
45 | if description == None:
46 | error_type = "ChatError"
47 | break
48 | if checkin(word.lower(), description.lower()):
49 | error_type = "AnswerMentionedError"
50 | break
51 |
52 | questioner_message = create_message("user",description)
53 | answerer_messsage = create_message("assistant",description)
54 | answer_agent.update_history(answerer_messsage)
55 | question_agent.update_history(questioner_message)
56 | # import pdb
57 | # pdb.set_trace()
58 | if args.debug:
59 | print(f"description: {description}"+"\n")
60 | f.write(f"description: {description}"+"\n")
61 | cnt += 1
62 | continue
63 |
64 | # ---------- Q&A Stage ----------
65 |
66 | # host Q&A prompt
67 | host_message = create_message("system",host_qa_prompt)
68 | answer_agent.update_history(host_message)
69 | question_agent.update_history(host_message)
70 |
71 | question = question_agent.play()
72 |
73 | if question == None:
74 | error_type = "ChatError"
75 | break
76 | questioner_message = create_message("assistant",question)
77 | answerer_messsage = create_message("user",question)
78 | answer_agent.update_history(answerer_messsage)
79 | question_agent.update_history(questioner_message)
80 | if args.debug:
81 | print(f"question: {question}"+"\n")
82 | f.write(f"question: {question}"+"\n")
83 |
84 | answer = answer_agent.play()
85 | if answer == None:
86 | error_type = "ChatError"
87 | break
88 | questioner_message = create_message("user",answer)
89 | answerer_messsage = create_message("assistant",answer)
90 | answer_agent.update_history(answerer_messsage)
91 | question_agent.update_history(questioner_message)
92 | if args.debug:
93 | print(f"answer: {answer}"+"\n")
94 | f.write(f"answer: {answer}"+"\n")
95 | cnt += 1
96 |
97 | # ---------- Check the Result ----------
98 | if "gameover" in answer.lower() or "game over" in answer.lower():
99 | if word.lower() in question.lower():
100 | flag = True
101 | break
102 | else:
103 | # wrongly end the game
104 | error_type = "EndingError"
105 | break
106 |
107 | # break the rule
108 | if checkin(word.lower(), answer.lower()):
109 | error_type = "AnswerMentionedError"
110 | break
111 | if cnt > 30:
112 | error_type = "RoundLimitError"
113 | break
114 |
115 | if flag:
116 | if args.debug:
117 | print({"object":word,"round":cnt-1,"error_type":"SuccessfulTrial"})
118 | return {"object":word,"round":cnt-1,"error_type":"SuccessfulTrial"}
119 | else:
120 | if args.debug:
121 | print({"object":word,"round":-1,"error_type":error_type})
122 | return {"object":word,"round":-1,"error_type":error_type}
123 |
124 |
125 | @vthread.pool(1)
126 | def run(word,i,model,args):
127 | with open(f"askguess/logs_{args.mode}_{args.model_name}/{word}/{i}.log","w") as f:
128 | res = game(word,f,model,args)
129 | res["log"] = f"{i}.log"
130 | with open(f"askguess/guess_result_{args.mode}_{args.model_name}.json","a") as f:
131 | f.write(json.dumps(res)+"\n")
132 |
133 | if __name__ == "__main__":
134 |
135 | parser = argparse.ArgumentParser()
136 | parser.add_argument('--label_path', type=str, default='askguess/labels.json')
137 | parser.add_argument('--model_name',type=str,default='gpt3')
138 | parser.add_argument('--mode',type=str,default='easy')
139 | parser.add_argument('--n',type=int,default='20')
140 | parser.add_argument('--debug',type=bool,default=True)
141 | args = parser.parse_args()
142 | model = get_model(args.model_name)
143 | with open(args.label_path,'r') as f:
144 | labels = json.load(f)
145 |
146 | ## prepare log folder
147 | for label in labels:
148 | log_dir = f"askguess/logs_{args.mode}_{args.model_name}/{label}"
149 | if not os.path.exists(log_dir):
150 | os.makedirs(log_dir)
151 |
152 | ## run multiple times for each word
153 | for i in range(0,len(labels)):
154 | label = labels[i]
155 | for i in range(args.n):
156 | run(label,i,model,args)
157 |
158 | vthread.pool.wait()
159 |
160 |
--------------------------------------------------------------------------------
/game_spyfall.py:
--------------------------------------------------------------------------------
1 | from spyfall.agents.base_agent import BaseAgent
2 |
3 | from chat.gpt3_chat import GPT3
4 | from chat.gpt4_chat import GPT4
5 | from chat.text003_chat import Text003
6 | from spyfall.utils.utils import create_message, get_model
7 | import argparse
8 | import vthread
9 | import json
10 | import copy
11 | import random
12 | import os
13 | # gpt3 = GPT3()
14 | # gpt4 = GPT4()
15 | # text003 = Text003()
16 | # name2model = {"gpt3":gpt3,"gpt4":gpt4,"text003":text003}
17 | players = ["Nancy","Tom","Cindy","Jack","Rose","Edward"]
18 |
19 | def init_game(phrase_pair,spy_model, villager_model):
20 | # phrase_pair[0]:spy word, phrase_pair[1]:common word
21 | spy_word = phrase_pair[0]
22 | villager_word = phrase_pair[1]
23 | random.shuffle(players)
24 | name2agent = {}
25 | index = random.randint(1,len(players)) # index of the spy
26 | spy_name = players[index-1]
27 | agents_list = []
28 | for i in range(len(players)):
29 |
30 | if i+1 == index:
31 | phrase = spy_word
32 | llm = spy_model
33 | llm_name = llm.name
34 | else:
35 | phrase = villager_word
36 | llm = villager_model
37 | llm_name = villager_model.name
38 |
39 | player_name = players[i]
40 | agent = BaseAgent(llm,llm_name,player_name,players,phrase)
41 | name2agent[player_name] = agent
42 | agents_list.append(agent)
43 | settings = f'''The spy word is: {spy_word};\n The villager word is {villager_word}.\n'''
44 | for agent in agents_list:
45 | settings += f"Player: {agent.player_name}; LLM: {agent.llm_name}; Assigned Word: {agent.phrase} \n"
46 |
47 | return agents_list, spy_name, index, settings
48 |
49 | def get_voted_name(name_list):
50 | counts = {}
51 |
52 | for string in name_list:
53 | if string in counts:
54 | counts[string] += 1
55 | else:
56 | counts[string] = 1
57 |
58 | max_count = 0
59 | most_frequent_string = None
60 | freq = []
61 | for string, count in counts.items():
62 | freq.append(count)
63 |
64 | if count > max_count:
65 | max_count = count
66 | most_frequent_string = string
67 |
68 | freq.sort()
69 | return most_frequent_string, freq
70 |
71 | def update_history(agents_list:list[BaseAgent], temp_message, player_name, public_messages):
72 |
73 | for agent in agents_list:
74 | if agent.player_name != player_name:
75 | agent.private_history.append(temp_message)
76 | public_messages.append(temp_message)
77 |
78 | def get_result(agent_list:list[BaseAgent], spy_index, round, i, winer):
79 |
80 | llms = [agent.llm_name for agent in agent_list]
81 | players = [agent.player_name for agent in agent_list]
82 |
83 | return {"winer":winer,"players":players,"llms":llms,"spy_index":spy_index,"round":round,"log":f"{i}.log"}
84 |
85 |
86 | def game(f, phrase_pair, spy_model, villager_model, i, args):
87 |
88 | agents_list, spy_name, spy_index, game_settings = init_game(phrase_pair,spy_model,villager_model)
89 | f.write(game_settings)
90 | living_players = copy.deepcopy(players)
91 |
92 | # game start
93 | PUBLIC_MESSAGES = []
94 | host_speech= "Host: The game now start."
95 | start_message = create_message("user",host_speech)
96 | update_history(agents_list,start_message,"host",PUBLIC_MESSAGES)
97 | f.write(host_speech + "\n")
98 |
99 | host_speech = f"Host: The living players are:{json.dumps(living_players)}"
100 | living_player_message = create_message("user",host_speech)
101 | update_history(agents_list,living_player_message,"host",PUBLIC_MESSAGES)
102 | f.write(host_speech + "\n")
103 |
104 | if args.debug:
105 | print(game_settings)
106 | game_round = 0
107 | while True:
108 | game_round += 1
109 | ## describing
110 | f.write("---------describing stage-------------")
111 |
112 | host_speech = f"Host: Now it's the describing stage, players have to say something about the received word without directly saying it."
113 | host_message = create_message("user",host_speech)
114 | update_history(agents_list,host_message,"host",PUBLIC_MESSAGES)
115 | f.write(host_speech+"\n\n")
116 |
117 | for agent in agents_list:
118 |
119 | if agent.player_name not in living_players:
120 | continue
121 |
122 | host_speech = f"Host: {agent.player_name}, it's your turn."
123 | host_message = create_message("user",host_speech)
124 | update_history(agents_list,host_message,"host",PUBLIC_MESSAGES)
125 | f.write(host_speech+"\n")
126 |
127 | description, cot = agent.describe()
128 |
129 | temp = f"{agent.player_name}: {description}"
130 | public_message = create_message("user",temp)
131 | update_history(agents_list,public_message,agent.player_name,PUBLIC_MESSAGES)
132 | PUBLIC_MESSAGES.append(cot)
133 | private_message = create_message("assistant",json.dumps(cot))
134 | agent.private_history.append(private_message)
135 | f.write(temp+"\n")
136 | f.write(json.dumps(cot)+"\n\n")
137 |
138 | if args.debug:
139 | print(cot,agent.phrase)
140 | print(temp)
141 | print()
142 |
143 |
144 | f.write("---------voting stage-------------")
145 |
146 | host_speech = "Host: Now the voting start, please vote for the player you think is the spy and tell the reason why you think he is the spy."
147 | host_message = create_message("user",host_speech)
148 | update_history(agents_list,host_message,"host",PUBLIC_MESSAGES)
149 | f.write(host_speech+"\n")
150 |
151 | name_list = []
152 | for agent in agents_list:
153 |
154 | host_speech = f"Host: {agent.player_name}, it's your turn."
155 | host_message = create_message("user",host_speech)
156 | update_history(agents_list,host_message,"host",PUBLIC_MESSAGES)
157 | f.write(host_speech+"\n")
158 |
159 | if agent.player_name not in living_players:
160 | continue
161 | name, speak, cot = agent.vote()
162 |
163 | # private message for the player
164 | private_message = create_message("assistant",json.dumps(cot))
165 | agent.private_history.append(private_message)
166 |
167 | # public message for the other players
168 | temp = f"{agent.player_name}: {speak}, i will vote {name} as the spy."
169 | public_message = create_message("user",temp)
170 | update_history(agents_list,public_message,agent.player_name,PUBLIC_MESSAGES)
171 | PUBLIC_MESSAGES.append(cot)
172 |
173 | if args.debug:
174 | print(cot)
175 | print(temp)
176 | print()
177 |
178 | f.write(temp+"\n")
179 | f.write(json.dumps(cot)+"\n")
180 |
181 | name_list.append(name)
182 |
183 | ## result of this round
184 | final_name, freq = get_voted_name(name_list)
185 |
186 | if final_name not in living_players:
187 | log_content = "Agent not reture a correct player name."
188 | print(log_content)
189 | f.write(log_content+"\n")
190 | return get_result(agents_list,spy_index,-1,i,"exit"),PUBLIC_MESSAGES
191 |
192 | if final_name == spy_name:
193 | log_content = "the spy loss the game"
194 | print(log_content)
195 | f.write(log_content)
196 | # gameover, spy loss
197 | return get_result(agents_list,spy_index,game_round,i,"villager"),PUBLIC_MESSAGES
198 | else:
199 | ## remove a player
200 | living_players.remove(final_name)
201 |
202 | host_speech = f"Host: the voting result is {final_name}, he is not the spy. The spy still lives, the game will continue. In the next round, the players' descriptions need to be more specific."
203 | host_message = create_message("user",host_speech)
204 | update_history(agents_list,host_message,"host",PUBLIC_MESSAGES)
205 | f.write(host_speech+"\n\n")
206 |
207 | host_speech = f"Host: Now the living players are:{json.dumps(living_players)}"
208 | living_player_message = create_message("user",host_speech)
209 | update_history(agents_list,living_player_message,"host",PUBLIC_MESSAGES)
210 | f.write(host_speech+"\n\n")
211 |
212 |
213 | if len(living_players) <= 3:
214 | log_content = "the spy win the game"
215 | print(log_content)
216 | f.write(log_content)
217 | return get_result(agents_list,spy_index,game_round,i,"spy"),PUBLIC_MESSAGES
218 |
219 |
220 | if __name__ == "__main__":
221 |
222 | parser = argparse.ArgumentParser()
223 | parser.add_argument('--label_path', type=str, default='spyfall/labels.txt')
224 | parser.add_argument('--spy_model_name',type=str,default='td003')
225 | parser.add_argument('--villager_model_name',type=str,default='gpt3')
226 | parser.add_argument('--n',type=int,default='1')
227 | parser.add_argument('--debug',type=bool,default=True)
228 | args = parser.parse_args()
229 | with open(args.label_path,'r') as f:
230 | data = f.readlines()
231 |
232 | log_path = f"spyfall/logs/{args.spy_model_name}_{args.villager_model_name}"
233 | labels = []
234 | for item in data:
235 | labels.append(item.strip().split(","))
236 |
237 | for label in labels:
238 | dir_name = f"{label[0]}&{label[1]}"
239 | if not os.path.exists(os.path.join(log_path,dir_name)):
240 | os.makedirs(os.path.join(log_path,dir_name))
241 |
242 | spy_model = get_model(args.spy_model_name)
243 | villager_model = get_model(args.villager_model_name)
244 |
245 | for j in range(args.n):
246 | for i in range(len(labels)):
247 | label = labels[i]
248 | dir_name = f"{log_path}/{label[0]}&{label[1]}"
249 | with open(f"{dir_name}/{j}.log",'w') as f:
250 | res, history = game(f=f,
251 | phrase_pair=label,
252 | spy_model=spy_model,
253 | villager_model=villager_model,
254 | i=j,
255 | args=args)
--------------------------------------------------------------------------------
/game_tofukingdom.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.agents import ChefAgent,SpyAgent,MaidAgent,GuardAgent,QueenAgent,PrinceAgent,PrincessAgent,MinisterAgent
2 | from tofukingdom.utils.utils import create_message
3 | from chat.gpt3_chat import GPT3
4 | from chat.gpt4_chat import GPT4
5 | import vthread
6 | import json
7 | import random
8 | import argparse
9 | import os
10 | from tofukingdom.utils.utils import get_model
11 | gpt3 = GPT3()
12 | gpt4 = GPT4()
13 | players = ["Nancy","Tom","Cindy","Jack","Rose","Edward","Robert"]
14 |
15 |
16 | def init_game(players,prince_model,queen_model,spy_model):
17 | name2agent = {}
18 | random.shuffle(players)
19 | agents = []
20 | agents.append(PrincessAgent(prince_model,players[0],players))
21 | agents.append(ChefAgent(prince_model,players[1],players))
22 | agents.append(SpyAgent(spy_model,players[2],players))
23 | agents.append(MaidAgent(spy_model,players[3],players))
24 | agents.append(GuardAgent(queen_model,players[4],players))
25 | agents.append(QueenAgent(queen_model,players[5],players))
26 | agents.append(MinisterAgent(queen_model,players[6],players))
27 | random.shuffle(agents)
28 | settings = f"PrinceModel: {prince_model.name}\n QueenModel: {queen_model.name} \n SpyModel: {spy_model.name} \n"
29 | identities = ""
30 | for agent in agents:
31 | name2agent[agent.player_name] = agent
32 | settings += f"Player: {agent.player_name}; LLM: {agent.chatbot.name}; Identity: {agent.role}; \n"
33 | identities += f"Player {agent.player_name} is the {agent.role};"
34 | return agents, name2agent, settings, identities
35 |
36 | def get_identity_text(agents):
37 | res = ""
38 | for agent in agents:
39 | res += f"{agent.player_name} is the {agent.role}. \n"
40 | return res
41 |
42 | def get_game_result(final_name,i):
43 | res = {"winner":final_name,"log":f"{i}.log"}
44 | return res
45 |
46 | def update_history(agents_list, temp_message, player_name):
47 |
48 | for agent in agents_list:
49 | if agent.player_name != player_name:
50 | agent.private_history.append(temp_message)
51 |
52 |
53 | def game(f,round,prince_model,queen_model,spy_model):
54 |
55 | prince = PrinceAgent(prince_model,players)
56 | print(f"The {round}-th game begins.\n")
57 |
58 | random.shuffle(players)
59 |
60 | agents_list, name2agent, settings, identities = init_game(players,prince_model,queen_model,spy_model)
61 | if args.debug:
62 | print(settings)
63 | print()
64 | identities = get_identity_text(agents_list)
65 |
66 | host_speech= "Host: The game now start."
67 | start_message = create_message("user",host_speech)
68 | update_history(agents_list,start_message,"host")
69 | prince.private_history.append(start_message)
70 | f.write(host_speech + "\n")
71 | if args.debug:
72 | print(host_speech)
73 |
74 | for agent in agents_list:
75 | player_name = agent.player_name
76 |
77 | host_speech= f"Host: The Prince please ask player {player_name} one question."
78 | host_message = create_message("user",host_speech)
79 | update_history(agents_list,host_message,"host")
80 | prince.private_history.append(host_message)
81 | f.write(host_speech + "\n")
82 | if args.debug:
83 | print(host_speech)
84 |
85 | # prince ask question
86 | question, cot = prince.ask()
87 | if question is None:
88 | error = "Question is None."
89 | print(error)
90 | return {"error":error}
91 | temp = f"Prince: {question}"
92 | temp_message = create_message("user",temp)
93 | update_history(agents_list,temp_message,"Prince")
94 | prince_message = create_message("assistant",json.dumps(cot))
95 | prince.private_history.append(prince_message)
96 | f.write(temp+"\n")
97 | f.write(json.dumps(cot)+"\n")
98 | if args.debug:
99 | print(temp)
100 | print(json.dumps(cot))
101 | print()
102 |
103 | # player answer question
104 | answer, cot = agent.chat(identities)
105 | if answer is None:
106 | error = "Answer is None."
107 | print(error)
108 | return {"error":error}
109 | temp = f"{player_name}: {answer}"
110 | temp_message = create_message("user",temp)
111 | update_history(agents_list,temp_message,player_name)
112 | private_message = create_message("assistant",json.dumps(cot))
113 | agent.private_history.append(private_message)
114 | prince.private_history.append(temp_message)
115 | f.write(temp+"\n")
116 | f.write(json.dumps(cot)+"\n")
117 | if args.debug:
118 | print(temp)
119 | print(json.dumps(cot))
120 | print()
121 |
122 | # choose a player to ask one more question
123 | host_speech = f"Host: The Prince please choose a player to ask an extra question."
124 | host_message = create_message("user",host_speech)
125 | update_history(agents_list,host_message,"host")
126 | prince.private_history.append(host_message)
127 | f.write(host_speech + "\n")
128 | if args.debug:
129 | print(host_speech)
130 | print()
131 |
132 | name, question, cot = prince.ask_choose()
133 | if name is None:
134 | error = "Extra name is None."
135 | print(error)
136 | return {"error":error}
137 | temp = f"Prince: I choose {name}, my quesiton is {question}"
138 | temp_message = create_message("user",temp)
139 | update_history(agents_list,temp_message,"Prince")
140 | prince_message = create_message("assistant",json.dumps(cot))
141 | prince.private_history.append(prince_message)
142 | f.write(temp+"\n")
143 | f.write(json.dumps(cot)+"\n")
144 | if args.debug:
145 | print(temp)
146 | print(json.dumps(cot))
147 | print()
148 |
149 | # player answer an extra question
150 | answer, cot = agent.chat(identities)
151 | if answer is None:
152 | error = "Extra answer is None."
153 | print(error)
154 | return {"error":error}
155 | temp = f"{player_name}: {answer}"
156 | temp_message = create_message("user",temp)
157 | update_history(agents_list,temp_message,player_name)
158 | private_message = create_message("assistant",json.dumps(cot))
159 | agent.private_history.append(private_message)
160 | prince.private_history.append(temp_message)
161 | f.write(temp+"\n")
162 | f.write(json.dumps(cot)+"\n")
163 | if args.debug:
164 | print(temp)
165 | print(json.dumps(cot))
166 | print()
167 |
168 | # choose the princess
169 |
170 | host_speech = f"Host: Who do you think is the true princess?"
171 | host_message = create_message("user",host_speech)
172 | update_history(agents_list,host_message,"host")
173 | prince.private_history.append(host_message)
174 | f.write(host_speech + "\n")
175 | if args.debug:
176 | print(host_speech)
177 | print()
178 |
179 | name, cot = prince.choose()
180 | if name is None:
181 | error = "Fianl answer is None."
182 | print(error)
183 | return {"error":error}
184 | if args.debug:
185 | print(f"The final choice is {name}")
186 | print(json.dumps(cot))
187 | print()
188 |
189 | game_result = get_game_result(name2agent[name].role,round)
190 | if args.debug:
191 | print(game_result)
192 | return game_result
193 |
194 |
195 |
196 | if __name__ == "__main__":
197 | parser = argparse.ArgumentParser()
198 | parser.add_argument('--prince_model_name',type=str,default='gpt3')
199 | parser.add_argument('--spy_model_name',type=str,default='td003')
200 | parser.add_argument('--queen_model_name',type=str,default='gpt3')
201 | parser.add_argument('--n',type=int,default='1')
202 | parser.add_argument('--debug',type=bool,default=True)
203 | args = parser.parse_args()
204 | prince_model = get_model(args.prince_model_name)
205 | spy_model = get_model(args.spy_model_name)
206 | queen_model = get_model(args.queen_model_name)
207 |
208 |
209 | for i in range(args.n):
210 | log_dir = f"tofukingdom/logs/{args.prince_model_name}_{args.queen_model_name}_{args.spy_model_name}"
211 | if not os.path.exists(log_dir):
212 | os.makedirs(log_dir)
213 | file_name = os.path.join(log_dir,f"{i}.txt")
214 | with open(file_name,'w') as f:
215 | game_result = game(f,i,prince_model,queen_model,spy_model)
216 | if game_result:
217 | with open("tofukingdom/result.json","a") as f:
218 | f.write(json.dumps(game_result)+"\n")
219 |
220 |
--------------------------------------------------------------------------------
/spyfall/.gitignore:
--------------------------------------------------------------------------------
1 | utils/key.py
2 | logs
--------------------------------------------------------------------------------
/spyfall/agents/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/spyfall/agents/base_agent.py:
--------------------------------------------------------------------------------
1 | from spyfall.utils.prompt import game_prompt_en
2 | from spyfall.utils.utils import create_message, print_messages
3 | import json
4 |
5 | class BaseAgent:
6 | def __init__(self,chatbot,llm_name,player_name,all_players,phrase) -> None:
7 | self.chatbot = chatbot # llm model
8 | self.llm_name = llm_name # name of the chatbot e.g. "gpt3"
9 | self.game_prompt = game_prompt_en
10 | self.player_name = player_name # player name
11 | self.phrase = phrase
12 | self.all_players = all_players # the names of all the players
13 | self.role_prompt = self.get_role_prompt()
14 |
15 | self.role_messages = self.get_role_messages()
16 | self.vote_messages = self.get_vote_messages()
17 | self.private_history = []
18 |
19 | def get_role_prompt(self):
20 | role_prompt = (
21 | f'''{self.game_prompt}'''
22 | f'''The players involved in the game are: {json.dumps(self.all_players)}.'''
23 | f'''You are {self.player_name} \n'''
24 | f'''Your given phrase is {self.phrase} \n'''
25 | )
26 |
27 | return role_prompt
28 |
29 |
30 |
31 | def get_role_messages(self):
32 |
33 | messages = []
34 | messages.append(create_message("system",self.game_prompt))
35 | temp = f"Now i have read the rules and i know how to play the game, can you offer me some key strategy to win the game? "
36 | messages.append(create_message("assistant",temp))
37 |
38 | temp = f"Sure. At the begining of the game or you are not sure whether you are the spy, you can speak with very general descriptions and use as few words as you can. "
39 | messages.append(create_message("user",temp))
40 | temp = f"For example, if your word is 'apple', you can say like 'it's this is a very common object' or 'it's a kind of fruit' "
41 | messages.append(create_message("user",temp))
42 | temp = f"You need to analyze the speech of other players carefully to guess what is the common word and what is the spy word."
43 | messages.append(create_message("user",temp))
44 | temp = f"If you are sure that you are a spy, you should try to conceal your identity and confuse others not to vote you."
45 | messages.append(create_message("user",temp))
46 | temp = f"I understand. "
47 | messages.append(create_message("assistant",temp))
48 | temp = f"Now you are {self.player_name}, the word you get is {self.phrase}. You don't know the word of other players. "
49 | messages.append(create_message("user",temp))
50 | temp = f"Recieved. "
51 | messages.append(create_message("assistant",temp))
52 |
53 | temp = (
54 | f'''Your reply should be a string in the json format as follows:\n'''
55 | '''{"thought":{your though},"speak":{your speak}}\n '''
56 | f''' "thought" represent your thinking, which can be seen only by your self. \n'''
57 | f''' "speak" represent your speak in this round, which can been seen by all the other players. \n'''
58 | )
59 | messages.append(create_message("user",temp))
60 | temp = (
61 | '''Your speak should only contain the few words about the word you received, you should not speak like 'i agree with {player_name}' or other thing irrelevant to the word you received. '''
62 | )
63 | messages.append(create_message("user",temp))
64 | temp = f"I understand. I will reply with a json string, and i will not repeat other players' speak or my own speak in the previous round. "
65 | messages.append(create_message("assistant",temp))
66 |
67 | return messages
68 |
69 |
70 | def get_vote_messages(self):
71 | messages = []
72 | messages.append(create_message("system",self.game_prompt))
73 | temp = f"Now i have read the rules. But i still need some strategies to better win the game."
74 | messages.append(create_message("assistant",temp))
75 |
76 | temp = f"The voting stage is very important. So i can give some experience in the voting stage. "
77 | messages.append(create_message("user",temp))
78 | temp = f"Great, i will learn from the experience to better win the game. "
79 | messages.append(create_message("assistant",temp))
80 | temp = f"First you should carefully think of the word each players get according to their descriptions. The player whose description is far from the others can be the spy."
81 | messages.append(create_message("user",temp))
82 | temp = "You need to constantly guess possible common words and spy words during the game process."
83 | messages.append(create_message("user",temp))
84 | temp = f"If you are sure that your word is the spy word, you should think of how to prevent being voted. You should try to confuse other players to hide your identity."
85 | temp += f"If you think you are not the spy, you should think who might be a spy. And you should encourage other players to vote for the spy in your speech if you have a specific suspicion target to be the spy."
86 | messages.append(create_message("user",temp))
87 | temp = f"I understand. "
88 | messages.append(create_message("assistant",temp))
89 |
90 | temp = f"Now you are {self.player_name}, the word you get is {self.phrase}."
91 | messages.append(create_message("user",temp))
92 |
93 | temp = f"Recieved. "
94 | messages.append(create_message("assistant",temp))
95 | temp = (
96 | f'''In the voting stage, your reply should be a string in the json format as follows:\n'''
97 | '''{"thought":{your though},"speak":{your speak},"name":{voted name}} \n '''
98 | f''' "thought" represent your thinking, which can be seen only by your self. \n'''
99 | f''' "speak" represent your speak in the game, which can be seen for all the players. \n'''
100 | f''' "name" can be only select be the living players. \n '''
101 | )
102 | messages.append(create_message("user",temp))
103 | temp = f"I understand. I will reply with a json string, and i will not repeat other players' speak or my own speak of the previous round. "
104 | messages.append(create_message("assistant",temp))
105 | return messages
106 |
107 | def describe(self):
108 | messages = self.role_messages + self.private_history
109 | messages.append(create_message("system","Remember, you must reply a json string as required. And you must not repeat the statements of other players and your own past statement."))
110 |
111 | if self.chatbot.name == "td003":
112 | text_prompt = self.convert_messages_to_prompt(messages)
113 | res = self.chatbot.single_chat(text_prompt)
114 | else:
115 | res = self.chatbot.multi_chat(messages)
116 |
117 | try:
118 | res = json.loads(res)
119 | though = res["thought"]
120 | speak = res["speak"]
121 | except:
122 | pass
123 |
124 | return speak, res
125 |
126 | def vote(self):
127 | messages = self.vote_messages + self.private_history
128 | messages.append(create_message("system","Remember, you must reply a json string as required, and the 'speak' must not repeat with the statements of other players or your own past statement. The 'name' must be the same string chosen from the given list 'living players'. "))
129 | if self.chatbot.name == "td003":
130 | text_prompt = self.convert_messages_to_prompt(messages)
131 | res = self.chatbot.single_chat(text_prompt)
132 | else:
133 | res = self.chatbot.multi_chat(messages)
134 | thought = None
135 | speak = None
136 | name = None
137 | try:
138 | res = json.loads(res)
139 | though = res["thought"]
140 | speak = res["speak"]
141 | name = res["name"]
142 | except:
143 | pass
144 |
145 | return name, speak, res
146 |
147 | def convert_messages_to_prompt(self, messages):
148 | prompt = ""
149 | for message in messages:
150 | if message["role"] == "system":
151 | prompt += "system: "
152 | prompt += message["content"]
153 | prompt += "\n"
154 | elif message["role"] == "assistant":
155 | prompt += f"{self.player_name}: "
156 | prompt += message["content"]
157 | prompt += "\n"
158 | else:
159 | prompt += message["content"]
160 | prompt += "\n"
161 | prompt += f"{self.player_name}: "
162 | return prompt
--------------------------------------------------------------------------------
/spyfall/compute_adversarial.py:
--------------------------------------------------------------------------------
1 | import json
2 | import os
3 |
4 | with open("/zecheng/qiaodan/spyfall/labels.txt",'r') as f:
5 | data = f.readlines()
6 |
7 |
8 | logs_dir = "/zecheng/qiaodan/spyfall/gpt3_gpt4"
9 | out_path = "/zecheng/qiaodan/spyfall/result/gpt3_gpt4.json"
10 | labels = []
11 | for item in data:
12 | labels.append(item.strip().split(","))
13 |
14 | def compute_adversarial(lines):
15 | cnt_spy = 0
16 | cnt_villager = 0
17 | round_avg = 0
18 | for line in lines:
19 | item = json.loads(line)
20 | winer = item["winer"]
21 | if winer == "exit":
22 | continue
23 | if winer == "spy":
24 | cnt_spy += 1
25 | else:
26 | cnt_villager += 1
27 | round_avg += item["round"]
28 |
29 | round_avg /= cnt_spy + cnt_villager
30 |
31 | return {"cnt_spy":cnt_spy,"cnt_villager":cnt_villager,"round_avg":round_avg,"rate":cnt_spy/(cnt_spy + cnt_villager)}
32 | res_dict = {}
33 | for label in labels:
34 | label_name = f"{label[0]}&{label[1]}"
35 | dir_name = os.path.join(logs_dir,label_name)
36 | with open(f"{dir_name}/res.json", 'r') as f:
37 | lines = f.readlines()
38 | res = compute_adversarial(lines)
39 | res_dict[label_name] = res
40 |
41 | avg = {"round_avg":0,"rate":0}
42 |
43 | for key,value in res_dict.items():
44 | avg["round_avg"] += value["round_avg"]
45 | avg["rate"] += value["rate"]
46 |
47 | avg["round_avg"] /= len(res_dict) - 1
48 | avg["rate"] /= len(res_dict) - 1
49 | res_dict["avg"] = avg
50 |
51 | with open(out_path,'w') as f:
52 | json.dump(res_dict,f,indent=1)
53 |
--------------------------------------------------------------------------------
/spyfall/labels.txt:
--------------------------------------------------------------------------------
1 | ipad,iphone
2 | guitar,lute
3 | BMW,BENZ
4 | eyebrow,beard
5 | Grape,Raisins
6 | sea_lion,seal
7 | spider_man,batman
8 | nike,adidas
9 | milk,soy_milk
10 | motorcycle,electromobile
11 | tiger,lion
12 |
--------------------------------------------------------------------------------
/spyfall/utils/prompt.py:
--------------------------------------------------------------------------------
1 |
2 |
3 | game_prompt_en = '''
4 | Now you are playing the game called "who is the spy".
5 |
6 | There are six players in this game.
7 | At the beginning of the game, everyone will receive a word.
8 | There is one spy who will receive a spy word, while others will receive a common word.
9 | Spy word is different but relevant to common words. For example, the spy word can be "apple", and the common word is "banana".
10 |
11 | There are two stages in each round of the game.
12 |
13 | The first stage is describing stage:
14 | Everyone needs to use a word or a few words to say something in turn about the word he received without directly saying the word.
15 | The funniest part of the game is that since you do not know other's words, you are not sure whether you are the spy.
16 | So, you can only infer who have the different based on other players description.
17 |
18 | The second stage is the voting stage:
19 | After everyone speaks, each player must vote for a player who he thinks is the spy and tell others why. Players can not repeat other players' speeches or their own speeches in the previous round. The player getting the most votes will be eliminated from the game.
20 |
21 | If the spy is eliminated, the rest of the players win.
22 | If there are less than three players, and the spy still lives, then the spy wins.
23 |
24 | '''
25 |
26 |
--------------------------------------------------------------------------------
/spyfall/utils/utils.py:
--------------------------------------------------------------------------------
1 | from chat.gpt3_chat import GPT3
2 | from chat.gpt4_chat import GPT4
3 | from chat.text003_chat import Text003
4 |
5 | def get_model(model_name):
6 | model = None
7 | if model_name == "gpt3":
8 | model = GPT3()
9 | if model_name == "gpt4":
10 | model = GPT4()
11 | if model_name == "td003":
12 | model = Text003()
13 | return model
14 |
15 | def create_message(role,content):
16 | return {"role":role,"content":content}
17 |
18 | def print_messages(messages):
19 | for message in messages:
20 | print(message)
21 |
22 |
--------------------------------------------------------------------------------
/tofukingdom/.gitignore:
--------------------------------------------------------------------------------
1 | utils/key.py
2 |
--------------------------------------------------------------------------------
/tofukingdom/agents/__init__.py:
--------------------------------------------------------------------------------
1 | import sys
2 | from tofukingdom.agents.guard_agent import GuardAgent
3 | from tofukingdom.agents.maid_agent import MaidAgent
4 | from tofukingdom.agents.chef_agent import ChefAgent
5 | from tofukingdom.agents.minister_agent import MinisterAgent
6 | from tofukingdom.agents.prince_agent import PrinceAgent
7 | from tofukingdom.agents.princess_agent import PrincessAgent
8 | from tofukingdom.agents.queen_agent import QueenAgent
9 | from tofukingdom.agents.spy_agent import SpyAgent
--------------------------------------------------------------------------------
/tofukingdom/agents/base_agent.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.utils.prompt import game_prompt_en
2 | from tofukingdom.utils.utils import create_message,print_messages
3 | import json
4 |
5 | class BaseAgent:
6 | def __init__(self,chatbot,player_name,all_players) -> None:
7 | self.game_prompt = game_prompt_en
8 | self.chatbot = chatbot
9 | self.role = None
10 | self.role_prompt = self.get_role_prompt()
11 | self.player_name = player_name
12 | self.all_players = all_players
13 | self.private_history = []
14 |
15 | def get_role_prompt(self):
16 | role_prompt = '''
17 | You now need to play the role of the Maid.
18 | For the Prince's question, you can choose to say the truth or lie.
19 |
20 | '''
21 | return role_prompt
22 |
23 | def chat(self,identities):
24 |
25 | role_prompt = (f"{self.game_prompt} \n"
26 | f"Now, you are player {self.player_name} "
27 | f"{self.role_prompt} \n"
28 | f"This is the identity information of other players: {identities} \n "
29 | )
30 | last_prompt = (f'''Your reply must be a JSON string in the following format: \n'''
31 | '''{"thought":{your thought},"answer":"your answer"} \n'''
32 | f''' 'thought' represent your thought of how to answer the question according to the rule and your goal. '''
33 | f''' 'answer' represent your reply to the Prince. ''')
34 | messages = []
35 | first_message = create_message("system",role_prompt)
36 | messages.append(first_message)
37 | messages += self.private_history
38 | last_message = create_message("system",last_prompt)
39 | messages.append(last_message)
40 |
41 | cnt = 0
42 | while True:
43 | try:
44 | if self.chatbot.name == "td003":
45 | prompt = self.convert_messages_to_prompt(messages)
46 | res = self.chatbot.single_chat(prompt)
47 | else:
48 | res = self.chatbot.multi_chat(messages)
49 | res = json.loads(res)
50 | break
51 | except:
52 | cnt += 1
53 | if cnt >= 3:
54 | return None, None
55 |
56 | answer = res["answer"]
57 | return answer, res
58 |
59 | def convert_messages_to_prompt(self, messages):
60 | prompt = ""
61 | for message in messages:
62 | if message["role"] == "system":
63 | prompt += "system: "
64 | prompt += message["content"]
65 | prompt += "\n"
66 | elif message["role"] == "assistant":
67 | prompt += f"{self.player_name}: "
68 | prompt += message["content"]
69 | prompt += "\n"
70 | else:
71 | prompt += message["content"]
72 | prompt += "\n"
73 | prompt += f"{self.player_name}: "
74 | return prompt
75 |
--------------------------------------------------------------------------------
/tofukingdom/agents/chef_agent.py:
--------------------------------------------------------------------------------
1 |
2 | from tofukingdom.utils.prompt import game_prompt_en
3 | from tofukingdom.agents.base_agent import BaseAgent
4 |
5 | class ChefAgent(BaseAgent):
6 | def __init__(self,chatbot,name,all_players) -> None:
7 | super().__init__(chatbot,name,all_players)
8 | self.role = "Chef"
9 |
10 | def get_role_prompt(self):
11 | role_prompt = '''You now need to play the role of the Chef.
12 | For the Prince's question, you can only speak the truth.
13 | The Princess is your teammate.
14 | You goal is to help the Prince correctly choose the princess.'''
15 | return role_prompt
16 |
--------------------------------------------------------------------------------
/tofukingdom/agents/guard_agent.py:
--------------------------------------------------------------------------------
1 |
2 | from tofukingdom.utils.prompt import game_prompt_en
3 | from tofukingdom.agents.base_agent import BaseAgent
4 | class GuardAgent(BaseAgent):
5 | def __init__(self,chatbot,name,all_players) -> None:
6 | super().__init__(chatbot,name,all_players)
7 | self.role = "Guard"
8 |
9 | def get_role_prompt(self):
10 | role_prompt = '''
11 | You now need to play the role of the Guard.
12 | For the Prince's question, you can only speak the lie, you are not allowed to speak the truth.
13 | The Queen and the Minister is your teammate.
14 | You goal is to mislead the Prince to choose the Queen as the Princess.
15 | '''
16 | return role_prompt
17 |
--------------------------------------------------------------------------------
/tofukingdom/agents/maid_agent.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.utils.prompt import game_prompt_en
2 | from tofukingdom.agents.base_agent import BaseAgent
3 | class MaidAgent(BaseAgent):
4 | def __init__(self,chatbot,name,all_players) -> None:
5 | super().__init__(chatbot,name,all_players)
6 | self.role = "Maid"
7 |
8 | def get_role_prompt(self):
9 | role_prompt = '''
10 | You now need to play the role of the Maid.
11 | For the Prince's question, you can choose to say the truth or lie.
12 | The Spy is your teammate.
13 | You goal is to mislead the Prince to choose other players except the Princess and the Queen.
14 | '''
15 | return role_prompt
16 |
--------------------------------------------------------------------------------
/tofukingdom/agents/minister_agent.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.utils.prompt import game_prompt_en
2 | from tofukingdom.agents.base_agent import BaseAgent
3 | class MinisterAgent(BaseAgent):
4 | def __init__(self,chatbot,name,all_players) -> None:
5 | super().__init__(chatbot,name,all_players)
6 | self.role = "Minister"
7 |
8 | def get_role_prompt(self):
9 | role_prompt = '''
10 | You now need to play the role of the Minister.
11 | For the Prince's question, you can only speak the lie, you are not allowed to speak the truth.
12 | The Queen and the Guard is your teammate.
13 | Your goal is to mislead the Prince to choose the Queen as the Princess.
14 | '''
15 | return role_prompt
16 |
17 |
--------------------------------------------------------------------------------
/tofukingdom/agents/prince_agent.py:
--------------------------------------------------------------------------------
1 |
2 | from tofukingdom.utils.prompt import game_prompt_en
3 | import json
4 | from tofukingdom.utils.utils import create_message,print_messages
5 | class PrinceAgent:
6 | def __init__(self,chatbot,all_players) -> None:
7 | self.game_prompt = game_prompt_en
8 | self.chatbot = chatbot
9 | self.name = "Prince"
10 | self.role_prompt = self.get_role_prompt()
11 | self.ask_prompt = self.get_ask_prompt()
12 | self.ask_choose_prompt = self.get_ask_choose_prompt()
13 | self.choose_prompt = self.get_choose_prompt()
14 |
15 | self.all_players = all_players
16 | self.private_history = []
17 |
18 | def get_role_prompt(self):
19 | prompt = (f"You need to play the role of the Prince. \n"
20 | f"Your goal is to find the true Princess. \n"
21 | "You can ask each player one question. "
22 | '''
23 | Your question can only be strictly chosen from the three question below:
24 | 1. Which player is the Princess?
25 | 2. What is your identity?
26 | 3. What is the identity of {player_name} ?
27 | '''
28 | f"Remember some player could lie to you. "
29 | f"So you need to carefully consider each player's response and think whether or not you should trust it. "
30 | f"You should not always ask the same question, you should try choosing different questions to verify who is the Princess."
31 | )
32 |
33 | return prompt
34 |
35 | def get_ask_prompt(self):
36 | prompt = (
37 | f"Your reply must be a JSON string in the following format: \n"
38 | '''{"thought":{your though},"question":{your question}}\n'''
39 | f"'thought' represent your you thinking of which question you want to ask and why. \n"
40 | f"'question' represent your question.\n"
41 | )
42 | return prompt
43 |
44 | def get_ask_choose_prompt(self):
45 | prompt = (
46 | f"Your reply must be in the json format as below:\n "
47 | '''{"thought":{your thought},"name":{player_name},"question":{your question}}\n'''
48 | f"'thought' represent your thinking of which player and which question you should ask to help you find the true Princess. \n"
49 | f"'name' should be the name of the player you choose to ask. \n"
50 | f"'question' is the question you want to ask, which should be chosen from the three questions above.\n"
51 | )
52 | return prompt
53 |
54 | def get_choose_prompt(self):
55 | prompt = (
56 | f"Your reply must be a single JSON string without any extra characters in the following format: \n"
57 | '''{"thought":{your thought},"name":{player_name}}\n'''
58 | f"'thought' represent you analysis according to your question and the response. \n "
59 | f"'name' should be the name of the player that you think is the Princess. \n"
60 | f"'name' must be chosen from names of the players \n"
61 | )
62 | return prompt
63 |
64 | def ask(self):
65 | messages = []
66 | game_message = create_message("system",self.game_prompt)
67 | messages.append(game_message)
68 | role_message = create_message("system",self.role_prompt)
69 | messages.append(role_message)
70 | messages += self.private_history
71 | last_message = create_message("system",self.ask_prompt)
72 | messages.append(last_message)
73 | cnt = 0
74 |
75 | while True:
76 | try:
77 | if self.chatbot.name == "td003":
78 | prompt = self.convert_messages_to_prompt(messages)
79 | res = self.chatbot.single_chat(prompt)
80 | else:
81 | res = self.chatbot.multi_chat(messages)
82 | res = json.loads(res)
83 | break
84 | except:
85 | cnt += 1
86 | if cnt >= 3:
87 | return None, None
88 |
89 | question = res["question"]
90 | return question, res
91 |
92 | def ask_choose(self):
93 | messages = []
94 | first_message = create_message("system",self.role_prompt)
95 | messages.append(first_message)
96 | messages += self.private_history
97 | last_message = create_message("system",self.ask_choose_prompt)
98 | messages.append(last_message)
99 |
100 | cnt = 0
101 | while True:
102 | try:
103 | if self.chatbot.name == "td003":
104 | prompt = self.convert_messages_to_prompt(messages)
105 | res = self.chatbot.single_chat(prompt)
106 | else:
107 | res = self.chatbot.multi_chat(messages)
108 | res = json.loads(res)
109 | break
110 | except:
111 | cnt += 1
112 | if cnt >= 3:
113 | return None, None, None
114 |
115 | question = res["question"]
116 | name = res["name"]
117 | return name, question, res
118 |
119 | def choose(self):
120 | messages = []
121 | first_message = create_message("system",self.role_prompt)
122 | messages.append(first_message)
123 | messages += self.private_history
124 | last_message = create_message("system",self.choose_prompt)
125 | messages.append(last_message)
126 |
127 | cnt = 0
128 | while True:
129 | try:
130 | if self.chatbot.name == "td003":
131 | prompt = self.convert_messages_to_prompt(messages)
132 | res = self.chatbot.single_chat(prompt)
133 | else:
134 | res = self.chatbot.multi_chat(messages)
135 | res = json.loads(res)
136 | break
137 | except:
138 | cnt += 1
139 | if cnt >= 3:
140 | return None, None
141 | name = res["name"]
142 | if name not in self.all_players:
143 | return None, None
144 | return name, res
145 |
146 | def convert_messages_to_prompt(self, messages):
147 | prompt = ""
148 | for message in messages:
149 | if message["role"] == "system":
150 | prompt += "system: "
151 | prompt += message["content"]
152 | prompt += "\n"
153 | elif message["role"] == "assistant":
154 | prompt += f"{self.player_name}: "
155 | prompt += message["content"]
156 | prompt += "\n"
157 | else:
158 | prompt += message["content"]
159 | prompt += "\n"
160 | prompt += f"{self.player_name}: "
161 | return prompt
162 |
--------------------------------------------------------------------------------
/tofukingdom/agents/princess_agent.py:
--------------------------------------------------------------------------------
1 |
2 | from tofukingdom.utils.prompt import game_prompt_en
3 | from tofukingdom.agents.base_agent import BaseAgent
4 |
5 | class PrincessAgent(BaseAgent):
6 | def __init__(self,chatbot,name,all_players) -> None:
7 | super().__init__(chatbot,name,all_players)
8 | self.role = "Princess"
9 |
10 | def get_role_prompt(self):
11 | role_prompt = '''
12 | You now need to play the role of the Princess.
13 | For the Prince's question, you can only speak the truth.
14 | The Chef is your teammate.
15 | You goal is to help the Prince correctly choose the princess.
16 | '''
17 | return role_prompt
18 |
19 |
--------------------------------------------------------------------------------
/tofukingdom/agents/queen_agent.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.utils.prompt import game_prompt_en
2 | from tofukingdom.agents.base_agent import BaseAgent
3 |
4 | class QueenAgent(BaseAgent):
5 | def __init__(self,chatbot,name,all_players) -> None:
6 | super().__init__(chatbot,name,all_players)
7 | self.role = "Queen"
8 |
9 | def get_role_prompt(self):
10 | role_prompt = '''
11 | You now need to play the role of the Queen.
12 | For the Prince's question, you can only speak the lie, you are not allowed to speak the truth.
13 | The Minister and the Guard is your teammate.
14 | You goal is to mislead the Prince to choose you as the Princess.
15 | '''
16 | return role_prompt
17 |
18 |
--------------------------------------------------------------------------------
/tofukingdom/agents/spy_agent.py:
--------------------------------------------------------------------------------
1 | from tofukingdom.utils.prompt import game_prompt_en
2 | from tofukingdom.agents.base_agent import BaseAgent
3 |
4 | class SpyAgent(BaseAgent):
5 | def __init__(self,chatbot,name,all_players) -> None:
6 | super().__init__(chatbot,name,all_players)
7 | self.role = "Spy"
8 |
9 | def get_role_prompt(self):
10 | role_prompt = '''
11 | You now need to play the role of the Spy.
12 | For the Prince's question, you can choose to say the truth or lie.
13 | The Maid is your teammate.
14 | You goal is to mislead the Prince to choose other players except the Princess and the Queen.
15 | '''
16 | return role_prompt
17 |
--------------------------------------------------------------------------------
/tofukingdom/compute.py:
--------------------------------------------------------------------------------
1 | import json
2 | file_path = "/workspace/qiaodan/tofuking/logs/res.json"
3 |
4 |
5 | with open(file_path,'r') as f:
6 | data = f.readlines()
7 |
8 | res = {"gpt3":0, "gpt4":0}
9 |
10 | for line in data[:100]:
11 | item = json.loads(line)
12 | llms = item["llms"]
13 | score = item["score"]
14 | for i in range(len(llms)):
15 | llm = llms[i]
16 | res[llm] += score[i]
17 |
18 | res["gpt3"] /= 4
19 | res["gpt4"] /= 3
20 | print(res)
21 |
22 |
--------------------------------------------------------------------------------
/tofukingdom/utils/prompt.py:
--------------------------------------------------------------------------------
1 | game_prompt_en = '''
2 | Here is a game called "Tofu Kingdom".
3 | Translation into English: There is a game called "Tofu Kingdom."
4 |
5 | In this game, there are 8 roles of Prince, Princess, Queen, Minister, Chef, Guard, Maid, and Spy.
6 | There is a predetermined Prince.
7 | Other players play one of the remaining roles.
8 |
9 | In the game, the Prince doesn't know the identities of the other players.
10 | The Prince needs to guess which player is the Princess.
11 | But the players except the Prince know the identity of each other.
12 | If the Prince correctly chooses the player who is the Princess, both the Chef and the Princess win.
13 | If the Prince chooses a player who is the Queen, then the Queen, Minister, and Guard win.
14 | If the Prince chooses a player who is any other role except the Princess and Queen, then the Maid and Spy win.
15 |
16 | During the game, the Prince can ask questions to the other players, the question can only be chosen from the three question below:
17 | 1.Which player is the Princess?
18 | 2.What is your identity?
19 | 3.What is the identity of {player_name} ?
20 |
21 | {player_name} must be chosen from the names of the given players.
22 |
23 | For question1, player must answer like: {player_name} is the Princess.
24 | For question2, player must answer like: My identity is {role_name}.
25 | For question3, player must answer like: the identity of {player_name} is {role_name}.
26 |
27 | At the end of the game, the Prince can choose a player to ask one more question, and the question options remain the same.
28 | For the Prince's inquiries, the Princess and Chef must tell the truth; the Queen, Minister, and Guard must lie; the Maid and Spy can either tell the truth or lie.
29 | Finally, the Prince should guess who is the Princess according to the questions and answers.
30 | '''
31 |
32 |
33 | game_prompt_zh = '''
34 | 这有一个叫做 “豆腐王国的游戏”。
35 | 在这个游戏中一共有8个玩家,分别扮演 王子,公主,女王,大臣,厨师,侍卫,女仆,间谍。
36 | 这个游戏中,王子不知道其余玩家的身份,王子需要猜测哪一位玩家是公主。
37 | 如果王子选择的玩家身份是公主,则厨师与公主胜利。
38 | 如果王子选择的玩家身份是女王,则女王,大臣,侍卫胜利。
39 | 如果王子选择的玩家身份是除了公主和女王以外的其他,则女仆和间谍胜利。
40 | 游戏中,王子需要询问每个玩家一个问题,被问的玩家需要给出回答这个问题只能从以下三个问题中选择:
41 | 1.豆腐公主是哪位玩家。
42 | 2.你的身份是什么。
43 | 3.某个玩家的身份是什么。
44 | 问题3的某个玩家可以是除了王子以外的任何一名玩家。
45 | 游戏的最后,王子还可以再次挑选一个玩家询问一次问题,问题依旧只能从上面三个问题中选择。
46 | 对于王子的提问,公主,厨师,必须说真话;女王,大臣,侍卫必须说假话;女仆,间谍可以说真话也可以说假话。
47 | '''
--------------------------------------------------------------------------------
/tofukingdom/utils/utils.py:
--------------------------------------------------------------------------------
1 | from chat.gpt3_chat import GPT3
2 | from chat.gpt4_chat import GPT4
3 | from chat.text003_chat import Text003
4 |
5 | def get_model(model_name):
6 | model = None
7 | if model_name == "gpt3":
8 | model = GPT3()
9 | if model_name == "gpt4":
10 | model = GPT4()
11 | if model_name == "td003":
12 | model = Text003()
13 | return model
14 |
15 | def create_message(role,content):
16 | return {"role":role,"content":content}
17 |
18 | def print_messages(messages):
19 | for message in messages:
20 | print(message)
21 |
22 |
--------------------------------------------------------------------------------