├── Github-figures
├── Env.png
├── framework.png
└── main_figure.png
├── LICENSE
├── README.md
├── LLM.py
├── data_visua.py
├── env3_create.py
├── env1_create.py
├── env4_create.py
├── env2_create.py
├── env1-box-arrange.py
├── env2-box-arrange.py
├── env4-box-arrange.py
├── env3-box-arrange.py
├── prompt_env3.py
└── prompt_env1.py
/Github-figures/Env.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/Env.png
--------------------------------------------------------------------------------
/Github-figures/framework.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/framework.png
--------------------------------------------------------------------------------
/Github-figures/main_figure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/main_figure.png
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2024 Yongchao Chen
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Multi-Agent-Framework ([Website](https://yongchao98.github.io/MIT-REALM-Multi-Robot/), ICRA 2024)
2 | Here we show the related code for the Multi-Agent Framework paper. The code will be updated dynamically in the future. There are in total four environments, corresponding to BoxNet1, BoxNet2, BoxLift, and Warehouse, respectively.
3 |
4 |
5 |

6 |
7 |
8 | ## Requirements
9 | Please install the following Python packages.
10 | ```
11 | pip install numpy openai re random time copy tiktoken
12 | ```
13 |
14 | Then you need to get your OpenAI key from https://beta.openai.com/
15 | Put that OpenAI key starting 'sk-' into the LLM.py, line8
16 |
17 | ## Create testing trial environments
18 | Run the env1_create.py/env2_create.py/env3_create.py/env4_create.py to create the environments, remember change the Code_dir_path in the last lines.
19 |
20 | ```
21 | python env1_create.py
22 | ```
23 |
24 | ## Usage
25 | Run the env1-box-arrange.py/env2-box-arrange.py/env3-box-arrange.py/env4-box-arrange.py to test our approaches in different frameworks and dialogue history methods. In around Line270, set up the models(GPT-3/4), frameworks (HMAS-2,HMSA-1, DMAS,CMAS), dialogue history method, and your working path dir. Then run the script:
26 |
27 | ```
28 | python env1-box-arrange.py
29 | ```
30 |
31 | The experimental results will appear in the generated dir Env1_BoxNet1. For visualizing the testing results, set up the Code_dir_path in line2, then run the script:
32 |
33 | ```
34 | python data_visua.py
35 | ```
36 |
37 | ## Recommended Work
38 |
39 | [AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers](https://arxiv.org/pdf/2306.06531.pdf)
40 |
41 | [NL2TL: Transforming Natural Languages to Temporal Logics using Large Language Models](https://arxiv.org/pdf/2305.07766.pdf)
42 |
--------------------------------------------------------------------------------
/LLM.py:
--------------------------------------------------------------------------------
1 | import openai
2 | import tiktoken
3 | import time
4 | enc = tiktoken.get_encoding("cl100k_base")
5 | assert enc.decode(enc.encode("hello world")) == "hello world"
6 | enc = tiktoken.encoding_for_model("gpt-4")
7 |
8 | openai_api_key_name = 'sk-...'
9 |
10 | def GPT_response(messages, model_name):
11 | token_num_count = 0
12 | for item in messages:
13 | token_num_count += len(enc.encode(item["content"]))
14 |
15 | if model_name in ['gpt-4', 'gpt-4-32k', 'gpt-3.5-turbo-0301', 'gpt-4-0613', 'gpt-4-32k-0613', 'gpt-3.5-turbo-16k-0613']:
16 | #print(f'-------------------Model name: {model_name}-------------------')
17 | openai.api_key = openai_api_key_name
18 |
19 | try:
20 | result = openai.ChatCompletion.create(
21 | model=model_name,
22 | messages=messages,
23 | temperature = 0.0,
24 | top_p=1,
25 | frequency_penalty=0,
26 | presence_penalty=0
27 | )
28 | except:
29 | try:
30 | result = openai.ChatCompletion.create(
31 | model=model_name,
32 | messages=messages,
33 | temperature=0.0,
34 | top_p=1,
35 | frequency_penalty=0,
36 | presence_penalty=0
37 | )
38 | except:
39 | try:
40 | print(f'{model_name} Waiting 60 seconds for API query')
41 | time.sleep(60)
42 | result = openai.ChatCompletion.create(
43 | model=model_name,
44 | messages=messages,
45 | temperature = 0.0,
46 | top_p=1,
47 | frequency_penalty=0,
48 | presence_penalty=0
49 | )
50 | except:
51 | return 'Out of tokens', token_num_count
52 | token_num_count += len(enc.encode(result.choices[0]['message']['content']))
53 | print(f'Token_num_count: {token_num_count}')
54 | return result.choices[0]['message']['content'], token_num_count
55 |
56 | else:
57 | raise ValueError(f'Invalid model name: {model_name}')
58 |
--------------------------------------------------------------------------------
/data_visua.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
3 | Saving_path = Code_dir_path + 'Env2_BoxNet2'
4 |
5 | candidate_list = [('CMAS','_wo_any_dialogue_history'), ('CMAS','_w_only_state_action_history'),
6 | ('HMAS-2','_wo_any_dialogue_history'), ('HMAS-2','_w_only_state_action_history'),
7 | ('HMAS-2','_w_all_dialogue_history'), ('HMAS-1','_w_only_state_action_history')]
8 |
9 | print('-------#####------------#####------------#####--------')
10 | Env_action_time_list_best_total = []; API_query_time_list_best_total = []; token_num_list_best_total = []
11 | for pg_row_num, pg_column_num in [(2, 2), (2, 4), (4, 4), (4,8)]:
12 | Env_action_time_list = []; API_query_time_list = []; token_num_list = [];
13 | for iteration_num in range(10):
14 | Env_action_time_list.append(1e10); API_query_time_list.append(1e10); token_num_list.append(1e10)
15 | for cen_decen_framework, dialogue_history_method in candidate_list:
16 | #print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, {cen_decen_framework}{dialogue_history_method}')
17 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/success_failure.txt', 'r') as file:
18 | first_line = file.readline().strip()
19 | #print(first_line)
20 |
21 | if first_line == 'success':
22 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/env_action_times.txt', 'r') as file:
23 | numbers = [float(line.strip()) for line in file.readlines()]
24 | #print('Environment action times', numbers[0])
25 | if numbers[0] < Env_action_time_list[iteration_num]:
26 | Env_action_time_list[iteration_num] = numbers[0]
27 |
28 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/token_num_count.txt', 'r') as file:
29 | numbers = [float(line.strip()) for line in file.readlines()]
30 | #print('API query times', len(numbers))
31 | #print('Consuming of token num', np.sum(numbers))
32 | if len(numbers) < API_query_time_list[iteration_num]:
33 | API_query_time_list[iteration_num] = len(numbers)
34 | if np.sum(numbers) < token_num_list[iteration_num]:
35 | token_num_list[iteration_num] = np.sum(numbers)
36 |
37 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, Environment action times, {Env_action_time_list}')
38 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, API query times, {API_query_time_list}')
39 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, Consuming of token num, {token_num_list}')
40 | print('\n')
41 | Env_action_time_list_best_total.append(Env_action_time_list)
42 | API_query_time_list_best_total.append(API_query_time_list)
43 | token_num_list_best_total.append(token_num_list)
44 |
45 | for cen_decen_framework, dialogue_history_method in candidate_list:
46 | print('\n')
47 | success_rate_list = []
48 | Env_action_time_list = [];
49 | API_query_time_list = [];
50 | token_num_list = []
51 | for index, (pg_row_num, pg_column_num) in enumerate([(2,2), (2,4), (4,4)]):
52 | print('\n')
53 | success_rate = 0;
54 | Env_action_time_cost = 0;
55 | API_query_time_cost = 0;
56 | token_num_cost = 0
57 | success_failure_state_list = []
58 | for iteration_num in range(4):
59 | with open(
60 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/success_failure.txt',
61 | 'r') as file:
62 | first_line = file.readline().strip()
63 | success_failure_state_list.append(first_line)
64 |
65 | if first_line == 'success':
66 | success_rate += 1
67 | with open(
68 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/env_action_times.txt',
69 | 'r') as file:
70 | numbers = [float(line.strip()) for line in file.readlines()]
71 | # print('Environment action times', numbers[0])
72 | if Env_action_time_list_best_total[index][iteration_num] < 1e10:
73 | #print(numbers[0]/Env_action_time_list_best_total[index][iteration_num])
74 | Env_action_time_cost += numbers[0]/Env_action_time_list_best_total[index][iteration_num]
75 |
76 | with open(
77 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/token_num_count.txt',
78 | 'r') as file:
79 | numbers = [float(line.strip()) for line in file.readlines()]
80 | # print('API query times', len(numbers))
81 | # print('Consuming of token num', np.sum(numbers))
82 | if API_query_time_list_best_total[index][iteration_num] < 1e10:
83 | #print(len(numbers)/API_query_time_list_best_total[index][iteration_num])
84 | API_query_time_cost += len(numbers)/API_query_time_list_best_total[index][iteration_num]
85 | if token_num_list_best_total[index][iteration_num] < 1e10:
86 | #print(np.sum(numbers)/token_num_list_best_total[index][iteration_num])
87 | token_num_cost += np.sum(numbers)/token_num_list_best_total[index][iteration_num]
88 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, {cen_decen_framework}{dialogue_history_method}')
89 | print('success_rate', {success_rate/4})
90 | print(success_failure_state_list)
91 | success_rate_list.append(success_rate / 4)
92 | if success_rate > 0:
93 | print('Env_action_time_cost', {Env_action_time_cost / success_rate})
94 | Env_action_time_list.append(Env_action_time_cost / success_rate)
95 |
96 | print('API_query_time_cost', {API_query_time_cost / success_rate})
97 | API_query_time_list.append(API_query_time_cost / success_rate)
98 |
99 | print('token_num_cost', {token_num_cost / success_rate})
100 | token_num_list.append(token_num_cost / success_rate)
101 |
102 | print('\n')
103 | print(f'success_rate: {success_rate_list}, {np.sum(success_rate_list)/len(success_rate_list)}')
104 | print(f'Env_action_time_cost: {Env_action_time_list}, {np.sum(Env_action_time_list) / len(Env_action_time_list)}')
105 | print(f'API_query_time_cost: {API_query_time_list}, {np.sum(API_query_time_list) / len(API_query_time_list)}')
106 | print(f'token_num_cost: {token_num_list}, {np.sum(token_num_list) / len(token_num_list)}')
--------------------------------------------------------------------------------
/env3_create.py:
--------------------------------------------------------------------------------
1 | from prompt_env3 import *
2 | from LLM import *
3 | from sre_constants import error
4 | import random
5 | import os
6 | import json
7 | import re
8 | import copy
9 | import numpy as np
10 | import shutil
11 | import time
12 | import ast
13 |
14 | def state_update_func(pg_dict, lifter_weight_list):
15 | volume_list = [volume for volume, weight in pg_dict.items()]
16 |
17 | state_update_prompt = f'The left boxes in the warehouse are: '
18 | left_box = ''
19 | for i in range(len(volume_list)-1):
20 | state_update_prompt += f'box[{volume_list[i]}V], '
21 | left_box += f'box[{volume_list[i]}V], '
22 | state_update_prompt += f'box[{volume_list[len(volume_list)-1]}V]'
23 | left_box += f'box[{volume_list[len(volume_list)-1]}V]'
24 | state_update_prompt += f'.\n'
25 | left_box += f'.\n'
26 |
27 | state_update_prompt += f'The available lifting agents in the warehouse are: '
28 | for i in range(len(lifter_weight_list)-1):
29 | state_update_prompt += f'agent[{lifter_weight_list[i]}W], '
30 | state_update_prompt += f'agent[{lifter_weight_list[len(lifter_weight_list)-1]}W]'
31 | state_update_prompt += f'.\n'
32 | return state_update_prompt, left_box
33 |
34 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method):
35 | user_prompt_list = copy.deepcopy(user_prompt_list_input)
36 | response_total_list = copy.deepcopy(response_total_list_input)
37 | iteration_num = 0
38 | token_num_count_list_add = []
39 | while iteration_num < 6:
40 | response_total_list.append(response)
41 | feedback = ''
42 | #try:
43 | original_response_dict = json.loads(response)
44 | pg_dict_original = copy.deepcopy(pg_dict_input)
45 |
46 | # The state to be updated
47 | volume_list = [volume for volume, weight in pg_dict_original.items()]
48 | weight_list = [weight for volume, weight in pg_dict_original.items()]
49 |
50 | # The action to act
51 | for key, value in original_response_dict.items():
52 | match = re.search(r'(\d+\.\d+)', key)
53 | volume = float(match.group(1))
54 | lift_weight_list = [float(num) for num in re.findall(r'(\d+\.\d+)', value)]
55 | # print(lift_weight_list)
56 |
57 | if volume in volume_list:
58 | pass
59 | else:
60 | feedback += f'box[{volume}V] is not in the current warehouse; '
61 | #except:
62 | # feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss the left boxes in the warehouse.'
63 |
64 | if feedback != '':
65 | feedback += 'Please replan for all the agents again with the same ouput format:'
66 | print('----------Syntactic Check----------')
67 | print(f'Response original: {response}')
68 | print(f'Feedback: {feedback}')
69 | user_prompt_list.append(feedback)
70 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction
71 | print(f'Length of messages {len(messages)}')
72 | response, token_num_count = GPT_response(messages, model_name)
73 | token_num_count_list_add.append(token_num_count)
74 | print(f'Response new: {response}\n')
75 | if response == 'Out of tokens':
76 | return response, token_num_count_list_add
77 | iteration_num += 1
78 | else:
79 | return response, token_num_count_list_add
80 | return 'Syntactic Error', token_num_count_list_add
81 |
82 |
83 | def action_from_response(pg_dict, original_response_dict, lifter_weight_list):
84 | system_error_feedback = '';
85 | env_act_feedback = ''
86 | pg_dict_original = copy.deepcopy(pg_dict)
87 |
88 | # The state to be updated
89 | volume_list = [volume for volume, weight in pg_dict_original.items()]
90 | weight_list = [weight for volume, weight in pg_dict_original.items()]
91 |
92 | # The action to act
93 | for key, value in original_response_dict.items():
94 | match = re.search(r'(\d+\.\d+)', key)
95 | volume = float(match.group(1))
96 | lift_weight_list = [float(num) for num in re.findall(r'(\d+\.\d+)', value)]
97 | for item in lift_weight_list:
98 | if item not in lifter_weight_list:
99 | system_error_feedback += f'agent[{item}W] is not in the current warehouse; '
100 |
101 | if volume in volume_list:
102 | index = volume_list.index(volume)
103 | if np.sum(lift_weight_list) >= weight_list[index]:
104 | volume_list.pop(index)
105 | weight_list.pop(index)
106 | else:
107 | expression = ''
108 | for index_2 in range(len(lift_weight_list)):
109 | if index_2 != len(lift_weight_list) - 1:
110 | expression += f'agent[{lift_weight_list[index_2]}W] and '
111 | else:
112 | expression += f'agent[{lift_weight_list[index_2]}W]'
113 | env_act_feedback += f'The weight of box[{volume}V] is higher than the summation of lifting capability of {expression}, so it can not be lifted. '
114 | else:
115 | system_error_feedback += f'box[{volume}V] is not in the current warehouse; '
116 |
117 | pg_dict_original = dict(zip(volume_list, weight_list))
118 | return system_error_feedback, pg_dict_original, env_act_feedback
119 |
120 |
121 |
122 | def assign_weight(volume):
123 | # Step 1: Assume a base density to convert volume to weight.
124 | # This value is an assumption; in real-life, different items have different densities.
125 | # Let's assume a density of 0.5 kg/m^3 for simplicity.
126 | # You can adjust this value based on your requirements.
127 | density = 1
128 | estimated_weight = volume * density
129 |
130 | # Step 2: Add some randomness to the weight.
131 | # This can be a combination of gaussian noise and outlier noise.
132 | noise = random.gauss(0, estimated_weight * 0.1) # 10% of weight as gaussian noise
133 | outlier_chance = 0.05 # 5% chance to be an outlier
134 | if random.random() < outlier_chance:
135 | noise += random.choice([-1, 1]) * estimated_weight * 0.5 # 50% of weight as outlier noise
136 |
137 | weight = max(0.1, estimated_weight + noise) # ensure weight is not negative
138 | return weight
139 |
140 | def env_create(lifter_num, box_num):
141 | # Create the volume and weight lists
142 | volume_list = [random.randint(2, 20)/2 for _ in range(box_num)]
143 | weight_list = [round(assign_weight(volume), 1) for volume in volume_list]
144 |
145 | # Create the lifter list
146 | lifter_weight_list = [random.randint(1, 15) / 2 for _ in range(lifter_num)]
147 | while np.sum(lifter_weight_list) < np.max(weight_list):
148 | lifter_weight_list = [item + 0.5 for item in lifter_weight_list]
149 |
150 | print('lifter_weight_list: ', lifter_weight_list)
151 | print('volume_list: ', volume_list)
152 | print('weight_list: ', weight_list)
153 | print('Deviation ratio: ', [weight_list[i] / volume_list[i] for i in range(len(volume_list))])
154 | print('\n')
155 | return lifter_weight_list, volume_list, weight_list
156 |
157 | def create_env3(Saving_path, repeat_num = 4):
158 | if not os.path.exists(Saving_path):
159 | os.makedirs(Saving_path, exist_ok=True)
160 | else:
161 | shutil.rmtree(Saving_path)
162 | os.makedirs(Saving_path, exist_ok=True)
163 |
164 | for i, box_num in [(4,10), (6,14), (8,18), (10,24)]:
165 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}'):
166 | os.makedirs(Saving_path+f'/env_pg_state_{i}', exist_ok=True)
167 | else:
168 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}')
169 | os.makedirs(Saving_path+f'/env_pg_state_{i}', exist_ok=True)
170 |
171 | for iteration_num in range(repeat_num):
172 | lifter_weight_list, volume_list, weight_list = env_create(i, box_num)
173 | os.makedirs(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}', exist_ok=True)
174 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/lifter_weight_list{iteration_num}.txt', 'w') as f:
175 | for number in lifter_weight_list:
176 | f.write(str(number) + '\n')
177 |
178 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/volume_list{iteration_num}.txt', 'w') as f:
179 | for number in volume_list:
180 | f.write(str(number) + '\n')
181 |
182 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/weight_list{iteration_num}.txt', 'w') as f:
183 | for number in weight_list:
184 | f.write(str(number) + '\n')
185 |
186 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
187 | Saving_path = Code_dir_path + 'Env3_BoxLift'
188 | create_env3(Saving_path, repeat_num = 10)
189 |
190 |
--------------------------------------------------------------------------------
/env1_create.py:
--------------------------------------------------------------------------------
1 | # Box moving to target without collision
2 |
3 | from prompt_env1 import *
4 | from LLM import *
5 | from sre_constants import error
6 | import random
7 | import os
8 | import json
9 | import re
10 | import copy
11 | import numpy as np
12 | import shutil
13 | import time
14 |
15 | def surround_index_func(row_num, coloum_num, row_index, coloum_index):
16 | surround_index_list = []
17 | for i, j in ([row_index-1, coloum_index], [row_index+1, coloum_index], [row_index, coloum_index-1], [row_index, coloum_index+1]):
18 | if i>=0 and i<=row_num-1 and j>=0 and j<=coloum_num-1 and not (i == row_index and j == coloum_index):
19 | surround_index_list.append([i+0.5,j+0.5])
20 | return surround_index_list
21 |
22 | def state_update_func(pg_row_num, pg_column_num, pg_dict):
23 | pg_dict_copy = copy.deepcopy(pg_dict)
24 | state_update_prompt = ''
25 | for i in range(pg_row_num):
26 | for j in range(pg_column_num):
27 | square_item_list = pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)]
28 | square_item_only_box = [item for item in square_item_list if item[:3]=='box']
29 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, i, j)
30 | state_update_prompt += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do '
31 | action_list = []
32 | for box in square_item_only_box:
33 | for surround_index in surround_index_list:
34 | action_list.append(f'move({box}, square{surround_index})')
35 | if 'target'+box[3:] in square_item_list:
36 | action_list.append(f'move({box}, target{box[3:]})')
37 | state_update_prompt += f'{action_list}\n'
38 | return state_update_prompt
39 |
40 | def state_update_func_local_agent(pg_row_num, pg_column_num, pg_row_i, pg_column_j, pg_dict):
41 | pg_dict_copy = copy.deepcopy(pg_dict)
42 | state_update_prompt_local_agent = ''
43 | state_update_prompt_other_agent = ''
44 |
45 | for i in range(pg_row_num):
46 | for j in range(pg_column_num):
47 | if not (i == pg_row_i and pg_column_j == j):
48 | square_item_list = pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)]
49 | square_item_only_box = [item for item in square_item_list if item[:3]=='box']
50 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, i, j)
51 | state_update_prompt_other_agent += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do '
52 | action_list = []
53 | for box in square_item_only_box:
54 | for surround_index in surround_index_list:
55 | action_list.append(f'move({box}, square{surround_index})')
56 | if 'target'+box[3:] in square_item_list:
57 | action_list.append(f'move({box}, target{box[3:]})')
58 | state_update_prompt_other_agent += f'{action_list}\n'
59 |
60 | square_item_list = pg_dict_copy[str(pg_row_i+0.5)+'_'+str(pg_column_j+0.5)]
61 | square_item_only_box = [item for item in square_item_list if item[:3]=='box']
62 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, pg_row_i, pg_column_j)
63 | state_update_prompt_local_agent += f'Agent[{pg_row_i+0.5}, {pg_column_j+0.5}]: in square[{pg_row_i+0.5}, {pg_column_j+0.5}], can observe {square_item_list}, can do '
64 | action_list = []
65 | for box in square_item_only_box:
66 | for surround_index in surround_index_list:
67 | action_list.append(f'move({box}, square{surround_index})')
68 | if 'target'+box[3:] in square_item_list:
69 | action_list.append(f'move({box}, target{box[3:]})')
70 | state_update_prompt_local_agent += f'{action_list}\n'
71 | return state_update_prompt_local_agent, state_update_prompt_other_agent
72 |
73 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method, cen_decen_framework):
74 | user_prompt_list = copy.deepcopy(user_prompt_list_input)
75 | response_total_list = copy.deepcopy(response_total_list_input)
76 | iteration_num = 0
77 | token_num_count_list_add = []
78 | while iteration_num < 6:
79 | response_total_list.append(response)
80 | try:
81 | original_response_dict = json.loads(response)
82 |
83 | pg_dict_original = copy.deepcopy(pg_dict_input)
84 | transformed_dict = {}
85 | for key, value in original_response_dict.items():
86 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key)))
87 |
88 | # match the item and location in the value
89 | match = re.match(r"move\((.*?),\s(.*?)\)", value)
90 | if match:
91 | item, location = match.groups()
92 |
93 | if "square" in location:
94 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location)))
95 |
96 | transformed_dict[coordinates] = [item, location]
97 |
98 | feedback = ''
99 | for key, value in transformed_dict.items():
100 | # print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
101 | if value[0] in pg_dict_original[str(key[0]) + '_' + str(key[1])] and type(value[1]) == tuple and (
102 | (np.abs(key[0] - value[1][0]) == 0 and np.abs(key[1] - value[1][1]) == 1) or (
103 | np.abs(key[0] - value[1][0]) == 1 and np.abs(key[1] - value[1][1]) == 0)):
104 | pass
105 | elif value[0] in pg_dict_original[str(key[0]) + '_' + str(key[1])] and type(value[1]) == str and value[1] in \
106 | pg_dict_original[str(key[0]) + '_' + str(key[1])] and value[0][:4] == 'box_' and value[1][
107 | :7] == 'target_' and \
108 | value[0][4:] == value[1][7:]:
109 | pass
110 | else:
111 | # print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
112 | feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; '
113 | except:
114 | raise error(f'The response in wrong json format: {response}')
115 | feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss to move box into the same colored target like move(box_blue, target_blue)'
116 |
117 | if feedback != '':
118 | feedback += 'Please replan for all the agents again with the same ouput format:'
119 | print('----------Syntactic Check----------')
120 | print(f'Response original: {response}')
121 | print(f'Feedback: {feedback}')
122 | user_prompt_list.append(feedback)
123 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction
124 | print(f'Length of messages {len(messages)}')
125 | response, token_num_count = GPT_response(messages, model_name)
126 | token_num_count_list_add.append(token_num_count)
127 | print(f'Response new: {response}\n')
128 | if response == 'Out of tokens':
129 | return response, token_num_count_list_add
130 | iteration_num += 1
131 | else:
132 | return response, token_num_count_list_add
133 | return 'Syntactic Error', token_num_count_list_add
134 |
135 | def action_from_response(pg_dict_input, original_response_dict):
136 | system_error_feedback = ''
137 | pg_dict_original = copy.deepcopy(pg_dict_input)
138 | transformed_dict = {}
139 | for key, value in original_response_dict.items():
140 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key)))
141 |
142 | # match the item and location in the value
143 | match = re.match(r"move\((.*?),\s(.*?)\)", value)
144 | if match:
145 | item, location = match.groups()
146 | if "square" in location:
147 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location)))
148 | transformed_dict[coordinates] = [item, location]
149 |
150 | for key, value in transformed_dict.items():
151 | #print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
152 | if value[0] in pg_dict_original[str(key[0])+'_'+str(key[1])] and type(value[1]) == tuple and ((np.abs(key[0]-value[1][0])==0 and np.abs(key[1]-value[1][1])==1) or (np.abs(key[0]-value[1][0])==1 and np.abs(key[1]-value[1][1])==0)):
153 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[0])
154 | pg_dict_original[str(value[1][0])+'_'+str(value[1][1])].append(value[0])
155 | elif value[0] in pg_dict_original[str(key[0])+'_'+str(key[1])] and type(value[1]) == str and value[1] in pg_dict_original[str(key[0])+'_'+str(key[1])] and value[0][:4] == 'box_' and value[1][:7] == 'target_' and value[0][4:] == value[1][7:]:
156 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[0])
157 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[1])
158 | else:
159 | #print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
160 | system_error_feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; '
161 |
162 | return system_error_feedback, pg_dict_original
163 |
164 | def env_create(pg_row_num = 5, pg_column_num = 5, box_num_low_bound = 2, box_num_upper_bound = 2, color_list = ['blue', 'red', 'green', 'purple', 'orange']):
165 | # pg_dict records the items in each square over steps, here in the initial setting, we randomly assign items into each square
166 | pg_dict = {}
167 | for i in range(pg_row_num):
168 | for j in range(pg_column_num):
169 | pg_dict[str(i+0.5)+'_'+str(j+0.5)] = []
170 |
171 | for color in color_list:
172 | box_num = random.randint(box_num_low_bound, box_num_upper_bound)
173 | for _ in range(box_num):
174 | N_box = random.randint(0, pg_row_num*pg_column_num - 1)
175 | a_box = N_box // pg_column_num
176 | b_box = N_box % pg_column_num
177 | N_target = random.randint(0, pg_row_num*pg_column_num - 1)
178 | a_target = N_target // pg_column_num
179 | b_target = N_target % pg_column_num
180 | pg_dict[str(a_box+0.5)+'_'+str(b_box+0.5)].append('box_' + color)
181 | pg_dict[str(a_target+0.5)+'_'+str(b_target+0.5)].append('target_' + color)
182 | return pg_dict
183 |
184 | def create_env1(Saving_path, repeat_num = 10):
185 | if not os.path.exists(Saving_path):
186 | os.makedirs(Saving_path, exist_ok=True)
187 | else:
188 | shutil.rmtree(Saving_path)
189 | os.makedirs(Saving_path, exist_ok=True)
190 |
191 | for i ,j in [(2,2), (2,4), (4,4), (4,8)]:
192 |
193 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}_{j}'):
194 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True)
195 | else:
196 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}_{j}')
197 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True)
198 |
199 | for iteration_num in range(repeat_num):
200 | # Define the total row and column numbers of the whole playground, and the item number of each colored target and box
201 | pg_row_num = i; pg_column_num = j; box_num_low_bound = 1; box_num_upper_bound = 3
202 | # Define the used colors
203 | color_list = ['blue', 'red', 'green', 'purple', 'orange']
204 | pg_dict = env_create(pg_row_num, pg_column_num, box_num_low_bound, box_num_upper_bound, color_list)
205 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}', exist_ok=True)
206 | with open(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f:
207 | json.dump(pg_dict, f)
208 |
209 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
210 | Saving_path = Code_dir_path + 'Env1_BoxNet1'
211 | # The first time to create the environment, after that you can comment it
212 | create_env1(Saving_path, repeat_num = 10)
--------------------------------------------------------------------------------
/env4_create.py:
--------------------------------------------------------------------------------
1 | from prompt_env4 import *
2 | from LLM import *
3 | from sre_constants import error
4 | import random
5 | import os
6 | import json
7 | import re
8 | import copy
9 | import numpy as np
10 | import shutil
11 | import time
12 |
13 | def state_update_func(agent_position_state_dict, box_position_dict, track_row_num, column_num):
14 | state_update_prompt = f'The states and actions of available agents are: \n'
15 | state_update_prompt += f'The left boxes and their locations in the warehouse are: '
16 | for key, value in box_position_dict.items():
17 | if value == 1:
18 | state_update_prompt += f'box_{key}, '
19 | state_update_prompt += f'.\n'
20 |
21 | for i in range(len(agent_position_state_dict)):
22 | if type(agent_position_state_dict[f'agent{i}']) == str and agent_position_state_dict[f'agent{i}'] == 'target':
23 | state_update_prompt += f'I am agent{i}, I am in target now, I can do: '
24 | for row_num in range(track_row_num):
25 | state_update_prompt += f'move to track_{row_num}; '
26 | else:
27 | if agent_position_state_dict[f'agent{i}'][2] == 1:
28 | state_update_prompt += f'I am agent{i}, I am in track_{agent_position_state_dict[f"agent{i}"][0]} and column_{agent_position_state_dict[f"agent{i}"][1]}, I am having box on myself so can not pick more box now. I can do: '
29 | else:
30 | state_update_prompt += f'I am agent{i}, I am in track_{agent_position_state_dict[f"agent{i}"][0]} and column_{agent_position_state_dict[f"agent{i}"][1]}, I am not having box on myself so can pick one box. I can do: '
31 | if agent_position_state_dict[f'agent{i}'][1] > 0:
32 | state_update_prompt += f'move left; '
33 | if agent_position_state_dict[f'agent{i}'][1] < column_num-1:
34 | state_update_prompt += f'move right; '
35 | if agent_position_state_dict[f'agent{i}'][1] == 0:
36 | state_update_prompt += f'move to target; '
37 | if agent_position_state_dict[f'agent{i}'][0] - 0.5 > 0 and box_position_dict[f'{agent_position_state_dict[f"agent{i}"][0]-0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}'] == 1 and agent_position_state_dict[f'agent{i}'][2] == 0:
38 | state_update_prompt += f'pick box_{agent_position_state_dict[f"agent{i}"][0]-0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}; '
39 | if agent_position_state_dict[f'agent{i}'][0] + 0.5 < track_row_num-1 and box_position_dict[f'{agent_position_state_dict[f"agent{i}"][0] + 0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}'] == 1 and agent_position_state_dict[f'agent{i}'][2] == 0:
40 | state_update_prompt += f'pick box_{agent_position_state_dict[f"agent{i}"][0]+0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}; '
41 | state_update_prompt += f'.\n'
42 |
43 | state_update_prompt += f'\n'
44 | return state_update_prompt
45 |
46 |
47 | def action_from_response(pg_dict_input, original_response_dict, track_row_num, column_num, box_position_dict_input):
48 | collision_check = False
49 | system_error_feedback = ''
50 | pg_dict_original = copy.deepcopy(pg_dict_input)
51 | box_position_dict = copy.deepcopy(box_position_dict_input)
52 |
53 | for key, value in original_response_dict.items():
54 | # '{"agent0":"move left", "agent1":"pick box_1.0_1.5"}'
55 |
56 | if 'left' in value:
57 | if pg_dict_original[key][1]>0:
58 | pg_dict_original[key][1] -= 1
59 | elif pg_dict_original[key][1]==0:
60 | system_error_feedback += f'{key} has arrived at the left side of the track, you can not move left. You can enter_target to leave the track and drop the box (if you have box), or you can move left or pick up near boxes.'
61 | else:
62 | print(f"Error, Key: {key}, Value: {value}")
63 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; '
64 | elif 'right' in value:
65 | if pg_dict_original[key][1]= track_row_num or float_numbers[0] <= 0:
99 | print(f"Error, Key: {key}, Value: {value}")
100 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; '
101 | if pg_dict_original[key] == 'target':
102 | #print(float_numbers)
103 | pg_dict_original[key] = [float_numbers[0], 0, 0]
104 |
105 | else:
106 | print(f"Error, Key: {key}, Value: {value}")
107 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; '
108 |
109 | position_list = []
110 | for key, value in pg_dict_original.items():
111 | if value == 'target':
112 | pass
113 | else:
114 | if [value[0], value[1]] in position_list:
115 | collision_check = True
116 | break
117 | else:
118 | position_list.append([value[0], value[1]])
119 |
120 | return system_error_feedback, pg_dict_original, collision_check, box_position_dict
121 |
122 |
123 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input,
124 | model_name, dialogue_history_method, track_row_num, column_num, box_position_dict):
125 | #print('----------Syntactic Check----------')
126 | user_prompt_list = copy.deepcopy(user_prompt_list_input)
127 | response_total_list = copy.deepcopy(response_total_list_input)
128 | iteration_num = 0
129 | token_num_count_list_add = []
130 | while iteration_num < 6:
131 | response_total_list.append(response)
132 | # try:
133 | original_response_dict = json.loads(response)
134 | pg_dict_original = copy.deepcopy(pg_dict_input)
135 | system_error_feedback, pg_dict_original_return, collision_check_return, box_position_dict_return = action_from_response(
136 | pg_dict_original, original_response_dict, track_row_num, column_num, box_position_dict)
137 | feedback = system_error_feedback
138 |
139 | if feedback != '':
140 | feedback += 'Please replan for all the agents again with the same ouput format:'
141 | print('----------Syntactic Check----------')
142 | print(f'Response original: {response}')
143 | print(f'Feedback: {feedback}')
144 | user_prompt_list.append(feedback)
145 | messages = message_construct_func(user_prompt_list, response_total_list,
146 | dialogue_history_method) # message construction
147 | print(f'Length of messages {len(messages)}')
148 | response, token_num_count = GPT_response(messages, model_name)
149 | token_num_count_list_add.append(token_num_count)
150 | print(f'Response new: {response}\n')
151 | if response == 'Out of tokens':
152 | return response, token_num_count_list_add
153 | iteration_num += 1
154 | else:
155 | return response, token_num_count_list_add
156 | return 'Syntactic Error', token_num_count_list_add
157 |
158 | def generate_unique_integers(lower_limit, upper_limit, m):
159 | if m > (upper_limit - lower_limit + 1):
160 | raise ValueError("Cannot generate more unique integers than the range allows.")
161 | unique_integers = set()
162 | while len(unique_integers) < m:
163 | unique_integers.add(random.randint(lower_limit, upper_limit))
164 | return list(unique_integers)
165 |
166 | def env_create(track_row_num, column_num, box_occupy_ratio, agent_num):
167 | box_position_dict = {}
168 | # assign boxes to positions
169 | for i in range(track_row_num -1):
170 | for j in range(column_num):
171 | if random.random() < box_occupy_ratio:
172 | box_position_dict[f'{float(i+0.5)}_{float(j)}'] = 1
173 | else:
174 | box_position_dict[f'{float(i+0.5)}_{float(j)}'] = 0
175 |
176 | # assign agents into positions
177 | agent_position_state_dict = {}
178 |
179 | lower_limit = 0
180 | upper_limit = track_row_num * column_num - 1
181 | unique_integers = generate_unique_integers(lower_limit, upper_limit + 5, agent_num)
182 |
183 | for i in range(agent_num):
184 | if unique_integers[i] > upper_limit:
185 | agent_position_state_dict[f'agent{i}'] = 'target'
186 | else:
187 | agent_position_state_dict[f'agent{i}'] = (unique_integers[i] // column_num, unique_integers[i] % column_num, 0)
188 | return agent_position_state_dict, box_position_dict
189 |
190 |
191 | def create_env4(Saving_path, repeat_num = 10):
192 | if not os.path.exists(Saving_path):
193 | os.makedirs(Saving_path, exist_ok=True)
194 | else:
195 | shutil.rmtree(Saving_path)
196 | os.makedirs(Saving_path, exist_ok=True)
197 |
198 | for track_row_num, column_num, box_occupy_ratio, agent_num in [(3, 5, 0.5, 4)]:
199 | if not os.path.exists(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}'):
200 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}', exist_ok=True)
201 | else:
202 | shutil.rmtree(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}')
203 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}', exist_ok=True)
204 |
205 | for iteration_num in range(repeat_num):
206 | agent_position_state_dict, box_position_dict = env_create(track_row_num, column_num, box_occupy_ratio, agent_num)
207 | print('Initial agent state: ', agent_position_state_dict)
208 | print('Box_matrix: ', box_position_dict)
209 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}', exist_ok=True)
210 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f:
211 | json.dump(agent_position_state_dict, f)
212 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/box_state{iteration_num}.json', 'w') as f:
213 | json.dump(box_position_dict, f)
214 | print('\n')
215 |
216 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
217 | Saving_path = Code_dir_path + 'Env4_Warehouse'
218 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613'
219 | # The first time to create the environment, after that you can comment it
220 | create_env4(Saving_path, repeat_num = 10)
--------------------------------------------------------------------------------
/env2_create.py:
--------------------------------------------------------------------------------
1 | # Box moving to target with collisions
2 |
3 | from prompt_env2 import *
4 | from LLM import *
5 | from sre_constants import error
6 | import random
7 | import os
8 | import json
9 | import re
10 | import copy
11 | import numpy as np
12 | import shutil
13 | import time
14 |
15 | def corner_position(pg_row_i, pg_column_j):
16 | corner_position_list = [(float(pg_row_i), float(pg_column_j)), (float(pg_row_i), float(pg_column_j + 1)), (float(pg_row_i + 1), float(pg_column_j)),
17 | (float(pg_row_i + 1), float(pg_column_j + 1))]
18 | return corner_position_list
19 |
20 | def judge_move_box2pos_box2target_func(key, value, pg_dict_original):
21 | if not (str(key[0] - 0.5) + '_' + str(key[1] - 0.5) in pg_dict_original.keys() \
22 | and str(key[0] - 0.5) + '_' + str(key[1] + 0.5) in pg_dict_original.keys() \
23 | and str(key[0] + 0.5) + '_' + str(key[1] - 0.5) in pg_dict_original.keys() \
24 | and str(key[0] + 0.5) + '_' + str(key[1] + 0.5) in pg_dict_original.keys() \
25 | and np.mod(key[0], 1) == 0.5 and np.mod(key[1], 1) == 0.5):
26 | return None, False, False, f'Agent[{float(key[0])}, {float(key[1])}] is not in the agent list. '
27 |
28 | if value[0] in pg_dict_original[str(key[0] - 0.5) + '_' + str(key[1] - 0.5)]:
29 | box_location = (key[0] - 0.5, key[1] - 0.5)
30 | elif value[0] in pg_dict_original[str(key[0] - 0.5) + '_' + str(key[1] + 0.5)]:
31 | box_location = (key[0] - 0.5, key[1] + 0.5)
32 | elif value[0] in pg_dict_original[str(key[0] + 0.5) + '_' + str(key[1] - 0.5)]:
33 | box_location = (key[0] + 0.5, key[1] - 0.5)
34 | elif value[0] in pg_dict_original[str(key[0] + 0.5) + '_' + str(key[1] + 0.5)]:
35 | box_location = (key[0] + 0.5, key[1] + 0.5)
36 | else:
37 | return None, False, False, ''
38 |
39 | if type(value[1]) == tuple and (np.abs(key[0]-value[1][0])==0.5 and np.abs(key[1]-value[1][1])==0.5):
40 | return box_location, True, False, ''
41 | elif type(value[1]) == str and value[1] in pg_dict_original[str(key[0])+'_'+str(key[1])] and value[0][:4] == 'box_' and value[1][:7] == 'target_' and value[0][4:] == value[1][7:]:
42 | return box_location, False, True, ''
43 | else:
44 | return None, False, False, f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; '
45 |
46 |
47 | def state_update_func(pg_row_num, pg_column_num, pg_dict):
48 | pg_dict_copy = copy.deepcopy(pg_dict)
49 | state_update_prompt = ''
50 | for i in range(pg_row_num):
51 | for j in range(pg_column_num):
52 | square_item_list = pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]
53 | state_update_prompt += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do '
54 | action_list = []
55 | for corner_x, corner_y in corner_position(i, j):
56 | if len(pg_dict_copy[str(corner_x)+'_'+str(corner_y)]) == 1:
57 | box = pg_dict_copy[str(corner_x)+'_'+str(corner_y)][0]
58 | for surround_index in corner_position(i, j):
59 | if surround_index != (corner_x, corner_y):
60 | action_list.append(f'move({box}, position{surround_index})')
61 | if 'target'+box[3:] in pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)]:
62 | action_list.append(f'move({box}, target{box[3:]})')
63 | state_update_prompt += f'{action_list}\n'
64 | return state_update_prompt
65 |
66 | def state_update_func_local_agent(pg_row_num, pg_column_num, pg_row_i, pg_column_j, pg_dict):
67 | pg_dict_copy = copy.deepcopy(pg_dict)
68 | state_update_prompt_local_agent = ''
69 | state_update_prompt_other_agent = ''
70 |
71 | for i in range(pg_row_num):
72 | for j in range(pg_column_num):
73 | if not (i == pg_row_i and pg_column_j == j):
74 | square_item_list = pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]
75 | state_update_prompt_other_agent += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do '
76 | action_list = []
77 | for corner_x, corner_y in corner_position(i, j):
78 | if len(pg_dict_copy[str(corner_x) + '_' + str(corner_y)]) == 1:
79 | box = pg_dict_copy[str(corner_x) + '_' + str(corner_y)][0]
80 | for surround_index in corner_position(i, j):
81 | if surround_index != (corner_x, corner_y):
82 | action_list.append(f'move({box}, position{surround_index})')
83 | if 'target' + box[3:] in pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]:
84 | action_list.append(f'move({box}, target{box[3:]})')
85 | state_update_prompt_other_agent += f'{action_list}\n'
86 |
87 | state_update_prompt_local_agent += f'Agent[{pg_row_i+0.5}, {pg_column_j+0.5}]: in square[{pg_row_i+0.5}, {pg_column_j+0.5}], can observe {square_item_list}, can do '
88 | action_list = []
89 | for corner_x, corner_y in corner_position(pg_row_i, pg_column_j):
90 | if len(pg_dict_copy[str(corner_x) + '_' + str(corner_y)]) == 1:
91 | box = pg_dict_copy[str(corner_x) + '_' + str(corner_y)][0]
92 | for surround_index in corner_position(pg_row_i, pg_column_j):
93 | if surround_index != (corner_x, corner_y):
94 | action_list.append(f'move({box}, position{surround_index})')
95 | if 'target' + box[3:] in pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]:
96 | action_list.append(f'move({box}, target{box[3:]})')
97 | state_update_prompt_local_agent += f'{action_list}\n'
98 | return state_update_prompt_local_agent, state_update_prompt_other_agent
99 |
100 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method):
101 | user_prompt_list = copy.deepcopy(user_prompt_list_input)
102 | response_total_list = copy.deepcopy(response_total_list_input)
103 | iteration_num = 0
104 | token_num_count_list_add = []
105 | while iteration_num < 6:
106 | response_total_list.append(response)
107 | try:
108 | original_response_dict = json.loads(response)
109 |
110 | pg_dict_original = copy.deepcopy(pg_dict_input)
111 | transformed_dict = {}
112 | for key, value in original_response_dict.items():
113 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key)))
114 |
115 | # match the item and location in the value
116 | match = re.match(r"move\((.*?),\s(.*?)\)", value)
117 | if match:
118 | item, location = match.groups()
119 |
120 | if "position" in location:
121 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location)))
122 |
123 | transformed_dict[coordinates] = [item, location]
124 |
125 | feedback = ''
126 | for key, value in transformed_dict.items():
127 | # print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
128 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value,
129 | pg_dict_original)
130 | if judge_move_box2pos == True or judge_move_box2target == True:
131 | pass
132 | except:
133 | feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss to move box into the same colored target like move(box_blue, target_blue)'
134 |
135 | if feedback != '':
136 | feedback += 'Please replan for all the agents again with the same ouput format:'
137 | print('----------Syntactic Check----------')
138 | print(f'Response original: {response}')
139 | print(f'Feedback: {feedback}')
140 | user_prompt_list.append(feedback)
141 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction
142 | print(f'Length of messages {len(messages)}')
143 | response, token_num_count = GPT_response(messages, model_name)
144 | token_num_count_list_add.append(token_num_count)
145 | print(f'Response new: {response}\n')
146 | if response == 'Out of tokens':
147 | return response, token_num_count_list_add
148 | iteration_num += 1
149 | else:
150 | return response, token_num_count_list_add
151 | return 'Syntactic Error', token_num_count_list_add
152 |
153 | def action_from_response(pg_dict_input, original_response_dict):
154 | collision_check = False
155 | system_error_feedback = ''
156 | pg_dict_original = copy.deepcopy(pg_dict_input)
157 | transformed_dict = {}
158 | for key, value in original_response_dict.items():
159 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key)))
160 |
161 | # match the item and location in the value
162 | match = re.match(r"move\((.*?),\s(.*?)\)", value)
163 | if match:
164 | item, location = match.groups()
165 | if "position" in location:
166 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location)))
167 | transformed_dict[coordinates] = [item, location]
168 |
169 | for key, value in transformed_dict.items():
170 | print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
171 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value, pg_dict_original)
172 | if judge_move_box2pos == True:
173 | pg_dict_original[str(box_location[0])+'_'+str(box_location[1])].remove(value[0])
174 | pg_dict_original[str(value[1][0])+'_'+str(value[1][1])].append(value[0])
175 | elif judge_move_box2target == True:
176 | pg_dict_original[str(box_location[0])+'_'+str(box_location[1])].remove(value[0])
177 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[1])
178 | else:
179 | #print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}")
180 | system_error_feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; '
181 | for key, value in transformed_dict.items():
182 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value,
183 | pg_dict_original)
184 | if judge_move_box2pos == True and len(pg_dict_original[str(value[1][0]) + '_' + str(value[1][1])]) > 1:
185 | collision_check = True
186 | break
187 |
188 | return system_error_feedback, pg_dict_original, collision_check
189 |
190 | def env_create(pg_row_num = 5, pg_column_num = 5, box_num_low_bound = 2, box_num_upper_bound = 2, color_list = ['blue', 'red', 'green', 'purple', 'orange']):
191 | # pg_dict records the items in each square over steps, here in the initial setting, we randomly assign items into each square
192 | pg_dict = {}
193 | for i in range(pg_row_num):
194 | for j in range(pg_column_num):
195 | pg_dict[str(i+0.5)+'_'+str(j+0.5)] = []
196 | for i in range(pg_row_num+1):
197 | for j in range(pg_column_num+1):
198 | pg_dict[str(float(i))+'_'+str(float(j))] = []
199 |
200 | for color in color_list:
201 | box_num = random.randint(box_num_low_bound, box_num_upper_bound)
202 | for _ in range(box_num):
203 | N_box = random.randint(0, pg_row_num*pg_column_num - 1)
204 | a_box = N_box // pg_column_num
205 | b_box = N_box % pg_column_num
206 | N_target = random.randint(0, pg_row_num*pg_column_num - 1)
207 | a_target = N_target // pg_column_num
208 | b_target = N_target % pg_column_num
209 | corner_list = [(1.0, 0.0), (0.0, 0.0), (0.0, 1.0), (1.0, 1.0)]
210 | random.shuffle(corner_list)
211 | for random_x, random_y in corner_list:
212 | if len(pg_dict[str(float(a_box) + random_x)+'_'+str(float(b_box) + random_y)]) == 0:
213 | pg_dict[str(float(a_box) + random_x) + '_' + str(float(b_box) + random_y)].append('box_' + color)
214 | pg_dict[str(a_target+0.5)+'_'+str(b_target+0.5)].append('target_' + color)
215 | break
216 | print(pg_dict)
217 | print('\n')
218 | return pg_dict
219 |
220 | def create_env2(Saving_path, repeat_num = 10):
221 | if not os.path.exists(Saving_path):
222 | os.makedirs(Saving_path, exist_ok=True)
223 | else:
224 | shutil.rmtree(Saving_path)
225 | os.makedirs(Saving_path, exist_ok=True)
226 |
227 | for i ,j in [(2,2), (2,4), (4,4), (4,8)]:
228 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}_{j}'):
229 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True)
230 | else:
231 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}_{j}')
232 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True)
233 |
234 | for iteration_num in range(repeat_num):
235 | # Define the total row and column numbers of the whole playground, and the item number of each colored target and box
236 | pg_row_num = i; pg_column_num = j; box_num_low_bound = 1; box_num_upper_bound = 1
237 | # Define the used colors
238 | color_list = ['blue', 'red', 'green', 'purple', 'orange']
239 | pg_dict = env_create(pg_row_num, pg_column_num, box_num_low_bound, box_num_upper_bound, color_list)
240 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}', exist_ok=True)
241 | with open(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f:
242 | json.dump(pg_dict, f)
243 |
244 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
245 | Saving_path = Code_dir_path + 'Env2_BoxNet2'
246 | # The first time to create the environment, after that you can comment it
247 | create_env2(Saving_path, repeat_num = 10)
--------------------------------------------------------------------------------
/env1-box-arrange.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | from prompt_env1 import *
3 | from env1_create import *
4 | from sre_constants import error
5 | import random
6 | import os
7 | import json
8 | import re
9 | import copy
10 | import numpy as np
11 | import shutil
12 | import time
13 |
14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2'
15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history'
16 | def run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'):
17 |
18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}'
19 |
20 | # specify the path to your dir for saving the results
21 | os.makedirs(Saving_path_result, exist_ok=True)
22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True)
23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True)
24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True)
25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True)
26 |
27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file:
28 | pg_dict = json.load(file)
29 |
30 | user_prompt_list = [] # The record list of all the input prompts
31 | response_total_list = [] # The record list of all the responses
32 | pg_state_list = [] # The record list of apg states in varied steps
33 | dialogue_history_list = []
34 | token_num_count_list = []
35 | pg_state_list.append(pg_dict)
36 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f:
37 | json.dump(pg_dict, f)
38 |
39 | ### Start the Game! Query LLM for response
40 | print(f'query_time_limit: {query_time_limit}')
41 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs
42 | #print(index_query_times)
43 | print(pg_dict)
44 | state_update_prompt = state_update_func(pg_row_num, pg_column_num, pg_dict)
45 | if cen_decen_framework in ('DMAS'):
46 | print('--------DMAS method starts--------')
47 | match = None
48 | count_round = 0
49 | dialogue_history = ''
50 | response = '{}'
51 | while not match and count_round <= 3:
52 | count_round += 1
53 | for local_agent_row_i in range(pg_row_num):
54 | for local_agent_column_j in range(pg_column_num):
55 | #if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in pg_dict:
56 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num,
57 | pg_column_num,
58 | local_agent_row_i,
59 | local_agent_column_j,
60 | pg_dict)
61 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent,
62 | state_update_prompt_other_agent,
63 | dialogue_history, response_total_list,
64 | pg_state_list, dialogue_history_list,
65 | dialogue_history_method)
66 | user_prompt_list.append(user_prompt_1)
67 | print(f'User prompt: {user_prompt_1}\n\n')
68 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
69 | f.write(user_prompt_list[-1])
70 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
71 | initial_response, token_num_count = GPT_response(messages,model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
72 | token_num_count_list.append(token_num_count)
73 |
74 | dialogue_history += f'[Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {initial_response}]\n\n'
75 | #print(dialogue_history)
76 | if re.search(r'EXECUTE', initial_response):
77 | # Search for the pattern that starts with { and ends with }
78 | print('EXECUTE!')
79 | match = re.search(r'{.*}', initial_response, re.DOTALL)
80 | if match:
81 | response = match.group()
82 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
83 | [user_prompt_list[-1]],
84 | [],
85 | model_name,
86 | '_w_all_dialogue_history',
87 | cen_decen_framework)
88 | token_num_count_list = token_num_count_list + token_num_count_list_add
89 | print(f'response: {response}')
90 | #print(f'User prompt: {user_prompt_1}\n\n')
91 | break
92 | break
93 | dialogue_history_list.append(dialogue_history)
94 | else:
95 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'):
96 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list,
97 | pg_state_list, dialogue_history_list,
98 | dialogue_history_method, cen_decen_framework)
99 | user_prompt_list.append(user_prompt_1)
100 |
101 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction
102 |
103 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f:
104 | f.write(user_prompt_list[-1])
105 | initial_response, token_num_count = GPT_response(messages, model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
106 | print('Initial response: ', initial_response)
107 | token_num_count_list.append(token_num_count)
108 | match = re.search(r'{.*}', initial_response, re.DOTALL)
109 | if match:
110 | response = match.group()
111 | if response[0] == '{' and response[-1] == '}':
112 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history', cen_decen_framework)
113 | token_num_count_list = token_num_count_list + token_num_count_list_add
114 | print(f'response: {response}')
115 | else:
116 | raise ValueError(f'Response format error: {response}')
117 |
118 | if response == 'Out of tokens':
119 | success_failure = 'failure over token length limit'
120 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
121 | elif response == 'Syntactic Error':
122 | success_failure = 'Syntactic Error'
123 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
124 |
125 | # Local agent response for checking the feasibility of actions
126 | if cen_decen_framework == 'HMAS-2':
127 | print('--------HMAS-2 method starts--------')
128 | dialogue_history = f'Central Planner: {response}\n'
129 | #print(f'Original plan response: {response}')
130 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {}
131 | local_agent_response_list_dir['feedback1'] = ''
132 | agent_dict = json.loads(response)
133 | for local_agent_row_i in range(pg_row_num):
134 | for local_agent_column_j in range(pg_column_num):
135 | if f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]' in agent_dict:
136 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = []
137 | response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = []
138 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, pg_column_num, local_agent_row_i, local_agent_column_j, pg_dict)
139 |
140 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method)
141 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'].append(local_reprompt)
142 | messages = message_construct_func(prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], '_w_all_dialogue_history')
143 | response_local_agent, token_num_count = GPT_response(messages, model_name)
144 | token_num_count_list.append(token_num_count)
145 | #print(f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}] response: {response_local_agent}')
146 | if response_local_agent != 'I Agree':
147 | local_agent_response_list_dir['feedback1'] += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' # collect the response from all the local agents
148 | dialogue_history += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n'
149 |
150 | if local_agent_response_list_dir['feedback1'] != '':
151 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {Agent[0.5, 0.5]:move(box_blue, square[0.5, 1.5]), Agent[1.5, 0.5]:move...}, as above. Do not explain, just directly output json directory. Your response:'
152 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction
153 | response_central_again, token_num_count = GPT_response(messages, model_name)
154 | token_num_count_list.append(token_num_count)
155 | match = re.search(r'{.*}', response_central_again, re.DOTALL)
156 | if match:
157 | response = match.group()
158 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name,
159 | '_w_all_dialogue_history', cen_decen_framework)
160 | token_num_count_list = token_num_count_list + token_num_count_list_add
161 | print(f'response: {response}')
162 | #print(messages[2])
163 | #print(messages[3])
164 | print(f'Modified plan response:\n {response}')
165 | else:
166 | print(f'Plan:\n {response}')
167 | pass
168 |
169 | dialogue_history_list.append(dialogue_history)
170 |
171 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast':
172 | print('--------HMAS-1 method starts--------')
173 | count_round = 0
174 | dialogue_history = f'Central Planner: {response}\n'
175 | match = None
176 | agent_dict = json.loads(response)
177 | while not match and count_round <= 3:
178 | count_round += 1
179 | for local_agent_row_i in range(pg_row_num):
180 | for local_agent_column_j in range(pg_column_num):
181 | if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in agent_dict:
182 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(
183 | pg_row_num,
184 | pg_column_num,
185 | local_agent_row_i,
186 | local_agent_column_j,
187 | pg_dict)
188 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast':
189 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent,
190 | state_update_prompt_other_agent,
191 | dialogue_history, response_total_list, pg_state_list,
192 | dialogue_history_list, dialogue_history_method,
193 | initial_plan=response)
194 | else:
195 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent,
196 | state_update_prompt_other_agent, dialogue_history,
197 | response_total_list, pg_state_list,
198 | dialogue_history_list, dialogue_history_method,
199 | initial_plan='')
200 |
201 | user_prompt_list.append(user_prompt_1)
202 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
203 | f.write(user_prompt_list[-1])
204 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
205 | initial_response, token_num_count = GPT_response(messages,model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
206 | token_num_count_list.append(token_num_count)
207 |
208 | #print('-----------prompt------------\n' + initial_response)
209 | dialogue_history += f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n'
210 | #print(dialogue_history)
211 | match = re.search(r'{.*}', initial_response, re.DOTALL)
212 | if match and re.search(r'EXECUTE', initial_response):
213 | response = match.group()
214 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
215 | [user_prompt_list[-1]],
216 | [],
217 | model_name,
218 | '_w_all_dialogue_history',
219 | cen_decen_framework)
220 | token_num_count_list = token_num_count_list + token_num_count_list_add
221 | print(f'response: {response}')
222 | break
223 | break
224 | dialogue_history_list.append(dialogue_history)
225 |
226 | response_total_list.append(response)
227 | if response == 'Out of tokens':
228 | success_failure = 'failure over token length limit'
229 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
230 | elif response == 'Syntactic Error':
231 | success_failure = 'Syntactic Error'
232 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
233 |
234 | data = json.loads(response)
235 |
236 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f:
237 | json.dump(data, f)
238 | original_response_dict = json.loads(response_total_list[index_query_times])
239 | print(pg_dict)
240 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'):
241 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f:
242 | f.write(dialogue_history_list[index_query_times])
243 | try:
244 | system_error_feedback, pg_dict_returned = action_from_response(pg_dict, original_response_dict)
245 | if system_error_feedback != '':
246 | print(system_error_feedback)
247 | pg_dict = pg_dict_returned
248 |
249 | except:
250 | success_failure = 'Hallucination of wrong plan'
251 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
252 | pg_state_list.append(pg_dict)
253 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(index_query_times+2)+'.json', 'w') as f:
254 | json.dump(pg_dict, f)
255 |
256 | # Check whether the task has been completed
257 | count = 0
258 | for key, value in pg_dict.items():
259 | count += len(value)
260 | if count == 0:
261 | break
262 |
263 | if index_query_times < query_time_limit - 1:
264 | success_failure = 'success'
265 | else:
266 | success_failure = 'failure over query time limit'
267 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
268 |
269 |
270 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
271 | Saving_path = Code_dir_path + 'Env1_BoxNet1'
272 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613'
273 | print(f'-------------------Model name: {model_name}-------------------')
274 | for pg_row_num, pg_column_num in [(2,2), (2,4), (4,4), (4,8)]:
275 | if pg_row_num == 4 and pg_column_num == 8:
276 | query_time_limit = 40
277 | else:
278 | query_time_limit = 30
279 | for iteration_num in range(10):
280 | print('-------###-------###-------###-------')
281 | print(f'Row num is: {pg_row_num}, Column num is: {pg_column_num}, Iteration num is: {iteration_num}\n\n')
282 |
283 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history',
284 | cen_decen_framework='HMAS-2', model_name = model_name)
285 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f:
286 | for token_num_num_count in token_num_count_list:
287 | f.write(str(token_num_num_count) + '\n')
288 |
289 | with open(Saving_path_result + '/success_failure.txt', 'w') as f:
290 | f.write(success_failure)
291 |
292 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f:
293 | f.write(f'{index_query_times+1}')
294 | print(success_failure)
295 | print(f'Iteration number: {index_query_times+1}')
--------------------------------------------------------------------------------
/env2-box-arrange.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | from prompt_env2 import *
3 | from env2_create import *
4 | from sre_constants import error
5 | import random
6 | import os
7 | import json
8 | import re
9 | import copy
10 | import numpy as np
11 | import shutil
12 | import time
13 |
14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2'
15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history'
16 | def run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS'):
17 |
18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}'
19 |
20 | # specify the path to your dir for saving the results
21 | os.makedirs(Saving_path_result, exist_ok=True)
22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True)
23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True)
24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True)
25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True)
26 |
27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file:
28 | pg_dict = json.load(file)
29 |
30 | user_prompt_list = [] # The record list of all the input prompts
31 | response_total_list = [] # The record list of all the responses
32 | pg_state_list = [] # The record list of apg states in varied steps
33 | dialogue_history_list = []
34 | token_num_count_list = []
35 | pg_state_list.append(pg_dict)
36 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f:
37 | json.dump(pg_dict, f)
38 |
39 | ### Start the Game! Query LLM for response
40 | print(f'query_time_limit: {query_time_limit}')
41 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs
42 | #print(index_query_times)
43 | #print(pg_dict)
44 | state_update_prompt = state_update_func(pg_row_num, pg_column_num, pg_dict)
45 | if cen_decen_framework in ('DMAS'):
46 | print('--------DMAS method starts--------')
47 | match = None
48 | count_round = 0
49 | dialogue_history = ''
50 | response = '{}'
51 | while not match and count_round <= 3:
52 | count_round += 1
53 | for local_agent_row_i in range(pg_row_num):
54 | for local_agent_column_j in range(pg_column_num):
55 | #if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in pg_dict:
56 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num,
57 | pg_column_num,
58 | local_agent_row_i,
59 | local_agent_column_j,
60 | pg_dict)
61 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent,
62 | state_update_prompt_other_agent,
63 | dialogue_history, response_total_list,
64 | pg_state_list, dialogue_history_list,
65 | dialogue_history_method)
66 | user_prompt_list.append(user_prompt_1)
67 | #print(f'User prompt: {user_prompt_1}\n\n')
68 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
69 | f.write(user_prompt_list[-1])
70 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
71 | initial_response, token_num_count = GPT_response(messages,
72 | model_name='gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
73 | token_num_count_list.append(token_num_count)
74 |
75 | dialogue_history += f'[Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {initial_response}]\n\n'
76 | #print(dialogue_history)
77 | if re.search(r'EXECUTE', initial_response):
78 | # Search for the pattern that starts with { and ends with }
79 | print('EXECUTE!')
80 | match = re.search(r'{.*}', initial_response, re.DOTALL)
81 | if match:
82 | response = match.group()
83 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
84 | [user_prompt_list[-1]],
85 | [],
86 | 'gpt-4-0613',
87 | '_w_all_dialogue_history')
88 | token_num_count_list = token_num_count_list + token_num_count_list_add
89 | print(f'response: {response}')
90 | #print(f'User prompt: {user_prompt_1}\n\n')
91 | break
92 | break
93 | dialogue_history_list.append(dialogue_history)
94 | else:
95 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'):
96 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list,
97 | pg_state_list, dialogue_history_list,
98 | dialogue_history_method, cen_decen_framework)
99 | user_prompt_list.append(user_prompt_1)
100 | #print('user_prompt_1: ', user_prompt_1)
101 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction
102 |
103 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f:
104 | f.write(user_prompt_list[-1])
105 | initial_response, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
106 | print('Initial response: ', initial_response)
107 | token_num_count_list.append(token_num_count)
108 | match = re.search(r'{.*}', initial_response, re.DOTALL)
109 | if match:
110 | response = match.group()
111 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], 'gpt-4-0613', '_w_all_dialogue_history')
112 | token_num_count_list = token_num_count_list + token_num_count_list_add
113 | print(f'response: {response}')
114 |
115 | if response == 'Out of tokens':
116 | success_failure = 'failure over token length limit'
117 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
118 | elif response == 'Syntactic Error':
119 | success_failure = 'Syntactic Error'
120 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
121 |
122 | # Local agent response for checking the feasibility of actions
123 | if cen_decen_framework == 'HMAS-2':
124 | print('--------HMAS-2 method starts--------')
125 | break_mark = False; count_round_HMAS2 = 0
126 |
127 | while break_mark == False and count_round_HMAS2 < 3:
128 | count_round_HMAS2 += 1
129 | dialogue_history = f'Central Planner: {response}\n'
130 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {}
131 | local_agent_response_list_dir['feedback1'] = ''
132 | agent_dict = json.loads(response)
133 | for local_agent_row_i in range(pg_row_num):
134 | for local_agent_column_j in range(pg_column_num):
135 | if f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]' in agent_dict:
136 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = []
137 | response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = []
138 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, pg_column_num, local_agent_row_i, local_agent_column_j, pg_dict)
139 |
140 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method)
141 | #print(local_reprompt)
142 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'].append(local_reprompt)
143 | messages = message_construct_func(prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], '_w_all_dialogue_history')
144 | response_local_agent, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613')
145 | token_num_count_list.append(token_num_count)
146 | print(f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}] response: {response_local_agent}')
147 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent):
148 | local_agent_response_list_dir['feedback1'] += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' # collect the response from all the local agents
149 | dialogue_history += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n'
150 |
151 | if local_agent_response_list_dir['feedback1'] != '':
152 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {Agent[0.5, 0.5]:move(box_blue, position[0.0, 1.0]), Agent[1.5, 0.5]:move...}, as above. Do not explain, just directly output json directory. Your response:'
153 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction
154 | response_central_again, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613')
155 | token_num_count_list.append(token_num_count)
156 | match = re.search(r'{.*}', response_central_again, re.DOTALL)
157 | if match:
158 | response = match.group()
159 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], 'gpt-4-0613',
160 | '_w_all_dialogue_history')
161 | token_num_count_list = token_num_count_list + token_num_count_list_add
162 | print(f'Modified plan response: {response}')
163 | else:
164 | break_mark = True
165 | pass
166 |
167 | dialogue_history_list.append(dialogue_history)
168 |
169 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast':
170 | print('--------HMAS-1 method starts--------')
171 | count_round = 0
172 | dialogue_history = f'Central Planner: {response}\n'
173 | match = None
174 | agent_dict = json.loads(response)
175 | while not match and count_round <= 3:
176 | count_round += 1
177 | for local_agent_row_i in range(pg_row_num):
178 | for local_agent_column_j in range(pg_column_num):
179 | if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in agent_dict:
180 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(
181 | pg_row_num,
182 | pg_column_num,
183 | local_agent_row_i,
184 | local_agent_column_j,
185 | pg_dict)
186 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast':
187 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent,
188 | state_update_prompt_other_agent,
189 | dialogue_history, response_total_list, pg_state_list,
190 | dialogue_history_list, dialogue_history_method,
191 | initial_plan=response)
192 | else:
193 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent,
194 | state_update_prompt_other_agent, dialogue_history,
195 | response_total_list, pg_state_list,
196 | dialogue_history_list, dialogue_history_method,
197 | initial_plan='')
198 |
199 | user_prompt_list.append(user_prompt_1)
200 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
201 | f.write(user_prompt_list[-1])
202 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
203 | initial_response, token_num_count = GPT_response(messages,
204 | model_name='gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613'
205 | token_num_count_list.append(token_num_count)
206 |
207 | #print('-----------prompt------------\n' + initial_response)
208 | dialogue_history += f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n'
209 | #print(dialogue_history)
210 | print(f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n')
211 | match = re.search(r'{.*}', initial_response, re.DOTALL)
212 | if match and re.search(r'EXECUTE', initial_response):
213 | response = match.group()
214 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
215 | [user_prompt_list[-1]],
216 | [],
217 | 'gpt-4-0613',
218 | '_w_all_dialogue_history')
219 | token_num_count_list = token_num_count_list + token_num_count_list_add
220 | print(f'response: {response}')
221 | break
222 | break
223 | dialogue_history_list.append(dialogue_history)
224 |
225 | response_total_list.append(response)
226 | if response == 'Out of tokens':
227 | success_failure = 'failure over token length limit'
228 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
229 | elif response == 'Syntactic Error':
230 | success_failure = 'Syntactic Error'
231 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
232 |
233 | data = json.loads(response)
234 |
235 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f:
236 | json.dump(data, f)
237 | original_response_dict = json.loads(response_total_list[index_query_times])
238 | print(pg_dict)
239 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'):
240 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f:
241 | f.write(dialogue_history_list[index_query_times])
242 | try:
243 | system_error_feedback, pg_dict_returned, collision_check = action_from_response(pg_dict, original_response_dict)
244 | if system_error_feedback != '':
245 | print(system_error_feedback)
246 | if collision_check:
247 | print('Collision!')
248 | success_failure = 'Collision'
249 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
250 | pg_dict = pg_dict_returned
251 |
252 | except:
253 | success_failure = 'Hallucination of wrong plan'
254 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
255 | pg_state_list.append(pg_dict)
256 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(index_query_times+2)+'.json', 'w') as f:
257 | json.dump(pg_dict, f)
258 |
259 | # Check whether the task has been completed
260 | count = 0
261 | for key, value in pg_dict.items():
262 | count += len(value)
263 | if count == 0:
264 | break
265 |
266 | if index_query_times < query_time_limit - 1:
267 | success_failure = 'success'
268 | else:
269 | success_failure = 'failure over query time limit'
270 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
271 |
272 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
273 | Saving_path = Code_dir_path + 'Env2_BoxNet2'
274 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613'
275 | print(f'-------------------Model name: {model_name}-------------------')
276 |
277 | for pg_row_num, pg_column_num in [(2,2), (2,4), (4,4), (4,8)]:
278 | if pg_row_num == 4 and pg_column_num == 8:
279 | query_time_limit = 40
280 | else:
281 | query_time_limit = 30
282 | for iteration_num in range(10):
283 | print('-------###-------###-------###-------')
284 | print(f'Row num is: {pg_row_num}, Column num is: {pg_column_num}, Iteration num is: {iteration_num}\n\n')
285 |
286 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history',
287 | cen_decen_framework='HMAS-2')
288 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f:
289 | for token_num_num_count in token_num_count_list:
290 | f.write(str(token_num_num_count) + '\n')
291 |
292 | with open(Saving_path_result + '/success_failure.txt', 'w') as f:
293 | f.write(success_failure)
294 |
295 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f:
296 | f.write(f'{index_query_times+1}')
297 | print(success_failure)
298 | print(f'Iteration number: {index_query_times+1}')
299 |
--------------------------------------------------------------------------------
/env4-box-arrange.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | from prompt_env4 import *
3 | from env4_create import *
4 | from sre_constants import error
5 | import random
6 | import os
7 | import json
8 | import re
9 | import copy
10 | import numpy as np
11 | import shutil
12 | import time
13 |
14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2'
15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history'
16 | def run_exp(Saving_path, track_row_num, column_num, box_occupy_ratio, agent_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'):
17 |
18 | Saving_path_result = Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}'
19 |
20 | # specify the path to your dir for saving the results
21 | os.makedirs(Saving_path_result, exist_ok=True)
22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True)
23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True)
24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True)
25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True)
26 |
27 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file:
28 | pg_dict = json.load(file)
29 | with open(Saving_path + f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/box_state{iteration_num}.json', 'r') as file:
30 | box_position_dict = json.load(file)
31 |
32 | user_prompt_list = [] # The record list of all the input prompts
33 | response_total_list = [] # The record list of all the responses
34 | pg_state_list = [] # The record list of pg states in varied steps
35 | box_state_list = [] # The record list of box states in varied steps
36 | dialogue_history_list = []
37 | token_num_count_list = []
38 | system_error_feedback_list = []
39 | pg_state_list.append(pg_dict)
40 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f:
41 | json.dump(pg_dict, f)
42 | with open(Saving_path_result+'/pg_state' + '/box_state'+str(1)+'.json', 'w') as f:
43 | json.dump(box_position_dict, f)
44 |
45 | ### Start the Game! Query LLM for response
46 | print(f'query_time_limit: {query_time_limit}')
47 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs
48 | state_update_prompt = state_update_func(pg_dict, box_position_dict, track_row_num, column_num)
49 | if cen_decen_framework in ('DMAS'):
50 | print('--------DMAS method starts--------')
51 | match = None
52 | count_round = 0
53 | dialogue_history = ''
54 | response = '{}'
55 | while not match and count_round <= 3:
56 | count_round += 1
57 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(
58 | local_agent_row_i,
59 | pg_dict)
60 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent,
61 | state_update_prompt_other_agent,
62 | dialogue_history, response_total_list,
63 | pg_state_list, dialogue_history_list,
64 | dialogue_history_method)
65 | user_prompt_list.append(user_prompt_1)
66 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
67 | f.write(user_prompt_list[-1])
68 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
69 | initial_response, token_num_count = GPT_response(messages, model_name)
70 | token_num_count_list.append(token_num_count)
71 |
72 | dialogue_history += f'[Agent[{local_agent_row_i}]: {initial_response}]\n\n'
73 | #print(dialogue_history)
74 | if re.search(r'EXECUTE', initial_response):
75 | # Search for the pattern that starts with { and ends with }
76 | print('EXECUTE!')
77 | match = re.search(r'{.*}', initial_response, re.DOTALL)
78 | if match:
79 | response = match.group()
80 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
81 | [user_prompt_list[-1]],
82 | [],
83 | model_name,
84 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict)
85 | token_num_count_list = token_num_count_list + token_num_count_list_add
86 | print(f'response: {response}')
87 | #print(f'User prompt: {user_prompt_1}\n\n')
88 | break
89 | break
90 | dialogue_history_list.append(dialogue_history)
91 | else:
92 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'):
93 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list, system_error_feedback_list,
94 | pg_state_list, dialogue_history_list,
95 | dialogue_history_method, cen_decen_framework, track_row_num, column_num)
96 |
97 | user_prompt_list.append(user_prompt_1)
98 | #print('user_prompt_1: ', user_prompt_1)
99 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction
100 |
101 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f:
102 | f.write(user_prompt_list[-1])
103 | initial_response, token_num_count = GPT_response(messages, model_name)
104 | print('Initial response: ', initial_response)
105 | token_num_count_list.append(token_num_count)
106 | match = re.search(r'{.*}', initial_response, re.DOTALL)
107 | if match:
108 | response = match.group()
109 | if response[0] == '{' and response[-1] == '}':
110 | if '{' in response[1:-1] and '}' in response[1:-1]:
111 | match = re.search(r'{.*}', response[:-1], re.DOTALL)
112 | if match:
113 | response = match.group()
114 | print(f'response: {response}')
115 | print('----------------Start syntactic check--------------')
116 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history', track_row_num, column_num, box_position_dict)
117 | token_num_count_list = token_num_count_list + token_num_count_list_add
118 | print(f'response: {response}')
119 | else:
120 | raise ValueError(f'Response format error: {response}')
121 |
122 | if response == 'Out of tokens':
123 | success_failure = 'failure over token length limit'
124 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
125 | elif response == 'Syntactic Error':
126 | success_failure = 'Syntactic Error'
127 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
128 |
129 | # Local agent response for checking the feasibility of actions
130 | if cen_decen_framework == 'HMAS-2':
131 | print('--------HMAS-2 method starts--------')
132 | break_mark = False; count_round_HMAS2 = 0
133 |
134 | while break_mark == False and count_round_HMAS2 < 3:
135 | count_round_HMAS2 += 1
136 | dialogue_history = f'Central Planner: {response}\n'
137 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {}
138 | local_agent_response_list_dir['feedback1'] = ''
139 |
140 | agent_dict = json.loads(response)
141 | for agent_name, agent_state in agent_dict.items():
142 |
143 | prompt_list_dir[agent_name] = []
144 | response_list_dir[agent_name] = []
145 |
146 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt, response,
147 | response_total_list, pg_state_list,
148 | dialogue_history_list, system_error_feedback_list,
149 | dialogue_history_method, agent_name, track_row_num, column_num)
150 |
151 |
152 | # print(local_reprompt)
153 | prompt_list_dir[agent_name].append(local_reprompt)
154 | messages = message_construct_func(
155 | prompt_list_dir[agent_name],
156 | response_list_dir[agent_name],
157 | '_w_all_dialogue_history')
158 | response_local_agent, token_num_count = GPT_response(messages, model_name)
159 | token_num_count_list.append(token_num_count)
160 | print(f'{agent_name} response: {response_local_agent}')
161 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent):
162 | local_agent_response_list_dir[
163 | 'feedback1'] += f'{agent_name}: {response_local_agent}\n' # collect the response from all the local agents
164 | dialogue_history += f'{agent_name}: {response_local_agent}\n'
165 |
166 | if local_agent_response_list_dir['feedback1'] != '':
167 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}, as above. Do not explain, just directly output json directory. Your response:'
168 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction
169 | response_central_again, token_num_count = GPT_response(messages, model_name)
170 | token_num_count_list.append(token_num_count)
171 | match = re.search(r'{.*}', response_central_again, re.DOTALL)
172 | if match:
173 | response = match.group()
174 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name,
175 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict)
176 | token_num_count_list = token_num_count_list
177 | print(f'Modified plan response: {response}')
178 | else:
179 | break_mark = True
180 | pass
181 |
182 | dialogue_history_list.append(dialogue_history)
183 |
184 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast':
185 | print('--------HMAS-1 method starts--------')
186 | count_round = 0
187 | dialogue_history = f'Central Planner: {response}\n'
188 | match = None
189 | while not match and count_round <= 3:
190 | count_round += 1
191 |
192 | agent_dict = json.loads(response)
193 | lift_weight_list_total = []
194 | for key, value in agent_dict.items():
195 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)]
196 |
197 | for lift_weight_item in lifter_weight_list:
198 |
199 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast':
200 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent,
201 | state_update_prompt_other_agent,
202 | dialogue_history, response_total_list, pg_state_list,
203 | dialogue_history_list, dialogue_history_method,
204 | initial_plan=response)
205 | else:
206 | user_prompt_1 = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response,
207 | response_total_list, pg_state_list,
208 | dialogue_history_list,
209 | dialogue_history_method)
210 |
211 |
212 | user_prompt_list.append(user_prompt_1)
213 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
214 | f.write(user_prompt_list[-1])
215 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
216 | initial_response, token_num_count = GPT_response(messages,
217 | model_name)
218 | token_num_count_list.append(token_num_count)
219 |
220 | #print('-----------prompt------------\n' + initial_response)
221 | dialogue_history += f'agent[{lift_weight_item}W]: {initial_response}\n'
222 | #print(dialogue_history)
223 | match = re.search(r'{.*}', initial_response, re.DOTALL)
224 | if match and re.search(r'EXECUTE', initial_response):
225 | response = match.group()
226 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
227 | [user_prompt_list[-1]],
228 | [],
229 | model_name,
230 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict)
231 | token_num_count_list = token_num_count_list + token_num_count_list_add
232 | print(f'response: {response}')
233 | break
234 | break
235 | dialogue_history_list.append(dialogue_history)
236 |
237 | response_total_list.append(response)
238 | if response == 'Out of tokens':
239 | success_failure = 'failure over token length limit'
240 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
241 | elif response == 'Syntactic Error':
242 | success_failure = 'Syntactic Error'
243 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
244 |
245 | data = json.loads(response)
246 |
247 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f:
248 | json.dump(data, f)
249 | original_response_dict = json.loads(response_total_list[index_query_times])
250 | print(pg_dict)
251 | print(box_position_dict)
252 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'):
253 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f:
254 | f.write(dialogue_history_list[index_query_times])
255 | #try:
256 | system_error_feedback, pg_dict_returned, collision_check, box_position_dict_returned = action_from_response(pg_dict, original_response_dict, track_row_num, column_num, box_position_dict)
257 | system_error_feedback_list.append(system_error_feedback)
258 | if system_error_feedback != '':
259 | print('system_error_feedback: ', system_error_feedback)
260 | if collision_check:
261 | print('Collision!')
262 | success_failure = 'Collision'
263 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
264 | pg_dict = pg_dict_returned
265 | box_position_dict = box_position_dict_returned
266 |
267 | #except:
268 | # print('Hallucination response: ', response)
269 | # success_failure = 'Hallucination of wrong plan'
270 | # return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
271 | pg_state_list.append(pg_dict)
272 | box_state_list.append(box_position_dict)
273 |
274 | with open(Saving_path_result + '/pg_state' + '/pg_state' + str(index_query_times+2) + '.json', 'w') as f:
275 | json.dump(pg_dict, f)
276 | with open(Saving_path_result + '/pg_state' + '/box_state' + str(index_query_times+2) + '.json', 'w') as f:
277 | json.dump(box_position_dict, f)
278 |
279 | # Check whether the task has been completed
280 | box_current_state_list = [value for value in box_position_dict.values()]
281 | print(f'box_current_state_list: {box_current_state_list}')
282 | #print(f'pg_dict: {pg_dict}')
283 | agent_current_state_list = [value[-1] for value in pg_dict.values() if type(value) == list]
284 | print(f'agent_current_state_list: {agent_current_state_list}')
285 | if np.sum(box_current_state_list) + np.sum(agent_current_state_list) == 0:
286 | break
287 |
288 | if index_query_times < query_time_limit - 1:
289 | success_failure = 'success'
290 | else:
291 | success_failure = 'failure over query time limit'
292 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
293 |
294 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
295 | Saving_path = Code_dir_path + 'Env4_Warehouse'
296 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613'
297 | print(f'-------------------Model name: {model_name}-------------------')
298 |
299 | for track_row_num, column_num, box_occupy_ratio, agent_num in [(3, 5, 0.5, 4)]:
300 | if agent_num == 8:
301 | query_time_limit = 40
302 | else:
303 | query_time_limit = 30
304 | for iteration_num in range(10):
305 | print('-------###-------###-------###-------')
306 | print(f'Track_row_num is: {track_row_num}, Column_num: {column_num}, Agent_num: {agent_num}, Iteration num is: {iteration_num}\n\n')
307 |
308 | user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, track_row_num, column_num, box_occupy_ratio, agent_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history',
309 | cen_decen_framework='HMAS-2', model_name = model_name)
310 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f:
311 | for token_num_num_count in token_num_count_list:
312 | f.write(str(token_num_num_count) + '\n')
313 |
314 | with open(Saving_path_result + '/success_failure.txt', 'w') as f:
315 | f.write(success_failure)
316 |
317 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f:
318 | f.write(f'{index_query_times+1}')
319 | print(success_failure)
320 | print(f'Iteration number: {index_query_times+1}')
--------------------------------------------------------------------------------
/env3-box-arrange.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | from prompt_env3 import *
3 | from env3_create import *
4 | from sre_constants import error
5 | import random
6 | import os
7 | import json
8 | import re
9 | import copy
10 | import numpy as np
11 | import shutil
12 | import time
13 |
14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2'
15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_previous_round_history'
16 | def run_exp(Saving_path, pg_row_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'):
17 |
18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}'
19 |
20 | # specify the path to your dir for saving the results
21 | os.makedirs(Saving_path_result, exist_ok=True)
22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True)
23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True)
24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True)
25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True)
26 |
27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/lifter_weight_list{iteration_num}.txt', 'r') as file:
28 | lifter_weight_list = [float(line.strip()) for line in file.readlines()]
29 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/volume_list{iteration_num}.txt', 'r') as file:
30 | volume_list = [float(line.strip()) for line in file.readlines()]
31 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/weight_list{iteration_num}.txt', 'r') as file:
32 | weight_list = [float(line.strip()) for line in file.readlines()]
33 |
34 | if len(volume_list) != len(weight_list):
35 | raise error('The length of volume_list and weight_list are not equal!')
36 | else:
37 | pg_dict = dict(zip(volume_list, weight_list))
38 |
39 | user_prompt_list = [] # The record list of all the input prompts
40 | response_total_list = [] # The record list of all the responses
41 | pg_state_list = [] # The record list of pg states in varied steps
42 | env_act_feedback_list = [] # The record list of env act feedbacks
43 | dialogue_history_list = []
44 | left_box_list = []
45 | token_num_count_list = []
46 | pg_state_list.append(pg_dict)
47 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f:
48 | json.dump(pg_dict, f)
49 |
50 | ### Start the Game! Query LLM for response
51 | print(f'query_time_limit: {query_time_limit}')
52 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs
53 | state_update_prompt, left_box = state_update_func(pg_dict, lifter_weight_list)
54 | left_box_list.append(left_box)
55 | if cen_decen_framework in ('DMAS'):
56 | print('--------DMAS method starts--------')
57 | match = None
58 | count_round = 0
59 | dialogue_history = ''
60 | response = '{}'
61 | while not match and count_round <= 3:
62 | count_round += 1
63 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(
64 | local_agent_row_i,
65 | pg_dict)
66 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent,
67 | state_update_prompt_other_agent,
68 | dialogue_history, response_total_list,
69 | pg_state_list, dialogue_history_list,
70 | dialogue_history_method)
71 | user_prompt_list.append(user_prompt_1)
72 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
73 | f.write(user_prompt_list[-1])
74 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
75 | initial_response, token_num_count = GPT_response(messages, model_name)
76 | token_num_count_list.append(token_num_count)
77 |
78 | dialogue_history += f'[Agent[{local_agent_row_i}]: {initial_response}]\n\n'
79 | #print(dialogue_history)
80 | if re.search(r'EXECUTE', initial_response):
81 | # Search for the pattern that starts with { and ends with }
82 | print('EXECUTE!')
83 | match = re.search(r'{.*}', initial_response, re.DOTALL)
84 | if match:
85 | response = match.group()
86 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
87 | [user_prompt_list[-1]],
88 | [],
89 | model_name,
90 | '_w_all_dialogue_history')
91 | token_num_count_list = token_num_count_list + token_num_count_list_add
92 | print(f'response: {response}')
93 | #print(f'User prompt: {user_prompt_1}\n\n')
94 | break
95 | break
96 | dialogue_history_list.append(dialogue_history)
97 | else:
98 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'):
99 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list,
100 | left_box_list, dialogue_history_list, env_act_feedback_list,
101 | dialogue_history_method, cen_decen_framework)
102 | user_prompt_list.append(user_prompt_1)
103 | #print('user_prompt_1: ', user_prompt_1)
104 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction
105 |
106 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f:
107 | f.write(user_prompt_list[-1])
108 | initial_response, token_num_count = GPT_response(messages, model_name)
109 | print('Initial response: ', initial_response)
110 | token_num_count_list.append(token_num_count)
111 | match = re.search(r'{.*}', initial_response, re.DOTALL)
112 | if match:
113 | response = match.group()
114 | if response[0] == '{' and response[-1] == '}':
115 | if '{' in response[1:-1] and '}' in response[1:-1]:
116 | match = re.search(r'{.*}', response[:-1], re.DOTALL)
117 | if match:
118 | response = match.group()
119 | print(f'response: {response}')
120 | print('----------------Start syntactic check--------------')
121 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history')
122 | token_num_count_list = token_num_count_list + token_num_count_list_add
123 | print(f'response: {response}')
124 | else:
125 | raise ValueError(f'Response format error: {response}')
126 |
127 | if response == 'Out of tokens':
128 | success_failure = 'failure over token length limit'
129 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
130 | elif response == 'Syntactic Error':
131 | success_failure = 'Syntactic Error'
132 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
133 |
134 | # Local agent response for checking the feasibility of actions
135 | if cen_decen_framework == 'HMAS-2':
136 | print('--------HMAS-2 method starts--------')
137 | break_mark = False; count_round_HMAS2 = 0
138 |
139 | while break_mark == False and count_round_HMAS2 < 3:
140 | count_round_HMAS2 += 1
141 | dialogue_history = f'Central Planner: {response}\n'
142 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {}
143 | local_agent_response_list_dir['feedback1'] = ''
144 |
145 | agent_dict = json.loads(response)
146 | lift_weight_list_total = []
147 | for key, value in agent_dict.items():
148 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)]
149 |
150 | for lift_weight_item in lifter_weight_list:
151 | if lift_weight_item in lift_weight_list_total:
152 | prompt_list_dir[f'Agent[{lift_weight_item}W]'] = []
153 | response_list_dir[f'Agent[{lift_weight_item}W]'] = []
154 |
155 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response,
156 | response_total_list, pg_state_list,
157 | dialogue_history_list,
158 | env_act_feedback_list,
159 | dialogue_history_method)
160 |
161 | # print(local_reprompt)
162 | prompt_list_dir[f'Agent[{lift_weight_item}W]'].append(local_reprompt)
163 | messages = message_construct_func(
164 | prompt_list_dir[f'Agent[{lift_weight_item}W]'],
165 | response_list_dir[f'Agent[{lift_weight_item}W]'],
166 | '_w_all_dialogue_history')
167 | response_local_agent, token_num_count = GPT_response(messages, model_name)
168 | token_num_count_list.append(token_num_count)
169 | print(f'Agent[{lift_weight_item}W] response: {response_local_agent}')
170 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent):
171 | local_agent_response_list_dir[
172 | 'feedback1'] += f'Agent[{lift_weight_item}W]: {response_local_agent}\n' # collect the response from all the local agents
173 | dialogue_history += f'Agent[{lift_weight_item}W]: {response_local_agent}\n'
174 |
175 | if local_agent_response_list_dir['feedback1'] != '':
176 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}, as above. Do not explain, just directly output json directory. Your response:'
177 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction
178 | response_central_again, token_num_count = GPT_response(messages, model_name)
179 | token_num_count_list.append(token_num_count)
180 | match = re.search(r'{.*}', response_central_again, re.DOTALL)
181 | if match:
182 | response = match.group()
183 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name,
184 | '_w_all_dialogue_history')
185 | token_num_count_list = token_num_count_list + token_num_count_list_add
186 | print(f'Modified plan response: {response}')
187 | else:
188 | break_mark = True
189 | pass
190 |
191 | dialogue_history_list.append(dialogue_history)
192 |
193 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast':
194 | print('--------HMAS-1 method starts--------')
195 | count_round = 0
196 | dialogue_history = f'Central Planner: {response}\n'
197 | match = None
198 | while not match and count_round <= 3:
199 | count_round += 1
200 |
201 | agent_dict = json.loads(response)
202 | lift_weight_list_total = []
203 | for key, value in agent_dict.items():
204 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)]
205 |
206 | for lift_weight_item in lifter_weight_list:
207 |
208 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast':
209 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent,
210 | state_update_prompt_other_agent,
211 | dialogue_history, response_total_list, pg_state_list,
212 | dialogue_history_list, dialogue_history_method,
213 | initial_plan=response)
214 | else:
215 | #user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent,
216 | # state_update_prompt_other_agent, dialogue_history,
217 | # response_total_list, pg_state_list,
218 | # dialogue_history_list, dialogue_history_method,
219 | # initial_plan='')
220 |
221 | user_prompt_1 = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response,
222 | response_total_list, pg_state_list,
223 | dialogue_history_list,
224 | env_act_feedback_list,
225 | dialogue_history_method)
226 |
227 |
228 | user_prompt_list.append(user_prompt_1)
229 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f:
230 | f.write(user_prompt_list[-1])
231 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history')
232 | initial_response, token_num_count = GPT_response(messages,
233 | model_name)
234 | token_num_count_list.append(token_num_count)
235 |
236 | #print('-----------prompt------------\n' + initial_response)
237 | dialogue_history += f'agent[{lift_weight_item}W]: {initial_response}\n'
238 | #print(dialogue_history)
239 | match = re.search(r'{.*}', initial_response, re.DOTALL)
240 | if match and re.search(r'EXECUTE', initial_response):
241 | response = match.group()
242 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response,
243 | [user_prompt_list[-1]],
244 | [],
245 | model_name,
246 | '_w_all_dialogue_history')
247 | token_num_count_list = token_num_count_list + token_num_count_list_add
248 | print(f'response: {response}')
249 | break
250 | break
251 | dialogue_history_list.append(dialogue_history)
252 |
253 | response_total_list.append(response)
254 | if response == 'Out of tokens':
255 | success_failure = 'failure over token length limit'
256 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
257 | elif response == 'Syntactic Error':
258 | success_failure = 'Syntactic Error'
259 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
260 |
261 | data = json.loads(response)
262 |
263 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f:
264 | json.dump(data, f)
265 | original_response_dict = json.loads(response_total_list[index_query_times])
266 | print(pg_dict)
267 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'):
268 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f:
269 | f.write(dialogue_history_list[index_query_times])
270 | try:
271 | system_error_feedback, pg_dict_returned, env_act_feedback = action_from_response(pg_dict, original_response_dict, lifter_weight_list)
272 | env_act_feedback_list.append(env_act_feedback)
273 | if system_error_feedback != '':
274 | print('system_error_feedback: ', system_error_feedback)
275 | if env_act_feedback != '':
276 | print('env_act_feedback: ', env_act_feedback)
277 | pg_dict = pg_dict_returned
278 |
279 | except:
280 | print('Hallucination response: ', response)
281 | success_failure = 'Hallucination of wrong plan'
282 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
283 | pg_state_list.append(pg_dict)
284 |
285 | with open(Saving_path_result + '/pg_state' + '/pg_state' + str(index_query_times+2) + '.json', 'w') as f:
286 | json.dump(pg_dict, f)
287 |
288 | # Check whether the task has been completed
289 | if len(pg_dict) == 0:
290 | break
291 |
292 | if index_query_times < query_time_limit - 1:
293 | success_failure = 'success'
294 | else:
295 | success_failure = 'failure over query time limit'
296 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result
297 |
298 |
299 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here
300 | Saving_path = Code_dir_path + 'Env3_BoxLift'
301 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613'
302 | print(f'-------------------Model name: {model_name}-------------------')
303 |
304 | for pg_row_num in [4,6,8,10]:
305 | if pg_row_num == 8:
306 | query_time_limit = 25
307 | else:
308 | query_time_limit = 20
309 | for iteration_num in range(10):
310 | print('-------###-------###-------###-------')
311 | print(f'Row num is: {pg_row_num}, Iteration num is: {iteration_num}\n\n')
312 |
313 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history',
314 | cen_decen_framework='HMAS-2', model_name = model_name)
315 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f:
316 | for token_num_num_count in token_num_count_list:
317 | f.write(str(token_num_num_count) + '\n')
318 |
319 | with open(Saving_path_result + '/success_failure.txt', 'w') as f:
320 | f.write(success_failure)
321 |
322 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f:
323 | f.write(f'{index_query_times+1}')
324 | print(success_failure)
325 | print(f'Iteration number: {index_query_times+1}')
326 |
--------------------------------------------------------------------------------
/prompt_env3.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | import tiktoken
3 | enc = tiktoken.get_encoding("cl100k_base")
4 | assert enc.decode(enc.encode("hello world")) == "hello world"
5 | enc = tiktoken.encoding_for_model("gpt-4")
6 | input_prompt_token_limit = 4000
7 |
8 | extra_prompt = 'Each lifting agent can be used only once in each step! You can combine multiple agents to lift one box like "box[3.0V]":"agent[1.5W], agent[2.5W]"! Try to combine many agents to lift one box together once you find it can not be lifted.'
9 |
10 | def LLM_summarize_func(state_action_prompt_next_initial):
11 | prompt1 = f"Please summarize the following content as concise as possible: \n{state_action_prompt_next_initial}"
12 | messages = [{"role": "system", "content": "You are a helpful assistant."},
13 | {"role": "user", "content": prompt1}]
14 | response = GPT_response(messages, model_name='gpt-4')
15 | return response
16 |
17 |
18 | def input_prompt_1_func(state_update_prompt):
19 | user_prompt_1 = f'''
20 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
21 |
22 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
23 |
24 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number.
25 |
26 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
27 |
28 | The current left boxes and agents are:
29 | {state_update_prompt}
30 |
31 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step:
32 | '''
33 | return user_prompt_1
34 |
35 |
36 | def input_prompt_1_func_total(state_update_prompt, response_total_list,
37 | pg_state_list, dialogue_history_list, env_act_feedback_list,
38 | dialogue_history_method, cen_decen_framework):
39 | if len(pg_state_list) - len(response_total_list) != 1:
40 | raise error('state and response list do not match')
41 | if len(pg_state_list) - len(env_act_feedback_list) != 1:
42 | raise error('state and env act feedback list do not match')
43 | if len(pg_state_list) - len(dialogue_history_list) != 1 and cen_decen_framework != 'CMAS':
44 | raise error('state and dialogue history list do not match')
45 |
46 | user_prompt_1 = f'''
47 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
48 |
49 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
50 |
51 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number.
52 |
53 | The previous state and action pairs at each step are:
54 |
55 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
56 |
57 | The current left boxes and agents are:
58 | {state_update_prompt}
59 |
60 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[5.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step:
61 | '''
62 | token_num_count = len(enc.encode(user_prompt_1))
63 |
64 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS':
65 | pass
66 | elif dialogue_history_method in (
67 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
68 | if dialogue_history_method == '_w_only_state_action_history':
69 | #print('fdsfdsafadsas')
70 | state_action_prompt = ''
71 | for i in range(len(response_total_list) - 1, -1, -1):
72 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
73 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
74 | state_action_prompt = state_action_prompt_next
75 | else:
76 | break
77 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS':
78 | state_action_prompt = ''
79 | for i in range(len(response_total_list) - 1, -1, -1):
80 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
81 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
82 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
83 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
84 | state_action_prompt = state_action_prompt_next
85 | else:
86 | break
87 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS':
88 | state_action_prompt = ''
89 | for i in range(len(response_total_list) - 1, -1, -1):
90 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
91 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
92 | state_action_prompt = state_action_prompt_next
93 | else:
94 | break
95 |
96 | user_prompt_1 = f'''
97 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
98 |
99 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
100 |
101 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number.
102 |
103 | The previous state and action pairs at each step are:
104 | {state_action_prompt}
105 |
106 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
107 |
108 | The current left boxes and agents are:
109 | {state_update_prompt}
110 |
111 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step:
112 | '''
113 | #print(f'state_action_prompt: {state_action_prompt}')
114 | return user_prompt_1
115 |
116 | def input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list,
117 | pg_state_list, dialogue_history_list,
118 | dialogue_history_method):
119 | if len(pg_state_list) - len(response_total_list) != 1:
120 | raise error('state and response list do not match')
121 | if len(pg_state_list) - len(dialogue_history_list) != 1:
122 | raise error('state and dialogue history list do not match')
123 |
124 | user_prompt_1 = f'''
125 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
126 |
127 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
128 |
129 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
130 |
131 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
132 |
133 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
134 |
135 | The current left boxes and agents are:
136 | {state_update_prompt}
137 |
138 | [Action Output Instruction]
139 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}.
140 | Include an agent only if it has a task next.
141 | Example#1:
142 | EXECUTE
143 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
144 |
145 | Example#2:
146 | EXECUTE
147 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
148 |
149 | The previous state and action pairs at each step are:
150 | {state_action_prompt}
151 |
152 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
153 |
154 | The current state is {pg_state_list[-1]}
155 | The central planner\'s current action plan is:
156 |
157 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
158 | Your response:
159 | '''
160 | token_num_count = len(enc.encode(user_prompt_1))
161 |
162 | if dialogue_history_method == '_wo_any_dialogue_history':
163 | pass
164 | elif dialogue_history_method in ('_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
165 | if dialogue_history_method == '_w_only_state_action_history':
166 | state_action_prompt = ''
167 | for i in range(len(response_total_list) - 1, -1, -1):
168 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
169 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
170 | state_action_prompt = state_action_prompt_next
171 | else:
172 | break
173 | elif dialogue_history_method == '_w_compressed_dialogue_history':
174 | state_action_prompt = ''
175 | for i in range(len(response_total_list) - 1, -1, -1):
176 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
177 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
178 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
179 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
180 | state_action_prompt = state_action_prompt_next
181 | else:
182 | break
183 | elif dialogue_history_method == '_w_all_dialogue_history':
184 | state_action_prompt = ''
185 | for i in range(len(response_total_list) - 1, -1, -1):
186 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
187 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
188 | state_action_prompt = state_action_prompt_next
189 | else:
190 | break
191 |
192 | user_prompt_1 = f'''
193 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
194 |
195 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
196 |
197 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
198 |
199 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
200 |
201 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
202 |
203 | The current left boxes and agents are:
204 | {state_update_prompt}
205 |
206 | [Action Output Instruction]
207 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}.
208 | Include an agent only if it has a task next.
209 | Example#1:
210 | EXECUTE
211 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
212 |
213 | Example#2:
214 | EXECUTE
215 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
216 |
217 | The previous state and action pairs at each step are:
218 | {state_action_prompt}
219 |
220 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
221 |
222 | The current state is {pg_state_list[-1]}
223 | The central planner\'s current action plan is: {{{central_response}}}.
224 |
225 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
226 | Your response:
227 | '''
228 | return user_prompt_1
229 |
230 |
231 | def input_prompt_local_agent_HMAS1_dialogue_func(lift_weight_item, state_update_prompt, central_response, response_total_list, pg_state_list, dialogue_history_list, env_act_feedback_list, dialogue_history_method):
232 | if len(pg_state_list) - len(response_total_list) != 1:
233 | raise error('state and response list do not match')
234 | if len(pg_state_list) - len(env_act_feedback_list) != 1:
235 | raise error('state and env act feedback list do not match')
236 | if len(pg_state_list) - len(dialogue_history_list) != 1:
237 | raise error('state and dialogue history list do not match')
238 |
239 | user_prompt_1 = f'''
240 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
241 |
242 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
243 |
244 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
245 |
246 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
247 |
248 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
249 |
250 | The current left boxes and agents are:
251 | {state_update_prompt}
252 |
253 | [Action Output Instruction]
254 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}.
255 | Include an agent only if it has a task next.
256 | Example#1:
257 | EXECUTE
258 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
259 |
260 | Example#2:
261 | EXECUTE
262 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
263 |
264 | The previous state and action pairs at each step are:
265 |
266 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
267 |
268 | The current state is {pg_state_list[-1]}
269 | The central planner\'s current action plan is:
270 |
271 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
272 | Your response:
273 | '''
274 |
275 | token_num_count = len(enc.encode(user_prompt_1))
276 |
277 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS':
278 | pass
279 | elif dialogue_history_method in (
280 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
281 | if dialogue_history_method == '_w_only_state_action_history':
282 | #print('fdsfdsafadsas')
283 | state_action_prompt = ''
284 | for i in range(len(response_total_list) - 1, -1, -1):
285 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
286 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
287 | state_action_prompt = state_action_prompt_next
288 | else:
289 | break
290 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS':
291 | state_action_prompt = ''
292 | for i in range(len(response_total_list) - 1, -1, -1):
293 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
294 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
295 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
296 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
297 | state_action_prompt = state_action_prompt_next
298 | else:
299 | break
300 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS':
301 | state_action_prompt = ''
302 | for i in range(len(response_total_list) - 1, -1, -1):
303 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
304 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
305 | state_action_prompt = state_action_prompt_next
306 | else:
307 | break
308 |
309 | user_prompt_1 = f'''
310 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
311 |
312 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
313 |
314 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
315 |
316 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
317 |
318 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
319 |
320 | The current left boxes and agents are:
321 | {state_update_prompt}
322 |
323 | [Action Output Instruction]
324 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}.
325 | Include an agent only if it has a task next.
326 | Example#1:
327 | EXECUTE
328 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
329 |
330 | Example#2:
331 | EXECUTE
332 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}}
333 |
334 | The previous state and action pairs at each step are:
335 | {state_action_prompt}
336 |
337 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
338 |
339 | The current state is {pg_state_list[-1]}
340 | The central planner\'s current action plan is: {{{central_response}}}.
341 |
342 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
343 | Your response:
344 | '''
345 | return user_prompt_1
346 |
347 | def input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, central_response, response_total_list, pg_state_list, dialogue_history_list, env_act_feedback_list, dialogue_history_method):
348 | if len(pg_state_list) - len(response_total_list) != 1:
349 | raise error('state and response list do not match')
350 | if len(pg_state_list) - len(env_act_feedback_list) != 1:
351 | raise error('state and env act feedback list do not match')
352 | if len(pg_state_list) - len(dialogue_history_list) != 1:
353 | raise error('state and dialogue history list do not match')
354 |
355 | user_prompt_1 = f'''
356 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
357 |
358 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
359 |
360 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
361 |
362 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
363 |
364 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
365 |
366 | The current left boxes and agents are:
367 | {state_update_prompt}
368 |
369 | The previous state and action pairs at each step are:
370 |
371 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
372 |
373 | The current state is {pg_state_list[-1]}
374 | The central planner\'s current action plan is: {{{central_response}}}.
375 |
376 | If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response:
377 | '''
378 |
379 | token_num_count = len(enc.encode(user_prompt_1))
380 |
381 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS':
382 | pass
383 | elif dialogue_history_method in (
384 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
385 | if dialogue_history_method == '_w_only_state_action_history':
386 | #print('fdsfdsafadsas')
387 | state_action_prompt = ''
388 | for i in range(len(response_total_list) - 1, -1, -1):
389 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
390 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
391 | state_action_prompt = state_action_prompt_next
392 | else:
393 | break
394 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS':
395 | state_action_prompt = ''
396 | for i in range(len(response_total_list) - 1, -1, -1):
397 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
398 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
399 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
400 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
401 | state_action_prompt = state_action_prompt_next
402 | else:
403 | break
404 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS':
405 | state_action_prompt = ''
406 | for i in range(len(response_total_list) - 1, -1, -1):
407 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt
408 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
409 | state_action_prompt = state_action_prompt_next
410 | else:
411 | break
412 |
413 | user_prompt_1 = f'''
414 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes.
415 |
416 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]".
417 |
418 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number.
419 |
420 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W'
421 |
422 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.]
423 |
424 | The current left boxes and agents are:
425 | {state_update_prompt}
426 |
427 | The previous state and action pairs at each step are:
428 | {state_action_prompt}
429 |
430 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
431 |
432 | The current state is {pg_state_list[-1]}
433 | The central planner\'s current action plan is: {{{central_response}}}.
434 |
435 | If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response:
436 | '''
437 | return user_prompt_1
438 |
439 | def message_construct_func(user_prompt_list, response_total_list, dialogue_history_method):
440 | if f'{dialogue_history_method}' == '_w_all_dialogue_history':
441 | messages=[{"role": "system", "content": "You are a helpful assistant."}]
442 | #print('length of user_prompt_list', len(user_prompt_list))
443 | for i in range(len(user_prompt_list)):
444 | messages.append({"role": "user", "content": user_prompt_list[i]})
445 | if i < len(user_prompt_list)-1:
446 | messages.append({"role": "assistant", "content": response_total_list[i]})
447 | #print('Length of messages', len(messages))
448 | elif f'{dialogue_history_method}' in ('_wo_any_dialogue_history', '_w_only_state_action_history'):
449 | messages=[{"role": "system", "content": "You are a helpful assistant."}]
450 | messages.append({"role": "user", "content": user_prompt_list[-1]})
451 | #print('Length of messages', len(messages))
452 | return messages
453 |
--------------------------------------------------------------------------------
/prompt_env1.py:
--------------------------------------------------------------------------------
1 | from LLM import *
2 | import tiktoken
3 | enc = tiktoken.get_encoding("cl100k_base")
4 | assert enc.decode(enc.encode("hello world")) == "hello world"
5 | enc = tiktoken.encoding_for_model("gpt-4")
6 | input_prompt_token_limit = 3000
7 |
8 | def LLM_summarize_func(state_action_prompt_next_initial, model_name):
9 | prompt1 = f"Please summarize the following content as concise as possible: \n{state_action_prompt_next_initial}"
10 | messages = [{"role": "system", "content": "You are a helpful assistant."},
11 | {"role": "user", "content": prompt1}]
12 | response = GPT_response(messages, model_name)
13 | return response
14 |
15 |
16 | def input_prompt_1_func(state_update_prompt):
17 | user_prompt_1 = f'''
18 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes.
19 |
20 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]).
21 |
22 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally.
23 |
24 | {state_update_prompt}
25 |
26 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step:
27 | '''
28 | return user_prompt_1
29 |
30 |
31 | def input_prompt_1_only_state_action_func(state_update_prompt, response_total_list, pg_state_list):
32 | user_prompt_1 = f'''
33 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes.
34 |
35 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]).
36 |
37 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally.
38 |
39 | The previous state and action pairs at each step are:
40 |
41 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
42 |
43 | Hence, the current state is {pg_state_list[-1]}, with the possible actions:
44 | {state_update_prompt}
45 |
46 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step:
47 | '''
48 | token_num_count = len(enc.encode(user_prompt_1))
49 |
50 | if len(pg_state_list) - len(response_total_list) != 1:
51 | raise error('state and response list do not match')
52 | state_action_prompt = ''
53 | for i in range(len(response_total_list) - 1, -1, -1):
54 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
55 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
56 | state_action_prompt = state_action_prompt_next
57 | else:
58 | break
59 |
60 | user_prompt_1 = f'''
61 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes.
62 |
63 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]).
64 |
65 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally.
66 |
67 | The previous state and action pairs at each step are:
68 | {state_action_prompt}
69 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
70 |
71 | Hence, the current state is {pg_state_list[-1]}, with the possible actions:
72 | {state_update_prompt}
73 |
74 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step:
75 | '''
76 | return user_prompt_1
77 |
78 |
79 | def input_prompt_1_func_total(state_update_prompt, response_total_list,
80 | pg_state_list, dialogue_history_list,
81 | dialogue_history_method, cen_decen_framework):
82 | if len(pg_state_list) - len(response_total_list) != 1:
83 | raise error('state and response list do not match')
84 | if len(pg_state_list) - len(dialogue_history_list) != 1 and cen_decen_framework != 'CMAS':
85 | raise error('state and dialogue history list do not match')
86 |
87 | user_prompt_1 = f'''
88 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes.
89 |
90 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]).
91 |
92 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally.
93 |
94 | The previous state and action pairs at each step are:
95 |
96 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
97 |
98 | Hence, the current state is {pg_state_list[-1]}, with the possible actions:
99 | {state_update_prompt}
100 |
101 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step:
102 | '''
103 | token_num_count = len(enc.encode(user_prompt_1))
104 |
105 | if dialogue_history_method == '_wo_any_dialogue_history' or cen_decen_framework == 'CMAS':
106 | pass
107 | elif dialogue_history_method in (
108 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
109 | if dialogue_history_method == '_w_only_state_action_history' and cen_decen_framework != 'CMAS':
110 | state_action_prompt = ''
111 | for i in range(len(response_total_list) - 1, -1, -1):
112 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
113 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
114 | state_action_prompt = state_action_prompt_next
115 | else:
116 | break
117 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS':
118 | state_action_prompt = ''
119 | for i in range(len(response_total_list) - 1, -1, -1):
120 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
121 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
122 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
123 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
124 | state_action_prompt = state_action_prompt_next
125 | else:
126 | break
127 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS':
128 | state_action_prompt = ''
129 | for i in range(len(response_total_list) - 1, -1, -1):
130 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
131 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
132 | state_action_prompt = state_action_prompt_next
133 | else:
134 | break
135 |
136 | user_prompt_1 = f'''
137 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes.
138 |
139 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]).
140 |
141 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally.
142 |
143 | The previous state and action pairs at each step are:
144 | {state_action_prompt}
145 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
146 |
147 | Hence, the current state is {pg_state_list[-1]}, with the possible actions:
148 | {state_update_prompt}
149 |
150 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step:
151 | '''
152 |
153 | return user_prompt_1
154 |
155 | def input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list,
156 | pg_state_list, dialogue_history_list,
157 | dialogue_history_method):
158 | if len(pg_state_list) - len(response_total_list) != 1:
159 | raise error('state and response list do not match')
160 | if len(pg_state_list) - len(dialogue_history_list) != 1:
161 | raise error('state and dialogue history list do not match')
162 |
163 | user_prompt_1 = f'''
164 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
165 | All the agents coordinate with others together to come out a plan and achieve the goal: match each box with its color-coded target.
166 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
167 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
168 | The previous state and action pairs at each step are:
169 |
170 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
171 |
172 |
173 | [Action Output Instruction]
174 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
175 | Include an agent only if it has a task next.
176 | Example#1:
177 | EXECUTE
178 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
179 |
180 | Example#2:
181 | EXECUTE
182 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
183 |
184 | The previous dialogue history is: {{{dialogue_history}}}
185 | Think step-by-step about the task and the previous dialogue history. Carefully check and correct them if they made a mistake.
186 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan.
187 | Propose exactly one action for yourself at the **current** round.
188 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
189 | Your response:
190 | '''
191 | token_num_count = len(enc.encode(user_prompt_1))
192 |
193 | if dialogue_history_method == '_wo_any_dialogue_history':
194 | pass
195 | elif dialogue_history_method in ('_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
196 | if dialogue_history_method == '_w_only_state_action_history':
197 | state_action_prompt = ''
198 | for i in range(len(response_total_list) - 1, -1, -1):
199 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
200 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
201 | state_action_prompt = state_action_prompt_next
202 | else:
203 | break
204 | elif dialogue_history_method == '_w_compressed_dialogue_history':
205 | state_action_prompt = ''
206 | for i in range(len(response_total_list) - 1, -1, -1):
207 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
208 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
209 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
210 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
211 | state_action_prompt = state_action_prompt_next
212 | else:
213 | break
214 | elif dialogue_history_method == '_w_all_dialogue_history':
215 | state_action_prompt = ''
216 | for i in range(len(response_total_list) - 1, -1, -1):
217 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
218 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
219 | state_action_prompt = state_action_prompt_next
220 | else:
221 | break
222 |
223 | user_prompt_1 = f'''
224 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
225 | All the agents coordinate with others together to come out a plan and achieve the goal: match each box with its color-coded target.
226 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
227 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
228 | The previous state and action pairs at each step are:
229 | {state_action_prompt}
230 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
231 |
232 |
233 | [Action Output Instruction]
234 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
235 | Include an agent only if it has a task next.
236 | Example#1:
237 | EXECUTE
238 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
239 |
240 | Example#2:
241 | EXECUTE
242 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
243 |
244 | The previous dialogue history is: {{{dialogue_history}}}
245 | Think step-by-step about the task and the previous dialogue history. Carefully check and correct them if they made a mistake.
246 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan.
247 | Propose exactly one action for yourself at the **current** round.
248 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan, must strictly follow [Action Output Instruction]!
249 | Your response:
250 | '''
251 |
252 | return user_prompt_1
253 |
254 |
255 | def input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, state_update_prompt_other_agent,
256 | dialogue_history, response_total_list, pg_state_list, dialogue_history_list,
257 | dialogue_history_method, initial_plan=''):
258 | if len(pg_state_list) - len(response_total_list) != 1:
259 | raise error('state and response list do not match')
260 | if len(pg_state_list) - len(dialogue_history_list) != 1:
261 | raise error('state and dialogue history list do not match')
262 |
263 | user_prompt_1 = f'''
264 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
265 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target.
266 | Then all the action agents discuss and coordiante with each other to come out a final plan.
267 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
268 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
269 | The previous state and action pairs at each step are:
270 |
271 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
272 |
273 | [Action Output Instruction]
274 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
275 | Include an agent only if it has a task next.
276 | Example#1:
277 | EXECUTE
278 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
279 |
280 | Example#2:
281 | EXECUTE
282 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
283 |
284 | The initial plan is: {{{initial_plan}}}
285 | The previous dialogue history is: {{{dialogue_history}}}
286 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake.
287 | End your response by outputting the final plan, must strictly follow [Action Output Instruction]!
288 | Your response:
289 | '''
290 |
291 | token_num_count = len(enc.encode(user_prompt_1))
292 |
293 | if dialogue_history_method == '_wo_any_dialogue_history':
294 | pass
295 | elif dialogue_history_method in (
296 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
297 | if dialogue_history_method == '_w_only_state_action_history':
298 | state_action_prompt = ''
299 | for i in range(len(response_total_list) - 1, -1, -1):
300 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
301 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
302 | state_action_prompt = state_action_prompt_next
303 | else:
304 | break
305 | elif dialogue_history_method == '_w_compressed_dialogue_history':
306 | state_action_prompt = ''
307 | for i in range(len(response_total_list) - 1, -1, -1):
308 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
309 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
310 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
311 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
312 | state_action_prompt = state_action_prompt_next
313 | else:
314 | break
315 | elif dialogue_history_method == '_w_all_dialogue_history':
316 | state_action_prompt = ''
317 | for i in range(len(response_total_list) - 1, -1, -1):
318 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
319 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
320 | state_action_prompt = state_action_prompt_next
321 | else:
322 | break
323 |
324 | user_prompt_1 = f'''
325 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
326 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target.
327 | Then all the action agents discuss and coordiante with each other to come out a final plan.
328 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
329 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
330 | The previous state and action pairs at each step are:
331 | {state_action_prompt}
332 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
333 |
334 | [Action Output Instruction]
335 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
336 | Include an agent only if it has a task next.
337 | Example#1:
338 | EXECUTE
339 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
340 |
341 | Example#2:
342 | EXECUTE
343 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
344 |
345 | The initial plan is: {{{initial_plan}}}
346 | The previous dialogue history is: {{{dialogue_history}}}
347 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake.
348 | End your response by outputting the final plan, must strictly follow [Action Output Instruction]!
349 | Your response:
350 | '''
351 | return user_prompt_1
352 |
353 |
354 | def input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method, initial_plan = ''):
355 | if len(pg_state_list) - len(response_total_list) != 1:
356 | raise error('state and response list do not match')
357 | if len(pg_state_list) - len(dialogue_history_list) != 1:
358 | raise error('state and dialogue history list do not match')
359 |
360 | user_prompt_1 = f'''
361 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
362 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target.
363 | Then all the action agents discuss and coordiante with each other to come out a final plan.
364 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
365 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
366 | The previous state and action pairs at each step are:
367 |
368 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
369 |
370 | [Action Output Instruction]
371 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
372 | Include an agent only if it has a task next.
373 | Example#1:
374 | EXECUTE
375 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
376 |
377 | Example#2:
378 | EXECUTE
379 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
380 |
381 | The initial plan is: {{{initial_plan}}}
382 | The previous dialogue history is: {{{dialogue_history}}}
383 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake.
384 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan.
385 | Propose exactly one action for yourself at the **current** round.
386 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
387 | Your response:
388 | '''
389 |
390 | token_num_count = len(enc.encode(user_prompt_1))
391 |
392 | if dialogue_history_method == '_wo_any_dialogue_history':
393 | pass
394 | elif dialogue_history_method in (
395 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
396 | if dialogue_history_method == '_w_only_state_action_history':
397 | state_action_prompt = ''
398 | for i in range(len(response_total_list) - 1, -1, -1):
399 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
400 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
401 | state_action_prompt = state_action_prompt_next
402 | else:
403 | break
404 | elif dialogue_history_method == '_w_compressed_dialogue_history':
405 | state_action_prompt = ''
406 | for i in range(len(response_total_list) - 1, -1, -1):
407 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
408 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
409 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
410 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
411 | state_action_prompt = state_action_prompt_next
412 | else:
413 | break
414 | elif dialogue_history_method == '_w_all_dialogue_history':
415 | state_action_prompt = ''
416 | for i in range(len(response_total_list) - 1, -1, -1):
417 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
418 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
419 | state_action_prompt = state_action_prompt_next
420 | else:
421 | break
422 |
423 | user_prompt_1 = f'''
424 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
425 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target.
426 | Then all the action agents discuss and coordiante with each other to come out a final plan.
427 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
428 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
429 | The previous state and action pairs at each step are:
430 | {state_action_prompt}
431 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
432 |
433 | [Action Output Instruction]
434 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}.
435 | Include an agent only if it has a task next.
436 | Example#1:
437 | EXECUTE
438 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}}
439 |
440 | Example#2:
441 | EXECUTE
442 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}}
443 |
444 | The initial plan is: {{{initial_plan}}}
445 | The previous dialogue history is: {{{dialogue_history}}}
446 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake.
447 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan.
448 | Propose exactly one action for yourself at the **current** round.
449 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]!
450 | Your response:
451 | '''
452 | return user_prompt_1
453 |
454 | def input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, central_response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method):
455 | if len(pg_state_list) - len(response_total_list) != 1:
456 | raise error('state and response list do not match')
457 | if len(pg_state_list) - len(dialogue_history_list) != 1:
458 | raise error('state and dialogue history list do not match')
459 |
460 | user_prompt_1 = f'''
461 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
462 |
463 | A central planner coordinates all agents to achieve the goal: match each box with its color-coded target.
464 |
465 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
466 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
467 | The previous state and action pairs at each step are:
468 |
469 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
470 |
471 | The central planner\'s current action plan is: {{{central_response}}}.
472 |
473 | Please evaluate the given plan. If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response:
474 | '''
475 |
476 | token_num_count = len(enc.encode(user_prompt_1))
477 |
478 | if dialogue_history_method == '_wo_any_dialogue_history':
479 | pass
480 | elif dialogue_history_method in (
481 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'):
482 | if dialogue_history_method == '_w_only_state_action_history':
483 | state_action_prompt = ''
484 | for i in range(len(response_total_list) - 1, -1, -1):
485 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
486 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
487 | state_action_prompt = state_action_prompt_next
488 | else:
489 | break
490 | elif dialogue_history_method == '_w_compressed_dialogue_history':
491 | state_action_prompt = ''
492 | for i in range(len(response_total_list) - 1, -1, -1):
493 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i])
494 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
495 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial)
496 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
497 | state_action_prompt = state_action_prompt_next
498 | else:
499 | break
500 | elif dialogue_history_method == '_w_all_dialogue_history':
501 | state_action_prompt = ''
502 | for i in range(len(response_total_list) - 1, -1, -1):
503 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt
504 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit:
505 | state_action_prompt = state_action_prompt_next
506 | else:
507 | break
508 |
509 | user_prompt_1 = f'''
510 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes.
511 |
512 | A central planner coordinates all agents to achieve the goal: match each box with its color-coded target.
513 |
514 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}.
515 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}.
516 | The previous state and action pairs at each step are:
517 | {state_action_prompt}
518 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops.
519 |
520 | The central planner\'s current action plan is: {{{central_response}}}.
521 |
522 | Please evaluate the given plan. If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response:
523 | '''
524 | return user_prompt_1
525 |
526 |
527 | def input_reprompt_func(state_update_prompt):
528 | user_reprompt = f'''
529 | Finished! The updated state is as follows(combined targets and boxes with the same color have been removed):
530 |
531 | {state_update_prompt}
532 |
533 | The output should be like json format like: {{Agent[0.5, 0.5]:move(box_blue, square[0.5, 1.5]), Agent[1.5, 0.5]:move...}}. If no action for one agent in the next step, just do not include its action in the output. Also remember at most one action for each agent in each step.
534 |
535 | Next step output:
536 | '''
537 | return user_reprompt
538 |
539 | def message_construct_func(user_prompt_list, response_total_list, dialogue_history_method):
540 | if f'{dialogue_history_method}' == '_w_all_dialogue_history':
541 | messages=[{"role": "system", "content": "You are a helpful assistant."}]
542 | #print('length of user_prompt_list', len(user_prompt_list))
543 | for i in range(len(user_prompt_list)):
544 | messages.append({"role": "user", "content": user_prompt_list[i]})
545 | if i < len(user_prompt_list)-1:
546 | messages.append({"role": "assistant", "content": response_total_list[i]})
547 | #print('Length of messages', len(messages))
548 | elif f'{dialogue_history_method}' in ('_wo_any_dialogue_history', '_w_only_state_action_history'):
549 | messages=[{"role": "system", "content": "You are a helpful assistant."}]
550 | messages.append({"role": "user", "content": user_prompt_list[-1]})
551 | #print('Length of messages', len(messages))
552 | return messages
553 |
--------------------------------------------------------------------------------