├── Github-figures ├── Env.png ├── framework.png └── main_figure.png ├── LICENSE ├── README.md ├── LLM.py ├── data_visua.py ├── env3_create.py ├── env1_create.py ├── env4_create.py ├── env2_create.py ├── env1-box-arrange.py ├── env2-box-arrange.py ├── env4-box-arrange.py ├── env3-box-arrange.py ├── prompt_env3.py └── prompt_env1.py /Github-figures/Env.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/Env.png -------------------------------------------------------------------------------- /Github-figures/framework.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/framework.png -------------------------------------------------------------------------------- /Github-figures/main_figure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yongchao98/multi-agent-framework/HEAD/Github-figures/main_figure.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Yongchao Chen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Multi-Agent-Framework ([Website](https://yongchao98.github.io/MIT-REALM-Multi-Robot/), ICRA 2024) 2 | Here we show the related code for the Multi-Agent Framework paper. The code will be updated dynamically in the future. There are in total four environments, corresponding to BoxNet1, BoxNet2, BoxLift, and Warehouse, respectively. 3 | 4 |

5 |

6 |

7 | 8 | ## Requirements 9 | Please install the following Python packages. 10 | ``` 11 | pip install numpy openai re random time copy tiktoken 12 | ``` 13 | 14 | Then you need to get your OpenAI key from https://beta.openai.com/ 15 | Put that OpenAI key starting 'sk-' into the LLM.py, line8 16 | 17 | ## Create testing trial environments 18 | Run the env1_create.py/env2_create.py/env3_create.py/env4_create.py to create the environments, remember change the Code_dir_path in the last lines. 19 | 20 | ``` 21 | python env1_create.py 22 | ``` 23 | 24 | ## Usage 25 | Run the env1-box-arrange.py/env2-box-arrange.py/env3-box-arrange.py/env4-box-arrange.py to test our approaches in different frameworks and dialogue history methods. In around Line270, set up the models(GPT-3/4), frameworks (HMAS-2,HMSA-1, DMAS,CMAS), dialogue history method, and your working path dir. Then run the script: 26 | 27 | ``` 28 | python env1-box-arrange.py 29 | ``` 30 | 31 | The experimental results will appear in the generated dir Env1_BoxNet1. For visualizing the testing results, set up the Code_dir_path in line2, then run the script: 32 | 33 | ``` 34 | python data_visua.py 35 | ``` 36 | 37 | ## Recommended Work 38 | 39 | [AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers](https://arxiv.org/pdf/2306.06531.pdf) 40 | 41 | [NL2TL: Transforming Natural Languages to Temporal Logics using Large Language Models](https://arxiv.org/pdf/2305.07766.pdf) 42 | -------------------------------------------------------------------------------- /LLM.py: -------------------------------------------------------------------------------- 1 | import openai 2 | import tiktoken 3 | import time 4 | enc = tiktoken.get_encoding("cl100k_base") 5 | assert enc.decode(enc.encode("hello world")) == "hello world" 6 | enc = tiktoken.encoding_for_model("gpt-4") 7 | 8 | openai_api_key_name = 'sk-...' 9 | 10 | def GPT_response(messages, model_name): 11 | token_num_count = 0 12 | for item in messages: 13 | token_num_count += len(enc.encode(item["content"])) 14 | 15 | if model_name in ['gpt-4', 'gpt-4-32k', 'gpt-3.5-turbo-0301', 'gpt-4-0613', 'gpt-4-32k-0613', 'gpt-3.5-turbo-16k-0613']: 16 | #print(f'-------------------Model name: {model_name}-------------------') 17 | openai.api_key = openai_api_key_name 18 | 19 | try: 20 | result = openai.ChatCompletion.create( 21 | model=model_name, 22 | messages=messages, 23 | temperature = 0.0, 24 | top_p=1, 25 | frequency_penalty=0, 26 | presence_penalty=0 27 | ) 28 | except: 29 | try: 30 | result = openai.ChatCompletion.create( 31 | model=model_name, 32 | messages=messages, 33 | temperature=0.0, 34 | top_p=1, 35 | frequency_penalty=0, 36 | presence_penalty=0 37 | ) 38 | except: 39 | try: 40 | print(f'{model_name} Waiting 60 seconds for API query') 41 | time.sleep(60) 42 | result = openai.ChatCompletion.create( 43 | model=model_name, 44 | messages=messages, 45 | temperature = 0.0, 46 | top_p=1, 47 | frequency_penalty=0, 48 | presence_penalty=0 49 | ) 50 | except: 51 | return 'Out of tokens', token_num_count 52 | token_num_count += len(enc.encode(result.choices[0]['message']['content'])) 53 | print(f'Token_num_count: {token_num_count}') 54 | return result.choices[0]['message']['content'], token_num_count 55 | 56 | else: 57 | raise ValueError(f'Invalid model name: {model_name}') 58 | -------------------------------------------------------------------------------- /data_visua.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 3 | Saving_path = Code_dir_path + 'Env2_BoxNet2' 4 | 5 | candidate_list = [('CMAS','_wo_any_dialogue_history'), ('CMAS','_w_only_state_action_history'), 6 | ('HMAS-2','_wo_any_dialogue_history'), ('HMAS-2','_w_only_state_action_history'), 7 | ('HMAS-2','_w_all_dialogue_history'), ('HMAS-1','_w_only_state_action_history')] 8 | 9 | print('-------#####------------#####------------#####--------') 10 | Env_action_time_list_best_total = []; API_query_time_list_best_total = []; token_num_list_best_total = [] 11 | for pg_row_num, pg_column_num in [(2, 2), (2, 4), (4, 4), (4,8)]: 12 | Env_action_time_list = []; API_query_time_list = []; token_num_list = []; 13 | for iteration_num in range(10): 14 | Env_action_time_list.append(1e10); API_query_time_list.append(1e10); token_num_list.append(1e10) 15 | for cen_decen_framework, dialogue_history_method in candidate_list: 16 | #print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, {cen_decen_framework}{dialogue_history_method}') 17 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/success_failure.txt', 'r') as file: 18 | first_line = file.readline().strip() 19 | #print(first_line) 20 | 21 | if first_line == 'success': 22 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/env_action_times.txt', 'r') as file: 23 | numbers = [float(line.strip()) for line in file.readlines()] 24 | #print('Environment action times', numbers[0]) 25 | if numbers[0] < Env_action_time_list[iteration_num]: 26 | Env_action_time_list[iteration_num] = numbers[0] 27 | 28 | with open(saving_path +f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/token_num_count.txt', 'r') as file: 29 | numbers = [float(line.strip()) for line in file.readlines()] 30 | #print('API query times', len(numbers)) 31 | #print('Consuming of token num', np.sum(numbers)) 32 | if len(numbers) < API_query_time_list[iteration_num]: 33 | API_query_time_list[iteration_num] = len(numbers) 34 | if np.sum(numbers) < token_num_list[iteration_num]: 35 | token_num_list[iteration_num] = np.sum(numbers) 36 | 37 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, Environment action times, {Env_action_time_list}') 38 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, API query times, {API_query_time_list}') 39 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, Consuming of token num, {token_num_list}') 40 | print('\n') 41 | Env_action_time_list_best_total.append(Env_action_time_list) 42 | API_query_time_list_best_total.append(API_query_time_list) 43 | token_num_list_best_total.append(token_num_list) 44 | 45 | for cen_decen_framework, dialogue_history_method in candidate_list: 46 | print('\n') 47 | success_rate_list = [] 48 | Env_action_time_list = []; 49 | API_query_time_list = []; 50 | token_num_list = [] 51 | for index, (pg_row_num, pg_column_num) in enumerate([(2,2), (2,4), (4,4)]): 52 | print('\n') 53 | success_rate = 0; 54 | Env_action_time_cost = 0; 55 | API_query_time_cost = 0; 56 | token_num_cost = 0 57 | success_failure_state_list = [] 58 | for iteration_num in range(4): 59 | with open( 60 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/success_failure.txt', 61 | 'r') as file: 62 | first_line = file.readline().strip() 63 | success_failure_state_list.append(first_line) 64 | 65 | if first_line == 'success': 66 | success_rate += 1 67 | with open( 68 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/env_action_times.txt', 69 | 'r') as file: 70 | numbers = [float(line.strip()) for line in file.readlines()] 71 | # print('Environment action times', numbers[0]) 72 | if Env_action_time_list_best_total[index][iteration_num] < 1e10: 73 | #print(numbers[0]/Env_action_time_list_best_total[index][iteration_num]) 74 | Env_action_time_cost += numbers[0]/Env_action_time_list_best_total[index][iteration_num] 75 | 76 | with open( 77 | saving_path + f'env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}/token_num_count.txt', 78 | 'r') as file: 79 | numbers = [float(line.strip()) for line in file.readlines()] 80 | # print('API query times', len(numbers)) 81 | # print('Consuming of token num', np.sum(numbers)) 82 | if API_query_time_list_best_total[index][iteration_num] < 1e10: 83 | #print(len(numbers)/API_query_time_list_best_total[index][iteration_num]) 84 | API_query_time_cost += len(numbers)/API_query_time_list_best_total[index][iteration_num] 85 | if token_num_list_best_total[index][iteration_num] < 1e10: 86 | #print(np.sum(numbers)/token_num_list_best_total[index][iteration_num]) 87 | token_num_cost += np.sum(numbers)/token_num_list_best_total[index][iteration_num] 88 | print(f'Row num: {pg_row_num}, Column num: {pg_column_num}, {cen_decen_framework}{dialogue_history_method}') 89 | print('success_rate', {success_rate/4}) 90 | print(success_failure_state_list) 91 | success_rate_list.append(success_rate / 4) 92 | if success_rate > 0: 93 | print('Env_action_time_cost', {Env_action_time_cost / success_rate}) 94 | Env_action_time_list.append(Env_action_time_cost / success_rate) 95 | 96 | print('API_query_time_cost', {API_query_time_cost / success_rate}) 97 | API_query_time_list.append(API_query_time_cost / success_rate) 98 | 99 | print('token_num_cost', {token_num_cost / success_rate}) 100 | token_num_list.append(token_num_cost / success_rate) 101 | 102 | print('\n') 103 | print(f'success_rate: {success_rate_list}, {np.sum(success_rate_list)/len(success_rate_list)}') 104 | print(f'Env_action_time_cost: {Env_action_time_list}, {np.sum(Env_action_time_list) / len(Env_action_time_list)}') 105 | print(f'API_query_time_cost: {API_query_time_list}, {np.sum(API_query_time_list) / len(API_query_time_list)}') 106 | print(f'token_num_cost: {token_num_list}, {np.sum(token_num_list) / len(token_num_list)}') -------------------------------------------------------------------------------- /env3_create.py: -------------------------------------------------------------------------------- 1 | from prompt_env3 import * 2 | from LLM import * 3 | from sre_constants import error 4 | import random 5 | import os 6 | import json 7 | import re 8 | import copy 9 | import numpy as np 10 | import shutil 11 | import time 12 | import ast 13 | 14 | def state_update_func(pg_dict, lifter_weight_list): 15 | volume_list = [volume for volume, weight in pg_dict.items()] 16 | 17 | state_update_prompt = f'The left boxes in the warehouse are: ' 18 | left_box = '' 19 | for i in range(len(volume_list)-1): 20 | state_update_prompt += f'box[{volume_list[i]}V], ' 21 | left_box += f'box[{volume_list[i]}V], ' 22 | state_update_prompt += f'box[{volume_list[len(volume_list)-1]}V]' 23 | left_box += f'box[{volume_list[len(volume_list)-1]}V]' 24 | state_update_prompt += f'.\n' 25 | left_box += f'.\n' 26 | 27 | state_update_prompt += f'The available lifting agents in the warehouse are: ' 28 | for i in range(len(lifter_weight_list)-1): 29 | state_update_prompt += f'agent[{lifter_weight_list[i]}W], ' 30 | state_update_prompt += f'agent[{lifter_weight_list[len(lifter_weight_list)-1]}W]' 31 | state_update_prompt += f'.\n' 32 | return state_update_prompt, left_box 33 | 34 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method): 35 | user_prompt_list = copy.deepcopy(user_prompt_list_input) 36 | response_total_list = copy.deepcopy(response_total_list_input) 37 | iteration_num = 0 38 | token_num_count_list_add = [] 39 | while iteration_num < 6: 40 | response_total_list.append(response) 41 | feedback = '' 42 | #try: 43 | original_response_dict = json.loads(response) 44 | pg_dict_original = copy.deepcopy(pg_dict_input) 45 | 46 | # The state to be updated 47 | volume_list = [volume for volume, weight in pg_dict_original.items()] 48 | weight_list = [weight for volume, weight in pg_dict_original.items()] 49 | 50 | # The action to act 51 | for key, value in original_response_dict.items(): 52 | match = re.search(r'(\d+\.\d+)', key) 53 | volume = float(match.group(1)) 54 | lift_weight_list = [float(num) for num in re.findall(r'(\d+\.\d+)', value)] 55 | # print(lift_weight_list) 56 | 57 | if volume in volume_list: 58 | pass 59 | else: 60 | feedback += f'box[{volume}V] is not in the current warehouse; ' 61 | #except: 62 | # feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss the left boxes in the warehouse.' 63 | 64 | if feedback != '': 65 | feedback += 'Please replan for all the agents again with the same ouput format:' 66 | print('----------Syntactic Check----------') 67 | print(f'Response original: {response}') 68 | print(f'Feedback: {feedback}') 69 | user_prompt_list.append(feedback) 70 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction 71 | print(f'Length of messages {len(messages)}') 72 | response, token_num_count = GPT_response(messages, model_name) 73 | token_num_count_list_add.append(token_num_count) 74 | print(f'Response new: {response}\n') 75 | if response == 'Out of tokens': 76 | return response, token_num_count_list_add 77 | iteration_num += 1 78 | else: 79 | return response, token_num_count_list_add 80 | return 'Syntactic Error', token_num_count_list_add 81 | 82 | 83 | def action_from_response(pg_dict, original_response_dict, lifter_weight_list): 84 | system_error_feedback = ''; 85 | env_act_feedback = '' 86 | pg_dict_original = copy.deepcopy(pg_dict) 87 | 88 | # The state to be updated 89 | volume_list = [volume for volume, weight in pg_dict_original.items()] 90 | weight_list = [weight for volume, weight in pg_dict_original.items()] 91 | 92 | # The action to act 93 | for key, value in original_response_dict.items(): 94 | match = re.search(r'(\d+\.\d+)', key) 95 | volume = float(match.group(1)) 96 | lift_weight_list = [float(num) for num in re.findall(r'(\d+\.\d+)', value)] 97 | for item in lift_weight_list: 98 | if item not in lifter_weight_list: 99 | system_error_feedback += f'agent[{item}W] is not in the current warehouse; ' 100 | 101 | if volume in volume_list: 102 | index = volume_list.index(volume) 103 | if np.sum(lift_weight_list) >= weight_list[index]: 104 | volume_list.pop(index) 105 | weight_list.pop(index) 106 | else: 107 | expression = '' 108 | for index_2 in range(len(lift_weight_list)): 109 | if index_2 != len(lift_weight_list) - 1: 110 | expression += f'agent[{lift_weight_list[index_2]}W] and ' 111 | else: 112 | expression += f'agent[{lift_weight_list[index_2]}W]' 113 | env_act_feedback += f'The weight of box[{volume}V] is higher than the summation of lifting capability of {expression}, so it can not be lifted. ' 114 | else: 115 | system_error_feedback += f'box[{volume}V] is not in the current warehouse; ' 116 | 117 | pg_dict_original = dict(zip(volume_list, weight_list)) 118 | return system_error_feedback, pg_dict_original, env_act_feedback 119 | 120 | 121 | 122 | def assign_weight(volume): 123 | # Step 1: Assume a base density to convert volume to weight. 124 | # This value is an assumption; in real-life, different items have different densities. 125 | # Let's assume a density of 0.5 kg/m^3 for simplicity. 126 | # You can adjust this value based on your requirements. 127 | density = 1 128 | estimated_weight = volume * density 129 | 130 | # Step 2: Add some randomness to the weight. 131 | # This can be a combination of gaussian noise and outlier noise. 132 | noise = random.gauss(0, estimated_weight * 0.1) # 10% of weight as gaussian noise 133 | outlier_chance = 0.05 # 5% chance to be an outlier 134 | if random.random() < outlier_chance: 135 | noise += random.choice([-1, 1]) * estimated_weight * 0.5 # 50% of weight as outlier noise 136 | 137 | weight = max(0.1, estimated_weight + noise) # ensure weight is not negative 138 | return weight 139 | 140 | def env_create(lifter_num, box_num): 141 | # Create the volume and weight lists 142 | volume_list = [random.randint(2, 20)/2 for _ in range(box_num)] 143 | weight_list = [round(assign_weight(volume), 1) for volume in volume_list] 144 | 145 | # Create the lifter list 146 | lifter_weight_list = [random.randint(1, 15) / 2 for _ in range(lifter_num)] 147 | while np.sum(lifter_weight_list) < np.max(weight_list): 148 | lifter_weight_list = [item + 0.5 for item in lifter_weight_list] 149 | 150 | print('lifter_weight_list: ', lifter_weight_list) 151 | print('volume_list: ', volume_list) 152 | print('weight_list: ', weight_list) 153 | print('Deviation ratio: ', [weight_list[i] / volume_list[i] for i in range(len(volume_list))]) 154 | print('\n') 155 | return lifter_weight_list, volume_list, weight_list 156 | 157 | def create_env3(Saving_path, repeat_num = 4): 158 | if not os.path.exists(Saving_path): 159 | os.makedirs(Saving_path, exist_ok=True) 160 | else: 161 | shutil.rmtree(Saving_path) 162 | os.makedirs(Saving_path, exist_ok=True) 163 | 164 | for i, box_num in [(4,10), (6,14), (8,18), (10,24)]: 165 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}'): 166 | os.makedirs(Saving_path+f'/env_pg_state_{i}', exist_ok=True) 167 | else: 168 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}') 169 | os.makedirs(Saving_path+f'/env_pg_state_{i}', exist_ok=True) 170 | 171 | for iteration_num in range(repeat_num): 172 | lifter_weight_list, volume_list, weight_list = env_create(i, box_num) 173 | os.makedirs(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}', exist_ok=True) 174 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/lifter_weight_list{iteration_num}.txt', 'w') as f: 175 | for number in lifter_weight_list: 176 | f.write(str(number) + '\n') 177 | 178 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/volume_list{iteration_num}.txt', 'w') as f: 179 | for number in volume_list: 180 | f.write(str(number) + '\n') 181 | 182 | with open(Saving_path+f'/env_pg_state_{i}/pg_state{iteration_num}/weight_list{iteration_num}.txt', 'w') as f: 183 | for number in weight_list: 184 | f.write(str(number) + '\n') 185 | 186 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 187 | Saving_path = Code_dir_path + 'Env3_BoxLift' 188 | create_env3(Saving_path, repeat_num = 10) 189 | 190 | -------------------------------------------------------------------------------- /env1_create.py: -------------------------------------------------------------------------------- 1 | # Box moving to target without collision 2 | 3 | from prompt_env1 import * 4 | from LLM import * 5 | from sre_constants import error 6 | import random 7 | import os 8 | import json 9 | import re 10 | import copy 11 | import numpy as np 12 | import shutil 13 | import time 14 | 15 | def surround_index_func(row_num, coloum_num, row_index, coloum_index): 16 | surround_index_list = [] 17 | for i, j in ([row_index-1, coloum_index], [row_index+1, coloum_index], [row_index, coloum_index-1], [row_index, coloum_index+1]): 18 | if i>=0 and i<=row_num-1 and j>=0 and j<=coloum_num-1 and not (i == row_index and j == coloum_index): 19 | surround_index_list.append([i+0.5,j+0.5]) 20 | return surround_index_list 21 | 22 | def state_update_func(pg_row_num, pg_column_num, pg_dict): 23 | pg_dict_copy = copy.deepcopy(pg_dict) 24 | state_update_prompt = '' 25 | for i in range(pg_row_num): 26 | for j in range(pg_column_num): 27 | square_item_list = pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)] 28 | square_item_only_box = [item for item in square_item_list if item[:3]=='box'] 29 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, i, j) 30 | state_update_prompt += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do ' 31 | action_list = [] 32 | for box in square_item_only_box: 33 | for surround_index in surround_index_list: 34 | action_list.append(f'move({box}, square{surround_index})') 35 | if 'target'+box[3:] in square_item_list: 36 | action_list.append(f'move({box}, target{box[3:]})') 37 | state_update_prompt += f'{action_list}\n' 38 | return state_update_prompt 39 | 40 | def state_update_func_local_agent(pg_row_num, pg_column_num, pg_row_i, pg_column_j, pg_dict): 41 | pg_dict_copy = copy.deepcopy(pg_dict) 42 | state_update_prompt_local_agent = '' 43 | state_update_prompt_other_agent = '' 44 | 45 | for i in range(pg_row_num): 46 | for j in range(pg_column_num): 47 | if not (i == pg_row_i and pg_column_j == j): 48 | square_item_list = pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)] 49 | square_item_only_box = [item for item in square_item_list if item[:3]=='box'] 50 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, i, j) 51 | state_update_prompt_other_agent += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do ' 52 | action_list = [] 53 | for box in square_item_only_box: 54 | for surround_index in surround_index_list: 55 | action_list.append(f'move({box}, square{surround_index})') 56 | if 'target'+box[3:] in square_item_list: 57 | action_list.append(f'move({box}, target{box[3:]})') 58 | state_update_prompt_other_agent += f'{action_list}\n' 59 | 60 | square_item_list = pg_dict_copy[str(pg_row_i+0.5)+'_'+str(pg_column_j+0.5)] 61 | square_item_only_box = [item for item in square_item_list if item[:3]=='box'] 62 | surround_index_list = surround_index_func(pg_row_num, pg_column_num, pg_row_i, pg_column_j) 63 | state_update_prompt_local_agent += f'Agent[{pg_row_i+0.5}, {pg_column_j+0.5}]: in square[{pg_row_i+0.5}, {pg_column_j+0.5}], can observe {square_item_list}, can do ' 64 | action_list = [] 65 | for box in square_item_only_box: 66 | for surround_index in surround_index_list: 67 | action_list.append(f'move({box}, square{surround_index})') 68 | if 'target'+box[3:] in square_item_list: 69 | action_list.append(f'move({box}, target{box[3:]})') 70 | state_update_prompt_local_agent += f'{action_list}\n' 71 | return state_update_prompt_local_agent, state_update_prompt_other_agent 72 | 73 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method, cen_decen_framework): 74 | user_prompt_list = copy.deepcopy(user_prompt_list_input) 75 | response_total_list = copy.deepcopy(response_total_list_input) 76 | iteration_num = 0 77 | token_num_count_list_add = [] 78 | while iteration_num < 6: 79 | response_total_list.append(response) 80 | try: 81 | original_response_dict = json.loads(response) 82 | 83 | pg_dict_original = copy.deepcopy(pg_dict_input) 84 | transformed_dict = {} 85 | for key, value in original_response_dict.items(): 86 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key))) 87 | 88 | # match the item and location in the value 89 | match = re.match(r"move\((.*?),\s(.*?)\)", value) 90 | if match: 91 | item, location = match.groups() 92 | 93 | if "square" in location: 94 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location))) 95 | 96 | transformed_dict[coordinates] = [item, location] 97 | 98 | feedback = '' 99 | for key, value in transformed_dict.items(): 100 | # print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 101 | if value[0] in pg_dict_original[str(key[0]) + '_' + str(key[1])] and type(value[1]) == tuple and ( 102 | (np.abs(key[0] - value[1][0]) == 0 and np.abs(key[1] - value[1][1]) == 1) or ( 103 | np.abs(key[0] - value[1][0]) == 1 and np.abs(key[1] - value[1][1]) == 0)): 104 | pass 105 | elif value[0] in pg_dict_original[str(key[0]) + '_' + str(key[1])] and type(value[1]) == str and value[1] in \ 106 | pg_dict_original[str(key[0]) + '_' + str(key[1])] and value[0][:4] == 'box_' and value[1][ 107 | :7] == 'target_' and \ 108 | value[0][4:] == value[1][7:]: 109 | pass 110 | else: 111 | # print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 112 | feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; ' 113 | except: 114 | raise error(f'The response in wrong json format: {response}') 115 | feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss to move box into the same colored target like move(box_blue, target_blue)' 116 | 117 | if feedback != '': 118 | feedback += 'Please replan for all the agents again with the same ouput format:' 119 | print('----------Syntactic Check----------') 120 | print(f'Response original: {response}') 121 | print(f'Feedback: {feedback}') 122 | user_prompt_list.append(feedback) 123 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction 124 | print(f'Length of messages {len(messages)}') 125 | response, token_num_count = GPT_response(messages, model_name) 126 | token_num_count_list_add.append(token_num_count) 127 | print(f'Response new: {response}\n') 128 | if response == 'Out of tokens': 129 | return response, token_num_count_list_add 130 | iteration_num += 1 131 | else: 132 | return response, token_num_count_list_add 133 | return 'Syntactic Error', token_num_count_list_add 134 | 135 | def action_from_response(pg_dict_input, original_response_dict): 136 | system_error_feedback = '' 137 | pg_dict_original = copy.deepcopy(pg_dict_input) 138 | transformed_dict = {} 139 | for key, value in original_response_dict.items(): 140 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key))) 141 | 142 | # match the item and location in the value 143 | match = re.match(r"move\((.*?),\s(.*?)\)", value) 144 | if match: 145 | item, location = match.groups() 146 | if "square" in location: 147 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location))) 148 | transformed_dict[coordinates] = [item, location] 149 | 150 | for key, value in transformed_dict.items(): 151 | #print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 152 | if value[0] in pg_dict_original[str(key[0])+'_'+str(key[1])] and type(value[1]) == tuple and ((np.abs(key[0]-value[1][0])==0 and np.abs(key[1]-value[1][1])==1) or (np.abs(key[0]-value[1][0])==1 and np.abs(key[1]-value[1][1])==0)): 153 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[0]) 154 | pg_dict_original[str(value[1][0])+'_'+str(value[1][1])].append(value[0]) 155 | elif value[0] in pg_dict_original[str(key[0])+'_'+str(key[1])] and type(value[1]) == str and value[1] in pg_dict_original[str(key[0])+'_'+str(key[1])] and value[0][:4] == 'box_' and value[1][:7] == 'target_' and value[0][4:] == value[1][7:]: 156 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[0]) 157 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[1]) 158 | else: 159 | #print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 160 | system_error_feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; ' 161 | 162 | return system_error_feedback, pg_dict_original 163 | 164 | def env_create(pg_row_num = 5, pg_column_num = 5, box_num_low_bound = 2, box_num_upper_bound = 2, color_list = ['blue', 'red', 'green', 'purple', 'orange']): 165 | # pg_dict records the items in each square over steps, here in the initial setting, we randomly assign items into each square 166 | pg_dict = {} 167 | for i in range(pg_row_num): 168 | for j in range(pg_column_num): 169 | pg_dict[str(i+0.5)+'_'+str(j+0.5)] = [] 170 | 171 | for color in color_list: 172 | box_num = random.randint(box_num_low_bound, box_num_upper_bound) 173 | for _ in range(box_num): 174 | N_box = random.randint(0, pg_row_num*pg_column_num - 1) 175 | a_box = N_box // pg_column_num 176 | b_box = N_box % pg_column_num 177 | N_target = random.randint(0, pg_row_num*pg_column_num - 1) 178 | a_target = N_target // pg_column_num 179 | b_target = N_target % pg_column_num 180 | pg_dict[str(a_box+0.5)+'_'+str(b_box+0.5)].append('box_' + color) 181 | pg_dict[str(a_target+0.5)+'_'+str(b_target+0.5)].append('target_' + color) 182 | return pg_dict 183 | 184 | def create_env1(Saving_path, repeat_num = 10): 185 | if not os.path.exists(Saving_path): 186 | os.makedirs(Saving_path, exist_ok=True) 187 | else: 188 | shutil.rmtree(Saving_path) 189 | os.makedirs(Saving_path, exist_ok=True) 190 | 191 | for i ,j in [(2,2), (2,4), (4,4), (4,8)]: 192 | 193 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}_{j}'): 194 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True) 195 | else: 196 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}_{j}') 197 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True) 198 | 199 | for iteration_num in range(repeat_num): 200 | # Define the total row and column numbers of the whole playground, and the item number of each colored target and box 201 | pg_row_num = i; pg_column_num = j; box_num_low_bound = 1; box_num_upper_bound = 3 202 | # Define the used colors 203 | color_list = ['blue', 'red', 'green', 'purple', 'orange'] 204 | pg_dict = env_create(pg_row_num, pg_column_num, box_num_low_bound, box_num_upper_bound, color_list) 205 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}', exist_ok=True) 206 | with open(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f: 207 | json.dump(pg_dict, f) 208 | 209 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 210 | Saving_path = Code_dir_path + 'Env1_BoxNet1' 211 | # The first time to create the environment, after that you can comment it 212 | create_env1(Saving_path, repeat_num = 10) -------------------------------------------------------------------------------- /env4_create.py: -------------------------------------------------------------------------------- 1 | from prompt_env4 import * 2 | from LLM import * 3 | from sre_constants import error 4 | import random 5 | import os 6 | import json 7 | import re 8 | import copy 9 | import numpy as np 10 | import shutil 11 | import time 12 | 13 | def state_update_func(agent_position_state_dict, box_position_dict, track_row_num, column_num): 14 | state_update_prompt = f'The states and actions of available agents are: \n' 15 | state_update_prompt += f'The left boxes and their locations in the warehouse are: ' 16 | for key, value in box_position_dict.items(): 17 | if value == 1: 18 | state_update_prompt += f'box_{key}, ' 19 | state_update_prompt += f'.\n' 20 | 21 | for i in range(len(agent_position_state_dict)): 22 | if type(agent_position_state_dict[f'agent{i}']) == str and agent_position_state_dict[f'agent{i}'] == 'target': 23 | state_update_prompt += f'I am agent{i}, I am in target now, I can do: ' 24 | for row_num in range(track_row_num): 25 | state_update_prompt += f'move to track_{row_num}; ' 26 | else: 27 | if agent_position_state_dict[f'agent{i}'][2] == 1: 28 | state_update_prompt += f'I am agent{i}, I am in track_{agent_position_state_dict[f"agent{i}"][0]} and column_{agent_position_state_dict[f"agent{i}"][1]}, I am having box on myself so can not pick more box now. I can do: ' 29 | else: 30 | state_update_prompt += f'I am agent{i}, I am in track_{agent_position_state_dict[f"agent{i}"][0]} and column_{agent_position_state_dict[f"agent{i}"][1]}, I am not having box on myself so can pick one box. I can do: ' 31 | if agent_position_state_dict[f'agent{i}'][1] > 0: 32 | state_update_prompt += f'move left; ' 33 | if agent_position_state_dict[f'agent{i}'][1] < column_num-1: 34 | state_update_prompt += f'move right; ' 35 | if agent_position_state_dict[f'agent{i}'][1] == 0: 36 | state_update_prompt += f'move to target; ' 37 | if agent_position_state_dict[f'agent{i}'][0] - 0.5 > 0 and box_position_dict[f'{agent_position_state_dict[f"agent{i}"][0]-0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}'] == 1 and agent_position_state_dict[f'agent{i}'][2] == 0: 38 | state_update_prompt += f'pick box_{agent_position_state_dict[f"agent{i}"][0]-0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}; ' 39 | if agent_position_state_dict[f'agent{i}'][0] + 0.5 < track_row_num-1 and box_position_dict[f'{agent_position_state_dict[f"agent{i}"][0] + 0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}'] == 1 and agent_position_state_dict[f'agent{i}'][2] == 0: 40 | state_update_prompt += f'pick box_{agent_position_state_dict[f"agent{i}"][0]+0.5}_{float(agent_position_state_dict[f"agent{i}"][1])}; ' 41 | state_update_prompt += f'.\n' 42 | 43 | state_update_prompt += f'\n' 44 | return state_update_prompt 45 | 46 | 47 | def action_from_response(pg_dict_input, original_response_dict, track_row_num, column_num, box_position_dict_input): 48 | collision_check = False 49 | system_error_feedback = '' 50 | pg_dict_original = copy.deepcopy(pg_dict_input) 51 | box_position_dict = copy.deepcopy(box_position_dict_input) 52 | 53 | for key, value in original_response_dict.items(): 54 | # '{"agent0":"move left", "agent1":"pick box_1.0_1.5"}' 55 | 56 | if 'left' in value: 57 | if pg_dict_original[key][1]>0: 58 | pg_dict_original[key][1] -= 1 59 | elif pg_dict_original[key][1]==0: 60 | system_error_feedback += f'{key} has arrived at the left side of the track, you can not move left. You can enter_target to leave the track and drop the box (if you have box), or you can move left or pick up near boxes.' 61 | else: 62 | print(f"Error, Key: {key}, Value: {value}") 63 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; ' 64 | elif 'right' in value: 65 | if pg_dict_original[key][1]= track_row_num or float_numbers[0] <= 0: 99 | print(f"Error, Key: {key}, Value: {value}") 100 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; ' 101 | if pg_dict_original[key] == 'target': 102 | #print(float_numbers) 103 | pg_dict_original[key] = [float_numbers[0], 0, 0] 104 | 105 | else: 106 | print(f"Error, Key: {key}, Value: {value}") 107 | system_error_feedback += f'Your assigned task for {key} is not in the doable action list; ' 108 | 109 | position_list = [] 110 | for key, value in pg_dict_original.items(): 111 | if value == 'target': 112 | pass 113 | else: 114 | if [value[0], value[1]] in position_list: 115 | collision_check = True 116 | break 117 | else: 118 | position_list.append([value[0], value[1]]) 119 | 120 | return system_error_feedback, pg_dict_original, collision_check, box_position_dict 121 | 122 | 123 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, 124 | model_name, dialogue_history_method, track_row_num, column_num, box_position_dict): 125 | #print('----------Syntactic Check----------') 126 | user_prompt_list = copy.deepcopy(user_prompt_list_input) 127 | response_total_list = copy.deepcopy(response_total_list_input) 128 | iteration_num = 0 129 | token_num_count_list_add = [] 130 | while iteration_num < 6: 131 | response_total_list.append(response) 132 | # try: 133 | original_response_dict = json.loads(response) 134 | pg_dict_original = copy.deepcopy(pg_dict_input) 135 | system_error_feedback, pg_dict_original_return, collision_check_return, box_position_dict_return = action_from_response( 136 | pg_dict_original, original_response_dict, track_row_num, column_num, box_position_dict) 137 | feedback = system_error_feedback 138 | 139 | if feedback != '': 140 | feedback += 'Please replan for all the agents again with the same ouput format:' 141 | print('----------Syntactic Check----------') 142 | print(f'Response original: {response}') 143 | print(f'Feedback: {feedback}') 144 | user_prompt_list.append(feedback) 145 | messages = message_construct_func(user_prompt_list, response_total_list, 146 | dialogue_history_method) # message construction 147 | print(f'Length of messages {len(messages)}') 148 | response, token_num_count = GPT_response(messages, model_name) 149 | token_num_count_list_add.append(token_num_count) 150 | print(f'Response new: {response}\n') 151 | if response == 'Out of tokens': 152 | return response, token_num_count_list_add 153 | iteration_num += 1 154 | else: 155 | return response, token_num_count_list_add 156 | return 'Syntactic Error', token_num_count_list_add 157 | 158 | def generate_unique_integers(lower_limit, upper_limit, m): 159 | if m > (upper_limit - lower_limit + 1): 160 | raise ValueError("Cannot generate more unique integers than the range allows.") 161 | unique_integers = set() 162 | while len(unique_integers) < m: 163 | unique_integers.add(random.randint(lower_limit, upper_limit)) 164 | return list(unique_integers) 165 | 166 | def env_create(track_row_num, column_num, box_occupy_ratio, agent_num): 167 | box_position_dict = {} 168 | # assign boxes to positions 169 | for i in range(track_row_num -1): 170 | for j in range(column_num): 171 | if random.random() < box_occupy_ratio: 172 | box_position_dict[f'{float(i+0.5)}_{float(j)}'] = 1 173 | else: 174 | box_position_dict[f'{float(i+0.5)}_{float(j)}'] = 0 175 | 176 | # assign agents into positions 177 | agent_position_state_dict = {} 178 | 179 | lower_limit = 0 180 | upper_limit = track_row_num * column_num - 1 181 | unique_integers = generate_unique_integers(lower_limit, upper_limit + 5, agent_num) 182 | 183 | for i in range(agent_num): 184 | if unique_integers[i] > upper_limit: 185 | agent_position_state_dict[f'agent{i}'] = 'target' 186 | else: 187 | agent_position_state_dict[f'agent{i}'] = (unique_integers[i] // column_num, unique_integers[i] % column_num, 0) 188 | return agent_position_state_dict, box_position_dict 189 | 190 | 191 | def create_env4(Saving_path, repeat_num = 10): 192 | if not os.path.exists(Saving_path): 193 | os.makedirs(Saving_path, exist_ok=True) 194 | else: 195 | shutil.rmtree(Saving_path) 196 | os.makedirs(Saving_path, exist_ok=True) 197 | 198 | for track_row_num, column_num, box_occupy_ratio, agent_num in [(3, 5, 0.5, 4)]: 199 | if not os.path.exists(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}'): 200 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}', exist_ok=True) 201 | else: 202 | shutil.rmtree(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}') 203 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}', exist_ok=True) 204 | 205 | for iteration_num in range(repeat_num): 206 | agent_position_state_dict, box_position_dict = env_create(track_row_num, column_num, box_occupy_ratio, agent_num) 207 | print('Initial agent state: ', agent_position_state_dict) 208 | print('Box_matrix: ', box_position_dict) 209 | os.makedirs(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}', exist_ok=True) 210 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f: 211 | json.dump(agent_position_state_dict, f) 212 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/box_state{iteration_num}.json', 'w') as f: 213 | json.dump(box_position_dict, f) 214 | print('\n') 215 | 216 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 217 | Saving_path = Code_dir_path + 'Env4_Warehouse' 218 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613' 219 | # The first time to create the environment, after that you can comment it 220 | create_env4(Saving_path, repeat_num = 10) -------------------------------------------------------------------------------- /env2_create.py: -------------------------------------------------------------------------------- 1 | # Box moving to target with collisions 2 | 3 | from prompt_env2 import * 4 | from LLM import * 5 | from sre_constants import error 6 | import random 7 | import os 8 | import json 9 | import re 10 | import copy 11 | import numpy as np 12 | import shutil 13 | import time 14 | 15 | def corner_position(pg_row_i, pg_column_j): 16 | corner_position_list = [(float(pg_row_i), float(pg_column_j)), (float(pg_row_i), float(pg_column_j + 1)), (float(pg_row_i + 1), float(pg_column_j)), 17 | (float(pg_row_i + 1), float(pg_column_j + 1))] 18 | return corner_position_list 19 | 20 | def judge_move_box2pos_box2target_func(key, value, pg_dict_original): 21 | if not (str(key[0] - 0.5) + '_' + str(key[1] - 0.5) in pg_dict_original.keys() \ 22 | and str(key[0] - 0.5) + '_' + str(key[1] + 0.5) in pg_dict_original.keys() \ 23 | and str(key[0] + 0.5) + '_' + str(key[1] - 0.5) in pg_dict_original.keys() \ 24 | and str(key[0] + 0.5) + '_' + str(key[1] + 0.5) in pg_dict_original.keys() \ 25 | and np.mod(key[0], 1) == 0.5 and np.mod(key[1], 1) == 0.5): 26 | return None, False, False, f'Agent[{float(key[0])}, {float(key[1])}] is not in the agent list. ' 27 | 28 | if value[0] in pg_dict_original[str(key[0] - 0.5) + '_' + str(key[1] - 0.5)]: 29 | box_location = (key[0] - 0.5, key[1] - 0.5) 30 | elif value[0] in pg_dict_original[str(key[0] - 0.5) + '_' + str(key[1] + 0.5)]: 31 | box_location = (key[0] - 0.5, key[1] + 0.5) 32 | elif value[0] in pg_dict_original[str(key[0] + 0.5) + '_' + str(key[1] - 0.5)]: 33 | box_location = (key[0] + 0.5, key[1] - 0.5) 34 | elif value[0] in pg_dict_original[str(key[0] + 0.5) + '_' + str(key[1] + 0.5)]: 35 | box_location = (key[0] + 0.5, key[1] + 0.5) 36 | else: 37 | return None, False, False, '' 38 | 39 | if type(value[1]) == tuple and (np.abs(key[0]-value[1][0])==0.5 and np.abs(key[1]-value[1][1])==0.5): 40 | return box_location, True, False, '' 41 | elif type(value[1]) == str and value[1] in pg_dict_original[str(key[0])+'_'+str(key[1])] and value[0][:4] == 'box_' and value[1][:7] == 'target_' and value[0][4:] == value[1][7:]: 42 | return box_location, False, True, '' 43 | else: 44 | return None, False, False, f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; ' 45 | 46 | 47 | def state_update_func(pg_row_num, pg_column_num, pg_dict): 48 | pg_dict_copy = copy.deepcopy(pg_dict) 49 | state_update_prompt = '' 50 | for i in range(pg_row_num): 51 | for j in range(pg_column_num): 52 | square_item_list = pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)] 53 | state_update_prompt += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do ' 54 | action_list = [] 55 | for corner_x, corner_y in corner_position(i, j): 56 | if len(pg_dict_copy[str(corner_x)+'_'+str(corner_y)]) == 1: 57 | box = pg_dict_copy[str(corner_x)+'_'+str(corner_y)][0] 58 | for surround_index in corner_position(i, j): 59 | if surround_index != (corner_x, corner_y): 60 | action_list.append(f'move({box}, position{surround_index})') 61 | if 'target'+box[3:] in pg_dict_copy[str(i+0.5)+'_'+str(j+0.5)]: 62 | action_list.append(f'move({box}, target{box[3:]})') 63 | state_update_prompt += f'{action_list}\n' 64 | return state_update_prompt 65 | 66 | def state_update_func_local_agent(pg_row_num, pg_column_num, pg_row_i, pg_column_j, pg_dict): 67 | pg_dict_copy = copy.deepcopy(pg_dict) 68 | state_update_prompt_local_agent = '' 69 | state_update_prompt_other_agent = '' 70 | 71 | for i in range(pg_row_num): 72 | for j in range(pg_column_num): 73 | if not (i == pg_row_i and pg_column_j == j): 74 | square_item_list = pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)] 75 | state_update_prompt_other_agent += f'Agent[{i+0.5}, {j+0.5}]: I am in square[{i+0.5}, {j+0.5}], I can observe {square_item_list}, I can do ' 76 | action_list = [] 77 | for corner_x, corner_y in corner_position(i, j): 78 | if len(pg_dict_copy[str(corner_x) + '_' + str(corner_y)]) == 1: 79 | box = pg_dict_copy[str(corner_x) + '_' + str(corner_y)][0] 80 | for surround_index in corner_position(i, j): 81 | if surround_index != (corner_x, corner_y): 82 | action_list.append(f'move({box}, position{surround_index})') 83 | if 'target' + box[3:] in pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]: 84 | action_list.append(f'move({box}, target{box[3:]})') 85 | state_update_prompt_other_agent += f'{action_list}\n' 86 | 87 | state_update_prompt_local_agent += f'Agent[{pg_row_i+0.5}, {pg_column_j+0.5}]: in square[{pg_row_i+0.5}, {pg_column_j+0.5}], can observe {square_item_list}, can do ' 88 | action_list = [] 89 | for corner_x, corner_y in corner_position(pg_row_i, pg_column_j): 90 | if len(pg_dict_copy[str(corner_x) + '_' + str(corner_y)]) == 1: 91 | box = pg_dict_copy[str(corner_x) + '_' + str(corner_y)][0] 92 | for surround_index in corner_position(pg_row_i, pg_column_j): 93 | if surround_index != (corner_x, corner_y): 94 | action_list.append(f'move({box}, position{surround_index})') 95 | if 'target' + box[3:] in pg_dict_copy[str(i + 0.5) + '_' + str(j + 0.5)]: 96 | action_list.append(f'move({box}, target{box[3:]})') 97 | state_update_prompt_local_agent += f'{action_list}\n' 98 | return state_update_prompt_local_agent, state_update_prompt_other_agent 99 | 100 | def with_action_syntactic_check_func(pg_dict_input, response, user_prompt_list_input, response_total_list_input, model_name, dialogue_history_method): 101 | user_prompt_list = copy.deepcopy(user_prompt_list_input) 102 | response_total_list = copy.deepcopy(response_total_list_input) 103 | iteration_num = 0 104 | token_num_count_list_add = [] 105 | while iteration_num < 6: 106 | response_total_list.append(response) 107 | try: 108 | original_response_dict = json.loads(response) 109 | 110 | pg_dict_original = copy.deepcopy(pg_dict_input) 111 | transformed_dict = {} 112 | for key, value in original_response_dict.items(): 113 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key))) 114 | 115 | # match the item and location in the value 116 | match = re.match(r"move\((.*?),\s(.*?)\)", value) 117 | if match: 118 | item, location = match.groups() 119 | 120 | if "position" in location: 121 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location))) 122 | 123 | transformed_dict[coordinates] = [item, location] 124 | 125 | feedback = '' 126 | for key, value in transformed_dict.items(): 127 | # print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 128 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value, 129 | pg_dict_original) 130 | if judge_move_box2pos == True or judge_move_box2target == True: 131 | pass 132 | except: 133 | feedback = 'Your assigned plan is not in the correct json format as before. If your answer is empty dict, please check whether you miss to move box into the same colored target like move(box_blue, target_blue)' 134 | 135 | if feedback != '': 136 | feedback += 'Please replan for all the agents again with the same ouput format:' 137 | print('----------Syntactic Check----------') 138 | print(f'Response original: {response}') 139 | print(f'Feedback: {feedback}') 140 | user_prompt_list.append(feedback) 141 | messages = message_construct_func(user_prompt_list, response_total_list, dialogue_history_method) # message construction 142 | print(f'Length of messages {len(messages)}') 143 | response, token_num_count = GPT_response(messages, model_name) 144 | token_num_count_list_add.append(token_num_count) 145 | print(f'Response new: {response}\n') 146 | if response == 'Out of tokens': 147 | return response, token_num_count_list_add 148 | iteration_num += 1 149 | else: 150 | return response, token_num_count_list_add 151 | return 'Syntactic Error', token_num_count_list_add 152 | 153 | def action_from_response(pg_dict_input, original_response_dict): 154 | collision_check = False 155 | system_error_feedback = '' 156 | pg_dict_original = copy.deepcopy(pg_dict_input) 157 | transformed_dict = {} 158 | for key, value in original_response_dict.items(): 159 | coordinates = tuple(map(float, re.findall(r"\d+\.?\d*", key))) 160 | 161 | # match the item and location in the value 162 | match = re.match(r"move\((.*?),\s(.*?)\)", value) 163 | if match: 164 | item, location = match.groups() 165 | if "position" in location: 166 | location = tuple(map(float, re.findall(r"\d+\.?\d*", location))) 167 | transformed_dict[coordinates] = [item, location] 168 | 169 | for key, value in transformed_dict.items(): 170 | print(f"Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 171 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value, pg_dict_original) 172 | if judge_move_box2pos == True: 173 | pg_dict_original[str(box_location[0])+'_'+str(box_location[1])].remove(value[0]) 174 | pg_dict_original[str(value[1][0])+'_'+str(value[1][1])].append(value[0]) 175 | elif judge_move_box2target == True: 176 | pg_dict_original[str(box_location[0])+'_'+str(box_location[1])].remove(value[0]) 177 | pg_dict_original[str(key[0])+'_'+str(key[1])].remove(value[1]) 178 | else: 179 | #print(f"Error, Iteration Num: {iteration_num}, Key: {key}, Value1: {value[0]}, Value2: {value[1]}") 180 | system_error_feedback += f'Your assigned task for {key[0]}_{key[1]} is not in the doable action list; ' 181 | for key, value in transformed_dict.items(): 182 | box_location, judge_move_box2pos, judge_move_box2target, feedback = judge_move_box2pos_box2target_func(key, value, 183 | pg_dict_original) 184 | if judge_move_box2pos == True and len(pg_dict_original[str(value[1][0]) + '_' + str(value[1][1])]) > 1: 185 | collision_check = True 186 | break 187 | 188 | return system_error_feedback, pg_dict_original, collision_check 189 | 190 | def env_create(pg_row_num = 5, pg_column_num = 5, box_num_low_bound = 2, box_num_upper_bound = 2, color_list = ['blue', 'red', 'green', 'purple', 'orange']): 191 | # pg_dict records the items in each square over steps, here in the initial setting, we randomly assign items into each square 192 | pg_dict = {} 193 | for i in range(pg_row_num): 194 | for j in range(pg_column_num): 195 | pg_dict[str(i+0.5)+'_'+str(j+0.5)] = [] 196 | for i in range(pg_row_num+1): 197 | for j in range(pg_column_num+1): 198 | pg_dict[str(float(i))+'_'+str(float(j))] = [] 199 | 200 | for color in color_list: 201 | box_num = random.randint(box_num_low_bound, box_num_upper_bound) 202 | for _ in range(box_num): 203 | N_box = random.randint(0, pg_row_num*pg_column_num - 1) 204 | a_box = N_box // pg_column_num 205 | b_box = N_box % pg_column_num 206 | N_target = random.randint(0, pg_row_num*pg_column_num - 1) 207 | a_target = N_target // pg_column_num 208 | b_target = N_target % pg_column_num 209 | corner_list = [(1.0, 0.0), (0.0, 0.0), (0.0, 1.0), (1.0, 1.0)] 210 | random.shuffle(corner_list) 211 | for random_x, random_y in corner_list: 212 | if len(pg_dict[str(float(a_box) + random_x)+'_'+str(float(b_box) + random_y)]) == 0: 213 | pg_dict[str(float(a_box) + random_x) + '_' + str(float(b_box) + random_y)].append('box_' + color) 214 | pg_dict[str(a_target+0.5)+'_'+str(b_target+0.5)].append('target_' + color) 215 | break 216 | print(pg_dict) 217 | print('\n') 218 | return pg_dict 219 | 220 | def create_env2(Saving_path, repeat_num = 10): 221 | if not os.path.exists(Saving_path): 222 | os.makedirs(Saving_path, exist_ok=True) 223 | else: 224 | shutil.rmtree(Saving_path) 225 | os.makedirs(Saving_path, exist_ok=True) 226 | 227 | for i ,j in [(2,2), (2,4), (4,4), (4,8)]: 228 | if not os.path.exists(Saving_path+f'/env_pg_state_{i}_{j}'): 229 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True) 230 | else: 231 | shutil.rmtree(Saving_path+f'/env_pg_state_{i}_{j}') 232 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}', exist_ok=True) 233 | 234 | for iteration_num in range(repeat_num): 235 | # Define the total row and column numbers of the whole playground, and the item number of each colored target and box 236 | pg_row_num = i; pg_column_num = j; box_num_low_bound = 1; box_num_upper_bound = 1 237 | # Define the used colors 238 | color_list = ['blue', 'red', 'green', 'purple', 'orange'] 239 | pg_dict = env_create(pg_row_num, pg_column_num, box_num_low_bound, box_num_upper_bound, color_list) 240 | os.makedirs(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}', exist_ok=True) 241 | with open(Saving_path+f'/env_pg_state_{i}_{j}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'w') as f: 242 | json.dump(pg_dict, f) 243 | 244 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 245 | Saving_path = Code_dir_path + 'Env2_BoxNet2' 246 | # The first time to create the environment, after that you can comment it 247 | create_env2(Saving_path, repeat_num = 10) -------------------------------------------------------------------------------- /env1-box-arrange.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | from prompt_env1 import * 3 | from env1_create import * 4 | from sre_constants import error 5 | import random 6 | import os 7 | import json 8 | import re 9 | import copy 10 | import numpy as np 11 | import shutil 12 | import time 13 | 14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2' 15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history' 16 | def run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'): 17 | 18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}' 19 | 20 | # specify the path to your dir for saving the results 21 | os.makedirs(Saving_path_result, exist_ok=True) 22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True) 23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True) 24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True) 25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True) 26 | 27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file: 28 | pg_dict = json.load(file) 29 | 30 | user_prompt_list = [] # The record list of all the input prompts 31 | response_total_list = [] # The record list of all the responses 32 | pg_state_list = [] # The record list of apg states in varied steps 33 | dialogue_history_list = [] 34 | token_num_count_list = [] 35 | pg_state_list.append(pg_dict) 36 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f: 37 | json.dump(pg_dict, f) 38 | 39 | ### Start the Game! Query LLM for response 40 | print(f'query_time_limit: {query_time_limit}') 41 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs 42 | #print(index_query_times) 43 | print(pg_dict) 44 | state_update_prompt = state_update_func(pg_row_num, pg_column_num, pg_dict) 45 | if cen_decen_framework in ('DMAS'): 46 | print('--------DMAS method starts--------') 47 | match = None 48 | count_round = 0 49 | dialogue_history = '' 50 | response = '{}' 51 | while not match and count_round <= 3: 52 | count_round += 1 53 | for local_agent_row_i in range(pg_row_num): 54 | for local_agent_column_j in range(pg_column_num): 55 | #if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in pg_dict: 56 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, 57 | pg_column_num, 58 | local_agent_row_i, 59 | local_agent_column_j, 60 | pg_dict) 61 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, 62 | state_update_prompt_other_agent, 63 | dialogue_history, response_total_list, 64 | pg_state_list, dialogue_history_list, 65 | dialogue_history_method) 66 | user_prompt_list.append(user_prompt_1) 67 | print(f'User prompt: {user_prompt_1}\n\n') 68 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 69 | f.write(user_prompt_list[-1]) 70 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 71 | initial_response, token_num_count = GPT_response(messages,model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 72 | token_num_count_list.append(token_num_count) 73 | 74 | dialogue_history += f'[Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {initial_response}]\n\n' 75 | #print(dialogue_history) 76 | if re.search(r'EXECUTE', initial_response): 77 | # Search for the pattern that starts with { and ends with } 78 | print('EXECUTE!') 79 | match = re.search(r'{.*}', initial_response, re.DOTALL) 80 | if match: 81 | response = match.group() 82 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 83 | [user_prompt_list[-1]], 84 | [], 85 | model_name, 86 | '_w_all_dialogue_history', 87 | cen_decen_framework) 88 | token_num_count_list = token_num_count_list + token_num_count_list_add 89 | print(f'response: {response}') 90 | #print(f'User prompt: {user_prompt_1}\n\n') 91 | break 92 | break 93 | dialogue_history_list.append(dialogue_history) 94 | else: 95 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'): 96 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list, 97 | pg_state_list, dialogue_history_list, 98 | dialogue_history_method, cen_decen_framework) 99 | user_prompt_list.append(user_prompt_1) 100 | 101 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction 102 | 103 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f: 104 | f.write(user_prompt_list[-1]) 105 | initial_response, token_num_count = GPT_response(messages, model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 106 | print('Initial response: ', initial_response) 107 | token_num_count_list.append(token_num_count) 108 | match = re.search(r'{.*}', initial_response, re.DOTALL) 109 | if match: 110 | response = match.group() 111 | if response[0] == '{' and response[-1] == '}': 112 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history', cen_decen_framework) 113 | token_num_count_list = token_num_count_list + token_num_count_list_add 114 | print(f'response: {response}') 115 | else: 116 | raise ValueError(f'Response format error: {response}') 117 | 118 | if response == 'Out of tokens': 119 | success_failure = 'failure over token length limit' 120 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 121 | elif response == 'Syntactic Error': 122 | success_failure = 'Syntactic Error' 123 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 124 | 125 | # Local agent response for checking the feasibility of actions 126 | if cen_decen_framework == 'HMAS-2': 127 | print('--------HMAS-2 method starts--------') 128 | dialogue_history = f'Central Planner: {response}\n' 129 | #print(f'Original plan response: {response}') 130 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {} 131 | local_agent_response_list_dir['feedback1'] = '' 132 | agent_dict = json.loads(response) 133 | for local_agent_row_i in range(pg_row_num): 134 | for local_agent_column_j in range(pg_column_num): 135 | if f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]' in agent_dict: 136 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = [] 137 | response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = [] 138 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, pg_column_num, local_agent_row_i, local_agent_column_j, pg_dict) 139 | 140 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method) 141 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'].append(local_reprompt) 142 | messages = message_construct_func(prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], '_w_all_dialogue_history') 143 | response_local_agent, token_num_count = GPT_response(messages, model_name) 144 | token_num_count_list.append(token_num_count) 145 | #print(f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}] response: {response_local_agent}') 146 | if response_local_agent != 'I Agree': 147 | local_agent_response_list_dir['feedback1'] += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' # collect the response from all the local agents 148 | dialogue_history += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' 149 | 150 | if local_agent_response_list_dir['feedback1'] != '': 151 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {Agent[0.5, 0.5]:move(box_blue, square[0.5, 1.5]), Agent[1.5, 0.5]:move...}, as above. Do not explain, just directly output json directory. Your response:' 152 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction 153 | response_central_again, token_num_count = GPT_response(messages, model_name) 154 | token_num_count_list.append(token_num_count) 155 | match = re.search(r'{.*}', response_central_again, re.DOTALL) 156 | if match: 157 | response = match.group() 158 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name, 159 | '_w_all_dialogue_history', cen_decen_framework) 160 | token_num_count_list = token_num_count_list + token_num_count_list_add 161 | print(f'response: {response}') 162 | #print(messages[2]) 163 | #print(messages[3]) 164 | print(f'Modified plan response:\n {response}') 165 | else: 166 | print(f'Plan:\n {response}') 167 | pass 168 | 169 | dialogue_history_list.append(dialogue_history) 170 | 171 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast': 172 | print('--------HMAS-1 method starts--------') 173 | count_round = 0 174 | dialogue_history = f'Central Planner: {response}\n' 175 | match = None 176 | agent_dict = json.loads(response) 177 | while not match and count_round <= 3: 178 | count_round += 1 179 | for local_agent_row_i in range(pg_row_num): 180 | for local_agent_column_j in range(pg_column_num): 181 | if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in agent_dict: 182 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent( 183 | pg_row_num, 184 | pg_column_num, 185 | local_agent_row_i, 186 | local_agent_column_j, 187 | pg_dict) 188 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast': 189 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, 190 | state_update_prompt_other_agent, 191 | dialogue_history, response_total_list, pg_state_list, 192 | dialogue_history_list, dialogue_history_method, 193 | initial_plan=response) 194 | else: 195 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent, 196 | state_update_prompt_other_agent, dialogue_history, 197 | response_total_list, pg_state_list, 198 | dialogue_history_list, dialogue_history_method, 199 | initial_plan='') 200 | 201 | user_prompt_list.append(user_prompt_1) 202 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 203 | f.write(user_prompt_list[-1]) 204 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 205 | initial_response, token_num_count = GPT_response(messages,model_name) # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 206 | token_num_count_list.append(token_num_count) 207 | 208 | #print('-----------prompt------------\n' + initial_response) 209 | dialogue_history += f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n' 210 | #print(dialogue_history) 211 | match = re.search(r'{.*}', initial_response, re.DOTALL) 212 | if match and re.search(r'EXECUTE', initial_response): 213 | response = match.group() 214 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 215 | [user_prompt_list[-1]], 216 | [], 217 | model_name, 218 | '_w_all_dialogue_history', 219 | cen_decen_framework) 220 | token_num_count_list = token_num_count_list + token_num_count_list_add 221 | print(f'response: {response}') 222 | break 223 | break 224 | dialogue_history_list.append(dialogue_history) 225 | 226 | response_total_list.append(response) 227 | if response == 'Out of tokens': 228 | success_failure = 'failure over token length limit' 229 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 230 | elif response == 'Syntactic Error': 231 | success_failure = 'Syntactic Error' 232 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 233 | 234 | data = json.loads(response) 235 | 236 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f: 237 | json.dump(data, f) 238 | original_response_dict = json.loads(response_total_list[index_query_times]) 239 | print(pg_dict) 240 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'): 241 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f: 242 | f.write(dialogue_history_list[index_query_times]) 243 | try: 244 | system_error_feedback, pg_dict_returned = action_from_response(pg_dict, original_response_dict) 245 | if system_error_feedback != '': 246 | print(system_error_feedback) 247 | pg_dict = pg_dict_returned 248 | 249 | except: 250 | success_failure = 'Hallucination of wrong plan' 251 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 252 | pg_state_list.append(pg_dict) 253 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(index_query_times+2)+'.json', 'w') as f: 254 | json.dump(pg_dict, f) 255 | 256 | # Check whether the task has been completed 257 | count = 0 258 | for key, value in pg_dict.items(): 259 | count += len(value) 260 | if count == 0: 261 | break 262 | 263 | if index_query_times < query_time_limit - 1: 264 | success_failure = 'success' 265 | else: 266 | success_failure = 'failure over query time limit' 267 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 268 | 269 | 270 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 271 | Saving_path = Code_dir_path + 'Env1_BoxNet1' 272 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613' 273 | print(f'-------------------Model name: {model_name}-------------------') 274 | for pg_row_num, pg_column_num in [(2,2), (2,4), (4,4), (4,8)]: 275 | if pg_row_num == 4 and pg_column_num == 8: 276 | query_time_limit = 40 277 | else: 278 | query_time_limit = 30 279 | for iteration_num in range(10): 280 | print('-------###-------###-------###-------') 281 | print(f'Row num is: {pg_row_num}, Column num is: {pg_column_num}, Iteration num is: {iteration_num}\n\n') 282 | 283 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history', 284 | cen_decen_framework='HMAS-2', model_name = model_name) 285 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f: 286 | for token_num_num_count in token_num_count_list: 287 | f.write(str(token_num_num_count) + '\n') 288 | 289 | with open(Saving_path_result + '/success_failure.txt', 'w') as f: 290 | f.write(success_failure) 291 | 292 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f: 293 | f.write(f'{index_query_times+1}') 294 | print(success_failure) 295 | print(f'Iteration number: {index_query_times+1}') -------------------------------------------------------------------------------- /env2-box-arrange.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | from prompt_env2 import * 3 | from env2_create import * 4 | from sre_constants import error 5 | import random 6 | import os 7 | import json 8 | import re 9 | import copy 10 | import numpy as np 11 | import shutil 12 | import time 13 | 14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2' 15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history' 16 | def run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS'): 17 | 18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}' 19 | 20 | # specify the path to your dir for saving the results 21 | os.makedirs(Saving_path_result, exist_ok=True) 22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True) 23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True) 24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True) 25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True) 26 | 27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}_{pg_column_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file: 28 | pg_dict = json.load(file) 29 | 30 | user_prompt_list = [] # The record list of all the input prompts 31 | response_total_list = [] # The record list of all the responses 32 | pg_state_list = [] # The record list of apg states in varied steps 33 | dialogue_history_list = [] 34 | token_num_count_list = [] 35 | pg_state_list.append(pg_dict) 36 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f: 37 | json.dump(pg_dict, f) 38 | 39 | ### Start the Game! Query LLM for response 40 | print(f'query_time_limit: {query_time_limit}') 41 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs 42 | #print(index_query_times) 43 | #print(pg_dict) 44 | state_update_prompt = state_update_func(pg_row_num, pg_column_num, pg_dict) 45 | if cen_decen_framework in ('DMAS'): 46 | print('--------DMAS method starts--------') 47 | match = None 48 | count_round = 0 49 | dialogue_history = '' 50 | response = '{}' 51 | while not match and count_round <= 3: 52 | count_round += 1 53 | for local_agent_row_i in range(pg_row_num): 54 | for local_agent_column_j in range(pg_column_num): 55 | #if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in pg_dict: 56 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, 57 | pg_column_num, 58 | local_agent_row_i, 59 | local_agent_column_j, 60 | pg_dict) 61 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, 62 | state_update_prompt_other_agent, 63 | dialogue_history, response_total_list, 64 | pg_state_list, dialogue_history_list, 65 | dialogue_history_method) 66 | user_prompt_list.append(user_prompt_1) 67 | #print(f'User prompt: {user_prompt_1}\n\n') 68 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 69 | f.write(user_prompt_list[-1]) 70 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 71 | initial_response, token_num_count = GPT_response(messages, 72 | model_name='gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 73 | token_num_count_list.append(token_num_count) 74 | 75 | dialogue_history += f'[Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {initial_response}]\n\n' 76 | #print(dialogue_history) 77 | if re.search(r'EXECUTE', initial_response): 78 | # Search for the pattern that starts with { and ends with } 79 | print('EXECUTE!') 80 | match = re.search(r'{.*}', initial_response, re.DOTALL) 81 | if match: 82 | response = match.group() 83 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 84 | [user_prompt_list[-1]], 85 | [], 86 | 'gpt-4-0613', 87 | '_w_all_dialogue_history') 88 | token_num_count_list = token_num_count_list + token_num_count_list_add 89 | print(f'response: {response}') 90 | #print(f'User prompt: {user_prompt_1}\n\n') 91 | break 92 | break 93 | dialogue_history_list.append(dialogue_history) 94 | else: 95 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'): 96 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list, 97 | pg_state_list, dialogue_history_list, 98 | dialogue_history_method, cen_decen_framework) 99 | user_prompt_list.append(user_prompt_1) 100 | #print('user_prompt_1: ', user_prompt_1) 101 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction 102 | 103 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f: 104 | f.write(user_prompt_list[-1]) 105 | initial_response, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 106 | print('Initial response: ', initial_response) 107 | token_num_count_list.append(token_num_count) 108 | match = re.search(r'{.*}', initial_response, re.DOTALL) 109 | if match: 110 | response = match.group() 111 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], 'gpt-4-0613', '_w_all_dialogue_history') 112 | token_num_count_list = token_num_count_list + token_num_count_list_add 113 | print(f'response: {response}') 114 | 115 | if response == 'Out of tokens': 116 | success_failure = 'failure over token length limit' 117 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 118 | elif response == 'Syntactic Error': 119 | success_failure = 'Syntactic Error' 120 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 121 | 122 | # Local agent response for checking the feasibility of actions 123 | if cen_decen_framework == 'HMAS-2': 124 | print('--------HMAS-2 method starts--------') 125 | break_mark = False; count_round_HMAS2 = 0 126 | 127 | while break_mark == False and count_round_HMAS2 < 3: 128 | count_round_HMAS2 += 1 129 | dialogue_history = f'Central Planner: {response}\n' 130 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {} 131 | local_agent_response_list_dir['feedback1'] = '' 132 | agent_dict = json.loads(response) 133 | for local_agent_row_i in range(pg_row_num): 134 | for local_agent_column_j in range(pg_column_num): 135 | if f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]' in agent_dict: 136 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = [] 137 | response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'] = [] 138 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent(pg_row_num, pg_column_num, local_agent_row_i, local_agent_column_j, pg_dict) 139 | 140 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method) 141 | #print(local_reprompt) 142 | prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'].append(local_reprompt) 143 | messages = message_construct_func(prompt_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], response_list_dir[f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]'], '_w_all_dialogue_history') 144 | response_local_agent, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613') 145 | token_num_count_list.append(token_num_count) 146 | print(f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}] response: {response_local_agent}') 147 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent): 148 | local_agent_response_list_dir['feedback1'] += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' # collect the response from all the local agents 149 | dialogue_history += f'Agent[{local_agent_row_i+0.5}, {local_agent_column_j+0.5}]: {response_local_agent}\n' 150 | 151 | if local_agent_response_list_dir['feedback1'] != '': 152 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {Agent[0.5, 0.5]:move(box_blue, position[0.0, 1.0]), Agent[1.5, 0.5]:move...}, as above. Do not explain, just directly output json directory. Your response:' 153 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction 154 | response_central_again, token_num_count = GPT_response(messages, model_name = 'gpt-4-0613') 155 | token_num_count_list.append(token_num_count) 156 | match = re.search(r'{.*}', response_central_again, re.DOTALL) 157 | if match: 158 | response = match.group() 159 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], 'gpt-4-0613', 160 | '_w_all_dialogue_history') 161 | token_num_count_list = token_num_count_list + token_num_count_list_add 162 | print(f'Modified plan response: {response}') 163 | else: 164 | break_mark = True 165 | pass 166 | 167 | dialogue_history_list.append(dialogue_history) 168 | 169 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast': 170 | print('--------HMAS-1 method starts--------') 171 | count_round = 0 172 | dialogue_history = f'Central Planner: {response}\n' 173 | match = None 174 | agent_dict = json.loads(response) 175 | while not match and count_round <= 3: 176 | count_round += 1 177 | for local_agent_row_i in range(pg_row_num): 178 | for local_agent_column_j in range(pg_column_num): 179 | if f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]' in agent_dict: 180 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent( 181 | pg_row_num, 182 | pg_column_num, 183 | local_agent_row_i, 184 | local_agent_column_j, 185 | pg_dict) 186 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast': 187 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, 188 | state_update_prompt_other_agent, 189 | dialogue_history, response_total_list, pg_state_list, 190 | dialogue_history_list, dialogue_history_method, 191 | initial_plan=response) 192 | else: 193 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent, 194 | state_update_prompt_other_agent, dialogue_history, 195 | response_total_list, pg_state_list, 196 | dialogue_history_list, dialogue_history_method, 197 | initial_plan='') 198 | 199 | user_prompt_list.append(user_prompt_1) 200 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 201 | f.write(user_prompt_list[-1]) 202 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 203 | initial_response, token_num_count = GPT_response(messages, 204 | model_name='gpt-4-0613') # 'gpt-4' or 'gpt-3.5-turbo-0301' or 'gpt-4-32k' or 'gpt-3' or 'gpt-4-0613' 205 | token_num_count_list.append(token_num_count) 206 | 207 | #print('-----------prompt------------\n' + initial_response) 208 | dialogue_history += f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n' 209 | #print(dialogue_history) 210 | print(f'Agent[{local_agent_row_i + 0.5}, {local_agent_column_j + 0.5}]: {initial_response}\n') 211 | match = re.search(r'{.*}', initial_response, re.DOTALL) 212 | if match and re.search(r'EXECUTE', initial_response): 213 | response = match.group() 214 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 215 | [user_prompt_list[-1]], 216 | [], 217 | 'gpt-4-0613', 218 | '_w_all_dialogue_history') 219 | token_num_count_list = token_num_count_list + token_num_count_list_add 220 | print(f'response: {response}') 221 | break 222 | break 223 | dialogue_history_list.append(dialogue_history) 224 | 225 | response_total_list.append(response) 226 | if response == 'Out of tokens': 227 | success_failure = 'failure over token length limit' 228 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 229 | elif response == 'Syntactic Error': 230 | success_failure = 'Syntactic Error' 231 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 232 | 233 | data = json.loads(response) 234 | 235 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f: 236 | json.dump(data, f) 237 | original_response_dict = json.loads(response_total_list[index_query_times]) 238 | print(pg_dict) 239 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'): 240 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f: 241 | f.write(dialogue_history_list[index_query_times]) 242 | try: 243 | system_error_feedback, pg_dict_returned, collision_check = action_from_response(pg_dict, original_response_dict) 244 | if system_error_feedback != '': 245 | print(system_error_feedback) 246 | if collision_check: 247 | print('Collision!') 248 | success_failure = 'Collision' 249 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 250 | pg_dict = pg_dict_returned 251 | 252 | except: 253 | success_failure = 'Hallucination of wrong plan' 254 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 255 | pg_state_list.append(pg_dict) 256 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(index_query_times+2)+'.json', 'w') as f: 257 | json.dump(pg_dict, f) 258 | 259 | # Check whether the task has been completed 260 | count = 0 261 | for key, value in pg_dict.items(): 262 | count += len(value) 263 | if count == 0: 264 | break 265 | 266 | if index_query_times < query_time_limit - 1: 267 | success_failure = 'success' 268 | else: 269 | success_failure = 'failure over query time limit' 270 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 271 | 272 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 273 | Saving_path = Code_dir_path + 'Env2_BoxNet2' 274 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613' 275 | print(f'-------------------Model name: {model_name}-------------------') 276 | 277 | for pg_row_num, pg_column_num in [(2,2), (2,4), (4,4), (4,8)]: 278 | if pg_row_num == 4 and pg_column_num == 8: 279 | query_time_limit = 40 280 | else: 281 | query_time_limit = 30 282 | for iteration_num in range(10): 283 | print('-------###-------###-------###-------') 284 | print(f'Row num is: {pg_row_num}, Column num is: {pg_column_num}, Iteration num is: {iteration_num}\n\n') 285 | 286 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, pg_column_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history', 287 | cen_decen_framework='HMAS-2') 288 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f: 289 | for token_num_num_count in token_num_count_list: 290 | f.write(str(token_num_num_count) + '\n') 291 | 292 | with open(Saving_path_result + '/success_failure.txt', 'w') as f: 293 | f.write(success_failure) 294 | 295 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f: 296 | f.write(f'{index_query_times+1}') 297 | print(success_failure) 298 | print(f'Iteration number: {index_query_times+1}') 299 | -------------------------------------------------------------------------------- /env4-box-arrange.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | from prompt_env4 import * 3 | from env4_create import * 4 | from sre_constants import error 5 | import random 6 | import os 7 | import json 8 | import re 9 | import copy 10 | import numpy as np 11 | import shutil 12 | import time 13 | 14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2' 15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history' 16 | def run_exp(Saving_path, track_row_num, column_num, box_occupy_ratio, agent_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'): 17 | 18 | Saving_path_result = Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}' 19 | 20 | # specify the path to your dir for saving the results 21 | os.makedirs(Saving_path_result, exist_ok=True) 22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True) 23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True) 24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True) 25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True) 26 | 27 | with open(Saving_path+f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/pg_state{iteration_num}.json', 'r') as file: 28 | pg_dict = json.load(file) 29 | with open(Saving_path + f'/env_pg_state_{track_row_num}_{column_num}_{box_occupy_ratio}_{agent_num}/pg_state{iteration_num}/box_state{iteration_num}.json', 'r') as file: 30 | box_position_dict = json.load(file) 31 | 32 | user_prompt_list = [] # The record list of all the input prompts 33 | response_total_list = [] # The record list of all the responses 34 | pg_state_list = [] # The record list of pg states in varied steps 35 | box_state_list = [] # The record list of box states in varied steps 36 | dialogue_history_list = [] 37 | token_num_count_list = [] 38 | system_error_feedback_list = [] 39 | pg_state_list.append(pg_dict) 40 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f: 41 | json.dump(pg_dict, f) 42 | with open(Saving_path_result+'/pg_state' + '/box_state'+str(1)+'.json', 'w') as f: 43 | json.dump(box_position_dict, f) 44 | 45 | ### Start the Game! Query LLM for response 46 | print(f'query_time_limit: {query_time_limit}') 47 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs 48 | state_update_prompt = state_update_func(pg_dict, box_position_dict, track_row_num, column_num) 49 | if cen_decen_framework in ('DMAS'): 50 | print('--------DMAS method starts--------') 51 | match = None 52 | count_round = 0 53 | dialogue_history = '' 54 | response = '{}' 55 | while not match and count_round <= 3: 56 | count_round += 1 57 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent( 58 | local_agent_row_i, 59 | pg_dict) 60 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, 61 | state_update_prompt_other_agent, 62 | dialogue_history, response_total_list, 63 | pg_state_list, dialogue_history_list, 64 | dialogue_history_method) 65 | user_prompt_list.append(user_prompt_1) 66 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 67 | f.write(user_prompt_list[-1]) 68 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 69 | initial_response, token_num_count = GPT_response(messages, model_name) 70 | token_num_count_list.append(token_num_count) 71 | 72 | dialogue_history += f'[Agent[{local_agent_row_i}]: {initial_response}]\n\n' 73 | #print(dialogue_history) 74 | if re.search(r'EXECUTE', initial_response): 75 | # Search for the pattern that starts with { and ends with } 76 | print('EXECUTE!') 77 | match = re.search(r'{.*}', initial_response, re.DOTALL) 78 | if match: 79 | response = match.group() 80 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 81 | [user_prompt_list[-1]], 82 | [], 83 | model_name, 84 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict) 85 | token_num_count_list = token_num_count_list + token_num_count_list_add 86 | print(f'response: {response}') 87 | #print(f'User prompt: {user_prompt_1}\n\n') 88 | break 89 | break 90 | dialogue_history_list.append(dialogue_history) 91 | else: 92 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'): 93 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list, system_error_feedback_list, 94 | pg_state_list, dialogue_history_list, 95 | dialogue_history_method, cen_decen_framework, track_row_num, column_num) 96 | 97 | user_prompt_list.append(user_prompt_1) 98 | #print('user_prompt_1: ', user_prompt_1) 99 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction 100 | 101 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f: 102 | f.write(user_prompt_list[-1]) 103 | initial_response, token_num_count = GPT_response(messages, model_name) 104 | print('Initial response: ', initial_response) 105 | token_num_count_list.append(token_num_count) 106 | match = re.search(r'{.*}', initial_response, re.DOTALL) 107 | if match: 108 | response = match.group() 109 | if response[0] == '{' and response[-1] == '}': 110 | if '{' in response[1:-1] and '}' in response[1:-1]: 111 | match = re.search(r'{.*}', response[:-1], re.DOTALL) 112 | if match: 113 | response = match.group() 114 | print(f'response: {response}') 115 | print('----------------Start syntactic check--------------') 116 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history', track_row_num, column_num, box_position_dict) 117 | token_num_count_list = token_num_count_list + token_num_count_list_add 118 | print(f'response: {response}') 119 | else: 120 | raise ValueError(f'Response format error: {response}') 121 | 122 | if response == 'Out of tokens': 123 | success_failure = 'failure over token length limit' 124 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 125 | elif response == 'Syntactic Error': 126 | success_failure = 'Syntactic Error' 127 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 128 | 129 | # Local agent response for checking the feasibility of actions 130 | if cen_decen_framework == 'HMAS-2': 131 | print('--------HMAS-2 method starts--------') 132 | break_mark = False; count_round_HMAS2 = 0 133 | 134 | while break_mark == False and count_round_HMAS2 < 3: 135 | count_round_HMAS2 += 1 136 | dialogue_history = f'Central Planner: {response}\n' 137 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {} 138 | local_agent_response_list_dir['feedback1'] = '' 139 | 140 | agent_dict = json.loads(response) 141 | for agent_name, agent_state in agent_dict.items(): 142 | 143 | prompt_list_dir[agent_name] = [] 144 | response_list_dir[agent_name] = [] 145 | 146 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt, response, 147 | response_total_list, pg_state_list, 148 | dialogue_history_list, system_error_feedback_list, 149 | dialogue_history_method, agent_name, track_row_num, column_num) 150 | 151 | 152 | # print(local_reprompt) 153 | prompt_list_dir[agent_name].append(local_reprompt) 154 | messages = message_construct_func( 155 | prompt_list_dir[agent_name], 156 | response_list_dir[agent_name], 157 | '_w_all_dialogue_history') 158 | response_local_agent, token_num_count = GPT_response(messages, model_name) 159 | token_num_count_list.append(token_num_count) 160 | print(f'{agent_name} response: {response_local_agent}') 161 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent): 162 | local_agent_response_list_dir[ 163 | 'feedback1'] += f'{agent_name}: {response_local_agent}\n' # collect the response from all the local agents 164 | dialogue_history += f'{agent_name}: {response_local_agent}\n' 165 | 166 | if local_agent_response_list_dir['feedback1'] != '': 167 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}, as above. Do not explain, just directly output json directory. Your response:' 168 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction 169 | response_central_again, token_num_count = GPT_response(messages, model_name) 170 | token_num_count_list.append(token_num_count) 171 | match = re.search(r'{.*}', response_central_again, re.DOTALL) 172 | if match: 173 | response = match.group() 174 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name, 175 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict) 176 | token_num_count_list = token_num_count_list 177 | print(f'Modified plan response: {response}') 178 | else: 179 | break_mark = True 180 | pass 181 | 182 | dialogue_history_list.append(dialogue_history) 183 | 184 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast': 185 | print('--------HMAS-1 method starts--------') 186 | count_round = 0 187 | dialogue_history = f'Central Planner: {response}\n' 188 | match = None 189 | while not match and count_round <= 3: 190 | count_round += 1 191 | 192 | agent_dict = json.loads(response) 193 | lift_weight_list_total = [] 194 | for key, value in agent_dict.items(): 195 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)] 196 | 197 | for lift_weight_item in lifter_weight_list: 198 | 199 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast': 200 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, 201 | state_update_prompt_other_agent, 202 | dialogue_history, response_total_list, pg_state_list, 203 | dialogue_history_list, dialogue_history_method, 204 | initial_plan=response) 205 | else: 206 | user_prompt_1 = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response, 207 | response_total_list, pg_state_list, 208 | dialogue_history_list, 209 | dialogue_history_method) 210 | 211 | 212 | user_prompt_list.append(user_prompt_1) 213 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 214 | f.write(user_prompt_list[-1]) 215 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 216 | initial_response, token_num_count = GPT_response(messages, 217 | model_name) 218 | token_num_count_list.append(token_num_count) 219 | 220 | #print('-----------prompt------------\n' + initial_response) 221 | dialogue_history += f'agent[{lift_weight_item}W]: {initial_response}\n' 222 | #print(dialogue_history) 223 | match = re.search(r'{.*}', initial_response, re.DOTALL) 224 | if match and re.search(r'EXECUTE', initial_response): 225 | response = match.group() 226 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 227 | [user_prompt_list[-1]], 228 | [], 229 | model_name, 230 | '_w_all_dialogue_history', track_row_num, column_num, box_position_dict) 231 | token_num_count_list = token_num_count_list + token_num_count_list_add 232 | print(f'response: {response}') 233 | break 234 | break 235 | dialogue_history_list.append(dialogue_history) 236 | 237 | response_total_list.append(response) 238 | if response == 'Out of tokens': 239 | success_failure = 'failure over token length limit' 240 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 241 | elif response == 'Syntactic Error': 242 | success_failure = 'Syntactic Error' 243 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 244 | 245 | data = json.loads(response) 246 | 247 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f: 248 | json.dump(data, f) 249 | original_response_dict = json.loads(response_total_list[index_query_times]) 250 | print(pg_dict) 251 | print(box_position_dict) 252 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'): 253 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f: 254 | f.write(dialogue_history_list[index_query_times]) 255 | #try: 256 | system_error_feedback, pg_dict_returned, collision_check, box_position_dict_returned = action_from_response(pg_dict, original_response_dict, track_row_num, column_num, box_position_dict) 257 | system_error_feedback_list.append(system_error_feedback) 258 | if system_error_feedback != '': 259 | print('system_error_feedback: ', system_error_feedback) 260 | if collision_check: 261 | print('Collision!') 262 | success_failure = 'Collision' 263 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 264 | pg_dict = pg_dict_returned 265 | box_position_dict = box_position_dict_returned 266 | 267 | #except: 268 | # print('Hallucination response: ', response) 269 | # success_failure = 'Hallucination of wrong plan' 270 | # return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 271 | pg_state_list.append(pg_dict) 272 | box_state_list.append(box_position_dict) 273 | 274 | with open(Saving_path_result + '/pg_state' + '/pg_state' + str(index_query_times+2) + '.json', 'w') as f: 275 | json.dump(pg_dict, f) 276 | with open(Saving_path_result + '/pg_state' + '/box_state' + str(index_query_times+2) + '.json', 'w') as f: 277 | json.dump(box_position_dict, f) 278 | 279 | # Check whether the task has been completed 280 | box_current_state_list = [value for value in box_position_dict.values()] 281 | print(f'box_current_state_list: {box_current_state_list}') 282 | #print(f'pg_dict: {pg_dict}') 283 | agent_current_state_list = [value[-1] for value in pg_dict.values() if type(value) == list] 284 | print(f'agent_current_state_list: {agent_current_state_list}') 285 | if np.sum(box_current_state_list) + np.sum(agent_current_state_list) == 0: 286 | break 287 | 288 | if index_query_times < query_time_limit - 1: 289 | success_failure = 'success' 290 | else: 291 | success_failure = 'failure over query time limit' 292 | return user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 293 | 294 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 295 | Saving_path = Code_dir_path + 'Env4_Warehouse' 296 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613' 297 | print(f'-------------------Model name: {model_name}-------------------') 298 | 299 | for track_row_num, column_num, box_occupy_ratio, agent_num in [(3, 5, 0.5, 4)]: 300 | if agent_num == 8: 301 | query_time_limit = 40 302 | else: 303 | query_time_limit = 30 304 | for iteration_num in range(10): 305 | print('-------###-------###-------###-------') 306 | print(f'Track_row_num is: {track_row_num}, Column_num: {column_num}, Agent_num: {agent_num}, Iteration num is: {iteration_num}\n\n') 307 | 308 | user_prompt_list, response_total_list, pg_state_list, box_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, track_row_num, column_num, box_occupy_ratio, agent_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history', 309 | cen_decen_framework='HMAS-2', model_name = model_name) 310 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f: 311 | for token_num_num_count in token_num_count_list: 312 | f.write(str(token_num_num_count) + '\n') 313 | 314 | with open(Saving_path_result + '/success_failure.txt', 'w') as f: 315 | f.write(success_failure) 316 | 317 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f: 318 | f.write(f'{index_query_times+1}') 319 | print(success_failure) 320 | print(f'Iteration number: {index_query_times+1}') -------------------------------------------------------------------------------- /env3-box-arrange.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | from prompt_env3 import * 3 | from env3_create import * 4 | from sre_constants import error 5 | import random 6 | import os 7 | import json 8 | import re 9 | import copy 10 | import numpy as np 11 | import shutil 12 | import time 13 | 14 | # cen_decen_framework = 'DMAS', 'HMAS-1', 'CMAS', 'HMAS-2' 15 | # dialogue_history_method = '_w_all_dialogue_history', '_wo_any_dialogue_history', '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_previous_round_history' 16 | def run_exp(Saving_path, pg_row_num, iteration_num, query_time_limit, dialogue_history_method = '_w_all_dialogue_history', cen_decen_framework = 'CMAS', model_name = 'gpt-3'): 17 | 18 | Saving_path_result = Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/{cen_decen_framework}{dialogue_history_method}_{model_name}' 19 | 20 | # specify the path to your dir for saving the results 21 | os.makedirs(Saving_path_result, exist_ok=True) 22 | os.makedirs(Saving_path_result+f'/prompt', exist_ok=True) 23 | os.makedirs(Saving_path_result+f'/response', exist_ok=True) 24 | os.makedirs(Saving_path_result+f'/pg_state', exist_ok=True) 25 | os.makedirs(Saving_path_result + f'/dialogue_history', exist_ok=True) 26 | 27 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/lifter_weight_list{iteration_num}.txt', 'r') as file: 28 | lifter_weight_list = [float(line.strip()) for line in file.readlines()] 29 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/volume_list{iteration_num}.txt', 'r') as file: 30 | volume_list = [float(line.strip()) for line in file.readlines()] 31 | with open(Saving_path+f'/env_pg_state_{pg_row_num}/pg_state{iteration_num}/weight_list{iteration_num}.txt', 'r') as file: 32 | weight_list = [float(line.strip()) for line in file.readlines()] 33 | 34 | if len(volume_list) != len(weight_list): 35 | raise error('The length of volume_list and weight_list are not equal!') 36 | else: 37 | pg_dict = dict(zip(volume_list, weight_list)) 38 | 39 | user_prompt_list = [] # The record list of all the input prompts 40 | response_total_list = [] # The record list of all the responses 41 | pg_state_list = [] # The record list of pg states in varied steps 42 | env_act_feedback_list = [] # The record list of env act feedbacks 43 | dialogue_history_list = [] 44 | left_box_list = [] 45 | token_num_count_list = [] 46 | pg_state_list.append(pg_dict) 47 | with open(Saving_path_result+'/pg_state' + '/pg_state'+str(1)+'.json', 'w') as f: 48 | json.dump(pg_dict, f) 49 | 50 | ### Start the Game! Query LLM for response 51 | print(f'query_time_limit: {query_time_limit}') 52 | for index_query_times in range(query_time_limit): # The upper limit of calling LLMs 53 | state_update_prompt, left_box = state_update_func(pg_dict, lifter_weight_list) 54 | left_box_list.append(left_box) 55 | if cen_decen_framework in ('DMAS'): 56 | print('--------DMAS method starts--------') 57 | match = None 58 | count_round = 0 59 | dialogue_history = '' 60 | response = '{}' 61 | while not match and count_round <= 3: 62 | count_round += 1 63 | state_update_prompt_local_agent, state_update_prompt_other_agent = state_update_func_local_agent( 64 | local_agent_row_i, 65 | pg_dict) 66 | user_prompt_1 = input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, 67 | state_update_prompt_other_agent, 68 | dialogue_history, response_total_list, 69 | pg_state_list, dialogue_history_list, 70 | dialogue_history_method) 71 | user_prompt_list.append(user_prompt_1) 72 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 73 | f.write(user_prompt_list[-1]) 74 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 75 | initial_response, token_num_count = GPT_response(messages, model_name) 76 | token_num_count_list.append(token_num_count) 77 | 78 | dialogue_history += f'[Agent[{local_agent_row_i}]: {initial_response}]\n\n' 79 | #print(dialogue_history) 80 | if re.search(r'EXECUTE', initial_response): 81 | # Search for the pattern that starts with { and ends with } 82 | print('EXECUTE!') 83 | match = re.search(r'{.*}', initial_response, re.DOTALL) 84 | if match: 85 | response = match.group() 86 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 87 | [user_prompt_list[-1]], 88 | [], 89 | model_name, 90 | '_w_all_dialogue_history') 91 | token_num_count_list = token_num_count_list + token_num_count_list_add 92 | print(f'response: {response}') 93 | #print(f'User prompt: {user_prompt_1}\n\n') 94 | break 95 | break 96 | dialogue_history_list.append(dialogue_history) 97 | else: 98 | if cen_decen_framework in ('CMAS', 'HMAS-1', 'HMAS-1-fast', 'HMAS-2'): 99 | user_prompt_1 = input_prompt_1_func_total(state_update_prompt, response_total_list, 100 | left_box_list, dialogue_history_list, env_act_feedback_list, 101 | dialogue_history_method, cen_decen_framework) 102 | user_prompt_list.append(user_prompt_1) 103 | #print('user_prompt_1: ', user_prompt_1) 104 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') # message construction 105 | 106 | with open(Saving_path_result+'/prompt' + '/user_prompt_'+str(index_query_times+1), 'w') as f: 107 | f.write(user_prompt_list[-1]) 108 | initial_response, token_num_count = GPT_response(messages, model_name) 109 | print('Initial response: ', initial_response) 110 | token_num_count_list.append(token_num_count) 111 | match = re.search(r'{.*}', initial_response, re.DOTALL) 112 | if match: 113 | response = match.group() 114 | if response[0] == '{' and response[-1] == '}': 115 | if '{' in response[1:-1] and '}' in response[1:-1]: 116 | match = re.search(r'{.*}', response[:-1], re.DOTALL) 117 | if match: 118 | response = match.group() 119 | print(f'response: {response}') 120 | print('----------------Start syntactic check--------------') 121 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, [user_prompt_1], [], model_name, '_w_all_dialogue_history') 122 | token_num_count_list = token_num_count_list + token_num_count_list_add 123 | print(f'response: {response}') 124 | else: 125 | raise ValueError(f'Response format error: {response}') 126 | 127 | if response == 'Out of tokens': 128 | success_failure = 'failure over token length limit' 129 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 130 | elif response == 'Syntactic Error': 131 | success_failure = 'Syntactic Error' 132 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 133 | 134 | # Local agent response for checking the feasibility of actions 135 | if cen_decen_framework == 'HMAS-2': 136 | print('--------HMAS-2 method starts--------') 137 | break_mark = False; count_round_HMAS2 = 0 138 | 139 | while break_mark == False and count_round_HMAS2 < 3: 140 | count_round_HMAS2 += 1 141 | dialogue_history = f'Central Planner: {response}\n' 142 | prompt_list_dir = {}; response_list_dir = {}; local_agent_response_list_dir = {} 143 | local_agent_response_list_dir['feedback1'] = '' 144 | 145 | agent_dict = json.loads(response) 146 | lift_weight_list_total = [] 147 | for key, value in agent_dict.items(): 148 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)] 149 | 150 | for lift_weight_item in lifter_weight_list: 151 | if lift_weight_item in lift_weight_list_total: 152 | prompt_list_dir[f'Agent[{lift_weight_item}W]'] = [] 153 | response_list_dir[f'Agent[{lift_weight_item}W]'] = [] 154 | 155 | local_reprompt = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response, 156 | response_total_list, pg_state_list, 157 | dialogue_history_list, 158 | env_act_feedback_list, 159 | dialogue_history_method) 160 | 161 | # print(local_reprompt) 162 | prompt_list_dir[f'Agent[{lift_weight_item}W]'].append(local_reprompt) 163 | messages = message_construct_func( 164 | prompt_list_dir[f'Agent[{lift_weight_item}W]'], 165 | response_list_dir[f'Agent[{lift_weight_item}W]'], 166 | '_w_all_dialogue_history') 167 | response_local_agent, token_num_count = GPT_response(messages, model_name) 168 | token_num_count_list.append(token_num_count) 169 | print(f'Agent[{lift_weight_item}W] response: {response_local_agent}') 170 | if not ('I Agree' in response_local_agent or 'I agree' in response_local_agent): 171 | local_agent_response_list_dir[ 172 | 'feedback1'] += f'Agent[{lift_weight_item}W]: {response_local_agent}\n' # collect the response from all the local agents 173 | dialogue_history += f'Agent[{lift_weight_item}W]: {response_local_agent}\n' 174 | 175 | if local_agent_response_list_dir['feedback1'] != '': 176 | local_agent_response_list_dir['feedback1'] += '\nThis is the feedback from local agents. If you find some errors in your previous plan, try to modify it. Otherwise, output the same plan as before. The output should have the same json format {"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}, as above. Do not explain, just directly output json directory. Your response:' 177 | messages = message_construct_func([user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], '_w_all_dialogue_history') # message construction 178 | response_central_again, token_num_count = GPT_response(messages, model_name) 179 | token_num_count_list.append(token_num_count) 180 | match = re.search(r'{.*}', response_central_again, re.DOTALL) 181 | if match: 182 | response = match.group() 183 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response_central_again, [user_prompt_list[-1], local_agent_response_list_dir['feedback1']], [response], model_name, 184 | '_w_all_dialogue_history') 185 | token_num_count_list = token_num_count_list + token_num_count_list_add 186 | print(f'Modified plan response: {response}') 187 | else: 188 | break_mark = True 189 | pass 190 | 191 | dialogue_history_list.append(dialogue_history) 192 | 193 | elif cen_decen_framework == 'HMAS-1' or cen_decen_framework == 'HMAS-1-fast': 194 | print('--------HMAS-1 method starts--------') 195 | count_round = 0 196 | dialogue_history = f'Central Planner: {response}\n' 197 | match = None 198 | while not match and count_round <= 3: 199 | count_round += 1 200 | 201 | agent_dict = json.loads(response) 202 | lift_weight_list_total = [] 203 | for key, value in agent_dict.items(): 204 | lift_weight_list_total += [float(num) for num in re.findall(r'(\d+\.\d+)', value)] 205 | 206 | for lift_weight_item in lifter_weight_list: 207 | 208 | if count_round >= 2 and cen_decen_framework == 'HMAS-1-fast': 209 | user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, 210 | state_update_prompt_other_agent, 211 | dialogue_history, response_total_list, pg_state_list, 212 | dialogue_history_list, dialogue_history_method, 213 | initial_plan=response) 214 | else: 215 | #user_prompt_1 = input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent, 216 | # state_update_prompt_other_agent, dialogue_history, 217 | # response_total_list, pg_state_list, 218 | # dialogue_history_list, dialogue_history_method, 219 | # initial_plan='') 220 | 221 | user_prompt_1 = input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, response, 222 | response_total_list, pg_state_list, 223 | dialogue_history_list, 224 | env_act_feedback_list, 225 | dialogue_history_method) 226 | 227 | 228 | user_prompt_list.append(user_prompt_1) 229 | with open(Saving_path_result + '/prompt' + '/user_prompt_' + str(index_query_times + 1), 'w') as f: 230 | f.write(user_prompt_list[-1]) 231 | messages = message_construct_func([user_prompt_1], [], '_w_all_dialogue_history') 232 | initial_response, token_num_count = GPT_response(messages, 233 | model_name) 234 | token_num_count_list.append(token_num_count) 235 | 236 | #print('-----------prompt------------\n' + initial_response) 237 | dialogue_history += f'agent[{lift_weight_item}W]: {initial_response}\n' 238 | #print(dialogue_history) 239 | match = re.search(r'{.*}', initial_response, re.DOTALL) 240 | if match and re.search(r'EXECUTE', initial_response): 241 | response = match.group() 242 | response, token_num_count_list_add = with_action_syntactic_check_func(pg_dict, response, 243 | [user_prompt_list[-1]], 244 | [], 245 | model_name, 246 | '_w_all_dialogue_history') 247 | token_num_count_list = token_num_count_list + token_num_count_list_add 248 | print(f'response: {response}') 249 | break 250 | break 251 | dialogue_history_list.append(dialogue_history) 252 | 253 | response_total_list.append(response) 254 | if response == 'Out of tokens': 255 | success_failure = 'failure over token length limit' 256 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 257 | elif response == 'Syntactic Error': 258 | success_failure = 'Syntactic Error' 259 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 260 | 261 | data = json.loads(response) 262 | 263 | with open(Saving_path_result+'/response' + '/response'+str(index_query_times+1)+'.json', 'w') as f: 264 | json.dump(data, f) 265 | original_response_dict = json.loads(response_total_list[index_query_times]) 266 | print(pg_dict) 267 | if cen_decen_framework in ('DMAS', 'HMAS-1', 'HMAS-1-fast'): 268 | with open(Saving_path_result+'/dialogue_history' + '/dialogue_history'+str(index_query_times)+'.txt', 'w') as f: 269 | f.write(dialogue_history_list[index_query_times]) 270 | try: 271 | system_error_feedback, pg_dict_returned, env_act_feedback = action_from_response(pg_dict, original_response_dict, lifter_weight_list) 272 | env_act_feedback_list.append(env_act_feedback) 273 | if system_error_feedback != '': 274 | print('system_error_feedback: ', system_error_feedback) 275 | if env_act_feedback != '': 276 | print('env_act_feedback: ', env_act_feedback) 277 | pg_dict = pg_dict_returned 278 | 279 | except: 280 | print('Hallucination response: ', response) 281 | success_failure = 'Hallucination of wrong plan' 282 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 283 | pg_state_list.append(pg_dict) 284 | 285 | with open(Saving_path_result + '/pg_state' + '/pg_state' + str(index_query_times+2) + '.json', 'w') as f: 286 | json.dump(pg_dict, f) 287 | 288 | # Check whether the task has been completed 289 | if len(pg_dict) == 0: 290 | break 291 | 292 | if index_query_times < query_time_limit - 1: 293 | success_failure = 'success' 294 | else: 295 | success_failure = 'failure over query time limit' 296 | return user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result 297 | 298 | 299 | Code_dir_path = 'path_to_multi-agent-framework/multi-agent-framework/' # Put the current code directory path here 300 | Saving_path = Code_dir_path + 'Env3_BoxLift' 301 | model_name = 'gpt-4-0613' #'gpt-4-0613', 'gpt-3.5-turbo-16k-0613' 302 | print(f'-------------------Model name: {model_name}-------------------') 303 | 304 | for pg_row_num in [4,6,8,10]: 305 | if pg_row_num == 8: 306 | query_time_limit = 25 307 | else: 308 | query_time_limit = 20 309 | for iteration_num in range(10): 310 | print('-------###-------###-------###-------') 311 | print(f'Row num is: {pg_row_num}, Iteration num is: {iteration_num}\n\n') 312 | 313 | user_prompt_list, response_total_list, pg_state_list, success_failure, index_query_times, token_num_count_list, Saving_path_result = run_exp(Saving_path, pg_row_num, iteration_num, query_time_limit, dialogue_history_method='_w_only_state_action_history', 314 | cen_decen_framework='HMAS-2', model_name = model_name) 315 | with open(Saving_path_result + '/token_num_count.txt', 'w') as f: 316 | for token_num_num_count in token_num_count_list: 317 | f.write(str(token_num_num_count) + '\n') 318 | 319 | with open(Saving_path_result + '/success_failure.txt', 'w') as f: 320 | f.write(success_failure) 321 | 322 | with open(Saving_path_result + '/env_action_times.txt', 'w') as f: 323 | f.write(f'{index_query_times+1}') 324 | print(success_failure) 325 | print(f'Iteration number: {index_query_times+1}') 326 | -------------------------------------------------------------------------------- /prompt_env3.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | import tiktoken 3 | enc = tiktoken.get_encoding("cl100k_base") 4 | assert enc.decode(enc.encode("hello world")) == "hello world" 5 | enc = tiktoken.encoding_for_model("gpt-4") 6 | input_prompt_token_limit = 4000 7 | 8 | extra_prompt = 'Each lifting agent can be used only once in each step! You can combine multiple agents to lift one box like "box[3.0V]":"agent[1.5W], agent[2.5W]"! Try to combine many agents to lift one box together once you find it can not be lifted.' 9 | 10 | def LLM_summarize_func(state_action_prompt_next_initial): 11 | prompt1 = f"Please summarize the following content as concise as possible: \n{state_action_prompt_next_initial}" 12 | messages = [{"role": "system", "content": "You are a helpful assistant."}, 13 | {"role": "user", "content": prompt1}] 14 | response = GPT_response(messages, model_name='gpt-4') 15 | return response 16 | 17 | 18 | def input_prompt_1_func(state_update_prompt): 19 | user_prompt_1 = f''' 20 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 21 | 22 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 23 | 24 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number. 25 | 26 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 27 | 28 | The current left boxes and agents are: 29 | {state_update_prompt} 30 | 31 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step: 32 | ''' 33 | return user_prompt_1 34 | 35 | 36 | def input_prompt_1_func_total(state_update_prompt, response_total_list, 37 | pg_state_list, dialogue_history_list, env_act_feedback_list, 38 | dialogue_history_method, cen_decen_framework): 39 | if len(pg_state_list) - len(response_total_list) != 1: 40 | raise error('state and response list do not match') 41 | if len(pg_state_list) - len(env_act_feedback_list) != 1: 42 | raise error('state and env act feedback list do not match') 43 | if len(pg_state_list) - len(dialogue_history_list) != 1 and cen_decen_framework != 'CMAS': 44 | raise error('state and dialogue history list do not match') 45 | 46 | user_prompt_1 = f''' 47 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 48 | 49 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 50 | 51 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number. 52 | 53 | The previous state and action pairs at each step are: 54 | 55 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 56 | 57 | The current left boxes and agents are: 58 | {state_update_prompt} 59 | 60 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[5.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step: 61 | ''' 62 | token_num_count = len(enc.encode(user_prompt_1)) 63 | 64 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS': 65 | pass 66 | elif dialogue_history_method in ( 67 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 68 | if dialogue_history_method == '_w_only_state_action_history': 69 | #print('fdsfdsafadsas') 70 | state_action_prompt = '' 71 | for i in range(len(response_total_list) - 1, -1, -1): 72 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 73 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 74 | state_action_prompt = state_action_prompt_next 75 | else: 76 | break 77 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS': 78 | state_action_prompt = '' 79 | for i in range(len(response_total_list) - 1, -1, -1): 80 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 81 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 82 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 83 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 84 | state_action_prompt = state_action_prompt_next 85 | else: 86 | break 87 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS': 88 | state_action_prompt = '' 89 | for i in range(len(response_total_list) - 1, -1, -1): 90 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 91 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 92 | state_action_prompt = state_action_prompt_next 93 | else: 94 | break 95 | 96 | user_prompt_1 = f''' 97 | You are a central planner directing lifting agents in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 98 | 99 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 100 | 101 | Your task is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. Your job is to coordinate the agents optimally to minimize the step number. 102 | 103 | The previous state and action pairs at each step are: 104 | {state_action_prompt} 105 | 106 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 107 | 108 | The current left boxes and agents are: 109 | {state_update_prompt} 110 | 111 | Specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. Include a box only if it has lifting agents to lift it next. Now, plan the next step: 112 | ''' 113 | #print(f'state_action_prompt: {state_action_prompt}') 114 | return user_prompt_1 115 | 116 | def input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list, 117 | pg_state_list, dialogue_history_list, 118 | dialogue_history_method): 119 | if len(pg_state_list) - len(response_total_list) != 1: 120 | raise error('state and response list do not match') 121 | if len(pg_state_list) - len(dialogue_history_list) != 1: 122 | raise error('state and dialogue history list do not match') 123 | 124 | user_prompt_1 = f''' 125 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 126 | 127 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 128 | 129 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 130 | 131 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 132 | 133 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 134 | 135 | The current left boxes and agents are: 136 | {state_update_prompt} 137 | 138 | [Action Output Instruction] 139 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. 140 | Include an agent only if it has a task next. 141 | Example#1: 142 | EXECUTE 143 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 144 | 145 | Example#2: 146 | EXECUTE 147 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 148 | 149 | The previous state and action pairs at each step are: 150 | {state_action_prompt} 151 | 152 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 153 | 154 | The current state is {pg_state_list[-1]} 155 | The central planner\'s current action plan is: 156 | 157 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 158 | Your response: 159 | ''' 160 | token_num_count = len(enc.encode(user_prompt_1)) 161 | 162 | if dialogue_history_method == '_wo_any_dialogue_history': 163 | pass 164 | elif dialogue_history_method in ('_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 165 | if dialogue_history_method == '_w_only_state_action_history': 166 | state_action_prompt = '' 167 | for i in range(len(response_total_list) - 1, -1, -1): 168 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 169 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 170 | state_action_prompt = state_action_prompt_next 171 | else: 172 | break 173 | elif dialogue_history_method == '_w_compressed_dialogue_history': 174 | state_action_prompt = '' 175 | for i in range(len(response_total_list) - 1, -1, -1): 176 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 177 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 178 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 179 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 180 | state_action_prompt = state_action_prompt_next 181 | else: 182 | break 183 | elif dialogue_history_method == '_w_all_dialogue_history': 184 | state_action_prompt = '' 185 | for i in range(len(response_total_list) - 1, -1, -1): 186 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 187 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 188 | state_action_prompt = state_action_prompt_next 189 | else: 190 | break 191 | 192 | user_prompt_1 = f''' 193 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 194 | 195 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 196 | 197 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 198 | 199 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 200 | 201 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 202 | 203 | The current left boxes and agents are: 204 | {state_update_prompt} 205 | 206 | [Action Output Instruction] 207 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. 208 | Include an agent only if it has a task next. 209 | Example#1: 210 | EXECUTE 211 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 212 | 213 | Example#2: 214 | EXECUTE 215 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 216 | 217 | The previous state and action pairs at each step are: 218 | {state_action_prompt} 219 | 220 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 221 | 222 | The current state is {pg_state_list[-1]} 223 | The central planner\'s current action plan is: {{{central_response}}}. 224 | 225 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 226 | Your response: 227 | ''' 228 | return user_prompt_1 229 | 230 | 231 | def input_prompt_local_agent_HMAS1_dialogue_func(lift_weight_item, state_update_prompt, central_response, response_total_list, pg_state_list, dialogue_history_list, env_act_feedback_list, dialogue_history_method): 232 | if len(pg_state_list) - len(response_total_list) != 1: 233 | raise error('state and response list do not match') 234 | if len(pg_state_list) - len(env_act_feedback_list) != 1: 235 | raise error('state and env act feedback list do not match') 236 | if len(pg_state_list) - len(dialogue_history_list) != 1: 237 | raise error('state and dialogue history list do not match') 238 | 239 | user_prompt_1 = f''' 240 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 241 | 242 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 243 | 244 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 245 | 246 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 247 | 248 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 249 | 250 | The current left boxes and agents are: 251 | {state_update_prompt} 252 | 253 | [Action Output Instruction] 254 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. 255 | Include an agent only if it has a task next. 256 | Example#1: 257 | EXECUTE 258 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 259 | 260 | Example#2: 261 | EXECUTE 262 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 263 | 264 | The previous state and action pairs at each step are: 265 | 266 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 267 | 268 | The current state is {pg_state_list[-1]} 269 | The central planner\'s current action plan is: 270 | 271 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 272 | Your response: 273 | ''' 274 | 275 | token_num_count = len(enc.encode(user_prompt_1)) 276 | 277 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS': 278 | pass 279 | elif dialogue_history_method in ( 280 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 281 | if dialogue_history_method == '_w_only_state_action_history': 282 | #print('fdsfdsafadsas') 283 | state_action_prompt = '' 284 | for i in range(len(response_total_list) - 1, -1, -1): 285 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 286 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 287 | state_action_prompt = state_action_prompt_next 288 | else: 289 | break 290 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS': 291 | state_action_prompt = '' 292 | for i in range(len(response_total_list) - 1, -1, -1): 293 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 294 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 295 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 296 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 297 | state_action_prompt = state_action_prompt_next 298 | else: 299 | break 300 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS': 301 | state_action_prompt = '' 302 | for i in range(len(response_total_list) - 1, -1, -1): 303 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 304 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 305 | state_action_prompt = state_action_prompt_next 306 | else: 307 | break 308 | 309 | user_prompt_1 = f''' 310 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 311 | 312 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 313 | 314 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 315 | 316 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 317 | 318 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 319 | 320 | The current left boxes and agents are: 321 | {state_update_prompt} 322 | 323 | [Action Output Instruction] 324 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"box[1.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W]"}}. 325 | Include an agent only if it has a task next. 326 | Example#1: 327 | EXECUTE 328 | {{"box[2.7V]":"agent[1.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 329 | 330 | Example#2: 331 | EXECUTE 332 | {{"box[2.7V]":"agent[4.5W]", "box[3.0V]":"agent[1.5W], agent[2.5W], agent[2.0W]"}} 333 | 334 | The previous state and action pairs at each step are: 335 | {state_action_prompt} 336 | 337 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 338 | 339 | The current state is {pg_state_list[-1]} 340 | The central planner\'s current action plan is: {{{central_response}}}. 341 | 342 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 343 | Your response: 344 | ''' 345 | return user_prompt_1 346 | 347 | def input_prompt_local_agent_HMAS2_dialogue_func(lift_weight_item, state_update_prompt, central_response, response_total_list, pg_state_list, dialogue_history_list, env_act_feedback_list, dialogue_history_method): 348 | if len(pg_state_list) - len(response_total_list) != 1: 349 | raise error('state and response list do not match') 350 | if len(pg_state_list) - len(env_act_feedback_list) != 1: 351 | raise error('state and env act feedback list do not match') 352 | if len(pg_state_list) - len(dialogue_history_list) != 1: 353 | raise error('state and dialogue history list do not match') 354 | 355 | user_prompt_1 = f''' 356 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 357 | 358 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 359 | 360 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 361 | 362 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 363 | 364 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 365 | 366 | The current left boxes and agents are: 367 | {state_update_prompt} 368 | 369 | The previous state and action pairs at each step are: 370 | 371 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 372 | 373 | The current state is {pg_state_list[-1]} 374 | The central planner\'s current action plan is: {{{central_response}}}. 375 | 376 | If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response: 377 | ''' 378 | 379 | token_num_count = len(enc.encode(user_prompt_1)) 380 | 381 | if dialogue_history_method == '_wo_any_dialogue_history' and cen_decen_framework == 'CMAS': 382 | pass 383 | elif dialogue_history_method in ( 384 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 385 | if dialogue_history_method == '_w_only_state_action_history': 386 | #print('fdsfdsafadsas') 387 | state_action_prompt = '' 388 | for i in range(len(response_total_list) - 1, -1, -1): 389 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 390 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 391 | state_action_prompt = state_action_prompt_next 392 | else: 393 | break 394 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS': 395 | state_action_prompt = '' 396 | for i in range(len(response_total_list) - 1, -1, -1): 397 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 398 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 399 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 400 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 401 | state_action_prompt = state_action_prompt_next 402 | else: 403 | break 404 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS': 405 | state_action_prompt = '' 406 | for i in range(len(response_total_list) - 1, -1, -1): 407 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\nEnvironment Feedback{i + 1}: {env_act_feedback_list[i]}\n\n' + state_action_prompt 408 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 409 | state_action_prompt = state_action_prompt_next 410 | else: 411 | break 412 | 413 | user_prompt_1 = f''' 414 | You are a box-lifting agent in a warehouse to lift boxes. Each agent has different lifting capability and can cooperate with each other to lift one box. In summation of lifting capability, the agents can lift all boxes. 415 | 416 | The boxes are identified by their volume, e.g., box[1.4V]. The agents are identified by their lifting weight capability, e.g., agent[1.5W]. Actions are like: "box[1.7V]":"agent[2.5W]", "box[6.0V]":"agent[1.5W], agent[2.5W]". 417 | 418 | The task of the central planner is to divide the group of each agent to lift all the boxes. After each step, environments provide updates for the left boxes. The goal of the group is to coordinate the agents optimally to minimize the step number. 419 | 420 | The current state of yourself is: f'Agent[{lift_weight_item}W]: has lifting capacity {lift_weight_item}W' 421 | 422 | Note that the agents can only lift one box at a time. {extra_prompt} [The volume of the box is roughly proportional to the weight of the box, but with some randomness. Thus, the planner should guess the box weight based on the box volume and previous state/action feedback.] 423 | 424 | The current left boxes and agents are: 425 | {state_update_prompt} 426 | 427 | The previous state and action pairs at each step are: 428 | {state_action_prompt} 429 | 430 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 431 | 432 | The current state is {pg_state_list[-1]} 433 | The central planner\'s current action plan is: {{{central_response}}}. 434 | 435 | If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response: 436 | ''' 437 | return user_prompt_1 438 | 439 | def message_construct_func(user_prompt_list, response_total_list, dialogue_history_method): 440 | if f'{dialogue_history_method}' == '_w_all_dialogue_history': 441 | messages=[{"role": "system", "content": "You are a helpful assistant."}] 442 | #print('length of user_prompt_list', len(user_prompt_list)) 443 | for i in range(len(user_prompt_list)): 444 | messages.append({"role": "user", "content": user_prompt_list[i]}) 445 | if i < len(user_prompt_list)-1: 446 | messages.append({"role": "assistant", "content": response_total_list[i]}) 447 | #print('Length of messages', len(messages)) 448 | elif f'{dialogue_history_method}' in ('_wo_any_dialogue_history', '_w_only_state_action_history'): 449 | messages=[{"role": "system", "content": "You are a helpful assistant."}] 450 | messages.append({"role": "user", "content": user_prompt_list[-1]}) 451 | #print('Length of messages', len(messages)) 452 | return messages 453 | -------------------------------------------------------------------------------- /prompt_env1.py: -------------------------------------------------------------------------------- 1 | from LLM import * 2 | import tiktoken 3 | enc = tiktoken.get_encoding("cl100k_base") 4 | assert enc.decode(enc.encode("hello world")) == "hello world" 5 | enc = tiktoken.encoding_for_model("gpt-4") 6 | input_prompt_token_limit = 3000 7 | 8 | def LLM_summarize_func(state_action_prompt_next_initial, model_name): 9 | prompt1 = f"Please summarize the following content as concise as possible: \n{state_action_prompt_next_initial}" 10 | messages = [{"role": "system", "content": "You are a helpful assistant."}, 11 | {"role": "user", "content": prompt1}] 12 | response = GPT_response(messages, model_name) 13 | return response 14 | 15 | 16 | def input_prompt_1_func(state_update_prompt): 17 | user_prompt_1 = f''' 18 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes. 19 | 20 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]). 21 | 22 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally. 23 | 24 | {state_update_prompt} 25 | 26 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step: 27 | ''' 28 | return user_prompt_1 29 | 30 | 31 | def input_prompt_1_only_state_action_func(state_update_prompt, response_total_list, pg_state_list): 32 | user_prompt_1 = f''' 33 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes. 34 | 35 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]). 36 | 37 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally. 38 | 39 | The previous state and action pairs at each step are: 40 | 41 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 42 | 43 | Hence, the current state is {pg_state_list[-1]}, with the possible actions: 44 | {state_update_prompt} 45 | 46 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step: 47 | ''' 48 | token_num_count = len(enc.encode(user_prompt_1)) 49 | 50 | if len(pg_state_list) - len(response_total_list) != 1: 51 | raise error('state and response list do not match') 52 | state_action_prompt = '' 53 | for i in range(len(response_total_list) - 1, -1, -1): 54 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 55 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 56 | state_action_prompt = state_action_prompt_next 57 | else: 58 | break 59 | 60 | user_prompt_1 = f''' 61 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes. 62 | 63 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]). 64 | 65 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally. 66 | 67 | The previous state and action pairs at each step are: 68 | {state_action_prompt} 69 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 70 | 71 | Hence, the current state is {pg_state_list[-1]}, with the possible actions: 72 | {state_update_prompt} 73 | 74 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step: 75 | ''' 76 | return user_prompt_1 77 | 78 | 79 | def input_prompt_1_func_total(state_update_prompt, response_total_list, 80 | pg_state_list, dialogue_history_list, 81 | dialogue_history_method, cen_decen_framework): 82 | if len(pg_state_list) - len(response_total_list) != 1: 83 | raise error('state and response list do not match') 84 | if len(pg_state_list) - len(dialogue_history_list) != 1 and cen_decen_framework != 'CMAS': 85 | raise error('state and dialogue history list do not match') 86 | 87 | user_prompt_1 = f''' 88 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes. 89 | 90 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]). 91 | 92 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally. 93 | 94 | The previous state and action pairs at each step are: 95 | 96 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 97 | 98 | Hence, the current state is {pg_state_list[-1]}, with the possible actions: 99 | {state_update_prompt} 100 | 101 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step: 102 | ''' 103 | token_num_count = len(enc.encode(user_prompt_1)) 104 | 105 | if dialogue_history_method == '_wo_any_dialogue_history' or cen_decen_framework == 'CMAS': 106 | pass 107 | elif dialogue_history_method in ( 108 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 109 | if dialogue_history_method == '_w_only_state_action_history' and cen_decen_framework != 'CMAS': 110 | state_action_prompt = '' 111 | for i in range(len(response_total_list) - 1, -1, -1): 112 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 113 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 114 | state_action_prompt = state_action_prompt_next 115 | else: 116 | break 117 | elif dialogue_history_method == '_w_compressed_dialogue_history' and cen_decen_framework != 'CMAS': 118 | state_action_prompt = '' 119 | for i in range(len(response_total_list) - 1, -1, -1): 120 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 121 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 122 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 123 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 124 | state_action_prompt = state_action_prompt_next 125 | else: 126 | break 127 | elif dialogue_history_method == '_w_all_dialogue_history' and cen_decen_framework != 'CMAS': 128 | state_action_prompt = '' 129 | for i in range(len(response_total_list) - 1, -1, -1): 130 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 131 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 132 | state_action_prompt = state_action_prompt_next 133 | else: 134 | break 135 | 136 | user_prompt_1 = f''' 137 | You are a central planner directing agents in a grid-like field to move colored boxes. Each agent is assigned to a 1x1 square and can only interact with objects in its area. Agents can move a box to a neighboring square or a same-color target. Each square can contain many targets and boxes. 138 | 139 | The squares are identified by their center coordinates, e.g., square[0.5, 0.5]. Actions are like: move(box_red, target_red) or move(box_red, square[0.5, 0.5]). 140 | 141 | Your task is to instruct each agent to match all boxes to their color-coded targets. After each move, agents provide updates for the next sequence of actions. Your job is to coordinate the agents optimally. 142 | 143 | The previous state and action pairs at each step are: 144 | {state_action_prompt} 145 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 146 | 147 | Hence, the current state is {pg_state_list[-1]}, with the possible actions: 148 | {state_update_prompt} 149 | 150 | Specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move...}}. Include an agent only if it has a task next. Now, plan the next step: 151 | ''' 152 | 153 | return user_prompt_1 154 | 155 | def input_prompt_local_agent_DMAS_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list, 156 | pg_state_list, dialogue_history_list, 157 | dialogue_history_method): 158 | if len(pg_state_list) - len(response_total_list) != 1: 159 | raise error('state and response list do not match') 160 | if len(pg_state_list) - len(dialogue_history_list) != 1: 161 | raise error('state and dialogue history list do not match') 162 | 163 | user_prompt_1 = f''' 164 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 165 | All the agents coordinate with others together to come out a plan and achieve the goal: match each box with its color-coded target. 166 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 167 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 168 | The previous state and action pairs at each step are: 169 | 170 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 171 | 172 | 173 | [Action Output Instruction] 174 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 175 | Include an agent only if it has a task next. 176 | Example#1: 177 | EXECUTE 178 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 179 | 180 | Example#2: 181 | EXECUTE 182 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 183 | 184 | The previous dialogue history is: {{{dialogue_history}}} 185 | Think step-by-step about the task and the previous dialogue history. Carefully check and correct them if they made a mistake. 186 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan. 187 | Propose exactly one action for yourself at the **current** round. 188 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 189 | Your response: 190 | ''' 191 | token_num_count = len(enc.encode(user_prompt_1)) 192 | 193 | if dialogue_history_method == '_wo_any_dialogue_history': 194 | pass 195 | elif dialogue_history_method in ('_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 196 | if dialogue_history_method == '_w_only_state_action_history': 197 | state_action_prompt = '' 198 | for i in range(len(response_total_list) - 1, -1, -1): 199 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 200 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 201 | state_action_prompt = state_action_prompt_next 202 | else: 203 | break 204 | elif dialogue_history_method == '_w_compressed_dialogue_history': 205 | state_action_prompt = '' 206 | for i in range(len(response_total_list) - 1, -1, -1): 207 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 208 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 209 | #state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 210 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 211 | state_action_prompt = state_action_prompt_next 212 | else: 213 | break 214 | elif dialogue_history_method == '_w_all_dialogue_history': 215 | state_action_prompt = '' 216 | for i in range(len(response_total_list) - 1, -1, -1): 217 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 218 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 219 | state_action_prompt = state_action_prompt_next 220 | else: 221 | break 222 | 223 | user_prompt_1 = f''' 224 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 225 | All the agents coordinate with others together to come out a plan and achieve the goal: match each box with its color-coded target. 226 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 227 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 228 | The previous state and action pairs at each step are: 229 | {state_action_prompt} 230 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 231 | 232 | 233 | [Action Output Instruction] 234 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 235 | Include an agent only if it has a task next. 236 | Example#1: 237 | EXECUTE 238 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 239 | 240 | Example#2: 241 | EXECUTE 242 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 243 | 244 | The previous dialogue history is: {{{dialogue_history}}} 245 | Think step-by-step about the task and the previous dialogue history. Carefully check and correct them if they made a mistake. 246 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan. 247 | Propose exactly one action for yourself at the **current** round. 248 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan, must strictly follow [Action Output Instruction]! 249 | Your response: 250 | ''' 251 | 252 | return user_prompt_1 253 | 254 | 255 | def input_prompt_local_agent_HMAS1_dialogue_fast_plan_func(state_update_prompt_local_agent, state_update_prompt_other_agent, 256 | dialogue_history, response_total_list, pg_state_list, dialogue_history_list, 257 | dialogue_history_method, initial_plan=''): 258 | if len(pg_state_list) - len(response_total_list) != 1: 259 | raise error('state and response list do not match') 260 | if len(pg_state_list) - len(dialogue_history_list) != 1: 261 | raise error('state and dialogue history list do not match') 262 | 263 | user_prompt_1 = f''' 264 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 265 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target. 266 | Then all the action agents discuss and coordiante with each other to come out a final plan. 267 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 268 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 269 | The previous state and action pairs at each step are: 270 | 271 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 272 | 273 | [Action Output Instruction] 274 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 275 | Include an agent only if it has a task next. 276 | Example#1: 277 | EXECUTE 278 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 279 | 280 | Example#2: 281 | EXECUTE 282 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 283 | 284 | The initial plan is: {{{initial_plan}}} 285 | The previous dialogue history is: {{{dialogue_history}}} 286 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake. 287 | End your response by outputting the final plan, must strictly follow [Action Output Instruction]! 288 | Your response: 289 | ''' 290 | 291 | token_num_count = len(enc.encode(user_prompt_1)) 292 | 293 | if dialogue_history_method == '_wo_any_dialogue_history': 294 | pass 295 | elif dialogue_history_method in ( 296 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 297 | if dialogue_history_method == '_w_only_state_action_history': 298 | state_action_prompt = '' 299 | for i in range(len(response_total_list) - 1, -1, -1): 300 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 301 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 302 | state_action_prompt = state_action_prompt_next 303 | else: 304 | break 305 | elif dialogue_history_method == '_w_compressed_dialogue_history': 306 | state_action_prompt = '' 307 | for i in range(len(response_total_list) - 1, -1, -1): 308 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 309 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 310 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 311 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 312 | state_action_prompt = state_action_prompt_next 313 | else: 314 | break 315 | elif dialogue_history_method == '_w_all_dialogue_history': 316 | state_action_prompt = '' 317 | for i in range(len(response_total_list) - 1, -1, -1): 318 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 319 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 320 | state_action_prompt = state_action_prompt_next 321 | else: 322 | break 323 | 324 | user_prompt_1 = f''' 325 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 326 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target. 327 | Then all the action agents discuss and coordiante with each other to come out a final plan. 328 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 329 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 330 | The previous state and action pairs at each step are: 331 | {state_action_prompt} 332 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 333 | 334 | [Action Output Instruction] 335 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 336 | Include an agent only if it has a task next. 337 | Example#1: 338 | EXECUTE 339 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 340 | 341 | Example#2: 342 | EXECUTE 343 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 344 | 345 | The initial plan is: {{{initial_plan}}} 346 | The previous dialogue history is: {{{dialogue_history}}} 347 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake. 348 | End your response by outputting the final plan, must strictly follow [Action Output Instruction]! 349 | Your response: 350 | ''' 351 | return user_prompt_1 352 | 353 | 354 | def input_prompt_local_agent_HMAS1_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, dialogue_history, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method, initial_plan = ''): 355 | if len(pg_state_list) - len(response_total_list) != 1: 356 | raise error('state and response list do not match') 357 | if len(pg_state_list) - len(dialogue_history_list) != 1: 358 | raise error('state and dialogue history list do not match') 359 | 360 | user_prompt_1 = f''' 361 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 362 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target. 363 | Then all the action agents discuss and coordiante with each other to come out a final plan. 364 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 365 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 366 | The previous state and action pairs at each step are: 367 | 368 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 369 | 370 | [Action Output Instruction] 371 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 372 | Include an agent only if it has a task next. 373 | Example#1: 374 | EXECUTE 375 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 376 | 377 | Example#2: 378 | EXECUTE 379 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 380 | 381 | The initial plan is: {{{initial_plan}}} 382 | The previous dialogue history is: {{{dialogue_history}}} 383 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake. 384 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan. 385 | Propose exactly one action for yourself at the **current** round. 386 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 387 | Your response: 388 | ''' 389 | 390 | token_num_count = len(enc.encode(user_prompt_1)) 391 | 392 | if dialogue_history_method == '_wo_any_dialogue_history': 393 | pass 394 | elif dialogue_history_method in ( 395 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 396 | if dialogue_history_method == '_w_only_state_action_history': 397 | state_action_prompt = '' 398 | for i in range(len(response_total_list) - 1, -1, -1): 399 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 400 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 401 | state_action_prompt = state_action_prompt_next 402 | else: 403 | break 404 | elif dialogue_history_method == '_w_compressed_dialogue_history': 405 | state_action_prompt = '' 406 | for i in range(len(response_total_list) - 1, -1, -1): 407 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 408 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 409 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 410 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 411 | state_action_prompt = state_action_prompt_next 412 | else: 413 | break 414 | elif dialogue_history_method == '_w_all_dialogue_history': 415 | state_action_prompt = '' 416 | for i in range(len(response_total_list) - 1, -1, -1): 417 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 418 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 419 | state_action_prompt = state_action_prompt_next 420 | else: 421 | break 422 | 423 | user_prompt_1 = f''' 424 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 425 | one extra planner first proposes an initial plan to coordinates all agents to achieve the goal: match each box with its color-coded target. 426 | Then all the action agents discuss and coordiante with each other to come out a final plan. 427 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 428 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 429 | The previous state and action pairs at each step are: 430 | {state_action_prompt} 431 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 432 | 433 | [Action Output Instruction] 434 | Must first output 'EXECUTE', then on the new line specify your action plan in this format: {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move..."}}. 435 | Include an agent only if it has a task next. 436 | Example#1: 437 | EXECUTE 438 | {{"Agent[0.5, 0.5]":"move(box_blue, square[0.5, 1.5])", "Agent[1.5, 0.5]":"move(box_green, square[0.5, 0.5])"}} 439 | 440 | Example#2: 441 | EXECUTE 442 | {{"Agent[0.5, 0.5]":"move(box_blue, target_blue)", "Agent[2.5, 1.5]":"move(box_red, square[1.5, 1.5])"}} 443 | 444 | The initial plan is: {{{initial_plan}}} 445 | The previous dialogue history is: {{{dialogue_history}}} 446 | Think step-by-step about the task, initial plan, and the previous dialogue history. Carefully check and correct them if they made a mistake. 447 | Respond very concisely but informatively, and do not repeat what others have said. Discuss with others to come up with the best plan. 448 | Propose exactly one action for yourself at the **current** round. 449 | End your response by either: 1) output PROCEED, if the plans require further discussion; 2) If everyone has made proposals and got approved, output the final plan as soon as possible, must strictly follow [Action Output Instruction]! 450 | Your response: 451 | ''' 452 | return user_prompt_1 453 | 454 | def input_prompt_local_agent_HMAS2_dialogue_func(state_update_prompt_local_agent, state_update_prompt_other_agent, central_response, response_total_list, pg_state_list, dialogue_history_list, dialogue_history_method): 455 | if len(pg_state_list) - len(response_total_list) != 1: 456 | raise error('state and response list do not match') 457 | if len(pg_state_list) - len(dialogue_history_list) != 1: 458 | raise error('state and dialogue history list do not match') 459 | 460 | user_prompt_1 = f''' 461 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 462 | 463 | A central planner coordinates all agents to achieve the goal: match each box with its color-coded target. 464 | 465 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 466 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 467 | The previous state and action pairs at each step are: 468 | 469 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 470 | 471 | The central planner\'s current action plan is: {{{central_response}}}. 472 | 473 | Please evaluate the given plan. If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response: 474 | ''' 475 | 476 | token_num_count = len(enc.encode(user_prompt_1)) 477 | 478 | if dialogue_history_method == '_wo_any_dialogue_history': 479 | pass 480 | elif dialogue_history_method in ( 481 | '_w_only_state_action_history', '_w_compressed_dialogue_history', '_w_all_dialogue_history'): 482 | if dialogue_history_method == '_w_only_state_action_history': 483 | state_action_prompt = '' 484 | for i in range(len(response_total_list) - 1, -1, -1): 485 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 486 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 487 | state_action_prompt = state_action_prompt_next 488 | else: 489 | break 490 | elif dialogue_history_method == '_w_compressed_dialogue_history': 491 | state_action_prompt = '' 492 | for i in range(len(response_total_list) - 1, -1, -1): 493 | dialogue_summary = LLM_summarize_func(dialogue_history_list[i]) 494 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nSummary of Dialogues in each step{i + 1}: {dialogue_summary}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 495 | # state_action_prompt_next = LLM_summarize_func(state_action_prompt_next_initial) 496 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 497 | state_action_prompt = state_action_prompt_next 498 | else: 499 | break 500 | elif dialogue_history_method == '_w_all_dialogue_history': 501 | state_action_prompt = '' 502 | for i in range(len(response_total_list) - 1, -1, -1): 503 | state_action_prompt_next = f'State{i + 1}: {pg_state_list[i]}\nDialogue{i + 1}: {dialogue_history_list[i]}\nAction{i + 1}: {response_total_list[i]}\n\n' + state_action_prompt 504 | if token_num_count + len(enc.encode(state_action_prompt_next)) < input_prompt_token_limit: 505 | state_action_prompt = state_action_prompt_next 506 | else: 507 | break 508 | 509 | user_prompt_1 = f''' 510 | You\'re a box-moving agent in a multi-agent system, stationed on a 1x1 square in a grid playground. You can only interact with objects in your square. Squares are denoted by their center coordinates (e.g., square[0.5, 0.5]), and actions involve moving boxes to targets or nearby squares, represented by colors (e.g., move(box_red, target_red)). Each square can contain many targets and boxes. 511 | 512 | A central planner coordinates all agents to achieve the goal: match each box with its color-coded target. 513 | 514 | The current state and possible actions of yourself are: {{{state_update_prompt_local_agent}}}. 515 | The current states and possible actions of all other agents are: {{{state_update_prompt_other_agent}}}. 516 | The previous state and action pairs at each step are: 517 | {state_action_prompt} 518 | Please learn from previous steps. Not purely repeat the actions but learn why the state changes or remains in a dead loop. Avoid being stuck in action loops. 519 | 520 | The central planner\'s current action plan is: {{{central_response}}}. 521 | 522 | Please evaluate the given plan. If you agree with it, respond 'I Agree', without any extra words. If not, briefly explain your objections to the central planner. Your response: 523 | ''' 524 | return user_prompt_1 525 | 526 | 527 | def input_reprompt_func(state_update_prompt): 528 | user_reprompt = f''' 529 | Finished! The updated state is as follows(combined targets and boxes with the same color have been removed): 530 | 531 | {state_update_prompt} 532 | 533 | The output should be like json format like: {{Agent[0.5, 0.5]:move(box_blue, square[0.5, 1.5]), Agent[1.5, 0.5]:move...}}. If no action for one agent in the next step, just do not include its action in the output. Also remember at most one action for each agent in each step. 534 | 535 | Next step output: 536 | ''' 537 | return user_reprompt 538 | 539 | def message_construct_func(user_prompt_list, response_total_list, dialogue_history_method): 540 | if f'{dialogue_history_method}' == '_w_all_dialogue_history': 541 | messages=[{"role": "system", "content": "You are a helpful assistant."}] 542 | #print('length of user_prompt_list', len(user_prompt_list)) 543 | for i in range(len(user_prompt_list)): 544 | messages.append({"role": "user", "content": user_prompt_list[i]}) 545 | if i < len(user_prompt_list)-1: 546 | messages.append({"role": "assistant", "content": response_total_list[i]}) 547 | #print('Length of messages', len(messages)) 548 | elif f'{dialogue_history_method}' in ('_wo_any_dialogue_history', '_w_only_state_action_history'): 549 | messages=[{"role": "system", "content": "You are a helpful assistant."}] 550 | messages.append({"role": "user", "content": user_prompt_list[-1]}) 551 | #print('Length of messages', len(messages)) 552 | return messages 553 | --------------------------------------------------------------------------------