├── README.md
└── images
    ├── 1af4535a-acc3-4417-ae33-675f4301f560.png
    ├── 2d94d985-d47e-4899-9760-c1cb8f19cd89.png
    ├── 5b923b31-dbbf-470f-af09-5125f5b91ab0.png
    ├── 80237155-9b7b-4f70-9c2e-8ee38029becd.png
    ├── a608d080-665a-4ab1-bd8f-d5bd121454da.png
    ├── a7817c0b-04b1-4a7c-9535-f9ff7801a689.png
    ├── data-demo.jpg
    ├── ee709e8b-6f05-428d-abff-2578914aeb0d.png
    ├── evaluation_affordance.png
    ├── evaluation_planning.png
    └── evaluation_trajectory.png


/README.md:
--------------------------------------------------------------------------------
  1 | # ShareRobot Dataset
  2 | 
  3 | **ShareRobot**, a high-quality heterogeneous dataset that labels multi-dimensional information, including task planning, object affordance, and end-effector trajectory, effectively enhancing various robotic capabilities.
  4 | 
  5 | - **Project Website**: [[CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete](https://superrobobrain.github.io/  "可选标题")
  6 | - **Download Link**: [ShareRobot Dataset](https://huggingface.co/datasets/BAAI/ShareRobot "可选标题")
  7 | 
  8 | ## Overview of ShareRobot
  9 | 
 10 | ![ee709e8b-6f05-428d-abff-2578914aeb0d](./images/ee709e8b-6f05-428d-abff-2578914aeb0d.png)
 11 | 
 12 | For **planning**, we have 51,403 episodes and each with 30 frames. In the process of data generation, we design 5 different templates for each of the 10 question types in RoboVQA [1]. In the process of data generation, we randomly select 2 templates of each question type to generate question-answer pairs for every instance. This process transforms 51,403 instances into 1,027,990 question-answer pairs, with annotators monitoring data generation to maintain the dataset’s integrity.
 13 | 
 14 | For **Affordance**, we have 6,522 images and each with affordance areas aligned with an instruction. 
 15 | 
 16 | For **Trajectory**, we have 6,870 images and each with at least 3 {x, y} coordinates aligned with an instruction.
 17 | 
 18 | 
 19 | 
 20 | ## Data Sources🌍
 21 | 
 22 | ![a608d080-665a-4ab1-bd8f-d5bd121454da](./images/a608d080-665a-4ab1-bd8f-d5bd121454da.png)
 23 | 
 24 | **ShareRobot** dataset contains 23 original datasets from Open X-Embodiment dataset [2], 12 embodiments and 107 types of atomic tasks. 
 25 | 
 26 | 
 27 | 
 28 | ### Raw Dataset for Planning
 29 | 
 30 | | Raw Dataset                                                   | Number of Raws |
 31 | |:-------------------------------------------------------------:| --------------:|
 32 | | nyu_door_opening_surprising_effectiveness                     | 421            |
 33 | | bridge                                                        | 15738          |
 34 | | dlr_edan_shared_control_converted_externally_to_rlds          | 63             |
 35 | | utokyo_xarm_pick_and_place_converted_externally_to_rlds       | 92             |
 36 | | cmu_stretch                                                   | 10             |
 37 | | asu_table_top_converted_externally_to_rlds                    | 109            |
 38 | | dlr_sara_pour_converted_externally_to_rlds                    | 51             |
 39 | | utokyo_xarm_bimanual_converted_externally_to_rlds             | 27             |
 40 | | robo_set                                                      | 18164          |
 41 | | dobbe                                                         | 5200           |
 42 | | berkeley_autolab_ur5                                          | 882            |
 43 | | qut_dexterous_manpulation                                     | 192            |
 44 | | aloha_mobile                                                  | 264            |
 45 | | dlr_sara_grid_clamp_converted_externally_to_rlds              | 40             |
 46 | | ucsd_pick_and_place_dataset_converted_externally_to_rlds      | 569            |
 47 | | ucsd_kitchen_dataset_converted_externally_to_rlds             | 39             |
 48 | | jaco_play                                                     | 956            |
 49 | | utokyo_pr2_opening_fridge_converted_externally_to_rlds        | 64             |
 50 | | conq_hose_manipulation                                        | 56             |
 51 | | fmb                                                           | 7836           |
 52 | | plex_robosuite                                                | 398            |
 53 | | utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds | 189            |
 54 | | viola                                                         | 44             |
 55 | 
 56 | 
 57 | 
 58 | ### Raw Dataset for Affordance
 59 | 
 60 | | Raw Dataset                                                   | Number of Raws |
 61 | |:-------------------------------------------------------------:| -------------:|
 62 | | utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds | 24             |
 63 | | utokyo_xarm_pick_and_place_converted_externally_to_rlds       | 23             |
 64 | | ucsd_kitchen_dataset_converted_externally_to_rlds             | 10             |
 65 | | ucsd_pick_and_place_dataset_converted_externally_to_rlds      | 112            |
 66 | | nyu_door_opening_surprising_effectiveness                     | 85             |
 67 | | jaco_play                                                     | 171            |
 68 | | bridge                                                        | 2610           |
 69 | | utokyo_pr2_opening_fridge_converted_externally_to_rlds        | 12             |
 70 | | asu_table_top_converted_externally_to_rlds                    | 24             |
 71 | | viola                                                         | 1              |
 72 | | berkeley_autolab_ur5                                          | 122            |
 73 | | aloha_mobile                                                  | 23             |
 74 | | conq_hose_manipulation                                        | 1              |
 75 | | dobbe                                                         | 717            |
 76 | | fmb                                                           | 561            |
 77 | | plex_robosuite                                                | 13             |
 78 | | qut_dexterous_manpulation                                     | 16             |
 79 | | robo_set                                                      | 1979           |
 80 | | dlr_edan_shared_control_converted_externally_to_rlds          | 18             |
 81 | | **Summary**                                                   | 6522           |
 82 | 
 83 | 
 84 | 
 85 | ### Raw Dataset for Trajectory
 86 | 
 87 | | Raw Dataset                                                   | Number of Raws |
 88 | |:-------------------------------------------------------------:| -------------:|
 89 | | utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds | 35             |
 90 | | utokyo_xarm_pick_and_place_converted_externally_to_rlds       | 36             |
 91 | | ucsd_kitchen_dataset_converted_externally_to_rlds             | 19             |
 92 | | dlr_sara_grid_clamp_converted_externally_to_rlds              | 1              |
 93 | | ucsd_pick_and_place_dataset_converted_externally_to_rlds      | 109            |
 94 | | nyu_door_opening_surprising_effectiveness                     | 74             |
 95 | | jaco_play                                                     | 175            |
 96 | | utokyo_xarm_bimanual_converted_externally_to_rlds             | 7              |
 97 | | bridge                                                        | 2986           |
 98 | | utokyo_pr2_opening_fridge_converted_externally_to_rlds        | 12             |
 99 | | asu_table_top_converted_externally_to_rlds                    | 22             |
100 | | berkeley_autolab_ur5                                          | 164            |
101 | | dobbe                                                         | 759            |
102 | | fmb                                                           | 48             |
103 | | qut_dexterous_manpulation                                     | 29             |
104 | | robo_set                                                      | 2374           |
105 | | dlr_sara_pour_converted_externally_to_rlds                    | 3              |
106 | | dlr_edan_shared_control_converted_externally_to_rlds          | 17             |
107 | | **Summary**                                                   | 6870           |
108 | 
109 | 
110 | 
111 | ## Data Format
112 | 
113 | ### Planning
114 | 
115 | ![data-demo](./images/data-demo.jpg)
116 | 
117 | ```json
118 | {
119 |  "id"{
120 |         "id": "/mnt/hpfs/baaiei/jyShi/rt_frames_success/rtx_frames_success_42/62_robo_set#episode_1570",
121 |         "task": "Future_Prediction_Task",
122 |         "selected_step": 3,
123 |         "conversations": [
124 |             {
125 |                 "from": "human",
126 |                 "value": "<image 0-25> After <move the grasped banana towards the mug>, what's the most probable next event?"
127 |             },
128 |             {
129 |                 "from": "gpt",
130 |                 "value": "<place the banana into the mug>"
131 |             }
132 |         ],
133 |         "image": [
134 |             "/path/to/image_0-25"
135 |         ]
136 |     }        
137 | }
138 | ```
139 | 
140 |      
141 | 
142 | 
143 | 
144 | ### Affordance
145 | 
146 | <!--![2d94d985-d47e-4899-9760-c1cb8f19cd89](./images/2d94d985-d47e-4899-9760-c1cb8f19cd89.png)![a7817c0b-04b1-4a7c-9535-f9ff7801a689](./images/a7817c0b-04b1-4a7c-9535-f9ff7801a689.png)-->
147 | <div style="display: flex; gap: 10px;">
148 |   <img src="./images/2d94d985-d47e-4899-9760-c1cb8f19cd89.png" style="width: 300px;" />
149 |   <img src="./images/a7817c0b-04b1-4a7c-9535-f9ff7801a689.png" style="width: 300px;" />
150 | </div>
151 | 
152 | ```json
153 | {
154 | 
155 |         "id": 2486,
156 |         "meta_data": {
157 |             "original_dataset": "bridge",
158 |             "original_width": 640,
159 |             "original_height": 480
160 |         },
161 |         "instruction": "place the red fork to the left of the left burner",
162 |         "affordance": {
163 |             "x": 352.87425387858815,
164 |             "y": 186.47871614766484,
165 |             "width": 19.296008229513156,
166 |             "height": 14.472006172134865
167 |     }
168 | ```
169 | 
170 | 
171 | 
172 | #### Visualize Code
173 | 
174 | ```python
175 | import json
176 | import os
177 | import cv2
178 | import numpy as np
179 | 
180 | img_dir = '/path/to/your/original/images/dir'
181 | affordance_json = '/path/to/your/affordances/json'
182 | output_img_dir = '/path/to/your/visualized/images/dir'
183 | 
184 | with open(affordance_json, 'r') as f:
185 |     data = json.load(f)
186 |     for item in data:
187 |         filepath = os.path.join(img_dir, item['id'])
188 | 
189 |         image = cv2.imread(filepath)
190 |         color = (255, 0, 0)
191 |         thickness = 2
192 | 
193 |         x_min,y_min = item['affordance']['x'], item['affordance']['y']
194 |         x_max,y_max = item['affordance']['x']+item['affordance']['width'], item['affordance']['y']+item['affordance']['height']
195 | 
196 |         # 定义矩形的四个顶点坐标
197 |         pts = np.array([
198 |             [x_min, y_min],  # 左上角
199 |             [x_max, y_min],  # 右上角
200 |             [x_max, y_max],  # 右下角
201 |             [x_min, y_max]   # 左下角
202 |         ], dtype=np.float32)
203 | 
204 |         # 绘制矩形框
205 |         cv2.polylines(image, [pts.astype(int)], isClosed=True, color=color, thickness=thickness)
206 | 
207 |         # 获取相对路径并拼接目标路径
208 |         relative_path = os.path.relpath(filepath, img_dir)  # 获取相对于 img_dir 的相对路径
209 |         output_img_path = os.path.join(output_img_dir, relative_path)  # 拼接目标路径
210 | 
211 |         # 创建目标文件夹
212 |         output_directory = os.path.dirname(output_img_path)
213 |         if not os.path.exists(output_directory):
214 |             os.makedirs(output_directory)
215 | 
216 |         # 打印调试信息
217 |         print(f"Input filepath: {filepath}")
218 |         print(f"Output image path: {output_img_path}")
219 |         print(f"Output directory: {output_directory}")
220 | 
221 |         # 保存图像
222 |         cv2.imwrite(output_img_path, image)
223 | 
224 | ```
225 | 
226 | 
227 | 
228 | 
229 | 
230 | ### Trajectory
231 | 
232 | <!-- ![5b923b31-dbbf-470f-af09-5125f5b91ab0](./images/5b923b31-dbbf-470f-af09-5125f5b91ab0.png)![1af4535a-acc3-4417-ae33-675f4301f560](./images/1af4535a-acc3-4417-ae33-675f4301f560.png)-->
233 | <div style="display: flex; gap: 10px;">
234 |   <img src="./images/5b923b31-dbbf-470f-af09-5125f5b91ab0.png" style="width: 300px;" />
235 |   <img src="./images/1af4535a-acc3-4417-ae33-675f4301f560.png" style="width: 300px;" />
236 | </div>
237 | 
238 | ```json
239 | {
240 |         "id": 456,
241 |         "meta_data": {
242 |             "original_dataset": "bridge",
243 |             "original_width": 640,
244 |             "original_height": 480
245 |         },
246 |         "instruction": "reach for the carrot",
247 |         "points": [
248 |             [
249 |                 265.45454545454544,
250 |                 120.0
251 |             ],
252 |             [
253 |                 275.1515151515152,
254 |                 162.42424242424244
255 |             ],
256 |             [
257 |                 280.0,
258 |                 213.33333333333331
259 |             ],
260 |             [
261 |                 280.0,
262 |                 259.3939393939394
263 |             ]
264 |         ]
265 |     },
266 | ```
267 | 
268 | #### Visualize Code
269 | 
270 | ```python
271 | import json
272 | import os
273 | from PIL import Image, ImageDraw
274 | 
275 | trajectory_final = '/path/to/your/trajectory_json'
276 | img_dir = '/path/to/your/original/images/dir'
277 | output_img_dir = '/path/to/your/visualzed/images/dir'
278 | 
279 | with open(trajectory_final, 'r') as f:
280 |     data = json.load(f)
281 |     for item in data:
282 |         filepath = os.path.join(img_dir, item['id'])
283 |         points = item['points']
284 | 
285 |         image = Image.open(filepath).convert("RGB")  # 确保图像是 RGB 模式
286 |         draw = ImageDraw.Draw(image)  # 创建绘图对象
287 |         # 定义颜色和线宽
288 |         color = (255, 0, 0)  # 红色 (RGB 格式)
289 |         thickness = 2
290 | 
291 | 
292 |         scaled_points = [
293 |                 (point[0], point[1])
294 |                 for point in points
295 |             ]
296 |         # 按照顺序连接相邻的点
297 |         for i in range(len(scaled_points) - 1):
298 |             draw.line([scaled_points[i], scaled_points[i + 1]], fill=color, width=thickness)
299 | 
300 |         # 获取相对路径并拼接目标路径
301 |         relative_path = os.path.relpath(filepath, img_dir)
302 |         output_img_path = os.path.join(output_img_dir, relative_path)
303 | 
304 |         # 创建目标文件夹
305 |         output_directory = os.path.dirname(output_img_path)
306 |         if not os.path.exists(output_directory):
307 |             os.makedirs(output_directory)
308 | 
309 |         # 打印调试信息
310 |         print(f"Input filepath: {filepath}")
311 |         print(f"Output image path: {output_img_path}")
312 |         print(f"Output directory: {output_directory}")
313 | 
314 |         # 保存图像
315 |         image.save(output_img_path)
316 | ```
317 | 
318 | 
319 | 
320 | ## Evaluation🚀
321 | Powered by ShareRobot dataset, RoboBrain Model achieves stunning results.🌟
322 | 
323 | **Task planning capability**: The RoboBrain model trained on ShareRobot achieves a 30.2% improvement in task decomposition accuracy (BLEU-4 reached 55.05%), significantly better than existing methods;  
324 | 
325 | **Affordance perception capability**: The average accuracy (AP) of object affordance area recognition is 27.1%, which is 14.6% higher than the baseline model.
326 | 
327 | **Trajectory prediction capability**: End-effector trajectory prediction error reduced by 42.9% (DFD index decreased from 0.191 to 0.109);     
328 | 
329 | **General capability**: In the OpenEQA benchmark, the scene understanding score surpasses general multimodal models such as GPT-4V. The RoboBrain model trained with ShareRobot did not sacrifice its general ability.
330 | 
331 | ![evaluation_planning](./images/evaluation_planning.png)
332 | <div style="display: flex; gap: 10px;">
333 |   <img src="./images/evaluation_affordance.png" style="width: 400px;" />
334 |   <img src="./images/evaluation_trajectory.png" style="width: 400px;" />
335 | </div>
336 | 
337 | 
338 | ## Reference
339 | 
340 | [1] Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan,Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi,et al. Robovqa: Multimodal long-horizon reasoning forrobotics. In ICRA, pages 645–652, 2024.
341 | 
342 | [2] Abby O’Neill, Abdul Rehman, Abhinav Gupta, AbhiramMaddukuri, Abhishek Gupta, Abhishek Padalkar, AbrahamLee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, et al.Open x-embodiment: Robotic learning datasets and rt-xmodels. arXiv preprint arXiv:2310.08864, 2023.
343 | 
344 | 
345 | 
346 | ## Citation
347 | ```
348 | @article{ji2025robobrain,
349 |   title={RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete},
350 |   author={Ji, Yuheng and Tan, Huajie and Shi, Jiayu and Hao, Xiaoshuai and Zhang, Yuan and Zhang, Hengyuan and Wang, Pengwei and Zhao, Mengdi and Mu, Yao and An, Pengju and others},
351 |   journal={arXiv preprint arXiv:2502.21257},
352 |   year={2025}
353 | }
354 | ```
355 | 


--------------------------------------------------------------------------------
/images/1af4535a-acc3-4417-ae33-675f4301f560.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/1af4535a-acc3-4417-ae33-675f4301f560.png


--------------------------------------------------------------------------------
/images/2d94d985-d47e-4899-9760-c1cb8f19cd89.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/2d94d985-d47e-4899-9760-c1cb8f19cd89.png


--------------------------------------------------------------------------------
/images/5b923b31-dbbf-470f-af09-5125f5b91ab0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/5b923b31-dbbf-470f-af09-5125f5b91ab0.png


--------------------------------------------------------------------------------
/images/80237155-9b7b-4f70-9c2e-8ee38029becd.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/80237155-9b7b-4f70-9c2e-8ee38029becd.png


--------------------------------------------------------------------------------
/images/a608d080-665a-4ab1-bd8f-d5bd121454da.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/a608d080-665a-4ab1-bd8f-d5bd121454da.png


--------------------------------------------------------------------------------
/images/a7817c0b-04b1-4a7c-9535-f9ff7801a689.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/a7817c0b-04b1-4a7c-9535-f9ff7801a689.png


--------------------------------------------------------------------------------
/images/data-demo.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/data-demo.jpg


--------------------------------------------------------------------------------
/images/ee709e8b-6f05-428d-abff-2578914aeb0d.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/ee709e8b-6f05-428d-abff-2578914aeb0d.png


--------------------------------------------------------------------------------
/images/evaluation_affordance.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/evaluation_affordance.png


--------------------------------------------------------------------------------
/images/evaluation_planning.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/evaluation_planning.png


--------------------------------------------------------------------------------
/images/evaluation_trajectory.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FlagOpen/ShareRobot/1ff23d64d07834a47b7d428380ceb308a2062d65/images/evaluation_trajectory.png


--------------------------------------------------------------------------------