├── README.md ├── assets ├── sample.gif ├── sample_point_cloud.png └── sample_rgbd_flow_mask.png └── tools ├── download └── download_dataset.sh └── python ├── dataset_loader.py ├── requirements.txt ├── sample.py └── sample_3d.py /README.md: -------------------------------------------------------------------------------- 1 |

2 | Fast-YCB 3 |

4 | 5 |

6 | 7 | ## Description 8 | 9 | This is the repository associated to the dataset Fast-YCB presented in the publication [ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking](https://github.com/hsp-iit/roft). 10 | 11 | The dataset is hosted in the [IIT Dataverse](https://dataverse.iit.it/) and it is identified by the following [![DOI:10.48557/G2QJDM](http://img.shields.io/badge/DOI-10.48557/G2QJDM-0a7bbc.svg)](https://doi.org/10.48557/G2QJDM). 12 | 13 | 14 | The dataset contains 6 synthetic sequences comprising objects from the [YCB Model Set](https://www.google.com/search?q=ycb+model+set&oq=ycb+model+set&aqs=chrome..69i57j0i22i30j69i59j69i60l3.2631j0j7&sourceid=chrome&ie=UTF-8). The trajectories of the object are characterized by moderate-to-fast motions and can be used to benchmark 6D object pose tracking algorithms. 15 | 16 | The dataset provides RGB, depth, optical flow, segmentation (ground truth and from Mask R-CNN) and 6D object poses (ground truth and from NVIDIA DOPE). 17 | 18 | Specifically, the dataset contains (for each object folder): 19 | - `cam_K.json` : a json file containing the camera width, height and intrinsic parameters 20 | - `rgb` : a folder containing rgb frames in `PNG` format 21 | - `depth` : a folder containing depth frames 22 | - `masks/gt` : a folder containing ground truth segmentation masks as binary `PNG` images 23 | - `masks/mrcnn_ycbv_bop_pbr`: a folder containing Mask R-CNN segmentation as binary `PNG images` 24 | - `optical_flow/nvof_1_slow` : a folder containing [NVIDIA NVOF SDK](https://developer.nvidia.com/opticalflow-sdk) optical flow frames 25 | - `dope/poses.txt` : a file containing 6D object poses obtained using [DOPE](https://github.com/NVlabs/Deep_Object_Pose) (these poses assume the [NVDU](https://github.com/NVIDIA/Dataset_Utilities) version of the YCB Model Set meshes) 26 | - `dope/poses_ycb.txt` : as above but assume the original [PoseCNN YCB Model set meshes](https://drive.google.com/file/d/1gmcDD-5bkJfcMKLZb3zGgH_HUFbulQWu/view?usp=sharing) 27 | - `gt/poses.txt` : ground truth 6D poses (NVDU format) 28 | - `gt/poses_ycb.txt` : ground truth 6D poses (PoseCNN YCB Model set format) 29 | - `gt/velocities.txt` : ground truth velocities 30 | 31 | ### Format of poses 32 | The format of the poses is 33 | 34 | $$ [ x, y, z, n_x, n_y, n_z, \theta] $$ 35 | 36 | where $p = (x, y, z)$ is the translation vector while $n = (n_x, n_y, n_z)$ is the axis of rotation and $\theta$ is the angle of rotation in radiant. 37 | 38 | The pose represents the transformation from the camera frame to the object frame. 39 | 40 | ### A note on additional sequences 41 | 42 | The object folders `003_cracker_box_real` and `006_mustard_bottle_real` contain additional sequences acquired with a real Intel RealSense D415 camera. These are not labeled (i.e. they miss the `masks/gt` and the whole `gt` folders). 43 | 44 | ## How to obtain the dataset 45 | 46 | Download the dataset using: 47 | ```console 48 | bash tools/download/download_dataset.sh 49 | ``` 50 | 51 | In order to download the dataset `curl`, `jq`, `unzip` and `zip` are required. 52 | 53 | ## How to access data 54 | 55 | We provide [python](tools/python/sample.py) sample code to access the information contained in the dataset. 56 | 57 | ```console 58 | pip install -r tools/python/requirements.txt 59 | python tools/python/sample.py 60 | ``` 61 | where `` might be `003_cracker_box`, `004_sugar_box`, `005_tomato_soup_can`, `006_cracker_box`, `009_gelatin_box`, `010_potted_meat_can`. 62 | 63 |

64 | 65 | You can also visualize the scene point cloud using [this](tools/python/sample_3d.py) other script: 66 | 67 | ```console 68 | pip install open3d 69 | python tools/python/sample_3d.py 70 | ``` 71 | 72 | By default the visualizer will show the first frame of the sequence and cut the depth beyond 1 meter. 73 | 74 |

75 | 76 | ## Citing Fast-YCB 77 | 78 | If you find the Fast-YCB dataset useful, please consider citing the associated publication: 79 | 80 | ```bibtex 81 | @ARTICLE{9568706, 82 | author={Piga, Nicola A. and Onyshchuk, Yuriy and Pasquale, Giulia and Pattacini, Ugo and Natale, Lorenzo}, 83 | journal={IEEE Robotics and Automation Letters}, 84 | title={ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking}, 85 | year={2022}, 86 | volume={7}, 87 | number={1}, 88 | pages={159-166}, 89 | doi={10.1109/LRA.2021.3119379} 90 | } 91 | ``` 92 | 93 | and the Dataset: 94 | 95 | ```bibtex 96 | @data{G2QJDM_2022, 97 | author = {Piga, Nicola A. and Onyshchuk, Yuriy and Pasquale, Giulia and Pattacini, Ugo and Natale, Lorenzo}, 98 | publisher = {IIT Dataverse}, 99 | title = {{Fast-YCB Dataset}}, 100 | year = {2022}, 101 | version = {V1}, 102 | doi = {10.48557/G2QJDM}, 103 | url = {https://doi.org/10.48557/G2QJDM} 104 | } 105 | ``` 106 | 107 | ## Maintainer 108 | 109 | This repository is maintained by: 110 | 111 | | | | 112 | |:---:|:---:| 113 | | [

](https://github.com/xenvre) | [@xenvre](https://github.com/xenvre) | 114 | -------------------------------------------------------------------------------- /assets/sample.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample.gif -------------------------------------------------------------------------------- /assets/sample_point_cloud.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample_point_cloud.png -------------------------------------------------------------------------------- /assets/sample_rgbd_flow_mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample_rgbd_flow_mask.png -------------------------------------------------------------------------------- /tools/download/download_dataset.sh: -------------------------------------------------------------------------------- 1 | SERVER=https://dataverse.iit.it 2 | PERSISTENT_ID=doi:10.48557/G2QJDM 3 | VERSION=:latest 4 | 5 | # Download the json file 6 | curl $SERVER/api/datasets/:persistentId/versions/$VERSION/files?persistentId=$PERSISTENT_ID > dataset.json 7 | 8 | # Download all files in the dataset 9 | NUM_FILES=`cat dataset.json | jq ".data | length - 1"` 10 | for i in $(seq 0 $NUM_FILES); do 11 | echo "Downloading file $i/$NUM_FILES..." 12 | file_id=`cat dataset.json | jq ".data[$i].dataFile.id"` 13 | file_name=`cat dataset.json | jq ".data[$i].dataFile.filename" | tr -d '"'` 14 | curl -L $SERVER/api/access/datafile/$file_id -o $file_name 15 | done 16 | 17 | # Unzip all objects 18 | for object_name in 003_cracker_box 004_sugar_box 005_tomato_soup_can 006_mustard_bottle 009_gelatin_box 010_potted_meat_can 003_cracker_box_real 006_mustard_bottle_real; do 19 | echo "Unzipping file ${object_name}.zip..." 20 | zip -qq -F ${object_name}.zip --out tmp.zip 21 | rm ${object_name}.z* 22 | unzip -qq tmp.zip 23 | rm tmp.zip 24 | done 25 | -------------------------------------------------------------------------------- /tools/python/dataset_loader.py: -------------------------------------------------------------------------------- 1 | #=============================================================================== 2 | # 3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT) 4 | # 5 | # This software may be modified and distributed under the terms of the 6 | # GPL-2+ license. See the accompanying LICENSE file for details. 7 | # 8 | #=============================================================================== 9 | 10 | import cv2 11 | import json 12 | import numpy 13 | import os 14 | import struct 15 | 16 | 17 | class Loader(): 18 | 19 | def __init__(self, path, object_name): 20 | """Constructor.""" 21 | 22 | self._path = os.path.join(path, object_name) 23 | self._object_name = object_name 24 | self._number_frames = None 25 | 26 | self._load_number_frames() 27 | self._load_camera_parameters() 28 | 29 | 30 | def _load_number_frames(self): 31 | """Load the total number of frames.""" 32 | 33 | self._number_frames = sum(1 for line in open(os.path.join(self._path, 'data.txt'))) 34 | 35 | 36 | def _load_camera_parameters(self): 37 | """Load the camera parameters.""" 38 | 39 | self._camera_parameters = json.load(open(os.path.join(self._path, 'cam_K.json'))) 40 | 41 | 42 | def _is_valid_frame(self, number): 43 | """Check if the frame with frame number 'number' does exists.""" 44 | 45 | return number < self._number_frames 46 | 47 | 48 | def get_number_frames(self): 49 | """Get the total number of frames.""" 50 | 51 | return self._number_frames 52 | 53 | 54 | def get_camera_parameters(self): 55 | """Get the camera parameterss.""" 56 | 57 | return self._camera_parameters 58 | 59 | 60 | def get_rgb(self, number): 61 | """Get the rgb frame given the frame number.""" 62 | 63 | if not self._is_valid_frame(number): 64 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.') 65 | 66 | return cv2.imread(os.path.join(self._path, 'rgb', str(number) + '.png')) 67 | 68 | 69 | def get_depth(self, number): 70 | """Get the depth frame given the frame number.""" 71 | 72 | if not self._is_valid_frame(number): 73 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.') 74 | 75 | with open(os.path.join(self._path, 'depth', str(number) + '.float'), 'rb') as f: 76 | width, = struct.unpack('=Q', f.read(8)) 77 | height, = struct.unpack('=Q', f.read(8)) 78 | depth = numpy.reshape \ 79 | ( 80 | numpy.array(struct.unpack('f' * width * height, f.read())), 81 | (height, width) 82 | ) 83 | 84 | return depth 85 | 86 | 87 | def get_optical_flow(self, number): 88 | """Get the flow frame given the frame number.""" 89 | 90 | type_map = {cv2.CV_32FC2 : 'f', cv2.CV_16SC2 : 'h'} 91 | type_scaling = {cv2.CV_32FC2 : 1.0, cv2.CV_16SC2 : float(2 ** 5)} 92 | 93 | if not self._is_valid_frame(number): 94 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.') 95 | 96 | with open(os.path.join(self._path, 'optical_flow/nvof_1_slow', str(number) + '.float'), 'rb') as f: 97 | frame_type, = struct.unpack('i', f. read(4)) 98 | width, = struct.unpack('=Q', f.read(8)) 99 | height, = struct.unpack('=Q', f.read(8)) 100 | flow = numpy.reshape \ 101 | ( 102 | numpy.array(struct.unpack(type_map[frame_type] * width * height * 2, f.read())), 103 | (height, width, 2) 104 | ) / type_scaling[frame_type] 105 | 106 | return flow 107 | 108 | 109 | def get_mask(self, number): 110 | """Get the mask frame given the frame number.""" 111 | 112 | if not self._is_valid_frame(number): 113 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.') 114 | 115 | return cv2.imread(os.path.join(self._path, 'masks', 'gt', self._object_name + '_' + str(number) + '.png')) 116 | -------------------------------------------------------------------------------- /tools/python/requirements.txt: -------------------------------------------------------------------------------- 1 | imgviz 2 | opencv-python 3 | -------------------------------------------------------------------------------- /tools/python/sample.py: -------------------------------------------------------------------------------- 1 | #=============================================================================== 2 | # 3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT) 4 | # 5 | # This software may be modified and distributed under the terms of the 6 | # GPL-2+ license. See the accompanying LICENSE file for details. 7 | # 8 | #=============================================================================== 9 | 10 | from dataset_loader import Loader 11 | import cv2 12 | import imgviz 13 | import numpy 14 | import sys 15 | 16 | 17 | def main(): 18 | fastycb_path = './' 19 | object_folder = sys.argv[1] 20 | loader = Loader(path = fastycb_path, object_name = object_folder) 21 | 22 | # Print useful information 23 | print('# frames: ', loader.get_number_frames()) 24 | print('camera parameters') 25 | print(loader.get_camera_parameters()) 26 | 27 | # Show sample RGB, depth and optical flow frames 28 | for i in range(500, min(1000, loader.get_number_frames())): 29 | # Get all frames 30 | rgb = loader.get_rgb(i) 31 | depth = loader.get_depth(i) 32 | flow = loader.get_optical_flow(i) 33 | mask = loader.get_mask(i) 34 | 35 | # Render depth and optical flow using RGB colors 36 | depth_render = imgviz.depth2rgb(depth, min_value = 0.3, max_value = 1.5, colormap = 'rainbow') 37 | flow_render = imgviz.flow2rgb(flow) 38 | 39 | # Render RGB, depth and optical flow in a single frame 40 | scale = 0.4 41 | height = int(rgb.shape[0] * scale) 42 | width = int(rgb.shape[1] * scale) 43 | render = numpy.empty([height, 4 * width, 3], 'uint8') 44 | 45 | render[:, : width] = cv2.resize(rgb, (width, height)) 46 | render[:, width : 2 * width] = cv2.resize(depth_render, (width, height)) 47 | render[:, 2 * width : 3 * width] = cv2.resize(flow_render, (width, height)) 48 | render[:, 3 * width : 4 * width] = cv2.resize(mask, (width, height)) 49 | 50 | cv2.imshow('', render) 51 | cv2.waitKey(33) 52 | 53 | 54 | if __name__ == '__main__': 55 | main() 56 | -------------------------------------------------------------------------------- /tools/python/sample_3d.py: -------------------------------------------------------------------------------- 1 | #=============================================================================== 2 | # 3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT) 4 | # 5 | # This software may be modified and distributed under the terms of the 6 | # GPL-2+ license. See the accompanying LICENSE file for details. 7 | # 8 | #=============================================================================== 9 | 10 | import cv2 11 | import numpy 12 | import open3d as o3d 13 | import sys 14 | from dataset_loader import Loader 15 | 16 | 17 | def eval_point_cloud(depth, rgb, max_depth, camera_parameters): 18 | 19 | width = camera_parameters['width'] 20 | height = camera_parameters['height'] 21 | fx = float(camera_parameters['fx']) 22 | fy = float(camera_parameters['fy']) 23 | cx = float(camera_parameters['cx']) 24 | cy = float(camera_parameters['cy']) 25 | 26 | image_x_z = numpy.zeros((height, width), dtype = numpy.float32) 27 | image_y_z = numpy.zeros((height, width), dtype = numpy.float32) 28 | for v in range(height): 29 | for u in range(width): 30 | image_x_z[v, u] = (u - cx) / fx 31 | image_y_z[v, u] = (v - cy) / fy 32 | 33 | valid_depth = depth < max_depth 34 | coords_z = depth[valid_depth] 35 | coords_x_z = image_x_z[valid_depth] 36 | coords_y_z = image_y_z[valid_depth] 37 | 38 | rgb_fixed = rgb.copy() 39 | rgb_fixed = cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB) 40 | colors = rgb_fixed[valid_depth] 41 | 42 | cloud = numpy.zeros((coords_z.shape[0], 3), dtype = numpy.float32) 43 | cloud[:, 0] = coords_x_z * coords_z 44 | cloud[:, 1] = coords_y_z * coords_z 45 | cloud[:, 2] = coords_z 46 | 47 | return cloud, colors 48 | 49 | 50 | def add_point_cloud(name, cloud, colors, size, scene): 51 | 52 | point_cloud = o3d.geometry.PointCloud() 53 | point_cloud.points = o3d.utility.Vector3dVector(cloud) 54 | point_cloud.colors = o3d.utility.Vector3dVector(colors / 255.0) 55 | 56 | material = o3d.visualization.rendering.MaterialRecord() 57 | material.shader = 'defaultUnlit' 58 | material.point_size = size 59 | 60 | scene.add_geometry(name, point_cloud, material) 61 | 62 | 63 | def main(): 64 | fastycb_path = './' 65 | object_folder = sys.argv[1] 66 | loader = Loader(path = fastycb_path, object_name = object_folder) 67 | 68 | # Load camera parameters 69 | camera_parameters = loader.get_camera_parameters() 70 | 71 | # Show sample point cloud for a given frame index 72 | index = 1 73 | rgb = loader.get_rgb(index) 74 | depth = loader.get_depth(index) 75 | max_depth = 1.0 76 | cloud, cloud_colors = eval_point_cloud(depth, rgb, max_depth, camera_parameters) 77 | 78 | try: 79 | app = o3d.visualization.gui.Application.instance 80 | app.initialize() 81 | 82 | window = app.create_window("Point cloud viewer", 1024, 768) 83 | 84 | widget3d = o3d.visualization.gui.SceneWidget() 85 | widget3d.scene = o3d.visualization.rendering.Open3DScene(window.renderer) 86 | widget3d.scene.set_background([1.0, 1.0, 1.0, 1.0]) 87 | window.add_child(widget3d) 88 | 89 | add_point_cloud('cloud', cloud, cloud_colors, 2, widget3d.scene) 90 | widget3d.setup_camera(60, widget3d.scene.bounding_box, [0, 0, 0]) 91 | 92 | app.run() 93 | except Exception as e: 94 | print(e) 95 | 96 | 97 | if __name__ == '__main__': 98 | main() 99 | --------------------------------------------------------------------------------