├── README.md
├── assets
├── sample.gif
├── sample_point_cloud.png
└── sample_rgbd_flow_mask.png
└── tools
├── download
└── download_dataset.sh
└── python
├── dataset_loader.py
├── requirements.txt
├── sample.py
└── sample_3d.py
/README.md:
--------------------------------------------------------------------------------
1 |
2 | Fast-YCB
3 |
4 |
5 | 
6 |
7 | ## Description
8 |
9 | This is the repository associated to the dataset Fast-YCB presented in the publication [ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking](https://github.com/hsp-iit/roft).
10 |
11 | The dataset is hosted in the [IIT Dataverse](https://dataverse.iit.it/) and it is identified by the following [](https://doi.org/10.48557/G2QJDM).
12 |
13 |
14 | The dataset contains 6 synthetic sequences comprising objects from the [YCB Model Set](https://www.google.com/search?q=ycb+model+set&oq=ycb+model+set&aqs=chrome..69i57j0i22i30j69i59j69i60l3.2631j0j7&sourceid=chrome&ie=UTF-8). The trajectories of the object are characterized by moderate-to-fast motions and can be used to benchmark 6D object pose tracking algorithms.
15 |
16 | The dataset provides RGB, depth, optical flow, segmentation (ground truth and from Mask R-CNN) and 6D object poses (ground truth and from NVIDIA DOPE).
17 |
18 | Specifically, the dataset contains (for each object folder):
19 | - `cam_K.json` : a json file containing the camera width, height and intrinsic parameters
20 | - `rgb` : a folder containing rgb frames in `PNG` format
21 | - `depth` : a folder containing depth frames
22 | - `masks/gt` : a folder containing ground truth segmentation masks as binary `PNG` images
23 | - `masks/mrcnn_ycbv_bop_pbr`: a folder containing Mask R-CNN segmentation as binary `PNG images`
24 | - `optical_flow/nvof_1_slow` : a folder containing [NVIDIA NVOF SDK](https://developer.nvidia.com/opticalflow-sdk) optical flow frames
25 | - `dope/poses.txt` : a file containing 6D object poses obtained using [DOPE](https://github.com/NVlabs/Deep_Object_Pose) (these poses assume the [NVDU](https://github.com/NVIDIA/Dataset_Utilities) version of the YCB Model Set meshes)
26 | - `dope/poses_ycb.txt` : as above but assume the original [PoseCNN YCB Model set meshes](https://drive.google.com/file/d/1gmcDD-5bkJfcMKLZb3zGgH_HUFbulQWu/view?usp=sharing)
27 | - `gt/poses.txt` : ground truth 6D poses (NVDU format)
28 | - `gt/poses_ycb.txt` : ground truth 6D poses (PoseCNN YCB Model set format)
29 | - `gt/velocities.txt` : ground truth velocities
30 |
31 | ### Format of poses
32 | The format of the poses is
33 |
34 | $$ [ x, y, z, n_x, n_y, n_z, \theta] $$
35 |
36 | where $p = (x, y, z)$ is the translation vector while $n = (n_x, n_y, n_z)$ is the axis of rotation and $\theta$ is the angle of rotation in radiant.
37 |
38 | The pose represents the transformation from the camera frame to the object frame.
39 |
40 | ### A note on additional sequences
41 |
42 | The object folders `003_cracker_box_real` and `006_mustard_bottle_real` contain additional sequences acquired with a real Intel RealSense D415 camera. These are not labeled (i.e. they miss the `masks/gt` and the whole `gt` folders).
43 |
44 | ## How to obtain the dataset
45 |
46 | Download the dataset using:
47 | ```console
48 | bash tools/download/download_dataset.sh
49 | ```
50 |
51 | In order to download the dataset `curl`, `jq`, `unzip` and `zip` are required.
52 |
53 | ## How to access data
54 |
55 | We provide [python](tools/python/sample.py) sample code to access the information contained in the dataset.
56 |
57 | ```console
58 | pip install -r tools/python/requirements.txt
59 | python tools/python/sample.py
60 | ```
61 | where `` might be `003_cracker_box`, `004_sugar_box`, `005_tomato_soup_can`, `006_cracker_box`, `009_gelatin_box`, `010_potted_meat_can`.
62 |
63 | 
64 |
65 | You can also visualize the scene point cloud using [this](tools/python/sample_3d.py) other script:
66 |
67 | ```console
68 | pip install open3d
69 | python tools/python/sample_3d.py
70 | ```
71 |
72 | By default the visualizer will show the first frame of the sequence and cut the depth beyond 1 meter.
73 |
74 |
75 |
76 | ## Citing Fast-YCB
77 |
78 | If you find the Fast-YCB dataset useful, please consider citing the associated publication:
79 |
80 | ```bibtex
81 | @ARTICLE{9568706,
82 | author={Piga, Nicola A. and Onyshchuk, Yuriy and Pasquale, Giulia and Pattacini, Ugo and Natale, Lorenzo},
83 | journal={IEEE Robotics and Automation Letters},
84 | title={ROFT: Real-Time Optical Flow-Aided 6D Object Pose and Velocity Tracking},
85 | year={2022},
86 | volume={7},
87 | number={1},
88 | pages={159-166},
89 | doi={10.1109/LRA.2021.3119379}
90 | }
91 | ```
92 |
93 | and the Dataset:
94 |
95 | ```bibtex
96 | @data{G2QJDM_2022,
97 | author = {Piga, Nicola A. and Onyshchuk, Yuriy and Pasquale, Giulia and Pattacini, Ugo and Natale, Lorenzo},
98 | publisher = {IIT Dataverse},
99 | title = {{Fast-YCB Dataset}},
100 | year = {2022},
101 | version = {V1},
102 | doi = {10.48557/G2QJDM},
103 | url = {https://doi.org/10.48557/G2QJDM}
104 | }
105 | ```
106 |
107 | ## Maintainer
108 |
109 | This repository is maintained by:
110 |
111 | | | |
112 | |:---:|:---:|
113 | | [
](https://github.com/xenvre) | [@xenvre](https://github.com/xenvre) |
114 |
--------------------------------------------------------------------------------
/assets/sample.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample.gif
--------------------------------------------------------------------------------
/assets/sample_point_cloud.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample_point_cloud.png
--------------------------------------------------------------------------------
/assets/sample_rgbd_flow_mask.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hsp-iit/fast-ycb/4a53d673f12650c7dcf16e72396d521330c553b9/assets/sample_rgbd_flow_mask.png
--------------------------------------------------------------------------------
/tools/download/download_dataset.sh:
--------------------------------------------------------------------------------
1 | SERVER=https://dataverse.iit.it
2 | PERSISTENT_ID=doi:10.48557/G2QJDM
3 | VERSION=:latest
4 |
5 | # Download the json file
6 | curl $SERVER/api/datasets/:persistentId/versions/$VERSION/files?persistentId=$PERSISTENT_ID > dataset.json
7 |
8 | # Download all files in the dataset
9 | NUM_FILES=`cat dataset.json | jq ".data | length - 1"`
10 | for i in $(seq 0 $NUM_FILES); do
11 | echo "Downloading file $i/$NUM_FILES..."
12 | file_id=`cat dataset.json | jq ".data[$i].dataFile.id"`
13 | file_name=`cat dataset.json | jq ".data[$i].dataFile.filename" | tr -d '"'`
14 | curl -L $SERVER/api/access/datafile/$file_id -o $file_name
15 | done
16 |
17 | # Unzip all objects
18 | for object_name in 003_cracker_box 004_sugar_box 005_tomato_soup_can 006_mustard_bottle 009_gelatin_box 010_potted_meat_can 003_cracker_box_real 006_mustard_bottle_real; do
19 | echo "Unzipping file ${object_name}.zip..."
20 | zip -qq -F ${object_name}.zip --out tmp.zip
21 | rm ${object_name}.z*
22 | unzip -qq tmp.zip
23 | rm tmp.zip
24 | done
25 |
--------------------------------------------------------------------------------
/tools/python/dataset_loader.py:
--------------------------------------------------------------------------------
1 | #===============================================================================
2 | #
3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT)
4 | #
5 | # This software may be modified and distributed under the terms of the
6 | # GPL-2+ license. See the accompanying LICENSE file for details.
7 | #
8 | #===============================================================================
9 |
10 | import cv2
11 | import json
12 | import numpy
13 | import os
14 | import struct
15 |
16 |
17 | class Loader():
18 |
19 | def __init__(self, path, object_name):
20 | """Constructor."""
21 |
22 | self._path = os.path.join(path, object_name)
23 | self._object_name = object_name
24 | self._number_frames = None
25 |
26 | self._load_number_frames()
27 | self._load_camera_parameters()
28 |
29 |
30 | def _load_number_frames(self):
31 | """Load the total number of frames."""
32 |
33 | self._number_frames = sum(1 for line in open(os.path.join(self._path, 'data.txt')))
34 |
35 |
36 | def _load_camera_parameters(self):
37 | """Load the camera parameters."""
38 |
39 | self._camera_parameters = json.load(open(os.path.join(self._path, 'cam_K.json')))
40 |
41 |
42 | def _is_valid_frame(self, number):
43 | """Check if the frame with frame number 'number' does exists."""
44 |
45 | return number < self._number_frames
46 |
47 |
48 | def get_number_frames(self):
49 | """Get the total number of frames."""
50 |
51 | return self._number_frames
52 |
53 |
54 | def get_camera_parameters(self):
55 | """Get the camera parameterss."""
56 |
57 | return self._camera_parameters
58 |
59 |
60 | def get_rgb(self, number):
61 | """Get the rgb frame given the frame number."""
62 |
63 | if not self._is_valid_frame(number):
64 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.')
65 |
66 | return cv2.imread(os.path.join(self._path, 'rgb', str(number) + '.png'))
67 |
68 |
69 | def get_depth(self, number):
70 | """Get the depth frame given the frame number."""
71 |
72 | if not self._is_valid_frame(number):
73 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.')
74 |
75 | with open(os.path.join(self._path, 'depth', str(number) + '.float'), 'rb') as f:
76 | width, = struct.unpack('=Q', f.read(8))
77 | height, = struct.unpack('=Q', f.read(8))
78 | depth = numpy.reshape \
79 | (
80 | numpy.array(struct.unpack('f' * width * height, f.read())),
81 | (height, width)
82 | )
83 |
84 | return depth
85 |
86 |
87 | def get_optical_flow(self, number):
88 | """Get the flow frame given the frame number."""
89 |
90 | type_map = {cv2.CV_32FC2 : 'f', cv2.CV_16SC2 : 'h'}
91 | type_scaling = {cv2.CV_32FC2 : 1.0, cv2.CV_16SC2 : float(2 ** 5)}
92 |
93 | if not self._is_valid_frame(number):
94 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.')
95 |
96 | with open(os.path.join(self._path, 'optical_flow/nvof_1_slow', str(number) + '.float'), 'rb') as f:
97 | frame_type, = struct.unpack('i', f. read(4))
98 | width, = struct.unpack('=Q', f.read(8))
99 | height, = struct.unpack('=Q', f.read(8))
100 | flow = numpy.reshape \
101 | (
102 | numpy.array(struct.unpack(type_map[frame_type] * width * height * 2, f.read())),
103 | (height, width, 2)
104 | ) / type_scaling[frame_type]
105 |
106 | return flow
107 |
108 |
109 | def get_mask(self, number):
110 | """Get the mask frame given the frame number."""
111 |
112 | if not self._is_valid_frame(number):
113 | raise ValueError('The frame with frame number ' + str(number) + ' does not exist.')
114 |
115 | return cv2.imread(os.path.join(self._path, 'masks', 'gt', self._object_name + '_' + str(number) + '.png'))
116 |
--------------------------------------------------------------------------------
/tools/python/requirements.txt:
--------------------------------------------------------------------------------
1 | imgviz
2 | opencv-python
3 |
--------------------------------------------------------------------------------
/tools/python/sample.py:
--------------------------------------------------------------------------------
1 | #===============================================================================
2 | #
3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT)
4 | #
5 | # This software may be modified and distributed under the terms of the
6 | # GPL-2+ license. See the accompanying LICENSE file for details.
7 | #
8 | #===============================================================================
9 |
10 | from dataset_loader import Loader
11 | import cv2
12 | import imgviz
13 | import numpy
14 | import sys
15 |
16 |
17 | def main():
18 | fastycb_path = './'
19 | object_folder = sys.argv[1]
20 | loader = Loader(path = fastycb_path, object_name = object_folder)
21 |
22 | # Print useful information
23 | print('# frames: ', loader.get_number_frames())
24 | print('camera parameters')
25 | print(loader.get_camera_parameters())
26 |
27 | # Show sample RGB, depth and optical flow frames
28 | for i in range(500, min(1000, loader.get_number_frames())):
29 | # Get all frames
30 | rgb = loader.get_rgb(i)
31 | depth = loader.get_depth(i)
32 | flow = loader.get_optical_flow(i)
33 | mask = loader.get_mask(i)
34 |
35 | # Render depth and optical flow using RGB colors
36 | depth_render = imgviz.depth2rgb(depth, min_value = 0.3, max_value = 1.5, colormap = 'rainbow')
37 | flow_render = imgviz.flow2rgb(flow)
38 |
39 | # Render RGB, depth and optical flow in a single frame
40 | scale = 0.4
41 | height = int(rgb.shape[0] * scale)
42 | width = int(rgb.shape[1] * scale)
43 | render = numpy.empty([height, 4 * width, 3], 'uint8')
44 |
45 | render[:, : width] = cv2.resize(rgb, (width, height))
46 | render[:, width : 2 * width] = cv2.resize(depth_render, (width, height))
47 | render[:, 2 * width : 3 * width] = cv2.resize(flow_render, (width, height))
48 | render[:, 3 * width : 4 * width] = cv2.resize(mask, (width, height))
49 |
50 | cv2.imshow('', render)
51 | cv2.waitKey(33)
52 |
53 |
54 | if __name__ == '__main__':
55 | main()
56 |
--------------------------------------------------------------------------------
/tools/python/sample_3d.py:
--------------------------------------------------------------------------------
1 | #===============================================================================
2 | #
3 | # Copyright (C) 2022 Istituto Italiano di Tecnologia (IIT)
4 | #
5 | # This software may be modified and distributed under the terms of the
6 | # GPL-2+ license. See the accompanying LICENSE file for details.
7 | #
8 | #===============================================================================
9 |
10 | import cv2
11 | import numpy
12 | import open3d as o3d
13 | import sys
14 | from dataset_loader import Loader
15 |
16 |
17 | def eval_point_cloud(depth, rgb, max_depth, camera_parameters):
18 |
19 | width = camera_parameters['width']
20 | height = camera_parameters['height']
21 | fx = float(camera_parameters['fx'])
22 | fy = float(camera_parameters['fy'])
23 | cx = float(camera_parameters['cx'])
24 | cy = float(camera_parameters['cy'])
25 |
26 | image_x_z = numpy.zeros((height, width), dtype = numpy.float32)
27 | image_y_z = numpy.zeros((height, width), dtype = numpy.float32)
28 | for v in range(height):
29 | for u in range(width):
30 | image_x_z[v, u] = (u - cx) / fx
31 | image_y_z[v, u] = (v - cy) / fy
32 |
33 | valid_depth = depth < max_depth
34 | coords_z = depth[valid_depth]
35 | coords_x_z = image_x_z[valid_depth]
36 | coords_y_z = image_y_z[valid_depth]
37 |
38 | rgb_fixed = rgb.copy()
39 | rgb_fixed = cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB)
40 | colors = rgb_fixed[valid_depth]
41 |
42 | cloud = numpy.zeros((coords_z.shape[0], 3), dtype = numpy.float32)
43 | cloud[:, 0] = coords_x_z * coords_z
44 | cloud[:, 1] = coords_y_z * coords_z
45 | cloud[:, 2] = coords_z
46 |
47 | return cloud, colors
48 |
49 |
50 | def add_point_cloud(name, cloud, colors, size, scene):
51 |
52 | point_cloud = o3d.geometry.PointCloud()
53 | point_cloud.points = o3d.utility.Vector3dVector(cloud)
54 | point_cloud.colors = o3d.utility.Vector3dVector(colors / 255.0)
55 |
56 | material = o3d.visualization.rendering.MaterialRecord()
57 | material.shader = 'defaultUnlit'
58 | material.point_size = size
59 |
60 | scene.add_geometry(name, point_cloud, material)
61 |
62 |
63 | def main():
64 | fastycb_path = './'
65 | object_folder = sys.argv[1]
66 | loader = Loader(path = fastycb_path, object_name = object_folder)
67 |
68 | # Load camera parameters
69 | camera_parameters = loader.get_camera_parameters()
70 |
71 | # Show sample point cloud for a given frame index
72 | index = 1
73 | rgb = loader.get_rgb(index)
74 | depth = loader.get_depth(index)
75 | max_depth = 1.0
76 | cloud, cloud_colors = eval_point_cloud(depth, rgb, max_depth, camera_parameters)
77 |
78 | try:
79 | app = o3d.visualization.gui.Application.instance
80 | app.initialize()
81 |
82 | window = app.create_window("Point cloud viewer", 1024, 768)
83 |
84 | widget3d = o3d.visualization.gui.SceneWidget()
85 | widget3d.scene = o3d.visualization.rendering.Open3DScene(window.renderer)
86 | widget3d.scene.set_background([1.0, 1.0, 1.0, 1.0])
87 | window.add_child(widget3d)
88 |
89 | add_point_cloud('cloud', cloud, cloud_colors, 2, widget3d.scene)
90 | widget3d.setup_camera(60, widget3d.scene.bounding_box, [0, 0, 0])
91 |
92 | app.run()
93 | except Exception as e:
94 | print(e)
95 |
96 |
97 | if __name__ == '__main__':
98 | main()
99 |
--------------------------------------------------------------------------------