├── .gitignore ├── LICENSE ├── README.md ├── demo ├── camera_images.py ├── create_video.py ├── load_dataset.py ├── load_dataset_part.py ├── load_filtered_dicts.py ├── load_image_data_only.py ├── random_access.py ├── simulation_rollout.py └── using_flat_observations.py ├── pyproject.toml ├── setup.cfg ├── setup.py └── trifinger_rl_datasets ├── __init__.py ├── data ├── __init__.py ├── r1_camera180.yml ├── r1_camera300.yml ├── r1_camera60.yml ├── r3_camera180.yml ├── r3_camera300.yml ├── r3_camera60.yml ├── r4_camera180.yml ├── r4_camera300.yml ├── r4_camera60.yml ├── r5_camera180.yml ├── r5_camera300.yml ├── r5_camera60.yml ├── r6_camera180.yml ├── r6_camera300.yml ├── r6_camera60.yml ├── r7_camera180.yml ├── r7_camera300.yml ├── r7_camera60.yml ├── r8_camera180.yml ├── r8_camera300.yml ├── r8_camera60.yml └── trifingerpro_shuffle_cube_trajectory_fast.npy ├── dataset_env.py ├── evaluate_sim.py ├── evaluation.py ├── policy_base.py ├── py.typed ├── sampling_utils.py ├── sim_env.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | *.swp 3 | *.egg-info 4 | *.pyc 5 | pycache/ 6 | build/ 7 | dist/ 8 | .vscode/ 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2022, Max Planck Gesellschaft. 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TriFinger RL Datasets 2 | 3 | This repository provides offline reinforcement learning datasets collected on the real TriFinger platform and in a simulated version of the environment. The paper ["Benchmarking Offline Reinforcement Learning on Real-Robot Hardware"](https://openreview.net/pdf?id=3k5CUGDLNdd) provides more details on the datasets and benchmarks offline RL algorithms on them. All datasets are available with camera images as well. 4 | 5 | More detailed information about the simulated environment, the datasets and on how to run experiments on a cluster of real TriFinger robots can be found in the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/). 6 | 7 | Some of the datasets were used during the [Real Robot Challenge 2022](https://real-robot-challenge.com). 8 | 9 | ## Installation 10 | 11 | To install the package run with python 3.8 in the root directory of the repository (we recommend doing this in a virtual environment): 12 | 13 | ```bash 14 | pip install --upgrade pip # make sure the most recent version of pip is installed 15 | pip install . 16 | ``` 17 | 18 | ## Usage 19 | 20 | This section provides short examples of how to load datasets and evaluate a policy in simulation. More details on how to work with the datasets can be found in the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/). 21 | 22 | 23 | ### Loading a dataset 24 | 25 | The datasets are accessible via gym environments which are automatically registered when importing the package. They are automatically downloaded when requested and stored in `~/.trifinger_rl_datasets` as Zarr files by default (see the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/) for custom paths to the datasets). The code for loading the datasets follows the interface suggested by [D4RL](https://github.com/rail-berkeley/d4rl) and extends it where needed. 26 | 27 | As an alternative to the automatic download, the datasets can also be downloaded 28 | manually from the [Edmond repository](https://edmond.mpdl.mpg.de/dataset.xhtml?persistentId=doi:10.17617/3.DXZ7TL). 29 | 30 | The datasets are named following the pattern `trifinger-cube-task-source-type-v0` where `task` is either `push` or `lift`, `source` is either `sim` or `real` and `type` can be either `mixed`, `weak-n-expert` or `expert`. 31 | 32 | By default the observations are loaded as flat arrays. For the simulated datasets the environment can be stepped and visualized. Example usage (also see `demo/load_dataset.py`): 33 | 34 | ```python 35 | import gymnasium as gym 36 | 37 | import trifinger_rl_datasets 38 | 39 | env = gym.make( 40 | "trifinger-cube-push-sim-expert-v0", 41 | visualization=True, # enable visualization 42 | ) 43 | 44 | dataset = env.get_dataset() 45 | 46 | print("First observation: ", dataset["observations"][0]) 47 | print("First action: ", dataset["actions"][0]) 48 | print("First reward: ", dataset["rewards"][0]) 49 | 50 | obs, info = env.reset() 51 | truncated = False 52 | 53 | while not truncated: 54 | obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) 55 | ``` 56 | 57 | Alternatively, the observations can be obtained as nested dictionaries. This simplifies working with the data. As some parts of the observations might be more useful than others, it is also possible to filter the observations when requesting dictionaries (see `demo/load_filtered_dicts.py`): 58 | 59 | ```python 60 | # Nested dictionary defines which observations to keep. 61 | # Everything that is not included or has value False 62 | # will be dropped. 63 | obs_to_keep = { 64 | "robot_observation": { 65 | "position": True, 66 | "velocity": True, 67 | "fingertip_force": False, 68 | }, 69 | "object_observation": {"keypoints": True}, 70 | } 71 | env = gym.make( 72 | args.env_name, 73 | # filter observations, 74 | obs_to_keep=obs_to_keep, 75 | ) 76 | ``` 77 | 78 | All datasets come in two versions: with and without camera observations. The versions with camera observations contain `-image` in their name. Despite PNG image compression they are more than one order of magnitude bigger than the imageless versions. To avoid running out of memory, a part of a dataset can be loaded by specifying a range of timesteps: 79 | 80 | ```python 81 | env = gym.make( 82 | "trifinger-cube-push-real-expert-image-v0", 83 | disable_env_checker=True 84 | ) 85 | 86 | # load only a subset of obervations, actions and rewards 87 | dataset = env.get_dataset(rng=(1000, 2000)) 88 | ``` 89 | 90 | The camera observations corresponding to this range are then returned in `dataset["images"]` with the following dimensions: 91 | 92 | ```python 93 | n_timesteps, n_cameras, n_channels, height, width = dataset["images"].shape 94 | ``` 95 | 96 | ### Evaluating a policy in simulation 97 | 98 | This package contains an executable module `trifinger_rl_datasets.evaluate_sim`, which 99 | can be used to evaluate a policy in simulation. As arguments it expects the task 100 | ("push" or "lift") and a Python class that implements the policy, following the 101 | `PolicyBase` interface: 102 | 103 | python3 -m trifinger_rl_datasets.evaluate_sim push my_package.MyPolicy 104 | 105 | For more options see `--help`. 106 | 107 | ## How to cite 108 | 109 | The paper ["Benchmarking Offline Reinforcement Learning on Real-Robot Hardware"](https://openreview.net/pdf?id=3k5CUGDLNdd) introducing the datasets was published at ICLR 2023: 110 | 111 | ``` 112 | @inproceedings{ 113 | guertler2023benchmarking, 114 | title={Benchmarking Offline Reinforcement Learning on Real-Robot Hardware}, 115 | author={Nico G{\"u}rtler and Sebastian Blaes and Pavel Kolev and Felix Widmaier and Manuel Wuthrich and Stefan Bauer and Bernhard Sch{\"o}lkopf and Georg Martius}, 116 | booktitle={The Eleventh International Conference on Learning Representations }, 117 | year={2023}, 118 | url={https://openreview.net/forum?id=3k5CUGDLNdd} 119 | } 120 | ``` -------------------------------------------------------------------------------- /demo/camera_images.py: -------------------------------------------------------------------------------- 1 | """Demo including camera images in the observation.""" 2 | 3 | 4 | import argparse 5 | 6 | import cv2 7 | import gymnasium as gym 8 | import numpy as np 9 | 10 | import trifinger_rl_datasets # noqa 11 | 12 | 13 | if __name__ == "__main__": 14 | argparser = argparse.ArgumentParser(description=__doc__) 15 | argparser.add_argument( 16 | "--env", 17 | type=str, 18 | default="trifinger-cube-push-sim-expert-v0", 19 | help="Name of dataset environment to load.", 20 | ) 21 | argparser.add_argument( 22 | "--flatten-obs", action="store_true", help="Flattens observations if set." 23 | ) 24 | argparser.add_argument( 25 | "--no-visualization", 26 | dest="visualization", 27 | action="store_false", 28 | help="Disables visualization, i.e., rendering of the environment in a GUI.", 29 | ) 30 | argparser.add_argument( 31 | "--data-dir", type=str, default=None, help="Path to data directory." 32 | ) 33 | args = argparser.parse_args() 34 | 35 | env = gym.make( 36 | args.env, 37 | disable_env_checker=True, 38 | visualization=args.visualization, 39 | # include camera images in the observation 40 | image_obs=True, 41 | flatten_obs=args.flatten_obs, 42 | data_dir=args.data_dir, 43 | ) 44 | obs, info = env.reset() 45 | truncated = False 46 | terminated = False 47 | 48 | # do one step in environment to get observations 49 | obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) 50 | 51 | if args.flatten_obs: 52 | # obs is a tuple containing an array with all observations but the images 53 | # and an array containing the images 54 | other_obs, images = obs 55 | print("Shape of all observations except images: ", other_obs.shape) 56 | print("Shape of images: ", images.shape) 57 | else: 58 | # obs is a nested dictionary if flatten_obs is False 59 | images = obs["camera_observation"]["images"] 60 | print("Shape of images: ", images.shape) 61 | 62 | # change to (height, width, channels) format for cv2 63 | images = np.transpose(images, (0, 2, 3, 1)) 64 | images = np.concatenate(images, axis=0) 65 | # convert RGB to BGR for cv2 66 | output_image = cv2.cvtColor(images, cv2.COLOR_RGB2BGR) 67 | # show images from last time step 68 | cv2.imshow("Camera images", output_image) 69 | cv2.waitKey(0) 70 | cv2.destroyAllWindows() 71 | -------------------------------------------------------------------------------- /demo/create_video.py: -------------------------------------------------------------------------------- 1 | """Create video from camera images.""" 2 | 3 | 4 | import argparse 5 | 6 | import cv2 7 | import gymnasium as gym 8 | import numpy as np 9 | 10 | import trifinger_rl_datasets # noqa 11 | 12 | 13 | def create_video( 14 | env, output_path, camera_id, timestep_range, zarr_path, show_reward=True 15 | ): 16 | """Create video from camera images. 17 | 18 | Args: 19 | dataset (dict): Dataset to load images from. 20 | output_path (str): Output path for video file. 21 | camera_id (str): ID of the camera for which to load images. 22 | """ 23 | 24 | image_range = env.convert_timestep_to_image_index(np.array(timestep_range)) 25 | # load relevant part of images in dataset 26 | images = env.get_image_data( 27 | # images from 3 cameras for each timestep 28 | rng=(image_range[0], image_range[1] + 3), 29 | zarr_path=zarr_path, 30 | timestep_dimension=True, 31 | ) 32 | if show_reward: 33 | # load rewards for the specified timesteps 34 | image_indices = env.convert_timestep_to_image_index( 35 | np.arange(*tuple(timestep_range)) 36 | ) 37 | dataset = env.get_dataset( 38 | rng=(timestep_range[0], timestep_range[1] + 1), zarr_path=zarr_path 39 | ) 40 | 41 | # select only images from the specified camera 42 | images = images[:, camera_id, ...] 43 | 44 | # create video writer 45 | fourcc = cv2.VideoWriter_fourcc(*"mp4v") 46 | fps = 10 47 | video_writer = cv2.VideoWriter( 48 | output_path, fourcc, fps, (images.shape[-1], images.shape[-2]) 49 | ) 50 | 51 | max_bar_height = 50 52 | # loop over images 53 | for i, image in enumerate(images): 54 | # convert to channeel last format for cv2 55 | img = np.transpose(image, (1, 2, 0)) 56 | if show_reward: 57 | # draw bar with height proportional to reward 58 | index = np.argmax(image_indices == i * 3 + image_range[0]) 59 | reward = dataset["rewards"][index] 60 | img[img.shape[0] - max_bar_height :, 260:, :] = 150 61 | bar_height = int(reward * max_bar_height) 62 | img[img.shape[0] - bar_height :, 260:, 1] = 255 63 | # convert RGB to BGR for cv2 64 | img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) 65 | # write image to video 66 | video_writer.write(img) 67 | 68 | # close video writer 69 | video_writer.release() 70 | 71 | 72 | if __name__ == "__main__": 73 | argparser = argparse.ArgumentParser(description=__doc__) 74 | argparser.add_argument("output_path", type=str, help="Path to output video file.") 75 | argparser.add_argument( 76 | "camera_id", type=int, help="ID of the camera for which to load images." 77 | ) 78 | argparser.add_argument( 79 | "--env", 80 | type=str, 81 | default="trifinger-cube-push-real-expert-image-v0", 82 | help="Name of dataset environment to load.", 83 | ) 84 | argparser.add_argument( 85 | "--timestep-range", 86 | type=int, 87 | nargs=2, 88 | default=[0, 750], 89 | help="Range of timesteps (not camera timesteps) to load image data for.", 90 | ) 91 | argparser.add_argument( 92 | "--zarr-path", type=str, default=None, help="Path to Zarr file to load." 93 | ) 94 | argparser.add_argument( 95 | "--data-dir", type=str, default=None, help="Path to data directory." 96 | ) 97 | argparser.add_argument( 98 | "--no-reward", action="store_true", help="Do not show reward bar. " 99 | ) 100 | args = argparser.parse_args() 101 | 102 | env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir) 103 | create_video( 104 | env, 105 | args.output_path, 106 | args.camera_id, 107 | args.timestep_range, 108 | args.zarr_path, 109 | not args.no_reward, 110 | ) 111 | -------------------------------------------------------------------------------- /demo/load_dataset.py: -------------------------------------------------------------------------------- 1 | """Load a complete dataset into memory and perform a rollout.""" 2 | 3 | import argparse 4 | 5 | import gymnasium as gym 6 | 7 | import trifinger_rl_datasets # noqa 8 | 9 | 10 | if __name__ == "__main__": 11 | argparser = argparse.ArgumentParser(description=__doc__) 12 | argparser.add_argument( 13 | "--env", 14 | type=str, 15 | default="trifinger-cube-push-sim-expert-v0", 16 | help="Name of dataset environment to load.", 17 | ) 18 | argparser.add_argument( 19 | "--data-dir", 20 | type=str, 21 | default=None, 22 | help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.", 23 | ) 24 | args = argparser.parse_args() 25 | 26 | env = gym.make( 27 | args.env, 28 | disable_env_checker=True, 29 | visualization=True, # enable visualization 30 | data_dir=args.data_dir, 31 | ) 32 | dataset = env.get_dataset() 33 | 34 | n_transitions = len(dataset["observations"]) 35 | print("Number of transitions: ", n_transitions) 36 | 37 | assert dataset["actions"].shape[0] == n_transitions 38 | assert dataset["rewards"].shape[0] == n_transitions 39 | 40 | print("First observation: ", dataset["observations"][0]) 41 | 42 | obs, info = env.reset() 43 | truncated = False 44 | terminated = False 45 | while not (truncated or terminated): 46 | obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) 47 | -------------------------------------------------------------------------------- /demo/load_dataset_part.py: -------------------------------------------------------------------------------- 1 | """Load part of a datset defined by a range of transitions.""" 2 | 3 | 4 | import argparse 5 | 6 | import gymnasium as gym 7 | 8 | import trifinger_rl_datasets # noqa 9 | 10 | 11 | if __name__ == "__main__": 12 | argparser = argparse.ArgumentParser(description=__doc__) 13 | argparser.add_argument( 14 | "--env", 15 | type=str, 16 | default="trifinger-cube-push-real-expert-v0", 17 | help="Name of dataset environment to load.", 18 | ) 19 | argparser.add_argument( 20 | "--range", 21 | type=int, 22 | nargs=2, 23 | default=[1000, 2000], 24 | help="Range of timesteps to load image data for.", 25 | ) 26 | argparser.add_argument( 27 | "--zarr-path", type=str, default=None, help="Path to Zarr file to load." 28 | ) 29 | argparser.add_argument( 30 | "--flatten-obs", action="store_true", help="Flatten observations." 31 | ) 32 | argparser.add_argument( 33 | "--data-dir", type=str, default=None, help="Path to data directory." 34 | ) 35 | args = argparser.parse_args() 36 | 37 | env = gym.make( 38 | args.env, 39 | disable_env_checker=True, 40 | flatten_obs=args.flatten_obs, 41 | data_dir=args.data_dir, 42 | ) 43 | 44 | # load only a subset of obervations, actions and rewards 45 | dataset = env.get_dataset(rng=tuple(args.range), zarr_path=args.zarr_path) 46 | 47 | n_observations = len(dataset["observations"]) 48 | print("Number of observations: ", n_observations) 49 | 50 | assert dataset["actions"].shape[0] == n_observations 51 | assert dataset["rewards"].shape[0] == n_observations 52 | -------------------------------------------------------------------------------- /demo/load_filtered_dicts.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import gymnasium as gym 3 | 4 | import trifinger_rl_datasets # noqa 5 | 6 | 7 | if __name__ == "__main__": 8 | parser = argparse.ArgumentParser( 9 | description="Demonstrate how to customize observation space by filtering." 10 | ) 11 | parser.add_argument( 12 | "--env-name", 13 | type=str, 14 | default="trifinger-cube-push-sim-expert-v0", 15 | help="Name of the gym environment to load.", 16 | ) 17 | parser.add_argument( 18 | "--do-not-filter-obs", 19 | action="store_true", 20 | help="Do not filter observations if this is set.", 21 | ) 22 | parser.add_argument( 23 | "--flatten-obs", 24 | action="store_true", 25 | help="Flatten observations again after filtering if this is set.", 26 | ) 27 | parser.add_argument( 28 | "--data-dir", 29 | type=str, 30 | default=None, 31 | help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.", 32 | ) 33 | args = parser.parse_args() 34 | 35 | # Nested dictionary defines which observations to keep. 36 | # Everything that is not included or has value False 37 | # will be dropped. 38 | obs_to_keep = { 39 | "robot_observation": { 40 | "position": True, 41 | "velocity": True, 42 | "fingertip_force": False, 43 | }, 44 | "camera_observation": {"object_keypoints": True}, 45 | } 46 | env = gym.make( 47 | args.env_name, 48 | disable_env_checker=True, 49 | # enable visualization, 50 | visualization=True, 51 | # filter observations, 52 | obs_to_keep=None if args.do_not_filter_obs else obs_to_keep, 53 | # flatten observation 54 | flatten_obs=args.flatten_obs, 55 | data_dir=args.data_dir, 56 | ) 57 | 58 | dataset = env.get_dataset() 59 | 60 | n_transitions = len(dataset["observations"]) 61 | print("Number of transitions: ", n_transitions) 62 | 63 | assert dataset["actions"].shape[0] == n_transitions 64 | assert dataset["rewards"].shape[0] == n_transitions 65 | 66 | print("First observation: ", dataset["observations"][0]) 67 | 68 | obs, info = env.reset() 69 | truncated = False 70 | terminated = False 71 | while not (truncated or terminated): 72 | obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) 73 | -------------------------------------------------------------------------------- /demo/load_image_data_only.py: -------------------------------------------------------------------------------- 1 | """Load image data from Zarr file and display it.""" 2 | 3 | 4 | import argparse 5 | 6 | import cv2 7 | import gymnasium as gym 8 | import numpy as np 9 | 10 | import trifinger_rl_datasets # noqa 11 | 12 | 13 | def show_images(images, timestep_dimension): 14 | """Show loaded images. 15 | 16 | Args: 17 | images (np.ndarray): Array containing the image data. 18 | no_timestep_dimension (bool): If False, the first dimension of the 19 | image_data array is assumed to correspond to camera timesteps. 20 | Otherwise, the first dimension is assumed to correspond to 21 | images.""" 22 | 23 | if timestep_dimension: 24 | n_timesteps, n_cameras, n_channels, height, width = images.shape 25 | output_image = np.zeros( 26 | (n_cameras * height, n_timesteps * width, n_channels), dtype=np.uint8 27 | ) 28 | else: 29 | n_images, n_channels, height, width = images.shape 30 | output_image = np.zeros((height, n_images * width, n_channels), dtype=np.uint8) 31 | # loop over tuples containing images from all cameras at one timestep 32 | for i, image_s in enumerate(images): 33 | if timestep_dimension: 34 | # concatenate images from all cameras along the height axis 35 | image_s = np.concatenate(image_s, axis=1) 36 | # change to (height, width, channels) format for cv2 37 | image_s = np.transpose(image_s, (1, 2, 0)) 38 | # copy column of camera images to output image 39 | output_image[:, i * width : (i + 1) * width, ...] = image_s 40 | # convert RGB to BGR for cv2 41 | output_image = cv2.cvtColor(output_image, cv2.COLOR_RGB2BGR) 42 | 43 | if timestep_dimension: 44 | legend = "Each column corresponds to the camera images at one timestep." 45 | else: 46 | legend = "Camera images" 47 | print(legend) 48 | print("Press any key to close window.") 49 | cv2.imshow(legend, output_image) 50 | cv2.waitKey(0) 51 | cv2.destroyAllWindows() 52 | 53 | 54 | if __name__ == "__main__": 55 | argparser = argparse.ArgumentParser(description=__doc__) 56 | argparser.add_argument( 57 | "--env", 58 | type=str, 59 | default="trifinger-cube-push-real-expert-image-v0", 60 | help="Name of dataset environment to load.", 61 | ) 62 | argparser.add_argument( 63 | "--n-timesteps", 64 | type=int, 65 | default=10, 66 | help="Number of camera timesteps to load image data for.", 67 | ) 68 | argparser.add_argument( 69 | "--zarr-path", type=str, default=None, help="Path to Zarr file to load." 70 | ) 71 | argparser.add_argument( 72 | "--do-not-show-images", 73 | action="store_true", 74 | help="Do not show images if this is set.", 75 | ) 76 | argparser.add_argument( 77 | "--no-timestep-dimension", 78 | dest="timestep_dimension", 79 | action="store_false", 80 | help="Do not include the timestep dimension in the output array.", 81 | ) 82 | argparser.add_argument( 83 | "--data-dir", type=str, default=None, help="Path to data directory." 84 | ) 85 | args = argparser.parse_args() 86 | 87 | # create environment 88 | env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir) 89 | 90 | # get information about image data 91 | image_stats = env.get_image_stats(zarr_path=args.zarr_path) 92 | print("Image dataset:") 93 | for key, value in image_stats.items(): 94 | print(f"{key}: {value}") 95 | 96 | # load image data 97 | print(f"Loading {args.n_timesteps} timesteps of image data.") 98 | from time import time 99 | 100 | t0 = time() 101 | images = env.get_image_data( 102 | # images from 3 cameras for each timestep 103 | rng=(0, 3 * args.n_timesteps), 104 | zarr_path=args.zarr_path, 105 | timestep_dimension=args.timestep_dimension, 106 | ) 107 | print(f"Loading took {time() - t0:.3f} seconds.") 108 | 109 | # show images 110 | if not args.do_not_show_images: 111 | show_images(images, args.timestep_dimension) 112 | -------------------------------------------------------------------------------- /demo/random_access.py: -------------------------------------------------------------------------------- 1 | """Load small parts of dataset at random positions to test performance.""" 2 | 3 | 4 | import argparse 5 | from time import time 6 | 7 | import gymnasium as gym 8 | import numpy as np 9 | 10 | import trifinger_rl_datasets # noqa 11 | 12 | 13 | if __name__ == "__main__": 14 | argparser = argparse.ArgumentParser(description=__doc__) 15 | argparser.add_argument( 16 | "--env", 17 | type=str, 18 | default="trifinger-cube-push-real-expert-v0", 19 | help="Name of dataset environment to load.", 20 | ) 21 | argparser.add_argument( 22 | "--n-parts", 23 | type=int, 24 | default=500, 25 | help="Number of contiguous parts to load from file.", 26 | ) 27 | argparser.add_argument( 28 | "--part-size", 29 | type=int, 30 | default=10, 31 | help="Number of transitions to load per part.", 32 | ) 33 | argparser.add_argument( 34 | "--zarr_path", type=str, default=None, help="Path to Zarr file to load." 35 | ) 36 | argparser.add_argument( 37 | "--data-dir", 38 | type=str, 39 | default=None, 40 | help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.", 41 | ) 42 | args = argparser.parse_args() 43 | 44 | # create environment 45 | env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir) 46 | 47 | stats = env.get_dataset_stats(zarr_path=args.zarr_path) 48 | print("Number of timesteps in dataset: ", stats["n_timesteps"]) 49 | 50 | # load subsets of the dataset at random positions 51 | indices = [] 52 | for i in range(args.n_parts): 53 | start = np.random.randint(0, stats["n_timesteps"] - args.part_size) 54 | if args.part_size == 1: 55 | indices.append(start) 56 | else: 57 | indices.extend(range(start, start + args.part_size)) 58 | indices = np.array(indices) 59 | t0 = time() 60 | part = env.get_dataset(indices=indices, zarr_path=args.zarr_path) 61 | t1 = time() 62 | print(f"Loaded {args.n_parts} parts of size {args.part_size} in {t1 - t0:.2f} s") 63 | 64 | print("Observation shape: ", part["observations"].shape) 65 | print("Action shape: ", part["actions"].shape) 66 | -------------------------------------------------------------------------------- /demo/simulation_rollout.py: -------------------------------------------------------------------------------- 1 | """Demo for doing a rollout in simulation.""" 2 | 3 | 4 | import argparse 5 | 6 | import gymnasium as gym 7 | import numpy as np 8 | 9 | import trifinger_rl_datasets # noqa 10 | 11 | 12 | if __name__ == "__main__": 13 | argparser = argparse.ArgumentParser(description=__doc__) 14 | argparser.add_argument( 15 | "--env", 16 | type=str, 17 | default="trifinger-cube-push-sim-expert-v0", 18 | help="Name of dataset environment to load.", 19 | ) 20 | argparser.add_argument( 21 | "--no-visualization", 22 | dest="visualization", 23 | action="store_false", 24 | help="Disables visualization, i.e., rendering of the environment in a GUI.", 25 | ) 26 | args = argparser.parse_args() 27 | 28 | env = gym.make(args.env, disable_env_checker=True, visualization=args.visualization) 29 | obs, info = env.reset() 30 | truncated = False 31 | 32 | while not truncated: 33 | obs, rew, terminated, truncated, info = env.step(env.action_space.sample()) 34 | -------------------------------------------------------------------------------- /demo/using_flat_observations.py: -------------------------------------------------------------------------------- 1 | """How to use the provided index ranges to work with flat observations.""" 2 | 3 | import argparse 4 | import json 5 | 6 | import gymnasium as gym 7 | 8 | import trifinger_rl_datasets # noqa 9 | 10 | 11 | if __name__ == "__main__": 12 | argparser = argparse.ArgumentParser(description=__doc__) 13 | argparser.add_argument( 14 | "--env", 15 | type=str, 16 | default="trifinger-cube-push-real-expert-v0", 17 | help="Name of dataset environment to load.", 18 | ) 19 | argparser.add_argument( 20 | "--data-dir", 21 | type=str, 22 | default=None, 23 | help="Path to data directory. If not set, the default data directory '~/.trifinger_rl_datasets' is used.", 24 | ) 25 | args = argparser.parse_args() 26 | 27 | env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir) 28 | 29 | # load only a subset of obervations, actions and rewards 30 | n_observations = 10 31 | dataset = env.get_dataset(rng=(0, 750)) 32 | 33 | # get mapping from observation components to index ranges 34 | obs_indices, obs_shapes = env.get_obs_indices() 35 | 36 | print("Observation component indices: ", json.dumps(obs_indices, indent=4)) 37 | print("Observation component shapes: ", json.dumps(obs_shapes, indent=4)) 38 | 39 | # print cube position over time 40 | print("Cube position over time: ") 41 | for i in range(n_observations): 42 | index_range = obs_indices["camera_observation"]["object_position"] 43 | print(dataset["observations"][i][slice(*index_range)]) 44 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [tool.mypy] 6 | ignore_missing_imports = true 7 | exclude = "build" 8 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | name = trifinger_rl_datasets 3 | version = attr: trifinger_rl_datasets.__version__ 4 | description = Gym environments which provide offline RL datasets collected on the TriFinger system. 5 | long_description = file: README.md 6 | long_description_content_type = text/markdown 7 | author = Nico Gürtler 8 | author_email = nico.guertler@tuebingen.mpg.de 9 | keywords = 10 | offline reinforcement learning 11 | reinforcement learning 12 | robotics 13 | TriFinger 14 | Real Robot Challenge 15 | dexterous manipulation 16 | license = BSD 3-Clause 17 | 18 | [options] 19 | packages = find: 20 | install_requires = 21 | numpy 22 | gymnasium 23 | zarr 24 | tqdm 25 | numpy-quaternion 26 | trifinger_simulation>=1.4.0 27 | opencv-python 28 | lmdb 29 | 30 | [options.package_data] 31 | trifinger_rl_datasets = py.typed 32 | trifinger_rl_datasets.data = 33 | *.npy 34 | *.yml 35 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup 2 | 3 | setup() 4 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/__init__.py: -------------------------------------------------------------------------------- 1 | __version__ = "1.0.3" 2 | 3 | from gymnasium.envs.registration import register 4 | 5 | from .dataset_env import TriFingerDatasetEnv 6 | from .evaluation import Evaluation 7 | from .policy_base import PolicyBase, PolicyConfig 8 | from .sim_env import SimTriFingerCubeEnv 9 | 10 | 11 | base_url = "https://robots.real-robot-challenge.com/public/trifinger_rl_datasets/" 12 | 13 | dataset_names = [ 14 | "trifinger-cube-push-real-expert-v0", 15 | "trifinger-cube-push-real-expert-image-v0", 16 | "trifinger-cube-push-real-weak-n-expert-v0", 17 | "trifinger-cube-push-real-weak-n-expert-image-v0", 18 | "trifinger-cube-push-real-half-expert-v0", 19 | "trifinger-cube-push-real-half-expert-image-v0", 20 | "trifinger-cube-push-real-mixed-v0", 21 | "trifinger-cube-push-real-mixed-image-v0", 22 | "trifinger-cube-push-sim-expert-v0", 23 | "trifinger-cube-push-sim-expert-image-v0", 24 | "trifinger-cube-push-sim-weak-n-expert-v0", 25 | "trifinger-cube-push-sim-weak-n-expert-image-v0", 26 | "trifinger-cube-push-sim-half-expert-v0", 27 | "trifinger-cube-push-sim-half-expert-image-v0", 28 | "trifinger-cube-push-sim-mixed-v0", 29 | "trifinger-cube-push-sim-mixed-image-v0", 30 | "trifinger-cube-lift-real-smooth-expert-v0", 31 | "trifinger-cube-lift-real-smooth-expert-image-v0", 32 | "trifinger-cube-lift-real-expert-v0", 33 | "trifinger-cube-lift-real-expert-image-v0", 34 | "trifinger-cube-lift-real-weak-n-expert-v0", 35 | "trifinger-cube-lift-real-weak-n-expert-image-v0", 36 | "trifinger-cube-lift-real-half-expert-v0", 37 | "trifinger-cube-lift-real-half-expert-image-v0", 38 | "trifinger-cube-lift-real-mixed-v0", 39 | "trifinger-cube-lift-real-mixed-image-v0", 40 | "trifinger-cube-lift-sim-expert-v0", 41 | "trifinger-cube-lift-sim-expert-image-v0", 42 | "trifinger-cube-lift-sim-weak-n-expert-v0", 43 | "trifinger-cube-lift-sim-weak-n-expert-image-v0", 44 | "trifinger-cube-lift-sim-half-expert-v0", 45 | "trifinger-cube-lift-sim-half-expert-image-v0", 46 | "trifinger-cube-lift-sim-mixed-v0", 47 | "trifinger-cube-lift-sim-mixed-image-v0", 48 | ] 49 | 50 | task_params = { 51 | "push": { 52 | "ref_min_score": 0.0, 53 | "ref_max_score": 1.0 * 15000 / 20, 54 | "trifinger_kwargs": { 55 | "episode_length": 750, 56 | "difficulty": 1, 57 | "keypoint_obs": True, 58 | "obs_action_delay": 10, 59 | }, 60 | }, 61 | "lift": { 62 | "ref_min_score": 0.0, 63 | "ref_max_score": 1.0 * 30000 / 20, 64 | "trifinger_kwargs": { 65 | "episode_length": 1500, 66 | "difficulty": 4, 67 | "keypoint_obs": True, 68 | "obs_action_delay": 2, 69 | }, 70 | } 71 | } 72 | 73 | # add the missing parameters for all environments 74 | dataset_params = [] 75 | for dataset_name in dataset_names: 76 | dataset_url = base_url + f"{dataset_name}.zarr/dataset.yaml" 77 | params = { 78 | "name": dataset_name, 79 | "dataset_url": dataset_url, 80 | "real_robot": "real" in dataset_name, 81 | "image_obs": "image" in dataset_name, 82 | } 83 | task = dataset_name.split("-")[2] 84 | params.update(task_params[task]) 85 | dataset_params.append(params) 86 | 87 | 88 | def get_env(**kwargs): 89 | return TriFingerDatasetEnv(**kwargs) 90 | 91 | 92 | for params in dataset_params: 93 | register( 94 | id=params["name"], entry_point="trifinger_rl_datasets:get_env", kwargs=params 95 | ) 96 | 97 | 98 | __all__ = ("TriFingerDatasetEnv", "Evaluation", "PolicyBase", "PolicyConfig", "get_env") 99 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/data/__init__.py -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r1_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 294.88772219999373 5 | - 0.0 6 | - 140.5328113426975 7 | - 0.0 8 | - 295.73310223701 9 | - 139.02122162180186 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2329767683542472 19 | - 0.1264568334234187 20 | - -0.002491690223171375 21 | - 0.0005127279134358445 22 | - -0.10105977517686267 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9998773558577191 30 | - 0.011414526348856103 31 | - 0.01008504910436589 32 | - -0.0005860472846181841 33 | - 0.014962484677142522 34 | - -0.8598542245158285 35 | - -0.5103030306087385 36 | - 0.006757462166832068 37 | - 0.0028470200349174054 38 | - 0.5103960640266085 39 | - -0.8599267125865473 40 | - 0.5398412631504335 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 4.532616904492608e-05 50 | - 0.002683896658220163 51 | - 0.0024633838589706476 52 | - 0.0005936876248080574 53 | - 0.002926484369290859 54 | - 0.0015377822108868704 55 | - 0.002593164324638185 56 | - 0.0005257987542181374 57 | - 0.0021738001191909102 58 | - 0.002590544587593434 59 | - 0.001537841000242577 60 | - 0.0002675030520101297 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r1_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 296.21756903928105 5 | - 0.0 6 | - 136.25729478510448 7 | - 0.0 8 | - 297.3694380877441 9 | - 136.15325821433555 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2360106786968337 19 | - 0.11119950673909895 20 | - -0.0023231213828792865 21 | - -0.00011207631566610079 22 | - -0.03797400043315655 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7007973495452335 30 | - -0.7132911751809331 31 | - -0.009335019672536672 32 | - 0.0029348055153886636 33 | - -0.5214052614698457 34 | - 0.5211156313513126 35 | - -0.6756982950215817 36 | - 0.006648752367397013 37 | - 0.4868367629135888 38 | - -0.4686606984044702 39 | - -0.7371146635360328 40 | - 0.529189920705302 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0019127399459759901 50 | - 0.001873846394493345 51 | - 0.0021123003294579246 52 | - 0.0005526193230891081 53 | - 0.0013674031004057444 54 | - 0.0019502970308061438 55 | - 0.0010921209022115455 56 | - 0.00039016189344718557 57 | - 0.0020164554088746207 58 | - 0.0020082287076101327 59 | - 0.000994872883899746 60 | - 0.000537056922161501 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r1_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.6937813935656 5 | - 0.0 6 | - 142.58976331067643 7 | - 0.0 8 | - 297.43474827422557 9 | - 132.94428296771792 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23413559970500727 19 | - 0.11371630166152194 20 | - -0.0029621685301716794 21 | - 0.00022464397863156515 22 | - -0.05975819039614815 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7015036040355102 30 | - 0.7126258352185996 31 | - -0.006292771681715977 32 | - -5.9947642844554126e-06 33 | - 0.5296527183057728 34 | - 0.5154373896823984 35 | - -0.6736331419719331 36 | - 0.007922712077366799 37 | - -0.47680902821528676 38 | - -0.47589060622531765 39 | - -0.7390285042410089 40 | - 0.5283398784209727 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0019454006345476767 50 | - 0.0019278396785450142 51 | - 0.00316423439241316 52 | - 0.0005606517135422176 53 | - 0.0015869029962777175 54 | - 0.0022564232625609977 55 | - 0.0017537498861007086 56 | - 0.0004269523682819464 57 | - 0.002745775690234948 58 | - 0.002835346737572229 59 | - 0.001604045799541765 60 | - 0.00041279039785668164 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r3_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 296.74611824529234 5 | - 0.0 6 | - 139.40850050814652 7 | - 0.0 8 | - 297.6917158038904 9 | - 138.78193203925287 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23795589930225886 19 | - 0.18043508606118522 20 | - -0.0035741471544204506 21 | - 9.138474527846751e-05 22 | - -0.22493844398993962 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9999547351934535 30 | - 0.008747792234960665 31 | - -0.0005467815435421723 32 | - 0.001765171033829126 33 | - 0.007193253602133133 34 | - -0.8546803606443765 35 | - -0.5190926134066003 36 | - 0.012616075729680792 37 | - -0.005009253005035439 38 | - 0.5190704588728084 39 | - -0.8547099687242182 40 | - 0.5393239940724293 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 2.1047563047475303e-05 50 | - 0.0023541870146826125 51 | - 0.0028569364094846524 52 | - 0.0006445672932799738 53 | - 0.002706191231962306 54 | - 0.001196432299462484 55 | - 0.001960106346274422 56 | - 0.0005881374226730213 57 | - 0.0025235168224777573 58 | - 0.0019601884188093378 59 | - 0.0011937326162342289 60 | - 0.0004392772389769057 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r3_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 298.5404817716902 5 | - 0.0 6 | - 141.68272217282268 7 | - 0.0 8 | - 299.0537835142228 9 | - 128.919845031919 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2364432150861758 19 | - 0.13306051434807178 20 | - -0.0022006077617502296 21 | - 0.00011556539793348763 22 | - -0.09353412205255426 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7085465317675347 30 | - -0.7056363472579175 31 | - -0.005623000466052991 32 | - -0.0014206576794110644 33 | - -0.5096260570266448 34 | - 0.5172088355708018 35 | - -0.6875841577095377 36 | - 0.015634778358790048 37 | - 0.4880922840847387 38 | - -0.48431963174976633 39 | - -0.7260807519787734 40 | - 0.5305425578768467 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0017850020349491676 50 | - 0.00178505207648795 51 | - 0.0010802644490479245 52 | - 0.0007533220965327408 53 | - 0.00133605144240045 54 | - 0.0012267301243739205 55 | - 0.0010191043032768865 56 | - 0.0005317818326090676 57 | - 0.0020112858807655957 58 | - 0.0014727050131528396 59 | - 0.0009715933830996746 60 | - 0.000586868798196673 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r3_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 296.94060130336067 5 | - 0.0 6 | - 144.92494998990932 7 | - 0.0 8 | - 298.12851303387595 9 | - 126.27834126011763 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.224614441732425 19 | - 0.06610056999907407 20 | - -0.002957571082894125 21 | - 5.384206970943196e-05 22 | - 0.02843286766321863 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.700242886187359 30 | - 0.7138894778538976 31 | - -0.0035451792561288736 32 | - -0.0008993391228219956 33 | - 0.5130943940421685 34 | - 0.49982009089630225 35 | - -0.6977888521123732 36 | - 0.022773616806538905 37 | - -0.49637442103061286 38 | - -0.4904415106411645 39 | - -0.7162914535207946 40 | - 0.5311376719192535 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0016404696193842237 50 | - 0.0016128107368050496 51 | - 0.0019629463782010886 52 | - 0.0006384353946906304 53 | - 0.0011906806384894921 54 | - 0.0016991520977704723 55 | - 0.000657699303747005 56 | - 0.0005515050742389639 57 | - 0.0015389711063393422 58 | - 0.0018238721419053371 59 | - 0.0006461098573577129 60 | - 0.0006490388732209827 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r4_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.14033512458053 5 | - 0.0 6 | - 138.00666513483048 7 | - 0.0 8 | - 295.8963905623699 9 | - 140.1780817025803 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2344692447120545 19 | - 0.1170210063291702 20 | - -0.0025387045089898526 21 | - 0.00028678240528109023 22 | - -0.05727351974597146 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9997577629246032 30 | - -0.019728351892018867 31 | - 0.008801429506700712 32 | - -0.011521565590285684 33 | - -0.012676964030929268 34 | - -0.8656406099279625 35 | - -0.5004868462171239 36 | - 0.002502652833109931 37 | - 0.017491868719132025 38 | - 0.5002604130100553 39 | - -0.8656918965259675 40 | - 0.5379001618103092 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 6.077392032405641e-05 50 | - 0.0032662823312658903 51 | - 0.0026589750881622597 52 | - 0.0006043392603399411 53 | - 0.0035533576087949576 54 | - 0.0012160406746025694 55 | - 0.002107293269154359 56 | - 0.0005246759072894277 57 | - 0.00226153485468361 58 | - 0.0021209420050399127 59 | - 0.0012169873079002672 60 | - 0.00032410552238325405 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r4_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.06181051767146 5 | - 0.0 6 | - 145.562733589218 7 | - 0.0 8 | - 296.4879185143768 9 | - 137.77016924669405 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23212995718456583 19 | - 0.09401968037398019 20 | - -0.003509759792242295 21 | - 0.00013186627698614768 22 | - -0.01598467384240246 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7053343042234599 30 | - -0.7088205420070108 31 | - -0.007533171694605158 32 | - 0.0011711320881139364 33 | - -0.5224835176259123 34 | - 0.5270393971499698 35 | - -0.6702455873235632 36 | - 0.00026587658713440984 37 | - 0.47905680627685526 38 | - -0.46881272781441363 39 | - -0.7420919799225995 40 | - 0.5280109588474451 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0024126865232079794 50 | - 0.0023890681263911166 51 | - 0.002946376748396405 52 | - 0.0005506877497822348 53 | - 0.0019409159597162035 54 | - 0.002197967455094283 55 | - 0.0016438122291327678 56 | - 0.0004887254561488916 57 | - 0.0030791628178283214 58 | - 0.0026458670499804582 59 | - 0.0014879776474758704 60 | - 0.0004331257058881025 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r4_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.3700492360575 5 | - 0.0 6 | - 136.4014117306309 7 | - 0.0 8 | - 296.88972733684324 9 | - 135.80933925508987 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23824457883790873 19 | - 0.1442700801176228 20 | - -0.0021768405888921223 21 | - 0.00018192403232659664 22 | - -0.11940468687914024 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.697667389466153 30 | - 0.7164043722161288 31 | - 0.002578390552528845 32 | - -0.00565327771472424 33 | - 0.5304059004291459 34 | - 0.51894730508639 35 | - -0.6703370518430334 36 | - 0.0020404541425615187 37 | - -0.4815748046122418 38 | - -0.46630623783099373 39 | - -0.7420436986786935 40 | - 0.5272097013879554 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.002252421338863501 50 | - 0.002191622368127827 51 | - 0.0029093705098407314 52 | - 0.0006367461049661742 53 | - 0.0017955664406818896 54 | - 0.0024356722232948837 55 | - 0.0015348295959123992 56 | - 0.00044482796113458047 57 | - 0.0026841325191941804 58 | - 0.00249536877201587 59 | - 0.001384897520199189 60 | - 0.00042975075002664144 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r5_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 291.80982121128756 5 | - 0.0 6 | - 144.54387475012228 7 | - 0.0 8 | - 292.6932628296855 9 | - 143.60050245646204 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.25011027447601114 19 | - 0.17835090640327209 20 | - -0.0005548967943714432 21 | - -0.00011834890033780262 22 | - -0.17116234818120937 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9999561355609734 30 | - -0.007928987189101743 31 | - 0.0012997157055982735 32 | - -0.0038819862544355044 33 | - -0.006227550987616203 34 | - -0.8673134043586379 35 | - -0.4976936744704877 36 | - 0.0004404750551836095 37 | - 0.005076286783498226 38 | - 0.49767305858846406 39 | - -0.8673321903460468 40 | - 0.5342541371766851 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 2.1536542331724083e-05 50 | - 0.0027485718806892715 51 | - 0.003951423034502948 52 | - 0.000579933025544882 53 | - 0.0033298252173191963 54 | - 0.0021486158161714286 55 | - 0.003738761116627183 56 | - 0.00040839765132245833 57 | - 0.0034767334312054317 58 | - 0.0037343459774502965 59 | - 0.0021439780764358997 60 | - 0.00027782692179023416 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r5_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 291.9934185104733 5 | - 0.0 6 | - 145.16730905488873 7 | - 0.0 8 | - 293.67115397647314 9 | - 144.03250986816687 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.24838171967550723 19 | - 0.13640297400803297 20 | - -0.001154775930887984 21 | - 0.0003733066378391576 22 | - -0.05845365778706338 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7019735477042157 30 | - -0.7121841801991278 31 | - 0.0006921259647857977 32 | - 0.00012632115564366634 33 | - -0.5388647008347751 34 | - 0.5305050468957526 35 | - -0.6543259972492321 36 | - -0.008541488099710197 37 | - 0.465647305518467 38 | - -0.4596992826623059 39 | - -0.7561778136707082 40 | - 0.5242294579470399 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0018227839003619188 50 | - 0.0017934781195245982 51 | - 0.0044512531258669216 52 | - 0.0004272643343968978 53 | - 0.0033011946127564896 54 | - 0.004164193916821838 55 | - 0.004298876519949212 56 | - 0.00035329568808827743 57 | - 0.004042147223758283 58 | - 0.0037551480998523506 59 | - 0.0037189251014270964 60 | - 0.00035881803664432967 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r5_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 292.0285930848304 5 | - 0.0 6 | - 144.00442873377975 7 | - 0.0 8 | - 293.95259618533396 9 | - 145.88910315941828 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.25341478022067476 19 | - 0.17099512391833938 20 | - -0.0014184980803399942 21 | - 0.0002517384743051436 22 | - -0.11906924909258061 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.699743668396842 30 | - 0.7143685685080576 31 | - -0.0029093553396054026 32 | - -0.001694552433137945 33 | - 0.5422160152815948 34 | - 0.5284547088756493 35 | - -0.6532159782970992 36 | - -0.007850685480007464 37 | - -0.4651060402596818 38 | - -0.45867061536542447 39 | - -0.7571276016602879 40 | - 0.5228406235997342 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0018332186701647034 50 | - 0.0018016217206025726 51 | - 0.004612589974219499 52 | - 0.0005240054641435965 53 | - 0.003733258318774234 54 | - 0.0033416744035683925 55 | - 0.0046038170219549965 56 | - 0.0002967797970863807 57 | - 0.0044344916588204795 58 | - 0.004474702957796695 59 | - 0.003968001457894717 60 | - 0.00025687284923325745 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r6_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 292.88269384058924 5 | - 0.0 6 | - 143.79235188937437 7 | - 0.0 8 | - 294.30956500441897 9 | - 149.2307049454187 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.22994262887993008 19 | - 0.08894501128947027 20 | - -0.0023966108654140225 21 | - -6.935365949748796e-05 22 | - -0.017936186650326827 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9999720431932659 30 | - 0.006455366446344063 31 | - -1.881938647716094e-05 32 | - 0.003625256844403014 33 | - 0.005636796904021368 34 | - -0.8744145177364162 35 | - -0.48512989398178585 36 | - -0.01060281975896039 37 | - -0.0031473446745730743 38 | - 0.48512312844760014 39 | - -0.8744297446424611 40 | - 0.5330174005997446 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 1.2128947602583105e-05 50 | - 0.0016961801889458417 51 | - 0.003370986119038037 52 | - 0.0005772000185894434 53 | - 0.0024953903457797444 54 | - 0.0015517098495523652 55 | - 0.002798013802624924 56 | - 0.0005740488455741548 57 | - 0.002829567804180083 58 | - 0.002800862430880717 59 | - 0.0015539842076223718 60 | - 0.0003714630428281225 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r6_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 293.0572612023564 5 | - 0.0 6 | - 139.17801464819777 7 | - 0.0 8 | - 295.40667105309467 9 | - 142.67976269897005 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23333477959856513 19 | - 0.09921033333190701 20 | - -0.002754143284285488 21 | - -0.0003490048615555727 22 | - -0.03163811291317096 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.707683572638958 30 | - -0.7064149220668923 31 | - -0.012148211019830187 32 | - 0.003543577770328583 33 | - -0.5258508371764269 34 | - 0.5381249309070799 35 | - -0.6587011985700283 36 | - -0.008355835603714326 37 | - 0.47186014717667946 38 | - -0.4597667518660912 39 | - -0.7522929898286798 40 | - 0.5258507414769803 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.001291453231841644 50 | - 0.0012862151818786298 51 | - 0.0033192870108997943 52 | - 0.0005630001327120169 53 | - 0.0018159974126859078 54 | - 0.00256262368578926 55 | - 0.002306922788415099 56 | - 0.0005009385939512225 57 | - 0.002809602491676512 58 | - 0.0024143542267237414 59 | - 0.002017391146870438 60 | - 0.00036416735870768095 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r6_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 294.392182042338 5 | - 0.0 6 | - 135.903897476172 7 | - 0.0 8 | - 296.3789733076691 9 | - 141.54828717833394 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2358354659457964 19 | - 0.11419031451647611 20 | - -0.0013552452815978515 21 | - -5.197185272809166e-05 22 | - -0.0506407541295614 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.701553011571557 30 | - 0.7125652659286393 31 | - -0.0081941598120978 32 | - 0.0031462234014824446 33 | - 0.5392229494438794 34 | - 0.5233014510040657 35 | - -0.6598389826645469 36 | - -0.006383759084792281 37 | - -0.46589093385989944 38 | - -0.4673315326527294 39 | - -0.7513562295190089 40 | - 0.5248396817440364 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.001229013783021029 50 | - 0.0012230723194279855 51 | - 0.0019907564196428966 52 | - 0.000678238864738361 53 | - 0.0014372425347592508 54 | - 0.0013047132797986282 55 | - 0.001717900885483957 56 | - 0.0003934710849838526 57 | - 0.0020110579530102307 58 | - 0.0020913093736326304 59 | - 0.0015082339319933923 60 | - 0.0003564474250847881 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r7_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 294.91407777779557 5 | - 0.0 6 | - 142.61132801267524 7 | - 0.0 8 | - 295.90225454309245 9 | - 135.97925936019564 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23720518121586948 19 | - 0.14246443256245964 20 | - -0.003145098642467982 21 | - 0.0003035117203527542 22 | - -0.13056703179574852 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9999614301557127 30 | - -0.0038044886143320166 31 | - 0.007070722980804723 32 | - -0.005681891195977897 33 | - 0.0004308237353203887 34 | - -0.8539705455649946 35 | - -0.5203122790607115 36 | - 0.014343960165359073 37 | - 0.008019151190371582 38 | - 0.520297200384148 39 | - -0.8539430342275228 40 | - 0.5382261057631044 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 1.9283957713196662e-05 50 | - 0.002908282193539116 51 | - 0.002051942169076023 52 | - 0.0005212657697279603 53 | - 0.0026520656637447205 54 | - 0.0007755873522018889 55 | - 0.0012723871096378659 56 | - 0.0004834483677217526 57 | - 0.002368962194840589 58 | - 0.0012636475649274786 59 | - 0.0007758826246265768 60 | - 0.0003799753060648883 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r7_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 296.2022396427944 5 | - 0.0 6 | - 133.82740566738968 7 | - 0.0 8 | - 297.3771133577303 9 | - 134.75089001072408 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2333745455445079 19 | - 0.11679667060789958 20 | - -0.0023468444404548742 21 | - 0.00014715034620450563 22 | - -0.0708895129264651 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7111205566981869 30 | - -0.7030203571281524 31 | - -0.007699906468514525 32 | - -0.0014288398691257784 33 | - -0.5100222458361423 34 | - 0.5233743164580106 35 | - -0.6826047744879559 36 | - 0.011510066687658853 37 | - 0.4839159914333139 38 | - -0.4814894107708583 39 | - -0.7307454771098745 40 | - 0.5294821942287425 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.00206778219301248 50 | - 0.0020930770637026014 51 | - 0.0014092729506834936 52 | - 0.0004645190484458327 53 | - 0.0020522867428850863 54 | - 0.0016974886777108267 55 | - 0.0005119784456929096 56 | - 0.00046796029096165116 57 | - 0.0012993169583872517 58 | - 0.001545060400600723 59 | - 0.00048249135429546594 60 | - 0.000498084902998353 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r7_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.68284749216275 5 | - 0.0 6 | - 138.6172318394432 7 | - 0.0 8 | - 296.88876982765737 9 | - 141.98753763870397 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23528658775530842 19 | - 0.11773801226352881 20 | - -0.002460578200390553 21 | - 0.0001076018703530655 22 | - -0.060997186815576546 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.6975114038977125 30 | - 0.7165641438116744 31 | - 0.0006223597262802224 32 | - -0.0033290540858423394 33 | - 0.5380220394522306 34 | - 0.5242879445517642 35 | - -0.6600364261980698 36 | - -0.006782151244644883 37 | - -0.4732843702854203 38 | - -0.46004892843662465 39 | - -0.7512308632726631 40 | - 0.5263967463789823 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0023360196800146084 50 | - 0.002275668967104426 51 | - 0.0016267290069801044 52 | - 0.000571890327053958 53 | - 0.001838070170067589 54 | - 0.0015387251485108407 55 | - 0.0007785730511057413 56 | - 0.00037582786781152047 57 | - 0.00178769402929836 58 | - 0.002327251836675354 59 | - 0.0006829373342669043 60 | - 0.0003199057768580286 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r8_camera180.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 296.02053364867896 5 | - 0.0 6 | - 141.07075475961815 7 | - 0.0 8 | - 296.8487591707737 9 | - 137.25746637774347 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera180 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.24950471284028264 19 | - 0.24342845204591992 20 | - -0.002531494887887566 21 | - 0.00046256063896776535 22 | - -0.3541986260087375 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - 0.9998932927741045 30 | - 0.012004739627995153 31 | - 0.007465193312412156 32 | - 0.001294615931159824 33 | - 0.014135952776839406 34 | - -0.8561971106260303 35 | - -0.5164466993433118 36 | - 0.012424529489562694 37 | - 0.00019488181040999667 38 | - 0.5164978786653587 39 | - -0.8562827476906107 40 | - 0.538478279558769 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 3.7629732321215315e-05 50 | - 0.003084536149848168 51 | - 0.0020110697091049044 52 | - 0.0005356753336498872 53 | - 0.0025770758673765376 54 | - 0.0008503720407472454 55 | - 0.0014576966397219205 56 | - 0.0004912935035423122 57 | - 0.0026262500846193013 58 | - 0.0014483854523583061 59 | - 0.0008742727017023533 60 | - 0.00037525738136786344 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r8_camera300.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.6492218709406 5 | - 0.0 6 | - 140.90844522439747 7 | - 0.0 8 | - 296.8523852917806 9 | - 134.3721297951843 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera300 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.23263432076355056 19 | - 0.09607373055882804 20 | - -0.0021965093214136658 21 | - 0.00010040389103745153 22 | - -0.020900940190873876 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7069760632217225 30 | - -0.7072283939196913 31 | - -0.0008312390838778697 32 | - -0.00152778567544845 33 | - -0.5180401431551137 34 | - 0.5186550356444332 35 | - -0.6801626021939451 36 | - 0.010290482386052437 37 | - 0.4814637199137908 38 | - -0.48043147307021605 39 | - -0.7330577405868307 40 | - 0.5309402021522255 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0021223869063767544 50 | - 0.002120233007018172 51 | - 0.0017759483917280225 52 | - 0.0005414513654696901 53 | - 0.0022978395868184544 54 | - 0.002049430035805928 55 | - 0.0008476007454615762 56 | - 0.0004494651527114296 57 | - 0.0014689878054040275 58 | - 0.0013627055998963378 59 | - 0.000787505151625276 60 | - 0.0005483492176623279 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/r8_camera60.yml: -------------------------------------------------------------------------------- 1 | camera_matrix: 2 | cols: 3 3 | data: 4 | - 295.4173651653694 5 | - 0.0 6 | - 139.6933250987614 7 | - 0.0 8 | - 296.93882981835765 9 | - 142.97108034297597 10 | - 0.0 11 | - 0.0 12 | - 1.0 13 | rows: 3 14 | camera_name: camera60 15 | distortion_coefficients: 16 | cols: 5 17 | data: 18 | - -0.2303760271500639 19 | - 0.09706332868956252 20 | - -0.003083941016988369 21 | - -5.190048856099958e-05 22 | - -0.03530538891597434 23 | rows: 1 24 | image_height: 270 25 | image_width: 270 26 | tf_world_to_camera: 27 | cols: 4 28 | data: 29 | - -0.7030252255720041 30 | - 0.7111477255883698 31 | - 0.0028142862053625395 32 | - -0.0016364916773338725 33 | - 0.5304715245842915 34 | - 0.5270363233833588 35 | - -0.6639450477383473 36 | - -0.004922911244153889 37 | - -0.4736460950801597 38 | - -0.465280304546193 39 | - -0.7477693453351668 40 | - 0.5267169283529306 41 | - 0.0 42 | - 0.0 43 | - 0.0 44 | - 1.0 45 | rows: 4 46 | tf_world_to_camera_std: 47 | cols: 4 48 | data: 49 | - 0.0022852521506059376 50 | - 0.002257239526714245 51 | - 0.002491362158126567 52 | - 0.0005670914732711851 53 | - 0.002294571088508853 54 | - 0.0014931293629478386 55 | - 0.0014678323194583416 56 | - 0.0004086713091198698 57 | - 0.0020935606259285377 58 | - 0.002922304033917509 59 | - 0.001303125770391626 60 | - 0.00040230225517642936 61 | - 0.0 62 | - 0.0 63 | - 0.0 64 | - 0.0 65 | rows: 4 66 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/data/trifingerpro_shuffle_cube_trajectory_fast.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/data/trifingerpro_shuffle_cube_trajectory_fast.npy -------------------------------------------------------------------------------- /trifinger_rl_datasets/dataset_env.py: -------------------------------------------------------------------------------- 1 | from copy import deepcopy 2 | import hashlib 3 | import os 4 | from pathlib import Path 5 | from threading import Thread 6 | from typing import Union, Tuple, Dict, Optional, List, Any 7 | import urllib.request 8 | 9 | import cv2 10 | import gymnasium as gym 11 | import gymnasium.spaces as spaces 12 | import numpy as np 13 | from tqdm import tqdm 14 | import yaml 15 | import zarr 16 | 17 | from .sim_env import SimTriFingerCubeEnv 18 | 19 | 20 | class ImageLoader(Thread): 21 | """Thread for loading and processing images from the dataset. 22 | 23 | This thread is responsible for loading and processing every 24 | loader_id-th image. Processing includes decoding, reordering 25 | of pixels and debayering.""" 26 | 27 | def __init__( 28 | self, 29 | loader_id, 30 | n_loaders, 31 | image_data, 32 | unique_images, 33 | n_unique_images, 34 | n_cameras, 35 | reorder_pixels, 36 | timestep_dimension, 37 | ): 38 | """ 39 | Args: 40 | loader_id: ID of this loader. This loader will load every 41 | loader_id-th image. 42 | n_loaders: Total number of loaders. 43 | image_data: Numpy array containing the image data. 44 | unique_images: Numpy array to which the images are written. 45 | n_unique_images: Number of unique images to load. If this 46 | number is not divisible by n_cameras, 47 | self.unique_images will be padded with zeros. 48 | n_cameras: Number of cameras. 49 | reorder_pixels: Whether to undo the reordering of the pixels 50 | which was done during creation of the dataset to improve 51 | the image compression. 52 | timestep_dimension: If True, the image data is expected to 53 | contain images from all cameras in a row and 54 | n_unique_images is expected to have shape 55 | (n_timesteps, n_cameras, height, width). If False, the 56 | shape is expected to be 57 | (n_unique_images, n_cameras, height, width).""" 58 | super().__init__() 59 | self.loader_id = loader_id 60 | self.n_loaders = n_loaders 61 | self.image_data = image_data 62 | self.unique_images = unique_images 63 | self.n_unique_images = n_unique_images 64 | self.n_cameras = n_cameras 65 | self.reorder_pixels = reorder_pixels 66 | self.timestep_dimension = timestep_dimension 67 | 68 | def _reorder_pixels(self, img: np.ndarray) -> np.ndarray: 69 | """Undo reordering of Bayer pattern.""" 70 | new = np.empty_like(img) 71 | a = img.shape[0] // 2 72 | b = img.shape[1] // 2 73 | 74 | red = img[0:a, 0:b] 75 | blue = img[a:, 0:b] 76 | green1 = img[0:a, b:] 77 | green2 = img[a:, b:] 78 | 79 | new[0::2, 0::2] = red 80 | new[1::2, 1::2] = blue 81 | new[0::2, 1::2] = green1 82 | new[1::2, 0::2] = green2 83 | 84 | return new 85 | 86 | def _decode_image(self, image: np.ndarray) -> np.ndarray: 87 | """Decode image from numpy array of type void.""" 88 | # convert numpy array of type V1 to use with cv2 imdecode 89 | image = np.frombuffer(image, dtype=np.uint8) 90 | # use cv2 to decode image 91 | image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED) 92 | return image 93 | 94 | def run(self): 95 | # this thread is responsible for every loader_id-th image 96 | for i in range(self.loader_id, self.n_unique_images, self.n_loaders): 97 | if self.timestep_dimension: 98 | timestep, camera = divmod(i, self.n_cameras) 99 | compressed_image = self.image_data[i] 100 | # decode image 101 | image = self._decode_image(compressed_image) 102 | if self.reorder_pixels: 103 | # undo reordering of pixels 104 | image = self._reorder_pixels(image) 105 | # debayer image (output channels in RGB order) 106 | image = cv2.cvtColor(image, cv2.COLOR_BAYER_BG2RGB) 107 | # convert to channel first 108 | image = np.transpose(image, (2, 0, 1)) 109 | if self.timestep_dimension: 110 | self.unique_images[timestep, camera, ...] = image 111 | else: 112 | self.unique_images[i, ...] = image 113 | 114 | 115 | class TriFingerDatasetEnv(gym.Env): 116 | """TriFinger environment which can load an offline RL dataset from a file. 117 | 118 | Similar to D4RL's OfflineEnv but with different data loading and 119 | options for customization of observation space.""" 120 | 121 | _PRELOAD_VECTOR_KEYS = ["observations", "actions"] 122 | _PRELOAD_SCALAR_KEYS = ["rewards", "timeouts"] 123 | 124 | def __init__( 125 | self, 126 | name, 127 | dataset_url, 128 | ref_max_score, 129 | ref_min_score, 130 | trifinger_kwargs, 131 | real_robot=False, 132 | image_obs=False, 133 | visualization=False, 134 | obs_to_keep=None, 135 | flatten_obs=True, 136 | scale_obs=False, 137 | set_terminals=False, 138 | data_dir=None, 139 | **kwargs, 140 | ): 141 | """ 142 | Args: 143 | name (str): Name of the dataset. 144 | dataset_url (str): URL pointing to the dataset. 145 | ref_max_score (float): Maximum score (for score normalization) 146 | ref_min_score (float): Minimum score (for score normalization) 147 | trifinger_kwargs (dict): Keyword arguments for underlying 148 | SimTriFingerCubeEnv environment. 149 | real_robot (bool): Whether the data was collected on real 150 | robots. 151 | image_obs (bool): Whether observations contain camera 152 | images. 153 | visualization (bool): Enables rendering for simulated 154 | environment. 155 | obs_to_keep (dict): Dictionary with the same structure as 156 | the observation of SimTriFingerCubeEnv. The boolean 157 | value of each item indicates whether it should be 158 | included in the observation. If None, the 159 | SimTriFingerCubeEnv is used. 160 | flatten_obs (bool): Whether to flatten the observation. Can 161 | be combined with obs_to_keep. 162 | scale_obs (bool): Whether to scale all components of the 163 | observation to interval [-1, 1]. Only implemented 164 | for flattend observations. 165 | set_terminals (bool): Whether to set the terminals instead 166 | of the timeouts. 167 | data_dir (str or Path): Directory where the dataset is 168 | stored. If None, the default data directory 169 | (~/.trifinger_rl_datasets) is used. 170 | """ 171 | super().__init__(**kwargs) 172 | 173 | self.name = name 174 | self.dataset_url = dataset_url 175 | self.ref_max_score = ref_max_score 176 | self.ref_min_score = ref_min_score 177 | self.real_robot = real_robot 178 | self.image_obs = image_obs 179 | self.obs_to_keep = obs_to_keep 180 | self.flatten_obs = flatten_obs 181 | self.scale_obs = scale_obs 182 | self.set_terminals = set_terminals 183 | self._local_dataset_path = None 184 | if data_dir is None: 185 | data_dir = Path.home() / ".trifinger_rl_datasets" 186 | self.data_dir = Path(data_dir) 187 | 188 | self.t_kwargs = deepcopy(trifinger_kwargs) 189 | self.t_kwargs["image_obs"] = image_obs 190 | self.t_kwargs["visualization"] = visualization 191 | 192 | # underlying simulated TriFinger environment 193 | self.sim_env = SimTriFingerCubeEnv(**self.t_kwargs) 194 | # a copy of the original observation space which is used when 195 | # filtering the observations 196 | self._orig_obs_space = deepcopy(self.sim_env.observation_space) 197 | # the space used for unflattening the observations (images will 198 | # be removed from this space) 199 | self._unflattening_space = deepcopy(self.sim_env.observation_space) 200 | 201 | # remove camera observations from space used for flattening 202 | # and unflattening as images are treated separetely and not 203 | # flattened 204 | if self.image_obs: 205 | stripped_camera_observations = spaces.Dict( 206 | { 207 | k: v 208 | for k, v in self._orig_obs_space.spaces[ 209 | "camera_observation" 210 | ].spaces.items() 211 | if k != "images" 212 | } 213 | ) 214 | self._unflattening_space["camera_observation"] = stripped_camera_observations 215 | if self.flatten_obs: 216 | # if the observations are eventually flattened, they do not contain 217 | # images anymore 218 | self._orig_obs_space["camera_observation"] = stripped_camera_observations 219 | self._orig_flat_obs_space = spaces.flatten_space(self._orig_obs_space) 220 | self._flat_unflattening_space = spaces.flatten_space(self._unflattening_space) 221 | 222 | if scale_obs and not flatten_obs: 223 | raise NotImplementedError( 224 | "Scaling of observations only " 225 | "implemented for flattened observations, i.e., for " 226 | "flatten_obs=True." 227 | ) 228 | 229 | # action space 230 | self.action_space = self.sim_env.action_space 231 | 232 | # observation space 233 | # self._filtered_obs_space is the Dict observation space after 234 | # filtering 235 | if self.obs_to_keep is not None: 236 | # construct filtered observation space 237 | self._filtered_obs_space = self._filter_dict( 238 | keys_to_keep=self.obs_to_keep, d=self._orig_obs_space 239 | ) 240 | else: 241 | self._filtered_obs_space = self._orig_obs_space 242 | # self.observation_space is potentially also flattened 243 | if self.flatten_obs: 244 | # flat obs space 245 | self.observation_space = spaces.flatten_space(self._filtered_obs_space) 246 | if self.scale_obs: 247 | self._obs_unscaled_low = self.observation_space.low 248 | self._obs_unscaled_high = self.observation_space.high 249 | # scale observations to [-1, 1] 250 | self.observation_space = spaces.Box( 251 | low=-1.0, 252 | high=1.0, 253 | shape=self.observation_space.shape, 254 | dtype=self.observation_space.dtype, 255 | ) 256 | else: 257 | self.observation_space = self._filtered_obs_space 258 | 259 | def _download_dataset(self): 260 | """Download dataset files if not already present. 261 | 262 | `self.dataset_url` is expected to point to a YAML file with the 263 | following structure: 264 | ``` 265 | n_parts: 266 | md5_hash_parts: 267 | - 268 | - 269 | ... 270 | md5_hash_complete: 271 | ``` 272 | The dataset is split into multiple parts to allow for 273 | continuing a download if it was interrupted. The complete 274 | dataset is then reconstructed by concatenating the parts.""" 275 | if self._local_dataset_path is None: 276 | dataset_dir = self.data_dir / (self.name + ".zarr") 277 | dataset_dir.mkdir(exist_ok=True, parents=True) 278 | local_path = dataset_dir / "data.mdb" 279 | if not local_path.exists(): 280 | print(f"Downloading dataset {self.name}.") 281 | # first download YAML file with info about dataset files 282 | with urllib.request.urlopen(self.dataset_url) as web_url: 283 | dataset_info = yaml.safe_load(web_url) 284 | # download dataset parts 285 | for i, part_hash in enumerate(tqdm(dataset_info["md5_hash_parts"])): 286 | part_path = dataset_dir / f"{self.name}_{i:03d}" 287 | if not part_path.exists(): 288 | # strip filename from url 289 | stripped_url = self.dataset_url.rsplit("/", 1)[0] 290 | part_url = stripped_url + f"/part_{i:03d}" 291 | urllib.request.urlretrieve(part_url, part_path) 292 | if not part_path.exists(): 293 | raise IOError( 294 | f"Failed to download part {i} of dataset from URL {part_url}." 295 | ) 296 | # check hash 297 | with open(part_path, "rb") as f: 298 | m = hashlib.md5() 299 | m.update(f.read()) 300 | if m.hexdigest() != part_hash: 301 | raise IOError( 302 | f"Hash of downloaded part {part_path} does not " 303 | f"match expected hash. Please delete " 304 | f"the file and try again." 305 | ) 306 | # combine parts 307 | with open(local_path, "wb") as f: 308 | print("Assembling dataset parts.") 309 | for i in tqdm(range(dataset_info["n_parts"])): 310 | part_path = dataset_dir / f"{self.name}_{i:03d}" 311 | with open(part_path, "rb") as part_file: 312 | f.write(part_file.read()) 313 | # delete part file 314 | part_path.unlink() 315 | if not local_path.exists(): 316 | raise IOError( 317 | f"Failed to assemble dataset {self.dataset_url} locally at {local_path}." 318 | ) 319 | self._local_dataset_path = dataset_dir 320 | return self._local_dataset_path 321 | 322 | def _filter_dict(self, keys_to_keep, d): 323 | """Keep only a subset of keys in dict. 324 | 325 | Applied recursively. 326 | 327 | Args: 328 | keys_to_keep (dict): (Nested) dictionary with values being 329 | either a dict or a bolean indicating whether to keep 330 | an item. 331 | d (dict or gymnasium.spaces.Dict): Dicitionary or Dict space that 332 | is to be filtered.""" 333 | 334 | filtered_dict = {} 335 | for k, v in keys_to_keep.items(): 336 | if isinstance(v, dict): 337 | subspace = self._filter_dict(v, d[k]) 338 | filtered_dict[k] = subspace 339 | elif isinstance(v, bool) and v: 340 | filtered_dict[k] = d[k] 341 | elif not isinstance(v, bool): 342 | raise TypeError( 343 | "Expected boolean to indicate whether item " 344 | "in observation space is to be kept." 345 | ) 346 | if isinstance(d, spaces.Dict): 347 | filtered_dict = spaces.Dict(spaces=filtered_dict) 348 | return filtered_dict 349 | 350 | def _scale_obs(self, obs: np.ndarray) -> np.ndarray: 351 | """Scale observation components to [-1, 1].""" 352 | 353 | interval = self._obs_unscaled_high.high - self._obs_unscaled_low.low 354 | a = (obs - self._obs_unscaled_low.low) / interval 355 | return a * 2.0 - 1.0 356 | 357 | def _process_obs(self, obs: Union[np.ndarray, Dict]) -> np.ndarray: 358 | """Process obs according to params. 359 | 360 | Assumes that if `self.obs_to_keep` is not None, then the observations 361 | are provided as a dictionary. 362 | Args: 363 | obs: Dictionary or array containing the 364 | observations. 365 | Returns: 366 | Processed observations. If `self.flatten_obs` is False then 367 | as a dictionary. If `self.flatten_obs` is True then either as 368 | a 1D NumPy array (if no images are contained in obs) or as a 369 | tuple (if images are contained in the obs dictionary) 370 | consisting of 371 | * a 1D NumPy array containing all observations except the 372 | camera images, and 373 | * a NumPy array of shape (n_cameras, n_channels, height, width) 374 | containing the camera images.""" 375 | 376 | images = None 377 | if self.obs_to_keep is not None: 378 | # filter obs 379 | obs = self._filter_dict(self.obs_to_keep, obs) 380 | if self.flatten_obs and isinstance(obs, dict): 381 | if "images" in obs["camera_observation"]: 382 | # remove camera_observations/images from obs 383 | images = obs["camera_observation"].pop("images") 384 | # flatten obs 385 | obs = spaces.flatten(self._filtered_obs_space, obs) 386 | if self.scale_obs: 387 | # scale obs 388 | obs = self._scale_obs(obs) 389 | if images is not None: 390 | return obs, images 391 | else: 392 | return obs 393 | 394 | def get_obs_indices(self) -> Tuple[Dict, Dict]: 395 | """Get index ranges that correspond to the different observation components. 396 | 397 | Also returns a dictionary containing the shapes of these observation 398 | components. 399 | 400 | Returns: 401 | - A dictionary with keys corresponding to the observation components and 402 | values being tuples of the form (start, end), where start and end are 403 | the indices at which the observation component starts and ends. The 404 | nested dictionary structure of the observation is preserved. 405 | - A dictionary of the same structure but with values being the shapes 406 | of the observation components.""" 407 | 408 | def _construct_dummy_obs(spaces_dict, counter=[0]): 409 | """Construct dummy observation which has an array repeating 410 | a different integer as the value of each component.""" 411 | dummy_obs = {} 412 | for i, (k, v) in enumerate(spaces_dict.items()): 413 | if isinstance(v, spaces.Dict): 414 | dummy_obs[k] = _construct_dummy_obs(v.spaces, counter) 415 | else: 416 | dummy_obs[k] = counter * np.ones(v.shape, dtype=np.int32) 417 | counter[0] += 1 418 | return dummy_obs 419 | 420 | dummy_obs = _construct_dummy_obs(self._orig_obs_space.spaces) 421 | flat_dummy_obs = spaces.flatten(self._orig_obs_space, dummy_obs) 422 | 423 | def _get_indices_and_shape(dummy_obs, flat_dummy_obs): 424 | indices = {} 425 | shape = {} 426 | for k, v in dummy_obs.items(): 427 | if isinstance(v, dict): 428 | indices[k], shape[k] = _get_indices_and_shape(v, flat_dummy_obs) 429 | else: 430 | where = np.where(flat_dummy_obs == v.flatten()[0])[0] 431 | indices[k] = (int(where[0]), int(where[-1]) + 1) 432 | shape[k] = v.shape 433 | return indices, shape 434 | 435 | return _get_indices_and_shape(dummy_obs, flat_dummy_obs) 436 | 437 | def get_dataset_stats(self, zarr_path: Union[str, os.PathLike] = None) -> Dict: 438 | """Get statistics of dataset such as number of timesteps. 439 | 440 | Args: 441 | zarr_path: Optional path to a Zarr directory containing the dataset, which will be 442 | used instead of the default. 443 | Returns: 444 | The statistics of the dataset as a dictionary with keys 445 | 446 | - n_timesteps: Number of timesteps in dataset. Corresponds to the 447 | number of observations, actions and rewards. 448 | - obs_size: Size of the observation vector. 449 | - action_size: Size of the action vector. 450 | """ 451 | if zarr_path is None: 452 | zarr_path = self._download_dataset() 453 | 454 | store = zarr.LMDBStore(zarr_path, readonly=True) 455 | with zarr.open(store=store) as root: 456 | dataset_stats = { 457 | "n_timesteps": root["observations"].shape[0], 458 | "obs_size": root["observations"].shape[1], 459 | "action_size": root["actions"].shape[1], 460 | } 461 | return dataset_stats 462 | 463 | def get_image_stats(self, zarr_path: Union[str, os.PathLike] = None) -> Dict: 464 | """Get statistics of image data in dataset. 465 | 466 | Args: 467 | zarr_path: Optional path to a Zarr directory containing the dataset, which will be 468 | used instead of the default. 469 | Returns: 470 | The statistics of the image data as a dictionary with keys 471 | 472 | - n_images: Number of images in the dataset. 473 | - n_cameras: Number of cameras used to capture the images. 474 | - n_channels: Number of channels in the images. 475 | - image_shape: Shape of the images in the format (height, width). 476 | - reorder_pixels: Whether the pixels in the images have been reordered 477 | to have the pixels corresponding to one color in the Bayer pattern 478 | together in blocks (to improve image compression). 479 | """ 480 | if zarr_path is None: 481 | zarr_path = self._download_dataset() 482 | 483 | store = zarr.LMDBStore(zarr_path, readonly=True) 484 | with zarr.open(store=store) as root: 485 | image_stats = { 486 | "n_images": root["images"].shape[0], 487 | "n_cameras": root["images"].attrs["n_cameras"], 488 | "n_channels": root["images"].attrs["n_channels"], 489 | "image_shape": tuple(root["images"].attrs["image_shape"]), 490 | "reorder_pixels": root["images"].attrs["reorder_pixels"], 491 | } 492 | return image_stats 493 | 494 | def get_image_data( 495 | self, 496 | rng: Optional[Tuple[int, int]] = None, 497 | indices: Optional[np.ndarray] = None, 498 | zarr_path: Union[str, os.PathLike] = None, 499 | timestep_dimension: bool = True, 500 | n_threads: Optional[int] = None, 501 | ) -> np.ndarray: 502 | """Get image data from dataset. 503 | 504 | Args: 505 | rng: Optional range of images to return. rng=(m,n) means that the 506 | images with indices m to n-1 are returned. 507 | indices: Optional array of image indices for which to load data. rng 508 | and indices are mutually exclusive, only one of them can be set. 509 | zarr_path: Optional path to a Zarr directory containing the dataset, 510 | which will be used instead of the default. 511 | timestep_dimension: Whether to include the timestep dimension in the 512 | returned array. This is useful if the given range of indices 513 | always contains `n_cameras` of image indices in a row which 514 | correspond to the camera images at one camera timestep. 515 | If this assumption is violated, the first dimension will not 516 | correspond to camera timesteps anymore. 517 | 518 | n_threads: Number of threads to use for processing the images. If None, 519 | the number of threads is set to the number of CPUs available to the 520 | process. 521 | Returns: 522 | The image data (or a part of it specified by rng or indices) as a numpy 523 | array. If `timestep_dimension` is True the shape will be 524 | (n_camera_timesteps, n_cameras, n_channels, height, width) else 525 | (n_images, n_channels, height, width). The channels are ordered as RGB. 526 | """ 527 | if rng is not None and indices is not None: 528 | raise ValueError("rng and indices cannot be specified at the same time.") 529 | 530 | if n_threads is None: 531 | n_threads = len(os.sched_getaffinity(0)) 532 | if zarr_path is None: 533 | zarr_path = self._download_dataset() 534 | store = zarr.LMDBStore(zarr_path, readonly=True) 535 | root = zarr.open(store=store) 536 | 537 | n_cameras = root["images"].attrs["n_cameras"] 538 | n_channels = root["images"].attrs["n_channels"] 539 | image_shape = tuple(root["images"].attrs["image_shape"]) 540 | reorder_pixels = root["images"].attrs["reorder_pixels"] 541 | compression = root["images"].attrs["compression"] 542 | assert compression == "image", "Only image compression is supported." 543 | 544 | # load only relevant image data 545 | if indices is not None: 546 | image_data = root["images"].get_orthogonal_selection(indices) 547 | else: 548 | image_data = root["images"][slice(*rng)] 549 | n_unique_images = image_data.shape[0] 550 | if timestep_dimension: 551 | n_timesteps = int(np.ceil(n_unique_images / n_cameras)) 552 | out_shape = (n_timesteps, n_cameras, n_channels) + image_shape 553 | else: 554 | out_shape = (n_unique_images, n_channels) + image_shape 555 | unique_images = np.zeros(out_shape, dtype=np.uint8) 556 | 557 | threads = [] 558 | # distribute image loading and processing over multiple threads 559 | for i in range(n_threads): 560 | image_loader = ImageLoader( 561 | loader_id=i, 562 | n_loaders=n_threads, 563 | image_data=image_data, 564 | unique_images=unique_images, 565 | n_unique_images=n_unique_images, 566 | n_cameras=n_cameras, 567 | reorder_pixels=reorder_pixels, 568 | timestep_dimension=timestep_dimension, 569 | ) 570 | threads.append(image_loader) 571 | image_loader.start() 572 | for thread in threads: 573 | thread.join() 574 | store.close() 575 | 576 | return unique_images 577 | 578 | def convert_timestep_to_image_index( 579 | self, 580 | timesteps: np.ndarray, 581 | zarr_path: Union[str, os.PathLike] = None, 582 | ) -> np.ndarray: 583 | """Convert camera timesteps to image indices. 584 | 585 | Args: 586 | timesteps: Array of camera timesteps. 587 | Returns: 588 | Array of image indices. 589 | """ 590 | if zarr_path is None: 591 | zarr_path = self._download_dataset() 592 | store = zarr.LMDBStore(zarr_path, readonly=True) 593 | root = zarr.open(store=store) 594 | 595 | # mapping from observation index to image index 596 | # (necessary since the camera frequency < control frequency) 597 | image_indices = root["obs_to_image_index"].get_coordinate_selection(timesteps) 598 | store.close() 599 | return image_indices 600 | 601 | def get_dataset( 602 | self, 603 | zarr_path: Union[str, os.PathLike] = None, 604 | clip: bool = True, 605 | rng: Optional[Tuple[int, int]] = None, 606 | indices: Optional[np.ndarray] = None, 607 | n_threads: Optional[int] = None, 608 | ) -> Dict[str, Any]: 609 | """Get the dataset. 610 | 611 | When called for the first time, the dataset is automatically downloaded and 612 | saved to ``~/.trifinger_rl_datasets``. 613 | 614 | Args: 615 | zarr_path: Optional path to a Zarr directory containing the dataset, which will be 616 | used instead of the default. 617 | clip: If True, observations are clipped to be within the environment's 618 | observation space. 619 | rng: Optional range to return. rng=(m,n) means that observations, actions 620 | and rewards m to n-1 are returned. If not specified, the entire 621 | dataset is returned. 622 | indices: Optional array of timestep indices for which to load data. rng 623 | and indices are mutually exclusive, only one of them can be set. 624 | n_threads: Number of threads to use for processing the images. If None, 625 | the number of threads is set to the number of CPUs available to the 626 | process. 627 | Returns: 628 | A dictionary containing the following keys 629 | 630 | - observations: Either an array or a list of dictionaries 631 | containing the observations depending on whether 632 | `flatten_obs` is True or False. 633 | - actions: Array containing the actions. 634 | - rewards: Array containing the rewards. 635 | - timeouts: Array containing the timeouts (True only at 636 | the end of an episode by default. Always False if 637 | `set_terminals` is True). 638 | - terminals: Array containing the terminals (Always 639 | False by default. If `set_terminals` is True, only 640 | True at the last timestep of an episode). 641 | - images (only if present in dataset): Array of the 642 | shape (n_control_timesteps, n_cameras, n_channels, 643 | height, width) containing the image data. The cannels 644 | are ordered as RGB. 645 | """ 646 | 647 | # The offline RL dataset is loaded from a Zarr directory which contains 648 | # the following Zarr arrays (this is an implementation detail and 649 | # not necessary to understand for users of the class): 650 | # - observations: Two-dimensional array of shape 651 | # `(n_control_timesteps, n_obs)` containing the observations as 652 | # flat vectors of length `n_obs` (except for the camera images 653 | # which are stored in image_data if present in the dataset). 654 | # - actions: Two-dimensional array of shape `(n_control_timesteps, 655 | # n_actions)` containing the actions. 656 | # - rewards: One-dimensional array of length `n_control_timesteps` 657 | # containing the rewards. 658 | # - episode_ends: One-dimensional array of length `n_episodes` 659 | # containing the indices of the last control timestep of each 660 | # episode. 661 | # - timeouts: One-dimensional array of length `n_control_timesteps` 662 | # with values of type bool. Only True at timesteps where the 663 | # episode ends, False otherwise. 664 | # - image_data: Ragged array of type bytes, which contains the 665 | # compressed image data. The images obtained from all cameras 666 | # at each camera time step are written one after another to this 667 | # array. After decompression the color information is contained 668 | # in a Bayer pattern. The images should therefore be debayerd 669 | # before use. Also note the information on the reorder_pixels 670 | # attribute below. The dataset has the following attributes: 671 | # - n_cameras: Number of cameras. 672 | # - n_channels: Number of channels per camera image. 673 | # - compression: Type of compression used. Only "image" is 674 | # supported by this class. 675 | # - image_codec: Codec used to compress the image data. Only 676 | # "jpeg" and "png" are supported by this class. 677 | # - image_shape: Tuple of length 2 containing the height and width 678 | # of the images. 679 | # - reorder_pixels: If true, the pixels of the Bayer pattern have 680 | # been reordered, such that all pixels of a specific colour are 681 | # next to each other in one big block (i.e. one block with all 682 | # red pixels, one with all blue pixels and one with all green 683 | # pixels). This leads to more continuity of the data (compared 684 | # to the original Bayer pattern) and thus tends to improve the 685 | # performance of standard image compression algorithms (e.g. 686 | # PNG). To restore the original image, the pixels need to be 687 | # reordered back before debayering. 688 | # - obs_to_image_index: One-dimensional array of length 689 | # `n_control_timesteps` containing the index of the camera 690 | # image corresponding to each control timestep. This mapping 691 | # is necessary because the camera frequency is lower than the 692 | # control frequency. 693 | 694 | if rng is not None and indices is not None: 695 | raise ValueError("rng and indices cannot be specified at the same time.") 696 | 697 | if zarr_path is None: 698 | zarr_path = self._download_dataset() 699 | store = zarr.LMDBStore(zarr_path, readonly=True) 700 | root = zarr.open(store=store) 701 | 702 | data_dict = {} 703 | if indices is None: 704 | # turn range into slice 705 | n_avail_transitions = root["observations"].shape[0] 706 | if rng is None: 707 | rng = (None, None) 708 | rng = ( 709 | 0 if rng[0] is None else rng[0], 710 | n_avail_transitions if rng[1] is None else rng[1], 711 | ) 712 | range_slice = slice(*rng) 713 | for k in self._PRELOAD_VECTOR_KEYS + self._PRELOAD_SCALAR_KEYS: 714 | data_dict[k] = root[k][range_slice] 715 | else: 716 | for k in self._PRELOAD_VECTOR_KEYS: 717 | data_dict[k] = root[k].get_orthogonal_selection((indices, slice(None))) 718 | for k in self._PRELOAD_SCALAR_KEYS: 719 | data_dict[k] = root[k].get_coordinate_selection(indices) 720 | 721 | n_control_timesteps = data_dict["observations"].shape[0] 722 | 723 | # clip to make sure that there are no outliers in the data 724 | if clip: 725 | data_dict["observations"] = data_dict["observations"].clip( 726 | min=self._flat_unflattening_space.low, 727 | max=self._flat_unflattening_space.high, 728 | dtype=self._flat_unflattening_space.dtype, 729 | ) 730 | 731 | if not (self.flatten_obs and self.obs_to_keep is None): 732 | # unflatten observations, i.e., turn them into dicts again 733 | unflattened_obs = [] 734 | obs = data_dict["observations"] 735 | for i in range(obs.shape[0]): 736 | unflattened_obs.append( 737 | spaces.unflatten(self._unflattening_space, obs[i, ...]) 738 | ) 739 | data_dict["observations"] = unflattened_obs 740 | 741 | # timeouts, terminals and info 742 | if self.set_terminals: 743 | data_dict["terminals"] = data_dict["timeouts"] 744 | data_dict["timeouts"] = np.zeros(n_control_timesteps, dtype=bool) 745 | data_dict["infos"] = [{} for _ in range(n_control_timesteps)] 746 | 747 | # process obs (filtering, flattening, scaling) 748 | for i in range(n_control_timesteps): 749 | data_dict["observations"][i] = self._process_obs( 750 | obs=data_dict["observations"][i] 751 | ) 752 | # turn observations into array if obs are flattened 753 | if self.flatten_obs: 754 | data_dict["observations"] = np.array( 755 | data_dict["observations"], dtype=self.observation_space.dtype 756 | ) 757 | 758 | if "images" in root.keys(): 759 | n_cameras = root["images"].attrs["n_cameras"] 760 | if indices is None: 761 | # mapping from observation index to image index 762 | # (necessary since the camera frequency < control frequency) 763 | obs_to_image_index = root["obs_to_image_index"][range_slice] 764 | image_index_range = ( 765 | obs_to_image_index[0], 766 | # add n_cameras to include last images as well 767 | obs_to_image_index[-1] + n_cameras, 768 | ) 769 | # load images 770 | unique_images = self.get_image_data( 771 | rng=image_index_range, zarr_path=zarr_path, n_threads=n_threads 772 | ) 773 | else: 774 | obs_to_image_index = root[ 775 | "obs_to_image_index" 776 | ].get_coordinate_selection(indices) 777 | # load images from all cameras, not only first one 778 | all_cam_indices = np.zeros( 779 | obs_to_image_index.shape[0] * n_cameras, dtype=np.int64 780 | ) 781 | for i in range(n_cameras): 782 | all_cam_indices[i::n_cameras] = obs_to_image_index + i 783 | # remove duplicates and sort 784 | image_indices, unique_to_original = np.unique( 785 | all_cam_indices, return_inverse=True 786 | ) 787 | # load images 788 | unique_images = self.get_image_data( 789 | indices=image_indices, zarr_path=zarr_path, n_threads=n_threads 790 | ) 791 | # repeat images to account for control frequency > camera frequency 792 | images = np.zeros( 793 | (n_control_timesteps,) + unique_images.shape[1:], dtype=np.uint8 794 | ) 795 | for i in range(n_control_timesteps): 796 | if indices is None: 797 | index = (obs_to_image_index[i] - obs_to_image_index[0]) // n_cameras 798 | else: 799 | # map from original image index to unique image index 800 | index = unique_to_original[i * n_cameras] // n_cameras 801 | images[i] = unique_images[index] 802 | data_dict["images"] = images 803 | 804 | store.close() 805 | 806 | return data_dict 807 | 808 | def get_dataset_chunk(self, chunk_id, zarr_path=None): 809 | raise NotImplementedError() 810 | 811 | def compute_reward( 812 | self, achieved_goal: dict, desired_goal: dict, info: dict 813 | ) -> float: 814 | """Compute the reward for the given achieved and desired goal. 815 | 816 | Args: 817 | achieved_goal: Current pose of the object. 818 | desired_goal: Goal pose of the object. 819 | info: An info dictionary containing a field "time_index" which 820 | contains the time index of the achieved_goal. 821 | 822 | Returns: 823 | The reward that corresponds to the provided achieved goal w.r.t. to 824 | the desired goal. 825 | """ 826 | return self.sim_env.compute_reward(achieved_goal, desired_goal, info) 827 | 828 | def step( 829 | self, action: np.ndarray, **kwargs 830 | ) -> Tuple[Union[Dict, np.ndarray], float, bool, bool, Dict]: 831 | """Execute one step. 832 | 833 | Args: 834 | action: Array of 9 torque commands, one for each robot joint. 835 | 836 | Returns: 837 | A tuple with 838 | 839 | - observation (dict or tuple): agent's observation of the current 840 | environment. If `self.flatten_obs` is False then as a dictionary. 841 | If `self.flatten_obs` is True then either as a 1D NumPy array 842 | (if no images are to be included) or as a tuple (if images are 843 | to be included) consisting of 844 | 845 | * a 1D NumPy array containing all observations except the 846 | camera images, and 847 | * a NumPy array of shape (n_cameras, n_channels, height, width) 848 | containing the camera images. 849 | 850 | - reward (float): amount of reward returned after previous action. 851 | - terminated (bool): whether the MDP has reached a terminal state. If true, 852 | the user needs to call `reset()`. 853 | - truncated (bool): Whether the truncation condition outside the scope 854 | of the MDP is satisfied. For this environment this corresponds to a 855 | timeout. If true, the user needs to call `reset()`. 856 | - info (dict): info dictionary containing the current time index. 857 | """ 858 | if self.real_robot: 859 | raise NotImplementedError( 860 | "The step method is not available for real-robot data." 861 | ) 862 | obs, rew, terminated, truncated, info = self.sim_env.step(action, **kwargs) 863 | # process obs 864 | processed_obs = self._process_obs(obs) 865 | return processed_obs, rew, terminated, truncated, info 866 | 867 | def reset( 868 | self, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None 869 | ) -> Tuple[Union[Dict, np.ndarray], Dict]: 870 | """Reset the environment. 871 | 872 | Returns: 873 | Tuple of observation and info dictionary. 874 | """ 875 | if self.real_robot: 876 | raise NotImplementedError( 877 | "The reset method is not available for real-robot data." 878 | ) 879 | if seed is not None: 880 | self.sim_env.seed(seed) 881 | obs, info = self.sim_env.reset() 882 | # process obs 883 | processed_obs = self._process_obs(obs) 884 | return processed_obs, info 885 | 886 | def seed(self, seed: Optional[int] = None) -> List[int]: 887 | """Set random seed of the environment.""" 888 | return self.sim_env.seed(seed) 889 | 890 | def render(self, mode: str = "human"): 891 | """Does not do anything for this environment.""" 892 | if self.real_robot: 893 | raise NotImplementedError( 894 | "The render method is not available for real-robot data." 895 | ) 896 | self.sim_env.render(mode) 897 | 898 | def reset_fingers(self, reset_wait_time: int = 3000): 899 | """Moves the fingers to initial position. 900 | 901 | This resets neither the frontend nor the cube. This method is supposed to be 902 | used for 'soft resets' between episodes in one job. 903 | """ 904 | 905 | if self.real_robot: 906 | raise NotImplementedError( 907 | "The reset_fingers method is not available for real-robot data." 908 | ) 909 | obs, info = self.sim_env.reset_fingers(reset_wait_time) 910 | processed_obs = self._process_obs(obs) 911 | return processed_obs, info 912 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/evaluate_sim.py: -------------------------------------------------------------------------------- 1 | """Evaluate a policy in simulation.""" 2 | import argparse 3 | import importlib 4 | import json 5 | import logging 6 | import pathlib 7 | import sys 8 | import typing 9 | 10 | import gymnasium as gym 11 | 12 | from trifinger_rl_datasets import Evaluation, PolicyBase, TriFingerDatasetEnv 13 | 14 | 15 | def load_policy_class(policy_class_str: str) -> typing.Type[PolicyBase]: 16 | """Import the given policy class 17 | 18 | Args: 19 | The name of the policy class in the format "package.module.Class". 20 | 21 | Returns: 22 | The specified policy class. 23 | 24 | Raises: 25 | RuntimeError: If importing of the class fails. 26 | """ 27 | try: 28 | module_name, class_name = policy_class_str.rsplit(".", 1) 29 | logging.info("import %s from %s" % (class_name, module_name)) 30 | module = importlib.import_module(module_name) 31 | Policy = getattr(module, class_name) 32 | except Exception: 33 | raise RuntimeError( 34 | "Failed to import policy %s from module %s" % (class_name, module_name) 35 | ) 36 | 37 | return Policy 38 | 39 | 40 | def main(): 41 | logging.basicConfig(level=logging.INFO) 42 | 43 | parser = argparse.ArgumentParser(description=__doc__) 44 | parser.add_argument( 45 | "task", 46 | type=str, 47 | choices=["push", "lift"], 48 | help="Which task to evaluate ('push' or 'lift').", 49 | ) 50 | parser.add_argument( 51 | "policy_class", 52 | type=str, 53 | help="Name of the policy class (something like 'package.module.Class').", 54 | ) 55 | parser.add_argument( 56 | "--visualization", 57 | "-v", 58 | action="store_true", 59 | help="Enable visualization of environment.", 60 | ) 61 | parser.add_argument( 62 | "--n-episodes", 63 | type=int, 64 | default=64, 65 | help="Number of episodes to run. Default: %(default)s", 66 | ) 67 | parser.add_argument( 68 | "--output", 69 | type=pathlib.Path, 70 | metavar="FILENAME", 71 | help="Save results to a JSON file.", 72 | ) 73 | args = parser.parse_args() 74 | 75 | if args.task == "push": 76 | env_name = "trifinger-cube-push-sim-expert-v0" 77 | elif args.task == "lift": 78 | env_name = "trifinger-cube-lift-sim-expert-v0" 79 | else: 80 | print("Invalid task %s" % args.task) 81 | return 1 82 | 83 | Policy = load_policy_class(args.policy_class) 84 | 85 | policy_config = Policy.get_policy_config() 86 | 87 | if policy_config.flatten_obs: 88 | print("Using flattened observations") 89 | else: 90 | print("Using structured observations") 91 | 92 | env = typing.cast( 93 | TriFingerDatasetEnv, 94 | gym.make( 95 | env_name, 96 | disable_env_checker=True, 97 | visualization=args.visualization, 98 | flatten_obs=policy_config.flatten_obs, 99 | image_obs=policy_config.image_obs, 100 | ), 101 | ) 102 | 103 | policy = Policy(env.action_space, env.observation_space, env.sim_env.episode_length) 104 | 105 | evaluation = Evaluation(env) 106 | eval_res = evaluation.evaluate(policy=policy, n_episodes=args.n_episodes) 107 | json_result = json.dumps(eval_res, indent=4) 108 | 109 | print("Evaluation result: ") 110 | print(json_result) 111 | 112 | if args.output: 113 | args.output.write_text(json_result) 114 | 115 | return 0 116 | 117 | 118 | if __name__ == "__main__": 119 | sys.exit(main()) 120 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/evaluation.py: -------------------------------------------------------------------------------- 1 | from time import time 2 | import typing 3 | 4 | import numpy as np 5 | 6 | from .policy_base import PolicyBase 7 | 8 | 9 | class Evaluation: 10 | 11 | _reset_time = 500 12 | 13 | def __init__(self, env, time_policy=False): 14 | self.env = env 15 | self.time_policy = time_policy 16 | 17 | def run_episode( 18 | self, initial_obs: dict, initial_info: dict, policy: PolicyBase 19 | ) -> typing.Dict[str, typing.Union[int, float]]: 20 | """Run one episode/do one rollout.""" 21 | 22 | obs = initial_obs 23 | info = initial_info 24 | n_steps = 0 25 | momentary_successes = 0 26 | ep_return = 0.0 27 | max_reward = 0.0 28 | transient_success = False 29 | 30 | policy.reset() 31 | 32 | while True: 33 | if self.time_policy: 34 | time1 = time() 35 | action = policy.get_action(obs) 36 | if self.time_policy: 37 | print("policy execution time: ", time() - time1) 38 | obs, rew, _, truncated, info = self.env.step(action) 39 | ep_return += rew 40 | max_reward = max(max_reward, rew) 41 | if info["has_achieved"]: 42 | transient_success = True 43 | momentary_successes += 1 44 | self.env.render() 45 | n_steps += 1 46 | if truncated: 47 | if info["has_achieved"]: 48 | print("Success: Goal achieved at end of episode.") 49 | else: 50 | print("Goal not reached at the end of the episode.") 51 | break 52 | 53 | ep_stats = { 54 | "success_rate": int(info["has_achieved"]), 55 | "mean_momentary_success": momentary_successes / n_steps, 56 | "transient_success_rate": int(transient_success), 57 | "return": ep_return, 58 | "max_reward": max_reward, 59 | } 60 | return ep_stats 61 | 62 | def evaluate(self, policy, n_episodes): 63 | """Evaluate policy in given environment.""" 64 | 65 | difficulty = self.env.sim_env.difficulty 66 | episode_batch_size = 8 if difficulty == 1 else 6 67 | ep_stats_list = [] 68 | for i in range(n_episodes): 69 | print("Start episode {}".format(i)) 70 | # reset episode periodically to simulate start of a new robot job 71 | if i % episode_batch_size == 0: 72 | initial_obs, initial_info = self.env.reset() 73 | # run episode 74 | ep_stats = self.run_episode(initial_obs, initial_info, policy) 75 | ep_stats_list.append(ep_stats) 76 | # move fingers to initial position and wait until cube has settled down 77 | self.env.reset_fingers(self._reset_time) 78 | if i < n_episodes - 1: 79 | # retrieve cube from barrier and center it approximately 80 | self.env.sim_env.reset_cube() 81 | # Sample new goal 82 | self.env.sim_env.sample_new_goal() 83 | # move fingers to initial position and wait until cube has settled down 84 | initial_obs, initial_info = self.env.reset_fingers(self._reset_time) 85 | 86 | overall_stats = {"n_episodes": n_episodes} 87 | for k in ep_stats_list[0]: 88 | overall_stats[k] = np.mean([ep_stats[k] for ep_stats in ep_stats_list]) 89 | 90 | return overall_stats 91 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/policy_base.py: -------------------------------------------------------------------------------- 1 | import typing 2 | from abc import ABC, abstractmethod 3 | from dataclasses import dataclass 4 | 5 | import gymnasium as gym 6 | import numpy as np 7 | 8 | ObservationType = typing.Union[np.ndarray, typing.Dict[str, typing.Any]] 9 | 10 | 11 | @dataclass 12 | class PolicyConfig: 13 | """Policy configuration specifying what kind of observations the policy expects. 14 | 15 | Args: 16 | flatten_obs: If True, the policy expects observations as flattened arrays. 17 | Otherwise, it expects them as dictionaries. 18 | image_obs: If True, the policy expects the observations to contain camera 19 | images. Otherwise, images are not included. If images_obs is True and 20 | flatten_obs is True, the observation is a tuple containing the flattened 21 | observation excluding the images and the images in a numpy array. If 22 | flatten_obs is False, the images are included in the observation 23 | dictionary. 24 | """ 25 | 26 | flatten_obs: bool = True 27 | image_obs: bool = False 28 | 29 | 30 | class PolicyBase(ABC): 31 | """Base class defining interface for policies.""" 32 | 33 | def __init__( 34 | self, action_space: gym.Space, observation_space: gym.Space, episode_length: int 35 | ): 36 | """ 37 | Args: 38 | action_space: Action space of the environment. 39 | observation_space: Observation space of the environment. 40 | episode_length: Number of steps in one episode. 41 | """ 42 | pass 43 | 44 | @staticmethod 45 | def get_policy_config() -> PolicyConfig: 46 | """Returns the policy configuration. 47 | 48 | This specifies what kind of observations the policy expects. 49 | """ 50 | return PolicyConfig() 51 | 52 | def reset(self) -> None: 53 | """Will be called at the beginning of each episode.""" 54 | pass 55 | 56 | @abstractmethod 57 | def get_action(self, observation: ObservationType) -> np.ndarray: 58 | """Returns action that is executed on the robot. 59 | 60 | Args: 61 | observation: Observation of the current time step. 62 | 63 | Returns: 64 | Action that is sent to the robot. 65 | """ 66 | pass 67 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/py.typed: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/py.typed -------------------------------------------------------------------------------- /trifinger_rl_datasets/sampling_utils.py: -------------------------------------------------------------------------------- 1 | """Utils for sampling of cube pose.""" 2 | 3 | import numpy as np 4 | from scipy.spatial.transform import Rotation 5 | 6 | from trifinger_simulation.tasks.move_cube import ( 7 | _CUBE_WIDTH, 8 | _ARENA_RADIUS, 9 | _base_orientations, 10 | Pose, 11 | ) 12 | 13 | 14 | def random_yaw_orientation(): 15 | # first "roll the die" to see which face is pointing upward 16 | up_face = np.random.choice(range(len(_base_orientations))) 17 | up_face_rot = _base_orientations[up_face] 18 | # then draw a random yaw rotation 19 | yaw_angle = np.random.uniform(0, 2 * np.pi) 20 | yaw_rot = Rotation.from_euler("z", yaw_angle) 21 | # and combine them 22 | orientation = yaw_rot * up_face_rot 23 | return yaw_angle, orientation.as_quat() 24 | 25 | 26 | def random_xy(cube_yaw): 27 | """Sample an xy position for cube which maximally covers arena. 28 | 29 | In particular, the cube can touch the barrier for all yaw anels.""" 30 | 31 | theta = np.random.uniform(0, 2 * np.pi) 32 | 33 | # Minimum distance of cube center from arena boundary 34 | min_dist = ( 35 | _CUBE_WIDTH 36 | / np.sqrt(2) 37 | * max( 38 | abs(np.sin(0.25 * np.pi + cube_yaw - theta)), 39 | abs(np.cos(0.25 * np.pi + cube_yaw - theta)), 40 | ) 41 | ) 42 | 43 | # sample uniform position in circle 44 | # (https://stackoverflow.com/a/50746409) 45 | radius = (_ARENA_RADIUS - min_dist) * np.sqrt(np.random.random()) 46 | 47 | # x,y-position of the cube 48 | x = radius * np.cos(theta) 49 | y = radius * np.sin(theta) 50 | 51 | return x, y 52 | 53 | 54 | def sample_initial_cube_pose(): 55 | yaw_angle, orientation = random_yaw_orientation() 56 | x, y = random_xy(yaw_angle) 57 | z = _CUBE_WIDTH / 2 58 | goal = Pose() 59 | goal.position = np.array((x, y, z)) 60 | goal.orientation = orientation 61 | return goal 62 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/sim_env.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | from time import sleep, time 3 | from typing import Tuple, Dict, Any, Optional 4 | import logging 5 | 6 | import cv2 7 | import gymnasium as gym 8 | import numpy as np 9 | from pybullet import ER_TINY_RENDERER 10 | 11 | import trifinger_simulation 12 | import trifinger_simulation.visual_objects 13 | from trifinger_simulation import trifingerpro_limits 14 | import trifinger_simulation.tasks.move_cube as task 15 | 16 | from .sampling_utils import sample_initial_cube_pose 17 | from .utils import to_quat, get_keypoints_from_pose 18 | 19 | 20 | class CameraWrapper: 21 | """Simple wrapper around camera array to change default renderer.""" 22 | 23 | def __init__(self, camera): 24 | self.camera = camera 25 | 26 | def get_images(self, renderer=ER_TINY_RENDERER): 27 | return self.camera.get_images(renderer) 28 | 29 | 30 | class SimTriFingerCubeEnv(gym.Env): 31 | """ 32 | Gym environment for simulated manipulation of a cube with a TriFingerPro platform. 33 | """ 34 | 35 | _initial_finger_position = [0.0, 0.9, -2.0] * 3 36 | _max_fingertip_vel = 5.0 37 | # parameters of reward function 38 | _kernel_reward_weight = 4.0 39 | _logkern_scale = 30 40 | _logkern_offset = 2 41 | # for how long to play the resetting trajectory 42 | _reset_trajectory_length = 18700 43 | # how many robot steps per environment step 44 | _step_size = 20 45 | 46 | def __init__( 47 | self, 48 | episode_length: int = 15, 49 | difficulty: int = 4, 50 | keypoint_obs: bool = True, 51 | obs_action_delay: int = 0, 52 | reward_type: str = "dense", 53 | visualization: bool = False, 54 | real_time: bool = True, 55 | image_obs: bool = False, 56 | camera_config_robot: int = 1, 57 | ): 58 | """ 59 | Args: 60 | episode_length (int): How often step will run before done is True. 61 | keypoint_obs (bool): Whether to give keypoint observations for 62 | pose in addition to position and quaternion. 63 | obs_action_delay (int): Delay between arrival of an observation 64 | and application of the action computed from this 65 | observation in milliseconds. 66 | reward_type (str): Which reward to use. Can be 'dense' or 'sparse'. 67 | visualization (bool): If true, the PyBullet GUI is run for visualization. 68 | real_time (bool): If true, the environment is stepped in real 69 | time instead of as fast as possible (ignored if visualization is 70 | disabled). 71 | image_obs (bool): If true, the camera images are returned as part 72 | of the observation. 73 | camera_config_robot (int): ID of the robot to retrieve camera 74 | configs from. Only used if image_obs is True. 75 | """ 76 | # Basic initialization 77 | # ==================== 78 | 79 | self.logger = logging.getLogger("trifinger_rl_datasets.SimTriFingerCubeEnv") 80 | 81 | assert ( 82 | obs_action_delay < self._step_size 83 | ), "Delay between retrieval of observation and sending of next \ 84 | action has to be smaller than step size (20 ms)." 85 | 86 | # will be initialized in reset() 87 | self.platform: Optional[trifinger_simulation.TriFingerPlatform] = None 88 | 89 | self.episode_length = episode_length 90 | self.difficulty = difficulty 91 | self.keypoint_obs = keypoint_obs 92 | self.n_keypoints = 8 93 | self.obs_action_delay = obs_action_delay 94 | self.reward_type = reward_type 95 | self.visualization = visualization 96 | self.real_time = real_time 97 | self.image_obs = image_obs 98 | self.camera_config_robot = camera_config_robot 99 | 100 | # load trajectory that is played back for resetting the cube 101 | trajectory_file_path = ( 102 | Path(__file__).resolve().parent 103 | / "data" 104 | / "trifingerpro_shuffle_cube_trajectory_fast.npy" 105 | ) 106 | with open(trajectory_file_path, "rb") as f: 107 | self._cube_reset_traj = np.load(f) 108 | 109 | # simulated robot has robot ID 0 110 | self.robot_id = 0 111 | 112 | if image_obs: 113 | # create camera object 114 | camera_config_dir = Path(__file__).resolve().parent / "data" 115 | calib_filename_pattern = f"r{self.camera_config_robot}_" + "camera{id}.yml" 116 | self.camera = trifinger_simulation.camera.create_trifinger_camera_array_from_config( 117 | camera_config_dir, calib_filename_pattern=calib_filename_pattern 118 | ) 119 | else: 120 | self.camera = None 121 | 122 | # Create the action and observation spaces 123 | # ======================================== 124 | 125 | robot_torque_space = gym.spaces.Box( 126 | low=trifingerpro_limits.robot_torque.low, 127 | high=trifingerpro_limits.robot_torque.high, 128 | ) 129 | robot_position_space = gym.spaces.Box( 130 | low=trifingerpro_limits.robot_position.low, 131 | high=trifingerpro_limits.robot_position.high, 132 | dtype=np.float32, 133 | ) 134 | robot_velocity_space = gym.spaces.Box( 135 | low=trifingerpro_limits.robot_velocity.low, 136 | high=trifingerpro_limits.robot_velocity.high, 137 | dtype=np.float32, 138 | ) 139 | robot_fingertip_force_space = gym.spaces.Box( 140 | low=np.zeros(trifingerpro_limits.n_fingers), 141 | high=np.ones(trifingerpro_limits.n_fingers), 142 | dtype=np.float32, 143 | ) 144 | robot_fingertip_pos_space = gym.spaces.Box( 145 | low=np.array([[-0.6, -0.6, 0.0]] * trifingerpro_limits.n_fingers), 146 | high=np.array([[0.6, 0.6, 0.6]] * trifingerpro_limits.n_fingers), 147 | dtype=np.float32, 148 | ) 149 | robot_fingertip_vel_space = gym.spaces.Box( 150 | low=np.array( 151 | [[-self._max_fingertip_vel] * 3] * trifingerpro_limits.n_fingers 152 | ), 153 | high=np.array( 154 | [[self._max_fingertip_vel] * 3] * trifingerpro_limits.n_fingers 155 | ), 156 | dtype=np.float32, 157 | ) 158 | robot_id_space = gym.spaces.Box(low=0, high=20, shape=(1,), dtype=np.int_) 159 | 160 | # camera observation space 161 | camera_obs_space_dict: Dict[str, gym.Space] = { 162 | "object_position": gym.spaces.Box( 163 | low=trifingerpro_limits.object_position.low, 164 | high=trifingerpro_limits.object_position.high, 165 | dtype=np.float32, 166 | ), 167 | "object_orientation": gym.spaces.Box( 168 | low=trifingerpro_limits.object_orientation.low, 169 | high=trifingerpro_limits.object_orientation.high, 170 | dtype=np.float32, 171 | ), 172 | "delay": gym.spaces.Box(low=0.0, high=0.30, shape=(1,), dtype=np.float32), 173 | "confidence": gym.spaces.Box( 174 | low=0.0, high=1.0, shape=(1,), dtype=np.float32 175 | ), 176 | } 177 | if self.keypoint_obs: 178 | camera_obs_space_dict["object_keypoints"] = gym.spaces.Box( 179 | low=np.array([[-0.6, -0.6, 0.0]] * self.n_keypoints), 180 | high=np.array([[0.6, 0.6, 0.3]] * self.n_keypoints), 181 | dtype=np.float32, 182 | ) 183 | if self.image_obs: 184 | n_cameras = len(self.camera.cameras) 185 | images_space = gym.spaces.Box( 186 | low=0, 187 | high=255, 188 | shape=( 189 | n_cameras, 190 | 3, 191 | self.camera.cameras[0]._output_height, 192 | self.camera.cameras[0]._output_width, 193 | ), 194 | dtype=np.uint8, 195 | ) 196 | camera_obs_space_dict["images"] = images_space 197 | camera_obs_space = gym.spaces.Dict(camera_obs_space_dict) 198 | 199 | # goal space 200 | if self.difficulty == 4: 201 | if self.keypoint_obs: 202 | goal_space = gym.spaces.Dict( 203 | {"object_keypoints": camera_obs_space["object_keypoints"]} 204 | ) 205 | else: 206 | goal_space = gym.spaces.Dict( 207 | { 208 | k: camera_obs_space[k] 209 | for k in ["object_position", "object_orientation"] 210 | } 211 | ) 212 | else: 213 | goal_space = gym.spaces.Dict( 214 | {"object_position": camera_obs_space["object_position"]} 215 | ) 216 | 217 | # action space 218 | self.action_space = robot_torque_space 219 | self._initial_action = trifingerpro_limits.robot_torque.default 220 | 221 | # NOTE: The order of dictionary items matters as it determines how 222 | # the observations are flattened/unflattened. The observation space 223 | # is therefore sorted by key. 224 | 225 | def sort_by_key(d): 226 | return { 227 | k: ( 228 | gym.spaces.Dict(sort_by_key(v.spaces)) 229 | if isinstance(v, gym.spaces.Dict) 230 | else v 231 | ) 232 | for k, v in sorted(d.items(), key=lambda item: item[0]) 233 | } 234 | 235 | # complete observation space 236 | self.observation_space = gym.spaces.Dict( 237 | sort_by_key( 238 | { 239 | "robot_observation": gym.spaces.Dict( 240 | { 241 | "position": robot_position_space, 242 | "velocity": robot_velocity_space, 243 | "torque": robot_torque_space, 244 | "fingertip_force": robot_fingertip_force_space, 245 | "fingertip_position": robot_fingertip_pos_space, 246 | "fingertip_velocity": robot_fingertip_vel_space, 247 | "robot_id": robot_id_space, 248 | } 249 | ), 250 | "camera_observation": camera_obs_space, 251 | "action": self.action_space, 252 | "desired_goal": goal_space, 253 | "achieved_goal": goal_space, 254 | } 255 | ) 256 | ) 257 | 258 | self._old_camera_obs: Optional[Dict[str, Any]] = None 259 | self.t_obs: int = 0 260 | 261 | # Count consecutive steps where timing is violated (to decide when the show a 262 | # warning) 263 | self._timing_violation_counter = 0 264 | 265 | def _kernel_reward( 266 | self, achieved_goal: np.ndarray, desired_goal: np.ndarray 267 | ) -> float: 268 | """Compute reward by evaluating a logistic kernel on the pairwise distance of 269 | points. 270 | 271 | Parameters can be either a 1 dim. array of size 3 (positions) or a two dim. 272 | array with last dim. of size 3 (keypoints) 273 | 274 | Args: 275 | achieved_goal: Position or keypoints of current pose of the object. 276 | desired_goal: Position or keypoints of goal pose of the object. 277 | """ 278 | 279 | diff = achieved_goal - desired_goal 280 | dist = np.linalg.norm(diff, axis=-1) 281 | scaled = self._logkern_scale * dist 282 | # Use logistic kernel 283 | rew = self._kernel_reward_weight * np.mean( 284 | 1.0 / (np.exp(scaled) + self._logkern_offset + np.exp(-scaled)) 285 | ) 286 | return rew 287 | 288 | def _append_desired_action(self, robot_action): 289 | """Append desired action to queue and wait if real time is enabled.""" 290 | 291 | t = self.platform.append_desired_action(robot_action) 292 | if self.visualization and self.real_time: 293 | sleep(max(0.001 - (time() - self.time_of_last_step), 0.0)) 294 | self.time_of_last_step = time() 295 | return t 296 | 297 | def compute_reward( 298 | self, achieved_goal: dict, desired_goal: dict, info: dict 299 | ) -> float: 300 | """Compute the reward for the given achieved and desired goal. 301 | 302 | Args: 303 | achieved_goal: Current pose of the object. 304 | desired_goal: Goal pose of the object. 305 | info: An info dictionary containing a field "time_index" which 306 | contains the time index of the achieved_goal. 307 | 308 | Returns: 309 | The reward that corresponds to the provided achieved goal w.r.t. to 310 | the desired goal. 311 | """ 312 | 313 | if self.reward_type == "dense": 314 | if self.difficulty == 4: 315 | # Use full keypoints if available as only difficulty 4 considers 316 | # orientation 317 | return self._kernel_reward( 318 | achieved_goal["object_keypoints"], desired_goal["object_keypoints"] 319 | ) 320 | else: 321 | # use position for all other difficulties 322 | return self._kernel_reward( 323 | achieved_goal["object_position"], desired_goal["object_position"] 324 | ) 325 | elif self.reward_type == "sparse": 326 | return self.has_achieved(achieved_goal, desired_goal) 327 | else: 328 | raise NotImplementedError( 329 | f"Reward type {self.reward_type} is not supported" 330 | ) 331 | 332 | def has_achieved(self, achieved_goal: dict, desired_goal: dict) -> bool: 333 | """Determine whether goal pose is achieved.""" 334 | POSITION_THRESHOLD = 0.02 335 | ANGLE_THRESHOLD_DEG = 22.0 336 | 337 | desired = desired_goal 338 | achieved = achieved_goal 339 | position_diff = np.linalg.norm( 340 | desired["object_position"] - achieved["object_position"] 341 | ) 342 | # cast from np.bool_ to bool to make mypy happy 343 | position_check = bool(position_diff < POSITION_THRESHOLD) 344 | 345 | if self.difficulty < 4: 346 | return position_check 347 | else: 348 | a = to_quat(desired["object_orientation"]) 349 | b = to_quat(achieved["object_orientation"]) 350 | b_conj = b.conjugate() 351 | quat_prod = a * b_conj 352 | norm = np.linalg.norm([quat_prod.x, quat_prod.y, quat_prod.z]) 353 | norm = min(norm, 1.0) # type: ignore 354 | angle = 2.0 * np.arcsin(norm) 355 | orientation_check = angle < 2.0 * np.pi * ANGLE_THRESHOLD_DEG / 360.0 356 | 357 | return position_check and orientation_check 358 | 359 | def _check_action(self, action): 360 | low_check = self.action_space.low <= action 361 | high_check = self.action_space.high >= action 362 | return np.all(np.logical_and(low_check, high_check)) 363 | 364 | def step( 365 | self, action: np.ndarray, preappend_actions: bool = True 366 | ) -> Tuple[dict, float, bool, bool, dict]: 367 | """Run one timestep of the environment's dynamics. 368 | 369 | When end of episode is reached, you are responsible for calling 370 | ``reset()`` to reset this environment's state. 371 | 372 | Args: 373 | action: An action provided by the agent 374 | preappend_actions (bool): Whether to already append actions that 375 | will be executed during obs-action delay to action queue. 376 | 377 | Returns: 378 | tuple: 379 | 380 | - observation (dict): agent's observation of the current environment. 381 | - reward (float): amount of reward returned after previous action. 382 | - terminated (bool): whether the MDP has reached a terminal state. If true, 383 | the user needs to call `reset()`. 384 | - truncated (bool): Whether the truncation condition outside the scope 385 | of the MDP is satisfied. For this environment this corresponds to a 386 | timeout. If true, the user needs to call `reset()`. 387 | - info (dict): info dictionary containing the current time index. 388 | """ 389 | if self.platform is None: 390 | raise RuntimeError("Call `reset()` before starting to step.") 391 | 392 | if not self._check_action(action): 393 | raise ValueError("Given action is not contained in the action space.") 394 | 395 | self.step_count += 1 396 | 397 | # get robot action 398 | robot_action = self._gym_action_to_robot_action(action) 399 | 400 | # check timing and show a warning/error if delayed 401 | # do not check in first iteration as no time index is available yet (would lead 402 | # to dead-lock) 403 | if self.t_obs > 0: 404 | t_now = self.platform.get_current_timeindex() 405 | t_expected = self.t_obs + self.obs_action_delay 406 | if t_now > t_expected: 407 | self._timing_violation_counter += 1 408 | extreme = t_now > self.t_obs + self._step_size 409 | 410 | if extreme or self._timing_violation_counter >= 3: 411 | delay = t_now - t_expected 412 | self.logger.warning( 413 | f"Control loop got delayed by {delay} ms." 414 | " The action will be applied for a shorter time to catch up." 415 | " Please check if your policy is fast enough (max. computation" 416 | f" time should be <{1 + self.obs_action_delay} ms)." 417 | ) 418 | 419 | if extreme: 420 | self.logger.error( 421 | "ERROR: Control loop got delayed by more than a full step." 422 | " Timing of the episode will be significantly affected!" 423 | ) 424 | else: 425 | self._timing_violation_counter = 0 426 | 427 | # send new action to robot until new observation is to be provided 428 | # Note that by initially setting t the way it is, it is ensured that the loop 429 | # always runs at least one iteration, even if the actual time step is already 430 | # ahead by more than one step size. 431 | t = self.t_obs + self.obs_action_delay 432 | while t < self.t_obs + self._step_size: 433 | t = self._append_desired_action(robot_action) 434 | # time of the new observation 435 | self.t_obs = t 436 | 437 | observation, info = self._create_observation(self.t_obs, action) 438 | reward = self.compute_reward( 439 | observation["achieved_goal"], observation["desired_goal"], info 440 | ) 441 | truncated = self.step_count >= self.episode_length 442 | 443 | if not truncated and preappend_actions: 444 | t_now = self.platform.get_current_timeindex() 445 | # Append action to action queue of robot for as many time 446 | # steps as the obs_action_delay dictates. This gives the 447 | # user time to evaluate the policy. 448 | # Also take time into account that might have already passed 449 | # while the observation was processed. 450 | for _ in range(max(self.obs_action_delay - (t_now - self.t_obs), 0)): 451 | self._append_desired_action(robot_action) 452 | 453 | return observation, reward, False, truncated, info 454 | 455 | def reset( # type: ignore 456 | self, preappend_actions: bool = True 457 | ): 458 | """Reset the environment.""" 459 | 460 | super().reset() 461 | 462 | # hard-reset simulation 463 | del self.platform 464 | 465 | # initialize simulation 466 | initial_robot_position = trifingerpro_limits.robot_position.default 467 | initial_object_pose = sample_initial_cube_pose() 468 | initial_object_pose.position[2] += 0.0005 # avoid negative z of keypoint 469 | self.platform = trifinger_simulation.TriFingerPlatform( 470 | visualization=self.visualization, 471 | initial_robot_position=initial_robot_position, 472 | initial_object_pose=initial_object_pose, 473 | enable_cameras=self.image_obs, 474 | ) 475 | if self.image_obs: 476 | # overwrite camera with wrapped version which uses software rendering 477 | self.platform.tricamera = CameraWrapper(self.camera) 478 | first_camera_obs = self.platform._get_current_camera_observation(0) 479 | self.platform._delayed_camera_observation = first_camera_obs 480 | self.platform._camera_observation_t = first_camera_obs 481 | # sample goal 482 | self.active_goal = task.sample_goal(difficulty=self.difficulty) 483 | # visualize the goal (but not if image observations are used) 484 | if self.visualization and not self.image_obs: 485 | if hasattr(self, "goal_marker"): 486 | del self.goal_marker 487 | self.goal_marker = trifinger_simulation.visual_objects.CubeMarker( 488 | width=task._CUBE_WIDTH, 489 | position=self.active_goal.position, 490 | orientation=self.active_goal.orientation, 491 | pybullet_client_id=self.platform.simfinger._pybullet_client_id, 492 | ) 493 | self.step_count = 0 494 | self.time_of_last_step = time() 495 | # need to already do one step to get initial observation 496 | self.t_obs = 0 497 | obs, _, _, _, info = self.step( 498 | self._initial_action, preappend_actions=preappend_actions 499 | ) 500 | info = {"time_index": -1} 501 | 502 | return obs, info 503 | 504 | def reset_fingers(self, reset_wait_time: int = 3000): 505 | """Reset fingers to initial position. 506 | 507 | This resets neither the frontend nor the cube. This method is 508 | supposed to be used for 'soft resets' between episodes in one 509 | job. 510 | """ 511 | assert self.platform is not None, "Environment is not initialised." 512 | 513 | action = self.platform.Action(position=self._initial_finger_position) 514 | for _ in range(reset_wait_time): 515 | t = self._append_desired_action(action) 516 | self.t_obs = t 517 | # reset step_count even though this is not a full reset 518 | self.step_count = 0 519 | # block until reset wait time has passed and return observation 520 | obs, info = self._create_observation(t, self._initial_action) 521 | return obs, info 522 | 523 | def sample_new_goal(self, goal=None): 524 | """Sample a new desired goal.""" 525 | if goal is None: 526 | self.active_goal = task.sample_goal(difficulty=self.difficulty) 527 | else: 528 | self.active_goal.position = np.array(goal["position"], dtype=np.float32) 529 | self.active_goal.orientation = np.array( 530 | goal["orientation"], dtype=np.float32 531 | ) 532 | 533 | # update goal visualisation 534 | if self.visualization and not self.image_obs: 535 | self.goal_marker.set_state( 536 | self.active_goal.position, self.active_goal.orientation 537 | ) 538 | 539 | def _get_pose_delay(self, camera_observation, t): 540 | """Get delay between when the object pose was captured and now.""" 541 | 542 | return t / 1000.0 - camera_observation.cameras[0].timestamp 543 | 544 | def _clip_observation(self, obs): 545 | """Clip observation.""" 546 | 547 | def clip_recursively(o, space): 548 | for k, v in space.spaces.items(): 549 | if isinstance(v, gym.spaces.Box): 550 | np.clip(o[k], v.low, v.high, dtype=v.dtype, out=o[k]) 551 | else: 552 | clip_recursively(o[k], v) 553 | 554 | clip_recursively(obs, self.observation_space) 555 | 556 | def _create_observation(self, t: int, action: np.ndarray) -> Tuple[dict, dict]: 557 | assert self.platform is not None, "Environment is not initialised." 558 | 559 | robot_observation = self.platform.get_robot_observation(t) 560 | camera_observation = self.platform.get_camera_observation(t) 561 | object_observation = camera_observation.object_pose 562 | 563 | info: Dict[str, Any] = {"time_index": t} 564 | 565 | # camera observation 566 | camera_obs_processed = { 567 | "object_position": object_observation.position.astype(np.float32), 568 | "object_orientation": object_observation.orientation.astype(np.float32), 569 | # time elapsed since capturing of pose in seconds 570 | "delay": np.array( 571 | [self._get_pose_delay(camera_observation, t)], dtype=np.float32 572 | ), 573 | "confidence": np.array([object_observation.confidence], dtype=np.float32), 574 | } 575 | if self.image_obs: 576 | if len(camera_observation.cameras[0].image.shape) == 2: 577 | # images from real platform have to be debayered 578 | images = np.array([cv2.cvtColor(cam.image, cv2.COLOR_BAYER_BG2RGB) for cam in camera_observation.cameras]) 579 | else: 580 | # RGB camera images created with software renderer 581 | # (using openGL requires GUI to run) 582 | images = np.array([cam.image for cam in camera_observation.cameras]) 583 | # convert to channel first 584 | images = np.transpose(images, (0, 3, 1, 2)) 585 | camera_obs_processed["images"] = images 586 | if self.keypoint_obs: 587 | camera_obs_processed["object_keypoints"] = get_keypoints_from_pose( 588 | object_observation 589 | ) 590 | if self._old_camera_obs is not None: 591 | # handle quaternion flipping 592 | q_sum = ( 593 | self._old_camera_obs["object_orientation"] 594 | + camera_obs_processed["object_orientation"] 595 | ) 596 | if np.linalg.norm(q_sum) < 0.2: 597 | camera_obs_processed["object_orientation"] = -camera_obs_processed[ 598 | "object_orientation" 599 | ] 600 | self._old_camera_obs = camera_obs_processed 601 | 602 | # goal represented as position and orientation 603 | desired_goal_pos_ori = { 604 | "object_position": self.active_goal.position.astype(np.float32), 605 | "object_orientation": self.active_goal.orientation.astype(np.float32), 606 | } 607 | achieved_goal_pos_ori = { 608 | "object_position": camera_obs_processed["object_position"], 609 | "object_orientation": camera_obs_processed["object_orientation"], 610 | } 611 | # goal as shown to agent 612 | if self.difficulty == 4: 613 | if self.keypoint_obs: 614 | desired_goal = { 615 | "object_keypoints": get_keypoints_from_pose(self.active_goal) 616 | } 617 | achieved_goal = { 618 | "object_keypoints": camera_obs_processed["object_keypoints"] 619 | } 620 | else: 621 | desired_goal = desired_goal_pos_ori 622 | achieved_goal = achieved_goal_pos_ori 623 | else: 624 | desired_goal = { 625 | "object_position": self.active_goal.position.astype(np.float32) 626 | } 627 | achieved_goal = {"object_position": camera_obs_processed["object_position"]} 628 | 629 | # fingertip positions and velocities 630 | fingertip_position, fingertip_velocity = self.platform.forward_kinematics( 631 | robot_observation.position, robot_observation.velocity 632 | ) 633 | fingertip_position = np.array(fingertip_position, dtype=np.float32) 634 | fingertip_velocity = np.array(fingertip_velocity, dtype=np.float32) 635 | 636 | observation = { 637 | "robot_observation": { 638 | "position": robot_observation.position.astype(np.float32), 639 | "velocity": robot_observation.velocity.astype(np.float32), 640 | "torque": robot_observation.torque.astype(np.float32), 641 | "fingertip_force": robot_observation.tip_force.astype(np.float32), 642 | "fingertip_position": fingertip_position, 643 | "fingertip_velocity": fingertip_velocity, 644 | "robot_id": np.array([self.robot_id], dtype=np.int_), 645 | }, 646 | "camera_observation": camera_obs_processed, 647 | "action": action.astype(np.float32), 648 | "desired_goal": desired_goal, 649 | "achieved_goal": achieved_goal, 650 | } 651 | # clip observation 652 | self._clip_observation(observation) 653 | 654 | has_achieved = self.has_achieved(achieved_goal_pos_ori, desired_goal_pos_ori) 655 | info["has_achieved"] = has_achieved 656 | info["desired_goal"] = desired_goal_pos_ori 657 | 658 | return observation, info 659 | 660 | def _gym_action_to_robot_action(self, gym_action: np.ndarray): 661 | assert self.platform is not None, "Environment is not initialised." 662 | 663 | # robot action is torque 664 | robot_action = self.platform.Action(torque=gym_action) 665 | return robot_action 666 | 667 | def render(self, mode: str = "human"): 668 | """Does nothing. See :class:`SimTriFingerCubeEnv` for how to enable 669 | visualization.""" 670 | pass 671 | 672 | def _wait_until_timeindex(self, t: int): 673 | """Wait until the given time index is reached.""" 674 | # The simulation is stepped automatically so there is nothing to do here. 675 | pass 676 | 677 | def reset_cube(self): 678 | """Replay a recorded trajectory to move cube to center of arena.""" 679 | 680 | for position in self._cube_reset_traj[: self._reset_trajectory_length : 2]: 681 | robot_action = self.platform.Action(position=position) 682 | t = self._append_desired_action(robot_action) 683 | self._wait_until_timeindex(t) # type: ignore 684 | -------------------------------------------------------------------------------- /trifinger_rl_datasets/utils.py: -------------------------------------------------------------------------------- 1 | """Utility methods for working with object poses and keypoints.""" 2 | 3 | import numpy as np 4 | import quaternion 5 | 6 | 7 | def to_quat(x): 8 | return np.quaternion(x[3], x[0], x[1], x[2]) 9 | 10 | 11 | def to_world_space(x_local, pose): 12 | """Transform point from local object coordinate system to world space. 13 | 14 | Args: 15 | x_local: Coordinates of point in local frame. 16 | pose: Object pose containing position and orientation. 17 | Returns: 18 | The coordinates in world space. 19 | """ 20 | q_rot = to_quat(pose.orientation) 21 | transl = pose.position 22 | q_local = np.quaternion(0.0, x_local[0], x_local[1], x_local[2]) 23 | q_global = q_rot * q_local * q_rot.conjugate() 24 | return transl + np.array([q_global.x, q_global.y, q_global.z]) 25 | 26 | 27 | def get_keypoints_from_pose(pose, num_keypoints=8, dimensions=(0.065, 0.065, 0.065)): 28 | """Calculate keypoints (coordinates of the corners of the cube) from pose. 29 | 30 | Args: 31 | pose: Object pose containing position and orientation of cube. 32 | num_keypoints: Number of keypoints to generate. 33 | dimensions: Dimensions of the cube. 34 | Returns: 35 | Array containing the keypoints. 36 | """ 37 | keypoints = [] 38 | for i in range(num_keypoints): 39 | # convert to binary representation 40 | str_kp = "{:03b}".format(i) 41 | # set components of keypoints according to digits in binary representation 42 | loc_kp = [ 43 | (1.0 if str_kp[i] == "0" else -1.0) * 0.5 * d 44 | for i, d in enumerate(dimensions) 45 | ][::-1] 46 | glob_kp = to_world_space(loc_kp, pose) 47 | keypoints.append(glob_kp) 48 | 49 | return np.array(keypoints, dtype=np.float32) 50 | 51 | 52 | def get_pose_from_keypoints(keypoints, dimensions=(0.065, 0.065, 0.065)): 53 | """Calculate pose (position, orientation) from keypoints. 54 | 55 | Args: 56 | keypoints: At least three keypoints representing the pose. 57 | dimensions: Dimensions of the cube. 58 | Returns: 59 | Tuple containing the coordinates of the cube center and a 60 | quaternion representing the orientation. 61 | """ 62 | center = np.mean(keypoints, axis=0) 63 | kp_centered = np.array(keypoints) - center 64 | kp_scaled = kp_centered / np.array(dimensions) * 2.0 65 | 66 | loc_kps = [] 67 | for i in range(3): 68 | # convert to binary representation 69 | str_kp = "{:03b}".format(i) 70 | # set components of keypoints according to digits in binary representation 71 | loc_kp = [(1.0 if str_kp[i] == "0" else -1.0) for i in range(3)][::-1] 72 | loc_kps.append(loc_kp) 73 | K_loc = np.transpose(np.array(loc_kps)) 74 | K_loc_inv = np.linalg.inv(K_loc) 75 | K_glob = np.transpose(kp_scaled[0:3]) 76 | R = np.matmul(K_glob, K_loc_inv) 77 | quat = quaternion.from_rotation_matrix(R) 78 | 79 | return center, np.array([quat.x, quat.y, quat.z, quat.w]) 80 | --------------------------------------------------------------------------------