├── .gitignore
├── LICENSE
├── README.md
├── demo
    ├── camera_images.py
    ├── create_video.py
    ├── load_dataset.py
    ├── load_dataset_part.py
    ├── load_filtered_dicts.py
    ├── load_image_data_only.py
    ├── random_access.py
    ├── simulation_rollout.py
    └── using_flat_observations.py
├── pyproject.toml
├── setup.cfg
├── setup.py
└── trifinger_rl_datasets
    ├── __init__.py
    ├── data
        ├── __init__.py
        ├── r1_camera180.yml
        ├── r1_camera300.yml
        ├── r1_camera60.yml
        ├── r3_camera180.yml
        ├── r3_camera300.yml
        ├── r3_camera60.yml
        ├── r4_camera180.yml
        ├── r4_camera300.yml
        ├── r4_camera60.yml
        ├── r5_camera180.yml
        ├── r5_camera300.yml
        ├── r5_camera60.yml
        ├── r6_camera180.yml
        ├── r6_camera300.yml
        ├── r6_camera60.yml
        ├── r7_camera180.yml
        ├── r7_camera300.yml
        ├── r7_camera60.yml
        ├── r8_camera180.yml
        ├── r8_camera300.yml
        ├── r8_camera60.yml
        └── trifingerpro_shuffle_cube_trajectory_fast.npy
    ├── dataset_env.py
    ├── evaluate_sim.py
    ├── evaluation.py
    ├── policy_base.py
    ├── py.typed
    ├── sampling_utils.py
    ├── sim_env.py
    └── utils.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *~
2 | *.swp
3 | *.egg-info
4 | *.pyc
5 | pycache/
6 | build/
7 | dist/
8 | .vscode/
9 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | BSD 3-Clause License
 2 | 
 3 | Copyright (c) 2022, Max Planck Gesellschaft.
 4 | All rights reserved.
 5 | 
 6 | Redistribution and use in source and binary forms, with or without
 7 | modification, are permitted provided that the following conditions are met:
 8 | 
 9 | 1. Redistributions of source code must retain the above copyright notice, this
10 |    list of conditions and the following disclaimer.
11 | 
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 |    this list of conditions and the following disclaimer in the documentation
14 |    and/or other materials provided with the distribution.
15 | 
16 | 3. Neither the name of the copyright holder nor the names of its
17 |    contributors may be used to endorse or promote products derived from
18 |    this software without specific prior written permission.
19 | 
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # TriFinger RL Datasets
  2 | 
  3 | This repository provides offline reinforcement learning datasets collected on the real TriFinger platform and in a simulated version of the environment. The paper ["Benchmarking Offline Reinforcement Learning on Real-Robot Hardware"](https://openreview.net/pdf?id=3k5CUGDLNdd) provides more details on the datasets and benchmarks offline RL algorithms on them. All datasets are available with camera images as well.
  4 | 
  5 | More detailed information about the simulated environment, the datasets and on how to run experiments on a cluster of real TriFinger robots can be found in the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/).
  6 | 
  7 | Some of the datasets were used during the [Real Robot Challenge 2022](https://real-robot-challenge.com).
  8 | 
  9 | ## Installation
 10 | 
 11 | To install the package run with python 3.8 in the root directory of the repository (we recommend doing this in a virtual environment):
 12 | 
 13 | ```bash
 14 | pip install --upgrade pip  # make sure the most recent version of pip is installed
 15 | pip install .
 16 | ```
 17 | 
 18 | ## Usage
 19 | 
 20 | This section provides short examples of how to load datasets and evaluate a policy in simulation. More details on how to work with the datasets can be found in the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/).
 21 | 
 22 | 
 23 | ### Loading a dataset
 24 | 
 25 | The datasets are accessible via gym environments which are automatically registered when importing the package. They are automatically downloaded when requested and stored in `~/.trifinger_rl_datasets` as Zarr files by default (see the [documentation](https://webdav.tuebingen.mpg.de/trifinger-rl/docs/) for custom paths to the datasets). The code for loading the datasets follows the interface suggested by [D4RL](https://github.com/rail-berkeley/d4rl) and extends it where needed. 
 26 | 
 27 | As an alternative to the automatic download, the datasets can also be downloaded
 28 | manually from the [Edmond repository](https://edmond.mpdl.mpg.de/dataset.xhtml?persistentId=doi:10.17617/3.DXZ7TL).
 29 | 
 30 | The datasets are named following the pattern `trifinger-cube-task-source-type-v0` where `task` is either `push` or `lift`, `source` is either `sim` or `real` and `type` can be either `mixed`, `weak-n-expert` or `expert`.
 31 | 
 32 | By default the observations are loaded as flat arrays. For the simulated datasets the environment can be stepped and visualized. Example usage (also see `demo/load_dataset.py`):
 33 | 
 34 | ```python
 35 | import gymnasium as gym
 36 | 
 37 | import trifinger_rl_datasets
 38 | 
 39 | env = gym.make(
 40 |     "trifinger-cube-push-sim-expert-v0",
 41 |     visualization=True,  # enable visualization
 42 | )
 43 | 
 44 | dataset = env.get_dataset()
 45 | 
 46 | print("First observation: ", dataset["observations"][0])
 47 | print("First action: ", dataset["actions"][0])
 48 | print("First reward: ", dataset["rewards"][0])
 49 | 
 50 | obs, info = env.reset()
 51 | truncated = False
 52 | 
 53 | while not truncated:
 54 |     obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
 55 | ```
 56 | 
 57 | Alternatively, the observations can be obtained as nested dictionaries. This simplifies working with the data. As some parts of the observations might be more useful than others, it is also possible to filter the observations when requesting dictionaries (see `demo/load_filtered_dicts.py`):
 58 | 
 59 | ```python
 60 |     # Nested dictionary defines which observations to keep.
 61 |     # Everything that is not included or has value False
 62 |     # will be dropped.
 63 |     obs_to_keep = {
 64 |         "robot_observation": {
 65 |             "position": True,
 66 |             "velocity": True,
 67 |             "fingertip_force": False,
 68 |         },
 69 |         "object_observation": {"keypoints": True},
 70 |     }
 71 |     env = gym.make(
 72 |         args.env_name,
 73 |         # filter observations,
 74 |         obs_to_keep=obs_to_keep,
 75 |     )
 76 | ```
 77 | 
 78 | All datasets come in two versions: with and without camera observations. The versions with camera observations contain `-image` in their name. Despite PNG image compression they are more than one order of magnitude bigger than the imageless versions. To avoid running out of memory, a part of a dataset can be loaded by specifying a range of timesteps:
 79 | 
 80 | ```python
 81 | env = gym.make(
 82 |     "trifinger-cube-push-real-expert-image-v0",
 83 |     disable_env_checker=True
 84 | )
 85 | 
 86 | # load only a subset of obervations, actions and rewards
 87 | dataset = env.get_dataset(rng=(1000, 2000))
 88 | ```
 89 | 
 90 | The camera observations corresponding to this range are then returned in `dataset["images"]` with the following dimensions:
 91 | 
 92 | ```python
 93 | n_timesteps, n_cameras, n_channels, height, width = dataset["images"].shape
 94 | ```
 95 | 
 96 | ### Evaluating a policy in simulation
 97 | 
 98 | This package contains an executable module `trifinger_rl_datasets.evaluate_sim`, which
 99 | can be used to evaluate a policy in simulation.  As arguments it expects the task
100 | ("push" or "lift") and a Python class that implements the policy, following the
101 | `PolicyBase` interface:
102 | 
103 |     python3 -m trifinger_rl_datasets.evaluate_sim push my_package.MyPolicy
104 | 
105 | For more options see `--help`.
106 | 
107 | ## How to cite
108 | 
109 | The paper ["Benchmarking Offline Reinforcement Learning on Real-Robot Hardware"](https://openreview.net/pdf?id=3k5CUGDLNdd) introducing the datasets was published at ICLR 2023:
110 | 
111 | ```
112 | @inproceedings{
113 | guertler2023benchmarking,
114 | title={Benchmarking Offline Reinforcement Learning on Real-Robot Hardware},
115 | author={Nico G{\"u}rtler and Sebastian Blaes and Pavel Kolev and Felix Widmaier and Manuel Wuthrich and Stefan Bauer and Bernhard Sch{\"o}lkopf and Georg Martius},
116 | booktitle={The Eleventh International Conference on Learning Representations },
117 | year={2023},
118 | url={https://openreview.net/forum?id=3k5CUGDLNdd}
119 | }
120 | ```


--------------------------------------------------------------------------------
/demo/camera_images.py:
--------------------------------------------------------------------------------
 1 | """Demo including camera images in the observation."""
 2 | 
 3 | 
 4 | import argparse
 5 | 
 6 | import cv2
 7 | import gymnasium as gym
 8 | import numpy as np
 9 | 
10 | import trifinger_rl_datasets  # noqa
11 | 
12 | 
13 | if __name__ == "__main__":
14 |     argparser = argparse.ArgumentParser(description=__doc__)
15 |     argparser.add_argument(
16 |         "--env",
17 |         type=str,
18 |         default="trifinger-cube-push-sim-expert-v0",
19 |         help="Name of dataset environment to load.",
20 |     )
21 |     argparser.add_argument(
22 |         "--flatten-obs", action="store_true", help="Flattens observations if set."
23 |     )
24 |     argparser.add_argument(
25 |         "--no-visualization",
26 |         dest="visualization",
27 |         action="store_false",
28 |         help="Disables visualization, i.e., rendering of the environment in a GUI.",
29 |     )
30 |     argparser.add_argument(
31 |         "--data-dir", type=str, default=None, help="Path to data directory."
32 |     )
33 |     args = argparser.parse_args()
34 | 
35 |     env = gym.make(
36 |         args.env,
37 |         disable_env_checker=True,
38 |         visualization=args.visualization,
39 |         # include camera images in the observation
40 |         image_obs=True,
41 |         flatten_obs=args.flatten_obs,
42 |         data_dir=args.data_dir,
43 |     )
44 |     obs, info = env.reset()
45 |     truncated = False
46 |     terminated = False
47 | 
48 |     # do one step in environment to get observations
49 |     obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
50 | 
51 |     if args.flatten_obs:
52 |         # obs is a tuple containing an array with all observations but the images
53 |         # and an array containing the images
54 |         other_obs, images = obs
55 |         print("Shape of all observations except images: ", other_obs.shape)
56 |         print("Shape of images: ", images.shape)
57 |     else:
58 |         # obs is a nested dictionary if flatten_obs is False
59 |         images = obs["camera_observation"]["images"]
60 |         print("Shape of images: ", images.shape)
61 | 
62 |     # change to (height, width, channels) format for cv2
63 |     images = np.transpose(images, (0, 2, 3, 1))
64 |     images = np.concatenate(images, axis=0)
65 |     # convert RGB to BGR for cv2
66 |     output_image = cv2.cvtColor(images, cv2.COLOR_RGB2BGR)
67 |     # show images from last time step
68 |     cv2.imshow("Camera images", output_image)
69 |     cv2.waitKey(0)
70 |     cv2.destroyAllWindows()
71 | 


--------------------------------------------------------------------------------
/demo/create_video.py:
--------------------------------------------------------------------------------
  1 | """Create video from camera images."""
  2 | 
  3 | 
  4 | import argparse
  5 | 
  6 | import cv2
  7 | import gymnasium as gym
  8 | import numpy as np
  9 | 
 10 | import trifinger_rl_datasets  # noqa
 11 | 
 12 | 
 13 | def create_video(
 14 |     env, output_path, camera_id, timestep_range, zarr_path, show_reward=True
 15 | ):
 16 |     """Create video from camera images.
 17 | 
 18 |     Args:
 19 |         dataset (dict):  Dataset to load images from.
 20 |         output_path (str):  Output path for video file.
 21 |         camera_id (str):  ID of the camera for which to load images.
 22 |     """
 23 | 
 24 |     image_range = env.convert_timestep_to_image_index(np.array(timestep_range))
 25 |     # load relevant part of images in dataset
 26 |     images = env.get_image_data(
 27 |         # images from 3 cameras for each timestep
 28 |         rng=(image_range[0], image_range[1] + 3),
 29 |         zarr_path=zarr_path,
 30 |         timestep_dimension=True,
 31 |     )
 32 |     if show_reward:
 33 |         # load rewards for the specified timesteps
 34 |         image_indices = env.convert_timestep_to_image_index(
 35 |             np.arange(*tuple(timestep_range))
 36 |         )
 37 |         dataset = env.get_dataset(
 38 |             rng=(timestep_range[0], timestep_range[1] + 1), zarr_path=zarr_path
 39 |         )
 40 | 
 41 |     # select only images from the specified camera
 42 |     images = images[:, camera_id, ...]
 43 | 
 44 |     # create video writer
 45 |     fourcc = cv2.VideoWriter_fourcc(*"mp4v")
 46 |     fps = 10
 47 |     video_writer = cv2.VideoWriter(
 48 |         output_path, fourcc, fps, (images.shape[-1], images.shape[-2])
 49 |     )
 50 | 
 51 |     max_bar_height = 50
 52 |     # loop over images
 53 |     for i, image in enumerate(images):
 54 |         # convert to channeel last format for cv2
 55 |         img = np.transpose(image, (1, 2, 0))
 56 |         if show_reward:
 57 |             # draw bar with height proportional to reward
 58 |             index = np.argmax(image_indices == i * 3 + image_range[0])
 59 |             reward = dataset["rewards"][index]
 60 |             img[img.shape[0] - max_bar_height :, 260:, :] = 150
 61 |             bar_height = int(reward * max_bar_height)
 62 |             img[img.shape[0] - bar_height :, 260:, 1] = 255
 63 |         # convert RGB to BGR for cv2
 64 |         img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
 65 |         # write image to video
 66 |         video_writer.write(img)
 67 | 
 68 |     # close video writer
 69 |     video_writer.release()
 70 | 
 71 | 
 72 | if __name__ == "__main__":
 73 |     argparser = argparse.ArgumentParser(description=__doc__)
 74 |     argparser.add_argument("output_path", type=str, help="Path to output video file.")
 75 |     argparser.add_argument(
 76 |         "camera_id", type=int, help="ID of the camera for which to load images."
 77 |     )
 78 |     argparser.add_argument(
 79 |         "--env",
 80 |         type=str,
 81 |         default="trifinger-cube-push-real-expert-image-v0",
 82 |         help="Name of dataset environment to load.",
 83 |     )
 84 |     argparser.add_argument(
 85 |         "--timestep-range",
 86 |         type=int,
 87 |         nargs=2,
 88 |         default=[0, 750],
 89 |         help="Range of timesteps (not camera timesteps) to load image data for.",
 90 |     )
 91 |     argparser.add_argument(
 92 |         "--zarr-path", type=str, default=None, help="Path to Zarr file to load."
 93 |     )
 94 |     argparser.add_argument(
 95 |         "--data-dir", type=str, default=None, help="Path to data directory."
 96 |     )
 97 |     argparser.add_argument(
 98 |         "--no-reward", action="store_true", help="Do not show reward bar. "
 99 |     )
100 |     args = argparser.parse_args()
101 | 
102 |     env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir)
103 |     create_video(
104 |         env,
105 |         args.output_path,
106 |         args.camera_id,
107 |         args.timestep_range,
108 |         args.zarr_path,
109 |         not args.no_reward,
110 |     )
111 | 


--------------------------------------------------------------------------------
/demo/load_dataset.py:
--------------------------------------------------------------------------------
 1 | """Load a complete dataset into memory and perform a rollout."""
 2 | 
 3 | import argparse
 4 | 
 5 | import gymnasium as gym
 6 | 
 7 | import trifinger_rl_datasets  # noqa
 8 | 
 9 | 
10 | if __name__ == "__main__":
11 |     argparser = argparse.ArgumentParser(description=__doc__)
12 |     argparser.add_argument(
13 |         "--env",
14 |         type=str,
15 |         default="trifinger-cube-push-sim-expert-v0",
16 |         help="Name of dataset environment to load.",
17 |     )
18 |     argparser.add_argument(
19 |         "--data-dir",
20 |         type=str,
21 |         default=None,
22 |         help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.",
23 |     )
24 |     args = argparser.parse_args()
25 | 
26 |     env = gym.make(
27 |         args.env,
28 |         disable_env_checker=True,
29 |         visualization=True,  # enable visualization
30 |         data_dir=args.data_dir,
31 |     )
32 |     dataset = env.get_dataset()
33 | 
34 |     n_transitions = len(dataset["observations"])
35 |     print("Number of transitions: ", n_transitions)
36 | 
37 |     assert dataset["actions"].shape[0] == n_transitions
38 |     assert dataset["rewards"].shape[0] == n_transitions
39 | 
40 |     print("First observation: ", dataset["observations"][0])
41 | 
42 |     obs, info = env.reset()
43 |     truncated = False
44 |     terminated = False
45 |     while not (truncated or terminated):
46 |         obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
47 | 


--------------------------------------------------------------------------------
/demo/load_dataset_part.py:
--------------------------------------------------------------------------------
 1 | """Load part of a datset defined by a range of transitions."""
 2 | 
 3 | 
 4 | import argparse
 5 | 
 6 | import gymnasium as gym
 7 | 
 8 | import trifinger_rl_datasets  # noqa
 9 | 
10 | 
11 | if __name__ == "__main__":
12 |     argparser = argparse.ArgumentParser(description=__doc__)
13 |     argparser.add_argument(
14 |         "--env",
15 |         type=str,
16 |         default="trifinger-cube-push-real-expert-v0",
17 |         help="Name of dataset environment to load.",
18 |     )
19 |     argparser.add_argument(
20 |         "--range",
21 |         type=int,
22 |         nargs=2,
23 |         default=[1000, 2000],
24 |         help="Range of timesteps to load image data for.",
25 |     )
26 |     argparser.add_argument(
27 |         "--zarr-path", type=str, default=None, help="Path to Zarr file to load."
28 |     )
29 |     argparser.add_argument(
30 |         "--flatten-obs", action="store_true", help="Flatten observations."
31 |     )
32 |     argparser.add_argument(
33 |         "--data-dir", type=str, default=None, help="Path to data directory."
34 |     )
35 |     args = argparser.parse_args()
36 | 
37 |     env = gym.make(
38 |         args.env,
39 |         disable_env_checker=True,
40 |         flatten_obs=args.flatten_obs,
41 |         data_dir=args.data_dir,
42 |     )
43 | 
44 |     # load only a subset of obervations, actions and rewards
45 |     dataset = env.get_dataset(rng=tuple(args.range), zarr_path=args.zarr_path)
46 | 
47 |     n_observations = len(dataset["observations"])
48 |     print("Number of observations: ", n_observations)
49 | 
50 |     assert dataset["actions"].shape[0] == n_observations
51 |     assert dataset["rewards"].shape[0] == n_observations
52 | 


--------------------------------------------------------------------------------
/demo/load_filtered_dicts.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import gymnasium as gym
 3 | 
 4 | import trifinger_rl_datasets  # noqa
 5 | 
 6 | 
 7 | if __name__ == "__main__":
 8 |     parser = argparse.ArgumentParser(
 9 |         description="Demonstrate how to customize observation space by filtering."
10 |     )
11 |     parser.add_argument(
12 |         "--env-name",
13 |         type=str,
14 |         default="trifinger-cube-push-sim-expert-v0",
15 |         help="Name of the gym environment to load.",
16 |     )
17 |     parser.add_argument(
18 |         "--do-not-filter-obs",
19 |         action="store_true",
20 |         help="Do not filter observations if this is set.",
21 |     )
22 |     parser.add_argument(
23 |         "--flatten-obs",
24 |         action="store_true",
25 |         help="Flatten observations again after filtering if this is set.",
26 |     )
27 |     parser.add_argument(
28 |         "--data-dir",
29 |         type=str,
30 |         default=None,
31 |         help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.",
32 |     )
33 |     args = parser.parse_args()
34 | 
35 |     # Nested dictionary defines which observations to keep.
36 |     # Everything that is not included or has value False
37 |     # will be dropped.
38 |     obs_to_keep = {
39 |         "robot_observation": {
40 |             "position": True,
41 |             "velocity": True,
42 |             "fingertip_force": False,
43 |         },
44 |         "camera_observation": {"object_keypoints": True},
45 |     }
46 |     env = gym.make(
47 |         args.env_name,
48 |         disable_env_checker=True,
49 |         # enable visualization,
50 |         visualization=True,
51 |         # filter observations,
52 |         obs_to_keep=None if args.do_not_filter_obs else obs_to_keep,
53 |         # flatten observation
54 |         flatten_obs=args.flatten_obs,
55 |         data_dir=args.data_dir,
56 |     )
57 | 
58 |     dataset = env.get_dataset()
59 | 
60 |     n_transitions = len(dataset["observations"])
61 |     print("Number of transitions: ", n_transitions)
62 | 
63 |     assert dataset["actions"].shape[0] == n_transitions
64 |     assert dataset["rewards"].shape[0] == n_transitions
65 | 
66 |     print("First observation: ", dataset["observations"][0])
67 | 
68 |     obs, info = env.reset()
69 |     truncated = False
70 |     terminated = False
71 |     while not (truncated or terminated):
72 |         obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
73 | 


--------------------------------------------------------------------------------
/demo/load_image_data_only.py:
--------------------------------------------------------------------------------
  1 | """Load image data from Zarr file and display it."""
  2 | 
  3 | 
  4 | import argparse
  5 | 
  6 | import cv2
  7 | import gymnasium as gym
  8 | import numpy as np
  9 | 
 10 | import trifinger_rl_datasets  # noqa
 11 | 
 12 | 
 13 | def show_images(images, timestep_dimension):
 14 |     """Show loaded images.
 15 | 
 16 |     Args:
 17 |         images (np.ndarray):  Array containing the image data.
 18 |         no_timestep_dimension (bool):  If False, the first dimension of the
 19 |             image_data array is assumed to correspond to camera timesteps.
 20 |             Otherwise, the first dimension is assumed to correspond to
 21 |             images."""
 22 | 
 23 |     if timestep_dimension:
 24 |         n_timesteps, n_cameras, n_channels, height, width = images.shape
 25 |         output_image = np.zeros(
 26 |             (n_cameras * height, n_timesteps * width, n_channels), dtype=np.uint8
 27 |         )
 28 |     else:
 29 |         n_images, n_channels, height, width = images.shape
 30 |         output_image = np.zeros((height, n_images * width, n_channels), dtype=np.uint8)
 31 |     # loop over tuples containing images from all cameras at one timestep
 32 |     for i, image_s in enumerate(images):
 33 |         if timestep_dimension:
 34 |             # concatenate images from all cameras along the height axis
 35 |             image_s = np.concatenate(image_s, axis=1)
 36 |         # change to (height, width, channels) format for cv2
 37 |         image_s = np.transpose(image_s, (1, 2, 0))
 38 |         # copy column of camera images to output image
 39 |         output_image[:, i * width : (i + 1) * width, ...] = image_s
 40 |     # convert RGB to BGR for cv2
 41 |     output_image = cv2.cvtColor(output_image, cv2.COLOR_RGB2BGR)
 42 | 
 43 |     if timestep_dimension:
 44 |         legend = "Each column corresponds to the camera images at one timestep."
 45 |     else:
 46 |         legend = "Camera images"
 47 |     print(legend)
 48 |     print("Press any key to close window.")
 49 |     cv2.imshow(legend, output_image)
 50 |     cv2.waitKey(0)
 51 |     cv2.destroyAllWindows()
 52 | 
 53 | 
 54 | if __name__ == "__main__":
 55 |     argparser = argparse.ArgumentParser(description=__doc__)
 56 |     argparser.add_argument(
 57 |         "--env",
 58 |         type=str,
 59 |         default="trifinger-cube-push-real-expert-image-v0",
 60 |         help="Name of dataset environment to load.",
 61 |     )
 62 |     argparser.add_argument(
 63 |         "--n-timesteps",
 64 |         type=int,
 65 |         default=10,
 66 |         help="Number of camera timesteps to load image data for.",
 67 |     )
 68 |     argparser.add_argument(
 69 |         "--zarr-path", type=str, default=None, help="Path to Zarr file to load."
 70 |     )
 71 |     argparser.add_argument(
 72 |         "--do-not-show-images",
 73 |         action="store_true",
 74 |         help="Do not show images if this is set.",
 75 |     )
 76 |     argparser.add_argument(
 77 |         "--no-timestep-dimension",
 78 |         dest="timestep_dimension",
 79 |         action="store_false",
 80 |         help="Do not include the timestep dimension in the output array.",
 81 |     )
 82 |     argparser.add_argument(
 83 |         "--data-dir", type=str, default=None, help="Path to data directory."
 84 |     )
 85 |     args = argparser.parse_args()
 86 | 
 87 |     # create environment
 88 |     env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir)
 89 | 
 90 |     # get information about image data
 91 |     image_stats = env.get_image_stats(zarr_path=args.zarr_path)
 92 |     print("Image dataset:")
 93 |     for key, value in image_stats.items():
 94 |         print(f"{key}: {value}")
 95 | 
 96 |     # load image data
 97 |     print(f"Loading {args.n_timesteps} timesteps of image data.")
 98 |     from time import time
 99 | 
100 |     t0 = time()
101 |     images = env.get_image_data(
102 |         # images from 3 cameras for each timestep
103 |         rng=(0, 3 * args.n_timesteps),
104 |         zarr_path=args.zarr_path,
105 |         timestep_dimension=args.timestep_dimension,
106 |     )
107 |     print(f"Loading took {time() - t0:.3f} seconds.")
108 | 
109 |     # show images
110 |     if not args.do_not_show_images:
111 |         show_images(images, args.timestep_dimension)
112 | 


--------------------------------------------------------------------------------
/demo/random_access.py:
--------------------------------------------------------------------------------
 1 | """Load small parts of dataset at random positions to test performance."""
 2 | 
 3 | 
 4 | import argparse
 5 | from time import time
 6 | 
 7 | import gymnasium as gym
 8 | import numpy as np
 9 | 
10 | import trifinger_rl_datasets  # noqa
11 | 
12 | 
13 | if __name__ == "__main__":
14 |     argparser = argparse.ArgumentParser(description=__doc__)
15 |     argparser.add_argument(
16 |         "--env",
17 |         type=str,
18 |         default="trifinger-cube-push-real-expert-v0",
19 |         help="Name of dataset environment to load.",
20 |     )
21 |     argparser.add_argument(
22 |         "--n-parts",
23 |         type=int,
24 |         default=500,
25 |         help="Number of contiguous parts to load from file.",
26 |     )
27 |     argparser.add_argument(
28 |         "--part-size",
29 |         type=int,
30 |         default=10,
31 |         help="Number of transitions to load per part.",
32 |     )
33 |     argparser.add_argument(
34 |         "--zarr_path", type=str, default=None, help="Path to Zarr file to load."
35 |     )
36 |     argparser.add_argument(
37 |         "--data-dir",
38 |         type=str,
39 |         default=None,
40 |         help="Path to data directory.If not set, the default data directory '~/.trifinger_rl_datasets' is used.",
41 |     )
42 |     args = argparser.parse_args()
43 | 
44 |     # create environment
45 |     env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir)
46 | 
47 |     stats = env.get_dataset_stats(zarr_path=args.zarr_path)
48 |     print("Number of timesteps in dataset: ", stats["n_timesteps"])
49 | 
50 |     # load subsets of the dataset at random positions
51 |     indices = []
52 |     for i in range(args.n_parts):
53 |         start = np.random.randint(0, stats["n_timesteps"] - args.part_size)
54 |         if args.part_size == 1:
55 |             indices.append(start)
56 |         else:
57 |             indices.extend(range(start, start + args.part_size))
58 |     indices = np.array(indices)
59 |     t0 = time()
60 |     part = env.get_dataset(indices=indices, zarr_path=args.zarr_path)
61 |     t1 = time()
62 |     print(f"Loaded {args.n_parts} parts of size {args.part_size} in {t1 - t0:.2f} s")
63 | 
64 |     print("Observation shape: ", part["observations"].shape)
65 |     print("Action shape: ", part["actions"].shape)
66 | 


--------------------------------------------------------------------------------
/demo/simulation_rollout.py:
--------------------------------------------------------------------------------
 1 | """Demo for doing a rollout in simulation."""
 2 | 
 3 | 
 4 | import argparse
 5 | 
 6 | import gymnasium as gym
 7 | import numpy as np
 8 | 
 9 | import trifinger_rl_datasets  # noqa
10 | 
11 | 
12 | if __name__ == "__main__":
13 |     argparser = argparse.ArgumentParser(description=__doc__)
14 |     argparser.add_argument(
15 |         "--env",
16 |         type=str,
17 |         default="trifinger-cube-push-sim-expert-v0",
18 |         help="Name of dataset environment to load.",
19 |     )
20 |     argparser.add_argument(
21 |         "--no-visualization",
22 |         dest="visualization",
23 |         action="store_false",
24 |         help="Disables visualization, i.e., rendering of the environment in a GUI.",
25 |     )
26 |     args = argparser.parse_args()
27 | 
28 |     env = gym.make(args.env, disable_env_checker=True, visualization=args.visualization)
29 |     obs, info = env.reset()
30 |     truncated = False
31 | 
32 |     while not truncated:
33 |         obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
34 | 


--------------------------------------------------------------------------------
/demo/using_flat_observations.py:
--------------------------------------------------------------------------------
 1 | """How to use the provided index ranges to work with flat observations."""
 2 | 
 3 | import argparse
 4 | import json
 5 | 
 6 | import gymnasium as gym
 7 | 
 8 | import trifinger_rl_datasets  # noqa
 9 | 
10 | 
11 | if __name__ == "__main__":
12 |     argparser = argparse.ArgumentParser(description=__doc__)
13 |     argparser.add_argument(
14 |         "--env",
15 |         type=str,
16 |         default="trifinger-cube-push-real-expert-v0",
17 |         help="Name of dataset environment to load.",
18 |     )
19 |     argparser.add_argument(
20 |         "--data-dir",
21 |         type=str,
22 |         default=None,
23 |         help="Path to data directory. If not set, the default data directory '~/.trifinger_rl_datasets' is used.",
24 |     )
25 |     args = argparser.parse_args()
26 | 
27 |     env = gym.make(args.env, disable_env_checker=True, data_dir=args.data_dir)
28 | 
29 |     # load only a subset of obervations, actions and rewards
30 |     n_observations = 10
31 |     dataset = env.get_dataset(rng=(0, 750))
32 | 
33 |     # get mapping from observation components to index ranges
34 |     obs_indices, obs_shapes = env.get_obs_indices()
35 | 
36 |     print("Observation component indices: ", json.dumps(obs_indices, indent=4))
37 |     print("Observation component shapes: ", json.dumps(obs_shapes, indent=4))
38 | 
39 |     # print cube position over time
40 |     print("Cube position over time: ")
41 |     for i in range(n_observations):
42 |         index_range = obs_indices["camera_observation"]["object_position"]
43 |         print(dataset["observations"][i][slice(*index_range)])
44 | 


--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
1 | [build-system]
2 | requires = ["setuptools"]
3 | build-backend = "setuptools.build_meta"
4 | 
5 | [tool.mypy]
6 | ignore_missing_imports = true
7 | exclude = "build"
8 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
 1 | [metadata]
 2 | name = trifinger_rl_datasets
 3 | version = attr: trifinger_rl_datasets.__version__
 4 | description = Gym environments which provide offline RL datasets collected on the TriFinger system.
 5 | long_description = file: README.md
 6 | long_description_content_type = text/markdown
 7 | author = Nico Gürtler
 8 | author_email = nico.guertler@tuebingen.mpg.de
 9 | keywords =
10 |     offline reinforcement learning
11 |     reinforcement learning 
12 |     robotics
13 |     TriFinger
14 |     Real Robot Challenge
15 |     dexterous manipulation
16 | license = BSD 3-Clause
17 | 
18 | [options]
19 | packages = find:
20 | install_requires = 
21 |     numpy
22 |     gymnasium
23 |     zarr
24 |     tqdm
25 |     numpy-quaternion
26 |     trifinger_simulation>=1.4.0
27 |     opencv-python
28 |     lmdb
29 | 
30 | [options.package_data]
31 | trifinger_rl_datasets = py.typed
32 | trifinger_rl_datasets.data =
33 |     *.npy
34 |     *.yml
35 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import setup
2 | 
3 | setup()
4 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/__init__.py:
--------------------------------------------------------------------------------
 1 | __version__ = "1.0.3"
 2 | 
 3 | from gymnasium.envs.registration import register
 4 | 
 5 | from .dataset_env import TriFingerDatasetEnv
 6 | from .evaluation import Evaluation
 7 | from .policy_base import PolicyBase, PolicyConfig
 8 | from .sim_env import SimTriFingerCubeEnv
 9 | 
10 | 
11 | base_url = "https://robots.real-robot-challenge.com/public/trifinger_rl_datasets/"
12 | 
13 | dataset_names = [
14 |     "trifinger-cube-push-real-expert-v0",
15 |     "trifinger-cube-push-real-expert-image-v0",
16 |     "trifinger-cube-push-real-weak-n-expert-v0",
17 |     "trifinger-cube-push-real-weak-n-expert-image-v0",
18 |     "trifinger-cube-push-real-half-expert-v0",
19 |     "trifinger-cube-push-real-half-expert-image-v0",
20 |     "trifinger-cube-push-real-mixed-v0",
21 |     "trifinger-cube-push-real-mixed-image-v0",
22 |     "trifinger-cube-push-sim-expert-v0",
23 |     "trifinger-cube-push-sim-expert-image-v0",
24 |     "trifinger-cube-push-sim-weak-n-expert-v0",
25 |     "trifinger-cube-push-sim-weak-n-expert-image-v0",
26 |     "trifinger-cube-push-sim-half-expert-v0",
27 |     "trifinger-cube-push-sim-half-expert-image-v0",
28 |     "trifinger-cube-push-sim-mixed-v0",
29 |     "trifinger-cube-push-sim-mixed-image-v0",
30 |     "trifinger-cube-lift-real-smooth-expert-v0",
31 |     "trifinger-cube-lift-real-smooth-expert-image-v0",
32 |     "trifinger-cube-lift-real-expert-v0",
33 |     "trifinger-cube-lift-real-expert-image-v0",
34 |     "trifinger-cube-lift-real-weak-n-expert-v0",
35 |     "trifinger-cube-lift-real-weak-n-expert-image-v0",
36 |     "trifinger-cube-lift-real-half-expert-v0",
37 |     "trifinger-cube-lift-real-half-expert-image-v0",
38 |     "trifinger-cube-lift-real-mixed-v0",
39 |     "trifinger-cube-lift-real-mixed-image-v0",
40 |     "trifinger-cube-lift-sim-expert-v0",
41 |     "trifinger-cube-lift-sim-expert-image-v0",
42 |     "trifinger-cube-lift-sim-weak-n-expert-v0",
43 |     "trifinger-cube-lift-sim-weak-n-expert-image-v0",
44 |     "trifinger-cube-lift-sim-half-expert-v0",
45 |     "trifinger-cube-lift-sim-half-expert-image-v0",
46 |     "trifinger-cube-lift-sim-mixed-v0",
47 |     "trifinger-cube-lift-sim-mixed-image-v0",
48 | ]
49 | 
50 | task_params = {
51 |     "push": {
52 |         "ref_min_score": 0.0,
53 |         "ref_max_score": 1.0 * 15000 / 20,
54 |         "trifinger_kwargs": {
55 |             "episode_length": 750,
56 |             "difficulty": 1,
57 |             "keypoint_obs": True,
58 |             "obs_action_delay": 10,
59 |         },
60 |     },
61 |     "lift": {
62 |         "ref_min_score": 0.0,
63 |         "ref_max_score": 1.0 * 30000 / 20,
64 |         "trifinger_kwargs": {
65 |             "episode_length": 1500,
66 |             "difficulty": 4,
67 |             "keypoint_obs": True,
68 |             "obs_action_delay": 2,
69 |         },
70 |     }
71 | }
72 | 
73 | # add the missing parameters for all environments
74 | dataset_params = []
75 | for dataset_name in dataset_names:
76 |     dataset_url = base_url + f"{dataset_name}.zarr/dataset.yaml"
77 |     params  = {
78 |         "name": dataset_name,
79 |         "dataset_url": dataset_url,
80 |         "real_robot": "real" in dataset_name,
81 |         "image_obs": "image" in dataset_name,
82 |     }
83 |     task = dataset_name.split("-")[2]
84 |     params.update(task_params[task])
85 |     dataset_params.append(params)
86 | 
87 | 
88 | def get_env(**kwargs):
89 |     return TriFingerDatasetEnv(**kwargs)
90 | 
91 | 
92 | for params in dataset_params:
93 |     register(
94 |         id=params["name"], entry_point="trifinger_rl_datasets:get_env", kwargs=params
95 |     )
96 | 
97 | 
98 | __all__ = ("TriFingerDatasetEnv", "Evaluation", "PolicyBase", "PolicyConfig", "get_env")
99 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/data/__init__.py


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r1_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 294.88772219999373
 5 |   - 0.0
 6 |   - 140.5328113426975
 7 |   - 0.0
 8 |   - 295.73310223701
 9 |   - 139.02122162180186
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2329767683542472
19 |   - 0.1264568334234187
20 |   - -0.002491690223171375
21 |   - 0.0005127279134358445
22 |   - -0.10105977517686267
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9998773558577191
30 |   - 0.011414526348856103
31 |   - 0.01008504910436589
32 |   - -0.0005860472846181841
33 |   - 0.014962484677142522
34 |   - -0.8598542245158285
35 |   - -0.5103030306087385
36 |   - 0.006757462166832068
37 |   - 0.0028470200349174054
38 |   - 0.5103960640266085
39 |   - -0.8599267125865473
40 |   - 0.5398412631504335
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 4.532616904492608e-05
50 |   - 0.002683896658220163
51 |   - 0.0024633838589706476
52 |   - 0.0005936876248080574
53 |   - 0.002926484369290859
54 |   - 0.0015377822108868704
55 |   - 0.002593164324638185
56 |   - 0.0005257987542181374
57 |   - 0.0021738001191909102
58 |   - 0.002590544587593434
59 |   - 0.001537841000242577
60 |   - 0.0002675030520101297
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r1_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 296.21756903928105
 5 |   - 0.0
 6 |   - 136.25729478510448
 7 |   - 0.0
 8 |   - 297.3694380877441
 9 |   - 136.15325821433555
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2360106786968337
19 |   - 0.11119950673909895
20 |   - -0.0023231213828792865
21 |   - -0.00011207631566610079
22 |   - -0.03797400043315655
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7007973495452335
30 |   - -0.7132911751809331
31 |   - -0.009335019672536672
32 |   - 0.0029348055153886636
33 |   - -0.5214052614698457
34 |   - 0.5211156313513126
35 |   - -0.6756982950215817
36 |   - 0.006648752367397013
37 |   - 0.4868367629135888
38 |   - -0.4686606984044702
39 |   - -0.7371146635360328
40 |   - 0.529189920705302
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0019127399459759901
50 |   - 0.001873846394493345
51 |   - 0.0021123003294579246
52 |   - 0.0005526193230891081
53 |   - 0.0013674031004057444
54 |   - 0.0019502970308061438
55 |   - 0.0010921209022115455
56 |   - 0.00039016189344718557
57 |   - 0.0020164554088746207
58 |   - 0.0020082287076101327
59 |   - 0.000994872883899746
60 |   - 0.000537056922161501
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r1_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.6937813935656
 5 |   - 0.0
 6 |   - 142.58976331067643
 7 |   - 0.0
 8 |   - 297.43474827422557
 9 |   - 132.94428296771792
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23413559970500727
19 |   - 0.11371630166152194
20 |   - -0.0029621685301716794
21 |   - 0.00022464397863156515
22 |   - -0.05975819039614815
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7015036040355102
30 |   - 0.7126258352185996
31 |   - -0.006292771681715977
32 |   - -5.9947642844554126e-06
33 |   - 0.5296527183057728
34 |   - 0.5154373896823984
35 |   - -0.6736331419719331
36 |   - 0.007922712077366799
37 |   - -0.47680902821528676
38 |   - -0.47589060622531765
39 |   - -0.7390285042410089
40 |   - 0.5283398784209727
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0019454006345476767
50 |   - 0.0019278396785450142
51 |   - 0.00316423439241316
52 |   - 0.0005606517135422176
53 |   - 0.0015869029962777175
54 |   - 0.0022564232625609977
55 |   - 0.0017537498861007086
56 |   - 0.0004269523682819464
57 |   - 0.002745775690234948
58 |   - 0.002835346737572229
59 |   - 0.001604045799541765
60 |   - 0.00041279039785668164
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r3_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 296.74611824529234
 5 |   - 0.0
 6 |   - 139.40850050814652
 7 |   - 0.0
 8 |   - 297.6917158038904
 9 |   - 138.78193203925287
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23795589930225886
19 |   - 0.18043508606118522
20 |   - -0.0035741471544204506
21 |   - 9.138474527846751e-05
22 |   - -0.22493844398993962
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9999547351934535
30 |   - 0.008747792234960665
31 |   - -0.0005467815435421723
32 |   - 0.001765171033829126
33 |   - 0.007193253602133133
34 |   - -0.8546803606443765
35 |   - -0.5190926134066003
36 |   - 0.012616075729680792
37 |   - -0.005009253005035439
38 |   - 0.5190704588728084
39 |   - -0.8547099687242182
40 |   - 0.5393239940724293
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 2.1047563047475303e-05
50 |   - 0.0023541870146826125
51 |   - 0.0028569364094846524
52 |   - 0.0006445672932799738
53 |   - 0.002706191231962306
54 |   - 0.001196432299462484
55 |   - 0.001960106346274422
56 |   - 0.0005881374226730213
57 |   - 0.0025235168224777573
58 |   - 0.0019601884188093378
59 |   - 0.0011937326162342289
60 |   - 0.0004392772389769057
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r3_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 298.5404817716902
 5 |   - 0.0
 6 |   - 141.68272217282268
 7 |   - 0.0
 8 |   - 299.0537835142228
 9 |   - 128.919845031919
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2364432150861758
19 |   - 0.13306051434807178
20 |   - -0.0022006077617502296
21 |   - 0.00011556539793348763
22 |   - -0.09353412205255426
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7085465317675347
30 |   - -0.7056363472579175
31 |   - -0.005623000466052991
32 |   - -0.0014206576794110644
33 |   - -0.5096260570266448
34 |   - 0.5172088355708018
35 |   - -0.6875841577095377
36 |   - 0.015634778358790048
37 |   - 0.4880922840847387
38 |   - -0.48431963174976633
39 |   - -0.7260807519787734
40 |   - 0.5305425578768467
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0017850020349491676
50 |   - 0.00178505207648795
51 |   - 0.0010802644490479245
52 |   - 0.0007533220965327408
53 |   - 0.00133605144240045
54 |   - 0.0012267301243739205
55 |   - 0.0010191043032768865
56 |   - 0.0005317818326090676
57 |   - 0.0020112858807655957
58 |   - 0.0014727050131528396
59 |   - 0.0009715933830996746
60 |   - 0.000586868798196673
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r3_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 296.94060130336067
 5 |   - 0.0
 6 |   - 144.92494998990932
 7 |   - 0.0
 8 |   - 298.12851303387595
 9 |   - 126.27834126011763
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.224614441732425
19 |   - 0.06610056999907407
20 |   - -0.002957571082894125
21 |   - 5.384206970943196e-05
22 |   - 0.02843286766321863
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.700242886187359
30 |   - 0.7138894778538976
31 |   - -0.0035451792561288736
32 |   - -0.0008993391228219956
33 |   - 0.5130943940421685
34 |   - 0.49982009089630225
35 |   - -0.6977888521123732
36 |   - 0.022773616806538905
37 |   - -0.49637442103061286
38 |   - -0.4904415106411645
39 |   - -0.7162914535207946
40 |   - 0.5311376719192535
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0016404696193842237
50 |   - 0.0016128107368050496
51 |   - 0.0019629463782010886
52 |   - 0.0006384353946906304
53 |   - 0.0011906806384894921
54 |   - 0.0016991520977704723
55 |   - 0.000657699303747005
56 |   - 0.0005515050742389639
57 |   - 0.0015389711063393422
58 |   - 0.0018238721419053371
59 |   - 0.0006461098573577129
60 |   - 0.0006490388732209827
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r4_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.14033512458053
 5 |   - 0.0
 6 |   - 138.00666513483048
 7 |   - 0.0
 8 |   - 295.8963905623699
 9 |   - 140.1780817025803
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2344692447120545
19 |   - 0.1170210063291702
20 |   - -0.0025387045089898526
21 |   - 0.00028678240528109023
22 |   - -0.05727351974597146
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9997577629246032
30 |   - -0.019728351892018867
31 |   - 0.008801429506700712
32 |   - -0.011521565590285684
33 |   - -0.012676964030929268
34 |   - -0.8656406099279625
35 |   - -0.5004868462171239
36 |   - 0.002502652833109931
37 |   - 0.017491868719132025
38 |   - 0.5002604130100553
39 |   - -0.8656918965259675
40 |   - 0.5379001618103092
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 6.077392032405641e-05
50 |   - 0.0032662823312658903
51 |   - 0.0026589750881622597
52 |   - 0.0006043392603399411
53 |   - 0.0035533576087949576
54 |   - 0.0012160406746025694
55 |   - 0.002107293269154359
56 |   - 0.0005246759072894277
57 |   - 0.00226153485468361
58 |   - 0.0021209420050399127
59 |   - 0.0012169873079002672
60 |   - 0.00032410552238325405
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r4_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.06181051767146
 5 |   - 0.0
 6 |   - 145.562733589218
 7 |   - 0.0
 8 |   - 296.4879185143768
 9 |   - 137.77016924669405
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23212995718456583
19 |   - 0.09401968037398019
20 |   - -0.003509759792242295
21 |   - 0.00013186627698614768
22 |   - -0.01598467384240246
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7053343042234599
30 |   - -0.7088205420070108
31 |   - -0.007533171694605158
32 |   - 0.0011711320881139364
33 |   - -0.5224835176259123
34 |   - 0.5270393971499698
35 |   - -0.6702455873235632
36 |   - 0.00026587658713440984
37 |   - 0.47905680627685526
38 |   - -0.46881272781441363
39 |   - -0.7420919799225995
40 |   - 0.5280109588474451
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0024126865232079794
50 |   - 0.0023890681263911166
51 |   - 0.002946376748396405
52 |   - 0.0005506877497822348
53 |   - 0.0019409159597162035
54 |   - 0.002197967455094283
55 |   - 0.0016438122291327678
56 |   - 0.0004887254561488916
57 |   - 0.0030791628178283214
58 |   - 0.0026458670499804582
59 |   - 0.0014879776474758704
60 |   - 0.0004331257058881025
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r4_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.3700492360575
 5 |   - 0.0
 6 |   - 136.4014117306309
 7 |   - 0.0
 8 |   - 296.88972733684324
 9 |   - 135.80933925508987
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23824457883790873
19 |   - 0.1442700801176228
20 |   - -0.0021768405888921223
21 |   - 0.00018192403232659664
22 |   - -0.11940468687914024
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.697667389466153
30 |   - 0.7164043722161288
31 |   - 0.002578390552528845
32 |   - -0.00565327771472424
33 |   - 0.5304059004291459
34 |   - 0.51894730508639
35 |   - -0.6703370518430334
36 |   - 0.0020404541425615187
37 |   - -0.4815748046122418
38 |   - -0.46630623783099373
39 |   - -0.7420436986786935
40 |   - 0.5272097013879554
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.002252421338863501
50 |   - 0.002191622368127827
51 |   - 0.0029093705098407314
52 |   - 0.0006367461049661742
53 |   - 0.0017955664406818896
54 |   - 0.0024356722232948837
55 |   - 0.0015348295959123992
56 |   - 0.00044482796113458047
57 |   - 0.0026841325191941804
58 |   - 0.00249536877201587
59 |   - 0.001384897520199189
60 |   - 0.00042975075002664144
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r5_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 291.80982121128756
 5 |   - 0.0
 6 |   - 144.54387475012228
 7 |   - 0.0
 8 |   - 292.6932628296855
 9 |   - 143.60050245646204
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.25011027447601114
19 |   - 0.17835090640327209
20 |   - -0.0005548967943714432
21 |   - -0.00011834890033780262
22 |   - -0.17116234818120937
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9999561355609734
30 |   - -0.007928987189101743
31 |   - 0.0012997157055982735
32 |   - -0.0038819862544355044
33 |   - -0.006227550987616203
34 |   - -0.8673134043586379
35 |   - -0.4976936744704877
36 |   - 0.0004404750551836095
37 |   - 0.005076286783498226
38 |   - 0.49767305858846406
39 |   - -0.8673321903460468
40 |   - 0.5342541371766851
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 2.1536542331724083e-05
50 |   - 0.0027485718806892715
51 |   - 0.003951423034502948
52 |   - 0.000579933025544882
53 |   - 0.0033298252173191963
54 |   - 0.0021486158161714286
55 |   - 0.003738761116627183
56 |   - 0.00040839765132245833
57 |   - 0.0034767334312054317
58 |   - 0.0037343459774502965
59 |   - 0.0021439780764358997
60 |   - 0.00027782692179023416
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r5_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 291.9934185104733
 5 |   - 0.0
 6 |   - 145.16730905488873
 7 |   - 0.0
 8 |   - 293.67115397647314
 9 |   - 144.03250986816687
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.24838171967550723
19 |   - 0.13640297400803297
20 |   - -0.001154775930887984
21 |   - 0.0003733066378391576
22 |   - -0.05845365778706338
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7019735477042157
30 |   - -0.7121841801991278
31 |   - 0.0006921259647857977
32 |   - 0.00012632115564366634
33 |   - -0.5388647008347751
34 |   - 0.5305050468957526
35 |   - -0.6543259972492321
36 |   - -0.008541488099710197
37 |   - 0.465647305518467
38 |   - -0.4596992826623059
39 |   - -0.7561778136707082
40 |   - 0.5242294579470399
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0018227839003619188
50 |   - 0.0017934781195245982
51 |   - 0.0044512531258669216
52 |   - 0.0004272643343968978
53 |   - 0.0033011946127564896
54 |   - 0.004164193916821838
55 |   - 0.004298876519949212
56 |   - 0.00035329568808827743
57 |   - 0.004042147223758283
58 |   - 0.0037551480998523506
59 |   - 0.0037189251014270964
60 |   - 0.00035881803664432967
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r5_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 292.0285930848304
 5 |   - 0.0
 6 |   - 144.00442873377975
 7 |   - 0.0
 8 |   - 293.95259618533396
 9 |   - 145.88910315941828
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.25341478022067476
19 |   - 0.17099512391833938
20 |   - -0.0014184980803399942
21 |   - 0.0002517384743051436
22 |   - -0.11906924909258061
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.699743668396842
30 |   - 0.7143685685080576
31 |   - -0.0029093553396054026
32 |   - -0.001694552433137945
33 |   - 0.5422160152815948
34 |   - 0.5284547088756493
35 |   - -0.6532159782970992
36 |   - -0.007850685480007464
37 |   - -0.4651060402596818
38 |   - -0.45867061536542447
39 |   - -0.7571276016602879
40 |   - 0.5228406235997342
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0018332186701647034
50 |   - 0.0018016217206025726
51 |   - 0.004612589974219499
52 |   - 0.0005240054641435965
53 |   - 0.003733258318774234
54 |   - 0.0033416744035683925
55 |   - 0.0046038170219549965
56 |   - 0.0002967797970863807
57 |   - 0.0044344916588204795
58 |   - 0.004474702957796695
59 |   - 0.003968001457894717
60 |   - 0.00025687284923325745
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r6_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 292.88269384058924
 5 |   - 0.0
 6 |   - 143.79235188937437
 7 |   - 0.0
 8 |   - 294.30956500441897
 9 |   - 149.2307049454187
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.22994262887993008
19 |   - 0.08894501128947027
20 |   - -0.0023966108654140225
21 |   - -6.935365949748796e-05
22 |   - -0.017936186650326827
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9999720431932659
30 |   - 0.006455366446344063
31 |   - -1.881938647716094e-05
32 |   - 0.003625256844403014
33 |   - 0.005636796904021368
34 |   - -0.8744145177364162
35 |   - -0.48512989398178585
36 |   - -0.01060281975896039
37 |   - -0.0031473446745730743
38 |   - 0.48512312844760014
39 |   - -0.8744297446424611
40 |   - 0.5330174005997446
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 1.2128947602583105e-05
50 |   - 0.0016961801889458417
51 |   - 0.003370986119038037
52 |   - 0.0005772000185894434
53 |   - 0.0024953903457797444
54 |   - 0.0015517098495523652
55 |   - 0.002798013802624924
56 |   - 0.0005740488455741548
57 |   - 0.002829567804180083
58 |   - 0.002800862430880717
59 |   - 0.0015539842076223718
60 |   - 0.0003714630428281225
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r6_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 293.0572612023564
 5 |   - 0.0
 6 |   - 139.17801464819777
 7 |   - 0.0
 8 |   - 295.40667105309467
 9 |   - 142.67976269897005
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23333477959856513
19 |   - 0.09921033333190701
20 |   - -0.002754143284285488
21 |   - -0.0003490048615555727
22 |   - -0.03163811291317096
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.707683572638958
30 |   - -0.7064149220668923
31 |   - -0.012148211019830187
32 |   - 0.003543577770328583
33 |   - -0.5258508371764269
34 |   - 0.5381249309070799
35 |   - -0.6587011985700283
36 |   - -0.008355835603714326
37 |   - 0.47186014717667946
38 |   - -0.4597667518660912
39 |   - -0.7522929898286798
40 |   - 0.5258507414769803
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.001291453231841644
50 |   - 0.0012862151818786298
51 |   - 0.0033192870108997943
52 |   - 0.0005630001327120169
53 |   - 0.0018159974126859078
54 |   - 0.00256262368578926
55 |   - 0.002306922788415099
56 |   - 0.0005009385939512225
57 |   - 0.002809602491676512
58 |   - 0.0024143542267237414
59 |   - 0.002017391146870438
60 |   - 0.00036416735870768095
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r6_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 294.392182042338
 5 |   - 0.0
 6 |   - 135.903897476172
 7 |   - 0.0
 8 |   - 296.3789733076691
 9 |   - 141.54828717833394
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2358354659457964
19 |   - 0.11419031451647611
20 |   - -0.0013552452815978515
21 |   - -5.197185272809166e-05
22 |   - -0.0506407541295614
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.701553011571557
30 |   - 0.7125652659286393
31 |   - -0.0081941598120978
32 |   - 0.0031462234014824446
33 |   - 0.5392229494438794
34 |   - 0.5233014510040657
35 |   - -0.6598389826645469
36 |   - -0.006383759084792281
37 |   - -0.46589093385989944
38 |   - -0.4673315326527294
39 |   - -0.7513562295190089
40 |   - 0.5248396817440364
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.001229013783021029
50 |   - 0.0012230723194279855
51 |   - 0.0019907564196428966
52 |   - 0.000678238864738361
53 |   - 0.0014372425347592508
54 |   - 0.0013047132797986282
55 |   - 0.001717900885483957
56 |   - 0.0003934710849838526
57 |   - 0.0020110579530102307
58 |   - 0.0020913093736326304
59 |   - 0.0015082339319933923
60 |   - 0.0003564474250847881
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r7_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 294.91407777779557
 5 |   - 0.0
 6 |   - 142.61132801267524
 7 |   - 0.0
 8 |   - 295.90225454309245
 9 |   - 135.97925936019564
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23720518121586948
19 |   - 0.14246443256245964
20 |   - -0.003145098642467982
21 |   - 0.0003035117203527542
22 |   - -0.13056703179574852
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9999614301557127
30 |   - -0.0038044886143320166
31 |   - 0.007070722980804723
32 |   - -0.005681891195977897
33 |   - 0.0004308237353203887
34 |   - -0.8539705455649946
35 |   - -0.5203122790607115
36 |   - 0.014343960165359073
37 |   - 0.008019151190371582
38 |   - 0.520297200384148
39 |   - -0.8539430342275228
40 |   - 0.5382261057631044
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 1.9283957713196662e-05
50 |   - 0.002908282193539116
51 |   - 0.002051942169076023
52 |   - 0.0005212657697279603
53 |   - 0.0026520656637447205
54 |   - 0.0007755873522018889
55 |   - 0.0012723871096378659
56 |   - 0.0004834483677217526
57 |   - 0.002368962194840589
58 |   - 0.0012636475649274786
59 |   - 0.0007758826246265768
60 |   - 0.0003799753060648883
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r7_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 296.2022396427944
 5 |   - 0.0
 6 |   - 133.82740566738968
 7 |   - 0.0
 8 |   - 297.3771133577303
 9 |   - 134.75089001072408
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2333745455445079
19 |   - 0.11679667060789958
20 |   - -0.0023468444404548742
21 |   - 0.00014715034620450563
22 |   - -0.0708895129264651
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7111205566981869
30 |   - -0.7030203571281524
31 |   - -0.007699906468514525
32 |   - -0.0014288398691257784
33 |   - -0.5100222458361423
34 |   - 0.5233743164580106
35 |   - -0.6826047744879559
36 |   - 0.011510066687658853
37 |   - 0.4839159914333139
38 |   - -0.4814894107708583
39 |   - -0.7307454771098745
40 |   - 0.5294821942287425
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.00206778219301248
50 |   - 0.0020930770637026014
51 |   - 0.0014092729506834936
52 |   - 0.0004645190484458327
53 |   - 0.0020522867428850863
54 |   - 0.0016974886777108267
55 |   - 0.0005119784456929096
56 |   - 0.00046796029096165116
57 |   - 0.0012993169583872517
58 |   - 0.001545060400600723
59 |   - 0.00048249135429546594
60 |   - 0.000498084902998353
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r7_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.68284749216275
 5 |   - 0.0
 6 |   - 138.6172318394432
 7 |   - 0.0
 8 |   - 296.88876982765737
 9 |   - 141.98753763870397
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23528658775530842
19 |   - 0.11773801226352881
20 |   - -0.002460578200390553
21 |   - 0.0001076018703530655
22 |   - -0.060997186815576546
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.6975114038977125
30 |   - 0.7165641438116744
31 |   - 0.0006223597262802224
32 |   - -0.0033290540858423394
33 |   - 0.5380220394522306
34 |   - 0.5242879445517642
35 |   - -0.6600364261980698
36 |   - -0.006782151244644883
37 |   - -0.4732843702854203
38 |   - -0.46004892843662465
39 |   - -0.7512308632726631
40 |   - 0.5263967463789823
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0023360196800146084
50 |   - 0.002275668967104426
51 |   - 0.0016267290069801044
52 |   - 0.000571890327053958
53 |   - 0.001838070170067589
54 |   - 0.0015387251485108407
55 |   - 0.0007785730511057413
56 |   - 0.00037582786781152047
57 |   - 0.00178769402929836
58 |   - 0.002327251836675354
59 |   - 0.0006829373342669043
60 |   - 0.0003199057768580286
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r8_camera180.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 296.02053364867896
 5 |   - 0.0
 6 |   - 141.07075475961815
 7 |   - 0.0
 8 |   - 296.8487591707737
 9 |   - 137.25746637774347
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera180
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.24950471284028264
19 |   - 0.24342845204591992
20 |   - -0.002531494887887566
21 |   - 0.00046256063896776535
22 |   - -0.3541986260087375
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - 0.9998932927741045
30 |   - 0.012004739627995153
31 |   - 0.007465193312412156
32 |   - 0.001294615931159824
33 |   - 0.014135952776839406
34 |   - -0.8561971106260303
35 |   - -0.5164466993433118
36 |   - 0.012424529489562694
37 |   - 0.00019488181040999667
38 |   - 0.5164978786653587
39 |   - -0.8562827476906107
40 |   - 0.538478279558769
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 3.7629732321215315e-05
50 |   - 0.003084536149848168
51 |   - 0.0020110697091049044
52 |   - 0.0005356753336498872
53 |   - 0.0025770758673765376
54 |   - 0.0008503720407472454
55 |   - 0.0014576966397219205
56 |   - 0.0004912935035423122
57 |   - 0.0026262500846193013
58 |   - 0.0014483854523583061
59 |   - 0.0008742727017023533
60 |   - 0.00037525738136786344
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r8_camera300.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.6492218709406
 5 |   - 0.0
 6 |   - 140.90844522439747
 7 |   - 0.0
 8 |   - 296.8523852917806
 9 |   - 134.3721297951843
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera300
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.23263432076355056
19 |   - 0.09607373055882804
20 |   - -0.0021965093214136658
21 |   - 0.00010040389103745153
22 |   - -0.020900940190873876
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7069760632217225
30 |   - -0.7072283939196913
31 |   - -0.0008312390838778697
32 |   - -0.00152778567544845
33 |   - -0.5180401431551137
34 |   - 0.5186550356444332
35 |   - -0.6801626021939451
36 |   - 0.010290482386052437
37 |   - 0.4814637199137908
38 |   - -0.48043147307021605
39 |   - -0.7330577405868307
40 |   - 0.5309402021522255
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0021223869063767544
50 |   - 0.002120233007018172
51 |   - 0.0017759483917280225
52 |   - 0.0005414513654696901
53 |   - 0.0022978395868184544
54 |   - 0.002049430035805928
55 |   - 0.0008476007454615762
56 |   - 0.0004494651527114296
57 |   - 0.0014689878054040275
58 |   - 0.0013627055998963378
59 |   - 0.000787505151625276
60 |   - 0.0005483492176623279
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/r8_camera60.yml:
--------------------------------------------------------------------------------
 1 | camera_matrix:
 2 |   cols: 3
 3 |   data:
 4 |   - 295.4173651653694
 5 |   - 0.0
 6 |   - 139.6933250987614
 7 |   - 0.0
 8 |   - 296.93882981835765
 9 |   - 142.97108034297597
10 |   - 0.0
11 |   - 0.0
12 |   - 1.0
13 |   rows: 3
14 | camera_name: camera60
15 | distortion_coefficients:
16 |   cols: 5
17 |   data:
18 |   - -0.2303760271500639
19 |   - 0.09706332868956252
20 |   - -0.003083941016988369
21 |   - -5.190048856099958e-05
22 |   - -0.03530538891597434
23 |   rows: 1
24 | image_height: 270
25 | image_width: 270
26 | tf_world_to_camera:
27 |   cols: 4
28 |   data:
29 |   - -0.7030252255720041
30 |   - 0.7111477255883698
31 |   - 0.0028142862053625395
32 |   - -0.0016364916773338725
33 |   - 0.5304715245842915
34 |   - 0.5270363233833588
35 |   - -0.6639450477383473
36 |   - -0.004922911244153889
37 |   - -0.4736460950801597
38 |   - -0.465280304546193
39 |   - -0.7477693453351668
40 |   - 0.5267169283529306
41 |   - 0.0
42 |   - 0.0
43 |   - 0.0
44 |   - 1.0
45 |   rows: 4
46 | tf_world_to_camera_std:
47 |   cols: 4
48 |   data:
49 |   - 0.0022852521506059376
50 |   - 0.002257239526714245
51 |   - 0.002491362158126567
52 |   - 0.0005670914732711851
53 |   - 0.002294571088508853
54 |   - 0.0014931293629478386
55 |   - 0.0014678323194583416
56 |   - 0.0004086713091198698
57 |   - 0.0020935606259285377
58 |   - 0.002922304033917509
59 |   - 0.001303125770391626
60 |   - 0.00040230225517642936
61 |   - 0.0
62 |   - 0.0
63 |   - 0.0
64 |   - 0.0
65 |   rows: 4
66 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/data/trifingerpro_shuffle_cube_trajectory_fast.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/data/trifingerpro_shuffle_cube_trajectory_fast.npy


--------------------------------------------------------------------------------
/trifinger_rl_datasets/dataset_env.py:
--------------------------------------------------------------------------------
  1 | from copy import deepcopy
  2 | import hashlib
  3 | import os
  4 | from pathlib import Path
  5 | from threading import Thread
  6 | from typing import Union, Tuple, Dict, Optional, List, Any
  7 | import urllib.request
  8 | 
  9 | import cv2
 10 | import gymnasium as gym
 11 | import gymnasium.spaces as spaces
 12 | import numpy as np
 13 | from tqdm import tqdm
 14 | import yaml
 15 | import zarr
 16 | 
 17 | from .sim_env import SimTriFingerCubeEnv
 18 | 
 19 | 
 20 | class ImageLoader(Thread):
 21 |     """Thread for loading and processing images from the dataset.
 22 | 
 23 |     This thread is responsible for loading and processing every
 24 |     loader_id-th image. Processing includes decoding, reordering
 25 |     of pixels and debayering."""
 26 | 
 27 |     def __init__(
 28 |         self,
 29 |         loader_id,
 30 |         n_loaders,
 31 |         image_data,
 32 |         unique_images,
 33 |         n_unique_images,
 34 |         n_cameras,
 35 |         reorder_pixels,
 36 |         timestep_dimension,
 37 |     ):
 38 |         """
 39 |         Args:
 40 |             loader_id: ID of this loader.  This loader will load every
 41 |                 loader_id-th image.
 42 |             n_loaders: Total number of loaders.
 43 |             image_data: Numpy array containing the image data.
 44 |             unique_images: Numpy array to which the images are written.
 45 |             n_unique_images: Number of unique images to load. If this
 46 |                 number is not divisible by n_cameras,
 47 |                 self.unique_images will be padded with zeros.
 48 |             n_cameras: Number of cameras.
 49 |             reorder_pixels: Whether to undo the reordering of the pixels
 50 |                 which was done during creation of the dataset to improve
 51 |                 the image compression.
 52 |             timestep_dimension: If True, the image data is expected to
 53 |                 contain images from all cameras in a row and
 54 |                 n_unique_images is expected to have shape
 55 |                 (n_timesteps, n_cameras, height, width). If False, the
 56 |                 shape is expected to be
 57 |                 (n_unique_images, n_cameras, height, width)."""
 58 |         super().__init__()
 59 |         self.loader_id = loader_id
 60 |         self.n_loaders = n_loaders
 61 |         self.image_data = image_data
 62 |         self.unique_images = unique_images
 63 |         self.n_unique_images = n_unique_images
 64 |         self.n_cameras = n_cameras
 65 |         self.reorder_pixels = reorder_pixels
 66 |         self.timestep_dimension = timestep_dimension
 67 | 
 68 |     def _reorder_pixels(self, img: np.ndarray) -> np.ndarray:
 69 |         """Undo reordering of Bayer pattern."""
 70 |         new = np.empty_like(img)
 71 |         a = img.shape[0] // 2
 72 |         b = img.shape[1] // 2
 73 | 
 74 |         red = img[0:a, 0:b]
 75 |         blue = img[a:, 0:b]
 76 |         green1 = img[0:a, b:]
 77 |         green2 = img[a:, b:]
 78 | 
 79 |         new[0::2, 0::2] = red
 80 |         new[1::2, 1::2] = blue
 81 |         new[0::2, 1::2] = green1
 82 |         new[1::2, 0::2] = green2
 83 | 
 84 |         return new
 85 | 
 86 |     def _decode_image(self, image: np.ndarray) -> np.ndarray:
 87 |         """Decode image from numpy array of type void."""
 88 |         # convert numpy array of type V1 to use with cv2 imdecode
 89 |         image = np.frombuffer(image, dtype=np.uint8)
 90 |         # use cv2 to decode image
 91 |         image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
 92 |         return image
 93 | 
 94 |     def run(self):
 95 |         # this thread is responsible for every loader_id-th image
 96 |         for i in range(self.loader_id, self.n_unique_images, self.n_loaders):
 97 |             if self.timestep_dimension:
 98 |                 timestep, camera = divmod(i, self.n_cameras)
 99 |             compressed_image = self.image_data[i]
100 |             # decode image
101 |             image = self._decode_image(compressed_image)
102 |             if self.reorder_pixels:
103 |                 # undo reordering of pixels
104 |                 image = self._reorder_pixels(image)
105 |             # debayer image (output channels in RGB order)
106 |             image = cv2.cvtColor(image, cv2.COLOR_BAYER_BG2RGB)
107 |             # convert to channel first
108 |             image = np.transpose(image, (2, 0, 1))
109 |             if self.timestep_dimension:
110 |                 self.unique_images[timestep, camera, ...] = image
111 |             else:
112 |                 self.unique_images[i, ...] = image
113 | 
114 | 
115 | class TriFingerDatasetEnv(gym.Env):
116 |     """TriFinger environment which can load an offline RL dataset from a file.
117 | 
118 |     Similar to D4RL's OfflineEnv but with different data loading and
119 |     options for customization of observation space."""
120 | 
121 |     _PRELOAD_VECTOR_KEYS = ["observations", "actions"]
122 |     _PRELOAD_SCALAR_KEYS = ["rewards", "timeouts"]
123 | 
124 |     def __init__(
125 |         self,
126 |         name,
127 |         dataset_url,
128 |         ref_max_score,
129 |         ref_min_score,
130 |         trifinger_kwargs,
131 |         real_robot=False,
132 |         image_obs=False,
133 |         visualization=False,
134 |         obs_to_keep=None,
135 |         flatten_obs=True,
136 |         scale_obs=False,
137 |         set_terminals=False,
138 |         data_dir=None,
139 |         **kwargs,
140 |     ):
141 |         """
142 |         Args:
143 |             name (str): Name of the dataset.
144 |             dataset_url (str): URL pointing to the dataset.
145 |             ref_max_score (float): Maximum score (for score normalization)
146 |             ref_min_score (float): Minimum score (for score normalization)
147 |             trifinger_kwargs (dict): Keyword arguments for underlying
148 |                 SimTriFingerCubeEnv environment.
149 |             real_robot (bool): Whether the data was collected on real
150 |                 robots.
151 |             image_obs (bool): Whether observations contain camera
152 |                 images.
153 |             visualization (bool): Enables rendering for simulated
154 |                 environment.
155 |             obs_to_keep (dict): Dictionary with the same structure as
156 |                 the observation of SimTriFingerCubeEnv. The boolean
157 |                 value of each item indicates whether it should be
158 |                 included in the observation. If None, the
159 |                 SimTriFingerCubeEnv is used.
160 |             flatten_obs (bool): Whether to flatten the observation. Can
161 |                 be combined with obs_to_keep.
162 |             scale_obs (bool): Whether to scale all components of the
163 |                 observation to interval [-1, 1]. Only implemented
164 |                 for flattend observations.
165 |             set_terminals (bool): Whether to set the terminals instead
166 |                 of the timeouts.
167 |             data_dir (str or Path): Directory where the dataset is
168 |                 stored.  If None, the default data directory
169 |                 (~/.trifinger_rl_datasets) is used.
170 |         """
171 |         super().__init__(**kwargs)
172 | 
173 |         self.name = name
174 |         self.dataset_url = dataset_url
175 |         self.ref_max_score = ref_max_score
176 |         self.ref_min_score = ref_min_score
177 |         self.real_robot = real_robot
178 |         self.image_obs = image_obs
179 |         self.obs_to_keep = obs_to_keep
180 |         self.flatten_obs = flatten_obs
181 |         self.scale_obs = scale_obs
182 |         self.set_terminals = set_terminals
183 |         self._local_dataset_path = None
184 |         if data_dir is None:
185 |             data_dir = Path.home() / ".trifinger_rl_datasets"
186 |         self.data_dir = Path(data_dir)
187 | 
188 |         self.t_kwargs = deepcopy(trifinger_kwargs)
189 |         self.t_kwargs["image_obs"] = image_obs
190 |         self.t_kwargs["visualization"] = visualization
191 | 
192 |         # underlying simulated TriFinger environment
193 |         self.sim_env = SimTriFingerCubeEnv(**self.t_kwargs)
194 |         # a copy of the original observation space which is used when
195 |         # filtering the observations
196 |         self._orig_obs_space = deepcopy(self.sim_env.observation_space)
197 |         # the space used for unflattening the observations (images will
198 |         # be removed from this space)
199 |         self._unflattening_space = deepcopy(self.sim_env.observation_space)
200 | 
201 |         # remove camera observations from space used for flattening
202 |         # and unflattening as images are treated separetely and not
203 |         # flattened
204 |         if self.image_obs:
205 |             stripped_camera_observations = spaces.Dict(
206 |                 {
207 |                     k: v
208 |                     for k, v in self._orig_obs_space.spaces[
209 |                         "camera_observation"
210 |                     ].spaces.items()
211 |                     if k != "images"
212 |                 }
213 |             )
214 |             self._unflattening_space["camera_observation"] = stripped_camera_observations
215 |             if self.flatten_obs:
216 |                 # if the observations are eventually flattened, they do not contain
217 |                 # images anymore
218 |                 self._orig_obs_space["camera_observation"] = stripped_camera_observations
219 |         self._orig_flat_obs_space = spaces.flatten_space(self._orig_obs_space)
220 |         self._flat_unflattening_space = spaces.flatten_space(self._unflattening_space)
221 | 
222 |         if scale_obs and not flatten_obs:
223 |             raise NotImplementedError(
224 |                 "Scaling of observations only "
225 |                 "implemented for flattened observations, i.e., for "
226 |                 "flatten_obs=True."
227 |             )
228 | 
229 |         # action space
230 |         self.action_space = self.sim_env.action_space
231 | 
232 |         # observation space
233 |         # self._filtered_obs_space is the Dict observation space after
234 |         # filtering
235 |         if self.obs_to_keep is not None:
236 |             # construct filtered observation space
237 |             self._filtered_obs_space = self._filter_dict(
238 |                 keys_to_keep=self.obs_to_keep, d=self._orig_obs_space
239 |             )
240 |         else:
241 |             self._filtered_obs_space = self._orig_obs_space
242 |         # self.observation_space is potentially also flattened
243 |         if self.flatten_obs:
244 |             # flat obs space
245 |             self.observation_space = spaces.flatten_space(self._filtered_obs_space)
246 |             if self.scale_obs:
247 |                 self._obs_unscaled_low = self.observation_space.low
248 |                 self._obs_unscaled_high = self.observation_space.high
249 |                 # scale observations to [-1, 1]
250 |                 self.observation_space = spaces.Box(
251 |                     low=-1.0,
252 |                     high=1.0,
253 |                     shape=self.observation_space.shape,
254 |                     dtype=self.observation_space.dtype,
255 |                 )
256 |         else:
257 |             self.observation_space = self._filtered_obs_space
258 | 
259 |     def _download_dataset(self):
260 |         """Download dataset files if not already present.
261 |         
262 |         `self.dataset_url` is expected to point to a YAML file with the
263 |         following structure:
264 |         ```
265 |         n_parts: <number of parts>
266 |         md5_hash_parts:
267 |             - <md5 hash of part 1>
268 |             - <md5 hash of part 2>
269 |             ...
270 |         md5_hash_complete: <md5 hash of complete dataset>
271 |         ```
272 |         The dataset is split into multiple parts to allow for
273 |         continuing a download if it was interrupted. The complete
274 |         dataset is then reconstructed by concatenating the parts."""
275 |         if self._local_dataset_path is None:
276 |             dataset_dir = self.data_dir / (self.name + ".zarr")
277 |             dataset_dir.mkdir(exist_ok=True, parents=True)
278 |             local_path = dataset_dir / "data.mdb"
279 |             if not local_path.exists():
280 |                 print(f"Downloading dataset {self.name}.")
281 |                 # first download YAML file with info about dataset files
282 |                 with urllib.request.urlopen(self.dataset_url) as web_url:
283 |                     dataset_info = yaml.safe_load(web_url)
284 |                 # download dataset parts
285 |                 for i, part_hash in enumerate(tqdm(dataset_info["md5_hash_parts"])):
286 |                     part_path = dataset_dir / f"{self.name}_{i:03d}"
287 |                     if not part_path.exists():
288 |                         # strip filename from url
289 |                         stripped_url = self.dataset_url.rsplit("/", 1)[0]
290 |                         part_url = stripped_url + f"/part_{i:03d}"
291 |                         urllib.request.urlretrieve(part_url, part_path)
292 |                         if not part_path.exists():
293 |                             raise IOError(
294 |                                 f"Failed to download part {i} of dataset from URL {part_url}."
295 |                             )
296 |                     # check hash
297 |                     with open(part_path, "rb") as f:
298 |                         m = hashlib.md5()
299 |                         m.update(f.read())
300 |                     if m.hexdigest() != part_hash:
301 |                         raise IOError(
302 |                             f"Hash of downloaded part {part_path} does not "
303 |                             f"match expected hash. Please delete "
304 |                             f"the file and try again."
305 |                         )
306 |                 # combine parts
307 |                 with open(local_path, "wb") as f:
308 |                     print("Assembling dataset parts.")
309 |                     for i in tqdm(range(dataset_info["n_parts"])):
310 |                         part_path = dataset_dir / f"{self.name}_{i:03d}"
311 |                         with open(part_path, "rb") as part_file:
312 |                             f.write(part_file.read())
313 |                         # delete part file
314 |                         part_path.unlink()
315 |                 if not local_path.exists():
316 |                     raise IOError(
317 |                         f"Failed to assemble dataset {self.dataset_url} locally at {local_path}."
318 |                     )
319 |             self._local_dataset_path = dataset_dir
320 |         return self._local_dataset_path
321 | 
322 |     def _filter_dict(self, keys_to_keep, d):
323 |         """Keep only a subset of keys in dict.
324 | 
325 |         Applied recursively.
326 | 
327 |         Args:
328 |             keys_to_keep (dict): (Nested) dictionary with values being
329 |                 either a dict or a bolean indicating whether to keep
330 |                 an item.
331 |             d (dict or gymnasium.spaces.Dict): Dicitionary or Dict space that
332 |                 is to be filtered."""
333 | 
334 |         filtered_dict = {}
335 |         for k, v in keys_to_keep.items():
336 |             if isinstance(v, dict):
337 |                 subspace = self._filter_dict(v, d[k])
338 |                 filtered_dict[k] = subspace
339 |             elif isinstance(v, bool) and v:
340 |                 filtered_dict[k] = d[k]
341 |             elif not isinstance(v, bool):
342 |                 raise TypeError(
343 |                     "Expected boolean to indicate whether item "
344 |                     "in observation space is to be kept."
345 |                 )
346 |         if isinstance(d, spaces.Dict):
347 |             filtered_dict = spaces.Dict(spaces=filtered_dict)
348 |         return filtered_dict
349 | 
350 |     def _scale_obs(self, obs: np.ndarray) -> np.ndarray:
351 |         """Scale observation components to [-1, 1]."""
352 | 
353 |         interval = self._obs_unscaled_high.high - self._obs_unscaled_low.low
354 |         a = (obs - self._obs_unscaled_low.low) / interval
355 |         return a * 2.0 - 1.0
356 | 
357 |     def _process_obs(self, obs: Union[np.ndarray, Dict]) -> np.ndarray:
358 |         """Process obs according to params.
359 | 
360 |         Assumes that if `self.obs_to_keep` is not None, then the observations
361 |         are provided as a dictionary.
362 |         Args:
363 |             obs: Dictionary or array containing the
364 |                 observations.
365 |         Returns:
366 |             Processed observations. If `self.flatten_obs` is False then
367 |             as a dictionary. If `self.flatten_obs` is True then either as
368 |             a 1D NumPy array (if no images are contained in obs) or as a
369 |             tuple (if images are contained in the obs dictionary)
370 |             consisting of
371 |             * a 1D NumPy array containing all observations except the
372 |             camera images, and
373 |             * a NumPy array of shape (n_cameras, n_channels, height, width)
374 |             containing the camera images."""
375 | 
376 |         images = None
377 |         if self.obs_to_keep is not None:
378 |             # filter obs
379 |             obs = self._filter_dict(self.obs_to_keep, obs)
380 |         if self.flatten_obs and isinstance(obs, dict):
381 |             if "images" in obs["camera_observation"]:
382 |                 # remove camera_observations/images from obs
383 |                 images = obs["camera_observation"].pop("images")
384 |             # flatten obs
385 |             obs = spaces.flatten(self._filtered_obs_space, obs)
386 |         if self.scale_obs:
387 |             # scale obs
388 |             obs = self._scale_obs(obs)
389 |         if images is not None:
390 |             return obs, images
391 |         else:
392 |             return obs
393 | 
394 |     def get_obs_indices(self) -> Tuple[Dict, Dict]:
395 |         """Get index ranges that correspond to the different observation components.
396 | 
397 |         Also returns a dictionary containing the shapes of these observation
398 |         components.
399 | 
400 |         Returns:
401 |             - A dictionary with keys corresponding to the observation components and
402 |               values being tuples of the form (start, end), where start and end are
403 |               the indices at which the observation component starts and ends. The
404 |               nested dictionary structure of the observation is preserved.
405 |             - A dictionary of the same structure but with values being the shapes
406 |               of the observation components."""
407 | 
408 |         def _construct_dummy_obs(spaces_dict, counter=[0]):
409 |             """Construct dummy observation which has an array repeating
410 |             a different integer as the value of each component."""
411 |             dummy_obs = {}
412 |             for i, (k, v) in enumerate(spaces_dict.items()):
413 |                 if isinstance(v, spaces.Dict):
414 |                     dummy_obs[k] = _construct_dummy_obs(v.spaces, counter)
415 |                 else:
416 |                     dummy_obs[k] = counter * np.ones(v.shape, dtype=np.int32)
417 |                     counter[0] += 1
418 |             return dummy_obs
419 | 
420 |         dummy_obs = _construct_dummy_obs(self._orig_obs_space.spaces)
421 |         flat_dummy_obs = spaces.flatten(self._orig_obs_space, dummy_obs)
422 | 
423 |         def _get_indices_and_shape(dummy_obs, flat_dummy_obs):
424 |             indices = {}
425 |             shape = {}
426 |             for k, v in dummy_obs.items():
427 |                 if isinstance(v, dict):
428 |                     indices[k], shape[k] = _get_indices_and_shape(v, flat_dummy_obs)
429 |                 else:
430 |                     where = np.where(flat_dummy_obs == v.flatten()[0])[0]
431 |                     indices[k] = (int(where[0]), int(where[-1]) + 1)
432 |                     shape[k] = v.shape
433 |             return indices, shape
434 | 
435 |         return _get_indices_and_shape(dummy_obs, flat_dummy_obs)
436 | 
437 |     def get_dataset_stats(self, zarr_path: Union[str, os.PathLike] = None) -> Dict:
438 |         """Get statistics of dataset such as number of timesteps.
439 | 
440 |         Args:
441 |             zarr_path:  Optional path to a Zarr directory containing the dataset, which will be
442 |                 used instead of the default.
443 |         Returns:
444 |             The statistics of the dataset as a dictionary with keys
445 | 
446 |                 - n_timesteps: Number of timesteps in dataset. Corresponds to the
447 |                   number of observations, actions and rewards.
448 |                 - obs_size: Size of the observation vector.
449 |                 - action_size: Size of the action vector.
450 |         """
451 |         if zarr_path is None:
452 |             zarr_path = self._download_dataset()
453 | 
454 |         store = zarr.LMDBStore(zarr_path, readonly=True)
455 |         with zarr.open(store=store) as root:
456 |             dataset_stats = {
457 |                 "n_timesteps": root["observations"].shape[0],
458 |                 "obs_size": root["observations"].shape[1],
459 |                 "action_size": root["actions"].shape[1],
460 |             }
461 |         return dataset_stats
462 | 
463 |     def get_image_stats(self, zarr_path: Union[str, os.PathLike] = None) -> Dict:
464 |         """Get statistics of image data in dataset.
465 | 
466 |         Args:
467 |             zarr_path:  Optional path to a Zarr directory containing the dataset, which will be
468 |                 used instead of the default.
469 |         Returns:
470 |             The statistics of the image data as a dictionary with keys
471 | 
472 |                 - n_images: Number of images in the dataset.
473 |                 - n_cameras: Number of cameras used to capture the images.
474 |                 - n_channels: Number of channels in the images.
475 |                 - image_shape: Shape of the images in the format (height, width).
476 |                 - reorder_pixels: Whether the pixels in the images have been reordered
477 |                   to have the pixels corresponding to one color in the Bayer pattern
478 |                   together in blocks (to improve image compression).
479 |         """
480 |         if zarr_path is None:
481 |             zarr_path = self._download_dataset()
482 | 
483 |         store = zarr.LMDBStore(zarr_path, readonly=True)
484 |         with zarr.open(store=store) as root:
485 |             image_stats = {
486 |                 "n_images": root["images"].shape[0],
487 |                 "n_cameras": root["images"].attrs["n_cameras"],
488 |                 "n_channels": root["images"].attrs["n_channels"],
489 |                 "image_shape": tuple(root["images"].attrs["image_shape"]),
490 |                 "reorder_pixels": root["images"].attrs["reorder_pixels"],
491 |             }
492 |         return image_stats
493 | 
494 |     def get_image_data(
495 |         self,
496 |         rng: Optional[Tuple[int, int]] = None,
497 |         indices: Optional[np.ndarray] = None,
498 |         zarr_path: Union[str, os.PathLike] = None,
499 |         timestep_dimension: bool = True,
500 |         n_threads: Optional[int] = None,
501 |     ) -> np.ndarray:
502 |         """Get image data from dataset.
503 | 
504 |         Args:
505 |             rng: Optional range of images to return. rng=(m,n) means that the
506 |                 images with indices m to n-1 are returned.
507 |             indices: Optional array of image indices for which to load data. rng
508 |                 and indices are mutually exclusive, only one of them can be set.
509 |             zarr_path:  Optional path to a Zarr directory containing the dataset,
510 |                 which will be used instead of the default.
511 |             timestep_dimension: Whether to include the timestep dimension in the
512 |                 returned array. This is useful if the given range of indices
513 |                 always contains `n_cameras` of image indices in a row which
514 |                 correspond to the camera images at one camera timestep.
515 |                 If this assumption is violated, the first dimension will not
516 |                 correspond to camera timesteps anymore.
517 | 
518 |             n_threads: Number of threads to use for processing the images. If None,
519 |                 the number of threads is set to the number of CPUs available to the
520 |                 process.
521 |         Returns:
522 |             The image data (or a part of it specified by rng or indices) as a numpy
523 |             array. If `timestep_dimension` is True the shape will be
524 |             (n_camera_timesteps, n_cameras, n_channels, height, width) else
525 |             (n_images, n_channels, height, width). The channels are ordered as RGB.
526 |         """
527 |         if rng is not None and indices is not None:
528 |             raise ValueError("rng and indices cannot be specified at the same time.")
529 | 
530 |         if n_threads is None:
531 |             n_threads = len(os.sched_getaffinity(0))
532 |         if zarr_path is None:
533 |             zarr_path = self._download_dataset()
534 |         store = zarr.LMDBStore(zarr_path, readonly=True)
535 |         root = zarr.open(store=store)
536 | 
537 |         n_cameras = root["images"].attrs["n_cameras"]
538 |         n_channels = root["images"].attrs["n_channels"]
539 |         image_shape = tuple(root["images"].attrs["image_shape"])
540 |         reorder_pixels = root["images"].attrs["reorder_pixels"]
541 |         compression = root["images"].attrs["compression"]
542 |         assert compression == "image", "Only image compression is supported."
543 | 
544 |         # load only relevant image data
545 |         if indices is not None:
546 |             image_data = root["images"].get_orthogonal_selection(indices)
547 |         else:
548 |             image_data = root["images"][slice(*rng)]
549 |         n_unique_images = image_data.shape[0]
550 |         if timestep_dimension:
551 |             n_timesteps = int(np.ceil(n_unique_images / n_cameras))
552 |             out_shape = (n_timesteps, n_cameras, n_channels) + image_shape
553 |         else:
554 |             out_shape = (n_unique_images, n_channels) + image_shape
555 |         unique_images = np.zeros(out_shape, dtype=np.uint8)
556 | 
557 |         threads = []
558 |         # distribute image loading and processing over multiple threads
559 |         for i in range(n_threads):
560 |             image_loader = ImageLoader(
561 |                 loader_id=i,
562 |                 n_loaders=n_threads,
563 |                 image_data=image_data,
564 |                 unique_images=unique_images,
565 |                 n_unique_images=n_unique_images,
566 |                 n_cameras=n_cameras,
567 |                 reorder_pixels=reorder_pixels,
568 |                 timestep_dimension=timestep_dimension,
569 |             )
570 |             threads.append(image_loader)
571 |             image_loader.start()
572 |         for thread in threads:
573 |             thread.join()
574 |         store.close()
575 | 
576 |         return unique_images
577 | 
578 |     def convert_timestep_to_image_index(
579 |             self,
580 |             timesteps: np.ndarray,
581 |             zarr_path: Union[str, os.PathLike] = None,
582 |         ) -> np.ndarray:
583 |         """Convert camera timesteps to image indices.
584 | 
585 |         Args:
586 |             timesteps:  Array of camera timesteps.
587 |         Returns:
588 |             Array of image indices.
589 |         """
590 |         if zarr_path is None:
591 |             zarr_path = self._download_dataset()
592 |         store = zarr.LMDBStore(zarr_path, readonly=True)
593 |         root = zarr.open(store=store)
594 | 
595 |         # mapping from observation index to image index
596 |         # (necessary since the camera frequency < control frequency)
597 |         image_indices = root["obs_to_image_index"].get_coordinate_selection(timesteps)
598 |         store.close()
599 |         return image_indices
600 | 
601 |     def get_dataset(
602 |         self,
603 |         zarr_path: Union[str, os.PathLike] = None,
604 |         clip: bool = True,
605 |         rng: Optional[Tuple[int, int]] = None,
606 |         indices: Optional[np.ndarray] = None,
607 |         n_threads: Optional[int] = None,
608 |     ) -> Dict[str, Any]:
609 |         """Get the dataset.
610 | 
611 |         When called for the first time, the dataset is automatically downloaded and
612 |         saved to ``~/.trifinger_rl_datasets``.
613 | 
614 |         Args:
615 |             zarr_path:  Optional path to a Zarr directory containing the dataset, which will be
616 |                 used instead of the default.
617 |             clip:  If True, observations are clipped to be within the environment's
618 |                 observation space.
619 |             rng:  Optional range to return. rng=(m,n) means that observations, actions
620 |                 and rewards m to n-1 are returned. If not specified, the entire
621 |                 dataset is returned.
622 |             indices: Optional array of timestep indices for which to load data. rng
623 |                 and indices are mutually exclusive, only one of them can be set.
624 |             n_threads: Number of threads to use for processing the images. If None,
625 |                 the number of threads is set to the number of CPUs available to the
626 |                 process.
627 |         Returns:
628 |             A dictionary containing the following keys
629 | 
630 |                 - observations: Either an array or a list of dictionaries
631 |                   containing the observations depending on whether
632 |                   `flatten_obs` is True or False.
633 |                 - actions: Array containing the actions.
634 |                 - rewards: Array containing the rewards.
635 |                 - timeouts: Array containing the timeouts (True only at
636 |                   the end of an episode by default. Always False if
637 |                   `set_terminals` is True).
638 |                 - terminals: Array containing the terminals (Always
639 |                   False by default. If `set_terminals` is True, only
640 |                   True at the last timestep of an episode).
641 |                 - images (only if present in dataset): Array of the
642 |                   shape (n_control_timesteps, n_cameras, n_channels,
643 |                   height, width) containing the image data. The cannels
644 |                   are ordered as RGB.
645 |         """
646 | 
647 |         # The offline RL dataset is loaded from a Zarr directory which contains
648 |         # the following Zarr arrays (this is an implementation detail and
649 |         # not necessary to understand for users of the class):
650 |         # - observations: Two-dimensional array of shape
651 |         #     `(n_control_timesteps, n_obs)` containing the observations as
652 |         #     flat vectors of length `n_obs` (except for the camera images
653 |         #     which are stored in image_data if present in the dataset).
654 |         # - actions: Two-dimensional array of shape `(n_control_timesteps,
655 |         #     n_actions)` containing the actions.
656 |         # - rewards: One-dimensional array of length `n_control_timesteps`
657 |         #     containing the rewards.
658 |         # - episode_ends: One-dimensional array of length `n_episodes`
659 |         #     containing the indices of the last control timestep of each
660 |         #     episode.
661 |         # - timeouts: One-dimensional array of length `n_control_timesteps`
662 |         #     with values of type bool. Only True at timesteps where the
663 |         #     episode ends, False otherwise.
664 |         # - image_data: Ragged array of type bytes, which contains the
665 |         #     compressed image data. The images obtained from all cameras
666 |         #     at each camera time step are written one after another to this
667 |         #     array. After decompression the color information is contained
668 |         #     in a Bayer pattern. The images should therefore be debayerd
669 |         #     before use. Also note the information on the reorder_pixels
670 |         #     attribute below. The dataset has the following attributes:
671 |         #     - n_cameras: Number of cameras.
672 |         #     - n_channels: Number of channels per camera image.
673 |         #     - compression: Type of compression used. Only "image" is
674 |         #       supported by this class.
675 |         #     - image_codec: Codec used to compress the image data. Only
676 |         #       "jpeg" and "png" are supported by this class.
677 |         #     - image_shape: Tuple of length 2 containing the height and width
678 |         #       of the images.
679 |         #     - reorder_pixels: If true, the pixels of the Bayer pattern have
680 |         #       been reordered, such that all pixels of a specific colour are
681 |         #       next to each other in one big block (i.e. one block with all
682 |         #       red pixels, one with all blue pixels and one with all green
683 |         #       pixels). This leads to more continuity of the data (compared
684 |         #       to the original Bayer pattern) and thus tends to improve the
685 |         #       performance of standard image compression algorithms (e.g.
686 |         #       PNG). To restore the original image, the pixels need to be
687 |         #       reordered back before debayering.
688 |         # - obs_to_image_index: One-dimensional array of length
689 |         #     `n_control_timesteps` containing the index of the camera
690 |         #     image corresponding to each control timestep. This mapping
691 |         #     is necessary because the camera frequency is lower than the
692 |         #     control frequency.
693 | 
694 |         if rng is not None and indices is not None:
695 |             raise ValueError("rng and indices cannot be specified at the same time.")
696 | 
697 |         if zarr_path is None:
698 |             zarr_path = self._download_dataset()
699 |         store = zarr.LMDBStore(zarr_path, readonly=True)
700 |         root = zarr.open(store=store)
701 | 
702 |         data_dict = {}
703 |         if indices is None:
704 |             # turn range into slice
705 |             n_avail_transitions = root["observations"].shape[0]
706 |             if rng is None:
707 |                 rng = (None, None)
708 |             rng = (
709 |                 0 if rng[0] is None else rng[0],
710 |                 n_avail_transitions if rng[1] is None else rng[1],
711 |             )
712 |             range_slice = slice(*rng)
713 |             for k in self._PRELOAD_VECTOR_KEYS + self._PRELOAD_SCALAR_KEYS:
714 |                 data_dict[k] = root[k][range_slice]
715 |         else:
716 |             for k in self._PRELOAD_VECTOR_KEYS:
717 |                 data_dict[k] = root[k].get_orthogonal_selection((indices, slice(None)))
718 |             for k in self._PRELOAD_SCALAR_KEYS:
719 |                 data_dict[k] = root[k].get_coordinate_selection(indices)
720 | 
721 |         n_control_timesteps = data_dict["observations"].shape[0]
722 | 
723 |         # clip to make sure that there are no outliers in the data
724 |         if clip:
725 |             data_dict["observations"] = data_dict["observations"].clip(
726 |                 min=self._flat_unflattening_space.low,
727 |                 max=self._flat_unflattening_space.high,
728 |                 dtype=self._flat_unflattening_space.dtype,
729 |             )
730 | 
731 |         if not (self.flatten_obs and self.obs_to_keep is None):
732 |             # unflatten observations, i.e., turn them into dicts again
733 |             unflattened_obs = []
734 |             obs = data_dict["observations"]
735 |             for i in range(obs.shape[0]):
736 |                 unflattened_obs.append(
737 |                     spaces.unflatten(self._unflattening_space, obs[i, ...])
738 |                 )
739 |             data_dict["observations"] = unflattened_obs
740 | 
741 |         # timeouts, terminals and info
742 |         if self.set_terminals:
743 |             data_dict["terminals"] = data_dict["timeouts"]
744 |             data_dict["timeouts"] = np.zeros(n_control_timesteps, dtype=bool)
745 |         data_dict["infos"] = [{} for _ in range(n_control_timesteps)]
746 | 
747 |         # process obs (filtering, flattening, scaling)
748 |         for i in range(n_control_timesteps):
749 |             data_dict["observations"][i] = self._process_obs(
750 |                 obs=data_dict["observations"][i]
751 |             )
752 |         # turn observations into array if obs are flattened
753 |         if self.flatten_obs:
754 |             data_dict["observations"] = np.array(
755 |                 data_dict["observations"], dtype=self.observation_space.dtype
756 |             )
757 | 
758 |         if "images" in root.keys():
759 |             n_cameras = root["images"].attrs["n_cameras"]
760 |             if indices is None:
761 |                 # mapping from observation index to image index
762 |                 # (necessary since the camera frequency < control frequency)
763 |                 obs_to_image_index = root["obs_to_image_index"][range_slice]
764 |                 image_index_range = (
765 |                     obs_to_image_index[0],
766 |                     # add n_cameras to include last images as well
767 |                     obs_to_image_index[-1] + n_cameras,
768 |                 )
769 |                 # load images
770 |                 unique_images = self.get_image_data(
771 |                     rng=image_index_range, zarr_path=zarr_path, n_threads=n_threads
772 |                 )
773 |             else:
774 |                 obs_to_image_index = root[
775 |                     "obs_to_image_index"
776 |                 ].get_coordinate_selection(indices)
777 |                 # load images from all cameras, not only first one
778 |                 all_cam_indices = np.zeros(
779 |                     obs_to_image_index.shape[0] * n_cameras, dtype=np.int64
780 |                 )
781 |                 for i in range(n_cameras):
782 |                     all_cam_indices[i::n_cameras] = obs_to_image_index + i
783 |                 # remove duplicates and sort
784 |                 image_indices, unique_to_original = np.unique(
785 |                     all_cam_indices, return_inverse=True
786 |                 )
787 |                 # load images
788 |                 unique_images = self.get_image_data(
789 |                     indices=image_indices, zarr_path=zarr_path, n_threads=n_threads
790 |                 )
791 |             # repeat images to account for control frequency > camera frequency
792 |             images = np.zeros(
793 |                 (n_control_timesteps,) + unique_images.shape[1:], dtype=np.uint8
794 |             )
795 |             for i in range(n_control_timesteps):
796 |                 if indices is None:
797 |                     index = (obs_to_image_index[i] - obs_to_image_index[0]) // n_cameras
798 |                 else:
799 |                     # map from original image index to unique image index
800 |                     index = unique_to_original[i * n_cameras] // n_cameras
801 |                 images[i] = unique_images[index]
802 |             data_dict["images"] = images
803 | 
804 |         store.close()
805 | 
806 |         return data_dict
807 | 
808 |     def get_dataset_chunk(self, chunk_id, zarr_path=None):
809 |         raise NotImplementedError()
810 | 
811 |     def compute_reward(
812 |         self, achieved_goal: dict, desired_goal: dict, info: dict
813 |     ) -> float:
814 |         """Compute the reward for the given achieved and desired goal.
815 | 
816 |         Args:
817 |             achieved_goal: Current pose of the object.
818 |             desired_goal: Goal pose of the object.
819 |             info: An info dictionary containing a field "time_index" which
820 |                 contains the time index of the achieved_goal.
821 | 
822 |         Returns:
823 |             The reward that corresponds to the provided achieved goal w.r.t. to
824 |             the desired goal.
825 |         """
826 |         return self.sim_env.compute_reward(achieved_goal, desired_goal, info)
827 | 
828 |     def step(
829 |         self, action: np.ndarray, **kwargs
830 |     ) -> Tuple[Union[Dict, np.ndarray], float, bool, bool, Dict]:
831 |         """Execute one step.
832 | 
833 |         Args:
834 |             action: Array of 9 torque commands, one for each robot joint.
835 | 
836 |         Returns:
837 |             A tuple with
838 | 
839 |             - observation (dict or tuple): agent's observation of the current
840 |               environment.  If `self.flatten_obs` is False then as a dictionary.
841 |               If `self.flatten_obs` is True then either as a 1D NumPy array
842 |               (if no images are to be included) or as a tuple (if images are
843 |               to be included) consisting of
844 | 
845 |               * a 1D NumPy array containing all observations except the
846 |                 camera images, and
847 |               * a NumPy array of shape (n_cameras, n_channels, height, width)
848 |                 containing the camera images.
849 | 
850 |             - reward (float): amount of reward returned after previous action.
851 |             - terminated (bool): whether the MDP has reached a terminal state. If true,
852 |               the user needs to call `reset()`.
853 |             - truncated (bool): Whether the truncation condition outside the scope
854 |               of the MDP is satisfied. For this environment this corresponds to a
855 |               timeout. If true, the user needs to call `reset()`.
856 |             - info (dict): info dictionary containing the current time index.
857 |         """
858 |         if self.real_robot:
859 |             raise NotImplementedError(
860 |                 "The step method is not available for real-robot data."
861 |             )
862 |         obs, rew, terminated, truncated, info = self.sim_env.step(action, **kwargs)
863 |         # process obs
864 |         processed_obs = self._process_obs(obs)
865 |         return processed_obs, rew, terminated, truncated, info
866 | 
867 |     def reset(
868 |         self, seed: Optional[int] = None, options: Optional[Dict[str, Any]] = None
869 |     ) -> Tuple[Union[Dict, np.ndarray], Dict]:
870 |         """Reset the environment.
871 | 
872 |         Returns:
873 |             Tuple of observation and info dictionary.
874 |         """
875 |         if self.real_robot:
876 |             raise NotImplementedError(
877 |                 "The reset method is not available for real-robot data."
878 |             )
879 |         if seed is not None:
880 |             self.sim_env.seed(seed)
881 |         obs, info = self.sim_env.reset()
882 |         # process obs
883 |         processed_obs = self._process_obs(obs)
884 |         return processed_obs, info
885 | 
886 |     def seed(self, seed: Optional[int] = None) -> List[int]:
887 |         """Set random seed of the environment."""
888 |         return self.sim_env.seed(seed)
889 | 
890 |     def render(self, mode: str = "human"):
891 |         """Does not do anything for this environment."""
892 |         if self.real_robot:
893 |             raise NotImplementedError(
894 |                 "The render method is not available for real-robot data."
895 |             )
896 |         self.sim_env.render(mode)
897 | 
898 |     def reset_fingers(self, reset_wait_time: int = 3000):
899 |         """Moves the fingers to initial position.
900 | 
901 |         This resets neither the frontend nor the cube. This method is supposed to be
902 |         used for 'soft resets' between episodes in one job.
903 |         """
904 | 
905 |         if self.real_robot:
906 |             raise NotImplementedError(
907 |                 "The reset_fingers method is not available for real-robot data."
908 |             )
909 |         obs, info = self.sim_env.reset_fingers(reset_wait_time)
910 |         processed_obs = self._process_obs(obs)
911 |         return processed_obs, info
912 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/evaluate_sim.py:
--------------------------------------------------------------------------------
  1 | """Evaluate a policy in simulation."""
  2 | import argparse
  3 | import importlib
  4 | import json
  5 | import logging
  6 | import pathlib
  7 | import sys
  8 | import typing
  9 | 
 10 | import gymnasium as gym
 11 | 
 12 | from trifinger_rl_datasets import Evaluation, PolicyBase, TriFingerDatasetEnv
 13 | 
 14 | 
 15 | def load_policy_class(policy_class_str: str) -> typing.Type[PolicyBase]:
 16 |     """Import the given policy class
 17 | 
 18 |     Args:
 19 |         The name of the policy class in the format "package.module.Class".
 20 | 
 21 |     Returns:
 22 |         The specified policy class.
 23 | 
 24 |     Raises:
 25 |         RuntimeError: If importing of the class fails.
 26 |     """
 27 |     try:
 28 |         module_name, class_name = policy_class_str.rsplit(".", 1)
 29 |         logging.info("import %s from %s" % (class_name, module_name))
 30 |         module = importlib.import_module(module_name)
 31 |         Policy = getattr(module, class_name)
 32 |     except Exception:
 33 |         raise RuntimeError(
 34 |             "Failed to import policy %s from module %s" % (class_name, module_name)
 35 |         )
 36 | 
 37 |     return Policy
 38 | 
 39 | 
 40 | def main():
 41 |     logging.basicConfig(level=logging.INFO)
 42 | 
 43 |     parser = argparse.ArgumentParser(description=__doc__)
 44 |     parser.add_argument(
 45 |         "task",
 46 |         type=str,
 47 |         choices=["push", "lift"],
 48 |         help="Which task to evaluate ('push' or 'lift').",
 49 |     )
 50 |     parser.add_argument(
 51 |         "policy_class",
 52 |         type=str,
 53 |         help="Name of the policy class (something like 'package.module.Class').",
 54 |     )
 55 |     parser.add_argument(
 56 |         "--visualization",
 57 |         "-v",
 58 |         action="store_true",
 59 |         help="Enable visualization of environment.",
 60 |     )
 61 |     parser.add_argument(
 62 |         "--n-episodes",
 63 |         type=int,
 64 |         default=64,
 65 |         help="Number of episodes to run. Default: %(default)s",
 66 |     )
 67 |     parser.add_argument(
 68 |         "--output",
 69 |         type=pathlib.Path,
 70 |         metavar="FILENAME",
 71 |         help="Save results to a JSON file.",
 72 |     )
 73 |     args = parser.parse_args()
 74 | 
 75 |     if args.task == "push":
 76 |         env_name = "trifinger-cube-push-sim-expert-v0"
 77 |     elif args.task == "lift":
 78 |         env_name = "trifinger-cube-lift-sim-expert-v0"
 79 |     else:
 80 |         print("Invalid task %s" % args.task)
 81 |         return 1
 82 | 
 83 |     Policy = load_policy_class(args.policy_class)
 84 | 
 85 |     policy_config = Policy.get_policy_config()
 86 | 
 87 |     if policy_config.flatten_obs:
 88 |         print("Using flattened observations")
 89 |     else:
 90 |         print("Using structured observations")
 91 | 
 92 |     env = typing.cast(
 93 |         TriFingerDatasetEnv,
 94 |         gym.make(
 95 |             env_name,
 96 |             disable_env_checker=True,
 97 |             visualization=args.visualization,
 98 |             flatten_obs=policy_config.flatten_obs,
 99 |             image_obs=policy_config.image_obs,
100 |         ),
101 |     )
102 | 
103 |     policy = Policy(env.action_space, env.observation_space, env.sim_env.episode_length)
104 | 
105 |     evaluation = Evaluation(env)
106 |     eval_res = evaluation.evaluate(policy=policy, n_episodes=args.n_episodes)
107 |     json_result = json.dumps(eval_res, indent=4)
108 | 
109 |     print("Evaluation result: ")
110 |     print(json_result)
111 | 
112 |     if args.output:
113 |         args.output.write_text(json_result)
114 | 
115 |     return 0
116 | 
117 | 
118 | if __name__ == "__main__":
119 |     sys.exit(main())
120 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/evaluation.py:
--------------------------------------------------------------------------------
 1 | from time import time
 2 | import typing
 3 | 
 4 | import numpy as np
 5 | 
 6 | from .policy_base import PolicyBase
 7 | 
 8 | 
 9 | class Evaluation:
10 | 
11 |     _reset_time = 500
12 | 
13 |     def __init__(self, env, time_policy=False):
14 |         self.env = env
15 |         self.time_policy = time_policy
16 | 
17 |     def run_episode(
18 |         self, initial_obs: dict, initial_info: dict, policy: PolicyBase
19 |     ) -> typing.Dict[str, typing.Union[int, float]]:
20 |         """Run one episode/do one rollout."""
21 | 
22 |         obs = initial_obs
23 |         info = initial_info
24 |         n_steps = 0
25 |         momentary_successes = 0
26 |         ep_return = 0.0
27 |         max_reward = 0.0
28 |         transient_success = False
29 | 
30 |         policy.reset()
31 | 
32 |         while True:
33 |             if self.time_policy:
34 |                 time1 = time()
35 |             action = policy.get_action(obs)
36 |             if self.time_policy:
37 |                 print("policy execution time: ", time() - time1)
38 |             obs, rew, _, truncated, info = self.env.step(action)
39 |             ep_return += rew
40 |             max_reward = max(max_reward, rew)
41 |             if info["has_achieved"]:
42 |                 transient_success = True
43 |                 momentary_successes += 1
44 |             self.env.render()
45 |             n_steps += 1
46 |             if truncated:
47 |                 if info["has_achieved"]:
48 |                     print("Success: Goal achieved at end of episode.")
49 |                 else:
50 |                     print("Goal not reached at the end of the episode.")
51 |                 break
52 | 
53 |         ep_stats = {
54 |             "success_rate": int(info["has_achieved"]),
55 |             "mean_momentary_success": momentary_successes / n_steps,
56 |             "transient_success_rate": int(transient_success),
57 |             "return": ep_return,
58 |             "max_reward": max_reward,
59 |         }
60 |         return ep_stats
61 | 
62 |     def evaluate(self, policy, n_episodes):
63 |         """Evaluate policy in given environment."""
64 | 
65 |         difficulty = self.env.sim_env.difficulty
66 |         episode_batch_size = 8 if difficulty == 1 else 6
67 |         ep_stats_list = []
68 |         for i in range(n_episodes):
69 |             print("Start episode {}".format(i))
70 |             # reset episode periodically to simulate start of a new robot job
71 |             if i % episode_batch_size == 0:
72 |                 initial_obs, initial_info = self.env.reset()
73 |             # run episode
74 |             ep_stats = self.run_episode(initial_obs, initial_info, policy)
75 |             ep_stats_list.append(ep_stats)
76 |             # move fingers to initial position and wait until cube has settled down
77 |             self.env.reset_fingers(self._reset_time)
78 |             if i < n_episodes - 1:
79 |                 # retrieve cube from barrier and center it approximately
80 |                 self.env.sim_env.reset_cube()
81 |             # Sample new goal
82 |             self.env.sim_env.sample_new_goal()
83 |             # move fingers to initial position and wait until cube has settled down
84 |             initial_obs, initial_info = self.env.reset_fingers(self._reset_time)
85 | 
86 |         overall_stats = {"n_episodes": n_episodes}
87 |         for k in ep_stats_list[0]:
88 |             overall_stats[k] = np.mean([ep_stats[k] for ep_stats in ep_stats_list])
89 | 
90 |         return overall_stats
91 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/policy_base.py:
--------------------------------------------------------------------------------
 1 | import typing
 2 | from abc import ABC, abstractmethod
 3 | from dataclasses import dataclass
 4 | 
 5 | import gymnasium as gym
 6 | import numpy as np
 7 | 
 8 | ObservationType = typing.Union[np.ndarray, typing.Dict[str, typing.Any]]
 9 | 
10 | 
11 | @dataclass
12 | class PolicyConfig:
13 |     """Policy configuration specifying what kind of observations the policy expects.
14 |     
15 |     Args:
16 |         flatten_obs:  If True, the policy expects observations as flattened arrays.
17 |             Otherwise, it expects them as dictionaries.
18 |         image_obs: If True, the policy expects the observations to contain camera
19 |             images. Otherwise, images are not included. If images_obs is True and
20 |             flatten_obs is True, the observation is a tuple containing the flattened
21 |             observation excluding the images and the images in a numpy array. If 
22 |             flatten_obs is False, the images are included in the observation
23 |             dictionary.
24 |             """
25 | 
26 |     flatten_obs: bool = True
27 |     image_obs: bool = False
28 | 
29 | 
30 | class PolicyBase(ABC):
31 |     """Base class defining interface for policies."""
32 | 
33 |     def __init__(
34 |         self, action_space: gym.Space, observation_space: gym.Space, episode_length: int
35 |     ):
36 |         """
37 |         Args:
38 |             action_space:  Action space of the environment.
39 |             observation_space:  Observation space of the environment.
40 |             episode_length:  Number of steps in one episode.
41 |         """
42 |         pass
43 | 
44 |     @staticmethod
45 |     def get_policy_config() -> PolicyConfig:
46 |         """Returns the policy configuration.
47 |         
48 |         This specifies what kind of observations the policy expects.
49 |         """
50 |         return PolicyConfig()
51 | 
52 |     def reset(self) -> None:
53 |         """Will be called at the beginning of each episode."""
54 |         pass
55 | 
56 |     @abstractmethod
57 |     def get_action(self, observation: ObservationType) -> np.ndarray:
58 |         """Returns action that is executed on the robot.
59 | 
60 |         Args:
61 |             observation: Observation of the current time step.
62 | 
63 |         Returns:
64 |             Action that is sent to the robot.
65 |         """
66 |         pass
67 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/py.typed:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/rr-learning/trifinger_rl_datasets/f23729f90634f570d7c3dfc2fba5d8c0e838cf44/trifinger_rl_datasets/py.typed


--------------------------------------------------------------------------------
/trifinger_rl_datasets/sampling_utils.py:
--------------------------------------------------------------------------------
 1 | """Utils for sampling of cube pose."""
 2 | 
 3 | import numpy as np
 4 | from scipy.spatial.transform import Rotation
 5 | 
 6 | from trifinger_simulation.tasks.move_cube import (
 7 |     _CUBE_WIDTH,
 8 |     _ARENA_RADIUS,
 9 |     _base_orientations,
10 |     Pose,
11 | )
12 | 
13 | 
14 | def random_yaw_orientation():
15 |     # first "roll the die" to see which face is pointing upward
16 |     up_face = np.random.choice(range(len(_base_orientations)))
17 |     up_face_rot = _base_orientations[up_face]
18 |     # then draw a random yaw rotation
19 |     yaw_angle = np.random.uniform(0, 2 * np.pi)
20 |     yaw_rot = Rotation.from_euler("z", yaw_angle)
21 |     # and combine them
22 |     orientation = yaw_rot * up_face_rot
23 |     return yaw_angle, orientation.as_quat()
24 | 
25 | 
26 | def random_xy(cube_yaw):
27 |     """Sample an xy position for cube which maximally covers arena.
28 | 
29 |     In particular, the cube can touch the barrier for all yaw anels."""
30 | 
31 |     theta = np.random.uniform(0, 2 * np.pi)
32 | 
33 |     # Minimum distance of cube center from arena boundary
34 |     min_dist = (
35 |         _CUBE_WIDTH
36 |         / np.sqrt(2)
37 |         * max(
38 |             abs(np.sin(0.25 * np.pi + cube_yaw - theta)),
39 |             abs(np.cos(0.25 * np.pi + cube_yaw - theta)),
40 |         )
41 |     )
42 | 
43 |     # sample uniform position in circle
44 |     # (https://stackoverflow.com/a/50746409)
45 |     radius = (_ARENA_RADIUS - min_dist) * np.sqrt(np.random.random())
46 | 
47 |     # x,y-position of the cube
48 |     x = radius * np.cos(theta)
49 |     y = radius * np.sin(theta)
50 | 
51 |     return x, y
52 | 
53 | 
54 | def sample_initial_cube_pose():
55 |     yaw_angle, orientation = random_yaw_orientation()
56 |     x, y = random_xy(yaw_angle)
57 |     z = _CUBE_WIDTH / 2
58 |     goal = Pose()
59 |     goal.position = np.array((x, y, z))
60 |     goal.orientation = orientation
61 |     return goal
62 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/sim_env.py:
--------------------------------------------------------------------------------
  1 | from pathlib import Path
  2 | from time import sleep, time
  3 | from typing import Tuple, Dict, Any, Optional
  4 | import logging
  5 | 
  6 | import cv2
  7 | import gymnasium as gym
  8 | import numpy as np
  9 | from pybullet import ER_TINY_RENDERER
 10 | 
 11 | import trifinger_simulation
 12 | import trifinger_simulation.visual_objects
 13 | from trifinger_simulation import trifingerpro_limits
 14 | import trifinger_simulation.tasks.move_cube as task
 15 | 
 16 | from .sampling_utils import sample_initial_cube_pose
 17 | from .utils import to_quat, get_keypoints_from_pose
 18 | 
 19 | 
 20 | class CameraWrapper:
 21 |     """Simple wrapper around camera array to change default renderer."""
 22 | 
 23 |     def __init__(self, camera):
 24 |         self.camera = camera
 25 | 
 26 |     def get_images(self, renderer=ER_TINY_RENDERER):
 27 |         return self.camera.get_images(renderer)
 28 | 
 29 | 
 30 | class SimTriFingerCubeEnv(gym.Env):
 31 |     """
 32 |     Gym environment for simulated manipulation of a cube with a TriFingerPro platform.
 33 |     """
 34 | 
 35 |     _initial_finger_position = [0.0, 0.9, -2.0] * 3
 36 |     _max_fingertip_vel = 5.0
 37 |     # parameters of reward function
 38 |     _kernel_reward_weight = 4.0
 39 |     _logkern_scale = 30
 40 |     _logkern_offset = 2
 41 |     # for how long to play the resetting trajectory
 42 |     _reset_trajectory_length = 18700
 43 |     # how many robot steps per environment step
 44 |     _step_size = 20
 45 | 
 46 |     def __init__(
 47 |         self,
 48 |         episode_length: int = 15,
 49 |         difficulty: int = 4,
 50 |         keypoint_obs: bool = True,
 51 |         obs_action_delay: int = 0,
 52 |         reward_type: str = "dense",
 53 |         visualization: bool = False,
 54 |         real_time: bool = True,
 55 |         image_obs: bool = False,
 56 |         camera_config_robot: int = 1,
 57 |     ):
 58 |         """
 59 |         Args:
 60 |             episode_length (int): How often step will run before done is True.
 61 |             keypoint_obs (bool): Whether to give keypoint observations for
 62 |                 pose in addition to position and quaternion.
 63 |             obs_action_delay (int): Delay between arrival of an observation
 64 |                 and application of the action computed from this
 65 |                 observation in milliseconds.
 66 |             reward_type (str): Which reward to use. Can be 'dense' or 'sparse'.
 67 |             visualization (bool): If true, the PyBullet GUI is run for visualization.
 68 |             real_time (bool): If true, the environment is stepped in real
 69 |                 time instead of as fast as possible (ignored if visualization is
 70 |                 disabled).
 71 |             image_obs (bool): If true, the camera images are returned as part
 72 |                 of the observation.
 73 |             camera_config_robot (int): ID of the robot to retrieve camera
 74 |                 configs from. Only used if image_obs is True.
 75 |         """
 76 |         # Basic initialization
 77 |         # ====================
 78 | 
 79 |         self.logger = logging.getLogger("trifinger_rl_datasets.SimTriFingerCubeEnv")
 80 | 
 81 |         assert (
 82 |             obs_action_delay < self._step_size
 83 |         ), "Delay between retrieval of observation and sending of next \
 84 |             action has to be smaller than step size (20 ms)."
 85 | 
 86 |         # will be initialized in reset()
 87 |         self.platform: Optional[trifinger_simulation.TriFingerPlatform] = None
 88 | 
 89 |         self.episode_length = episode_length
 90 |         self.difficulty = difficulty
 91 |         self.keypoint_obs = keypoint_obs
 92 |         self.n_keypoints = 8
 93 |         self.obs_action_delay = obs_action_delay
 94 |         self.reward_type = reward_type
 95 |         self.visualization = visualization
 96 |         self.real_time = real_time
 97 |         self.image_obs = image_obs
 98 |         self.camera_config_robot = camera_config_robot
 99 | 
100 |         # load trajectory that is played back for resetting the cube
101 |         trajectory_file_path = (
102 |             Path(__file__).resolve().parent
103 |             / "data"
104 |             / "trifingerpro_shuffle_cube_trajectory_fast.npy"
105 |         )
106 |         with open(trajectory_file_path, "rb") as f:
107 |             self._cube_reset_traj = np.load(f)
108 | 
109 |         # simulated robot has robot ID 0
110 |         self.robot_id = 0
111 | 
112 |         if image_obs:
113 |             # create camera object
114 |             camera_config_dir = Path(__file__).resolve().parent / "data"
115 |             calib_filename_pattern = f"r{self.camera_config_robot}_" + "camera{id}.yml"
116 |             self.camera = trifinger_simulation.camera.create_trifinger_camera_array_from_config(
117 |                 camera_config_dir, calib_filename_pattern=calib_filename_pattern
118 |             )
119 |         else:
120 |             self.camera = None
121 | 
122 |         # Create the action and observation spaces
123 |         # ========================================
124 | 
125 |         robot_torque_space = gym.spaces.Box(
126 |             low=trifingerpro_limits.robot_torque.low,
127 |             high=trifingerpro_limits.robot_torque.high,
128 |         )
129 |         robot_position_space = gym.spaces.Box(
130 |             low=trifingerpro_limits.robot_position.low,
131 |             high=trifingerpro_limits.robot_position.high,
132 |             dtype=np.float32,
133 |         )
134 |         robot_velocity_space = gym.spaces.Box(
135 |             low=trifingerpro_limits.robot_velocity.low,
136 |             high=trifingerpro_limits.robot_velocity.high,
137 |             dtype=np.float32,
138 |         )
139 |         robot_fingertip_force_space = gym.spaces.Box(
140 |             low=np.zeros(trifingerpro_limits.n_fingers),
141 |             high=np.ones(trifingerpro_limits.n_fingers),
142 |             dtype=np.float32,
143 |         )
144 |         robot_fingertip_pos_space = gym.spaces.Box(
145 |             low=np.array([[-0.6, -0.6, 0.0]] * trifingerpro_limits.n_fingers),
146 |             high=np.array([[0.6, 0.6, 0.6]] * trifingerpro_limits.n_fingers),
147 |             dtype=np.float32,
148 |         )
149 |         robot_fingertip_vel_space = gym.spaces.Box(
150 |             low=np.array(
151 |                 [[-self._max_fingertip_vel] * 3] * trifingerpro_limits.n_fingers
152 |             ),
153 |             high=np.array(
154 |                 [[self._max_fingertip_vel] * 3] * trifingerpro_limits.n_fingers
155 |             ),
156 |             dtype=np.float32,
157 |         )
158 |         robot_id_space = gym.spaces.Box(low=0, high=20, shape=(1,), dtype=np.int_)
159 | 
160 |         # camera observation space
161 |         camera_obs_space_dict: Dict[str, gym.Space] = {
162 |             "object_position": gym.spaces.Box(
163 |                 low=trifingerpro_limits.object_position.low,
164 |                 high=trifingerpro_limits.object_position.high,
165 |                 dtype=np.float32,
166 |             ),
167 |             "object_orientation": gym.spaces.Box(
168 |                 low=trifingerpro_limits.object_orientation.low,
169 |                 high=trifingerpro_limits.object_orientation.high,
170 |                 dtype=np.float32,
171 |             ),
172 |             "delay": gym.spaces.Box(low=0.0, high=0.30, shape=(1,), dtype=np.float32),
173 |             "confidence": gym.spaces.Box(
174 |                 low=0.0, high=1.0, shape=(1,), dtype=np.float32
175 |             ),
176 |         }
177 |         if self.keypoint_obs:
178 |             camera_obs_space_dict["object_keypoints"] = gym.spaces.Box(
179 |                 low=np.array([[-0.6, -0.6, 0.0]] * self.n_keypoints),
180 |                 high=np.array([[0.6, 0.6, 0.3]] * self.n_keypoints),
181 |                 dtype=np.float32,
182 |             )
183 |         if self.image_obs:
184 |             n_cameras = len(self.camera.cameras)
185 |             images_space = gym.spaces.Box(
186 |                 low=0,
187 |                 high=255,
188 |                 shape=(
189 |                     n_cameras,
190 |                     3,
191 |                     self.camera.cameras[0]._output_height,
192 |                     self.camera.cameras[0]._output_width,
193 |                 ),
194 |                 dtype=np.uint8,
195 |             )
196 |             camera_obs_space_dict["images"] = images_space
197 |         camera_obs_space = gym.spaces.Dict(camera_obs_space_dict)
198 | 
199 |         # goal space
200 |         if self.difficulty == 4:
201 |             if self.keypoint_obs:
202 |                 goal_space = gym.spaces.Dict(
203 |                     {"object_keypoints": camera_obs_space["object_keypoints"]}
204 |                 )
205 |             else:
206 |                 goal_space = gym.spaces.Dict(
207 |                     {
208 |                         k: camera_obs_space[k]
209 |                         for k in ["object_position", "object_orientation"]
210 |                     }
211 |                 )
212 |         else:
213 |             goal_space = gym.spaces.Dict(
214 |                 {"object_position": camera_obs_space["object_position"]}
215 |             )
216 | 
217 |         # action space
218 |         self.action_space = robot_torque_space
219 |         self._initial_action = trifingerpro_limits.robot_torque.default
220 | 
221 |         # NOTE: The order of dictionary items matters as it determines how
222 |         # the observations are flattened/unflattened. The observation space
223 |         # is therefore sorted by key.
224 | 
225 |         def sort_by_key(d):
226 |             return {
227 |                 k: (
228 |                     gym.spaces.Dict(sort_by_key(v.spaces))
229 |                     if isinstance(v, gym.spaces.Dict)
230 |                     else v
231 |                 )
232 |                 for k, v in sorted(d.items(), key=lambda item: item[0])
233 |             }
234 | 
235 |         # complete observation space
236 |         self.observation_space = gym.spaces.Dict(
237 |             sort_by_key(
238 |                 {
239 |                     "robot_observation": gym.spaces.Dict(
240 |                         {
241 |                             "position": robot_position_space,
242 |                             "velocity": robot_velocity_space,
243 |                             "torque": robot_torque_space,
244 |                             "fingertip_force": robot_fingertip_force_space,
245 |                             "fingertip_position": robot_fingertip_pos_space,
246 |                             "fingertip_velocity": robot_fingertip_vel_space,
247 |                             "robot_id": robot_id_space,
248 |                         }
249 |                     ),
250 |                     "camera_observation": camera_obs_space,
251 |                     "action": self.action_space,
252 |                     "desired_goal": goal_space,
253 |                     "achieved_goal": goal_space,
254 |                 }
255 |             )
256 |         )
257 | 
258 |         self._old_camera_obs: Optional[Dict[str, Any]] = None
259 |         self.t_obs: int = 0
260 | 
261 |         # Count consecutive steps where timing is violated (to decide when the show a
262 |         # warning)
263 |         self._timing_violation_counter = 0
264 | 
265 |     def _kernel_reward(
266 |         self, achieved_goal: np.ndarray, desired_goal: np.ndarray
267 |     ) -> float:
268 |         """Compute reward by evaluating a logistic kernel on the pairwise distance of
269 |         points.
270 | 
271 |         Parameters can be either a 1 dim. array of size 3 (positions) or a two dim.
272 |         array with last dim. of size 3 (keypoints)
273 | 
274 |         Args:
275 |             achieved_goal: Position or keypoints of current pose of the object.
276 |             desired_goal: Position or keypoints of goal pose of the object.
277 |         """
278 | 
279 |         diff = achieved_goal - desired_goal
280 |         dist = np.linalg.norm(diff, axis=-1)
281 |         scaled = self._logkern_scale * dist
282 |         # Use logistic kernel
283 |         rew = self._kernel_reward_weight * np.mean(
284 |             1.0 / (np.exp(scaled) + self._logkern_offset + np.exp(-scaled))
285 |         )
286 |         return rew
287 | 
288 |     def _append_desired_action(self, robot_action):
289 |         """Append desired action to queue and wait if real time is enabled."""
290 | 
291 |         t = self.platform.append_desired_action(robot_action)
292 |         if self.visualization and self.real_time:
293 |             sleep(max(0.001 - (time() - self.time_of_last_step), 0.0))
294 |             self.time_of_last_step = time()
295 |         return t
296 | 
297 |     def compute_reward(
298 |         self, achieved_goal: dict, desired_goal: dict, info: dict
299 |     ) -> float:
300 |         """Compute the reward for the given achieved and desired goal.
301 | 
302 |         Args:
303 |             achieved_goal: Current pose of the object.
304 |             desired_goal: Goal pose of the object.
305 |             info: An info dictionary containing a field "time_index" which
306 |                 contains the time index of the achieved_goal.
307 | 
308 |         Returns:
309 |             The reward that corresponds to the provided achieved goal w.r.t. to
310 |             the desired goal.
311 |         """
312 | 
313 |         if self.reward_type == "dense":
314 |             if self.difficulty == 4:
315 |                 # Use full keypoints if available as only difficulty 4 considers
316 |                 # orientation
317 |                 return self._kernel_reward(
318 |                     achieved_goal["object_keypoints"], desired_goal["object_keypoints"]
319 |                 )
320 |             else:
321 |                 # use position for all other difficulties
322 |                 return self._kernel_reward(
323 |                     achieved_goal["object_position"], desired_goal["object_position"]
324 |                 )
325 |         elif self.reward_type == "sparse":
326 |             return self.has_achieved(achieved_goal, desired_goal)
327 |         else:
328 |             raise NotImplementedError(
329 |                 f"Reward type {self.reward_type} is not supported"
330 |             )
331 | 
332 |     def has_achieved(self, achieved_goal: dict, desired_goal: dict) -> bool:
333 |         """Determine whether goal pose is achieved."""
334 |         POSITION_THRESHOLD = 0.02
335 |         ANGLE_THRESHOLD_DEG = 22.0
336 | 
337 |         desired = desired_goal
338 |         achieved = achieved_goal
339 |         position_diff = np.linalg.norm(
340 |             desired["object_position"] - achieved["object_position"]
341 |         )
342 |         # cast from np.bool_ to bool to make mypy happy
343 |         position_check = bool(position_diff < POSITION_THRESHOLD)
344 | 
345 |         if self.difficulty < 4:
346 |             return position_check
347 |         else:
348 |             a = to_quat(desired["object_orientation"])
349 |             b = to_quat(achieved["object_orientation"])
350 |             b_conj = b.conjugate()
351 |             quat_prod = a * b_conj
352 |             norm = np.linalg.norm([quat_prod.x, quat_prod.y, quat_prod.z])
353 |             norm = min(norm, 1.0)  # type: ignore
354 |             angle = 2.0 * np.arcsin(norm)
355 |             orientation_check = angle < 2.0 * np.pi * ANGLE_THRESHOLD_DEG / 360.0
356 | 
357 |             return position_check and orientation_check
358 | 
359 |     def _check_action(self, action):
360 |         low_check = self.action_space.low <= action
361 |         high_check = self.action_space.high >= action
362 |         return np.all(np.logical_and(low_check, high_check))
363 | 
364 |     def step(
365 |         self, action: np.ndarray, preappend_actions: bool = True
366 |     ) -> Tuple[dict, float, bool, bool, dict]:
367 |         """Run one timestep of the environment's dynamics.
368 | 
369 |         When end of episode is reached, you are responsible for calling
370 |         ``reset()`` to reset this environment's state.
371 | 
372 |         Args:
373 |             action: An action provided by the agent
374 |             preappend_actions (bool): Whether to already append actions that
375 |                 will be executed during obs-action delay to action queue.
376 | 
377 |         Returns:
378 |             tuple:
379 | 
380 |             - observation (dict): agent's observation of the current environment.
381 |             - reward (float): amount of reward returned after previous action.
382 |             - terminated (bool): whether the MDP has reached a terminal state. If true,
383 |               the user needs to call `reset()`.
384 |             - truncated (bool): Whether the truncation condition outside the scope
385 |               of the MDP is satisfied. For this environment this corresponds to a
386 |               timeout. If true, the user needs to call `reset()`.
387 |             - info (dict): info dictionary containing the current time index.
388 |         """
389 |         if self.platform is None:
390 |             raise RuntimeError("Call `reset()` before starting to step.")
391 | 
392 |         if not self._check_action(action):
393 |             raise ValueError("Given action is not contained in the action space.")
394 | 
395 |         self.step_count += 1
396 | 
397 |         # get robot action
398 |         robot_action = self._gym_action_to_robot_action(action)
399 | 
400 |         # check timing and show a warning/error if delayed
401 |         # do not check in first iteration as no time index is available yet (would lead
402 |         # to dead-lock)
403 |         if self.t_obs > 0:
404 |             t_now = self.platform.get_current_timeindex()
405 |             t_expected = self.t_obs + self.obs_action_delay
406 |             if t_now > t_expected:
407 |                 self._timing_violation_counter += 1
408 |                 extreme = t_now > self.t_obs + self._step_size
409 | 
410 |                 if extreme or self._timing_violation_counter >= 3:
411 |                     delay = t_now - t_expected
412 |                     self.logger.warning(
413 |                         f"Control loop got delayed by {delay} ms."
414 |                         " The action will be applied for a shorter time to catch up."
415 |                         " Please check if your policy is fast enough (max. computation"
416 |                         f" time should be <{1 + self.obs_action_delay} ms)."
417 |                     )
418 | 
419 |                 if extreme:
420 |                     self.logger.error(
421 |                         "ERROR: Control loop got delayed by more than a full step."
422 |                         "  Timing of the episode will be significantly affected!"
423 |                     )
424 |             else:
425 |                 self._timing_violation_counter = 0
426 | 
427 |         # send new action to robot until new observation is to be provided
428 |         # Note that by initially setting t the way it is, it is ensured that the loop
429 |         # always runs at least one iteration, even if the actual time step is already
430 |         # ahead by more than one step size.
431 |         t = self.t_obs + self.obs_action_delay
432 |         while t < self.t_obs + self._step_size:
433 |             t = self._append_desired_action(robot_action)
434 |         # time of the new observation
435 |         self.t_obs = t
436 | 
437 |         observation, info = self._create_observation(self.t_obs, action)
438 |         reward = self.compute_reward(
439 |             observation["achieved_goal"], observation["desired_goal"], info
440 |         )
441 |         truncated = self.step_count >= self.episode_length
442 | 
443 |         if not truncated and preappend_actions:
444 |             t_now = self.platform.get_current_timeindex()
445 |             # Append action to action queue of robot for as many time
446 |             # steps as the obs_action_delay dictates. This gives the
447 |             # user time to evaluate the policy.
448 |             # Also take time into account that might have already passed
449 |             # while the observation was processed.
450 |             for _ in range(max(self.obs_action_delay - (t_now - self.t_obs), 0)):
451 |                 self._append_desired_action(robot_action)
452 | 
453 |         return observation, reward, False, truncated, info
454 | 
455 |     def reset(  # type: ignore
456 |         self, preappend_actions: bool = True
457 |     ):
458 |         """Reset the environment."""
459 | 
460 |         super().reset()
461 | 
462 |         # hard-reset simulation
463 |         del self.platform
464 | 
465 |         # initialize simulation
466 |         initial_robot_position = trifingerpro_limits.robot_position.default
467 |         initial_object_pose = sample_initial_cube_pose()
468 |         initial_object_pose.position[2] += 0.0005  # avoid negative z of keypoint
469 |         self.platform = trifinger_simulation.TriFingerPlatform(
470 |             visualization=self.visualization,
471 |             initial_robot_position=initial_robot_position,
472 |             initial_object_pose=initial_object_pose,
473 |             enable_cameras=self.image_obs,
474 |         )
475 |         if self.image_obs:
476 |             # overwrite camera with wrapped version which uses software rendering
477 |             self.platform.tricamera = CameraWrapper(self.camera)
478 |             first_camera_obs = self.platform._get_current_camera_observation(0)
479 |             self.platform._delayed_camera_observation = first_camera_obs
480 |             self.platform._camera_observation_t = first_camera_obs
481 |         # sample goal
482 |         self.active_goal = task.sample_goal(difficulty=self.difficulty)
483 |         # visualize the goal (but not if image observations are used)
484 |         if self.visualization and not self.image_obs:
485 |             if hasattr(self, "goal_marker"):
486 |                 del self.goal_marker
487 |             self.goal_marker = trifinger_simulation.visual_objects.CubeMarker(
488 |                 width=task._CUBE_WIDTH,
489 |                 position=self.active_goal.position,
490 |                 orientation=self.active_goal.orientation,
491 |                 pybullet_client_id=self.platform.simfinger._pybullet_client_id,
492 |             )
493 |         self.step_count = 0
494 |         self.time_of_last_step = time()
495 |         # need to already do one step to get initial observation
496 |         self.t_obs = 0
497 |         obs, _, _, _, info = self.step(
498 |             self._initial_action, preappend_actions=preappend_actions
499 |         )
500 |         info = {"time_index": -1}
501 | 
502 |         return obs, info
503 | 
504 |     def reset_fingers(self, reset_wait_time: int = 3000):
505 |         """Reset fingers to initial position.
506 | 
507 |         This resets neither the frontend nor the cube. This method is
508 |         supposed to be used for 'soft resets' between episodes in one
509 |         job.
510 |         """
511 |         assert self.platform is not None, "Environment is not initialised."
512 | 
513 |         action = self.platform.Action(position=self._initial_finger_position)
514 |         for _ in range(reset_wait_time):
515 |             t = self._append_desired_action(action)
516 |         self.t_obs = t
517 |         # reset step_count even though this is not a full reset
518 |         self.step_count = 0
519 |         # block until reset wait time has passed and return observation
520 |         obs, info = self._create_observation(t, self._initial_action)
521 |         return obs, info
522 | 
523 |     def sample_new_goal(self, goal=None):
524 |         """Sample a new desired goal."""
525 |         if goal is None:
526 |             self.active_goal = task.sample_goal(difficulty=self.difficulty)
527 |         else:
528 |             self.active_goal.position = np.array(goal["position"], dtype=np.float32)
529 |             self.active_goal.orientation = np.array(
530 |                 goal["orientation"], dtype=np.float32
531 |             )
532 | 
533 |         # update goal visualisation
534 |         if self.visualization and not self.image_obs:
535 |             self.goal_marker.set_state(
536 |                 self.active_goal.position, self.active_goal.orientation
537 |             )
538 | 
539 |     def _get_pose_delay(self, camera_observation, t):
540 |         """Get delay between when the object pose was captured and now."""
541 | 
542 |         return t / 1000.0 - camera_observation.cameras[0].timestamp
543 | 
544 |     def _clip_observation(self, obs):
545 |         """Clip observation."""
546 | 
547 |         def clip_recursively(o, space):
548 |             for k, v in space.spaces.items():
549 |                 if isinstance(v, gym.spaces.Box):
550 |                     np.clip(o[k], v.low, v.high, dtype=v.dtype, out=o[k])
551 |                 else:
552 |                     clip_recursively(o[k], v)
553 | 
554 |         clip_recursively(obs, self.observation_space)
555 | 
556 |     def _create_observation(self, t: int, action: np.ndarray) -> Tuple[dict, dict]:
557 |         assert self.platform is not None, "Environment is not initialised."
558 | 
559 |         robot_observation = self.platform.get_robot_observation(t)
560 |         camera_observation = self.platform.get_camera_observation(t)
561 |         object_observation = camera_observation.object_pose
562 | 
563 |         info: Dict[str, Any] = {"time_index": t}
564 | 
565 |         # camera observation
566 |         camera_obs_processed = {
567 |             "object_position": object_observation.position.astype(np.float32),
568 |             "object_orientation": object_observation.orientation.astype(np.float32),
569 |             # time elapsed since capturing of pose in seconds
570 |             "delay": np.array(
571 |                 [self._get_pose_delay(camera_observation, t)], dtype=np.float32
572 |             ),
573 |             "confidence": np.array([object_observation.confidence], dtype=np.float32),
574 |         }
575 |         if self.image_obs:
576 |             if len(camera_observation.cameras[0].image.shape) == 2:
577 |                 # images from real platform have to be debayered
578 |                 images = np.array([cv2.cvtColor(cam.image, cv2.COLOR_BAYER_BG2RGB) for cam in camera_observation.cameras])
579 |             else:
580 |                 # RGB camera images created with software renderer
581 |                 # (using openGL requires GUI to run)
582 |                 images = np.array([cam.image for cam in camera_observation.cameras])
583 |             # convert to channel first
584 |             images = np.transpose(images, (0, 3, 1, 2))
585 |             camera_obs_processed["images"] = images
586 |         if self.keypoint_obs:
587 |             camera_obs_processed["object_keypoints"] = get_keypoints_from_pose(
588 |                 object_observation
589 |             )
590 |         if self._old_camera_obs is not None:
591 |             # handle quaternion flipping
592 |             q_sum = (
593 |                 self._old_camera_obs["object_orientation"]
594 |                 + camera_obs_processed["object_orientation"]
595 |             )
596 |             if np.linalg.norm(q_sum) < 0.2:
597 |                 camera_obs_processed["object_orientation"] = -camera_obs_processed[
598 |                     "object_orientation"
599 |                 ]
600 |         self._old_camera_obs = camera_obs_processed
601 | 
602 |         # goal represented as position and orientation
603 |         desired_goal_pos_ori = {
604 |             "object_position": self.active_goal.position.astype(np.float32),
605 |             "object_orientation": self.active_goal.orientation.astype(np.float32),
606 |         }
607 |         achieved_goal_pos_ori = {
608 |             "object_position": camera_obs_processed["object_position"],
609 |             "object_orientation": camera_obs_processed["object_orientation"],
610 |         }
611 |         # goal as shown to agent
612 |         if self.difficulty == 4:
613 |             if self.keypoint_obs:
614 |                 desired_goal = {
615 |                     "object_keypoints": get_keypoints_from_pose(self.active_goal)
616 |                 }
617 |                 achieved_goal = {
618 |                     "object_keypoints": camera_obs_processed["object_keypoints"]
619 |                 }
620 |             else:
621 |                 desired_goal = desired_goal_pos_ori
622 |                 achieved_goal = achieved_goal_pos_ori
623 |         else:
624 |             desired_goal = {
625 |                 "object_position": self.active_goal.position.astype(np.float32)
626 |             }
627 |             achieved_goal = {"object_position": camera_obs_processed["object_position"]}
628 | 
629 |         # fingertip positions and velocities
630 |         fingertip_position, fingertip_velocity = self.platform.forward_kinematics(
631 |             robot_observation.position, robot_observation.velocity
632 |         )
633 |         fingertip_position = np.array(fingertip_position, dtype=np.float32)
634 |         fingertip_velocity = np.array(fingertip_velocity, dtype=np.float32)
635 | 
636 |         observation = {
637 |             "robot_observation": {
638 |                 "position": robot_observation.position.astype(np.float32),
639 |                 "velocity": robot_observation.velocity.astype(np.float32),
640 |                 "torque": robot_observation.torque.astype(np.float32),
641 |                 "fingertip_force": robot_observation.tip_force.astype(np.float32),
642 |                 "fingertip_position": fingertip_position,
643 |                 "fingertip_velocity": fingertip_velocity,
644 |                 "robot_id": np.array([self.robot_id], dtype=np.int_),
645 |             },
646 |             "camera_observation": camera_obs_processed,
647 |             "action": action.astype(np.float32),
648 |             "desired_goal": desired_goal,
649 |             "achieved_goal": achieved_goal,
650 |         }
651 |         # clip observation
652 |         self._clip_observation(observation)
653 | 
654 |         has_achieved = self.has_achieved(achieved_goal_pos_ori, desired_goal_pos_ori)
655 |         info["has_achieved"] = has_achieved
656 |         info["desired_goal"] = desired_goal_pos_ori
657 | 
658 |         return observation, info
659 | 
660 |     def _gym_action_to_robot_action(self, gym_action: np.ndarray):
661 |         assert self.platform is not None, "Environment is not initialised."
662 | 
663 |         # robot action is torque
664 |         robot_action = self.platform.Action(torque=gym_action)
665 |         return robot_action
666 | 
667 |     def render(self, mode: str = "human"):
668 |         """Does nothing. See :class:`SimTriFingerCubeEnv` for how to enable
669 |         visualization."""
670 |         pass
671 | 
672 |     def _wait_until_timeindex(self, t: int):
673 |         """Wait until the given time index is reached."""
674 |         # The simulation is stepped automatically so there is nothing to do here.
675 |         pass
676 | 
677 |     def reset_cube(self):
678 |         """Replay a recorded trajectory to move cube to center of arena."""
679 | 
680 |         for position in self._cube_reset_traj[: self._reset_trajectory_length : 2]:
681 |             robot_action = self.platform.Action(position=position)
682 |             t = self._append_desired_action(robot_action)
683 |             self._wait_until_timeindex(t)  # type: ignore
684 | 


--------------------------------------------------------------------------------
/trifinger_rl_datasets/utils.py:
--------------------------------------------------------------------------------
 1 | """Utility methods for working with object poses and keypoints."""
 2 | 
 3 | import numpy as np
 4 | import quaternion
 5 | 
 6 | 
 7 | def to_quat(x):
 8 |     return np.quaternion(x[3], x[0], x[1], x[2])
 9 | 
10 | 
11 | def to_world_space(x_local, pose):
12 |     """Transform point from local object coordinate system to world space.
13 | 
14 |     Args:
15 |         x_local: Coordinates of point in local frame.
16 |         pose: Object pose containing position and orientation.
17 |     Returns:
18 |         The coordinates in world space.
19 |     """
20 |     q_rot = to_quat(pose.orientation)
21 |     transl = pose.position
22 |     q_local = np.quaternion(0.0, x_local[0], x_local[1], x_local[2])
23 |     q_global = q_rot * q_local * q_rot.conjugate()
24 |     return transl + np.array([q_global.x, q_global.y, q_global.z])
25 | 
26 | 
27 | def get_keypoints_from_pose(pose, num_keypoints=8, dimensions=(0.065, 0.065, 0.065)):
28 |     """Calculate keypoints (coordinates of the corners of the cube) from pose.
29 | 
30 |     Args:
31 |         pose: Object pose containing position and orientation of cube.
32 |         num_keypoints: Number of keypoints to generate.
33 |         dimensions: Dimensions of the cube.
34 |     Returns:
35 |         Array containing the keypoints.
36 |     """
37 |     keypoints = []
38 |     for i in range(num_keypoints):
39 |         # convert to binary representation
40 |         str_kp = "{:03b}".format(i)
41 |         # set components of keypoints according to digits in binary representation
42 |         loc_kp = [
43 |             (1.0 if str_kp[i] == "0" else -1.0) * 0.5 * d
44 |             for i, d in enumerate(dimensions)
45 |         ][::-1]
46 |         glob_kp = to_world_space(loc_kp, pose)
47 |         keypoints.append(glob_kp)
48 | 
49 |     return np.array(keypoints, dtype=np.float32)
50 | 
51 | 
52 | def get_pose_from_keypoints(keypoints, dimensions=(0.065, 0.065, 0.065)):
53 |     """Calculate pose (position, orientation) from keypoints.
54 | 
55 |     Args:
56 |         keypoints: At least three keypoints representing the pose.
57 |         dimensions: Dimensions of the cube.
58 |     Returns:
59 |         Tuple containing the coordinates of the cube center and a
60 |         quaternion representing the orientation.
61 |     """
62 |     center = np.mean(keypoints, axis=0)
63 |     kp_centered = np.array(keypoints) - center
64 |     kp_scaled = kp_centered / np.array(dimensions) * 2.0
65 | 
66 |     loc_kps = []
67 |     for i in range(3):
68 |         # convert to binary representation
69 |         str_kp = "{:03b}".format(i)
70 |         # set components of keypoints according to digits in binary representation
71 |         loc_kp = [(1.0 if str_kp[i] == "0" else -1.0) for i in range(3)][::-1]
72 |         loc_kps.append(loc_kp)
73 |     K_loc = np.transpose(np.array(loc_kps))
74 |     K_loc_inv = np.linalg.inv(K_loc)
75 |     K_glob = np.transpose(kp_scaled[0:3])
76 |     R = np.matmul(K_glob, K_loc_inv)
77 |     quat = quaternion.from_rotation_matrix(R)
78 | 
79 |     return center, np.array([quat.x, quat.y, quat.z, quat.w])
80 | 


--------------------------------------------------------------------------------