├── .gitignore
├── LICENSE
├── README.md
├── assets
├── base_link.STL
├── finger_limb_link.STL
├── finger_tip_link.STL
├── forearm_link.STL
├── hand_base_link.STL
├── koko_full.xml
├── shoulder_lift_link.STL
├── shoulder_link.STL
├── upper_arm_lift_link.STL
├── upper_arm_link.STL
└── wrist_lift_link.STL
├── koko_gym
├── __init__.py
└── envs
│ ├── __init__.py
│ ├── assets
│ ├── base_link.STL
│ ├── finger_limb_link.STL
│ ├── finger_tip_link.STL
│ ├── forearm_link.STL
│ ├── hand_base_link.STL
│ ├── koko_reacher.xml
│ ├── shoulder_lift_link.STL
│ ├── shoulder_link.STL
│ ├── upper_arm_lift_link.STL
│ ├── upper_arm_link.STL
│ └── wrist_lift_link.STL
│ └── koko_reacher.py
├── simulate.py
└── train.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | __pycache__/
3 |
4 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Berkeley Open Arms
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # koko-mujoco
2 |
3 | ## Requirement
4 | * Mujoco1.55
5 | * OpenAI Gym
6 | * OpenAI Mujoco-py
7 |
8 | ## Repo structure
9 | ```bash
10 | ├── README.md
11 | ├── simulate.py
12 | ├── train.py
13 | ├── assets
14 | │ ├── koko_full.xml
15 | │ └── STL files
16 | └── koko_gym
17 | └── envs
18 | ├── assets
19 | │ ├── koko_reacher.xml
20 | │ └── STL files
21 | ├── __init__.py
22 | └── koko_reacher.py
23 | ```
24 |
25 | ## Explaination for each file
26 | * koko_full.xml
27 |
28 |
29 | MJCF file for the Blue robot. Actuated gripper installed. Having following actuators (joints).
30 | ```xml
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 | ```
46 | Since no URDF `` tag equivalent exists in MJCF, the grippers (last four actuators) are actuated by a position controller that takes the current `robotfinger_actuator_joint` angle as an input (`fingerlimb_joint` moves positive and `fingertip_joint` goes negative to make the tips parallel to each other).
47 |
48 | * koko_reacher.py
49 |
50 |
51 | OpenAI Gym environment for Blue. `reacher.step` takes 1x8 size action array. The actuator of the gripper joints cannot be controlled respectively but will be controlled at once using `robotfinger_actuator_joint`'s angle as the position input. You can also set your favorite reward signal in a step function.
52 |
53 | * koko_reacher.xml
54 |
55 |
56 | `koko_full.xml` with target object.
57 |
58 | * train.py
59 |
60 |
61 | Training loop using random controller. Add your favorite algorithm to train the policy.
62 |
63 | * simulate.py
64 |
65 |
66 | Runs the Mujoco-py viewer simulator for 5000 time steps. Use this for test run your trained policy.
67 |
68 | ## Reference
69 |
70 | * [https://github.com/openai/gym](https://github.com/openai/gym)
71 | * [https://github.com/openai/mujoco-py](https://github.com/openai/mujoco-py)
72 | * [http://www.mujoco.org/](http://www.mujoco.org/)
73 |
--------------------------------------------------------------------------------
/assets/base_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/base_link.STL
--------------------------------------------------------------------------------
/assets/finger_limb_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/finger_limb_link.STL
--------------------------------------------------------------------------------
/assets/finger_tip_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/finger_tip_link.STL
--------------------------------------------------------------------------------
/assets/forearm_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/forearm_link.STL
--------------------------------------------------------------------------------
/assets/hand_base_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/hand_base_link.STL
--------------------------------------------------------------------------------
/assets/koko_full.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 |
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
--------------------------------------------------------------------------------
/assets/shoulder_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/shoulder_lift_link.STL
--------------------------------------------------------------------------------
/assets/shoulder_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/shoulder_link.STL
--------------------------------------------------------------------------------
/assets/upper_arm_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/upper_arm_lift_link.STL
--------------------------------------------------------------------------------
/assets/upper_arm_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/upper_arm_link.STL
--------------------------------------------------------------------------------
/assets/wrist_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/assets/wrist_lift_link.STL
--------------------------------------------------------------------------------
/koko_gym/__init__.py:
--------------------------------------------------------------------------------
1 | from koko_gym.envs.koko_reacher import KokoReacherEnv
2 | # from koko_gym.envs.koko_pusher import KokoPusherEnv
3 |
--------------------------------------------------------------------------------
/koko_gym/envs/__init__.py:
--------------------------------------------------------------------------------
1 | from koko_gym.envs.koko_reacher import KokoReacherEnv
2 | # from koko_gym.envs.koko_pusher import KokoPusherEnv
3 |
--------------------------------------------------------------------------------
/koko_gym/envs/assets/base_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/base_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/finger_limb_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/finger_limb_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/finger_tip_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/finger_tip_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/forearm_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/forearm_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/hand_base_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/hand_base_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/koko_reacher.xml:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
52 |
53 |
54 |
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 |
63 |
64 |
65 |
66 |
67 |
68 |
69 |
70 |
71 |
72 |
73 |
74 |
75 |
76 |
77 |
78 |
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 |
87 |
88 |
89 |
90 |
91 |
92 |
93 |
94 |
95 |
96 |
97 |
98 |
99 |
100 |
101 |
102 |
103 |
104 |
105 |
106 |
107 |
108 |
109 |
110 |
111 |
112 |
113 |
114 |
--------------------------------------------------------------------------------
/koko_gym/envs/assets/shoulder_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/shoulder_lift_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/shoulder_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/shoulder_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/upper_arm_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/upper_arm_lift_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/upper_arm_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/upper_arm_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/assets/wrist_lift_link.STL:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/berkeleyopenarms/blue_mujoco_v1/aa73db621f22dac4b76af8748ea6c179d5cb1715/koko_gym/envs/assets/wrist_lift_link.STL
--------------------------------------------------------------------------------
/koko_gym/envs/koko_reacher.py:
--------------------------------------------------------------------------------
1 | from gym import utils, spaces
2 | from gym.envs.mujoco import mujoco_env
3 | from mujoco_py.generated import const
4 | import os
5 | import random
6 | import time
7 | import numpy as np
8 |
9 | class KokoReacherEnv(utils.EzPickle, mujoco_env.MujocoEnv):
10 | def __init__(self):
11 | self.init_done = False
12 | utils.EzPickle.__init__(self)
13 | mujoco_env.MujocoEnv.__init__(self, os.path.join(os.path.dirname(__file__), "assets", "koko_reacher.xml"), 2)
14 | self.viewer = self._get_viewer('human')
15 | # adjust the actuation space
16 | bounds = self.model.actuator_ctrlrange.copy()
17 | low, high = bounds.T
18 | low, high = low[:-4], high[:-4] # four joints for finger are dependants of the finger inertial joint
19 | self.action_space = spaces.Box(low=low, high=high, dtype=np.float32)
20 | self.gripper_action = self.sim.data.qpos[-4:]
21 | self.init_done = True
22 |
23 | def step(self, a):
24 | vec = self.get_body_com("robotleftfingertip") - self.get_body_com("target")
25 | reward_dist = -np.square(2.0*np.linalg.norm(vec))
26 | reward_vel = -np.sqrt(np.square(self.sim.data.qvel).mean())
27 | reward_ctrl = -np.square(a).sum()/len(self.sim.data.ctrl)
28 | reward = reward_dist + reward_ctrl
29 |
30 | if self.init_done:
31 | self.gripper_action = np.ones(4) * a[-1]
32 | self.gripper_action[1] *= -1
33 | self.gripper_action[3] *= -1
34 | a = np.concatenate((a,self.gripper_action))
35 |
36 | self.do_simulation(a, self.frame_skip)
37 | ob = self._get_obs()
38 | done = False
39 | info = {'reward_dist':reward_dist,
40 | 'reward_vel':reward_vel}
41 | return ob, reward, done, info
42 |
43 | def viewer_setup(self):
44 | self.viewer.cam.trackbodyid = 0
45 |
46 | def reset_model(self):
47 | qpos = self.np_random.uniform(low=-0.01, high=0.01, size=self.model.nq) + self.init_qpos
48 | while True:
49 | self.goal = self.np_random.uniform(low=-.2, high=.2, size=2)
50 | if np.linalg.norm(self.goal) < 2:
51 | break
52 | qpos[-2:] = self.goal
53 | qvel = self.init_qvel + self.np_random.uniform(low=-.005, high=.005, size=self.model.nv)
54 | qvel[-2:] = 0
55 | self.set_state(qpos, qvel)
56 | return self._get_obs()
57 |
58 | def _get_obs(self):
59 | return np.concatenate([
60 | self.sim.data.qpos,
61 | self.sim.data.qvel,
62 | self.get_body_com("robotleftfingertip") - self.get_body_com("target")
63 | ])
64 |
65 | def viewer_setup(self, camera_type='global_cam', camera_select=0):
66 | if camera_type == 'fixed_cam':
67 | cam_type = const.CAMERA_FIXED
68 | camera_select = camera_select
69 | elif camera_type == 'global_cam':
70 | cam_type = 0
71 | DEFAULT_CAMERA_CONFIG = {
72 | 'distance': 6.0,
73 | 'azimuth': 140.0,
74 | 'elevation': -30.0,
75 | 'type': cam_type,
76 | 'fixedcamid': camera_select
77 | }
78 |
79 | for key, value in DEFAULT_CAMERA_CONFIG.items():
80 | if isinstance(value, np.ndarray):
81 | getattr(self.viewer.cam, key)[:] = value
82 | else:
83 | setattr(self.viewer.cam, key, value)
84 |
--------------------------------------------------------------------------------
/simulate.py:
--------------------------------------------------------------------------------
1 | from koko_gym import KokoReacherEnv
2 | from glfw import get_framebuffer_size
3 | import random
4 | import numpy as np
5 |
6 | #Make reacher env instance
7 | reacher = KokoReacherEnv()
8 | reacher.reset_model()
9 |
10 | #Set the viewer
11 | width, height = get_framebuffer_size(reacher.viewer.window)
12 | reacher.viewer_setup(camera_type='global_cam', camera_select=0)
13 |
14 | # Sample propotional controller should be replaced with your policy function
15 | Kp = 1.0
16 | target_state = reacher.sim.get_state()
17 |
18 | for i in range(5000):
19 | #Get the current state info
20 | current_state = reacher.sim.get_state()
21 |
22 | # Sample controller (Pseudo Policy Function)
23 | target_state.qpos[0] = 0.5*np.sin(i/500) # base_roll_joint
24 | target_state.qpos[1] = 0.5*np.sin(i/500) # shoulder_lift_joint
25 | target_state.qpos[2] = 0.5*np.sin(i/500) # shoulder_roll_joint
26 | target_state.qpos[3] = 0.5*np.sin(i/500) # elbow_lift_joint
27 | target_state.qpos[4] = 0.5*np.sin(i/500) # elbow_roll_joint
28 | target_state.qpos[5] = 0.5*np.sin(i/500) # wrist_lift_joint
29 | target_state.qpos[6] = 0.5*np.sin(i/500) # wrist_roll_joint
30 | target_state.qpos[7] = 1.0*np.sin(i/500) # robotfinger_actuator_joint
31 | feedback_cmd = Kp * (target_state.qpos - current_state.qpos)
32 |
33 | #Adding Step to model
34 | ob, _, _, _ = reacher.step(a=feedback_cmd[:8]) #ob = qpos numpy.ndarray len=8
35 | reacher.render(mode='human', width=width, height=height)
36 |
37 |
38 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import gym
3 | import numpy as np
4 | import os
5 | import time
6 | from koko_gym import KokoReacherEnv
7 |
8 | def do_rollout(env, policy_fn, max_steps, render=False):
9 | observation = env.reset()
10 | done = False
11 | steps = 0
12 |
13 | rollout_observations = []
14 | rollout_actions = []
15 | rollout_returns = []
16 |
17 | while not done:
18 | action = policy_fn()
19 | rollout_observations.append(observation)
20 | rollout_actions.append(action)
21 | observation, reward, done, _ = env.step(action)
22 | steps += 1
23 | if render:
24 | env.render()
25 | if steps >= max_steps:
26 | break
27 | rollout_returns.append(reward)
28 |
29 | return (rollout_observations, rollout_actions, rollout_returns)
30 |
31 | def make_random_policy(env):
32 | np_random = env.np_random
33 | action_size = len(env.sim.data.ctrl) - 4
34 | def random_policy():
35 | random = np_random.uniform(low=-1.0, high=1.0, size=action_size)
36 | return random
37 | return random_policy
38 |
39 | def main():
40 | parser = argparse.ArgumentParser()
41 | parser.add_argument('--max_timesteps', type=int)
42 | args = parser.parse_args()
43 |
44 | env = KokoReacherEnv()
45 |
46 | max_steps = args.max_timesteps or 2000
47 |
48 | random_controller = make_random_policy(env)
49 |
50 | for i in range(10):
51 | rollout_obs, rollout_act, rollout_r = do_rollout(env, random_controller, max_steps, render=True)
52 | print("rollout number:", i, " rollout average reward:", sum(rollout_r)/len(rollout_r))
53 |
54 | if __name__ == '__main__':
55 | main()
56 |
57 |
--------------------------------------------------------------------------------