├── .gitignore
├── LICENSE
├── README.md
├── envs
├── setup_env.sh
└── teaser.jpg
└── src
└── test_env.py
/.gitignore:
--------------------------------------------------------------------------------
1 | assemblyEnv
2 | ml-agents
3 | envs/
4 | **/__pycache__
5 | **pyc
6 | **npy
7 | **DS_Store
8 | src/baselines
9 | baselines
10 | src/trained_models
11 | src/SaveData
12 | src/temp
13 | src/temp.txt
14 | src/auto_plot.sh
15 | src/pytorch-tools
16 | pytorch-tools
17 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright (c) 2019 Deepak Pathak
2 | All rights reserved.
3 |
4 | Redistribution and use in source and binary forms, with or without
5 | modification, are permitted provided that the following conditions are met:
6 |
7 | * Redistributions of source code must retain the above copyright notice, this
8 | list of conditions and the following disclaimer.
9 |
10 | * Redistributions in binary form must reproduce the above copyright notice,
11 | this list of conditions and the following disclaimer in the documentation
12 | and/or other materials provided with the distribution.
13 |
14 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
15 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
17 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
18 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
20 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
21 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
22 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
23 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
24 |
25 |
26 | --------------------------------------------------------------------------------
27 | Original Pytorch-PPO License:
28 | --------------------------------------------------------------------------------
29 | MIT License
30 |
31 | Copyright (c) 2017 Ilya Kostrikov
32 |
33 | Permission is hereby granted, free of charge, to any person obtaining a copy
34 | of this software and associated documentation files (the "Software"), to deal
35 | in the Software without restriction, including without limitation the rights
36 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
37 | copies of the Software, and to permit persons to whom the Software is
38 | furnished to do so, subject to the following conditions:
39 |
40 | The above copyright notice and this permission notice shall be included in all
41 | copies or substantial portions of the Software.
42 |
43 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
44 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
45 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
46 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
47 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
48 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
49 | SOFTWARE.
50 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Learning to Control Self-Assembling Morphologies ##
2 | ### NeurIPS 2019 (Spotlight)
Winner of [Virtual Creatures Competition at GECCO 2019, Prague](https://virtualcreatures.github.io/)
3 | #### [[Project Website]](https://pathak22.github.io/modular-assemblies/) [[Demo Video]](https://youtu.be/ngCIB-IWD8E)
4 |
5 | [Deepak Pathak](https://people.eecs.berkeley.edu/~pathak/)*, [Chris Lu](https://chris-lu.weebly.com/)*, [Trevor Darrell](https://people.eecs.berkeley.edu/~trevor/), [Phillip Isola](https://www.eecs.mit.edu/people/faculty/phillip-isola/), [Alexei A. Efros](https://people.eecs.berkeley.edu/~efros/)
6 | University of California, Berkeley
7 | MIT
8 | (* equal contribution)
9 |
10 |
11 |
12 |
13 |
14 | This is a pytorch based implementation for our [paper on learning to control self-assembling agents using deep reinforcement learning](https://pathak22.github.io/modular-assemblies/). We investigate a modular co-evolution strategy: a collection of primitive agents learns to dynamically self-assemble into composite bodies while also learning to coordinate their behavior to control these bodies. We learn compositional policies to demonstrate better zero-shot generalization. If you find this work useful in your research, please cite:
15 |
16 | @inproceedings{pathak19assemblies,
17 | Author = {Pathak, Deepak and Lu, Chris and Darrell, Trevor and
18 | Isola, Phillip and Efros, Alexei A.},
19 | Title = {Learning to Control Self-Assembling Morphologies:
20 | A Study of Generalization via Modularity},
21 | Booktitle = {arXiv preprint arXiv:1902.05546},
22 | Year = {2019}
23 | }
24 |
25 | ### Installation and Usage
26 |
27 | 1. Setting up repository
28 | ```Shell
29 | git clone https://github.com/pathak22/modular-assemblies.git
30 | cd modular-assemblies/
31 | git clone https://github.com/Unity-Technologies/ml-agents.git
32 | cd ml-agents/
33 | git reset --hard 6c5255e
34 | cd ..
35 | bash envs/setup_env.sh
36 |
37 | python3 -m venv assemblyEnv
38 | source $PWD/assemblyEnv/bin/activate
39 | pip install --upgrade pip
40 | ```
41 |
42 | 2. Installation
43 | - Requirements:
44 | - CUDNN-5.1, CUDA-8.0, Python-3.5
45 | - Detailed setup, skip to quick setup for exact replication:
46 | ```Shell
47 | # Install Pytorch from http://pytorch.org/
48 | pip install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp35-cp35m-linux_x86_64.whl
49 | pip install torchvision
50 | pip install --upgrade visdom
51 |
52 | # Install baselines for Atari preprocessing
53 | pip install gym==0.9.4 # baselines install latest gym first automatically, but latest gym has moved to mujoco5 so first install old gym and then install baselines
54 | git clone https://github.com/openai/baselines.git
55 | cd baselines
56 | git reset --hard b5be53d
57 | pip install -e .
58 |
59 | # Additional packages
60 | pip install numpy
61 | pip install matplotlib
62 | pip install pillow
63 | pip install opencv-python
64 |
65 | # fold
66 | cd modular-assemblies/src/
67 | git clone https://github.com/nearai/pytorch-tools.git
68 | cd pytorch-tools/
69 | git reset --hard 09dccb2
70 | python setup.py install
71 | ```
72 | - Quick setup for exact replication:
73 | ```Shell
74 | pip install -r requirements.txt
75 | ```
76 |
77 | 3. Run code
78 | ```Shell
79 | cd modular-assemblies/src/
80 | python test_env.py
81 | ```
82 |
83 | ### Acknowledgement
84 | Builds upon Ilya Kostrikov's Pytorch PPO [implementation](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr).
85 |
--------------------------------------------------------------------------------
/envs/setup_env.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | URL=https://www.dropbox.com/s//final_N_vgrid3.zip
4 | DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/" && pwd )"
5 | cd $DIR
6 |
7 | rm -rf *.tar.gz *.zip
8 | echo "Downloading the environment executable..."
9 | wget $URL
10 | echo "Unzipping..."
11 | unzip final_N_vgrid3.zip
12 | mv final_N_vgrid3/* .
13 | rm -rf final_N_vgrid3 __MACOSX .DS_Store
14 | rm -rf *.tar.gz *.zip
15 | echo "Downloading Done."
16 |
--------------------------------------------------------------------------------
/envs/teaser.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/pathak22/modular-assemblies/535338cd2a33526770d5cb7cd04c516ae518b12c/envs/teaser.jpg
--------------------------------------------------------------------------------
/src/test_env.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import argparse
3 | import os
4 | import sys
5 | sys.path.insert(0, os.path.abspath("../ml-agents/python/"))
6 | from unityagents import UnityEnvironment
7 |
8 | parser = argparse.ArgumentParser(description='Unity Env Test')
9 | parser.add_argument('-e', '--env-name', default='../envs/pyenv',
10 | help='environment path')
11 | args = parser.parse_args()
12 |
13 | env = UnityEnvironment(file_name=args.env_name)
14 | default_brain = env.brain_names[0]
15 | brain = env.brains[default_brain]
16 | print('Loaded environment from: %s' % args.env_name)
17 | print(str(env))
18 | print('action space size per limb = ', brain.action_space_size)
19 | print('state space size per limb = ', brain.state_space_size)
20 | train_mode = False
21 |
22 | # joinDist means merge action
23 | # forceScale means scale torque in the environment
24 | # autoJoin is to join automatically if dist <= joinDist else via merge action
25 | expID = 0.0
26 | saveEnv = 0.0
27 | loadEnv = 0.0
28 | env_config = {"maxLimb": 7, "joinDist":4.0, "forceScale":10.0,
29 | "autoJoin": 0.0, "dynamicFriction":0.6, "staticFriction":0.6,
30 | "saveEnv":saveEnv, "loadEnv":loadEnv, "expID":expID}
31 | if saveEnv != 0.0:
32 | print('EnvData Save dir: ./SaveData/%05d/%05d.dat' % (expID, saveEnv))
33 | if loadEnv != 0.0:
34 | print('EnvData Load dir: ./SaveData/%05d/%05d.dat' % (expID, loadEnv))
35 | if 'multiagent_N' in args.env_name:
36 | env_config["N"] = 8
37 | print('Env config:', env_config)
38 |
39 | for episode in range(10):
40 | print('='*60)
41 | env_info = env.reset(train_mode=train_mode, config=env_config)[default_brain]
42 | print('Starting Episode: %d' % (episode+1))
43 | if "N" in env_config:
44 | env_config["N"] = 0 # to add only once
45 | done = False
46 | episode_rewards = 0
47 | for i in range(100):
48 | if brain.action_space_type == 'continuous':
49 | act = np.random.randn(len(env_info.agents), brain.action_space_size)
50 | else:
51 | act = np.random.randint(0, brain.action_space_size,
52 | size=(len(env_info.agents)))
53 | env_info = env.step(act)[default_brain]
54 | episode_rewards += env_info.rewards[0]
55 | print("Total reward this episode: {}".format(episode_rewards))
56 |
57 | env.close()
58 |
--------------------------------------------------------------------------------