├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── RELEASE_NOTES.md ├── dm_memorytasks ├── __init__.py ├── _load_environment.py ├── _version.py ├── load_from_disk_test.py └── load_from_docker_test.py ├── docs └── index.md ├── examples ├── human_agent.py └── random_agent.py └── setup.py /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to Contribute 2 | 3 | We'd love to accept your patches and contributions to this project. There are 4 | just a few small guidelines you need to follow. 5 | 6 | ## Contributor License Agreement 7 | 8 | Contributions to this project must be accompanied by a Contributor License 9 | Agreement. You (or your employer) retain the copyright to your contribution; 10 | this simply gives us permission to use and redistribute your contributions as 11 | part of the project. Head over to to see 12 | your current agreements on file or to sign a new one. 13 | 14 | You generally only need to submit a CLA once, so if you've already submitted one 15 | (even if it was for a different project), you probably don't need to do it 16 | again. 17 | 18 | ## Code reviews 19 | 20 | All submissions, including submissions by project members, require review. We 21 | use GitHub pull requests for this purpose. Consult 22 | [GitHub Help](https://help.github.com/articles/about-pull-requests/) for more 23 | information on using pull requests. 24 | 25 | ## Community Guidelines 26 | 27 | This project follows 28 | [Google's Open Source Community Guidelines](https://opensource.google/conduct/). 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `dm_memorytasks`: DeepMind Memory Task Suite 2 | 3 | The *DeepMind Memory Task Suite* is a set of 13 diverse machine-learning tasks 4 | that require memory to solve. They are constructed to let us evaluate 5 | generalization performance on a memory-specific holdout set. 6 | 7 | The 8 tasks in this repo are [Unity-based](http://unity3d.com/). Besides these, 8 | there are 4 tasks in the overall Memory Task Suite that are modifications of 9 | [PsychLab](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/psychlab) 10 | tasks, and 1 that is a modification of a 11 | [DeepMind Lab](https://github.com/deepmind/lab) level. 12 | 13 | **NOTE:** The 5 other tasks in the Suite are in Psychlab and DeepMind Lab, not 14 | Unity. Psychlab is part of DeepMind Lab. DeepMind Lab has a separate set of 15 | installation [instructions](https://github.com/deepmind/lab). 16 | 17 | ## Overview 18 | 19 | The 8 Unity-based tasks are provided through a pre-packaged 20 | [Docker container](http://www.docker.com). 21 | 22 | The `dm_memorytasks` package consists of support code to run these Docker 23 | containers. You interact with the task environment via a 24 | [`dm_env`](http://www.github.com/deepmind/dm_env) Python interface. 25 | 26 | Please see the [documentation](docs/index.md) for more detailed information on 27 | the available tasks, actions and observations. 28 | 29 | ## Requirements 30 | 31 | `dm_memorytasks` requires [Docker](https://www.docker.com), 32 | [Python](https://www.python.org/) 3.6.1 or later and a x86-64 CPU with SSE4.2 33 | support. We do not attempt to maintain a working version for Python 2. 34 | 35 | Note: We recommend using 36 | [Python virtual environment](https://docs.python.org/3/tutorial/venv.html) to 37 | mitigate conflicts with your system's Python environment. 38 | 39 | Download and install Docker: 40 | 41 | * For Linux, install [Docker-CE](https://docs.docker.com/install/) 42 | * Install Docker Desktop for 43 | [OSX](https://docs.docker.com/docker-for-mac/install/) or 44 | [Windows](https://docs.docker.com/docker-for-windows/install/). 45 | 46 | ## Installation 47 | 48 | `dm_memorytasks` can be installed from 49 | [PyPi](https://pypi.org/project/dm-memorytasks/) using `pip`: 50 | 51 | ```bash 52 | $ pip install dm-memorytasks 53 | ``` 54 | 55 | To also install the dependencies for the `examples/`, install with: 56 | 57 | ```bash 58 | $ pip install dm-memorytasks[examples] 59 | ``` 60 | 61 | Alternatively, you can install `dm_memorytasks` by cloning a local copy of our 62 | GitHub repository: 63 | 64 | ```bash 65 | $ git clone https://github.com/deepmind/dm_memorytasks.git 66 | $ pip install ./dm_memorytasks 67 | ``` 68 | 69 | ## Usage 70 | 71 | Once `dm_memorytasks` is installed, to instantiate a `dm_env` instance run the 72 | following: 73 | 74 | ```python 75 | import dm_memorytasks 76 | 77 | settings = dm_memorytasks.EnvironmentSettings(seed=123, level_name='spot_diff_train') 78 | env = dm_memorytasks.load_from_docker(settings) 79 | ``` 80 | 81 | ## Citing 82 | 83 | If you use `dm_memorytasks` in your work, please cite the accompanying paper: 84 | 85 | ```bibtex 86 | @inproceedings{fortunato2019generalization, 87 | title={Generalization of Reinforcement Learners with Working and Episodic Memory}, 88 | author={Fortunato, Meire and 89 | Tan, Melissa and 90 | Faulkner, Ryan and 91 | Hansen, Steven and 92 | Badia, Adri{\`a} Puigdom{\`e}nech and 93 | Buttimore, Gavin and 94 | Deck, Charles and 95 | Leibo, Joel Z and 96 | Blundell, Charles}, 97 | booktitle={Advances in Neural Information Processing Systems}, 98 | pages={12448--12457}, 99 | year={2019}, 100 | } 101 | ``` 102 | 103 | ## Notice 104 | 105 | This is not an officially supported Google product. 106 | -------------------------------------------------------------------------------- /RELEASE_NOTES.md: -------------------------------------------------------------------------------- 1 | # Release Notes 2 | 3 | ## [1.0.3] 4 | 5 | * Set dm_env_rpc version to < v1.1.0. 6 | 7 | ## [1.0.2] 8 | 9 | * Update to incorporate dm_env_rpc v1.0.1. 10 | 11 | ## [1.0.1] 12 | 13 | * Updated missing "visible_\*" levels and incorrectly named "goal" levels. 14 | * Moved levelName setting from JoinWorld to CreateWorld. 15 | * Disabled realtime lightmaps, improving the environment's CPU utilization. 16 | 17 | ## [1.0.0] 18 | 19 | * Initial release. 20 | -------------------------------------------------------------------------------- /dm_memorytasks/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Python utilities for running dm_memorytasks.""" 16 | 17 | from dm_memorytasks import _load_environment 18 | from dm_memorytasks._version import __version__ 19 | 20 | EnvironmentSettings = _load_environment.EnvironmentSettings 21 | 22 | MEMORY_TASK_LEVEL_NAMES = _load_environment.MEMORY_TASK_LEVEL_NAMES 23 | 24 | load_from_disk = _load_environment.load_from_disk 25 | load_from_docker = _load_environment.load_from_docker 26 | -------------------------------------------------------------------------------- /dm_memorytasks/_load_environment.py: -------------------------------------------------------------------------------- 1 | # Lint as: python3 2 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # ============================================================================ 16 | """Python utility functions for loading DeepMind Memory Tasks.""" 17 | 18 | import codecs 19 | import collections 20 | import json 21 | import os 22 | import re 23 | import subprocess 24 | import time 25 | import typing 26 | 27 | from absl import logging 28 | import dm_env 29 | import docker 30 | import grpc 31 | import numpy as np 32 | import portpicker 33 | 34 | from dm_env_rpc.v1 import connection as dm_env_rpc_connection 35 | from dm_env_rpc.v1 import dm_env_adaptor 36 | from dm_env_rpc.v1 import dm_env_rpc_pb2 37 | from dm_env_rpc.v1 import error 38 | from dm_env_rpc.v1 import tensor_utils 39 | 40 | # Maximum number of times to attempt gRPC connection. 41 | _MAX_CONNECTION_ATTEMPTS = 10 42 | 43 | # Port to expect the docker environment to internally listen on. 44 | _DOCKER_INTERNAL_GRPC_PORT = 10000 45 | 46 | _DEFAULT_DOCKER_IMAGE_NAME = 'gcr.io/deepmind-environments/dm_memorytasks:v1.0.1' 47 | 48 | _MEMORY_TASK_OBSERVATIONS = ['RGB_INTERLEAVED', 'AvatarPosition', 'Score'] 49 | 50 | MEMORY_TASK_LEVEL_NAMES = frozenset(( 51 | 'invisible_goal_empty_arena_extrapolate', 52 | 'invisible_goal_empty_arena_holdout_extrapolate', 53 | 'invisible_goal_empty_arena_holdout_interpolate', 54 | 'invisible_goal_empty_arena_holdout_large', 55 | 'invisible_goal_empty_arena_holdout_small', 56 | 'invisible_goal_empty_arena_interpolate', 57 | 'invisible_goal_empty_arena_train', 58 | 'invisible_goal_with_buildings_extrapolate', 59 | 'invisible_goal_with_buildings_holdout_extrapolate', 60 | 'invisible_goal_with_buildings_holdout_interpolate', 61 | 'invisible_goal_with_buildings_holdout_large', 62 | 'invisible_goal_with_buildings_holdout_small', 63 | 'invisible_goal_with_buildings_interpolate', 64 | 'invisible_goal_with_buildings_train', 65 | 'spot_diff_extrapolate', 66 | 'spot_diff_holdout_extrapolate', 67 | 'spot_diff_holdout_interpolate', 68 | 'spot_diff_holdout_large', 69 | 'spot_diff_holdout_small', 70 | 'spot_diff_interpolate', 71 | 'spot_diff_motion_extrapolate', 72 | 'spot_diff_motion_holdout_extrapolate', 73 | 'spot_diff_motion_holdout_interpolate', 74 | 'spot_diff_motion_holdout_large', 75 | 'spot_diff_motion_holdout_small', 76 | 'spot_diff_motion_interpolate', 77 | 'spot_diff_motion_train', 78 | 'spot_diff_multi_extrapolate', 79 | 'spot_diff_multi_holdout_extrapolate', 80 | 'spot_diff_multi_holdout_interpolate', 81 | 'spot_diff_multi_holdout_large', 82 | 'spot_diff_multi_holdout_small', 83 | 'spot_diff_multi_interpolate', 84 | 'spot_diff_multi_train', 85 | 'spot_diff_passive_extrapolate', 86 | 'spot_diff_passive_holdout_extrapolate', 87 | 'spot_diff_passive_holdout_interpolate', 88 | 'spot_diff_passive_holdout_large', 89 | 'spot_diff_passive_holdout_small', 90 | 'spot_diff_passive_interpolate', 91 | 'spot_diff_passive_train', 92 | 'spot_diff_train', 93 | 'transitive_inference_extrapolate', 94 | 'transitive_inference_holdout_extrapolate', 95 | 'transitive_inference_holdout_interpolate', 96 | 'transitive_inference_holdout_large', 97 | 'transitive_inference_holdout_small', 98 | 'transitive_inference_interpolate', 99 | 'transitive_inference_train_large', 100 | 'transitive_inference_train_small', 101 | 'visible_goal_with_buildings_extrapolate', 102 | 'visible_goal_with_buildings_holdout_extrapolate', 103 | 'visible_goal_with_buildings_holdout_interpolate', 104 | 'visible_goal_with_buildings_holdout_large', 105 | 'visible_goal_with_buildings_holdout_small', 106 | 'visible_goal_with_buildings_interpolate', 107 | 'visible_goal_with_buildings_train', 108 | )) 109 | 110 | _ConnectionDetails = collections.namedtuple('_ConnectionDetails', 111 | ['channel', 'connection', 'specs']) 112 | 113 | 114 | class _MemoryTasksEnv(dm_env_adaptor.DmEnvAdaptor): 115 | """An implementation of dm_env_rpc.DmEnvAdaptor for memory tasks.""" 116 | 117 | def __init__(self, connection_details, requested_observations, 118 | num_action_repeats): 119 | super(_MemoryTasksEnv, 120 | self).__init__(connection_details.connection, 121 | connection_details.specs, requested_observations) 122 | self._channel = connection_details.channel 123 | self._num_action_repeats = num_action_repeats 124 | 125 | def close(self): 126 | super(_MemoryTasksEnv, self).close() 127 | self._channel.close() 128 | 129 | def step(self, action): 130 | """Implementation of dm_env.step that supports repeated actions.""" 131 | 132 | timestep = None 133 | discount = None 134 | reward = None 135 | for _ in range(self._num_action_repeats): 136 | next_timestep = super(_MemoryTasksEnv, self).step(action) 137 | 138 | # Accumulate reward per timestep. 139 | if next_timestep.reward is not None: 140 | reward = (reward or 0.) + next_timestep.reward 141 | 142 | # Calculate the product for discount. 143 | if next_timestep.discount is not None: 144 | discount = discount if discount else [] 145 | discount.append(next_timestep.discount) 146 | 147 | timestep = dm_env.TimeStep(next_timestep.step_type, reward, 148 | # Note: np.product(None) returns None. 149 | np.product(discount), 150 | next_timestep.observation) 151 | 152 | if timestep.last(): 153 | return timestep 154 | 155 | return timestep 156 | 157 | 158 | class _MemoryTasksContainerEnv(_MemoryTasksEnv): 159 | """An implementation of _MemoryTasksEnv. 160 | 161 | Ensures that the provided Docker container is closed on exit. 162 | """ 163 | 164 | def __init__(self, connection_details, requested_observations, 165 | num_action_repeats, container): 166 | super(_MemoryTasksContainerEnv, 167 | self).__init__(connection_details, requested_observations, 168 | num_action_repeats) 169 | self._container = container 170 | 171 | def close(self): 172 | super(_MemoryTasksContainerEnv, self).close() 173 | try: 174 | self._container.kill() 175 | except docker.errors.NotFound: 176 | pass # Ignore, container has already been closed. 177 | 178 | 179 | class _MemoryTasksProcessEnv(_MemoryTasksEnv): 180 | """An implementation of _MemoryTasksEnv. 181 | 182 | Ensure that the provided running process is closed on exit. 183 | """ 184 | 185 | def __init__(self, connection_details, requested_observations, 186 | num_action_repeats, process): 187 | super(_MemoryTasksProcessEnv, 188 | self).__init__(connection_details, requested_observations, 189 | num_action_repeats) 190 | self._process = process 191 | 192 | def close(self): 193 | super(_MemoryTasksProcessEnv, self).close() 194 | self._process.terminate() 195 | self._process.wait() 196 | 197 | 198 | def _check_grpc_channel_ready(channel): 199 | """Helper function to check the gRPC channel is ready N times.""" 200 | for _ in range(_MAX_CONNECTION_ATTEMPTS - 1): 201 | try: 202 | return grpc.channel_ready_future(channel).result(timeout=1) 203 | except grpc.FutureTimeoutError: 204 | pass 205 | return grpc.channel_ready_future(channel).result(timeout=1) 206 | 207 | 208 | def _can_send_message(connection): 209 | """Returns if `connection` is healthy and able to process requests.""" 210 | try: 211 | # This should return a response with an error unless the server isn't yet 212 | # receiving requests. 213 | connection.send(dm_env_rpc_pb2.StepRequest()) 214 | except error.DmEnvRpcError: 215 | return True 216 | except grpc.RpcError: 217 | return False 218 | 219 | 220 | def _create_channel_and_connection(port): 221 | """Returns a tuple of `(channel, connection)`.""" 222 | for _ in range(_MAX_CONNECTION_ATTEMPTS): 223 | channel = grpc.secure_channel('localhost:{}'.format(port), 224 | grpc.local_channel_credentials()) 225 | _check_grpc_channel_ready(channel) 226 | connection = dm_env_rpc_connection.Connection(channel) 227 | if _can_send_message(connection): 228 | break 229 | else: 230 | # A gRPC server running within Docker sometimes reports that the channel 231 | # is ready but transitively returns an error (status code 14) on first 232 | # use. Giving the server some time to breath and retrying often fixes the 233 | # problem. 234 | connection.close() 235 | channel.close() 236 | time.sleep(1.0) 237 | 238 | return channel, connection 239 | 240 | 241 | def _parse_exception_message(message): 242 | """Returns a human-readable version of a dm_env_rpc json error message.""" 243 | try: 244 | match = re.match(r'^message\:\ \"(.*)\"$', message) 245 | json_data = codecs.decode(match.group(1), 'unicode-escape') 246 | parsed_json_data = json.loads(json_data) 247 | return ValueError(json.dumps(parsed_json_data, indent=4)) 248 | except: # pylint: disable=bare-except 249 | return message 250 | 251 | 252 | def _wrap_send(send): 253 | """Wraps `send` in order to reformat exceptions.""" 254 | try: 255 | return send() 256 | except ValueError as e: 257 | e.args = [_parse_exception_message(e.args[0])] 258 | raise 259 | 260 | 261 | def _connect_to_environment(port, settings): 262 | """Helper function for connecting to a running dm_memorytask environment.""" 263 | if settings.level_name not in MEMORY_TASK_LEVEL_NAMES: 264 | raise ValueError( 265 | 'Level named "{}" is not a valid dm_memorytask level.'.format( 266 | settings.level_name)) 267 | channel, connection = _create_channel_and_connection(port) 268 | original_send = connection.send 269 | connection.send = lambda request: _wrap_send(lambda: original_send(request)) 270 | world_name = connection.send( 271 | dm_env_rpc_pb2.CreateWorldRequest( 272 | settings={ 273 | 'seed': tensor_utils.pack_tensor(settings.seed), 274 | 'episodeId': tensor_utils.pack_tensor(0), 275 | 'levelName': tensor_utils.pack_tensor(settings.level_name), 276 | })).world_name 277 | join_world_settings = { 278 | 'width': 279 | tensor_utils.pack_tensor(settings.width), 280 | 'height': 281 | tensor_utils.pack_tensor(settings.height), 282 | 'EpisodeLengthSeconds': 283 | tensor_utils.pack_tensor(settings.episode_length_seconds), 284 | } 285 | specs = connection.send( 286 | dm_env_rpc_pb2.JoinWorldRequest( 287 | world_name=world_name, settings=join_world_settings)).specs 288 | return _ConnectionDetails(channel=channel, connection=connection, specs=specs) 289 | 290 | 291 | class EnvironmentSettings(typing.NamedTuple): 292 | """Collection of settings used to start a specific Memory task. 293 | 294 | Required attributes: 295 | seed: Seed to initialize the environment's RNG. 296 | level_name: Name of the level to load. 297 | Optional attributes: 298 | width: Width (in pixels) of the desired RGB observation; defaults to 96. 299 | height: Height (in pixels) of the desired RGB observation; defaults to 72. 300 | episode_length_seconds: Maximum episode length (in seconds); defaults to 301 | 120. 302 | num_action_repeats: Number of times to step the environment with the 303 | provided action in calls to `step()`. 304 | """ 305 | seed: int 306 | level_name: str 307 | width: int = 96 308 | height: int = 72 309 | episode_length_seconds: float = 120.0 310 | num_action_repeats: int = 1 311 | 312 | 313 | def _validate_environment_settings(settings): 314 | """Helper function to validate the provided environment settings.""" 315 | if settings.episode_length_seconds <= 0.0: 316 | raise ValueError('episode_length_seconds must have a positive value.') 317 | if settings.num_action_repeats <= 0: 318 | raise ValueError('num_action_repeats must have a positive value.') 319 | if settings.width <= 0 or settings.height <= 0: 320 | raise ValueError('width and height must have a positive value.') 321 | 322 | 323 | def load_from_disk(path, settings): 324 | """Load Memory Tasks from disk. 325 | 326 | Args: 327 | path: Directory containing dm_memorytasks environment. 328 | settings: EnvironmentSettings required to start the environment. 329 | 330 | Returns: 331 | An implementation of dm_env.Environment. 332 | 333 | Raises: 334 | RuntimeError: If unable to start environment process. 335 | """ 336 | _validate_environment_settings(settings) 337 | 338 | executable_path = os.path.join(path, 'Linux64Player') 339 | libosmesa_path = os.path.join(path, 'external_libosmesa_llvmpipe.so') 340 | if not os.path.exists(executable_path) or not os.path.exists(libosmesa_path): 341 | raise RuntimeError( 342 | 'Cannot find dm_memorytasks executable or dependent files at path: {}' 343 | .format(path)) 344 | 345 | port = portpicker.pick_unused_port() 346 | 347 | process_flags = [ 348 | executable_path, 349 | # Unity command-line flags. 350 | '-logfile', 351 | '-batchmode', 352 | '-noaudio', 353 | # Other command-line flags. 354 | '--logtostderr', 355 | '--server_type=GRPC', 356 | '--uri_address=[::]:{}'.format(port), 357 | ] 358 | 359 | os.environ.update({ 360 | 'UNITY_RENDERER': 'software', 361 | 'UNITY_OSMESA_PATH': libosmesa_path, 362 | }) 363 | 364 | process = subprocess.Popen( 365 | process_flags, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) 366 | if process.poll() is not None: 367 | raise RuntimeError('Failed to start dm_memorytasks process correctly.') 368 | 369 | return _MemoryTasksProcessEnv( 370 | _connect_to_environment(port, settings), _MEMORY_TASK_OBSERVATIONS, 371 | settings.num_action_repeats, process) 372 | 373 | 374 | def load_from_docker(settings, name=None): 375 | """Load Memory Tasks from docker container. 376 | 377 | Args: 378 | settings: EnvironmentSettings required to start the environment. 379 | name: Optional name of Docker image that contains the dm_memorytasks 380 | environment. If left unset, uses the dm_memorytasks default name. 381 | 382 | Returns: 383 | An implementation of dm_env.Environment 384 | """ 385 | _validate_environment_settings(settings) 386 | 387 | name = name or _DEFAULT_DOCKER_IMAGE_NAME 388 | client = docker.from_env() 389 | 390 | port = portpicker.pick_unused_port() 391 | 392 | try: 393 | client.images.get(name) 394 | except docker.errors.ImageNotFound: 395 | logging.info('Downloading docker image "%s"...', name) 396 | client.images.pull(name) 397 | logging.info('Download finished.') 398 | 399 | container = client.containers.run( 400 | name, 401 | auto_remove=True, 402 | detach=True, 403 | ports={_DOCKER_INTERNAL_GRPC_PORT: port}) 404 | 405 | return _MemoryTasksContainerEnv( 406 | _connect_to_environment(port, settings), _MEMORY_TASK_OBSERVATIONS, 407 | settings.num_action_repeats, container) 408 | -------------------------------------------------------------------------------- /dm_memorytasks/_version.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Package version for dm_memorytasks. 16 | 17 | Kept in separate file so it can be used during installation. 18 | """ 19 | 20 | __version__ = '1.0.3' # https://www.python.org/dev/peps/pep-0440/ 21 | -------------------------------------------------------------------------------- /dm_memorytasks/load_from_disk_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Tests for dm_memorytasks.load_from_disk.""" 16 | 17 | from absl import flags 18 | from absl.testing import absltest 19 | from absl.testing import parameterized 20 | from dm_env import test_utils 21 | import dm_memorytasks 22 | 23 | FLAGS = flags.FLAGS 24 | 25 | flags.DEFINE_string('path', '', 26 | 'Directory that contains dm_memorytasks environment.') 27 | 28 | 29 | class LoadFromDiskTest(test_utils.EnvironmentTestMixin, absltest.TestCase): 30 | 31 | def make_object_under_test(self): 32 | return dm_memorytasks.load_from_disk( 33 | FLAGS.path, 34 | settings=dm_memorytasks.EnvironmentSettings( 35 | seed=123, level_name='spot_diff_extrapolate')) 36 | 37 | 38 | class MemoryTaskTest(parameterized.TestCase): 39 | 40 | @parameterized.parameters(dm_memorytasks.MEMORY_TASK_LEVEL_NAMES) 41 | def test_load_level(self, level_name): 42 | self.assertIsNotNone( 43 | dm_memorytasks.load_from_disk( 44 | FLAGS.path, 45 | settings=dm_memorytasks.EnvironmentSettings( 46 | seed=123, level_name=level_name))) 47 | 48 | 49 | if __name__ == '__main__': 50 | absltest.main() 51 | -------------------------------------------------------------------------------- /dm_memorytasks/load_from_docker_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Tests for dm_memorytasks.load_from_docker.""" 16 | 17 | from absl import flags 18 | from absl.testing import absltest 19 | from absl.testing import parameterized 20 | from dm_env import test_utils 21 | import dm_memorytasks 22 | 23 | FLAGS = flags.FLAGS 24 | 25 | flags.DEFINE_string( 26 | 'docker_image_name', None, 27 | 'Name of the Docker image that contains the Memory Tasks. ' 28 | 'If None, uses the default dm_memorytask name') 29 | 30 | 31 | class LoadFromDockerTest(test_utils.EnvironmentTestMixin, absltest.TestCase): 32 | 33 | def make_object_under_test(self): 34 | return dm_memorytasks.load_from_docker( 35 | name=FLAGS.docker_image_name, 36 | settings=dm_memorytasks.EnvironmentSettings( 37 | seed=123, level_name='spot_diff_extrapolate')) 38 | 39 | 40 | class MemoryTaskTest(parameterized.TestCase): 41 | 42 | @parameterized.parameters(dm_memorytasks.MEMORY_TASK_LEVEL_NAMES) 43 | def test_load_level(self, level_name): 44 | self.assertIsNotNone( 45 | dm_memorytasks.load_from_docker( 46 | name=FLAGS.docker_image_name, 47 | settings=dm_memorytasks.EnvironmentSettings( 48 | seed=123, level_name=level_name))) 49 | 50 | 51 | if __name__ == '__main__': 52 | absltest.main() 53 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | # Tasks 2 | 3 | Each task has 3 different *levels* to run the agent on: 4 | 5 | 1. **Train** 6 | 2. **Holdout Interpolate** 7 | 3. **Holdout Extrapolate**. 8 | 9 | A further explanation of the per-task level split and details can be found in 10 | our paper ([arxiv](https://arxiv.org/abs/1910.13406)). 11 | 12 | ## Passing a level name string to `dm_env` API 13 | 14 | To run on a particular level, you need to append one of these suffixes to the 15 | base task name. 16 | 17 | 1. `_train` 18 | 2. `_holdout_interpolate` 19 | 3. `_holdout_extrapolate` 20 | 21 | For example, you could train your agent on the `transitive_inference_train` 22 | level and test on the `transitive_inference_holdout_interpolate` level. 23 | 24 | We also provide 4 extra levels, not used in the paper, with the suffixes below: 25 | 26 | * `_interpolate` 27 | * `_extrapolate` 28 | 29 | In these levels, the set of stimuli that is used is the same as in `_train`, but 30 | the scale dimension is altered as per the corresponding holdout variant. 31 | 32 | * `_holdout_small` 33 | * `_holdout_large` 34 | 35 | In these levels, the set of stimuli that is used is the holdout set, but the 36 | scale dimensions are the ones used in `_train`. 37 | 38 | To run on one of the 4 PsychLab tasks or the DeepMind Lab goal navigation tasks, 39 | listed 40 | [here](https://github.com/deepmind/lab/tree/master/game_scripts/levels/contributed/psychlab/memory_suite_01), 41 | follow the 42 | [DeepMind Lab instructions for using `dm_env`](https://github.com/deepmind/lab#train-an-agent). 43 | 44 | ### Base task names 45 | 46 | #### PsychLab 47 | 48 | * `arbitrary_visuomotor_mapping` 49 | * `change_detection` 50 | * `continuous_recognition` 51 | * `what_then_where` 52 | 53 | #### Spot the Difference (all in Unity) 54 | 55 | * `spot_diff` 56 | * `spot_diff_motion` 57 | * `spot_diff_multi` 58 | * `spot_diff_passive` 59 | 60 | #### Goal Navigation (all in Unity except 1) 61 | 62 | * `explore_goal_locations` (DeepMind Lab) 63 | * `invisible_goal_empty_arena` 64 | * `invisible_goal_with_buildings` 65 | * `visible_goal_with_buildings` 66 | 67 | #### Transitive Inference (in Unity) 68 | 69 | * `transitive_inference` 70 | 71 | NOTE: For `explore_goal_locations` and `transitive_inference`, the **Train** 72 | level was implemented as two separate files. Instead of appending `_train`, 73 | append either `_train_small` or `_train_large`. 74 | 75 | All task videos and descriptions can be found here: 76 | [https://sites.google.com/corp/view/memory-tasks-suite](https://sites.google.com/corp/view/memory-tasks-suite/home#h.p_2_oxOFZA5QsA). 77 | 78 | # Actions 79 | 80 | For the 8 Unity-based tasks, the environment provides the following actions: 81 | 82 | * `STRAFE_LEFT_RIGHT` 83 | * `MOVE_BACK_FORWARD` 84 | * `LOOK_LEFT_RIGHT` 85 | * `LOOK_DOWN_UP` 86 | 87 | Each action is a `double` scalar, with an inclusive range of `[-1.0, 1.0]`. It 88 | is not compulsory to send a value for each action every step, but note that 89 | actions are "sticky", meaning an action's value will only change when a new 90 | value is provided. For example: 91 | 92 | ```python 93 | env = dm_memorytasks.load_from_docker(settings) 94 | env.reset() 95 | env.step({'STRAFE_LEFT_RIGHT': -1.0}) # Result: strafe Left. 96 | env.step({'MOVE_BACK_FORWARD': 1.0}) # Result: strafe left & move backward. 97 | 98 | env.step({'STRAFE_LEFT_RIGHT': 0.0, 99 | 'MOVE_BACK_FORWARD': 0.0}) # Result: stationary. 100 | ``` 101 | 102 | # Observations 103 | 104 | For the 8 Unity-based tasks, the environment provides the following 105 | observations: 106 | 107 | * `RGB_INTERLEAVED`: First person RGB camera observation. The `width` and 108 | `height` can be adjusted through the `EnvironmentSettings`, but the 109 | observation will always have a fixed 4:3 aspect ratio. 110 | * `AvatarPosition`: 3-dimensional world-space position of the agent. 111 | * `Score`: The agent's cumulative score. 112 | 113 | # Configurable environment settings 114 | 115 | Required attributes: 116 | 117 | * `seed`: Seed to initialize the environment's RNG. 118 | * `level_name`: Name of the level to load. 119 | 120 | Optional attributes: 121 | 122 | * `width`: Width (in pixels) of the desired RGB observation; defaults to 96. 123 | * `height`: Height (in pixels) of the desired RGB observation; defaults to 72. 124 | * `episode_length_seconds`: Maximum episode length (in seconds); defaults 125 | to 120. 126 | * `num_action_repeats`: Number of times to step the environment with the 127 | provided action in calls to `step()`. 128 | -------------------------------------------------------------------------------- /examples/human_agent.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Example human agent for interacting with DeepMind Memory Tasks.""" 16 | 17 | from absl import app 18 | from absl import flags 19 | from absl import logging 20 | import dm_memorytasks 21 | import numpy as np 22 | import pygame 23 | 24 | FLAGS = flags.FLAGS 25 | 26 | flags.DEFINE_list( 27 | 'screen_size', [640, 480], 28 | 'Screen width/height in pixels. Scales the environment RGB observations to ' 29 | 'fit the screen size.') 30 | 31 | flags.DEFINE_string( 32 | 'docker_image_name', None, 33 | 'Name of the Docker image that contains the Memory Tasks. ' 34 | 'If None, uses the default dm_memorytask name') 35 | 36 | flags.DEFINE_integer('seed', 123, 'Environment seed.') 37 | flags.DEFINE_string('level_name', 'spot_diff_extrapolate', 38 | 'Name of memory task to run.') 39 | 40 | _FRAMES_PER_SECOND = 30 41 | 42 | _KEYS_TO_ACTION = { 43 | pygame.K_w: {'MOVE_BACK_FORWARD': 1}, 44 | pygame.K_s: {'MOVE_BACK_FORWARD': -1}, 45 | pygame.K_a: {'STRAFE_LEFT_RIGHT': -1}, 46 | pygame.K_d: {'STRAFE_LEFT_RIGHT': 1}, 47 | pygame.K_UP: {'LOOK_DOWN_UP': -1}, 48 | pygame.K_DOWN: {'LOOK_DOWN_UP': 1}, 49 | pygame.K_LEFT: {'LOOK_LEFT_RIGHT': -1}, 50 | pygame.K_RIGHT: {'LOOK_LEFT_RIGHT': 1}, 51 | } # pyformat: disable 52 | _NO_ACTION = { 53 | 'MOVE_BACK_FORWARD': 0, 54 | 'STRAFE_LEFT_RIGHT': 0, 55 | 'LOOK_LEFT_RIGHT': 0, 56 | 'LOOK_DOWN_UP': 0, 57 | } 58 | 59 | 60 | def main(_): 61 | pygame.init() 62 | try: 63 | pygame.mixer.quit() 64 | except NotImplementedError: 65 | pass 66 | pygame.display.set_caption('Memory Tasks Human Agent') 67 | 68 | env_settings = dm_memorytasks.EnvironmentSettings( 69 | seed=FLAGS.seed, level_name=FLAGS.level_name) 70 | with dm_memorytasks.load_from_docker(name=FLAGS.docker_image_name, 71 | settings=env_settings) as env: 72 | screen = pygame.display.set_mode( 73 | (int(FLAGS.screen_size[0]), int(FLAGS.screen_size[1]))) 74 | 75 | rgb_spec = env.observation_spec()['RGB_INTERLEAVED'] 76 | surface = pygame.Surface((rgb_spec.shape[1], rgb_spec.shape[0])) 77 | 78 | actions = _NO_ACTION 79 | score = 0 80 | clock = pygame.time.Clock() 81 | while True: 82 | # Do not close with CTRL-C as otherwise the docker container may be left 83 | # running on exit. 84 | for event in pygame.event.get(): 85 | if event.type == pygame.QUIT: 86 | return 87 | elif event.type == pygame.KEYDOWN: 88 | if event.key == pygame.K_ESCAPE: 89 | return 90 | key_actions = _KEYS_TO_ACTION.get(event.key, {}) 91 | for name, action in key_actions.items(): 92 | actions[name] += action 93 | elif event.type == pygame.KEYUP: 94 | key_actions = _KEYS_TO_ACTION.get(event.key, {}) 95 | for name, action in key_actions.items(): 96 | actions[name] -= action 97 | 98 | timestep = env.step(actions) 99 | frame = np.swapaxes(timestep.observation['RGB_INTERLEAVED'], 0, 1) 100 | pygame.surfarray.blit_array(surface, frame) 101 | pygame.transform.smoothscale(surface, screen.get_size(), screen) 102 | 103 | pygame.display.update() 104 | 105 | if timestep.reward: 106 | score += timestep.reward 107 | logging.info('Total score: %1.1f, reward: %1.1f', score, 108 | timestep.reward) 109 | clock.tick(_FRAMES_PER_SECOND) 110 | 111 | 112 | if __name__ == '__main__': 113 | app.run(main) 114 | -------------------------------------------------------------------------------- /examples/random_agent.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Example random agent for interacting with DeepMind Memory Tasks.""" 16 | 17 | from absl import app 18 | from absl import flags 19 | from absl import logging 20 | from dm_env import specs 21 | import dm_memorytasks 22 | import numpy as np 23 | 24 | FLAGS = flags.FLAGS 25 | 26 | flags.DEFINE_string( 27 | 'docker_image_name', None, 28 | 'Name of the Docker image that contains the Memory Tasks. ' 29 | 'If None, uses the default dm_memorytask name') 30 | 31 | flags.DEFINE_integer('seed', 123, 'Environment seed.') 32 | flags.DEFINE_string('level_name', 'spot_diff_extrapolate', 33 | 'Name of memory task to run.') 34 | 35 | 36 | class RandomAgent(object): 37 | """Basic random agent for DeepMind Memory Tasks.""" 38 | 39 | def __init__(self, action_spec): 40 | self.action_spec = action_spec 41 | 42 | def act(self): 43 | action = {} 44 | 45 | for name, spec in self.action_spec.items(): 46 | # Uniformly sample BoundedArray actions. 47 | if isinstance(spec, specs.BoundedArray): 48 | action[name] = np.random.uniform(spec.minimum, spec.maximum, spec.shape) 49 | else: 50 | action[name] = spec.generate_value() 51 | return action 52 | 53 | 54 | def main(_): 55 | env_settings = dm_memorytasks.EnvironmentSettings( 56 | seed=FLAGS.seed, level_name=FLAGS.level_name) 57 | with dm_memorytasks.load_from_docker( 58 | name=FLAGS.docker_image_name, settings=env_settings) as env: 59 | agent = RandomAgent(env.action_spec()) 60 | 61 | timestep = env.reset() 62 | score = 0 63 | while not timestep.last(): 64 | action = agent.act() 65 | timestep = env.step(action) 66 | 67 | if timestep.reward: 68 | score += timestep.reward 69 | logging.info('Total score: %1.1f, reward: %1.1f', score, 70 | timestep.reward) 71 | 72 | 73 | if __name__ == '__main__': 74 | app.run(main) 75 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Install script for setuptools.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import imp 22 | from setuptools import find_packages 23 | from setuptools import setup 24 | 25 | setup( 26 | name='dm-memorytasks', 27 | version=imp.load_source('_version', 28 | 'dm_memorytasks/_version.py').__version__, 29 | description=('DeepMind Memory Tasks, a set of Unity-based machine-' 30 | 'learning research tasks.'), 31 | author='DeepMind', 32 | license='Apache License, Version 2.0', 33 | keywords='reinforcement-learning python machine learning', 34 | packages=find_packages(exclude=['examples']), 35 | install_requires=[ 36 | 'absl-py', 37 | 'dm-env', 38 | 'dm-env-rpc<1.1.0', 39 | 'docker', 40 | 'grpcio', 41 | 'numpy', 42 | 'portpicker', 43 | ], 44 | tests_require=['nose'], 45 | python_requires='>=3.7', 46 | extras_require={'examples': ['pygame']}, 47 | test_suite='nose.collector', 48 | classifiers=[ 49 | 'Development Status :: 5 - Production/Stable', 50 | 'Environment :: Console', 51 | 'Intended Audience :: Science/Research', 52 | 'License :: OSI Approved :: Apache Software License', 53 | 'Operating System :: POSIX :: Linux', 54 | 'Operating System :: Microsoft :: Windows', 55 | 'Operating System :: MacOS :: MacOS X', 56 | 'Programming Language :: Python :: 3.7', 57 | 'Topic :: Scientific/Engineering :: Artificial Intelligence', 58 | ], 59 | ) 60 | --------------------------------------------------------------------------------