├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── RELEASE_NOTES.md ├── dm_fast_mapping ├── __init__.py ├── _load_environment.py ├── _version.py ├── load_from_disk_test.py └── load_from_docker_test.py ├── docs └── index.md ├── examples ├── human_agent.py └── random_agent.py └── setup.py /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to Contribute 2 | 3 | We'd love to accept your patches and contributions to this project. There are 4 | just a few small guidelines you need to follow. 5 | 6 | ## Contributor License Agreement 7 | 8 | Contributions to this project must be accompanied by a Contributor License 9 | Agreement. You (or your employer) retain the copyright to your contribution; 10 | this simply gives us permission to use and redistribute your contributions as 11 | part of the project. Head over to to see 12 | your current agreements on file or to sign a new one. 13 | 14 | You generally only need to submit a CLA once, so if you've already submitted one 15 | (even if it was for a different project), you probably don't need to do it 16 | again. 17 | 18 | ## Code reviews 19 | 20 | All submissions, including submissions by project members, require review. We 21 | use GitHub pull requests for this purpose. Consult 22 | [GitHub Help](https://help.github.com/articles/about-pull-requests/) for more 23 | information on using pull requests. 24 | 25 | ## Community Guidelines 26 | 27 | This project follows 28 | [Google's Open Source Community Guidelines](https://opensource.google/conduct/). 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `dm_fast_mapping`: DeepMind Fast Language Learning Tasks 2 | 3 | The *DeepMind Fast Language Learning Tasks* is a set of machine-learning tasks 4 | that requires agents to learn the meaning of instruction words either slowly 5 | (i.e. across many episodes), quickly (i.e. within a single episode) or both. 6 | 7 | The tasks in this repo are [Unity-based](http://unity3d.com/). 8 | 9 | ## Overview 10 | 11 | These tasks are provided through pre-packaged 12 | [Docker containers](http://www.docker.com). 13 | 14 | This package consists of support code to run these Docker containers. You 15 | interact with the task environment via a 16 | [`dm_env`](http://www.github.com/deepmind/dm_env) Python interface. 17 | 18 | Please see the [documentation](docs/index.md) for more detailed information on 19 | the available tasks, actions and observations. 20 | 21 | ## Requirements 22 | 23 | `dm_fast_mapping` requires [Docker](https://www.docker.com), 24 | [Python](https://www.python.org/) 3.6.1 or later and a x86-64 CPU with SSE4.2 25 | support. We do not attempt to maintain a working version for Python 2. 26 | 27 | Note: We recommend using 28 | [Python virtual environment](https://docs.python.org/3/tutorial/venv.html) to 29 | mitigate conflicts with your system's Python environment. 30 | 31 | Download and install Docker: 32 | 33 | * For Linux, install [Docker-CE](https://docs.docker.com/install/) 34 | * Install Docker Desktop for 35 | [OSX](https://docs.docker.com/docker-for-mac/install/) or 36 | [Windows](https://docs.docker.com/docker-for-windows/install/). 37 | 38 | ## Installation 39 | 40 | You can install `dm_fast_mapping` by cloning a local copy of our GitHub 41 | repository: 42 | 43 | ```bash 44 | $ git clone https://github.com/deepmind/dm_fast_mapping.git 45 | $ pip install ./dm_fast_mapping 46 | ``` 47 | 48 | You can install the dependencies for the `examples/` with: 49 | 50 | ```bash 51 | $ pip install ./dm-fast-mapping[examples] 52 | ``` 53 | 54 | ## Usage 55 | 56 | Once `dm_fast_mapping` is installed, to instantiate a `dm_env` instance run the 57 | following: 58 | 59 | ```python 60 | import dm_fast_mapping 61 | 62 | settings = dm_fast_mapping.EnvironmentSettings(seed=123, 63 | level_name='fast_slow/fast_map_three_objs') 64 | env = dm_fast_mapping.load_from_docker(settings) 65 | ``` 66 | 67 | ## Citing 68 | 69 | If you use `dm_fast_mapping` in your work, please cite the accompanying paper: 70 | 71 | ```bibtex 72 | @misc{hill2020grounded, 73 | title={Grounded Language Learning Fast and Slow}, 74 | author={Felix Hill and 75 | Olivier Tieleman and 76 | Tamara von Glehn and 77 | Nathaniel Wong and 78 | Hamza Merzic and 79 | Stephen Clark}, 80 | year={2020}, 81 | eprint={2009.01719}, 82 | archivePrefix={arXiv}, 83 | primaryClass={cs.CL} 84 | } 85 | ``` 86 | 87 | For the `with_distractors` tasks, please also cite the source for those tasks: 88 | 89 | ```bibtex 90 | @misc{lampinen2021towards, 91 | title={Towards mental time travel: 92 | a hierarchical memory for reinforcement learning agents}, 93 | author={Lampinen, Andrew Kyle and 94 | Chan, Stephanie C Y and 95 | Banino, Andrea and 96 | Hill, Felix}, 97 | archivePrefix={arXiv}, 98 | eprint={2105.14039}, 99 | year={2021}, 100 | primaryClass={cs.LG} 101 | } 102 | ``` 103 | 104 | ## Notice 105 | 106 | This is not an officially supported Google product. 107 | -------------------------------------------------------------------------------- /RELEASE_NOTES.md: -------------------------------------------------------------------------------- 1 | # Release Notes 2 | 3 | ## [1.0.0] 4 | 5 | * Initial release. 6 | 7 | ## [1.1.0] 8 | 9 | * Added tasks from the paper "Towards mental time travel: a hierarchical 10 | memory for reinforcement learning agents." 11 | 12 | ## [1.1.1] 13 | 14 | * Added additional tasks from the revised paper "Towards mental time travel: a 15 | hierarchical memory for reinforcement learning agents." 16 | -------------------------------------------------------------------------------- /dm_fast_mapping/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Python utilities for running dm_fast_mapping.""" 16 | 17 | from dm_fast_mapping import _load_environment 18 | from dm_fast_mapping._version import __version__ 19 | 20 | EnvironmentSettings = _load_environment.EnvironmentSettings 21 | 22 | FAST_MAPPING_TASK_LEVEL_NAMES = _load_environment.FAST_MAPPING_TASK_LEVEL_NAMES 23 | 24 | load_from_disk = _load_environment.load_from_disk 25 | load_from_docker = _load_environment.load_from_docker 26 | -------------------------------------------------------------------------------- /dm_fast_mapping/_load_environment.py: -------------------------------------------------------------------------------- 1 | # Lint as: python3 2 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # ============================================================================ 16 | """Python utility functions for loading DeepMind Fast Language Learning Tasks.""" 17 | 18 | import codecs 19 | import collections 20 | import json 21 | import os 22 | import re 23 | import subprocess 24 | import time 25 | import typing 26 | 27 | from absl import logging 28 | import dm_env 29 | import docker 30 | import grpc 31 | import numpy as np 32 | import portpicker 33 | 34 | from dm_env_rpc.v1 import connection as dm_env_rpc_connection 35 | from dm_env_rpc.v1 import dm_env_adaptor 36 | from dm_env_rpc.v1 import dm_env_rpc_pb2 37 | from dm_env_rpc.v1 import error 38 | from dm_env_rpc.v1 import tensor_utils 39 | 40 | # Maximum number of times to attempt gRPC connection. 41 | _MAX_CONNECTION_ATTEMPTS = 10 42 | 43 | # Port to expect the docker environment to internally listen on. 44 | _DOCKER_INTERNAL_GRPC_PORT = 10000 45 | 46 | _DEFAULT_DOCKER_IMAGE_NAME = 'gcr.io/deepmind-environments/dm_fast_mapping:v1.1.1' 47 | 48 | _FAST_MAPPING_TASK_OBSERVATIONS = ['RGB_INTERLEAVED', 'TEXT'] 49 | 50 | FAST_MAPPING_TASK_LEVEL_NAMES = frozenset(( 51 | 'architecture_comparison/fast_map_three_objs', 52 | 'fast_slow/fast_map_three_objs_bed_tray', 53 | 'fast_slow/fast_map_three_objs_bed_tray_putting_near', 54 | 'fast_slow/fast_map_three_objs_bed_tray_putting_on', 55 | 'fast_slow/fast_map_three_objs', 56 | 'fast_slow/slow_learn_three_objs_bed_tray_lifting', 57 | 'fast_slow/slow_learn_three_objs_bed_tray_putting_near', 58 | 'fast_slow/slow_learn_three_objs_bed_tray_putting_on', 59 | 'fast_slow/test_holdout_fast_map_three_objs_bed_tray_putting_on', 60 | 'fast_slow/two_phase_slow_learn_three_objs_bed_tray_putting_near', 61 | 'fast_slow/two_phase_slow_learn_three_objs_bed_tray_putting_on', 62 | 'intrinsic_motivation/fast_map_three_objs_no_shaping_reward', 63 | 'new_obj_generalization/fast_map_heldout_test_objs', 64 | 'new_obj_generalization/fast_map_three_objs_global_five', 65 | 'new_obj_generalization/fast_map_three_objs_global_ten', 66 | 'new_obj_generalization/fast_map_three_objs_global_three', 67 | 'new_obj_generalization/fast_map_three_objs_global_twenty', 68 | 'num_generalization/fast_map_eight_objs', 69 | 'num_generalization/fast_map_five_objs', 70 | 'num_generalization/fast_map_three_objs', 71 | 'with_distractors/eval_fast_map_two_episodes_three_objs_five_distractor', 72 | 'with_distractors/eval_fast_map_three_episodes_three_objs_five_distractor', 73 | 'with_distractors/eval_fast_map_four_episodes_three_objs_no_distractor', 74 | 'with_distractors/eval_fast_map_four_episodes_three_objs_one_distractor', 75 | 'with_distractors/eval_fast_map_three_objs_ten_distractor', 76 | 'with_distractors/eval_fast_map_three_objs_twenty_distractor', 77 | 'with_distractors/fast_map_three_objs_no_distractor', 78 | 'with_distractors/fast_map_three_objs_one_distractor', 79 | 'with_distractors/fast_map_three_objs_two_distractor', 80 | )) 81 | 82 | _ConnectionDetails = collections.namedtuple('_ConnectionDetails', 83 | ['channel', 'connection', 'specs']) 84 | 85 | 86 | class _FastMappingTasksEnv(dm_env_adaptor.DmEnvAdaptor): 87 | """An implementation of dm_env_rpc.DmEnvAdaptor for Fast Language Learning tasks.""" 88 | 89 | def __init__(self, connection_details, requested_observations, 90 | num_action_repeats): 91 | super(_FastMappingTasksEnv, 92 | self).__init__(connection_details.connection, 93 | connection_details.specs, requested_observations) 94 | self._channel = connection_details.channel 95 | self._num_action_repeats = num_action_repeats 96 | 97 | def close(self): 98 | super(_FastMappingTasksEnv, self).close() 99 | self._channel.close() 100 | 101 | def step(self, action): 102 | """Implementation of dm_env.step that supports repeated actions.""" 103 | 104 | timestep = None 105 | discount = None 106 | reward = None 107 | for _ in range(self._num_action_repeats): 108 | next_timestep = super(_FastMappingTasksEnv, self).step(action) 109 | 110 | # Accumulate reward per timestep. 111 | if next_timestep.reward is not None: 112 | reward = (reward or 0.) + next_timestep.reward 113 | 114 | # Calculate the product for discount. 115 | if next_timestep.discount is not None: 116 | discount = discount if discount else [] 117 | discount.append(next_timestep.discount) 118 | 119 | timestep = dm_env.TimeStep(next_timestep.step_type, reward, 120 | # Note: np.product(None) returns None. 121 | np.product(discount), 122 | next_timestep.observation) 123 | 124 | if timestep.last(): 125 | return timestep 126 | 127 | return timestep 128 | 129 | 130 | class _FastMappingTasksContainerEnv(_FastMappingTasksEnv): 131 | """An implementation of _FastMappingTasksEnv. 132 | 133 | Ensures that the provided Docker container is closed on exit. 134 | """ 135 | 136 | def __init__(self, connection_details, requested_observations, 137 | num_action_repeats, container): 138 | super(_FastMappingTasksContainerEnv, 139 | self).__init__(connection_details, requested_observations, 140 | num_action_repeats) 141 | self._container = container 142 | 143 | def close(self): 144 | super(_FastMappingTasksContainerEnv, self).close() 145 | try: 146 | self._container.kill() 147 | except docker.errors.NotFound: 148 | pass # Ignore, container has already been closed. 149 | 150 | 151 | class _FastMappingTasksProcessEnv(_FastMappingTasksEnv): 152 | """An implementation of _FastMappingTasksEnv. 153 | 154 | Ensure that the provided running process is closed on exit. 155 | """ 156 | 157 | def __init__(self, connection_details, requested_observations, 158 | num_action_repeats, process): 159 | super(_FastMappingTasksProcessEnv, 160 | self).__init__(connection_details, requested_observations, 161 | num_action_repeats) 162 | self._process = process 163 | 164 | def close(self): 165 | super(_FastMappingTasksProcessEnv, self).close() 166 | self._process.terminate() 167 | self._process.wait() 168 | 169 | 170 | def _check_grpc_channel_ready(channel): 171 | """Helper function to check the gRPC channel is ready N times.""" 172 | for _ in range(_MAX_CONNECTION_ATTEMPTS - 1): 173 | try: 174 | return grpc.channel_ready_future(channel).result(timeout=1) 175 | except grpc.FutureTimeoutError: 176 | pass 177 | return grpc.channel_ready_future(channel).result(timeout=1) 178 | 179 | 180 | def _can_send_message(connection): 181 | """Returns if `connection` is healthy and able to process requests.""" 182 | try: 183 | # This should return a response with an error unless the server isn't yet 184 | # receiving requests. 185 | connection.send(dm_env_rpc_pb2.StepRequest()) 186 | except error.DmEnvRpcError: 187 | return True 188 | except grpc.RpcError: 189 | return False 190 | 191 | 192 | def _create_channel_and_connection(port): 193 | """Returns a tuple of `(channel, connection)`.""" 194 | for _ in range(_MAX_CONNECTION_ATTEMPTS): 195 | channel = grpc.secure_channel('localhost:{}'.format(port), 196 | grpc.local_channel_credentials()) 197 | _check_grpc_channel_ready(channel) 198 | connection = dm_env_rpc_connection.Connection(channel) 199 | if _can_send_message(connection): 200 | break 201 | else: 202 | # A gRPC server running within Docker sometimes reports that the channel 203 | # is ready but transitively returns an error (status code 14) on first 204 | # use. Giving the server some time to breath and retrying often fixes the 205 | # problem. 206 | connection.close() 207 | channel.close() 208 | time.sleep(1.0) 209 | 210 | return channel, connection 211 | 212 | 213 | def _parse_exception_message(message): 214 | """Returns a human-readable version of a dm_env_rpc json error message.""" 215 | try: 216 | match = re.match(r'^message\:\ \"(.*)\"$', message) 217 | json_data = codecs.decode(match.group(1), 'unicode-escape') 218 | parsed_json_data = json.loads(json_data) 219 | return ValueError(json.dumps(parsed_json_data, indent=4)) 220 | except: # pylint: disable=bare-except 221 | return message 222 | 223 | 224 | def _wrap_send(send): 225 | """Wraps `send` in order to reformat exceptions.""" 226 | try: 227 | return send() 228 | except ValueError as e: 229 | e.args = [_parse_exception_message(e.args[0])] 230 | raise 231 | 232 | 233 | def _connect_to_environment(port, settings): 234 | """Helper function for connecting to a running dm_fast_mapping environment.""" 235 | if settings.level_name not in FAST_MAPPING_TASK_LEVEL_NAMES: 236 | raise ValueError( 237 | 'Level named "{}" is not a valid dm_fast_mapping level.'.format( 238 | settings.level_name)) 239 | channel, connection = _create_channel_and_connection(port) 240 | original_send = connection.send 241 | connection.send = lambda request: _wrap_send(lambda: original_send(request)) 242 | world_name = connection.send( 243 | dm_env_rpc_pb2.CreateWorldRequest( 244 | settings={ 245 | 'seed': tensor_utils.pack_tensor(settings.seed), 246 | 'episodeId': tensor_utils.pack_tensor(0), 247 | 'levelName': tensor_utils.pack_tensor(settings.level_name), 248 | })).world_name 249 | join_world_settings = { 250 | 'width': 251 | tensor_utils.pack_tensor(settings.width), 252 | 'height': 253 | tensor_utils.pack_tensor(settings.height), 254 | 'EpisodeLengthSeconds': 255 | tensor_utils.pack_tensor(settings.episode_length_seconds), 256 | 'ShowReachabilityHUD': tensor_utils.pack_tensor(False), 257 | } 258 | specs = connection.send( 259 | dm_env_rpc_pb2.JoinWorldRequest( 260 | world_name=world_name, settings=join_world_settings)).specs 261 | return _ConnectionDetails(channel=channel, connection=connection, specs=specs) 262 | 263 | 264 | class EnvironmentSettings(typing.NamedTuple): 265 | """Collection of settings used to start a specific Fast Language Learning task. 266 | 267 | Required attributes: 268 | seed: Seed to initialize the environment's RNG. 269 | level_name: Name of the level to load. 270 | Optional attributes: 271 | width: Width (in pixels) of the desired RGB observation; defaults to 96. 272 | height: Height (in pixels) of the desired RGB observation; defaults to 72. 273 | episode_length_seconds: Maximum episode length (in seconds); defaults to 274 | 120. 275 | num_action_repeats: Number of times to step the environment with the 276 | provided action in calls to `step()`. 277 | """ 278 | seed: int 279 | level_name: str 280 | width: int = 96 281 | height: int = 72 282 | episode_length_seconds: float = 120.0 283 | num_action_repeats: int = 1 284 | 285 | 286 | def _validate_environment_settings(settings): 287 | """Helper function to validate the provided environment settings.""" 288 | if settings.episode_length_seconds <= 0.0: 289 | raise ValueError('episode_length_seconds must have a positive value.') 290 | if settings.num_action_repeats <= 0: 291 | raise ValueError('num_action_repeats must have a positive value.') 292 | if settings.width <= 0 or settings.height <= 0: 293 | raise ValueError('width and height must have a positive value.') 294 | if ('with_distractors/' in settings.level_name and 295 | settings.episode_length_seconds != 450.0): 296 | raise ValueError( 297 | 'episode_length_seconds must be 450.0 for with_distractors/ levels.') 298 | 299 | 300 | def load_from_disk(path, settings): 301 | """Load Fast Language Learning Tasks from disk. 302 | 303 | Args: 304 | path: Directory containing dm_fast_mapping environment. 305 | settings: EnvironmentSettings required to start the environment. 306 | 307 | Returns: 308 | An implementation of dm_env.Environment. 309 | 310 | Raises: 311 | RuntimeError: If unable to start environment process. 312 | """ 313 | _validate_environment_settings(settings) 314 | 315 | executable_path = os.path.join(path, 'Linux64Player') 316 | libosmesa_path = os.path.join(path, 'external_libosmesa_llvmpipe.so') 317 | if not os.path.exists(executable_path) or not os.path.exists(libosmesa_path): 318 | raise RuntimeError( 319 | 'Cannot find dm_fast_mapping executable or dependent files at path: {}' 320 | .format(path)) 321 | 322 | port = portpicker.pick_unused_port() 323 | 324 | process_flags = [ 325 | executable_path, 326 | # Unity command-line flags. 327 | '-logfile', 328 | '-batchmode', 329 | '-noaudio', 330 | # Other command-line flags. 331 | '--logtostderr', 332 | '--server_type=GRPC', 333 | '--uri_address=[::]:{}'.format(port), 334 | ] 335 | 336 | os.environ.update({ 337 | 'UNITY_RENDERER': 'software', 338 | 'UNITY_OSMESA_PATH': libosmesa_path, 339 | }) 340 | 341 | process = subprocess.Popen( 342 | process_flags, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) 343 | if process.poll() is not None: 344 | raise RuntimeError('Failed to start dm_fast_mapping process correctly.') 345 | 346 | return _FastMappingTasksProcessEnv( 347 | _connect_to_environment(port, settings), _FAST_MAPPING_TASK_OBSERVATIONS, 348 | settings.num_action_repeats, process) 349 | 350 | 351 | def load_from_docker(settings, name=None): 352 | """Load Fast Language Learning Tasks from docker container. 353 | 354 | Args: 355 | settings: EnvironmentSettings required to start the environment. 356 | name: Optional name of Docker image that contains the dm_fast_mapping 357 | environment. If left unset, uses the dm_fast_mapping default name. 358 | 359 | Returns: 360 | An implementation of dm_env.Environment 361 | """ 362 | _validate_environment_settings(settings) 363 | 364 | name = name or _DEFAULT_DOCKER_IMAGE_NAME 365 | client = docker.from_env() 366 | 367 | port = portpicker.pick_unused_port() 368 | 369 | try: 370 | client.images.get(name) 371 | except docker.errors.ImageNotFound: 372 | logging.info('Downloading docker image "%s"...', name) 373 | client.images.pull(name) 374 | logging.info('Download finished.') 375 | 376 | container = client.containers.run( 377 | name, 378 | auto_remove=True, 379 | detach=True, 380 | ports={_DOCKER_INTERNAL_GRPC_PORT: port}) 381 | 382 | return _FastMappingTasksContainerEnv( 383 | _connect_to_environment(port, settings), _FAST_MAPPING_TASK_OBSERVATIONS, 384 | settings.num_action_repeats, container) 385 | -------------------------------------------------------------------------------- /dm_fast_mapping/_version.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Package version for dm_fast_mapping. 16 | 17 | Kept in separate file so it can be used during installation. 18 | """ 19 | 20 | __version__ = '1.0.0' # https://www.python.org/dev/peps/pep-0440/ 21 | -------------------------------------------------------------------------------- /dm_fast_mapping/load_from_disk_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Tests for dm_fast_mapping.load_from_disk.""" 16 | 17 | from absl import flags 18 | from absl.testing import absltest 19 | from absl.testing import parameterized 20 | from dm_env import test_utils 21 | import dm_fast_mapping 22 | 23 | FLAGS = flags.FLAGS 24 | 25 | flags.DEFINE_string('path', '', 26 | 'Directory that contains dm_fast_mapping environment.') 27 | 28 | 29 | class LoadFromDiskTest(test_utils.EnvironmentTestMixin, absltest.TestCase): 30 | 31 | def make_object_under_test(self): 32 | return dm_fast_mapping.load_from_disk( 33 | FLAGS.path, 34 | settings=dm_fast_mapping.EnvironmentSettings( 35 | seed=123, level_name='architecture_comparison/fast_map_three_objs')) 36 | 37 | 38 | class FastMappingTaskTest(parameterized.TestCase): 39 | 40 | @parameterized.parameters(dm_fast_mapping.FAST_MAPPING_TASK_LEVEL_NAMES) 41 | def test_load_level(self, level_name): 42 | self.assertIsNotNone( 43 | dm_fast_mapping.load_from_disk( 44 | FLAGS.path, 45 | settings=dm_fast_mapping.EnvironmentSettings( 46 | seed=123, level_name=level_name))) 47 | 48 | 49 | if __name__ == '__main__': 50 | absltest.main() 51 | -------------------------------------------------------------------------------- /dm_fast_mapping/load_from_docker_test.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Tests for dm_fast_mapping.load_from_docker.""" 16 | 17 | from absl import flags 18 | from absl.testing import absltest 19 | from absl.testing import parameterized 20 | from dm_env import test_utils 21 | import dm_fast_mapping 22 | 23 | FLAGS = flags.FLAGS 24 | 25 | flags.DEFINE_string( 26 | 'docker_image_name', None, 27 | 'Name of the Docker image that contains the Fast Language Learning Tasks. ' 28 | 'If None, uses the default dm_fast_mapping name') 29 | 30 | 31 | class LoadFromDockerTest(test_utils.EnvironmentTestMixin, absltest.TestCase): 32 | 33 | def make_object_under_test(self): 34 | return dm_fast_mapping.load_from_docker( 35 | name=FLAGS.docker_image_name, 36 | settings=dm_fast_mapping.EnvironmentSettings( 37 | seed=123, level_name='architecture_comparison/fast_map_three_objs')) 38 | 39 | 40 | class FastMappingTaskTest(parameterized.TestCase): 41 | 42 | @parameterized.parameters(dm_fast_mapping.FAST_MAPPING_TASK_LEVEL_NAMES) 43 | def test_load_level(self, level_name): 44 | self.assertIsNotNone( 45 | dm_fast_mapping.load_from_docker( 46 | name=FLAGS.docker_image_name, 47 | settings=dm_fast_mapping.EnvironmentSettings( 48 | seed=123, level_name=level_name))) 49 | 50 | 51 | if __name__ == '__main__': 52 | absltest.main() 53 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | # Tasks 2 | 3 | The available values for `level_name` are as follows: 4 | 5 | * 'architecture_comparison/fast_map_three_objs' 6 | * 'num_generalization/fast_map_eight_objs' 7 | * 'num_generalization/fast_map_five_objs' 8 | * 'num_generalization/fast_map_three_objs' 9 | * 'new_obj_generalization/fast_map_three_objs_global_five' 10 | * 'new_obj_generalization/fast_map_three_objs_global_ten' 11 | * 'new_obj_generalization/fast_map_three_objs_global_three' 12 | * 'new_obj_generalization/fast_map_three_objs_global_twenty' 13 | * 'new_obj_generalization/fast_map_heldout_test_objs' 14 | * 'intrinsic_motivation/fast_map_three_objs_no_shaping_reward' 15 | * 'fast_slow/fast_map_three_objs' 16 | * 'fast_slow/fast_map_three_objs_bed_tray' 17 | * 'fast_slow/fast_map_three_objs_bed_tray_putting_near' 18 | * 'fast_slow/fast_map_three_objs_bed_tray_putting_on' 19 | * 'fast_slow/slow_learn_three_objs_bed_tray_lifting' 20 | * 'fast_slow/slow_learn_three_objs_bed_tray_putting_near' 21 | * 'fast_slow/slow_learn_three_objs_bed_tray_putting_on' 22 | * 'fast_slow/two_phase_slow_learn_three_objs_bed_tray_putting_near' 23 | * 'fast_slow/two_phase_slow_learn_three_objs_bed_tray_putting_on' 24 | * 'fast_slow/test_holdout_fast_map_three_objs_bed_tray_putting_on' 25 | * 'with_distractors/eval_fast_map_two_episodes_three_objs_five_distractor', 26 | * 'with_distractors/eval_fast_map_three_episodes_three_objs_five_distractor', 27 | * 'with_distractors/eval_fast_map_four_episodes_three_objs_no_distractor', 28 | * 'with_distractors/eval_fast_map_four_episodes_three_objs_one_distractor', 29 | * 'with_distractors/eval_fast_map_three_objs_ten_distractor', 30 | * 'with_distractors/eval_fast_map_three_objs_twenty_distractor', 31 | * 'with_distractors/fast_map_three_objs_no_distractor', 32 | * 'with_distractors/fast_map_three_objs_one_distractor', 33 | * 'with_distractors/fast_map_three_objs_two_distractor', 34 | 35 | # Experiments from "Grounded Language Learning: Fast and Slow" 36 | 37 | These tasks correspond different experiments: 38 | 39 | 1. architecture_comparison (Table 1, Section 4.0): 40 | 41 | * Train on 'architecture_comparison/fast_map_three_objs' 42 | 43 | 2. num_generalization (Figure 2, Section 4.1). E.g: 44 | 45 | * Train on 'num_generalization/fast_map_three_objs' 46 | * Test on 'num_generalization/fast_map_five_objs', 47 | 'num_generalization/fast_map_eight_objs' 48 | 49 | 3. new_obj_generalization (Figure 3, Section 4.1). E.g: 50 | 51 | * Train on 'new_obj_generalization/fast_map_three_objs_global_ten' 52 | * Test on 'new_obj_generalization/fast_map_heldout_test_objs' 53 | 54 | 4. instrinsic_motivation (Figure 5, Section 4.2): 55 | 56 | * Train on 'intrinsic_motivation/fast_map_three_objs_no_shaping_reward' 57 | 58 | 5. fast_slow (Figure 6, Section 4.3). E.g. (unfamiliar objects, unfamiliar 59 | task): 60 | 61 | * Train on 'fast_slow/slow_learn_three_objs_bed_tray_lifting', 62 | 'fast_slow/slow_learn_three_objs_bed_tray_putting_near', 63 | 'fast_slow/slow_learn_three_objs_bed_tray_putting_on', 64 | 'fast_slow/fast_map_three_objs', 65 | 'fast_slow/fast_map_three_objs_bed_tray' 66 | * Test on 'fast_slow/test_holdout_fast_map_three_objs_bed_tray_putting_on' 67 | 68 | The sections refer to the version of the paper hosted on arXiv on 1 November 69 | 2020 ([arxiv](https://arxiv.org/pdf/2009.01719.pdf)). Note that we do not 70 | release the experiments involving ShapeNet assets (Figure 4, Section 4.1) for 71 | copyright reasons. 72 | 73 | # Experiments from "Towards mental time travel: a hierarchical memory for RL..." 74 | 75 | The tasks prefixed with `with_distractors` are the rapid-word-learning tasks 76 | from Figure 5, Section 3.3: 77 | 78 | 1. Length generalization (Fig. 5d): 79 | 80 | * Train on 'with_distractors/fast_map_three_objs_no_distractor' 81 | 'with_distractors/fast_map_three_objs_one_distractor' 82 | 'with_distractors/fast_map_three_objs_two_distractor' 83 | * Test on 'with_distractors/eval_fast_map_three_objs_ten_distractor' 84 | 85 | 2. Generalization to multi-episode evaluation (Fig. 5e-f) with: 86 | 87 | * Train on same as previous: 88 | 'with_distractors/fast_map_three_objs_no_distractor' 89 | 'with_distractors/fast_map_three_objs_one_distractor' 90 | 'with_distractors/fast_map_three_objs_two_distractor' 91 | * Test on 92 | 'with_distractors/eval_fast_map_four_episodes_three_objs_no_distractor' 93 | 'with_distractors/eval_fast_map_two_episodes_three_objs_five_distractor' 94 | 95 | The section and figure numbers refer to the updated paper version 96 | ([arXiv](https://arxiv.org/abs/2105.14039)) that was posted in October, 2021. 97 | 98 | # Actions 99 | 100 | The environment provides the following actions: 101 | 102 | * `STRAFE_LEFT_RIGHT` 103 | * `MOVE_BACK_FORWARD` 104 | * `LOOK_LEFT_RIGHT` 105 | * `LOOK_DOWN_UP` 106 | * `HAND_ROTATE_AROUND_RIGHT` 107 | * `HAND_ROTATE_AROUND_UP` 108 | * `HAND_ROTATE_AROUND_FORWARD` 109 | * `HAND_PUSH_PULL` 110 | * `HAND_GRIP` 111 | 112 | Each action is a `double` scalar, with an inclusive range of `[-1.0, 1.0]` 113 | except for HAND_GRIP, which is a binary action taking values `0` or `1`. It is 114 | not compulsory to send a value for each action every step, but note that actions 115 | are "sticky", meaning an action's value will only change when a new value is 116 | provided. For example: 117 | 118 | ```python 119 | env = dm_fast_mapping.load_from_docker(settings) 120 | env.reset() 121 | env.step({'STRAFE_LEFT_RIGHT': -1.0}) # Result: strafe Left. 122 | env.step({'MOVE_BACK_FORWARD': 1.0}) # Result: strafe left & move backward. 123 | 124 | env.step({'STRAFE_LEFT_RIGHT': 0.0, 125 | 'MOVE_BACK_FORWARD': 0.0}) # Result: stationary. 126 | ``` 127 | 128 | Note that when using the provided script `human_agent.py` to try the tasks, only 129 | the `STRAFE_LEFT_RIGHT` (keys `a`, `d`), `MOVE_BACK_FORWARD` (`s`, `w`), 130 | `LOOK_LEFT_RIGHT` (`left_arrow`, `right_arrow`), `LOOK_DOWN_UP` (`down_arrow`, 131 | `up_arrow`) and `HAND_GRIP`(`spacebar`) are available. 132 | 133 | # Observations 134 | 135 | For the 8 Unity-based tasks, the environment provides the following 136 | observations: 137 | 138 | * `RGB_INTERLEAVED`: First person RGB camera observation. The `width` and 139 | `height` can be adjusted through the `EnvironmentSettings`, but the 140 | observation will always have a fixed 4:3 aspect ratio. 141 | * `TEXT`: A string indicating the instructions or language information 142 | provided by the environment. 143 | 144 | # Configurable environment settings 145 | 146 | Required attributes: 147 | 148 | * `seed`: Seed to initialize the environment's RNG. 149 | * `level_name`: Name of the level to load. 150 | 151 | Optional attributes: 152 | 153 | * `width`: Width (in pixels) of the desired RGB observation; defaults to 96. 154 | * `height`: Height (in pixels) of the desired RGB observation; defaults to 72. 155 | * `episode_length_seconds`: Maximum episode length (in seconds); defaults 156 | to 120. 157 | * `num_action_repeats`: Number of times to step the environment with the 158 | provided action in calls to `step()`. 159 | -------------------------------------------------------------------------------- /examples/human_agent.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Example human agent for interacting with DeepMind Fast Language Learning Tasks.""" 16 | 17 | from absl import app 18 | from absl import flags 19 | from absl import logging 20 | import dm_fast_mapping 21 | import numpy as np 22 | import pygame 23 | 24 | FLAGS = flags.FLAGS 25 | 26 | flags.DEFINE_list( 27 | 'screen_size', [640, 480], 28 | 'Screen width/height in pixels. Scales the environment RGB observations to ' 29 | 'fit the screen size.') 30 | 31 | flags.DEFINE_string( 32 | 'docker_image_name', None, 33 | 'Name of the Docker image that contains the Fast Language Learning Tasks. ' 34 | 'If None, uses the default dm_fast_mapping name') 35 | 36 | flags.DEFINE_integer('seed', 123, 'Environment seed.') 37 | flags.DEFINE_string('level_name', 'fast_slow/fast_map_three_objs', 38 | 'Name of task to run.') 39 | 40 | _FRAMES_PER_SECOND = 30 41 | 42 | _KEYS_TO_ACTION = { 43 | pygame.K_w: {'MOVE_BACK_FORWARD': 1}, 44 | pygame.K_s: {'MOVE_BACK_FORWARD': -1}, 45 | pygame.K_a: {'STRAFE_LEFT_RIGHT': -1}, 46 | pygame.K_d: {'STRAFE_LEFT_RIGHT': 1}, 47 | pygame.K_UP: {'LOOK_DOWN_UP': -1}, 48 | pygame.K_DOWN: {'LOOK_DOWN_UP': 1}, 49 | pygame.K_LEFT: {'LOOK_LEFT_RIGHT': -1}, 50 | pygame.K_RIGHT: {'LOOK_LEFT_RIGHT': 1}, 51 | pygame.K_SPACE: {'HAND_GRIP': 1}, 52 | } # pyformat: disable 53 | _NO_ACTION = { 54 | 'MOVE_BACK_FORWARD': 0, 55 | 'STRAFE_LEFT_RIGHT': 0, 56 | 'LOOK_LEFT_RIGHT': 0, 57 | 'LOOK_DOWN_UP': 0, 58 | 'HAND_GRIP': 0, 59 | } 60 | 61 | 62 | def main(_): 63 | pygame.init() 64 | try: 65 | pygame.mixer.quit() 66 | except NotImplementedError: 67 | pass 68 | pygame.display.set_caption('Fast Language Learning Tasks Human Agent') 69 | 70 | if 'with_distractors' in FLAGS.level_name: # for the tasks from the HTM paper 71 | episode_length_seconds = 450.0 72 | else: 73 | episode_length_seconds = 120.0 74 | env_settings = dm_fast_mapping.EnvironmentSettings( 75 | seed=FLAGS.seed, level_name=FLAGS.level_name, 76 | episode_length_seconds=episode_length_seconds) 77 | with dm_fast_mapping.load_from_docker(name=FLAGS.docker_image_name, 78 | settings=env_settings) as env: 79 | screen = pygame.display.set_mode( 80 | (int(FLAGS.screen_size[0]), int(FLAGS.screen_size[1]))) 81 | 82 | rgb_spec = env.observation_spec()['RGB_INTERLEAVED'] 83 | surface = pygame.Surface((rgb_spec.shape[1], rgb_spec.shape[0])) 84 | 85 | actions = _NO_ACTION 86 | score = 0 87 | clock = pygame.time.Clock() 88 | while True: 89 | # Do not close with CTRL-C as otherwise the docker container may be left 90 | # running on exit. 91 | for event in pygame.event.get(): 92 | if event.type == pygame.QUIT: 93 | return 94 | elif event.type == pygame.KEYDOWN: 95 | if event.key == pygame.K_ESCAPE: 96 | return 97 | key_actions = _KEYS_TO_ACTION.get(event.key, {}) 98 | for name, action in key_actions.items(): 99 | actions[name] += action 100 | elif event.type == pygame.KEYUP: 101 | key_actions = _KEYS_TO_ACTION.get(event.key, {}) 102 | for name, action in key_actions.items(): 103 | actions[name] -= action 104 | 105 | timestep = env.step(actions) 106 | frame = np.swapaxes(timestep.observation['RGB_INTERLEAVED'], 0, 1) 107 | font = pygame.font.SysFont('Sans', 10) 108 | pygame.surfarray.blit_array(surface, frame) 109 | text = font.render(timestep.observation['TEXT'], True, (0, 0, 0)) 110 | surface.blit(text, (0, 0)) 111 | pygame.transform.smoothscale(surface, screen.get_size(), screen) 112 | 113 | pygame.display.update() 114 | 115 | if timestep.reward: 116 | score += timestep.reward 117 | logging.info('Total score: %1.1f, reward: %1.1f', score, 118 | timestep.reward) 119 | clock.tick(_FRAMES_PER_SECOND) 120 | 121 | 122 | if __name__ == '__main__': 123 | app.run(main) 124 | -------------------------------------------------------------------------------- /examples/random_agent.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Example random agent for interacting with DeepMind Fast Mapping Tasks.""" 16 | 17 | from absl import app 18 | from absl import flags 19 | from absl import logging 20 | from dm_env import specs 21 | import dm_fast_mapping 22 | import numpy as np 23 | 24 | FLAGS = flags.FLAGS 25 | 26 | flags.DEFINE_string( 27 | 'docker_image_name', None, 28 | 'Name of the Docker image that contains the Fast Language Learning Tasks. ' 29 | 'If None, uses the default dm_fast_mapping name') 30 | 31 | flags.DEFINE_integer('seed', 123, 'Environment seed.') 32 | flags.DEFINE_string('level_name', 'fast_slow/fast_map_three_objs', 33 | 'Name of task to run.') 34 | 35 | 36 | class RandomAgent(object): 37 | """Basic random agent for DeepMind Fast Language Fast Language Learning Tasks.""" 38 | 39 | def __init__(self, action_spec): 40 | self.action_spec = action_spec 41 | 42 | def act(self): 43 | action = {} 44 | 45 | for name, spec in self.action_spec.items(): 46 | # Uniformly sample BoundedArray actions. 47 | if isinstance(spec, specs.BoundedArray): 48 | action[name] = np.random.uniform(spec.minimum, spec.maximum, spec.shape) 49 | else: 50 | action[name] = spec.generate_value() 51 | return action 52 | 53 | 54 | def main(_): 55 | if 'with_distractors' in FLAGS.level_name: # for the tasks from the HTM paper 56 | episode_length_seconds = 450.0 57 | else: 58 | episode_length_seconds = 120.0 59 | 60 | env_settings = dm_fast_mapping.EnvironmentSettings( 61 | seed=FLAGS.seed, level_name=FLAGS.level_name, 62 | episode_length_seconds=episode_length_seconds) 63 | with dm_fast_mapping.load_from_docker( 64 | name=FLAGS.docker_image_name, settings=env_settings) as env: 65 | agent = RandomAgent(env.action_spec()) 66 | 67 | timestep = env.reset() 68 | score = 0 69 | while not timestep.last(): 70 | action = agent.act() 71 | timestep = env.step(action) 72 | 73 | if timestep.reward: 74 | score += timestep.reward 75 | logging.info('Total score: %1.1f, reward: %1.1f', score, 76 | timestep.reward) 77 | 78 | 79 | if __name__ == '__main__': 80 | app.run(main) 81 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # Copyright 2019 DeepMind Technologies Limited. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================ 15 | """Install script for setuptools.""" 16 | 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | 21 | import imp 22 | from setuptools import find_packages 23 | from setuptools import setup 24 | 25 | setup( 26 | name='dm-fast-mapping', 27 | version=imp.load_source('_version', 28 | 'dm_fast_mapping/_version.py').__version__, 29 | description=('DeepMind Fast Language Learning Tasks, a set of Unity-based' 30 | 'machine-learning research tasks.'), 31 | author='DeepMind', 32 | license='Apache License, Version 2.0', 33 | keywords='reinforcement-learning python machine learning language', 34 | packages=find_packages(exclude=['examples']), 35 | install_requires=[ 36 | 'absl-py', 37 | 'dm-env', 38 | 'dm-env-rpc', 39 | 'docker', 40 | 'grpcio', 41 | 'numpy', 42 | 'portpicker', 43 | ], 44 | tests_require=['nose'], 45 | python_requires='>=3.7', 46 | extras_require={'examples': ['pygame']}, 47 | test_suite='nose.collector', 48 | classifiers=[ 49 | 'Development Status :: 5 - Production/Stable', 50 | 'Environment :: Console', 51 | 'Intended Audience :: Science/Research', 52 | 'License :: OSI Approved :: Apache Software License', 53 | 'Operating System :: POSIX :: Linux', 54 | 'Operating System :: Microsoft :: Windows', 55 | 'Operating System :: MacOS :: MacOS X', 56 | 'Programming Language :: Python :: 3.7', 57 | 'Topic :: Scientific/Engineering :: Artificial Intelligence', 58 | ], 59 | ) 60 | --------------------------------------------------------------------------------