├── README.md ├── environment.yml ├── utilities ├── javascript-pong │ └── static │ │ ├── game.js │ │ ├── images │ │ ├── press1.png │ │ ├── press2.png │ │ └── winner.png │ │ ├── index.html │ │ ├── pong.css │ │ ├── pong.js │ │ └── sounds │ │ ├── goal.wav │ │ ├── ping.wav │ │ ├── pong.wav │ │ └── wall.wav └── pong_py │ ├── pong_py │ ├── __init__.py │ ├── ball.py │ ├── helper.py │ ├── paddle.py │ └── pongjsenv.py │ └── setup.py ├── week_1 ├── README.md ├── week_1_exercise_1.ipynb ├── week_1_exercise_2.ipynb └── week_1_exercise_3.ipynb ├── week_2 ├── README.md ├── week_2_exercise_1.ipynb └── week_2_exercise_2.ipynb ├── week_3 ├── README.md ├── week_3_exercise_1.ipynb ├── week_3_exercise_2.ipynb └── week_3_exercise_3.ipynb ├── week_4 ├── README.md ├── week_4_exercise_1.ipynb ├── week_4_exercise_2.ipynb └── week_4_exercise_3.ipynb ├── week_5 ├── README.md ├── week_5_exercise_1.ipynb └── week_5_exercise_2.ipynb ├── week_6 ├── README.md ├── week_6_exercise_1.ipynb └── week_6_exercise_2.ipynb ├── week_7 ├── README.md ├── cnn.png ├── helper.py ├── input.html ├── input_final.html ├── mnist.png ├── tune.png └── week_7_exercise_1.ipynb └── week_8 ├── README.md ├── client.png ├── dqn.png ├── learning.png ├── log.png ├── ppo.png ├── serving ├── data_large.gz ├── data_small.gz ├── do_rollouts.py ├── javascript-pong │ └── static │ │ ├── game.js │ │ ├── images │ │ ├── press1.png │ │ ├── press2.png │ │ └── winner.png │ │ ├── index.html │ │ ├── pong.css │ │ ├── pong.js │ │ └── sounds │ │ ├── goal.wav │ │ ├── ping.wav │ │ ├── pong.wav │ │ └── wall.wav ├── pong_py │ ├── pong_py.egg-info │ │ ├── PKG-INFO │ │ ├── SOURCES.txt │ │ ├── dependency_links.txt │ │ └── top_level.txt │ ├── pong_py │ │ ├── __init__.py │ │ ├── ball.py │ │ ├── helper.py │ │ ├── paddle.py │ │ └── pongjsenv.py │ └── setup.py ├── pong_web_server.py └── simple_policy_server.py ├── web.png ├── week_8_exercise_1.ipynb ├── week_8_exercise_2.ipynb └── week_8_exercise_3.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # Distributed AI with the Ray Framework Course 2 | 3 | ## Summary 4 | 5 | Learn how to build large-scale AI applications using Ray, a high-performance distributed execution framework from the RISELab at UC Berkeley. Simplify complex parallel systems with this easy-to-use Python* framework that comes with machine learning libraries to speed up AI applications. 6 | 7 | This course provides you with practical knowledge of the following skills: 8 | 9 | - Use remote functions, actors, and more with the Ray framework 10 | 11 | - Quickly find the optimal variables for AI training with Ray Tune 12 | 13 | - Distribute reinforcement learning algorithms across a cluster with Ray RLlib 14 | 15 | - Deploy AI applications on large computer clusters and cloud resources 16 | 17 | The course is structured around eight weeks of lectures and exercises. Each week requires approximately two hours to complete. 18 | 19 | ### Acknowledgements 20 | 21 | Ray framework official [repository](https://github.com/ray-project/ray). 22 | 23 | Course material compiled from the Ray tutorial [repository](https://github.com/ray-project/tutorial). 24 | 25 | ## [Week 1](week_1) 26 | 27 | Get an introduction to the Ray framework and data parallelism. Topics include how to: 28 | 29 | - Run tasks in parallel using remote functions 30 | 31 | - Make dependencies between remote tasks through object IDs 32 | 33 | - Create nested tasks and remote functions within remote functions 34 | 35 | 36 | 37 | ## [Week 2](week_2) 38 | 39 | Learn about Ray actors, which are remote functions that have states. Additional topics: 40 | 41 | - How to implement Ray actors using Python* classes 42 | 43 | - How to use different hardware resources for various AI tasks, such as training and inference 44 | 45 | - The analytics ecosystem, which is made up of toolkits, libraries, solutions, and hardware 46 | 47 | 48 | ## [Week 3](week_3) 49 | 50 | Understand how to optimize and speed up functions. Topics include how to: 51 | 52 | - Avoid waiting for slow tasks using ray.wait() 53 | 54 | - Process remote tasks in a specific order 55 | 56 | - Speed up serialization by passing objects to ray.put() 57 | 58 | ## [Week 4](week_4) 59 | 60 | Explore how to optimize functions, including: 61 | 62 | - Accelerate Pandas* workflows by changing one line of code 63 | 64 | - Implement a MapReduce system with Ray 65 | 66 | - Use Tree Reduce to execute a tree of dependent function in parallel 67 | 68 | ## [Week 5](week_5) 69 | 70 | Learn to access different hardware resources, including how to: 71 | 72 | - Send remote tasks to different accelerators and processors 73 | 74 | - Use custom resources for tasks that require complex hardware combinations 75 | 76 | ## [Week 6](week_6) 77 | 78 | Get an introduction to training neural networks across multiple workers. Topics include: 79 | 80 | - An example of how to pass the weights of a TensorFlow* model between workers and drivers 81 | 82 | - How to implement a sharded parameter server for distributing parameters across multiple workers 83 | 84 | ## [Week 7](week_7) 85 | 86 | Understand how to use Ray Tune, a scalable framework for searching for hyperparameters. 87 | 88 | - Use Tune to reduce one of the most expensive parts of machine learning 89 | 90 | - Search for the right parameters, such as learning rate and momentum, to train a neural network 91 | 92 | - Combine HyperOpt and HyperBand to perform a more powerful search 93 | 94 | ## [Week 8](week_8) 95 | 96 | Learn about RLlib, which is a scalable reinforcement learning library to train AI agents. 97 | 98 | - Get an introduction to the Markov Decision Process and how to use it in Python* 99 | 100 | - See an example of how to use the PPO algorithm to train a network to play a simple game with Gym* and visualize the results with TensorBoard* 101 | 102 | - Learn to create a deep-Q network (DQN) to play Pong and play against it in a browser 103 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: ray-tutorial 2 | channels: 3 | - conda-forge 4 | dependencies: 5 | - python=3.6 6 | - bokeh 7 | - ipywidgets=6.0.0 8 | - tensorflow 9 | - pip: 10 | - ray[rllib]==0.6.0 11 | - keras 12 | - modin 13 | - matplotlib 14 | -------------------------------------------------------------------------------- /utilities/javascript-pong/static/game.js: -------------------------------------------------------------------------------- 1 | //============================================================================= 2 | // 3 | // We need some ECMAScript 5 methods but we need to implement them ourselves 4 | // for older browsers (compatibility: 5 | // http://kangax.github.com/es5-compat-table/) 6 | // 7 | // Function.bind: 8 | // https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Function/bind 9 | // Object.create: http://javascript.crockford.com/prototypal.html 10 | // Object.extend: (defacto standard like jquery $.extend or prototype's 11 | // Object.extend) 12 | // 13 | // Object.construct: our own wrapper around Object.create that ALSO calls 14 | // an initialize constructor method if one exists 15 | // 16 | //============================================================================= 17 | 18 | if (!Function.prototype.bind) { 19 | Function.prototype.bind = function(obj) { 20 | var slice = [].slice, args = slice.call(arguments, 1), self = this, 21 | nop = function() {}, bound = function() { 22 | return self.apply( 23 | this instanceof nop ? this : (obj || {}), 24 | args.concat(slice.call(arguments))); 25 | }; 26 | nop.prototype = self.prototype; 27 | bound.prototype = new nop(); 28 | return bound; 29 | }; 30 | } 31 | 32 | if (!Object.create) { 33 | Object.create = function(base) { 34 | function F(){}; 35 | F.prototype = base; 36 | return new F(); 37 | } 38 | } 39 | 40 | if (!Object.construct) { 41 | Object.construct = function(base) { 42 | var instance = Object.create(base); 43 | if (instance.initialize) 44 | instance.initialize.apply(instance, [].slice.call(arguments, 1)); 45 | return instance; 46 | } 47 | } 48 | 49 | if (!Object.extend) { 50 | Object.extend = function(destination, source) { 51 | for (var property in source) { 52 | if (source.hasOwnProperty(property)) 53 | destination[property] = source[property]; 54 | } 55 | return destination; 56 | }; 57 | } 58 | 59 | /* NOT READY FOR PRIME TIME 60 | if (!window.requestAnimationFrame) {// 61 | http://paulirish.com/2011/requestanimationframe-for-smart-animating/ 62 | window.requestAnimationFrame = window.webkitRequestAnimationFrame || 63 | window.mozRequestAnimationFrame || 64 | window.oRequestAnimationFrame || 65 | window.msRequestAnimationFrame || 66 | function(callback, element) { 67 | window.setTimeout(callback, 1000 / 60); 68 | } 69 | } 70 | */ 71 | 72 | //============================================================================= 73 | // GAME 74 | //============================================================================= 75 | 76 | Game = { 77 | 78 | compatible: function() { 79 | return Object.create && Object.extend && Function.bind && 80 | document.addEventListener && // HTML5 standard, all modern browsers 81 | // that support canvas should also support 82 | // add/removeEventListener 83 | Game.ua.hasCanvas 84 | }, 85 | 86 | start: function(id, game, cfg) { 87 | if (Game.compatible()) 88 | return Object.construct(Game.Runner, id, game, cfg).game; // return the 89 | // game 90 | // instance, 91 | // not the 92 | // runner 93 | // (caller can 94 | // always get 95 | // at the 96 | // runner via 97 | // game.runner) 98 | }, 99 | 100 | ua: function() { // should avoid user agent sniffing... but sometimes you 101 | // just gotta do what you gotta do 102 | var ua = navigator.userAgent.toLowerCase(); 103 | var key = ((ua.indexOf('opera') > -1) ? 'opera' : null); 104 | key = key || ((ua.indexOf('firefox') > -1) ? 'firefox' : null); 105 | key = key || ((ua.indexOf('chrome') > -1) ? 'chrome' : null); 106 | key = key || ((ua.indexOf('safari') > -1) ? 'safari' : null); 107 | key = key || ((ua.indexOf('msie') > -1) ? 'ie' : null); 108 | 109 | try { 110 | var re = (key == 'ie') ? 'msie (\\d)' : key + '\\/(\\d\\.\\d)' 111 | var matches = ua.match(new RegExp(re, 'i')); 112 | var version = matches ? parseFloat(matches[1]) : null; 113 | } catch (e) { 114 | } 115 | 116 | return { 117 | full: ua, name: key + (version ? ' ' + version.toString() : ''), 118 | version: version, isFirefox: (key == 'firefox'), 119 | isChrome: (key == 'chrome'), isSafari: (key == 'safari'), 120 | isOpera: (key == 'opera'), isIE: (key == 'ie'), 121 | hasCanvas: (document.createElement('canvas').getContext), 122 | hasAudio: (typeof(Audio) != 'undefined') 123 | } 124 | }(), 125 | 126 | addEvent: function(obj, type, fn) { 127 | obj.addEventListener(type, fn, false); 128 | }, 129 | removeEvent: function(obj, type, fn) { 130 | obj.removeEventListener(type, fn, false); 131 | }, 132 | 133 | ready: function(fn) { 134 | if (Game.compatible()) Game.addEvent(document, 'DOMContentLoaded', fn); 135 | }, 136 | 137 | createCanvas: function() { 138 | return document.createElement('canvas'); 139 | }, 140 | 141 | createAudio: function(src) { 142 | try { 143 | var a = new Audio(src); 144 | a.volume = 0.1; // lets be real quiet please 145 | return a; 146 | } catch (e) { 147 | return null; 148 | } 149 | }, 150 | 151 | loadImages: function( 152 | sources, callback) { /* load multiple images and callback when ALL have 153 | finished loading */ 154 | var images = {}; 155 | var count = sources ? sources.length : 0; 156 | if (count == 0) { 157 | callback(images); 158 | } else { 159 | for (var n = 0; n < sources.length; n++) { 160 | var source = sources[n]; 161 | var image = document.createElement('img'); 162 | images[source] = image; 163 | Game.addEvent(image, 'load', function() { 164 | if (--count == 0) callback(images); 165 | }); 166 | image.src = source; 167 | } 168 | } 169 | }, 170 | 171 | random: function(min, max) { 172 | return (min + (Math.random() * (max - min))); 173 | }, 174 | 175 | timestamp: function() { 176 | return new Date().getTime(); 177 | }, 178 | 179 | KEY: { 180 | BACKSPACE: 8, 181 | TAB: 9, 182 | RETURN: 13, 183 | ESC: 27, 184 | SPACE: 32, 185 | LEFT: 37, 186 | UP: 38, 187 | RIGHT: 39, 188 | DOWN: 40, 189 | DELETE: 46, 190 | HOME: 36, 191 | END: 35, 192 | PAGEUP: 33, 193 | PAGEDOWN: 34, 194 | INSERT: 45, 195 | ZERO: 48, 196 | ONE: 49, 197 | TWO: 50, 198 | A: 65, 199 | L: 76, 200 | P: 80, 201 | Q: 81, 202 | TILDA: 192 203 | }, 204 | 205 | //----------------------------------------------------------------------------- 206 | 207 | Runner: { 208 | 209 | initialize: function(id, game, cfg) { 210 | this.cfg = Object.extend( 211 | game.Defaults || {}, cfg || {}); // use game defaults (if any) and 212 | // extend with custom cfg (if any) 213 | this.fps = this.cfg.fps || 20; 214 | this.interval = 1000.0 / this.fps; 215 | this.canvas = document.getElementById(id); 216 | this.width = this.cfg.width || this.canvas.offsetWidth; 217 | this.height = this.cfg.height || this.canvas.offsetHeight; 218 | this.front = this.canvas; 219 | this.front.width = this.width; 220 | this.front.height = this.height; 221 | this.back = Game.createCanvas(); 222 | this.back.width = this.width; 223 | this.back.height = this.height; 224 | this.front2d = this.front.getContext('2d'); 225 | this.back2d = this.back.getContext('2d'); 226 | this.addEvents(); 227 | this.resetStats(); 228 | 229 | this.game = Object.construct( 230 | game, this, this.cfg); // finally construct the game object itself 231 | }, 232 | 233 | start: function() { // game instance should call runner.start() when its 234 | // finished initializing and is ready to start the game 235 | // loop 236 | this.lastFrame = Game.timestamp(); 237 | this.timer = setInterval(this.loop.bind(this), this.interval); 238 | }, 239 | 240 | stop: function() { 241 | clearInterval(this.timer); 242 | }, 243 | 244 | loop: function() { 245 | var start = Game.timestamp(); 246 | this.update((start - this.lastFrame) / 1000.0); // send dt as seconds 247 | var middle = Game.timestamp(); 248 | this.draw(); 249 | var end = Game.timestamp(); 250 | this.updateStats(middle - start, end - middle); 251 | this.lastFrame = start; 252 | }, 253 | 254 | update: function(dt) { 255 | this.game.update(dt); 256 | }, 257 | 258 | draw: function() { 259 | this.back2d.clearRect(0, 0, this.width, this.height); 260 | this.game.draw(this.back2d); 261 | this.drawStats(this.back2d); 262 | this.front2d.clearRect(0, 0, this.width, this.height); 263 | this.front2d.drawImage(this.back, 0, 0); 264 | }, 265 | 266 | resetStats: function() { 267 | this.stats = { 268 | count: 0, 269 | fps: 0, 270 | update: 0, 271 | draw: 0, 272 | frame: 0 // update + draw 273 | }; 274 | }, 275 | 276 | updateStats: function(update, draw) { 277 | if (this.cfg.stats) { 278 | this.stats.update = Math.max(1, update); 279 | this.stats.draw = Math.max(1, draw); 280 | this.stats.frame = this.stats.update + this.stats.draw; 281 | this.stats.count = 282 | this.stats.count == this.fps ? 0 : this.stats.count + 1; 283 | this.stats.fps = Math.min(this.fps, 1000 / this.stats.frame); 284 | } 285 | }, 286 | 287 | drawStats: function(ctx) { 288 | if (this.cfg.stats) { 289 | ctx.fillText( 290 | 'frame: ' + this.stats.count, this.width - 100, this.height - 60); 291 | ctx.fillText( 292 | 'fps: ' + this.stats.fps, this.width - 100, this.height - 50); 293 | ctx.fillText( 294 | 'update: ' + this.stats.update + 'ms', this.width - 100, 295 | this.height - 40); 296 | ctx.fillText( 297 | 'draw: ' + this.stats.draw + 'ms', this.width - 100, 298 | this.height - 30); 299 | } 300 | }, 301 | 302 | addEvents: function() { 303 | Game.addEvent(document, 'keydown', this.onkeydown.bind(this)); 304 | Game.addEvent(document, 'keyup', this.onkeyup.bind(this)); 305 | }, 306 | 307 | onkeydown: function(ev) { 308 | if (this.game.onkeydown) this.game.onkeydown(ev.keyCode); 309 | }, 310 | onkeyup: function(ev) { 311 | if (this.game.onkeyup) this.game.onkeyup(ev.keyCode); 312 | }, 313 | 314 | hideCursor: function() { 315 | this.canvas.style.cursor = 'none'; 316 | }, 317 | showCursor: function() { 318 | this.canvas.style.cursor = 'auto'; 319 | }, 320 | 321 | alert: function(msg) { 322 | this.stop(); // alert blocks thread, so need to stop game loop in order 323 | // to avoid sending huge dt values to next update 324 | result = window.alert(msg); 325 | this.start(); 326 | return result; 327 | }, 328 | 329 | confirm: function(msg) { 330 | this.stop(); // alert blocks thread, so need to stop game loop in order 331 | // to avoid sending huge dt values to next update 332 | result = window.confirm(msg); 333 | this.start(); 334 | return result; 335 | } 336 | 337 | //------------------------------------------------------------------------- 338 | 339 | } // Game.Runner 340 | } // Game 341 | -------------------------------------------------------------------------------- /utilities/javascript-pong/static/images/press1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/images/press1.png -------------------------------------------------------------------------------- /utilities/javascript-pong/static/images/press2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/images/press2.png -------------------------------------------------------------------------------- /utilities/javascript-pong/static/images/winner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/images/winner.png -------------------------------------------------------------------------------- /utilities/javascript-pong/static/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Pong! 5 | 6 | 7 | 8 | 9 | 10 | 11 | 49 | 50 | 51 |
52 | Sorry, this example cannot be run because your browser does not support the <canvas> element 53 |
54 |
55 | 56 | 57 | 58 | 81 | 82 | 83 | 84 | -------------------------------------------------------------------------------- /utilities/javascript-pong/static/pong.css: -------------------------------------------------------------------------------- 1 | body { background-color: black; color: #AAA; font-size: 12pt; padding: 1em; } 2 | 3 | #unsupported { border: 1px solid yellow; color: black; background-color: #FFFFAD; padding: 2em; margin: 1em; display: inline-block; } 4 | 5 | #sidebar { width: 18em; height: 40em; float: left; font-size: 0.825em; background-color: #333; border: 1px solid white; padding: 1em; } 6 | #sidebar h2 { color: white; text-align: center; margin: 0; } 7 | #sidebar .parts { padding-left: 1em; list-style-type: none; margin-bottom: 2em; text-align: right; } 8 | #sidebar .parts li a { color: white; text-decoration: none; } 9 | #sidebar .parts li a:visited { color: white; } 10 | #sidebar .parts li a:hover { color: white; text-decoration: underline; } 11 | #sidebar .parts li a.selected { color: #F08010; } 12 | #sidebar .parts li a i { color: #AAA; } 13 | #sidebar .parts li a.selected i { color: #F08010; } 14 | #sidebar .settings { line-height: 1.2em; height: 1.2em; text-align: right; } 15 | #sidebar .settings.size { } 16 | #sidebar .settings.speed { margin-bottom: 1em; } 17 | #sidebar .settings label { vertical-align: middle; } 18 | #sidebar .settings input { vertical-align: middle; } 19 | #sidebar .settings select { vertical-align: middle; } 20 | #sidebar .description { margin-bottom: 2em; } 21 | #sidebar .description b { font-weight: normal; color: #FFF; } 22 | 23 | 24 | @media screen and (min-width: 0px) { 25 | #sidebar { display: none; } 26 | #game { display: block; width: 480px; height: 360px; margin: 0 auto; } 27 | } 28 | 29 | @media screen and (min-width: 800px) { 30 | #game { width: 640px; height: 480px; } 31 | } 32 | 33 | @media screen and (min-width: 1000px) { 34 | #sidebar { display: block; } 35 | #game { margin-left: 18em; } 36 | } 37 | 38 | @media screen and (min-width: 1200px) { 39 | #game { width: 800px; height: 600px; } 40 | } 41 | 42 | @media screen and (min-width: 1600px) { 43 | #game { width: 1024px; height: 768px; } 44 | } 45 | -------------------------------------------------------------------------------- /utilities/javascript-pong/static/sounds/goal.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/sounds/goal.wav -------------------------------------------------------------------------------- /utilities/javascript-pong/static/sounds/ping.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/sounds/ping.wav -------------------------------------------------------------------------------- /utilities/javascript-pong/static/sounds/pong.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/sounds/pong.wav -------------------------------------------------------------------------------- /utilities/javascript-pong/static/sounds/wall.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/utilities/javascript-pong/static/sounds/wall.wav -------------------------------------------------------------------------------- /utilities/pong_py/pong_py/__init__.py: -------------------------------------------------------------------------------- 1 | from pong_py.pongjsenv import PongJSEnv 2 | -------------------------------------------------------------------------------- /utilities/pong_py/pong_py/ball.py: -------------------------------------------------------------------------------- 1 | import pong_py.helper as helper 2 | import random 3 | 4 | class Ball(): 5 | def __init__(self, pong): 6 | self.radius = 5 7 | self.dt = pong.dt 8 | self.minX = self.radius; 9 | self.maxX = pong.width - self.radius 10 | self.minY = pong.wall_width + self.radius 11 | self.maxY = pong.height - pong.wall_width - self.radius 12 | self.speed = (self.maxX - self.minX) / 4; 13 | self.accel = 8; 14 | self.dx = 0 15 | self.dy = 0 16 | 17 | def set_position(self, x, y): 18 | self.x_prev = x if not hasattr(self, "x") else self.x 19 | self.y_prev = y if not hasattr(self, "y") else self.y 20 | 21 | self.x = x 22 | self.y = y 23 | self.left = self.x - self.radius 24 | self.top = self.y - self.radius 25 | self.right = self.x + self.radius 26 | self.bottom = self.y + self.radius 27 | 28 | def set_direction(self, dx, dy): 29 | self.dx = dx 30 | self.dy = dy 31 | 32 | def update(self, left_pad, right_pad): 33 | 34 | pos = helper.accelerate(self.x, self.y, 35 | self.dx, self.dy, 36 | self.accel, self.dt); 37 | 38 | if ((pos.dy > 0) and (pos.y > self.maxY)): 39 | pos.y = self.maxY 40 | pos.dy = -pos.dy 41 | elif ((pos.dy < 0) and (pos.y < self.minY)): 42 | pos.y = self.minY 43 | pos.dy = -pos.dy 44 | 45 | paddle = left_pad if (pos.dx < 0) else right_pad; 46 | pt = helper.ballIntercept(self, paddle, pos.nx, pos.ny); 47 | 48 | if pt: 49 | if pt.d == 'left' or pt.d == 'right': 50 | pos.x = pt.x 51 | pos.dx = -pos.dx 52 | elif pt.d == 'top' or pt.d == 'bottom': 53 | pos.y = pt.y 54 | pos.dy = -pos.dy 55 | 56 | if paddle.up: 57 | pos.dy = pos.dy * (0.5 if pos.dy < 0 else 1.5) 58 | elif paddle.down: 59 | pos.dy = pos.dy * (0.5 if pos.dy > 0 else 1.5) 60 | 61 | self.set_position(pos.x, pos.y) 62 | self.set_direction(pos.dx, pos.dy) 63 | 64 | def reset(self, playerNo): 65 | self.set_position((self.maxX + self.minX) / 2, random.uniform(self.minY, self.maxY)) 66 | self.set_direction(self.speed if playerNo == 1 else -self.speed, self.speed) 67 | -------------------------------------------------------------------------------- /utilities/pong_py/pong_py/helper.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Position = namedtuple("Position", ["nx", "ny", "x", "y", "dx", "dy"]) 4 | Intercept = namedtuple("Intercept", ["x", "y", "d"]) 5 | Rectangle = namedtuple("Rectangle", ["left", "right", "top", "bottom"]) 6 | 7 | class Position(): 8 | def __init__(self, nx, ny, x, y, dx, dy): 9 | self.nx = nx 10 | self.ny = ny 11 | self.x = x 12 | self.y = y 13 | self.dx = dx 14 | self.dy = dy 15 | 16 | class Intercept(): 17 | def __init__(self, x, y, d): 18 | self.x = x 19 | self.y = y 20 | self.d = d 21 | 22 | class Rectangle(): 23 | def __init__(self, left, right, top, bottom): 24 | self.left = left 25 | self.right = right 26 | self.top = top 27 | self.bottom = bottom 28 | 29 | def accelerate(x, y, dx, dy, accel, dt): 30 | x2 = x + (dt * dx) + (accel * dt * dt * 0.5); 31 | y2 = y + (dt * dy) + (accel * dt * dt * 0.5); 32 | dx2 = dx + (accel * dt) * (1 if dx > 0 else -1); 33 | dy2 = dy + (accel * dt) * (1 if dy > 0 else -1); 34 | return Position((x2-x), (y2-y), x2, y2, dx2, dy2 ) 35 | 36 | 37 | def intercept(x1, y1, x2, y2, x3, y3, x4, y4, d): 38 | denom = ((y4-y3) * (x2-x1)) - ((x4-x3) * (y2-y1)) 39 | if (denom != 0): 40 | ua = (((x4-x3) * (y1-y3)) - ((y4-y3) * (x1-x3))) / denom 41 | if ((ua >= 0) and (ua <= 1)): 42 | ub = (((x2-x1) * (y1-y3)) - ((y2-y1) * (x1-x3))) / denom 43 | if ((ub >= 0) and (ub <= 1)): 44 | x = x1 + (ua * (x2-x1)) 45 | y = y1 + (ua * (y2-y1)) 46 | return Intercept(x, y, d) 47 | 48 | 49 | def ballIntercept(ball, rect, nx, ny): 50 | pt = None 51 | if (nx < 0): 52 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 53 | rect.right + ball.radius, 54 | rect.top - ball.radius, 55 | rect.right + ball.radius, 56 | rect.bottom + ball.radius, 57 | "right"); 58 | elif (nx > 0): 59 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 60 | rect.left - ball.radius, 61 | rect.top - ball.radius, 62 | rect.left - ball.radius, 63 | rect.bottom + ball.radius, 64 | "left") 65 | 66 | if (not pt): 67 | if (ny < 0): 68 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 69 | rect.left - ball.radius, 70 | rect.bottom + ball.radius, 71 | rect.right + ball.radius, 72 | rect.bottom + ball.radius, 73 | "bottom"); 74 | elif (ny > 0): 75 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 76 | rect.left - ball.radius, 77 | rect.top - ball.radius, 78 | rect.right + ball.radius, 79 | rect.top - ball.radius, 80 | "top"); 81 | return pt -------------------------------------------------------------------------------- /utilities/pong_py/pong_py/paddle.py: -------------------------------------------------------------------------------- 1 | import random 2 | import pong_py.helper as helper 3 | from pong_py.helper import Rectangle 4 | 5 | class Paddle(): 6 | STOP = 0 7 | DOWN = 1 8 | UP = 2 9 | 10 | 11 | def __init__(self, rhs, pong): 12 | self.pid = rhs 13 | self.width = 12 14 | self.height = 60 15 | self.dt = pong.dt 16 | self.minY = pong.wall_width 17 | self.maxY = pong.height - pong.wall_width - self.height 18 | self.speed = (self.maxY - self.minY) / 2 19 | self.ai_reaction = 0.1 20 | self.ai_error = 120 21 | self.pong = pong 22 | self.set_direction(0) 23 | self.set_position(pong.width - self.width if rhs else 0, 24 | self.minY + (self.maxY - self.minY) / 2) 25 | self.prediction = None 26 | self.ai_prev_action = 0 27 | 28 | def set_position(self, x, y): 29 | self.x = x 30 | self.y = y 31 | self.left = self.x 32 | self.right = self.left + self.width 33 | self.top = self.y 34 | self.bottom = self.y + self.height 35 | 36 | def set_direction(self, dy): 37 | # Needed for spin calculation 38 | self.up = -dy if dy < 0 else 0 39 | self.down = dy if dy > 0 else 0 40 | 41 | def step(self, action): 42 | if action == self.STOP: 43 | self.stopMovingDown() 44 | self.stopMovingUp() 45 | elif action == self.DOWN: 46 | self.moveDown() 47 | elif action == self.UP: 48 | self.moveUp() 49 | amt = self.down - self.up 50 | if amt != 0: 51 | y = self.y + (amt * self.dt * self.speed) 52 | if y < self.minY: 53 | y = self.minY 54 | elif y > self.maxY: 55 | y = self.maxY 56 | self.set_position(self.x, y) 57 | 58 | def predict(self, ball, dt): 59 | # only re-predict if the ball changed direction, or its been some amount of time since last prediction 60 | if (self.prediction and ((self.prediction.dx * ball.dx) > 0) and 61 | ((self.prediction.dy * ball.dy) > 0) and 62 | (self.prediction.since < self.ai_reaction)): 63 | self.prediction.since += dt 64 | return 65 | 66 | rect = Rectangle(self.left, self.right, -10000, 10000) 67 | pt = helper.ballIntercept(ball, rect, ball.dx * 10, ball.dy * 10) 68 | 69 | if (pt): 70 | t = self.minY + ball.radius 71 | b = self.maxY + self.height - ball.radius 72 | 73 | while ((pt.y < t) or (pt.y > b)): 74 | if (pt.y < t): 75 | pt.y = t + (t - pt.y) 76 | elif (pt.y > b): 77 | pt.y = t + (b - t) - (pt.y - b) 78 | self.prediction = pt 79 | else: 80 | self.prediction = None 81 | 82 | if self.prediction: 83 | self.prediction.since = 0 84 | self.prediction.dx = ball.dx 85 | self.prediction.dy = ball.dy 86 | self.prediction.radius = ball.radius 87 | self.prediction.exactX = self.prediction.x 88 | self.prediction.exactY = self.prediction.y 89 | closeness = (ball.x - self.right if ball.dx < 0 else self.left - ball.x) / self.pong.width 90 | error = self.ai_error * closeness 91 | self.prediction.y = self.prediction.y + random.uniform(-error, error) 92 | 93 | def ai_step(self, ball): 94 | 95 | if (((ball.x < self.left) and (ball.dx < 0)) or 96 | ((ball.x > self.right) and (ball.dx > 0))): 97 | self.stopMovingUp() 98 | self.stopMovingDown() 99 | return 100 | 101 | self.predict(ball, self.dt) 102 | action = self.ai_prev_action 103 | 104 | if (self.prediction): 105 | # print('prediction') 106 | if (self.prediction.y < (self.top + self.height/2 - 5)): 107 | action = self.UP 108 | # print("moved up") 109 | elif (self.prediction.y > (self.bottom - self.height/2 + 5)): 110 | action = self.DOWN 111 | # print("moved down") 112 | 113 | else: 114 | action = self.STOP 115 | # print("nothing") 116 | self.ai_prev_action = action 117 | return self.step(action) 118 | 119 | def moveUp(self): 120 | self.down = 0 121 | self.up = 1 122 | 123 | def moveDown(self): 124 | self.down = 1 125 | self.up = 0 126 | 127 | def stopMovingDown(self): 128 | self.down = 0 129 | 130 | def stopMovingUp(self): 131 | self.up = 0 132 | -------------------------------------------------------------------------------- /utilities/pong_py/pong_py/pongjsenv.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | import gym 4 | import gym.spaces 5 | 6 | from pong_py.ball import Ball 7 | from pong_py.paddle import Paddle 8 | 9 | 10 | def transform_state(state): 11 | return state / 500 12 | 13 | 14 | class PongJS(object): 15 | # MDP to 16 | def __init__(self): 17 | self.width = 640 18 | self.height = 480 19 | self.wall_width = 12 20 | self.dt = 0.05 # seconds 21 | #self.dt = 0.01 # seconds 22 | self.left_pad = Paddle(0, self) 23 | self.right_pad = Paddle(1, self) 24 | self.ball = Ball(self) 25 | 26 | def step(self, action): 27 | # do logic for self 28 | self.left_pad.step(action) 29 | self.right_pad.ai_step(self.ball) 30 | 31 | self.ball.update(self.left_pad, self.right_pad) 32 | term, reward = self.terminate() 33 | if term: 34 | self.reset(0 if reward == 1 else 1) 35 | state = self.get_state() 36 | return state, reward, term 37 | 38 | def init(self): 39 | self.reset(0) 40 | 41 | def terminate(self): 42 | if self.ball.left > self.width: 43 | return True, 1 44 | elif self.ball.right < 0: 45 | return True, -1 46 | else: 47 | return False, 0 48 | 49 | def get_state(self): 50 | return np.array([self.left_pad.y, 0, 51 | self.ball.x, self.ball.y, 52 | self.ball.dx, self.ball.dy, 53 | self.ball.x_prev, self.ball.y_prev]) 54 | 55 | def reset(self, player): 56 | self.ball.reset(player) 57 | 58 | 59 | class PongJSEnv(gym.Env): 60 | def __init__(self): 61 | self.env = PongJS() 62 | self.action_space = gym.spaces.Discrete(3) 63 | self.observation_space = gym.spaces.box.Box(low=0, high=1, shape=(8,)) 64 | 65 | @property 66 | def right_pad(self): 67 | return self.env.right_pad 68 | 69 | @property 70 | def left_pad(self): 71 | return self.env.left_pad 72 | 73 | def reset(self): 74 | self.env.init() 75 | return transform_state(self.env.get_state()) 76 | 77 | def step(self, action): 78 | state, reward, done = self.env.step(action) 79 | return transform_state(state), 1, done, {} 80 | #return state, reward, done, {} 81 | -------------------------------------------------------------------------------- /utilities/pong_py/setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages, Distribution 2 | 3 | setup(name='pong_py', 4 | packages=find_packages()) 5 | -------------------------------------------------------------------------------- /week_1/README.md: -------------------------------------------------------------------------------- 1 | # Week 1 2 | Get an introduction to the Ray framework and data parallelism. Topics include how to: 3 | 4 | - Run tasks in parallel using remote functions 5 | - Make dependencies between remote tasks through object IDs 6 | - Create nested tasks and remote functions within remote functions 7 | 8 | -------------------------------------------------------------------------------- /week_1/week_1_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 1: Exercise 1 - Simple Data Parallel Example\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to run simple tasks in parallel.\n", 10 | "\n", 11 | "This script is too slow, and the computation is embarrassingly parallel. In this exercise, you will use Ray to execute the functions in parallel to speed it up.\n", 12 | "\n", 13 | "### Concept for this Exercise - Remote Functions\n", 14 | "\n", 15 | "The standard way to turn a Python function into a remote function is to add the `@ray.remote` decorator. Here is an example.\n", 16 | "\n", 17 | "```python\n", 18 | "# A regular Python function.\n", 19 | "def regular_function():\n", 20 | " return 1\n", 21 | "\n", 22 | "# A Ray remote function.\n", 23 | "@ray.remote\n", 24 | "def remote_function():\n", 25 | " return 1\n", 26 | "```\n", 27 | "\n", 28 | "The differences are the following:\n", 29 | "\n", 30 | "1. **Invocation:** The regular version is called with `regular_function()`, whereas the remote version is called with `remote_function.remote()`.\n", 31 | "2. **Return values:** `regular_function` immediately executes and returns `1`, whereas `remote_function` immediately returns an object ID (a future) and then creates a task that will be executed on a worker process. The result can be obtained with `ray.get`.\n", 32 | " ```python\n", 33 | " >>> regular_function()\n", 34 | " 1\n", 35 | " \n", 36 | " >>> remote_function.remote()\n", 37 | " ObjectID(1c80d6937802cd7786ad25e50caf2f023c95e350)\n", 38 | " \n", 39 | " >>> ray.get(remote_function.remote())\n", 40 | " 1\n", 41 | " ```\n", 42 | "3. **Parallelism:** Invocations of `regular_function` happen **serially**, for example\n", 43 | " ```python\n", 44 | " # These happen serially.\n", 45 | " for _ in range(4):\n", 46 | " regular_function()\n", 47 | " ```\n", 48 | " whereas invocations of `remote_function` happen in **parallel**, for example\n", 49 | " ```python\n", 50 | " # These happen in parallel.\n", 51 | " for _ in range(4):\n", 52 | " remote_function.remote()\n", 53 | " ```" 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "metadata": {}, 60 | "outputs": [], 61 | "source": [ 62 | "from __future__ import absolute_import\n", 63 | "from __future__ import division\n", 64 | "from __future__ import print_function\n", 65 | "\n", 66 | "import ray\n", 67 | "import time" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "Start Ray. By default, Ray does not schedule more tasks concurrently than there are CPUs. This example requires four tasks to run concurrently, so we tell Ray that there are four CPUs. Usually this is not done and Ray computes the number of CPUs using `psutil.cpu_count()`. The argument `ignore_reinit_error=True` just ignores errors if the cell is run multiple times.\n", 75 | "\n", 76 | "The call to `ray.init` starts a number of processes." 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "**EXERCISE:** The function below is slow. Turn it into a remote function using the `@ray.remote` decorator." 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [ 101 | "# This function is a proxy for a more interesting and computationally\n", 102 | "# intensive function.\n", 103 | "def slow_function(i):\n", 104 | " time.sleep(1)\n", 105 | " return i" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "metadata": {}, 111 | "source": [ 112 | "**EXERCISE:** The loop below takes too long. The four function calls could be executed in parallel. Instead of four seconds, it should only take one second. Once `slow_function` has been made a remote function, execute these four tasks in parallel by calling `slow_function.remote()`. Then obtain the results by calling `ray.get` on a list of the resulting object IDs." 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "metadata": {}, 119 | "outputs": [], 120 | "source": [ 121 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 122 | "# We do this because workers may still be starting up in the background.\n", 123 | "time.sleep(2.0)\n", 124 | "start_time = time.time()\n", 125 | "\n", 126 | "results = [slow_function(i) for i in range(4)]\n", 127 | "\n", 128 | "end_time = time.time()\n", 129 | "duration = end_time - start_time\n", 130 | "\n", 131 | "print('The results are {}. This took {} seconds. Run the next cell to see '\n", 132 | " 'if the exercise was done correctly.'.format(results, duration))" 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": {}, 138 | "source": [ 139 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": null, 145 | "metadata": {}, 146 | "outputs": [], 147 | "source": [ 148 | "assert results == [0, 1, 2, 3], 'Did you remember to call ray.get?'\n", 149 | "assert duration < 1.1, ('The loop took {} seconds. This is too slow.'\n", 150 | " .format(duration))\n", 151 | "assert duration > 1, ('The loop took {} seconds. This is too fast.'\n", 152 | " .format(duration))\n", 153 | "\n", 154 | "print('Success! The example took {} seconds.'.format(duration))" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "**EXERCISE:** Use the UI to view the task timeline and to verify that the four tasks were executed in parallel. After running the cell below, you'll need to click on **View task timeline**\".\n", 162 | "- Using the **second** button, you can click and drag to **move** the timeline.\n", 163 | "- Using the **third** button, you can click and drag to **zoom**. You can also zoom by holding \"alt\" and scrolling.\n", 164 | "\n", 165 | "**NOTE:** Normally our UI is used as a separate Jupyter notebook. However, for simplicity we embedded the relevant feature here in this notebook.\n", 166 | "\n", 167 | "**NOTE:** The first time you click **View task timeline** it may take **several minutes** to start up. This will change.\n", 168 | "\n", 169 | "**NOTE:** If you run more tasks and want to regenerate the UI, you need to move the slider bar a little bit and then click **View task timeline** again.\n", 170 | "\n", 171 | "**NOTE:** The timeline visualization may only work in **Chrome**." 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "import ray.experimental.ui as ui\n", 181 | "ui.task_timeline()" 182 | ] 183 | } 184 | ], 185 | "metadata": { 186 | "kernelspec": { 187 | "display_name": "Python 3", 188 | "language": "python", 189 | "name": "python3" 190 | }, 191 | "language_info": { 192 | "codemirror_mode": { 193 | "name": "ipython", 194 | "version": 3 195 | }, 196 | "file_extension": ".py", 197 | "mimetype": "text/x-python", 198 | "name": "python", 199 | "nbconvert_exporter": "python", 200 | "pygments_lexer": "ipython3", 201 | "version": "3.6.8" 202 | } 203 | }, 204 | "nbformat": 4, 205 | "nbformat_minor": 2 206 | } 207 | -------------------------------------------------------------------------------- /week_1/week_1_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 1: Exercise 2 - Parallel Data Processing with Task Dependencies\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to pass object IDs into remote functions to encode dependencies between tasks.\n", 10 | "\n", 11 | "In this exercise, we construct a sequence of tasks each of which depends on the previous mimicking a data parallel application. Within each sequence, tasks are executed serially, but multiple sequences can be executed in parallel.\n", 12 | "\n", 13 | "In this exercise, you will use Ray to parallelize the computation below and speed it up.\n", 14 | "\n", 15 | "### Concept for this Exercise - Task Dependencies\n", 16 | "\n", 17 | "Suppose we have two remote functions defined as follows.\n", 18 | "\n", 19 | "```python\n", 20 | "@ray.remote\n", 21 | "def f(x):\n", 22 | " return x\n", 23 | "```\n", 24 | "\n", 25 | "Arguments can be passed into remote functions as usual.\n", 26 | "\n", 27 | "```python\n", 28 | ">>> x1_id = f.remote(1)\n", 29 | ">>> ray.get(x1_id)\n", 30 | "1\n", 31 | "\n", 32 | ">>> x2_id = f.remote([1, 2, 3])\n", 33 | ">>> ray.get(x2_id)\n", 34 | "[1, 2, 3]\n", 35 | "```\n", 36 | "\n", 37 | "**Object IDs** can also be passed into remote functions. When the function actually gets executed, **the argument will be a retrieved as a regular Python object**.\n", 38 | "\n", 39 | "```python\n", 40 | ">>> y1_id = f.remote(x1_id)\n", 41 | ">>> ray.get(y1_id)\n", 42 | "1\n", 43 | "\n", 44 | ">>> y2_id = f.remote(x2_id)\n", 45 | ">>> ray.get(y2_id)\n", 46 | "[1, 2, 3]\n", 47 | "```\n", 48 | "\n", 49 | "So when implementing a remote function, the function should expect a regular Python object regardless of whether the caller passes in a regular Python object or an object ID.\n", 50 | "\n", 51 | "**Task dependencies affect scheduling.** In the example above, the task that creates `y1_id` depends on the task that creates `x1_id`. This has the following implications.\n", 52 | "\n", 53 | "- The second task will not be executed until the first task has finished executing.\n", 54 | "- If the two tasks are scheduled on different machines, the output of the first task (the value corresponding to `x1_id`) will be copied over the network to the machine where the second task is scheduled." 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": null, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "from __future__ import absolute_import\n", 64 | "from __future__ import division\n", 65 | "from __future__ import print_function\n", 66 | "\n", 67 | "import numpy as np\n", 68 | "import ray\n", 69 | "import time" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "metadata": {}, 84 | "source": [ 85 | "These are some helper functions that mimic an example pattern of a data parallel application.\n", 86 | "\n", 87 | "**EXERCISE:** You will need to turn all of these functions into remote functions. When you turn these functions into remote function, you do not have to worry about whether the caller passes in an object ID or a regular object. In both cases, the arguments will be regular objects when the function executes. This means that even if you pass in an object ID, you **do not need to call `ray.get`** inside of these remote functions." 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": null, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "def load_data(filename):\n", 97 | " time.sleep(0.1)\n", 98 | " return np.ones((1000, 100))\n", 99 | "\n", 100 | "def normalize_data(data):\n", 101 | " time.sleep(0.1)\n", 102 | " return data - np.mean(data, axis=0)\n", 103 | "\n", 104 | "def extract_features(normalized_data):\n", 105 | " time.sleep(0.1)\n", 106 | " return np.hstack([normalized_data, normalized_data ** 2])\n", 107 | "\n", 108 | "def compute_loss(features):\n", 109 | " num_data, dim = features.shape\n", 110 | " time.sleep(0.1)\n", 111 | " return np.sum((np.dot(features, np.ones(dim)) - np.ones(num_data)) ** 2)\n", 112 | "\n", 113 | "assert hasattr(load_data, 'remote'), 'load_data must be a remote function'\n", 114 | "assert hasattr(normalize_data, 'remote'), 'normalize_data must be a remote function'\n", 115 | "assert hasattr(extract_features, 'remote'), 'extract_features must be a remote function'\n", 116 | "assert hasattr(compute_loss, 'remote'), 'compute_loss must be a remote function'" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": {}, 122 | "source": [ 123 | "**EXERCISE:** The loop below takes too long. Parallelize the four passes through the loop by turning `load_data`, `normalize_data`, `extract_features`, and `compute_loss` into remote functions and then retrieving the losses with `ray.get`.\n", 124 | "\n", 125 | "**NOTE:** You should only use **ONE** call to `ray.get`. For example, the object ID returned by `load_data` should be passed directly into `normalize_data` without needing to be retrieved by the driver." 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": null, 131 | "metadata": {}, 132 | "outputs": [], 133 | "source": [ 134 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 135 | "time.sleep(2.0)\n", 136 | "start_time = time.time()\n", 137 | "\n", 138 | "losses = []\n", 139 | "for filename in ['file1', 'file2', 'file3', 'file4']:\n", 140 | " inner_start = time.time()\n", 141 | "\n", 142 | " data = load_data(filename)\n", 143 | " normalized_data = normalize_data(data)\n", 144 | " features = extract_features(normalized_data)\n", 145 | " loss = compute_loss(features)\n", 146 | " losses.append(loss)\n", 147 | " \n", 148 | " inner_end = time.time()\n", 149 | " \n", 150 | " if inner_end - inner_start >= 0.1:\n", 151 | " raise Exception('You may be calling ray.get inside of the for loop! '\n", 152 | " 'Doing this will prevent parallelism from being exposed. '\n", 153 | " 'Make sure to only call ray.get once outside of the for loop.')\n", 154 | "\n", 155 | "print('The losses are {}.'.format(losses) + '\\n')\n", 156 | "loss = sum(losses)\n", 157 | "\n", 158 | "end_time = time.time()\n", 159 | "duration = end_time - start_time\n", 160 | "\n", 161 | "print('The loss is {}. This took {} seconds. Run the next cell to see '\n", 162 | " 'if the exercise was done correctly.'.format(loss, duration))" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": {}, 168 | "source": [ 169 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": null, 175 | "metadata": {}, 176 | "outputs": [], 177 | "source": [ 178 | "assert loss == 4000\n", 179 | "assert duration < 0.8, ('The loop took {} seconds. This is too slow.'\n", 180 | " .format(duration))\n", 181 | "assert duration > 0.4, ('The loop took {} seconds. This is too fast.'\n", 182 | " .format(duration))\n", 183 | "\n", 184 | "print('Success! The example took {} seconds.'.format(duration))" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": {}, 190 | "source": [ 191 | "**EXERCISE:** Use the UI to view the task timeline and to verify that the relevant tasks were executed in parallel. After running the cell below, you'll need to click on **View task timeline**\".\n", 192 | "- Using the **second** button, you can click and drag to **move** the timeline.\n", 193 | "- Using the **third** button, you can click and drag to **zoom**. You can also zoom by holding \"alt\" and scrolling.\n", 194 | "\n", 195 | "In the timeline, click on **View Options** and select **Flow Events** to visualize tasks dependencies." 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": null, 201 | "metadata": {}, 202 | "outputs": [], 203 | "source": [ 204 | "import ray.experimental.ui as ui\n", 205 | "ui.task_timeline()" 206 | ] 207 | } 208 | ], 209 | "metadata": { 210 | "kernelspec": { 211 | "display_name": "Python 3", 212 | "language": "python", 213 | "name": "python3" 214 | }, 215 | "language_info": { 216 | "codemirror_mode": { 217 | "name": "ipython", 218 | "version": 3 219 | }, 220 | "file_extension": ".py", 221 | "mimetype": "text/x-python", 222 | "name": "python", 223 | "nbconvert_exporter": "python", 224 | "pygments_lexer": "ipython3", 225 | "version": "3.6.8" 226 | } 227 | }, 228 | "nbformat": 4, 229 | "nbformat_minor": 2 230 | } 231 | -------------------------------------------------------------------------------- /week_1/week_1_exercise_3.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 1: Exercise 3 - Nested Parallelism\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to create nested tasks by calling a remote function inside of another remote function.\n", 10 | "\n", 11 | "In this exercise, you will implement the structure of a parallel hyperparameter sweep which trains a number of models in parallel. Each model will be trained using parallel gradient computations.\n", 12 | "\n", 13 | "### Concepts for this Exercise - Nested Remote Functions\n", 14 | "\n", 15 | "Remote functions can call other functions. For example, consider the following.\n", 16 | "\n", 17 | "```python\n", 18 | "@ray.remote\n", 19 | "def f():\n", 20 | " return 1\n", 21 | "\n", 22 | "@ray.remote\n", 23 | "def g():\n", 24 | " # Call f 4 times and return the resulting object IDs.\n", 25 | " return [f.remote() for _ in range(4)]\n", 26 | "\n", 27 | "@ray.remote\n", 28 | "def h():\n", 29 | " # Call f 4 times, block until those 4 tasks finish,\n", 30 | " # retrieve the results, and return the values.\n", 31 | " return ray.get([f.remote() for _ in range(4)])\n", 32 | "```\n", 33 | "\n", 34 | "Then calling `g` and `h` produces the following behavior.\n", 35 | "\n", 36 | "```python\n", 37 | ">>> ray.get(g.remote())\n", 38 | "[ObjectID(b1457ba0911ae84989aae86f89409e953dd9a80e),\n", 39 | " ObjectID(7c14a1d13a56d8dc01e800761a66f09201104275),\n", 40 | " ObjectID(99763728ffc1a2c0766a2000ebabded52514e9a6),\n", 41 | " ObjectID(9c2f372e1933b04b2936bb6f58161285829b9914)]\n", 42 | "\n", 43 | ">>> ray.get(h.remote())\n", 44 | "[1, 1, 1, 1]\n", 45 | "```\n", 46 | "\n", 47 | "**One limitation** is that the definition of `f` must come before the definitions of `g` and `h` because as soon as `g` is defined, it will be pickled and shipped to the workers, and so if `f` hasn't been defined yet, the definition will be incomplete." 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": null, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "from __future__ import absolute_import\n", 57 | "from __future__ import division\n", 58 | "from __future__ import print_function\n", 59 | "\n", 60 | "import numpy as np\n", 61 | "import ray\n", 62 | "import time" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "ray.init(num_cpus=9, include_webui=False, ignore_reinit_error=True)" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "This example represents a hyperparameter sweep in which multiple models are trained in parallel. Each model training task also performs data parallel gradient computations.\n", 79 | "\n", 80 | "**EXERCISE:** Turn `compute_gradient` and `train_model` into remote functions so that they can be executed in parallel. Inside of `train_model`, do the calls to `compute_gradient` in parallel and fetch the results using `ray.get`." 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": {}, 87 | "outputs": [], 88 | "source": [ 89 | "def compute_gradient(data, current_model):\n", 90 | " time.sleep(0.03)\n", 91 | " return 1\n", 92 | "\n", 93 | "def train_model(hyperparameters):\n", 94 | " current_model = 0\n", 95 | " # Iteratively improve the current model. This outer loop cannot be parallelized.\n", 96 | " for _ in range(10):\n", 97 | " # EXERCISE: Parallelize the list comprehension in the line below. After you\n", 98 | " # turn \"compute_gradient\" into a remote function, you will need to call it\n", 99 | " # with \".remote\". The results must be retrieved with \"ray.get\" before \"sum\"\n", 100 | " # is called.\n", 101 | " total_gradient = sum([compute_gradient(j, current_model) for j in range(2)])\n", 102 | " current_model += total_gradient\n", 103 | "\n", 104 | " return current_model\n", 105 | "\n", 106 | "assert hasattr(compute_gradient, 'remote'), 'compute_gradient must be a remote function'\n", 107 | "assert hasattr(train_model, 'remote'), 'train_model must be a remote function'" 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "**EXERCISE:** The code below runs 3 hyperparameter experiments. Change this to run the experiments in parallel." 115 | ] 116 | }, 117 | { 118 | "cell_type": "code", 119 | "execution_count": null, 120 | "metadata": {}, 121 | "outputs": [], 122 | "source": [ 123 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 124 | "time.sleep(2.0)\n", 125 | "start_time = time.time()\n", 126 | "\n", 127 | "# Run some hyperparaameter experiments.\n", 128 | "results = []\n", 129 | "for hyperparameters in [{'learning_rate': 1e-1, 'batch_size': 100},\n", 130 | " {'learning_rate': 1e-2, 'batch_size': 100},\n", 131 | " {'learning_rate': 1e-3, 'batch_size': 100}]:\n", 132 | " results.append(train_model(hyperparameters))\n", 133 | "\n", 134 | "# EXERCISE: Once you've turned \"results\" into a list of Ray ObjectIDs\n", 135 | "# by calling train_model.remote, you will need to turn \"results\" back\n", 136 | "# into a list of integers, e.g., by doing \"results = ray.get(results)\".\n", 137 | "\n", 138 | "end_time = time.time()\n", 139 | "duration = end_time - start_time\n", 140 | "\n", 141 | "assert all([isinstance(x, int) for x in results]), 'Looks like \"results\" is {}. You may have forgotten to call ray.get.'.format(results)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "assert results == [20, 20, 20]\n", 158 | "assert duration < 0.5, ('The experiments ran in {} seconds. This is too '\n", 159 | " 'slow.'.format(duration))\n", 160 | "assert duration > 0.3, ('The experiments ran in {} seconds. This is too '\n", 161 | " 'fast.'.format(duration))\n", 162 | "\n", 163 | "print('Success! The example took {} seconds.'.format(duration))" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": {}, 169 | "source": [ 170 | "**EXERCISE:** Use the UI to view the task timeline and to verify that the pattern makes sense." 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": null, 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "import ray.experimental.ui as ui\n", 180 | "ui.task_timeline()" 181 | ] 182 | } 183 | ], 184 | "metadata": { 185 | "kernelspec": { 186 | "display_name": "Python 3", 187 | "language": "python", 188 | "name": "python3" 189 | }, 190 | "language_info": { 191 | "codemirror_mode": { 192 | "name": "ipython", 193 | "version": 3 194 | }, 195 | "file_extension": ".py", 196 | "mimetype": "text/x-python", 197 | "name": "python", 198 | "nbconvert_exporter": "python", 199 | "pygments_lexer": "ipython3", 200 | "version": "3.6.8" 201 | } 202 | }, 203 | "nbformat": 4, 204 | "nbformat_minor": 2 205 | } 206 | -------------------------------------------------------------------------------- /week_2/README.md: -------------------------------------------------------------------------------- 1 | # Week 2 2 | Learn about Ray actors, which are remote functions that have states. Additional topics: 3 | 4 | - How to implement Ray actors using Python* classes 5 | - How to use different hardware resources for various AI tasks, such as training and inference 6 | - The analytics ecosystem, which is made up of toolkits, libraries, solutions, and hardware 7 | -------------------------------------------------------------------------------- /week_2/week_2_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 2: Exercise 1 - Introducing Actors\n", 8 | "\n", 9 | "**Goal:** The goal of this exercise is to show how to create an actor and how to call actor methods.\n", 10 | "\n", 11 | "See the documentation on actors at http://ray.readthedocs.io/en/latest/actors.html.\n", 12 | "\n", 13 | "Sometimes you need a \"worker\" process to have \"state\". For example, that state might be a neural network, a simulator environment, a counter, or something else entirely. However, remote functions are side-effect free. That is, they operate on inputs and produce outputs, but they don't change the state of the worker they execute on.\n", 14 | "\n", 15 | "Actors are different. When we instantiate an actor, a brand new worker is created, and all methods that are called on that actor are executed on the newly created worker.\n", 16 | "\n", 17 | "This means that with a single actor, no parallelism can be achieved because calls to the actor's methods will be executed one at a time. However, multiple actors can be created and methods can be executed on them in parallel.\n", 18 | "\n", 19 | "### Concepts for this Exercise - Actors\n", 20 | "\n", 21 | "To create an actor, decorate Python class with the `@ray.remote` decorator.\n", 22 | "\n", 23 | "```python\n", 24 | "@ray.remote\n", 25 | "class Example(object):\n", 26 | " def __init__(self, x):\n", 27 | " self.x = x\n", 28 | " \n", 29 | " def set(self, x):\n", 30 | " self.x = x\n", 31 | " \n", 32 | " def get(self):\n", 33 | " return self.x\n", 34 | "```\n", 35 | "\n", 36 | "Like regular Python classes, **actors encapsulate state that is shared across actor method invocations**.\n", 37 | "\n", 38 | "Actor classes differ from regular Python classes in the following ways.\n", 39 | "1. **Instantiation:** A regular class would be instantiated via `e = Example(1)`. Actors are instantiated via\n", 40 | " ```python\n", 41 | " e = Example.remote(1)\n", 42 | " ```\n", 43 | " When an actor is instantiated, a **new worker process** is created by a local scheduler somewhere in the cluster.\n", 44 | "2. **Method Invocation:** Methods of a regular class would be invoked via `e.set(2)` or `e.get()`. Actor methods are invoked differently.\n", 45 | " ```python\n", 46 | " >>> e.set.remote(2)\n", 47 | " ObjectID(d966aa9b6486331dc2257522734a69ff603e5a1c)\n", 48 | " \n", 49 | " >>> e.get.remote()\n", 50 | " ObjectID(7c432c085864ed4c7c18cf112377a608676afbc3)\n", 51 | " ```\n", 52 | "3. **Return Values:** Actor methods are non-blocking. They immediately return an object ID and **they create a task which is scheduled on the actor worker**. The result can be retrieved with `ray.get`.\n", 53 | " ```python\n", 54 | " >>> ray.get(e.set.remote(2))\n", 55 | " None\n", 56 | " \n", 57 | " >>> ray.get(e.get.remote())\n", 58 | " 2\n", 59 | " ```" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "from __future__ import absolute_import\n", 69 | "from __future__ import division\n", 70 | "from __future__ import print_function\n", 71 | "\n", 72 | "import numpy as np\n", 73 | "import ray\n", 74 | "import time" 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": {}, 81 | "outputs": [], 82 | "source": [ 83 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 84 | ] 85 | }, 86 | { 87 | "cell_type": "markdown", 88 | "metadata": {}, 89 | "source": [ 90 | "**EXERCISE:** Change the `Foo` class to be an actor class by using the `@ray.remote` decorator." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": null, 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "class Foo(object):\n", 100 | " def __init__(self):\n", 101 | " self.counter = 0\n", 102 | "\n", 103 | " def reset(self):\n", 104 | " self.counter = 0\n", 105 | "\n", 106 | " def increment(self):\n", 107 | " time.sleep(0.5)\n", 108 | " self.counter += 1\n", 109 | " return self.counter\n", 110 | "\n", 111 | "assert hasattr(Foo, 'remote'), 'You need to turn \"Foo\" into an actor with @ray.remote.'" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "**EXERCISE:** Change the intantiations below to create two actors by calling `Foo.remote()`." 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "# Create two Foo objects.\n", 128 | "f1 = Foo()\n", 129 | "f2 = Foo()" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "**EXERCISE:** Parallelize the code below. The two actors can execute methods in parallel (though each actor can only execute one method at a time)." 137 | ] 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": null, 142 | "metadata": {}, 143 | "outputs": [], 144 | "source": [ 145 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 146 | "time.sleep(2.0)\n", 147 | "start_time = time.time()\n", 148 | "\n", 149 | "# Reset the actor state so that we can run this cell multiple times without\n", 150 | "# changing the results.\n", 151 | "f1.reset()\n", 152 | "f2.reset()\n", 153 | "\n", 154 | "# We want to parallelize this code. However, it is not straightforward to\n", 155 | "# make \"increment\" a remote function, because state is shared (the value of\n", 156 | "# \"self.counter\") between subsequent calls to \"increment\". In this case, it\n", 157 | "# makes sense to use actors.\n", 158 | "results = []\n", 159 | "for _ in range(5):\n", 160 | " results.append(f1.increment())\n", 161 | " results.append(f2.increment())\n", 162 | "\n", 163 | "end_time = time.time()\n", 164 | "duration = end_time - start_time\n", 165 | "\n", 166 | "assert not any([isinstance(result, ray.ObjectID) for result in results]), 'Looks like \"results\" is {}. You may have forgotten to call ray.get.'.format(results)" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 174 | ] 175 | }, 176 | { 177 | "cell_type": "code", 178 | "execution_count": null, 179 | "metadata": {}, 180 | "outputs": [], 181 | "source": [ 182 | "assert results == [1, 1, 2, 2, 3, 3, 4, 4, 5, 5]\n", 183 | "\n", 184 | "assert duration < 3, ('The experiments ran in {} seconds. This is too '\n", 185 | " 'slow.'.format(duration))\n", 186 | "assert duration > 2.5, ('The experiments ran in {} seconds. This is too '\n", 187 | " 'fast.'.format(duration))\n", 188 | "\n", 189 | "print('Success! The example took {} seconds.'.format(duration))" 190 | ] 191 | } 192 | ], 193 | "metadata": { 194 | "kernelspec": { 195 | "display_name": "Python 3", 196 | "language": "python", 197 | "name": "python3" 198 | }, 199 | "language_info": { 200 | "codemirror_mode": { 201 | "name": "ipython", 202 | "version": 3 203 | }, 204 | "file_extension": ".py", 205 | "mimetype": "text/x-python", 206 | "name": "python", 207 | "nbconvert_exporter": "python", 208 | "pygments_lexer": "ipython3", 209 | "version": "3.6.8" 210 | } 211 | }, 212 | "nbformat": 4, 213 | "nbformat_minor": 2 214 | } 215 | -------------------------------------------------------------------------------- /week_2/week_2_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 2: Exercise 2 - Actor Handles\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to pass around actor handles.\n", 10 | "\n", 11 | "Suppose we wish to have multiple tasks invoke methods on the same actor. For example, we may have a single actor that records logging information from a number of tasks. We can achieve this by passing a handle to the actor as an argument into the relevant tasks.\n", 12 | "\n", 13 | "### Concepts for this Exercise - Actor Handles\n", 14 | "\n", 15 | "First of all, suppose we've created an actor as follows.\n", 16 | "\n", 17 | "```python\n", 18 | "@ray.remote\n", 19 | "class Actor(object):\n", 20 | " def method(self):\n", 21 | " pass\n", 22 | "\n", 23 | "# Create the actor\n", 24 | "actor = Actor.remote()\n", 25 | "```\n", 26 | "\n", 27 | "Then we can define a remote function (or another actor) that takes an actor handle as an argument.\n", 28 | "\n", 29 | "```python\n", 30 | "@ray.remote\n", 31 | "def f(actor):\n", 32 | " # We can invoke methods on the actor.\n", 33 | " x_id = actor.method.remote()\n", 34 | " # We can block and get the results.\n", 35 | " return ray.get(x_id)\n", 36 | "```\n", 37 | "\n", 38 | "Then we can invoke the remote function a few times and pass in the actor handle.\n", 39 | "\n", 40 | "```python\n", 41 | "# Each of the three tasks created below will invoke methods on the same actor.\n", 42 | "f.remote(actor)\n", 43 | "f.remote(actor)\n", 44 | "f.remote(actor)\n", 45 | "```" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "from __future__ import absolute_import\n", 55 | "from __future__ import division\n", 56 | "from __future__ import print_function\n", 57 | "\n", 58 | "from collections import defaultdict\n", 59 | "import ray\n", 60 | "import time" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": null, 66 | "metadata": {}, 67 | "outputs": [], 68 | "source": [ 69 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | "In this exercise, we're going to write some code that runs several \"experiments\" in parallel and has each experiment log its results to an actor. The driver script can then periodically pull the results from the logging actor.\n", 77 | "\n", 78 | "**EXERCISE:** Turn this `LoggingActor` class into an actor class." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "class LoggingActor(object):\n", 88 | " def __init__(self):\n", 89 | " self.logs = defaultdict(lambda: [])\n", 90 | " \n", 91 | " def log(self, index, message):\n", 92 | " self.logs[index].append(message)\n", 93 | " \n", 94 | " def get_logs(self):\n", 95 | " return dict(self.logs)\n", 96 | "\n", 97 | "\n", 98 | "assert hasattr(LoggingActor, 'remote'), ('You need to turn LoggingActor into an '\n", 99 | " 'actor (by using the ray.remote keyword).')" 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "**EXERCISE:** Instantiate the actor." 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "logging_actor = LoggingActor()\n", 116 | "\n", 117 | "# Some checks to make sure this was done correctly.\n", 118 | "assert hasattr(logging_actor, 'get_logs')" 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "metadata": {}, 124 | "source": [ 125 | "Now we define a remote function that runs and pushes its logs to the `LoggingActor`.\n", 126 | "\n", 127 | "**EXERCISE:** Modify this function so that it invokes methods correctly on `logging_actor` (you need to change the way you call the `log` method)." 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [ 136 | "@ray.remote\n", 137 | "def run_experiment(experiment_index, logging_actor):\n", 138 | " for i in range(60):\n", 139 | " time.sleep(1)\n", 140 | " # Push a logging message to the actor.\n", 141 | " logging_actor.log(experiment_index, 'On iteration {}'.format(i))" 142 | ] 143 | }, 144 | { 145 | "cell_type": "markdown", 146 | "metadata": {}, 147 | "source": [ 148 | "Now we create several tasks that use the logging actor." 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "experiment_ids = [run_experiment.remote(i, logging_actor) for i in range(3)]" 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "metadata": {}, 163 | "source": [ 164 | "While the experiments are running in the background, the driver process (that is, this Jupyter notebook) can query the actor to read the logs.\n", 165 | "\n", 166 | "**EXERCISE:** Modify the code below to dispatch methods to the `LoggingActor`." 167 | ] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": null, 172 | "metadata": {}, 173 | "outputs": [], 174 | "source": [ 175 | "logs = logging_actor.get_logs()\n", 176 | "print(logs)\n", 177 | "\n", 178 | "assert isinstance(logs, dict), (\"Make sure that you dispatch tasks to the \"\n", 179 | " \"actor using the .remote keyword and get the results using ray.get.\")" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "**EXERCISE:** Try running the above box multiple times and see how the results change (while the experiments are still running in the background). You can also try running more of the experiment tasks and see what happens." 187 | ] 188 | } 189 | ], 190 | "metadata": { 191 | "kernelspec": { 192 | "display_name": "Python 3", 193 | "language": "python", 194 | "name": "python3" 195 | }, 196 | "language_info": { 197 | "codemirror_mode": { 198 | "name": "ipython", 199 | "version": 3 200 | }, 201 | "file_extension": ".py", 202 | "mimetype": "text/x-python", 203 | "name": "python", 204 | "nbconvert_exporter": "python", 205 | "pygments_lexer": "ipython3", 206 | "version": "3.6.8" 207 | } 208 | }, 209 | "nbformat": 4, 210 | "nbformat_minor": 2 211 | } 212 | -------------------------------------------------------------------------------- /week_3/README.md: -------------------------------------------------------------------------------- 1 | # Week 3 2 | Understand how to optimize and speed up functions. Topics include how to: 3 | 4 | - Avoid waiting for slow tasks using ray.wait() 5 | - Process remote tasks in a specific order 6 | - Speed up serialization by passing objects to ray.put() 7 | -------------------------------------------------------------------------------- /week_3/week_3_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 3: Exercise 1 - Handling Slow Tasks\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to use `ray.wait` to avoid waiting for slow tasks.\n", 10 | "\n", 11 | "See the documentation for ray.wait at https://ray.readthedocs.io/en/latest/api.html#ray.wait.\n", 12 | "\n", 13 | "This script starts 6 tasks, each of which takes a random amount of time to complete. We'd like to process the results in two batches (each of size 3). Change the code so that instead of waiting for a fixed set of 3 tasks to finish, we make the first batch consist of the first 3 tasks that complete. The second batch should consist of the 3 remaining tasks. Do this exercise by using `ray.wait`.\n", 14 | "\n", 15 | "### Concepts for this Exercise - ray.wait\n", 16 | "\n", 17 | "After launching a number of tasks, you may want to know which ones have finished executing. This can be done with `ray.wait`. The function works as follows.\n", 18 | "\n", 19 | "```python\n", 20 | "ready_ids, remaining_ids = ray.wait(object_ids, num_returns=1, timeout=None)\n", 21 | "```\n", 22 | "\n", 23 | "**Arguments:**\n", 24 | "- `object_ids`: This is a list of object IDs.\n", 25 | "- `num_returns`: This is maximum number of object IDs to wait for. The default value is `1`.\n", 26 | "- `timeout`: This is the maximum amount of time in milliseconds to wait for. So `ray.wait` will block until either `num_returns` objects are ready or until `timeout` milliseconds have passed.\n", 27 | "\n", 28 | "**Return values:**\n", 29 | "- `ready_ids`: This is a list of object IDs that are available in the object store.\n", 30 | "- `remaining_ids`: This is a list of the IDs that were in `object_ids` but are not in `ready_ids`, so the IDs in `ready_ids` and `remaining_ids` together make up all the IDs in `object_ids`." 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "metadata": { 37 | "collapsed": true 38 | }, 39 | "outputs": [], 40 | "source": [ 41 | "from __future__ import absolute_import\n", 42 | "from __future__ import division\n", 43 | "from __future__ import print_function\n", 44 | "\n", 45 | "import numpy as np\n", 46 | "import ray\n", 47 | "import time" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": null, 53 | "metadata": { 54 | "collapsed": true 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "ray.init(num_cpus=6, include_webui=False, ignore_reinit_error=True)" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "Define a remote function that takes a variable amount of time to run." 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": { 72 | "collapsed": true 73 | }, 74 | "outputs": [], 75 | "source": [ 76 | "@ray.remote\n", 77 | "def f(i):\n", 78 | " np.random.seed(5 + i)\n", 79 | " x = np.random.uniform(0, 4)\n", 80 | " time.sleep(x)\n", 81 | " return i, time.time()" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "**EXERCISE:** Using `ray.wait`, change the code below so that `initial_results` consists of the outputs of the first three tasks to complete instead of the first three tasks that were submitted." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": { 95 | "collapsed": true 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 100 | "time.sleep(2.0)\n", 101 | "start_time = time.time()\n", 102 | "\n", 103 | "# This launches 6 tasks, each of which takes a random amount of time to\n", 104 | "# complete.\n", 105 | "result_ids = [f.remote(i) for i in range(6)]\n", 106 | "# Get one batch of tasks. Instead of waiting for a fixed subset of tasks, we\n", 107 | "# should instead use the first 3 tasks that finish.\n", 108 | "initial_results = ray.get(result_ids[:3])\n", 109 | "\n", 110 | "end_time = time.time()\n", 111 | "duration = end_time - start_time" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "**EXERCISE:** Change the code below so that `remaining_results` consists of the outputs of the last three tasks to complete." 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": null, 124 | "metadata": { 125 | "collapsed": true 126 | }, 127 | "outputs": [], 128 | "source": [ 129 | "# Wait for the remaining tasks to complete.\n", 130 | "remaining_results = ray.get(result_ids[3:])" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "metadata": { 144 | "collapsed": true 145 | }, 146 | "outputs": [], 147 | "source": [ 148 | "assert len(initial_results) == 3\n", 149 | "assert len(remaining_results) == 3\n", 150 | "\n", 151 | "initial_indices = [result[0] for result in initial_results]\n", 152 | "initial_times = [result[1] for result in initial_results]\n", 153 | "remaining_indices = [result[0] for result in remaining_results]\n", 154 | "remaining_times = [result[1] for result in remaining_results]\n", 155 | "\n", 156 | "assert set(initial_indices + remaining_indices) == set(range(6))\n", 157 | "\n", 158 | "assert duration < 1.5, ('The initial batch of ten tasks was retrieved in '\n", 159 | " '{} seconds. This is too slow.'.format(duration))\n", 160 | "\n", 161 | "assert duration > 0.8, ('The initial batch of ten tasks was retrieved in '\n", 162 | " '{} seconds. This is too slow.'.format(duration))\n", 163 | "\n", 164 | "# Make sure the initial results actually completed first.\n", 165 | "assert max(initial_times) < min(remaining_times)\n", 166 | "\n", 167 | "print('Success! The example took {} seconds.'.format(duration))" 168 | ] 169 | } 170 | ], 171 | "metadata": { 172 | "kernelspec": { 173 | "display_name": "Python 3", 174 | "language": "python", 175 | "name": "python3" 176 | }, 177 | "language_info": { 178 | "codemirror_mode": { 179 | "name": "ipython", 180 | "version": 3 181 | }, 182 | "file_extension": ".py", 183 | "mimetype": "text/x-python", 184 | "name": "python", 185 | "nbconvert_exporter": "python", 186 | "pygments_lexer": "ipython3", 187 | "version": "3.6.8" 188 | } 189 | }, 190 | "nbformat": 4, 191 | "nbformat_minor": 2 192 | } 193 | -------------------------------------------------------------------------------- /week_3/week_3_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 3: Exercise 2 - Process Tasks in Order of Completion\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to use `ray.wait` to process tasks in the order that they finish.\n", 10 | "\n", 11 | "See the documentation for ray.wait at https://ray.readthedocs.io/en/latest/api.html#ray.wait.\n", 12 | "\n", 13 | "The code below runs 10 tasks and retrieves the results in the order that the tasks were launched. However, since each task takes a random amount of time to finish, we could instead process the tasks in the order that they finish." 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": null, 19 | "metadata": { 20 | "collapsed": true 21 | }, 22 | "outputs": [], 23 | "source": [ 24 | "from __future__ import absolute_import\n", 25 | "from __future__ import division\n", 26 | "from __future__ import print_function\n", 27 | "\n", 28 | "import numpy as np\n", 29 | "import ray\n", 30 | "import time" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "ray.init(num_cpus=5, include_webui=False, ignore_reinit_error=True)" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": null, 45 | "metadata": { 46 | "collapsed": true 47 | }, 48 | "outputs": [], 49 | "source": [ 50 | "@ray.remote\n", 51 | "def f():\n", 52 | " time.sleep(np.random.uniform(0, 5))\n", 53 | " return time.time()" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": [ 60 | "**EXERCISE:** Change the code below to use `ray.wait` to get the results of the tasks in the order that they complete.\n", 61 | "\n", 62 | "**NOTE:** It would be a simple modification to maintain a pool of 10 experiments and to start a new experiment whenever one finishes." 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 72 | "time.sleep(2.0)\n", 73 | "start_time = time.time()\n", 74 | "\n", 75 | "result_ids = [f.remote() for _ in range(10)]\n", 76 | "\n", 77 | "# Get the results.\n", 78 | "results = []\n", 79 | "for result_id in result_ids:\n", 80 | " result = ray.get(result_id)\n", 81 | " results.append(result)\n", 82 | " print('Processing result which finished after {} seconds.'\n", 83 | " .format(result - start_time))\n", 84 | "\n", 85 | "end_time = time.time()\n", 86 | "duration = end_time - start_time" 87 | ] 88 | }, 89 | { 90 | "cell_type": "markdown", 91 | "metadata": {}, 92 | "source": [ 93 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": null, 99 | "metadata": {}, 100 | "outputs": [], 101 | "source": [ 102 | "assert results == sorted(results), ('The results were not processed in the '\n", 103 | " 'order that they finished.')\n", 104 | "\n", 105 | "print('Success! The example took {} seconds.'.format(duration))" 106 | ] 107 | } 108 | ], 109 | "metadata": { 110 | "kernelspec": { 111 | "display_name": "Python 3", 112 | "language": "python", 113 | "name": "python3" 114 | }, 115 | "language_info": { 116 | "codemirror_mode": { 117 | "name": "ipython", 118 | "version": 3 119 | }, 120 | "file_extension": ".py", 121 | "mimetype": "text/x-python", 122 | "name": "python", 123 | "nbconvert_exporter": "python", 124 | "pygments_lexer": "ipython3", 125 | "version": "3.6.8" 126 | } 127 | }, 128 | "nbformat": 4, 129 | "nbformat_minor": 2 130 | } 131 | -------------------------------------------------------------------------------- /week_3/week_3_exercise_3.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 3: Exercise 3 - Speed up Serialization\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to illustrate how to speed up serialization by using `ray.put`.\n", 10 | "\n", 11 | "### Concepts for this Exercise - ray.put\n", 12 | "\n", 13 | "Object IDs can be created in multiple ways.\n", 14 | "- They are returned by remote function calls.\n", 15 | "- They are returned by actor method calls.\n", 16 | "- They are returned by `ray.put`.\n", 17 | "\n", 18 | "When an object is passed to `ray.put`, the object is serialized using the Apache Arrow format (see https://arrow.apache.org/ for more information about Arrow) and copied into a shared memory object store. This object will then be available to other workers on the same machine via shared memory. If it is needed by workers on another machine, it will be shipped under the hood.\n", 19 | "\n", 20 | "**When objects are passed into a remote function, Ray puts them in the object store under the hood.** That is, if `f` is a remote function, the code\n", 21 | "\n", 22 | "```python\n", 23 | "x = np.zeros(1000)\n", 24 | "f.remote(x)\n", 25 | "```\n", 26 | "\n", 27 | "is essentially transformed under the hood to\n", 28 | "\n", 29 | "```python\n", 30 | "x = np.zeros(1000)\n", 31 | "x_id = ray.put(x)\n", 32 | "f.remote(x_id)\n", 33 | "```\n", 34 | "\n", 35 | "The call to `ray.put` copies the numpy array into the shared-memory object store, from where it can be read by all of the worker processes (without additional copying). However, if you do something like\n", 36 | "\n", 37 | "```python\n", 38 | "for i in range(10):\n", 39 | " f.remote(x)\n", 40 | "```\n", 41 | "\n", 42 | "then 10 copies of the array will be placed into the object store. This takes up more memory in the object store than is necessary, and it also takes time to copy the array into the object store over and over. This can be made more efficient by placing the array in the object store only once as follows.\n", 43 | "\n", 44 | "```python\n", 45 | "x_id = ray.put(x)\n", 46 | "for i in range(10):\n", 47 | " f.remote(x_id)\n", 48 | "```\n", 49 | "\n", 50 | "In this exercise, you will speed up the code below and reduce the memory footprint by calling `ray.put` on the neural net weights before passing them into the remote functions.\n", 51 | "\n", 52 | "**WARNING:** This exercise requires a lot of memory to run. If this notebook is running within a Docker container, then the docker container must be started with a large shared-memory file system. This can be done by starting the docker container with the `--shm-size` flag." 53 | ] 54 | }, 55 | { 56 | "cell_type": "code", 57 | "execution_count": null, 58 | "metadata": { 59 | "collapsed": true 60 | }, 61 | "outputs": [], 62 | "source": [ 63 | "from __future__ import absolute_import\n", 64 | "from __future__ import division\n", 65 | "from __future__ import print_function\n", 66 | "\n", 67 | "import pickle\n", 68 | "import numpy as np\n", 69 | "import ray\n", 70 | "import time" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": null, 76 | "metadata": { 77 | "collapsed": true 78 | }, 79 | "outputs": [], 80 | "source": [ 81 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "Define some neural net weights which will be passed into a number of tasks." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": { 95 | "collapsed": true 96 | }, 97 | "outputs": [], 98 | "source": [ 99 | "neural_net_weights = {'variable{}'.format(i): np.random.normal(size=1000000)\n", 100 | " for i in range(50)}" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | "**EXERCISE:** Compare the time required to serialize the neural net weights and copy them into the object store using Ray versus the time required to pickle and unpickle the weights. The big win should be with the time required for *deserialization*.\n", 108 | "\n", 109 | "Note that when you call `ray.put`, in addition to serializing the object, we are copying it into shared memory where it can be efficiently accessed by other workers on the same machine.\n", 110 | "\n", 111 | "**NOTE:** You don't actually have to do anything here other than run the cell below and read the output.\n", 112 | "\n", 113 | "**NOTE:** Sometimes `ray.put` can be faster than `pickle.dumps`. This is because `ray.put` leverages multiple threads when serializing large objects. Note that this is not possible with `pickle`." 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": { 120 | "collapsed": true 121 | }, 122 | "outputs": [], 123 | "source": [ 124 | "print('Ray - serializing')\n", 125 | "%time x_id = ray.put(neural_net_weights)\n", 126 | "print('\\nRay - deserializing')\n", 127 | "%time x_val = ray.get(x_id)\n", 128 | "\n", 129 | "print('\\npickle - serializing')\n", 130 | "%time serialized = pickle.dumps(neural_net_weights)\n", 131 | "print('\\npickle - deserializing')\n", 132 | "%time deserialized = pickle.loads(serialized)" 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": {}, 138 | "source": [ 139 | "Define a remote function which uses the neural net weights." 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": null, 145 | "metadata": { 146 | "collapsed": true 147 | }, 148 | "outputs": [], 149 | "source": [ 150 | "@ray.remote\n", 151 | "def use_weights(weights, i):\n", 152 | " return i" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "**EXERCISE:** In the code below, use `ray.put` to avoid copying the neural net weights to the object store multiple times." 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": null, 165 | "metadata": { 166 | "collapsed": true 167 | }, 168 | "outputs": [], 169 | "source": [ 170 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 171 | "time.sleep(2.0)\n", 172 | "start_time = time.time()\n", 173 | "\n", 174 | "results = ray.get([use_weights.remote(neural_net_weights, i)\n", 175 | " for i in range(20)])\n", 176 | "\n", 177 | "end_time = time.time()\n", 178 | "duration = end_time - start_time" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": {}, 184 | "source": [ 185 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": null, 191 | "metadata": { 192 | "collapsed": true 193 | }, 194 | "outputs": [], 195 | "source": [ 196 | "assert results == list(range(20))\n", 197 | "assert duration < 1, ('The experiments ran in {} seconds. This is too '\n", 198 | " 'slow.'.format(duration))\n", 199 | "\n", 200 | "print('Success! The example took {} seconds.'.format(duration))" 201 | ] 202 | } 203 | ], 204 | "metadata": { 205 | "kernelspec": { 206 | "display_name": "Python 3", 207 | "language": "python", 208 | "name": "python3" 209 | }, 210 | "language_info": { 211 | "codemirror_mode": { 212 | "name": "ipython", 213 | "version": 3 214 | }, 215 | "file_extension": ".py", 216 | "mimetype": "text/x-python", 217 | "name": "python", 218 | "nbconvert_exporter": "python", 219 | "pygments_lexer": "ipython3", 220 | "version": "3.6.8" 221 | } 222 | }, 223 | "nbformat": 4, 224 | "nbformat_minor": 2 225 | } 226 | -------------------------------------------------------------------------------- /week_4/README.md: -------------------------------------------------------------------------------- 1 | # Week 4 2 | Explore how to optimize functions, including: 3 | 4 | - Accelerate Pandas* workflows by changing one line of code 5 | - Implement a MapReduce system with Ray 6 | - Use Tree Reduce to execute a tree of dependent function in parallel 7 | -------------------------------------------------------------------------------- /week_4/week_4_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 4: Exercise 1 - Modin (Pandas on Ray)\n", 8 | "\n", 9 | "**GOAL:** Learn to increase the speed of Pandas workflows by changing a single line of code.\n", 10 | "\n", 11 | "[Modin](https://modin.readthedocs.io/en/latest/?badge=latest) (Pandas on Ray) is a project aimed at speeding up Pandas using Ray.\n", 12 | "\n", 13 | "### Using Modin\n", 14 | "\n", 15 | "To use Modin, only a single line of code must be changed.\n", 16 | "\n", 17 | "Simply change:\n", 18 | "```python\n", 19 | "import pandas as pd\n", 20 | "```\n", 21 | "to\n", 22 | "```python\n", 23 | "import modin.pandas as pd\n", 24 | "```\n", 25 | "\n", 26 | "Changing this line of code will allow you to use all of the cores in your machine to do computation on your data. One of the major performance bottlenecks of Pandas is that it only uses a single core for any given computation. **Modin** exposes an API that is identical to Pandas, allowing you to continue interacting with your data as you would with Pandas. **There are no additional commands required to use Modin locally.** Partitioning, scheduling, data transfer, and other related concerns are all handled by **Modin** and **Ray** under the hood.\n", 27 | "\n", 28 | "### Concept for Exercise: DataFrame Constructor\n", 29 | "\n", 30 | "Often when playing around in Pandas, it is useful to create a DataFrame with the constructor. That is where we will start.\n", 31 | "\n", 32 | "```python\n", 33 | "import numpy as np\n", 34 | "import pandas as pd\n", 35 | "\n", 36 | "frame_data = np.random.randint(0, 100, size=(2**10, 2**5))\n", 37 | "df = pd.DataFrame(frame_data)\n", 38 | "```\n", 39 | "\n", 40 | "The above code creates a Pandas DataFrame full of random integers with 1024 rows and 128 columns." 41 | ] 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "from __future__ import absolute_import\n", 50 | "from __future__ import division\n", 51 | "from __future__ import print_function\n", 52 | "\n", 53 | "import numpy as np\n", 54 | "import pandas\n", 55 | "import subprocess\n", 56 | "import sys" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "metadata": {}, 62 | "source": [ 63 | "**EXERCISE:** Modify the code below to make the dataframe a `modin.pandas` DataFrame (remember the line of code to change)." 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "# Implement your answer here. You are also free to play with the size\n", 73 | "# and shape of the DataFrame, but beware of exceeding your memory!\n", 74 | "\n", 75 | "import pandas as pd\n", 76 | "\n", 77 | "frame_data = np.random.randint(0, 100, size=(2**10, 2**5))\n", 78 | "df = pd.DataFrame(frame_data)\n", 79 | "\n", 80 | "# ***** Do not change the code below! It verifies that \n", 81 | "# ***** the exercise has been done correctly. *****\n", 82 | "\n", 83 | "try:\n", 84 | " assert df is not None\n", 85 | " assert frame_data is not None\n", 86 | " assert isinstance(frame_data, np.ndarray)\n", 87 | "except:\n", 88 | " raise AssertionError('Don\\'t change too much of the original code!')\n", 89 | "assert 'modin.pandas' in sys.modules, 'Not quite correct. Remember the single line of code change (See above)'\n", 90 | "assert hasattr(df, '_query_compiler'), 'Make sure that df is a modin.pandas DataFrame.'\n", 91 | "\n", 92 | "print(\"Success! You only need to change one line of code!\")" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "Now that we have created a toy example for playing around with the DataFrame, let's print it out in different ways.\n", 100 | "\n", 101 | "### Concept for Exercise: Data Interaction and Printing\n", 102 | "\n", 103 | "When interacting with data, it is very imporant to look at different parts of the data (e.g. `df.head()`). Here we will show that you can print the `modin.pandas` DataFrame in the same ways you would Pandas." 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": null, 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "# Print the first 10 lines.\n", 113 | "df.head(10)" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "# Print the DataFrame.\n", 123 | "df" 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": null, 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "# Free cell for custom interaction (Play around here!)\n" 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": {}, 138 | "source": [ 139 | "`modin.pandas` is using all of the cores in your machine to read the CSV file much faster than Pandas!" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": [ 146 | "### Concept for Exercise: Identical API\n", 147 | "\n", 148 | "As previously mentioned, `modin.pandas` has an identical API to Pandas. In this section, we will go over some examples of how you can use `modin.pandas` to interact with your data in the same way you would with Pandas.\n", 149 | "\n", 150 | "**Note: `modin.pandas` does not yet have 100% of the Pandas API fully implemented or optimized. Some parameters are not implemented for some methods and some of the more obscure methods are not yet implemented. We are continuing to work toward 100% API coverage.**\n", 151 | "\n", 152 | "For a full list of implemented methods, visit the [Modin documentation](https://modin.readthedocs.io/en/latest/pandas_supported.html)." 153 | ] 154 | }, 155 | { 156 | "cell_type": "code", 157 | "execution_count": null, 158 | "metadata": {}, 159 | "outputs": [], 160 | "source": [ 161 | "df.describe()" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [ 170 | "# Transpose the data\n", 171 | "df.T" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "metadata": {}, 178 | "outputs": [], 179 | "source": [ 180 | "# Create a new column the same as Pandas\n", 181 | "df['New Column'] = np.nan" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "df" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "df.columns" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": null, 205 | "metadata": {}, 206 | "outputs": [], 207 | "source": [ 208 | "# Delete the first column\n", 209 | "del df[df.columns[0]]" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "df.columns" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": null, 224 | "metadata": {}, 225 | "outputs": [], 226 | "source": [ 227 | "# Some operation are not yet optimized, but they are implemented.\n", 228 | "df.fillna(value=0, axis=0, limit=100)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [ 237 | "# Some operations are not yet implemented, they will throw this error.\n", 238 | "df.kurtosis()" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": null, 244 | "metadata": {}, 245 | "outputs": [], 246 | "source": [ 247 | "# Free cell for custom interaction (Play around here!).\n" 248 | ] 249 | } 250 | ], 251 | "metadata": { 252 | "kernelspec": { 253 | "display_name": "Python 3", 254 | "language": "python", 255 | "name": "python3" 256 | }, 257 | "language_info": { 258 | "codemirror_mode": { 259 | "name": "ipython", 260 | "version": 3 261 | }, 262 | "file_extension": ".py", 263 | "mimetype": "text/x-python", 264 | "name": "python", 265 | "nbconvert_exporter": "python", 266 | "pygments_lexer": "ipython3", 267 | "version": "3.6.8" 268 | } 269 | }, 270 | "nbformat": 4, 271 | "nbformat_minor": 2 272 | } 273 | -------------------------------------------------------------------------------- /week_4/week_4_exercise_3.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 4: Exercise 3 - Tree Reduce\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to implement a tree reduce in Ray by passing object IDs into remote functions to encode dependencies between tasks.\n", 10 | "\n", 11 | "In this exercise, you will use Ray to implement parallel data generation and a parallel tree reduction." 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": null, 17 | "metadata": {}, 18 | "outputs": [], 19 | "source": [ 20 | "from __future__ import absolute_import\n", 21 | "from __future__ import division\n", 22 | "from __future__ import print_function\n", 23 | "\n", 24 | "import numpy as np\n", 25 | "import ray\n", 26 | "import time" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": null, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "ray.init(num_cpus=8, include_webui=False, ignore_reinit_error=True)" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "**EXERCISE:** These functions will need to be turned into remote functions so that the tree of tasks can be executed in parallel." 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "# This is a proxy for a function which generates some data.\n", 52 | "def create_data(i):\n", 53 | " time.sleep(0.3)\n", 54 | " return i * np.ones(10000)\n", 55 | "\n", 56 | "# This is a proxy for an expensive aggregation step (which is also\n", 57 | "# commutative and associative so it can be used in a tree-reduce).\n", 58 | "def aggregate_data(x, y):\n", 59 | " time.sleep(0.3)\n", 60 | " return x * y" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "**EXERCISE:** Make the data creation tasks run in parallel. Also aggregate the vectors in parallel. Note that the `aggregate_data` function must be called 7 times. They cannot all run in parallel because some depend on the outputs of others. However, it is possible to first run 4 in parallel, then 2 in parallel, and then 1." 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [ 76 | "# Sleep a little to improve the accuracy of the timing measurements below.\n", 77 | "time.sleep(1.0)\n", 78 | "start_time = time.time()\n", 79 | "\n", 80 | "# EXERCISE: Here we generate some data. Do this part in parallel.\n", 81 | "vectors = [create_data(i + 1) for i in range(8)]\n", 82 | "\n", 83 | "# Here we aggregate all of the data repeatedly calling aggregate_data. This\n", 84 | "# can be sped up using Ray.\n", 85 | "#\n", 86 | "# NOTE: A direct translation of the code below to use Ray will not result in\n", 87 | "# a speedup because each function call uses the output of the previous function\n", 88 | "# call so the function calls must be executed serially.\n", 89 | "#\n", 90 | "# EXERCISE: Speed up the aggregation below by using Ray. Note that this will\n", 91 | "# require restructuring the code to expose more parallelism. First run 4 tasks\n", 92 | "# aggregating the 8 values in pairs. Then run 2 tasks aggregating the resulting\n", 93 | "# 4 intermediate values in pairs. then run 1 task aggregating the two resulting\n", 94 | "# values. Lastly, you will need to call ray.get to retrieve the final result.\n", 95 | "#\n", 96 | "# Exposing more parallelism means aggregating the vectors in a DIFFERENT ORDER.\n", 97 | "# This can be done because we are simply summing the data and the order in\n", 98 | "# which the values are summed doesn't matter (it's commutative and associative).\n", 99 | "result = aggregate_data(vectors[0], vectors[1])\n", 100 | "result = aggregate_data(result, vectors[2])\n", 101 | "result = aggregate_data(result, vectors[3])\n", 102 | "result = aggregate_data(result, vectors[4])\n", 103 | "result = aggregate_data(result, vectors[5])\n", 104 | "result = aggregate_data(result, vectors[6])\n", 105 | "result = aggregate_data(result, vectors[7])\n", 106 | "\n", 107 | "# NOTE: For clarity, the aggregation above is written out as 7 separate function\n", 108 | "# calls, but this can be done more easily in a while loop via\n", 109 | "#\n", 110 | "# while len(vectors) > 1:\n", 111 | "# vectors = aggregate_data(vectors[0], vectors[1]) + vectors[2:]\n", 112 | "# result = vectors[0]\n", 113 | "#\n", 114 | "# When expressed this way, the change from serial aggregation to tree-structured\n", 115 | "# aggregation can be made simply by appending the result of aggregate_data to the\n", 116 | "# end of the vectors list as opposed to the beginning.\n", 117 | "#\n", 118 | "# EXERCISE: Think about why this is true.\n", 119 | "\n", 120 | "end_time = time.time()\n", 121 | "duration = end_time - start_time" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "**EXERCISE:** Use the UI to view the task timeline and to verify that the vectors were aggregated with a tree of tasks.\n", 129 | "\n", 130 | "You should be able to see the 8 `create_data` tasks running in parallel followed by 4 `aggregate_data` tasks running in parallel followed by 2 more `aggregate_data` tasks followed by 1 more `aggregate_data` task.\n", 131 | "\n", 132 | "In the timeline, click on **View Options** and select **Flow Events** to visualize tasks dependencies." 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "metadata": {}, 139 | "outputs": [], 140 | "source": [ 141 | "import ray.experimental.ui as ui\n", 142 | "ui.task_timeline()" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | "**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass." 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "assert np.all(result == 40320 * np.ones(10000)), ('Did you remember to '\n", 159 | " 'call ray.get?')\n", 160 | "assert duration < 0.3 + 0.9 + 0.3, ('FAILURE: The data generation and '\n", 161 | " 'aggregation took {} seconds. This is '\n", 162 | " 'too slow'.format(duration))\n", 163 | "assert duration > 0.3 + 0.9, ('FAILURE: The data generation and '\n", 164 | " 'aggregation took {} seconds. This is '\n", 165 | " 'too fast'.format(duration))\n", 166 | "\n", 167 | "print('Success! The example took {} seconds.'.format(duration))" 168 | ] 169 | } 170 | ], 171 | "metadata": { 172 | "kernelspec": { 173 | "display_name": "Python 3", 174 | "language": "python", 175 | "name": "python3" 176 | }, 177 | "language_info": { 178 | "codemirror_mode": { 179 | "name": "ipython", 180 | "version": 3 181 | }, 182 | "file_extension": ".py", 183 | "mimetype": "text/x-python", 184 | "name": "python", 185 | "nbconvert_exporter": "python", 186 | "pygments_lexer": "ipython3", 187 | "version": "3.6.8" 188 | } 189 | }, 190 | "nbformat": 4, 191 | "nbformat_minor": 2 192 | } 193 | -------------------------------------------------------------------------------- /week_5/README.md: -------------------------------------------------------------------------------- 1 | # Week 5 2 | Learn to access different hardware resources, including how to: 3 | 4 | - Send remote tasks to different accelerators and processors 5 | - Use custom resources for tasks that require complex hardware combinations 6 | -------------------------------------------------------------------------------- /week_5/week_5_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 5: Exercise 1 - Using the GPU API\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to use GPUs with remote functions and actors.\n", 10 | "\n", 11 | "**NOTE:** These exercises are designed to run on a machine without GPUs.\n", 12 | "\n", 13 | "See the documentation on using Ray with GPUs http://ray.readthedocs.io/en/latest/using-ray-with-gpus.html.\n", 14 | "\n", 15 | "### Concepts for this Exercise - Using Ray with GPUs\n", 16 | "\n", 17 | "We can indicate that a remote function or an actor requires some GPUs using the `num_gpus` keyword.\n", 18 | "\n", 19 | "```python\n", 20 | "@ray.remote(num_gpus=1)\n", 21 | "def f():\n", 22 | " # The command ray.get_gpu_ids() returns a list of the indices\n", 23 | " # of the GPUs that this task can use (e.g., [0] or [1]).\n", 24 | " ray.get_gpu_ids()\n", 25 | "\n", 26 | "@ray.remote(num_gpus=2)\n", 27 | "class Foo(object):\n", 28 | " def __init__(self):\n", 29 | " # The command ray.get_gpu_ids() returns a list of the\n", 30 | " # indices of the GPUs that this actor can use\n", 31 | " # (e.g., [0, 1] or [3, 5]).\n", 32 | " ray.get_gpu_ids()\n", 33 | "```\n", 34 | "\n", 35 | "Then inside of the actor constructor and methods, we can get the IDs of the GPUs allocated for that actor with `ray.get_gpu_ids()`." 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": { 42 | "collapsed": true 43 | }, 44 | "outputs": [], 45 | "source": [ 46 | "from __future__ import absolute_import\n", 47 | "from __future__ import division\n", 48 | "from __future__ import print_function\n", 49 | "\n", 50 | "import numpy as np\n", 51 | "import ray\n", 52 | "import time" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "Start Ray, note that we pass in `num_gpus=4`. Ray will assume this machine has 4 GPUs (even if it does not). When a task or actor requests a GPU, it will be assigned a GPU ID from the set `[0, 1, 2, 3]`. It is then the responsibility of the task or actor to make sure that it only uses that specific GPU (e.g., by setting the `CUDA_VISIBLE_DEVICES` environment variable)." 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": null, 65 | "metadata": { 66 | "collapsed": true 67 | }, 68 | "outputs": [], 69 | "source": [ 70 | "ray.init(num_cpus=4, num_gpus=2, include_webui=False, ignore_reinit_error=True)" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "**EXERCISE:** Change the remote function below to require one GPU.\n", 78 | "\n", 79 | "**NOTE:** This change does not make the remote function actually **use** the GPU, it simply **reserves** the GPU for use by the remote function. To actually use the GPU, the remote function would use a neural net library like TensorFlow or PyTorch after setting the `CUDA_VISIBLE_DEVICES` environment variable properly. This can be done as follows.\n", 80 | "\n", 81 | "```python\n", 82 | "import os\n", 83 | "os.environ['CUDA_VISIBLE_DEVICES'] = ','.join([str(i) for i in ray.get_gpu_ids()])\n", 84 | "```" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": { 91 | "collapsed": true 92 | }, 93 | "outputs": [], 94 | "source": [ 95 | "@ray.remote\n", 96 | "def f():\n", 97 | " time.sleep(0.5)\n", 98 | " return ray.get_gpu_ids()" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "**VERIFY:** This code checks that each task was assigned one GPU and that not more than two tasks are run at the same time (because we told Ray there are only two GPUs)." 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": null, 111 | "metadata": { 112 | "collapsed": true 113 | }, 114 | "outputs": [], 115 | "source": [ 116 | "start_time = time.time()\n", 117 | "\n", 118 | "gpu_ids = ray.get([f.remote() for _ in range(3)])\n", 119 | "\n", 120 | "end_time = time.time()\n", 121 | "\n", 122 | "for i in range(len(gpu_ids)):\n", 123 | " assert len(gpu_ids[i]) == 1\n", 124 | "\n", 125 | "assert end_time - start_time > 1\n", 126 | "\n", 127 | "print('Sucess! The test passed.')" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "**EXERCISE:** The code below defines an actor. Make it require one GPU." 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": { 141 | "collapsed": true 142 | }, 143 | "outputs": [], 144 | "source": [ 145 | "@ray.remote\n", 146 | "class Actor(object):\n", 147 | " def __init__(self):\n", 148 | " pass\n", 149 | "\n", 150 | " def get_gpu_ids(self):\n", 151 | " return ray.get_gpu_ids()" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "**VERIFY:** This code checks that the actor was assigned a GPU." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "collapsed": true 166 | }, 167 | "outputs": [], 168 | "source": [ 169 | "actor = Actor.remote()\n", 170 | "\n", 171 | "gpu_ids = ray.get(actor.get_gpu_ids.remote())\n", 172 | "\n", 173 | "assert len(gpu_ids) == 1\n", 174 | "\n", 175 | "print('Sucess! The test passed.')" 176 | ] 177 | } 178 | ], 179 | "metadata": { 180 | "kernelspec": { 181 | "display_name": "Python 3", 182 | "language": "python", 183 | "name": "python3" 184 | }, 185 | "language_info": { 186 | "codemirror_mode": { 187 | "name": "ipython", 188 | "version": 3 189 | }, 190 | "file_extension": ".py", 191 | "mimetype": "text/x-python", 192 | "name": "python", 193 | "nbconvert_exporter": "python", 194 | "pygments_lexer": "ipython3", 195 | "version": "3.6.8" 196 | } 197 | }, 198 | "nbformat": 4, 199 | "nbformat_minor": 2 200 | } 201 | -------------------------------------------------------------------------------- /week_5/week_5_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 5: Exercise 2 - Custom Resources\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to use custom resources\n", 10 | "\n", 11 | "See the documentation on using Ray with custom resources http://ray.readthedocs.io/en/latest/resources.html#custom-resources.\n", 12 | "\n", 13 | "### Concepts for this Exercise - Using Custom Resources\n", 14 | "\n", 15 | "We've discussed how to specify a task's CPU and GPU requirements, but there are many other kinds of resources. For example, a task may require a dataset, which only lives on a few machines, or it may need to be scheduled on a machine with extra memory. These kinds of requirements can be expressed through the use of custom resources.\n", 16 | "\n", 17 | "Custom resources are most useful in the multi-machine setting. However, this exercise illustrates their usage in the single-machine setting.\n", 18 | "\n", 19 | "Ray can be started with a dictionary of custom resources (mapping resource name to resource quantity) as follows.\n", 20 | "\n", 21 | "```python\n", 22 | "ray.init(resources={'CustomResource1': 1, 'CustomResource2': 4})\n", 23 | "```\n", 24 | "\n", 25 | "The resource requirements of a remote function or actor can be specified in a similar way.\n", 26 | "\n", 27 | "```python\n", 28 | "@ray.remote(resources={'CustomResource2': 1})\n", 29 | "def f():\n", 30 | " return 1\n", 31 | "```\n", 32 | "\n", 33 | "Even if there are many CPUs on the machine, only 4 copies of `f` can be executed concurrently.\n", 34 | "\n", 35 | "Custom resources give applications a great deal of flexibility. For example, if you wish to control precisely which machine a task gets scheduled on, you can simply start each machine with a different custom resource (e.g., start machine `n` with resource `Custom_n` and then tasks that should be scheduled on machine `n` can require resource `Custom_n`. However, this usage has drawbacks because it makes the code less portable and less resilient to machine failures." 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [ 44 | "from __future__ import absolute_import\n", 45 | "from __future__ import division\n", 46 | "from __future__ import print_function\n", 47 | "\n", 48 | "import ray\n", 49 | "import time" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "In this exercise, we will start Ray using custom resources." 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "ray.init(num_cpus=8, resources={'Custom1': 4}, include_webui=False, ignore_reinit_error=True)" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "**EXERCISE:** Modify the resource requirements of the remote functions below so that the following hold.\n", 73 | "- The number of concurrently executing tasks is at most 8 (note that there are 8 CPUs).\n", 74 | "- No more than 4 copies of `g` can execute concurrently.\n", 75 | "- If 4 `g` tasks are executing, then an additional 4 `f` tasks can execute.\n", 76 | "\n", 77 | "You should only need to use the `Custom1` resource." 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": null, 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "@ray.remote\n", 87 | "def f():\n", 88 | " time.sleep(0.1)\n", 89 | "\n", 90 | "@ray.remote\n", 91 | "def g():\n", 92 | " time.sleep(0.1)" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "If you did the above exercise correctly, the next cell should execute without raising an exception." 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": null, 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "start = time.time()\n", 109 | "ray.get([f.remote() for _ in range(8)])\n", 110 | "duration = time.time() - start \n", 111 | "assert duration >= 0.1 and duration < 0.19, '8 f tasks should be able to execute concurrently.'\n", 112 | "\n", 113 | "start = time.time()\n", 114 | "ray.get([f.remote() for _ in range(9)])\n", 115 | "duration = time.time() - start \n", 116 | "assert duration >= 0.2 and duration < 0.29, 'f tasks should not be able to execute concurrently.'\n", 117 | "\n", 118 | "start = time.time()\n", 119 | "ray.get([g.remote() for _ in range(4)])\n", 120 | "duration = time.time() - start \n", 121 | "assert duration >= 0.1 and duration < 0.19, '4 g tasks should be able to execute concurrently.'\n", 122 | "\n", 123 | "start = time.time()\n", 124 | "ray.get([g.remote() for _ in range(5)])\n", 125 | "duration = time.time() - start \n", 126 | "assert duration >= 0.2 and duration < 0.29, '5 g tasks should not be able to execute concurrently.'\n", 127 | "\n", 128 | "start = time.time()\n", 129 | "ray.get([f.remote() for _ in range(4)] + [g.remote() for _ in range(4)])\n", 130 | "duration = time.time() - start \n", 131 | "assert duration >= 0.1 and duration < 0.19, '4 f and 4 g tasks should be able to execute concurrently.'\n", 132 | "\n", 133 | "start = time.time()\n", 134 | "ray.get([f.remote() for _ in range(5)] + [g.remote() for _ in range(4)])\n", 135 | "duration = time.time() - start \n", 136 | "assert duration >= 0.2 and duration < 0.29, '5 f and 4 g tasks should not be able to execute concurrently.'\n", 137 | "\n", 138 | "print('Success!')" 139 | ] 140 | } 141 | ], 142 | "metadata": { 143 | "kernelspec": { 144 | "display_name": "Python 3", 145 | "language": "python", 146 | "name": "python3" 147 | }, 148 | "language_info": { 149 | "codemirror_mode": { 150 | "name": "ipython", 151 | "version": 3 152 | }, 153 | "file_extension": ".py", 154 | "mimetype": "text/x-python", 155 | "name": "python", 156 | "nbconvert_exporter": "python", 157 | "pygments_lexer": "ipython3", 158 | "version": "3.6.8" 159 | } 160 | }, 161 | "nbformat": 4, 162 | "nbformat_minor": 2 163 | } 164 | -------------------------------------------------------------------------------- /week_6/README.md: -------------------------------------------------------------------------------- 1 | # Week 6 2 | Get an introduction to training neural networks across multiple workers. Topics include: 3 | 4 | - An example of how to pass the weights of a TensorFlow* model between workers and drivers 5 | - How to implement a sharded parameter server for distributing parameters across multiple workers 6 | -------------------------------------------------------------------------------- /week_6/week_6_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 6: Exercise 1 - Pass Neural Net Weights Between Processes\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to show how to send neural network weights between workers and the driver.\n", 10 | "\n", 11 | "For more details on using Ray with TensorFlow, see the documentation at http://ray.readthedocs.io/en/latest/using-ray-with-tensorflow.html.\n", 12 | "\n", 13 | "### Concepts for this Exercise - Getting and Setting Neural Net Weights\n", 14 | "\n", 15 | "Since pickling and unpickling a TensorFlow graph can be inefficient or may not work at all, it is most efficient to ship the weights between processes as a dictionary of numpy arrays (or as a flattened numpy array).\n", 16 | "\n", 17 | "We provide the helper class `ray.experimental.TensorFlowVariables` to help with getting and setting weights. Similar techniques should work other neural net libraries.\n", 18 | "\n", 19 | "Consider the following neural net definition.\n", 20 | "\n", 21 | "```python\n", 22 | "import tensorflow as tf\n", 23 | "\n", 24 | "x_data = tf.placeholder(tf.float32, shape=[100])\n", 25 | "y_data = tf.placeholder(tf.float32, shape=[100])\n", 26 | "\n", 27 | "w = tf.Variable(tf.random_uniform([1], -1.0, 1.0))\n", 28 | "b = tf.Variable(tf.zeros([1]))\n", 29 | "y = w * x_data + b\n", 30 | "\n", 31 | "loss = tf.reduce_mean(tf.square(y - y_data))\n", 32 | "optimizer = tf.train.GradientDescentOptimizer(0.5)\n", 33 | "grads = optimizer.compute_gradients(loss)\n", 34 | "train = optimizer.apply_gradients(grads)\n", 35 | "\n", 36 | "init = tf.global_variables_initializer()\n", 37 | "sess = tf.Session()\n", 38 | "sess.run(init)\n", 39 | "```\n", 40 | "\n", 41 | "Then we can use the helper class as follows.\n", 42 | "\n", 43 | "```python\n", 44 | "variables = ray.experimental.TensorFlowVariables(loss, sess)\n", 45 | "# Here 'weights' is a dictionary mapping variable names to the associated\n", 46 | "# weights as a numpy array.\n", 47 | "weights = variables.get_weights()\n", 48 | "variables.set_weights(weights)\n", 49 | "```\n", 50 | "\n", 51 | "Note that there are analogous methods `variables.get_flat` and `variables.set_flat`, which concatenate the weights as a single array instead of a dictionary.\n", 52 | "\n", 53 | "```python\n", 54 | "# Here 'weights' is a numpy array of all of the neural net weights\n", 55 | "# concatenated together.\n", 56 | "weights = variables.get_flat()\n", 57 | "variables.set_flat(weights)\n", 58 | "```\n", 59 | "\n", 60 | "In this exercise, we will use an actor containing a neural network and implement methods to extract and set the neural net weights.\n", 61 | "\n", 62 | "**WARNING:** This exercise is more complex than previous exercises." 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": { 69 | "collapsed": true 70 | }, 71 | "outputs": [], 72 | "source": [ 73 | "from __future__ import absolute_import\n", 74 | "from __future__ import division\n", 75 | "from __future__ import print_function\n", 76 | "\n", 77 | "import numpy as np\n", 78 | "import ray\n", 79 | "import tensorflow as tf\n", 80 | "import time" 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": null, 86 | "metadata": { 87 | "collapsed": true 88 | }, 89 | "outputs": [], 90 | "source": [ 91 | "ray.init(num_cpus=4, include_webui=False, ignore_reinit_error=True)" 92 | ] 93 | }, 94 | { 95 | "cell_type": "markdown", 96 | "metadata": {}, 97 | "source": [ 98 | "The code below defines a class containing a simple neural network.\n", 99 | "\n", 100 | "**EXERCISE:** Implement the `set_weights` and `get_weights` methods. This should be done using the `ray.experimental.TensorFlowVariables` helper class." 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": null, 106 | "metadata": { 107 | "collapsed": true 108 | }, 109 | "outputs": [], 110 | "source": [ 111 | "@ray.remote\n", 112 | "class SimpleModel(object):\n", 113 | " def __init__(self):\n", 114 | " x_data = tf.placeholder(tf.float32, shape=[100])\n", 115 | " y_data = tf.placeholder(tf.float32, shape=[100])\n", 116 | "\n", 117 | " w = tf.Variable(tf.random_uniform([1], -1.0, 1.0))\n", 118 | " b = tf.Variable(tf.zeros([1]))\n", 119 | " y = w * x_data + b\n", 120 | "\n", 121 | " self.loss = tf.reduce_mean(tf.square(y - y_data))\n", 122 | " optimizer = tf.train.GradientDescentOptimizer(0.5)\n", 123 | " grads = optimizer.compute_gradients(self.loss)\n", 124 | " self.train = optimizer.apply_gradients(grads)\n", 125 | "\n", 126 | " init = tf.global_variables_initializer()\n", 127 | " self.sess = tf.Session()\n", 128 | "\n", 129 | " # Here we create the TensorFlowVariables object to assist with getting\n", 130 | " # and setting weights.\n", 131 | " self.variables = ray.experimental.TensorFlowVariables(self.loss, self.sess)\n", 132 | "\n", 133 | " self.sess.run(init)\n", 134 | "\n", 135 | " def set_weights(self, weights):\n", 136 | " \"\"\"Set the neural net weights.\n", 137 | " \n", 138 | " This method should assign the given weights to the neural net.\n", 139 | " \n", 140 | " Args:\n", 141 | " weights: Either a dict mapping strings (the variable names) to numpy\n", 142 | " arrays or a single flattened numpy array containing all of the\n", 143 | " concatenated weights.\n", 144 | " \"\"\"\n", 145 | " # EXERCISE: You will want to use self.variables here.\n", 146 | " raise NotImplementedError\n", 147 | "\n", 148 | " def get_weights(self):\n", 149 | " \"\"\"Get the neural net weights.\n", 150 | " \n", 151 | " This method should return the current neural net weights.\n", 152 | " \n", 153 | " Returns:\n", 154 | " Either a dict mapping strings (the variable names) to numpy arrays or\n", 155 | " a single flattened numpy array containing all of the concatenated\n", 156 | " weights.\n", 157 | " \"\"\"\n", 158 | " # EXERCISE: You will want to use self.variables here.\n", 159 | " raise NotImplementedError" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "Create a few actors." 167 | ] 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": null, 172 | "metadata": { 173 | "collapsed": true 174 | }, 175 | "outputs": [], 176 | "source": [ 177 | "actors = [SimpleModel.remote() for _ in range(4)]" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "**EXERCISE:** Get the neural net weights from all of the actors." 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "metadata": { 191 | "collapsed": true 192 | }, 193 | "outputs": [], 194 | "source": [ 195 | "raise Exception('Implement this.')" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "**EXERCISE:** Average all of the neural net weights.\n", 203 | "\n", 204 | "**NOTE:** This will be easier to do if you chose to use `get_flat`/`set_flat` instead of `get_weights`/`set_weights` in the implementation of `SimpleModel.set_weights` and `SimpleModel.get_weights` above.." 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": null, 210 | "metadata": { 211 | "collapsed": true 212 | }, 213 | "outputs": [], 214 | "source": [ 215 | "raise Exception('Implement this.')" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "**EXERCISE:** Set the average weights on the actors." 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "execution_count": null, 228 | "metadata": { 229 | "collapsed": true 230 | }, 231 | "outputs": [], 232 | "source": [ 233 | "raise Exception('Implement this.')" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "**VERIFY:** Check that all of the actors have the same weights." 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "metadata": { 247 | "collapsed": true 248 | }, 249 | "outputs": [], 250 | "source": [ 251 | "weights = ray.get([actor.get_weights.remote() for actor in actors])\n", 252 | "\n", 253 | "for i in range(len(weights)):\n", 254 | " np.testing.assert_equal(weights[i], weights[0])\n", 255 | "\n", 256 | "print('Success! The test passed.')" 257 | ] 258 | } 259 | ], 260 | "metadata": { 261 | "kernelspec": { 262 | "display_name": "Python 3", 263 | "language": "python", 264 | "name": "python3" 265 | }, 266 | "language_info": { 267 | "codemirror_mode": { 268 | "name": "ipython", 269 | "version": 3 270 | }, 271 | "file_extension": ".py", 272 | "mimetype": "text/x-python", 273 | "name": "python", 274 | "nbconvert_exporter": "python", 275 | "pygments_lexer": "ipython3", 276 | "version": "3.6.8" 277 | } 278 | }, 279 | "nbformat": 4, 280 | "nbformat_minor": 2 281 | } 282 | -------------------------------------------------------------------------------- /week_6/week_6_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 6: Exercise 2 - Sharded Parameter Servers\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to use actor handles to implement a sharded parameter server example for **distributed asynchronous stochastic gradient descent**.\n", 10 | "\n", 11 | "Before doing this exercise, make sure you understand the concepts from the exercise on **Actor Handles**.\n", 12 | "\n", 13 | "### Parameter Servers\n", 14 | "\n", 15 | "A parameter server is simply an object that stores the parameters (or \"weights\") of a machine learning model (this could be a neural network, a linear model, or something else). It exposes two methods: one for getting the parameters and one for updating the parameters.\n", 16 | "\n", 17 | "In a typical machine learning training application, worker processes will run in an infinite loop that does the following:\n", 18 | "1. Get the latest parameters from the parameter server.\n", 19 | "2. Compute an update to the parameters (using the current parameters and some data).\n", 20 | "3. Send the update to the parameter server.\n", 21 | "\n", 22 | "The workers can operate synchronously (that is, in lock step), in which case distributed training with multiple workers is algorithmically equivalent to serial training with a larger batch of data. Alternatively, workers can operate independently and apply their updates asynchronously. The main benefit of asynchronous training is that a single slow worker will not slow down the other workers. The benefit of synchronous training is that the algorithm behavior is more predictable and reproducible." 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": null, 28 | "metadata": {}, 29 | "outputs": [], 30 | "source": [ 31 | "from __future__ import absolute_import\n", 32 | "from __future__ import division\n", 33 | "from __future__ import print_function\n", 34 | "\n", 35 | "import numpy as np\n", 36 | "import ray\n", 37 | "import time" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": null, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "ray.init(num_cpus=30, include_webui=False, ignore_reinit_error=True)" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "A simple parameter server can be implemented as a Python class in a few lines of code.\n", 54 | "\n", 55 | "**EXERCISE:** Make the `ParameterServer` class an actor." 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [ 64 | "dim = 10\n", 65 | "\n", 66 | "class ParameterServer(object):\n", 67 | " def __init__(self, dim):\n", 68 | " self.parameters = np.zeros(dim)\n", 69 | " \n", 70 | " def get_parameters(self):\n", 71 | " return self.parameters\n", 72 | " \n", 73 | " def update_parameters(self, update):\n", 74 | " self.parameters += update\n", 75 | "\n", 76 | "\n", 77 | "ps = ParameterServer(dim)\n", 78 | "\n", 79 | "assert hasattr(ParameterServer, 'remote'), ('You need to turn ParameterServer into an '\n", 80 | " 'actor (by using the ray.remote keyword).')" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "A worker can be implemented as a simple Python function that repeatedly gets the latest parameters, computes an update to the parameters, and sends the update to the parameter server.\n", 88 | "\n", 89 | "**EXERCISE:** Make the `worker` function a remote function. Since you turned `ParameterServer` into an actor class above, you'll also need to modify the `get_parameters` and `update_parameters` function invocations (by adding the `.remote` keyword). You'll also need to call `ray.get` to get the result of `get_parameters` (but not for `update_parameters`)." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "def worker(ps, dim, num_iters):\n", 99 | " for _ in range(num_iters):\n", 100 | " # Get the latest parameters.\n", 101 | " parameters = ps.get_parameters()\n", 102 | " # Compute an update.\n", 103 | " update = 1e-3 * parameters + np.ones(dim)\n", 104 | " # Update the parameters.\n", 105 | " ps.update_parameters(update)\n", 106 | " # Sleep a little to simulate a real workload.\n", 107 | " time.sleep(0.5)\n", 108 | "\n", 109 | "# Test that worker is implemented correctly. You do not need to change this line.\n", 110 | "ray.get(worker.remote(ps, dim, 1))" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": {}, 117 | "outputs": [], 118 | "source": [ 119 | "# Start two workers.\n", 120 | "worker_results = [worker(ps, dim, 100) for _ in range(2)]" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | "As the worker tasks are executing, you can query the parameter server from the driver and see the parameters changing in the background.\n", 128 | "\n", 129 | "**EXERCISE:** Experiment by querying the parameter server below and see how the parameters change. Simply run the next cell a bunch of times and watch the values change." 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": null, 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [ 138 | "print(ray.get(ps.get_parameters.remote()))" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "## Sharding a Parameter Server\n", 146 | "\n", 147 | "As the number of workers increases, the volume of updates being sent to the parameter server will increase. At some point, the network bandwidth into the parameter server machine or the computation down by the parameter server may be a bottleneck.\n", 148 | "\n", 149 | "Suppose you have $N$ workers and $1$ parameter server, and suppose each of these is an actor that lives on its own machine. Furthermore, suppose the model size is $M$ bytes. Then sending all of the parameters from the workers to the parameter server will mean that $N * M$ bytes in total are sent to the parameter server. If $N = 100$ and $M = 10^8$, then the parameter server must receive ten gigabytes, which, assuming a network bandwidth of 10 giga*bits* per second, would take 8 seconds. This would be prohibitive.\n", 150 | "\n", 151 | "On the other hand, if the parameters are sharded (that is, split) across `K` parameter servers, `K` is `100`, and each parameter server lives on a separate machine, then each parameter server needs to receive only 100 megabytes, which can be done in 80 milliseconds. This is much better.\n", 152 | "\n", 153 | "**EXERCISE:** The code below defines a parameter server shard class. Modify this class to make `ParameterServerShard` an actor. We will need to revisit this code soon and increase `num_shards`." 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": null, 159 | "metadata": {}, 160 | "outputs": [], 161 | "source": [ 162 | "class ParameterServerShard(object):\n", 163 | " def __init__(self, sharded_dim):\n", 164 | " self.parameters = np.zeros(sharded_dim)\n", 165 | " \n", 166 | " def get_parameters(self):\n", 167 | " return self.parameters\n", 168 | " \n", 169 | " def update_parameters(self, update):\n", 170 | " self.parameters += update\n", 171 | "\n", 172 | "\n", 173 | "total_dim = (10 ** 8) // 8 # This works out to 100MB (we have 25 million\n", 174 | " # float64 values, which are each 8 bytes).\n", 175 | "num_shards = 1 # The number of parameter server shards.\n", 176 | "\n", 177 | "assert total_dim % num_shards == 0, ('In this exercise, the number of shards must '\n", 178 | " 'perfectly divide the total dimension.')\n", 179 | "\n", 180 | "# Start some parameter servers.\n", 181 | "ps_shards = [ParameterServerShard(total_dim // num_shards) for _ in range(num_shards)]\n", 182 | "\n", 183 | "assert hasattr(ParameterServerShard, 'remote'), ('You need to turn ParameterServerShard into an '\n", 184 | " 'actor (by using the ray.remote keyword).')" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "metadata": {}, 190 | "source": [ 191 | "The code below implements a worker that does the following.\n", 192 | "1. Gets the latest parameters from all of the parameter server shards.\n", 193 | "2. Concatenates the parameters together to form the full parameter vector.\n", 194 | "3. Computes an update to the parameters.\n", 195 | "4. Partitions the update into one piece for each parameter server.\n", 196 | "5. Applies the right update to each parameter server shard.\n", 197 | "\n", 198 | "**EXERCISE:** Modify the code below to make `worker_task` a remote function. You will also need to modify the parameter server method invocations within the function (e.g., by adding `.remote` where needed and calling `ray.get` when needed)." 199 | ] 200 | }, 201 | { 202 | "cell_type": "code", 203 | "execution_count": null, 204 | "metadata": {}, 205 | "outputs": [], 206 | "source": [ 207 | "def worker_task(total_dim, num_iters, *ps_shards):\n", 208 | " # Note that ps_shards are passed in using Python's variable number\n", 209 | " # of arguments feature. We do this because currently actor handles\n", 210 | " # cannot be passed to tasks inside of lists or other objects.\n", 211 | " for _ in range(num_iters):\n", 212 | " # Get the current parameters from each parameter server.\n", 213 | " parameter_shards = [ps.get_parameters() for ps in ps_shards]\n", 214 | " assert all([isinstance(shard, np.ndarray) for shard in parameter_shards]), (\n", 215 | " 'The parameter shards must be numpy arrays. Did you forget to call ray.get?')\n", 216 | " # Concatenate them to form the full parameter vector.\n", 217 | " parameters = np.concatenate(parameter_shards)\n", 218 | " assert parameters.shape == (total_dim,)\n", 219 | "\n", 220 | " # Compute an update.\n", 221 | " update = np.ones(total_dim)\n", 222 | " # Shard the update.\n", 223 | " update_shards = np.split(update, len(ps_shards))\n", 224 | " \n", 225 | " # Apply the updates to the relevant parameter server shards.\n", 226 | " for ps, update_shard in zip(ps_shards, update_shards):\n", 227 | " ps.update_parameters(update_shard)\n", 228 | "\n", 229 | "\n", 230 | "# Test that worker_task is implemented correctly. You do not need to change this line.\n", 231 | "ray.get(worker_task.remote(total_dim, 1, *ps_shards))" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "**EXERCISE:** Experiment by changing the number of parameter server shards, the number of workers, and the size of the data.\n", 239 | "\n", 240 | "**NOTE:** Because these processes are all running on the same machine, network bandwidth will not be a limitation and sharding the parameter server will not help. To see the difference, you would need to run the application on multiple machines. There are still regimes where sharding a parameter server can help speed up computation on the same machine (by parallelizing the computation that the parameter server processes have to do). If you want to see this effect, you should implement a synchronous training application. In the asynchronous setting, the computation is staggered and so speeding up the parameter server usually does not matter." 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "metadata": {}, 247 | "outputs": [], 248 | "source": [ 249 | "num_workers = 4\n", 250 | "\n", 251 | "# Start some workers. Try changing various quantities and see how the\n", 252 | "# duration changes.\n", 253 | "start = time.time()\n", 254 | "ray.get([worker_task(total_dim, 5, *ps_shards) for _ in range(num_workers)])\n", 255 | "print('This took {} seconds.'.format(time.time() - start))" 256 | ] 257 | } 258 | ], 259 | "metadata": { 260 | "kernelspec": { 261 | "display_name": "Python 3", 262 | "language": "python", 263 | "name": "python3" 264 | }, 265 | "language_info": { 266 | "codemirror_mode": { 267 | "name": "ipython", 268 | "version": 3 269 | }, 270 | "file_extension": ".py", 271 | "mimetype": "text/x-python", 272 | "name": "python", 273 | "nbconvert_exporter": "python", 274 | "pygments_lexer": "ipython3", 275 | "version": "3.6.8" 276 | } 277 | }, 278 | "nbformat": 4, 279 | "nbformat_minor": 2 280 | } 281 | -------------------------------------------------------------------------------- /week_7/README.md: -------------------------------------------------------------------------------- 1 | # Week 7 2 | Understand how to use Ray Tune, a scalable framework for searching for hyperparameters. 3 | 4 | - Use Tune to reduce one of the most expensive parts of machine learning 5 | - Search for the right parameters, such as learning rate and momentum, to train a neural network 6 | - Combine HyperOpt and HyperBand to perform a more powerful search 7 | -------------------------------------------------------------------------------- /week_7/cnn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_7/cnn.png -------------------------------------------------------------------------------- /week_7/helper.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import scipy.ndimage as ndimage 4 | import itertools 5 | import logging 6 | import sys 7 | import keras 8 | from keras.datasets import mnist 9 | from keras.preprocessing.image import ImageDataGenerator 10 | from keras import backend as K 11 | 12 | def limit_threads(num_threads): 13 | K.set_session( 14 | K.tf.Session( 15 | config=K.tf.ConfigProto( 16 | intra_op_parallelism_threads=num_threads, 17 | inter_op_parallelism_threads=num_threads))) 18 | 19 | 20 | def shuffled(x, y): 21 | idx = np.r_[:x.shape[0]] 22 | np.random.shuffle(idx) 23 | return x[idx], y[idx] 24 | 25 | 26 | def load_data(generator=True, num_batches=600): 27 | num_classes = 10 28 | 29 | # input image dimensions 30 | img_rows, img_cols = 28, 28 31 | 32 | # the data, split between train and test sets 33 | (x_train, y_train), (x_test, y_test) = mnist.load_data() 34 | 35 | if K.image_data_format() == 'channels_first': 36 | x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) 37 | x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) 38 | input_shape = (1, img_rows, img_cols) 39 | else: 40 | x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) 41 | x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) 42 | input_shape = (img_rows, img_cols, 1) 43 | 44 | x_train = x_train.astype('float32') 45 | x_test = x_test.astype('float32') 46 | x_train /= 255 47 | x_test /= 255 48 | print('x_train shape:', x_train.shape) 49 | print(x_train.shape[0], 'train samples') 50 | print(x_test.shape[0], 'test samples') 51 | x_train, y_train = shuffled(x_train, y_train) 52 | x_test, y_test = shuffled(x_test, y_test) 53 | 54 | # convert class vectors to binary class matrices 55 | y_train = keras.utils.to_categorical(y_train, num_classes) 56 | y_test = keras.utils.to_categorical(y_test, num_classes) 57 | if generator: 58 | datagen = ImageDataGenerator() 59 | return itertools.islice(datagen.flow(x_train, y_train), num_batches) 60 | return x_train, x_test, y_train, y_test 61 | 62 | 63 | def get_best_trial(trial_list, metric): 64 | """Retrieve the best trial.""" 65 | return max(trial_list, key=lambda trial: trial.last_result.get(metric, 0)) 66 | 67 | 68 | def get_sorted_trials(trial_list, metric): 69 | return sorted(trial_list, key=lambda trial: trial.last_result.get(metric, 0), reverse=True) 70 | 71 | 72 | def get_best_result(trial_list, metric): 73 | """Retrieve the last result from the best trial.""" 74 | return {metric: get_best_trial(trial_list, metric).last_result[metric]} 75 | 76 | 77 | def get_best_model(model_creator, trial_list, metric): 78 | """Restore a model from the best trial.""" 79 | sorted_trials = get_sorted_trials(trial_list, metric) 80 | for best_trial in sorted_trials: 81 | try: 82 | print("Creating model...") 83 | model = model_creator(best_trial.config) 84 | weights = os.path.join(best_trial.logdir, best_trial.last_result["checkpoint"]) 85 | print("Loading from", weights) 86 | model.load_weights(weights) 87 | break 88 | except Exception as e: 89 | print(e) 90 | print("Loading failed. Trying next model") 91 | return model 92 | 93 | def prepare_data(data): 94 | new_data = np.array(data).reshape((1, 28, 28, 1)).astype(np.float32) 95 | return ndimage.gaussian_filter(new_data, sigma=(0.5)) 96 | 97 | class TuneCallback(keras.callbacks.Callback): 98 | def __init__(self, reporter, logs={}): 99 | self.reporter = reporter 100 | 101 | def on_train_end(self, epoch, logs={}): 102 | self.reporter(done=1, mean_accuracy=logs["acc"]) 103 | 104 | def on_batch_end(self, batch, logs={}): 105 | self.reporter(mean_accuracy=logs["acc"]) 106 | 107 | 108 | class GoodError(Exception): 109 | pass 110 | 111 | 112 | def test_reporter(train_mnist_tune): 113 | def mock_reporter(**kwargs): 114 | assert "mean_accuracy" in kwargs, "Did not report proper metric" 115 | assert "checkpoint" in kwargs, "Accidentally removed `checkpoint`?" 116 | assert "timesteps_total" in kwargs, "Accidentally removed `timesteps_total`?" 117 | assert isinstance(kwargs["mean_accuracy"], float), ( 118 | "Did not report properly. Need to report a float!") 119 | raise GoodError("This works.") 120 | try: 121 | train_mnist_tune({}, mock_reporter) 122 | except TypeError as e: 123 | print("Forgot to modify function signature?") 124 | raise e 125 | except GoodError: 126 | print("Works!") 127 | return 1 128 | raise Exception("Didn't call reporter...") 129 | 130 | 131 | def evaluate(model, validation=True): 132 | train_data, val_data, train_labels, val_labels = load_data(generator=False) 133 | data = val_data if validation else train_data 134 | labels = val_labels if validation else train_labels 135 | 136 | res = model.evaluate(data, labels) 137 | print("Model evaluation results:", dict(zip(model.metrics_names, res))) 138 | -------------------------------------------------------------------------------- /week_7/mnist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_7/mnist.png -------------------------------------------------------------------------------- /week_7/tune.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_7/tune.png -------------------------------------------------------------------------------- /week_8/README.md: -------------------------------------------------------------------------------- 1 | # Week 8 2 | Learn about RLlib, which is a scalable reinforcement learning library to train AI agents. 3 | 4 | - Get an introduction to the Markov Decision Process and how to use it in Python* 5 | - See an example of how to use the PPO algorithm to train a network to play a simple game with Gym* and visualize the results with TensorBoard* 6 | - Learn to create a deep-Q network (DQN) to play Pong and play against it in a browser 7 | -------------------------------------------------------------------------------- /week_8/client.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/client.png -------------------------------------------------------------------------------- /week_8/dqn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/dqn.png -------------------------------------------------------------------------------- /week_8/learning.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/learning.png -------------------------------------------------------------------------------- /week_8/log.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/log.png -------------------------------------------------------------------------------- /week_8/ppo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/ppo.png -------------------------------------------------------------------------------- /week_8/serving/data_large.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/data_large.gz -------------------------------------------------------------------------------- /week_8/serving/data_small.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/data_small.gz -------------------------------------------------------------------------------- /week_8/serving/do_rollouts.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import json 6 | import argparse 7 | import gym 8 | 9 | from ray.rllib.utils.policy_client import PolicyClient 10 | 11 | parser = argparse.ArgumentParser() 12 | parser.add_argument( 13 | "--no-train", action="store_true", help="Whether to disable training.") 14 | parser.add_argument( 15 | "--off-policy", 16 | action="store_true", 17 | help="Whether to take random instead of on-policy actions.") 18 | 19 | 20 | if __name__ == "__main__": 21 | args = parser.parse_args() 22 | import pong_py 23 | env = pong_py.PongJSEnv() 24 | client = PolicyClient("http://localhost:8900") 25 | 26 | eid = client.start_episode(training_enabled=not args.no_train) 27 | obs = env.reset() 28 | rewards = 0 29 | episode = [] 30 | f = open("out.txt", "w") 31 | 32 | while True: 33 | if args.off_policy: 34 | action = env.action_space.sample() 35 | client.log_action(eid, obs, action) 36 | else: 37 | action = client.get_action(eid, obs) 38 | next_obs, reward, done, info = env.step(action) 39 | episode.append({ 40 | "obs": obs.tolist(), 41 | "action": float(action), 42 | "reward": reward, 43 | }) 44 | obs = next_obs 45 | rewards += reward 46 | client.log_returns(eid, reward, info=info) 47 | if done: 48 | print("Total reward:", rewards) 49 | f.write(json.dumps(episode)) 50 | f.write("\n") 51 | f.flush() 52 | rewards = 0 53 | client.end_episode(eid, obs) 54 | obs = env.reset() 55 | eid = client.start_episode(training_enabled=not args.no_train) 56 | -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/game.js: -------------------------------------------------------------------------------- 1 | //============================================================================= 2 | // 3 | // We need some ECMAScript 5 methods but we need to implement them ourselves 4 | // for older browsers (compatibility: 5 | // http://kangax.github.com/es5-compat-table/) 6 | // 7 | // Function.bind: 8 | // https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Function/bind 9 | // Object.create: http://javascript.crockford.com/prototypal.html 10 | // Object.extend: (defacto standard like jquery $.extend or prototype's 11 | // Object.extend) 12 | // 13 | // Object.construct: our own wrapper around Object.create that ALSO calls 14 | // an initialize constructor method if one exists 15 | // 16 | //============================================================================= 17 | 18 | if (!Function.prototype.bind) { 19 | Function.prototype.bind = function(obj) { 20 | var slice = [].slice, args = slice.call(arguments, 1), self = this, 21 | nop = function() {}, bound = function() { 22 | return self.apply( 23 | this instanceof nop ? this : (obj || {}), 24 | args.concat(slice.call(arguments))); 25 | }; 26 | nop.prototype = self.prototype; 27 | bound.prototype = new nop(); 28 | return bound; 29 | }; 30 | } 31 | 32 | if (!Object.create) { 33 | Object.create = function(base) { 34 | function F(){}; 35 | F.prototype = base; 36 | return new F(); 37 | } 38 | } 39 | 40 | if (!Object.construct) { 41 | Object.construct = function(base) { 42 | var instance = Object.create(base); 43 | if (instance.initialize) 44 | instance.initialize.apply(instance, [].slice.call(arguments, 1)); 45 | return instance; 46 | } 47 | } 48 | 49 | if (!Object.extend) { 50 | Object.extend = function(destination, source) { 51 | for (var property in source) { 52 | if (source.hasOwnProperty(property)) 53 | destination[property] = source[property]; 54 | } 55 | return destination; 56 | }; 57 | } 58 | 59 | /* NOT READY FOR PRIME TIME 60 | if (!window.requestAnimationFrame) {// 61 | http://paulirish.com/2011/requestanimationframe-for-smart-animating/ 62 | window.requestAnimationFrame = window.webkitRequestAnimationFrame || 63 | window.mozRequestAnimationFrame || 64 | window.oRequestAnimationFrame || 65 | window.msRequestAnimationFrame || 66 | function(callback, element) { 67 | window.setTimeout(callback, 1000 / 60); 68 | } 69 | } 70 | */ 71 | 72 | //============================================================================= 73 | // GAME 74 | //============================================================================= 75 | 76 | Game = { 77 | 78 | compatible: function() { 79 | return Object.create && Object.extend && Function.bind && 80 | document.addEventListener && // HTML5 standard, all modern browsers 81 | // that support canvas should also support 82 | // add/removeEventListener 83 | Game.ua.hasCanvas 84 | }, 85 | 86 | start: function(id, game, cfg) { 87 | if (Game.compatible()) 88 | return Object.construct(Game.Runner, id, game, cfg).game; // return the 89 | // game 90 | // instance, 91 | // not the 92 | // runner 93 | // (caller can 94 | // always get 95 | // at the 96 | // runner via 97 | // game.runner) 98 | }, 99 | 100 | ua: function() { // should avoid user agent sniffing... but sometimes you 101 | // just gotta do what you gotta do 102 | var ua = navigator.userAgent.toLowerCase(); 103 | var key = ((ua.indexOf('opera') > -1) ? 'opera' : null); 104 | key = key || ((ua.indexOf('firefox') > -1) ? 'firefox' : null); 105 | key = key || ((ua.indexOf('chrome') > -1) ? 'chrome' : null); 106 | key = key || ((ua.indexOf('safari') > -1) ? 'safari' : null); 107 | key = key || ((ua.indexOf('msie') > -1) ? 'ie' : null); 108 | 109 | try { 110 | var re = (key == 'ie') ? 'msie (\\d)' : key + '\\/(\\d\\.\\d)' 111 | var matches = ua.match(new RegExp(re, 'i')); 112 | var version = matches ? parseFloat(matches[1]) : null; 113 | } catch (e) { 114 | } 115 | 116 | return { 117 | full: ua, name: key + (version ? ' ' + version.toString() : ''), 118 | version: version, isFirefox: (key == 'firefox'), 119 | isChrome: (key == 'chrome'), isSafari: (key == 'safari'), 120 | isOpera: (key == 'opera'), isIE: (key == 'ie'), 121 | hasCanvas: (document.createElement('canvas').getContext), 122 | hasAudio: (typeof(Audio) != 'undefined') 123 | } 124 | }(), 125 | 126 | addEvent: function(obj, type, fn) { 127 | obj.addEventListener(type, fn, false); 128 | }, 129 | removeEvent: function(obj, type, fn) { 130 | obj.removeEventListener(type, fn, false); 131 | }, 132 | 133 | ready: function(fn) { 134 | if (Game.compatible()) Game.addEvent(document, 'DOMContentLoaded', fn); 135 | }, 136 | 137 | createCanvas: function() { 138 | return document.createElement('canvas'); 139 | }, 140 | 141 | createAudio: function(src) { 142 | try { 143 | var a = new Audio(src); 144 | a.volume = 0.1; // lets be real quiet please 145 | return a; 146 | } catch (e) { 147 | return null; 148 | } 149 | }, 150 | 151 | loadImages: function( 152 | sources, callback) { /* load multiple images and callback when ALL have 153 | finished loading */ 154 | var images = {}; 155 | var count = sources ? sources.length : 0; 156 | if (count == 0) { 157 | callback(images); 158 | } else { 159 | for (var n = 0; n < sources.length; n++) { 160 | var source = sources[n]; 161 | var image = document.createElement('img'); 162 | images[source] = image; 163 | Game.addEvent(image, 'load', function() { 164 | if (--count == 0) callback(images); 165 | }); 166 | image.src = source; 167 | } 168 | } 169 | }, 170 | 171 | random: function(min, max) { 172 | return (min + (Math.random() * (max - min))); 173 | }, 174 | 175 | timestamp: function() { 176 | return new Date().getTime(); 177 | }, 178 | 179 | KEY: { 180 | BACKSPACE: 8, 181 | TAB: 9, 182 | RETURN: 13, 183 | ESC: 27, 184 | SPACE: 32, 185 | LEFT: 37, 186 | UP: 38, 187 | RIGHT: 39, 188 | DOWN: 40, 189 | DELETE: 46, 190 | HOME: 36, 191 | END: 35, 192 | PAGEUP: 33, 193 | PAGEDOWN: 34, 194 | INSERT: 45, 195 | ZERO: 48, 196 | ONE: 49, 197 | TWO: 50, 198 | A: 65, 199 | L: 76, 200 | P: 80, 201 | Q: 81, 202 | TILDA: 192 203 | }, 204 | 205 | //----------------------------------------------------------------------------- 206 | 207 | Runner: { 208 | 209 | initialize: function(id, game, cfg) { 210 | this.cfg = Object.extend( 211 | game.Defaults || {}, cfg || {}); // use game defaults (if any) and 212 | // extend with custom cfg (if any) 213 | this.fps = this.cfg.fps || 20; 214 | this.interval = 1000.0 / this.fps; 215 | this.canvas = document.getElementById(id); 216 | this.width = this.cfg.width || this.canvas.offsetWidth; 217 | this.height = this.cfg.height || this.canvas.offsetHeight; 218 | this.front = this.canvas; 219 | this.front.width = this.width; 220 | this.front.height = this.height; 221 | this.back = Game.createCanvas(); 222 | this.back.width = this.width; 223 | this.back.height = this.height; 224 | this.front2d = this.front.getContext('2d'); 225 | this.back2d = this.back.getContext('2d'); 226 | this.addEvents(); 227 | this.resetStats(); 228 | 229 | this.game = Object.construct( 230 | game, this, this.cfg); // finally construct the game object itself 231 | }, 232 | 233 | start: function() { // game instance should call runner.start() when its 234 | // finished initializing and is ready to start the game 235 | // loop 236 | this.lastFrame = Game.timestamp(); 237 | this.timer = setInterval(this.loop.bind(this), this.interval); 238 | }, 239 | 240 | stop: function() { 241 | clearInterval(this.timer); 242 | }, 243 | 244 | loop: function() { 245 | var start = Game.timestamp(); 246 | this.update((start - this.lastFrame) / 1000.0); // send dt as seconds 247 | var middle = Game.timestamp(); 248 | this.draw(); 249 | var end = Game.timestamp(); 250 | this.updateStats(middle - start, end - middle); 251 | this.lastFrame = start; 252 | }, 253 | 254 | update: function(dt) { 255 | this.game.update(dt); 256 | }, 257 | 258 | draw: function() { 259 | this.back2d.clearRect(0, 0, this.width, this.height); 260 | this.game.draw(this.back2d); 261 | this.drawStats(this.back2d); 262 | this.front2d.clearRect(0, 0, this.width, this.height); 263 | this.front2d.drawImage(this.back, 0, 0); 264 | }, 265 | 266 | resetStats: function() { 267 | this.stats = { 268 | count: 0, 269 | fps: 0, 270 | update: 0, 271 | draw: 0, 272 | frame: 0 // update + draw 273 | }; 274 | }, 275 | 276 | updateStats: function(update, draw) { 277 | if (this.cfg.stats) { 278 | this.stats.update = Math.max(1, update); 279 | this.stats.draw = Math.max(1, draw); 280 | this.stats.frame = this.stats.update + this.stats.draw; 281 | this.stats.count = 282 | this.stats.count == this.fps ? 0 : this.stats.count + 1; 283 | this.stats.fps = Math.min(this.fps, 1000 / this.stats.frame); 284 | } 285 | }, 286 | 287 | drawStats: function(ctx) { 288 | if (this.cfg.stats) { 289 | ctx.fillText( 290 | 'frame: ' + this.stats.count, this.width - 100, this.height - 60); 291 | ctx.fillText( 292 | 'fps: ' + this.stats.fps, this.width - 100, this.height - 50); 293 | ctx.fillText( 294 | 'update: ' + this.stats.update + 'ms', this.width - 100, 295 | this.height - 40); 296 | ctx.fillText( 297 | 'draw: ' + this.stats.draw + 'ms', this.width - 100, 298 | this.height - 30); 299 | } 300 | }, 301 | 302 | addEvents: function() { 303 | Game.addEvent(document, 'keydown', this.onkeydown.bind(this)); 304 | Game.addEvent(document, 'keyup', this.onkeyup.bind(this)); 305 | }, 306 | 307 | onkeydown: function(ev) { 308 | if (this.game.onkeydown) this.game.onkeydown(ev.keyCode); 309 | }, 310 | onkeyup: function(ev) { 311 | if (this.game.onkeyup) this.game.onkeyup(ev.keyCode); 312 | }, 313 | 314 | hideCursor: function() { 315 | this.canvas.style.cursor = 'none'; 316 | }, 317 | showCursor: function() { 318 | this.canvas.style.cursor = 'auto'; 319 | }, 320 | 321 | alert: function(msg) { 322 | this.stop(); // alert blocks thread, so need to stop game loop in order 323 | // to avoid sending huge dt values to next update 324 | result = window.alert(msg); 325 | this.start(); 326 | return result; 327 | }, 328 | 329 | confirm: function(msg) { 330 | this.stop(); // alert blocks thread, so need to stop game loop in order 331 | // to avoid sending huge dt values to next update 332 | result = window.confirm(msg); 333 | this.start(); 334 | return result; 335 | } 336 | 337 | //------------------------------------------------------------------------- 338 | 339 | } // Game.Runner 340 | } // Game 341 | -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/images/press1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/images/press1.png -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/images/press2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/images/press2.png -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/images/winner.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/images/winner.png -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | Pong! 5 | 6 | 7 | 8 | 9 | 10 | 11 | 49 | 50 | 51 |
52 | Sorry, this example cannot be run because your browser does not support the <canvas> element 53 |
54 |
55 | 56 | 57 | 58 | 81 | 82 | 83 | 84 | -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/pong.css: -------------------------------------------------------------------------------- 1 | body { background-color: black; color: #AAA; font-size: 12pt; padding: 1em; } 2 | 3 | #unsupported { border: 1px solid yellow; color: black; background-color: #FFFFAD; padding: 2em; margin: 1em; display: inline-block; } 4 | 5 | #sidebar { width: 18em; height: 40em; float: left; font-size: 0.825em; background-color: #333; border: 1px solid white; padding: 1em; } 6 | #sidebar h2 { color: white; text-align: center; margin: 0; } 7 | #sidebar .parts { padding-left: 1em; list-style-type: none; margin-bottom: 2em; text-align: right; } 8 | #sidebar .parts li a { color: white; text-decoration: none; } 9 | #sidebar .parts li a:visited { color: white; } 10 | #sidebar .parts li a:hover { color: white; text-decoration: underline; } 11 | #sidebar .parts li a.selected { color: #F08010; } 12 | #sidebar .parts li a i { color: #AAA; } 13 | #sidebar .parts li a.selected i { color: #F08010; } 14 | #sidebar .settings { line-height: 1.2em; height: 1.2em; text-align: right; } 15 | #sidebar .settings.size { } 16 | #sidebar .settings.speed { margin-bottom: 1em; } 17 | #sidebar .settings label { vertical-align: middle; } 18 | #sidebar .settings input { vertical-align: middle; } 19 | #sidebar .settings select { vertical-align: middle; } 20 | #sidebar .description { margin-bottom: 2em; } 21 | #sidebar .description b { font-weight: normal; color: #FFF; } 22 | 23 | 24 | @media screen and (min-width: 0px) { 25 | #sidebar { display: none; } 26 | #game { display: block; width: 480px; height: 360px; margin: 0 auto; } 27 | } 28 | 29 | @media screen and (min-width: 800px) { 30 | #game { width: 640px; height: 480px; } 31 | } 32 | 33 | @media screen and (min-width: 1000px) { 34 | #sidebar { display: block; } 35 | #game { margin-left: 18em; } 36 | } 37 | 38 | @media screen and (min-width: 1200px) { 39 | #game { width: 800px; height: 600px; } 40 | } 41 | 42 | @media screen and (min-width: 1600px) { 43 | #game { width: 1024px; height: 768px; } 44 | } 45 | -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/sounds/goal.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/sounds/goal.wav -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/sounds/ping.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/sounds/ping.wav -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/sounds/pong.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/sounds/pong.wav -------------------------------------------------------------------------------- /week_8/serving/javascript-pong/static/sounds/wall.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/serving/javascript-pong/static/sounds/wall.wav -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py.egg-info/PKG-INFO: -------------------------------------------------------------------------------- 1 | Metadata-Version: 1.0 2 | Name: pong-py 3 | Version: 0.0.0 4 | Summary: UNKNOWN 5 | Home-page: UNKNOWN 6 | Author: UNKNOWN 7 | Author-email: UNKNOWN 8 | License: UNKNOWN 9 | Description: UNKNOWN 10 | Platform: UNKNOWN 11 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py.egg-info/SOURCES.txt: -------------------------------------------------------------------------------- 1 | setup.py 2 | pong_py/__init__.py 3 | pong_py/ball.py 4 | pong_py/helper.py 5 | pong_py/paddle.py 6 | pong_py/pongjsenv.py 7 | pong_py.egg-info/PKG-INFO 8 | pong_py.egg-info/SOURCES.txt 9 | pong_py.egg-info/dependency_links.txt 10 | pong_py.egg-info/top_level.txt -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py.egg-info/dependency_links.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py.egg-info/top_level.txt: -------------------------------------------------------------------------------- 1 | pong_py 2 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py/__init__.py: -------------------------------------------------------------------------------- 1 | from pong_py.pongjsenv import PongJSEnv 2 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py/ball.py: -------------------------------------------------------------------------------- 1 | import pong_py.helper as helper 2 | import random 3 | 4 | class Ball(): 5 | def __init__(self, pong): 6 | self.radius = 5 7 | self.dt = pong.dt 8 | self.minX = self.radius; 9 | self.maxX = pong.width - self.radius 10 | self.minY = pong.wall_width + self.radius 11 | self.maxY = pong.height - pong.wall_width - self.radius 12 | self.speed = (self.maxX - self.minX) / 4; 13 | self.accel = 8; 14 | self.dx = 0 15 | self.dy = 0 16 | 17 | def set_position(self, x, y): 18 | self.x_prev = x if not hasattr(self, "x") else self.x 19 | self.y_prev = y if not hasattr(self, "y") else self.y 20 | 21 | self.x = x 22 | self.y = y 23 | self.left = self.x - self.radius 24 | self.top = self.y - self.radius 25 | self.right = self.x + self.radius 26 | self.bottom = self.y + self.radius 27 | 28 | def set_direction(self, dx, dy): 29 | self.dx = dx 30 | self.dy = dy 31 | 32 | def update(self, left_pad, right_pad): 33 | 34 | pos = helper.accelerate(self.x, self.y, 35 | self.dx, self.dy, 36 | self.accel, self.dt); 37 | 38 | if ((pos.dy > 0) and (pos.y > self.maxY)): 39 | pos.y = self.maxY 40 | pos.dy = -pos.dy 41 | elif ((pos.dy < 0) and (pos.y < self.minY)): 42 | pos.y = self.minY 43 | pos.dy = -pos.dy 44 | 45 | paddle = left_pad if (pos.dx < 0) else right_pad; 46 | pt = helper.ballIntercept(self, paddle, pos.nx, pos.ny); 47 | 48 | if pt: 49 | if pt.d == 'left' or pt.d == 'right': 50 | pos.x = pt.x 51 | pos.dx = -pos.dx 52 | elif pt.d == 'top' or pt.d == 'bottom': 53 | pos.y = pt.y 54 | pos.dy = -pos.dy 55 | 56 | if paddle.up: 57 | pos.dy = pos.dy * (0.5 if pos.dy < 0 else 1.5) 58 | elif paddle.down: 59 | pos.dy = pos.dy * (0.5 if pos.dy > 0 else 1.5) 60 | 61 | self.set_position(pos.x, pos.y) 62 | self.set_direction(pos.dx, pos.dy) 63 | 64 | def reset(self, playerNo): 65 | self.set_position((self.maxX + self.minX) / 2, random.uniform(self.minY, self.maxY)) 66 | self.set_direction(self.speed if playerNo == 1 else -self.speed, self.speed) 67 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py/helper.py: -------------------------------------------------------------------------------- 1 | from collections import namedtuple 2 | 3 | Position = namedtuple("Position", ["nx", "ny", "x", "y", "dx", "dy"]) 4 | Intercept = namedtuple("Intercept", ["x", "y", "d"]) 5 | Rectangle = namedtuple("Rectangle", ["left", "right", "top", "bottom"]) 6 | 7 | class Position(): 8 | def __init__(self, nx, ny, x, y, dx, dy): 9 | self.nx = nx 10 | self.ny = ny 11 | self.x = x 12 | self.y = y 13 | self.dx = dx 14 | self.dy = dy 15 | 16 | class Intercept(): 17 | def __init__(self, x, y, d): 18 | self.x = x 19 | self.y = y 20 | self.d = d 21 | 22 | class Rectangle(): 23 | def __init__(self, left, right, top, bottom): 24 | self.left = left 25 | self.right = right 26 | self.top = top 27 | self.bottom = bottom 28 | 29 | def accelerate(x, y, dx, dy, accel, dt): 30 | x2 = x + (dt * dx) + (accel * dt * dt * 0.5); 31 | y2 = y + (dt * dy) + (accel * dt * dt * 0.5); 32 | dx2 = dx + (accel * dt) * (1 if dx > 0 else -1); 33 | dy2 = dy + (accel * dt) * (1 if dy > 0 else -1); 34 | return Position((x2-x), (y2-y), x2, y2, dx2, dy2 ) 35 | 36 | 37 | def intercept(x1, y1, x2, y2, x3, y3, x4, y4, d): 38 | denom = ((y4-y3) * (x2-x1)) - ((x4-x3) * (y2-y1)) 39 | if (denom != 0): 40 | ua = (((x4-x3) * (y1-y3)) - ((y4-y3) * (x1-x3))) / denom 41 | if ((ua >= 0) and (ua <= 1)): 42 | ub = (((x2-x1) * (y1-y3)) - ((y2-y1) * (x1-x3))) / denom 43 | if ((ub >= 0) and (ub <= 1)): 44 | x = x1 + (ua * (x2-x1)) 45 | y = y1 + (ua * (y2-y1)) 46 | return Intercept(x, y, d) 47 | 48 | 49 | def ballIntercept(ball, rect, nx, ny): 50 | pt = None 51 | if (nx < 0): 52 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 53 | rect.right + ball.radius, 54 | rect.top - ball.radius, 55 | rect.right + ball.radius, 56 | rect.bottom + ball.radius, 57 | "right"); 58 | elif (nx > 0): 59 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 60 | rect.left - ball.radius, 61 | rect.top - ball.radius, 62 | rect.left - ball.radius, 63 | rect.bottom + ball.radius, 64 | "left") 65 | 66 | if (not pt): 67 | if (ny < 0): 68 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 69 | rect.left - ball.radius, 70 | rect.bottom + ball.radius, 71 | rect.right + ball.radius, 72 | rect.bottom + ball.radius, 73 | "bottom"); 74 | elif (ny > 0): 75 | pt = intercept(ball.x, ball.y, ball.x + nx, ball.y + ny, 76 | rect.left - ball.radius, 77 | rect.top - ball.radius, 78 | rect.right + ball.radius, 79 | rect.top - ball.radius, 80 | "top"); 81 | return pt -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py/paddle.py: -------------------------------------------------------------------------------- 1 | import random 2 | import pong_py.helper as helper 3 | from pong_py.helper import Rectangle 4 | 5 | class Paddle(): 6 | STOP = 0 7 | DOWN = 1 8 | UP = 2 9 | 10 | 11 | def __init__(self, rhs, pong): 12 | self.pid = rhs 13 | self.width = 12 14 | self.height = 60 15 | self.dt = pong.dt 16 | self.minY = pong.wall_width 17 | self.maxY = pong.height - pong.wall_width - self.height 18 | self.speed = (self.maxY - self.minY) / 2 19 | self.ai_reaction = 0.1 20 | self.ai_error = 120 21 | self.pong = pong 22 | self.set_direction(0) 23 | self.set_position(pong.width - self.width if rhs else 0, 24 | self.minY + (self.maxY - self.minY) / 2) 25 | self.prediction = None 26 | self.ai_prev_action = 0 27 | 28 | def set_position(self, x, y): 29 | self.x = x 30 | self.y = y 31 | self.left = self.x 32 | self.right = self.left + self.width 33 | self.top = self.y 34 | self.bottom = self.y + self.height 35 | 36 | def set_direction(self, dy): 37 | # Needed for spin calculation 38 | self.up = -dy if dy < 0 else 0 39 | self.down = dy if dy > 0 else 0 40 | 41 | def step(self, action): 42 | if action == self.STOP: 43 | self.stopMovingDown() 44 | self.stopMovingUp() 45 | elif action == self.DOWN: 46 | self.moveDown() 47 | elif action == self.UP: 48 | self.moveUp() 49 | amt = self.down - self.up 50 | if amt != 0: 51 | y = self.y + (amt * self.dt * self.speed) 52 | if y < self.minY: 53 | y = self.minY 54 | elif y > self.maxY: 55 | y = self.maxY 56 | self.set_position(self.x, y) 57 | 58 | def predict(self, ball, dt): 59 | # only re-predict if the ball changed direction, or its been some amount of time since last prediction 60 | if (self.prediction and ((self.prediction.dx * ball.dx) > 0) and 61 | ((self.prediction.dy * ball.dy) > 0) and 62 | (self.prediction.since < self.ai_reaction)): 63 | self.prediction.since += dt 64 | return 65 | 66 | rect = Rectangle(self.left, self.right, -10000, 10000) 67 | pt = helper.ballIntercept(ball, rect, ball.dx * 10, ball.dy * 10) 68 | 69 | if (pt): 70 | t = self.minY + ball.radius 71 | b = self.maxY + self.height - ball.radius 72 | 73 | while ((pt.y < t) or (pt.y > b)): 74 | if (pt.y < t): 75 | pt.y = t + (t - pt.y) 76 | elif (pt.y > b): 77 | pt.y = t + (b - t) - (pt.y - b) 78 | self.prediction = pt 79 | else: 80 | self.prediction = None 81 | 82 | if self.prediction: 83 | self.prediction.since = 0 84 | self.prediction.dx = ball.dx 85 | self.prediction.dy = ball.dy 86 | self.prediction.radius = ball.radius 87 | self.prediction.exactX = self.prediction.x 88 | self.prediction.exactY = self.prediction.y 89 | closeness = (ball.x - self.right if ball.dx < 0 else self.left - ball.x) / self.pong.width 90 | error = self.ai_error * closeness 91 | self.prediction.y = self.prediction.y + random.uniform(-error, error) 92 | 93 | def ai_step(self, ball): 94 | 95 | if (((ball.x < self.left) and (ball.dx < 0)) or 96 | ((ball.x > self.right) and (ball.dx > 0))): 97 | self.stopMovingUp() 98 | self.stopMovingDown() 99 | return 100 | 101 | self.predict(ball, self.dt) 102 | action = self.ai_prev_action 103 | 104 | if (self.prediction): 105 | # print('prediction') 106 | if (self.prediction.y < (self.top + self.height/2 - 5)): 107 | action = self.UP 108 | # print("moved up") 109 | elif (self.prediction.y > (self.bottom - self.height/2 + 5)): 110 | action = self.DOWN 111 | # print("moved down") 112 | 113 | else: 114 | action = self.STOP 115 | # print("nothing") 116 | self.ai_prev_action = action 117 | return self.step(action) 118 | 119 | def moveUp(self): 120 | self.down = 0 121 | self.up = 1 122 | 123 | def moveDown(self): 124 | self.down = 1 125 | self.up = 0 126 | 127 | def stopMovingDown(self): 128 | self.down = 0 129 | 130 | def stopMovingUp(self): 131 | self.up = 0 132 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/pong_py/pongjsenv.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | import gym 4 | import gym.spaces 5 | 6 | from pong_py.ball import Ball 7 | from pong_py.paddle import Paddle 8 | 9 | 10 | def transform_state(state): 11 | return state / 500 12 | 13 | 14 | class PongJS(object): 15 | # MDP to 16 | def __init__(self): 17 | self.width = 640 18 | self.height = 480 19 | self.wall_width = 12 20 | self.dt = 0.05 # seconds 21 | #self.dt = 0.01 # seconds 22 | self.left_pad = Paddle(0, self) 23 | self.right_pad = Paddle(1, self) 24 | self.ball = Ball(self) 25 | 26 | def step(self, action): 27 | # do logic for self 28 | self.left_pad.step(action) 29 | self.right_pad.ai_step(self.ball) 30 | 31 | self.ball.update(self.left_pad, self.right_pad) 32 | term, reward = self.terminate() 33 | if term: 34 | self.reset(0 if reward == 1 else 1) 35 | state = self.get_state() 36 | return state, reward, term 37 | 38 | def init(self): 39 | self.reset(0) 40 | 41 | def terminate(self): 42 | if self.ball.left > self.width: 43 | return True, 1 44 | elif self.ball.right < 0: 45 | return True, -1 46 | else: 47 | return False, 0 48 | 49 | def get_state(self): 50 | return np.array([self.left_pad.y, 0, 51 | self.ball.x, self.ball.y, 52 | self.ball.dx, self.ball.dy, 53 | self.ball.x_prev, self.ball.y_prev]) 54 | 55 | def reset(self, player): 56 | self.ball.reset(player) 57 | 58 | 59 | class PongJSEnv(gym.Env): 60 | def __init__(self): 61 | self.env = PongJS() 62 | self.action_space = gym.spaces.Discrete(3) 63 | self.observation_space = gym.spaces.box.Box(low=0, high=1, shape=(8,)) 64 | 65 | @property 66 | def right_pad(self): 67 | return self.env.right_pad 68 | 69 | @property 70 | def left_pad(self): 71 | return self.env.left_pad 72 | 73 | def reset(self): 74 | self.env.init() 75 | return transform_state(self.env.get_state()) 76 | 77 | def step(self, action): 78 | state, reward, done = self.env.step(action) 79 | return transform_state(state), 1, done, {} 80 | #return state, reward, done, {} 81 | -------------------------------------------------------------------------------- /week_8/serving/pong_py/setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages, Distribution 2 | 3 | setup(name='pong_py', 4 | packages=find_packages()) 5 | -------------------------------------------------------------------------------- /week_8/serving/pong_web_server.py: -------------------------------------------------------------------------------- 1 | import cgi 2 | from http.server import BaseHTTPRequestHandler, HTTPServer 3 | import json 4 | import requests 5 | import socketserver 6 | import subprocess 7 | import threading 8 | 9 | from ray.rllib.utils.policy_client import PolicyClient 10 | 11 | 12 | # Check that the required port isn't already in use. 13 | try: 14 | requests.get('http://localhost:3000') 15 | except: 16 | pass 17 | else: 18 | raise Exception('The port 3000 is still in use (perhaps from a previous run of this notebook. ' 19 | 'You will need to kill that process before proceeding, e.g., by running ' 20 | '"subprocess.call([\'ray\', \'stop\'])" in a new cell and restarting this notebook.') 21 | 22 | 23 | client = PolicyClient("http://localhost:8900") 24 | 25 | 26 | def make_handler_class(agent): 27 | """This function is used to define a custom handler using the policy.""" 28 | 29 | class PolicyHandler(BaseHTTPRequestHandler): 30 | def __init__(self, *args, **kwargs): 31 | BaseHTTPRequestHandler.__init__(self, *args, **kwargs) 32 | 33 | def end_headers(self): 34 | self.send_header('Access-Control-Allow-Origin', '*') 35 | self.send_header('Access-Control-Allow-Methods', '*') 36 | self.send_header('Access-Control-Allow-Headers', 'Content-Type') 37 | BaseHTTPRequestHandler.end_headers(self) 38 | 39 | def do_OPTIONS(self): 40 | self.send_response(200, 'ok') 41 | self.end_headers() 42 | 43 | def do_POST(self): 44 | """This method receives the state of the game and returns an action.""" 45 | length = int(self.headers.get_all('content-length')[0]) 46 | post_body = cgi.parse_qs(self.rfile.read(length), keep_blank_values=1) 47 | print("Processing request", post_body) 48 | req = json.loads(list(post_body.keys())[0].decode("utf-8")) 49 | if "command" in req: 50 | if req["command"] == "start_episode": 51 | resp = client.start_episode(training_enabled=False) 52 | elif req["command"] == "end_episode": 53 | resp = client.end_episode(req["episode_id"], [0] * 8) 54 | elif req["command"] == "log_returns": 55 | if req["playerNo"] == 0: 56 | client.log_returns(req["episode_id"], req["reward"]) 57 | resp = "OK" 58 | else: 59 | raise ValueError("Unknown command") 60 | else: 61 | action = client.get_action(req["episode_id"], req["observation"]) 62 | resp = {"output": int(action)} 63 | 64 | self.send_response(200) 65 | self.send_header('Content-type', 'json') 66 | self.end_headers() 67 | 68 | self.wfile.write(json.dumps(resp).encode('ascii')) 69 | 70 | return PolicyHandler 71 | 72 | 73 | if __name__ == "__main__": 74 | handler = make_handler_class(None) 75 | httpd = HTTPServer(('', 3000), handler) 76 | print("Starting web server on port 3000.") 77 | httpd.serve_forever() 78 | -------------------------------------------------------------------------------- /week_8/serving/simple_policy_server.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import argparse 6 | import os 7 | 8 | from gym import spaces 9 | import numpy as np 10 | 11 | import ray 12 | from ray.rllib.agents.dqn import DQNAgent 13 | from ray.rllib.agents.pg import PGAgent 14 | from ray.rllib.env.serving_env import ServingEnv 15 | from ray.rllib.utils.policy_server import PolicyServer 16 | from ray.tune.logger import pretty_print 17 | from ray.tune.registry import register_env 18 | 19 | SERVER_ADDRESS = "localhost" 20 | SERVER_PORT = 8900 21 | 22 | parser = argparse.ArgumentParser() 23 | parser.add_argument("--action-size", type=int, required=True) 24 | parser.add_argument("--observation-size", type=int, required=True) 25 | parser.add_argument("--checkpoint-file", type=str, required=True) 26 | parser.add_argument("--run", type=str, required=True) 27 | 28 | 29 | class SimpleServing(ServingEnv): 30 | def __init__(self, config): 31 | ServingEnv.__init__( 32 | self, spaces.Discrete(config["action_size"]), 33 | spaces.Box( 34 | low=-10, high=10, 35 | shape=(config["observation_size"],), 36 | dtype=np.float32)) 37 | 38 | def run(self): 39 | print("Starting policy server at {}:{}".format(SERVER_ADDRESS, 40 | SERVER_PORT)) 41 | server = PolicyServer(self, SERVER_ADDRESS, SERVER_PORT) 42 | server.serve_forever() 43 | 44 | 45 | if __name__ == "__main__": 46 | args = parser.parse_args() 47 | ray.init() 48 | register_env("srv", lambda config: SimpleServing(config)) 49 | 50 | if args.run == "DQN": 51 | agent = DQNAgent( 52 | env="srv", 53 | config={ 54 | # Use a single process to avoid needing a load balancer 55 | "num_workers": 0, 56 | # Configure the agent to run short iterations for debugging 57 | "exploration_fraction": 0.01, 58 | "learning_starts": 100, 59 | "timesteps_per_iteration": 200, 60 | "env_config": { 61 | "observation_size": args.observation_size, 62 | "action_size": args.action_size, 63 | }, 64 | }) 65 | elif args.run == "PG": 66 | agent = PGAgent( 67 | env="srv", 68 | config={ 69 | "num_workers": 0, 70 | "env_config": { 71 | "observation_size": args.observation_size, 72 | "action_size": args.action_size, 73 | }, 74 | }) 75 | 76 | # Attempt to restore from checkpoint if possible. 77 | if os.path.exists(args.checkpoint_file): 78 | checkpoint_file = open(args.checkpoint_file).read() 79 | print("Restoring from checkpoint path", checkpoint_file) 80 | agent.restore(checkpoint_file) 81 | 82 | # Serving and training loop 83 | while True: 84 | print(pretty_print(agent.train())) 85 | checkpoint_file = agent.save() 86 | print("Last checkpoint", checkpoint_file) 87 | with open(args.checkpoint_file, "w") as f: 88 | f.write(checkpoint_file) 89 | -------------------------------------------------------------------------------- /week_8/web.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stephenoffer/ray_course/3ad48e41633a83113c91cbd140c48bc675f3b079/week_8/web.png -------------------------------------------------------------------------------- /week_8/week_8_exercise_1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "hideCode": false, 7 | "hidePrompt": false 8 | }, 9 | "source": [ 10 | "# Week 8: Exercise 1 - Markov Decision Processes\n", 11 | "\n", 12 | "**GOAL:** The goal of the exercise is to introduce the Markov Decision Process abstraction and to show how to use Markov Decision Processes in Python.\n", 13 | "\n", 14 | "**The key abstraction in reinforcement learning is the Markov decision process (MDP).** An MDP models sequential interactions with an external environment. It consists of the following:\n", 15 | "- a **state space**\n", 16 | "- a set of **actions**\n", 17 | "- a **transition function** which describes the probability of being in a state $s'$ at time $t+1$ given that the MDP was in state $s$ at time $t$ and action $a$ was taken\n", 18 | "- a **reward function**, which determines the reward received at time $t$\n", 19 | "- a **discount factor** $\\gamma$\n", 20 | "\n", 21 | "More details are available [here](https://en.wikipedia.org/wiki/Markov_decision_process).\n", 22 | "\n", 23 | "**NOTE:** Reinforcement learning algorithms are often applied to problems that don't strictly fit into the MDP framework. In particular, situations in which the state of the environment is not fully observed lead to violations of the MDP assumption. Nevertheless, RL algorithms can be applied anyway.\n", 24 | "\n", 25 | "## Policies\n", 26 | "\n", 27 | "A **policy** is a function that takes in a **state** and returns an **action**. A policy may be stochastic (i.e., it may sample from a probability distribution) or it can be deterministic.\n", 28 | "\n", 29 | "The **goal of reinforcement learning** is to learn a **policy** for maximizing the cumulative reward in an MDP. That is, we wish to find a policy $\\pi$ which solves the following optimization problem\n", 30 | "\n", 31 | "\\begin{equation}\n", 32 | "\\arg\\max_{\\pi} \\sum_{t=1}^T \\gamma^t R_t(\\pi),\n", 33 | "\\end{equation}\n", 34 | "\n", 35 | "where $T$ is the number of steps taken in the MDP (this is a random variable and may depend on $\\pi$) and $R_t$ is the reward received at time $t$ (also a random variable which depends on $\\pi$).\n", 36 | "\n", 37 | "A number of algorithms are available for solving reinforcement learning problems. Several of the most widely known are [value iteration](https://en.wikipedia.org/wiki/Markov_decision_process#Value_iteration), [policy iteration](https://en.wikipedia.org/wiki/Markov_decision_process#Policy_iteration), and [Q learning](https://en.wikipedia.org/wiki/Q-learning).\n", 38 | "\n", 39 | "## RL in Python\n", 40 | "\n", 41 | "The `gym` Python module provides MDP interfaces to a variety of simulators. For example, the CartPole environment interfaces with a simple simulator which simulates the physics of balancing a pole on a cart. The CartPole problem is described at https://gym.openai.com/envs/CartPole-v0. This example fits into the MDP framework as follows.\n", 42 | "- The **state** consists of the position and velocity of the cart as well as the angle and angular velocity of the pole that is balancing on the cart.\n", 43 | "- The **actions** are to decrease or increase the cart's velocity by one unit.\n", 44 | "- The **transition function** is deterministic and is determined by simulating physical laws.\n", 45 | "- The **reward function** is a constant 1 as long as the pole is upright, and 0 once the pole has fallen over. Therefore, maximizing the reward means balancing the pole for as long as possible.\n", 46 | "- The **discount factor** in this case can be taken to be 1.\n", 47 | "\n", 48 | "More information about the `gym` Python module is available at https://gym.openai.com/." 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": null, 54 | "metadata": {}, 55 | "outputs": [], 56 | "source": [ 57 | "from __future__ import absolute_import\n", 58 | "from __future__ import division\n", 59 | "from __future__ import print_function\n", 60 | "\n", 61 | "import gym\n", 62 | "import numpy as np" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "The code below illustrates how to create and manipulate MDPs in Python. An MDP can be created by calling `gym.make`. Gym environments are identified by names like `CartPole-v0`. A **catalog of built-in environments** can be found at https://gym.openai.com/envs." 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": { 76 | "hideCode": false, 77 | "hidePrompt": false 78 | }, 79 | "outputs": [], 80 | "source": [ 81 | "env = gym.make('CartPole-v0')\n", 82 | "print('Created env:', env)" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "Reset the state of the MDP by calling `env.reset()`. This call returns the initial state of the MDP." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": { 96 | "hideCode": false, 97 | "hidePrompt": false 98 | }, 99 | "outputs": [], 100 | "source": [ 101 | "state = env.reset()\n", 102 | "print('The starting state is:', state)" 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "metadata": {}, 108 | "source": [ 109 | "The `env.step` method takes an action (in the case of the CartPole environment, the appropriate actions are 0 or 1, for moving left or right). It returns a tuple of four things:\n", 110 | "1. the new state of the environment\n", 111 | "2. a reward\n", 112 | "3. a boolean indicating whether the simulation has finished\n", 113 | "4. a dictionary of miscellaneous extra information" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": { 120 | "hideCode": false, 121 | "hidePrompt": false 122 | }, 123 | "outputs": [], 124 | "source": [ 125 | "# Simulate taking an action in the environment. Appropriate actions for\n", 126 | "# the CartPole environment are 0 and 1 (for moving left and right).\n", 127 | "action = 0\n", 128 | "state, reward, done, info = env.step(action)\n", 129 | "print(state, reward, done, info)" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "A **rollout** is a simulation of a policy in an environment. It alternates between choosing actions based (using some policy) and taking those actions in the environment.\n", 137 | "\n", 138 | "The code below performs a rollout in a given environment. It takes **random actions** until the simulation has finished and returns the cumulative reward." 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "def random_rollout(env):\n", 148 | " state = env.reset()\n", 149 | " \n", 150 | " done = False\n", 151 | " cumulative_reward = 0\n", 152 | "\n", 153 | " # Keep looping as long as the simulation has not finished.\n", 154 | " while not done:\n", 155 | " # Choose a random action (either 0 or 1).\n", 156 | " action = np.random.choice([0, 1])\n", 157 | " \n", 158 | " # Take the action in the environment.\n", 159 | " state, reward, done, _ = env.step(action)\n", 160 | " \n", 161 | " # Update the cumulative reward.\n", 162 | " cumulative_reward += reward\n", 163 | " \n", 164 | " # Return the cumulative reward.\n", 165 | " return cumulative_reward\n", 166 | " \n", 167 | "reward = random_rollout(env)\n", 168 | "print(reward)\n", 169 | "reward = random_rollout(env)\n", 170 | "print(reward)" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "**EXERCISE:** Finish implementing the `rollout_policy` function below, which should take an environment *and* a policy. The *policy* is a function that takes in a *state* and returns an *action*. The main difference is that instead of choosing a **random action**, the action should be chosen **with the policy** (as a function of the state)." 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "metadata": {}, 184 | "outputs": [], 185 | "source": [ 186 | "def rollout_policy(env, policy):\n", 187 | " state = env.reset()\n", 188 | " \n", 189 | " done = False\n", 190 | " cumulative_reward = 0\n", 191 | "\n", 192 | " # EXERCISE: Fill out this function by copying the 'random_rollout' function\n", 193 | " # and then modifying it to choose the action using the policy.\n", 194 | " raise NotImplementedError\n", 195 | "\n", 196 | " # Return the cumulative reward.\n", 197 | " return cumulative_reward\n", 198 | "\n", 199 | "def sample_policy1(state):\n", 200 | " return 0 if state[0] < 0 else 1\n", 201 | "\n", 202 | "def sample_policy2(state):\n", 203 | " return 1 if state[0] < 0 else 0\n", 204 | "\n", 205 | "reward1 = np.mean([rollout_policy(env, sample_policy1) for _ in range(100)])\n", 206 | "reward2 = np.mean([rollout_policy(env, sample_policy2) for _ in range(100)])\n", 207 | "\n", 208 | "print('The first sample policy got an average reward of {}.'.format(reward1))\n", 209 | "print('The second sample policy got an average reward of {}.'.format(reward2))\n", 210 | "\n", 211 | "assert 5 < reward1 < 15, ('Make sure that rollout_policy computes the action '\n", 212 | " 'by applying the policy to the state.')\n", 213 | "assert 25 < reward2 < 35, ('Make sure that rollout_policy computes the action '\n", 214 | " 'by applying the policy to the state.')" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": null, 220 | "metadata": {}, 221 | "outputs": [], 222 | "source": [] 223 | } 224 | ], 225 | "metadata": { 226 | "hide_code_all_hidden": false, 227 | "kernelspec": { 228 | "display_name": "Python 3", 229 | "language": "python", 230 | "name": "python3" 231 | }, 232 | "language_info": { 233 | "codemirror_mode": { 234 | "name": "ipython", 235 | "version": 3 236 | }, 237 | "file_extension": ".py", 238 | "mimetype": "text/x-python", 239 | "name": "python", 240 | "nbconvert_exporter": "python", 241 | "pygments_lexer": "ipython3", 242 | "version": "3.6.8" 243 | } 244 | }, 245 | "nbformat": 4, 246 | "nbformat_minor": 2 247 | } 248 | -------------------------------------------------------------------------------- /week_8/week_8_exercise_2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Week 8: Exercise 2 - Proximal Policy Optimization\n", 8 | "\n", 9 | "**GOAL:** The goal of this exercise is to demonstrate how to use the proximal policy optimization (PPO) algorithm.\n", 10 | "\n", 11 | "To understand how to use **RLlib**, see the documentation at http://rllib.io.\n", 12 | "\n", 13 | "PPO is described in detail in https://arxiv.org/abs/1707.06347. It is a variant of Trust Region Policy Optimization (TRPO) described in https://arxiv.org/abs/1502.05477\n", 14 | "\n", 15 | "PPO works in two phases. In one phase, a large number of rollouts are performed (in parallel). The rollouts are then aggregated on the driver and a surrogate optimization objective is defined based on those rollouts. We then use SGD to find the policy that maximizes that objective with a penalty term for diverging too much from the current policy.\n", 16 | "\n", 17 | "![ppo](https://raw.githubusercontent.com/ucbrise/risecamp/risecamp2018/ray/tutorial/rllib_exercises/ppo.png)\n", 18 | "\n", 19 | "**NOTE:** The SGD optimization step is best performed in a data-parallel manner over multiple GPUs. This is exposed through the `num_gpus` field of the `config` dictionary (for this to work, you must be using a machine that has GPUs)." 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "metadata": {}, 26 | "outputs": [], 27 | "source": [ 28 | "from __future__ import absolute_import\n", 29 | "from __future__ import division\n", 30 | "from __future__ import print_function\n", 31 | "\n", 32 | "import gym\n", 33 | "import ray\n", 34 | "from ray.rllib.agents.ppo import PPOAgent, DEFAULT_CONFIG\n", 35 | "from ray.tune.logger import pretty_print" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "Start up Ray. This must be done before we instantiate any RL agents." 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": null, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "ray.init(ignore_reinit_error=True)" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "Instantiate a PPOAgent object. We pass in a config object that specifies how the network and training procedure should be configured. Some of the parameters are the following.\n", 59 | "\n", 60 | "- `num_workers` is the number of actors that the agent will create. This determines the degree of parallelism that will be used.\n", 61 | "- `num_sgd_iter` is the number of epochs of SGD (passes through the data) that will be used to optimize the PPO surrogate objective at each iteration of PPO.\n", 62 | "- `sgd_minibatch_size` is the SGD batch size that will be used to optimize the PPO surrogate objective.\n", 63 | "- `model` contains a dictionary of parameters describing the neural net used to parameterize the policy. The `fcnet_hiddens` parameter is a list of the sizes of the hidden layers." 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "config = DEFAULT_CONFIG.copy()\n", 73 | "config['num_workers'] = 3\n", 74 | "config['num_sgd_iter'] = 30\n", 75 | "config['sgd_minibatch_size'] = 128\n", 76 | "config['model']['fcnet_hiddens'] = [100, 100]\n", 77 | "config['num_cpus_per_worker'] = 0 # This avoids running out of resources in the notebook environment when this cell is re-executed\n", 78 | "\n", 79 | "agent = PPOAgent(config, 'CartPole-v0')" 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "metadata": {}, 85 | "source": [ 86 | "Train the policy on the `CartPole-v0` environment for 2 steps. The CartPole problem is described at https://gym.openai.com/envs/CartPole-v0.\n", 87 | "\n", 88 | "**EXERCISE:** Inspect how well the policy is doing by looking for the lines that say something like\n", 89 | "\n", 90 | "```\n", 91 | "total reward is 22.3215974777\n", 92 | "trajectory length mean is 21.3215974777\n", 93 | "```\n", 94 | "\n", 95 | "This indicates how much reward the policy is receiving and how many time steps of the environment the policy ran. The maximum possible reward for this problem is 200. The reward and trajectory length are very close because the agent receives a reward of one for every time step that it survives (however, that is specific to this environment)." 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "for i in range(2):\n", 105 | " result = agent.train()\n", 106 | " print(pretty_print(result))" 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": {}, 112 | "source": [ 113 | "**EXERCISE:** The current network and training configuration are too large and heavy-duty for a simple problem like CartPole. Modify the configuration to use a smaller network and to speed up the optimization of the surrogate objective (fewer SGD iterations and a larger batch size should help)." 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "config = DEFAULT_CONFIG.copy()\n", 123 | "config['num_workers'] = 3\n", 124 | "config['num_sgd_iter'] = 30\n", 125 | "config['sgd_minibatch_size'] = 128\n", 126 | "config['model']['fcnet_hiddens'] = [100, 100]\n", 127 | "config['num_cpus_per_worker'] = 0\n", 128 | "\n", 129 | "agent = PPOAgent(config, 'CartPole-v0')" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "**EXERCISE:** Train the agent and try to get a reward of 200. If it's training too slowly you may need to modify the config above to use fewer hidden units, a larger `sgd_minibatch_size`, a smaller `num_sgd_iter`, or a larger `num_workers`.\n", 137 | "\n", 138 | "This should take around 20 or 30 training iterations." 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "for i in range(2):\n", 148 | " result = agent.train()\n", 149 | " print(pretty_print(result))" 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": {}, 155 | "source": [ 156 | "Checkpoint the current model. The call to `agent.save()` returns the path to the checkpointed model and can be used later to restore the model." 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": null, 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "checkpoint_path = agent.save()\n", 166 | "print(checkpoint_path)" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "Now let's use the trained policy to make predictions.\n", 174 | "\n", 175 | "**NOTE:** Here we are loading the trained policy in the same process, but in practice, this would often be done in a different process (probably on a different machine)." 176 | ] 177 | }, 178 | { 179 | "cell_type": "code", 180 | "execution_count": null, 181 | "metadata": {}, 182 | "outputs": [], 183 | "source": [ 184 | "trained_config = config.copy()\n", 185 | "\n", 186 | "test_agent = PPOAgent(trained_config, 'CartPole-v0')\n", 187 | "test_agent.restore(checkpoint_path)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "Now use the trained policy to act in an environment. The key line is the call to `test_agent.compute_action(state)` which uses the trained policy to choose an action.\n", 195 | "\n", 196 | "**EXERCISE:** Verify that the reward received roughly matches up with the reward printed in the training logs." 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [ 205 | "env = gym.make('CartPole-v0')\n", 206 | "state = env.reset()\n", 207 | "done = False\n", 208 | "cumulative_reward = 0\n", 209 | "\n", 210 | "while not done:\n", 211 | " action = test_agent.compute_action(state)\n", 212 | " state, reward, done, _ = env.step(action)\n", 213 | " cumulative_reward += reward\n", 214 | "\n", 215 | "print(cumulative_reward)" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "## Visualize results with TensorBoard\n", 223 | "\n", 224 | "**EXERCISE**: Finally, you can visualize your training results using TensorBoard. To do this, open a new terminal in Jupyter lab using the \"+\" button, and run:\n", 225 | " \n", 226 | "`$ tensorboard --logdir=~/ray_results --host=0.0.0.0`\n", 227 | "\n", 228 | "And open your browser to the address printed (or change the current URL to go to port 6006). Check the \"episode_reward_mean\" learning curve of the PPO agent. Toggle the horizontal axis between both the \"STEPS\" and \"RELATIVE\" view to compare efficiency in number of timesteps vs real time time." 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": null, 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [] 237 | } 238 | ], 239 | "metadata": { 240 | "kernelspec": { 241 | "display_name": "Python 3", 242 | "language": "python", 243 | "name": "python3" 244 | }, 245 | "language_info": { 246 | "codemirror_mode": { 247 | "name": "ipython", 248 | "version": 3 249 | }, 250 | "file_extension": ".py", 251 | "mimetype": "text/x-python", 252 | "name": "python", 253 | "nbconvert_exporter": "python", 254 | "pygments_lexer": "ipython3", 255 | "version": "3.6.8" 256 | } 257 | }, 258 | "nbformat": 4, 259 | "nbformat_minor": 2 260 | } 261 | --------------------------------------------------------------------------------