├── LICENSE ├── README.md ├── package.json └── turbo.js /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016 minxomat 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # [![](https://i.imgur.com/rb8oPur.png)](http://turbo.github.io) 2 | 3 | turbo.js is a small library that makes it easier to perform complex calculations that can be done in parallel. The actual calculation performed (the *kernel* executed) uses the GPU for execution. This enables you to work on an array of values all at once. 4 | 5 | turbo.js is compatible with all browsers (even IE when not using ES6 template strings) and most desktop and mobile GPUs. 6 | 7 | **For a live demo and short intro, please visit [turbo.github.io](http://turbo.github.io).** 8 | 9 | ![](https://i.imgur.com/BiiQSzP.png) 10 | 11 | ### Example 1 12 | 13 | For this example, which can also be found at the aforementioned website, we are going to perform a simple calculation on a big-ish array of values. 14 | 15 | First, include turbo.js in your site: 16 | 17 | ```html 18 | 19 | ``` 20 | 21 | or pull [`turbojs`](https://www.npmjs.com/package/turbojs) via npm to use it in your project. 22 | 23 | turbo.js only has two functions that can be called by your code. Both are contained within the `turbojs` object. If this object is not initialized, something went wrong. So the first step is to check for turbo.js support. You can optionally check for exceptions thrown by turbo.js, which will provide further details on the error. 24 | 25 | ```js 26 | if (turbojs) { 27 | // yay 28 | } 29 | ``` 30 | 31 | Now we need some memory. Because data has to be transferred to and from GPU and system memory, we want to reduce the overhead this copy operation creates. To do this, turbo.js provides the `alloc` function. This will reserve memory on the GPU and in your browser. JavaScript can access and change contents of allocated memory by accessing the `.data` sub-array of a variable that contains allocated memory. 32 | 33 | For both turbo.js and JavaScript, the allocated memory is strictly typed and represents a one-dimensional array of 32bit IEEE floating-point values. Thus, the `.data` sub-array is a standard JavaScript [`Float32Array`](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Float32Array) object. After allocation, you can interact with this array however you want, except for changing it's size. Doing so will result in undefined behavior. 34 | 35 | ```js 36 | if (turbojs) { 37 | var foo = turbojs.alloc(1e6); 38 | } 39 | ``` 40 | 41 | We now have an array with 1,000,000 elements. Let's fill it with some data. 42 | 43 | ```js 44 | if (turbojs) { 45 | var foo = turbojs.alloc(1e6); 46 | 47 | for (var i = 0; i < 1e6; i++) foo.data[i] = i; 48 | 49 | // print first five elements 50 | console.log(foo.data.subarray(0, 5)); 51 | } 52 | ``` 53 | 54 | Running this, the console should now display `[0, 1, 2, 3, 4]`. Now for our simple calculation: Multiplying each value by `nFactor` and printing the results: 55 | 56 | ```js 57 | if (turbojs) { 58 | var foo = turbojs.alloc(1e6); 59 | var nFactor = 4; 60 | 61 | for (var i = 0; i < 1e6; i++) foo.data[i] = i; 62 | 63 | turbojs.run(foo, `void main(void) { 64 | commit(read() * ${nFactor}.); 65 | }`); 66 | 67 | console.log(foo.data.subarray(0, 5)); 68 | } 69 | ``` 70 | 71 | The console should now display `[0, 4, 8, 12, 16]`. That was easy, wasn't it? Let's break done what we've done: 72 | 73 | - `turbojs.run`'s first parameter is the previously allocated memory. The second parameter is the code that will be executed for each value in the array. 74 | - The code is written in an extension of C called GLSL. If you are not familiar with it, there is some good documentation on the internet. If you know C (or JS and know what types are), you'll pick it up in no time. 75 | - The kernel code here consists just of the main function, which takes no parameters. However, kernels can have any number of functions (except zero). 76 | - The `read()` function reads the current input value. 77 | - `${nFactor}` is substituted by the value of `nFactor`. Since GLSL expects numerical constant expressions to be typed, we append a `.` to mark it as a float. Otherwise the GLSL compiler will throw a type error. 78 | - `commit()` writes the result back to memory. You can `commit` from any function, but it is good practice to do so from the last line of the `main` function. 79 | 80 | ### Example 2: Working with vectors 81 | 82 | That's great. But sometimes you need to return more than a single value from each operation. Well, it might not look like it, but we've been doing that all along. Both `commit` and `read` actually work on 4-dimensional vectors. To break it down: 83 | 84 | - `vec4 read()` returns the GLSL data type `vec4`. 85 | - `void commit(vec4)` takes a `vec4` and writes it to memory 86 | 87 | A `vec4` is basically just an array. You could say it's akin to `foobar = {r:0, g:0, b:0, a:0}` in JS, but it's much more similar to JavaScript [`SIMD`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/SIMD)'s `Float32x4`. 88 | 89 | The nice thing about GLSL is that all operations are overloaded so that they can work with vectors without the need to deal with each element individually, so 90 | 91 | ```GLSL 92 | commit(vec4(read().r * 4., read().g * 4., read().b * 4., read().a * 4.)); 93 | ``` 94 | 95 | is equivalent to 96 | 97 | ```GLSL 98 | commit(read() * 4.); 99 | ``` 100 | 101 | Neat, huh? Of course there are other types of vectors in GLSL, namely `vec2` and `vec3`. If you create a bigger vector and supply a smaller one as a parameter, GLSL will automatically align the values: 102 | 103 | ```GLSL 104 | vec2 foo = vec2(1., 2.); 105 | 106 | commit(vec4(foo.r, foo.g, 0., 0.)); 107 | 108 | // is the same as 109 | 110 | commit(vec4(foo.rg, 0., 0.)); 111 | ``` 112 | 113 | So we'll use that right now. If you visit the website mentioned above, you will get results from a simple benchmark comparing JS to JS + turbo.js. The benchmark calculates random points on a mandelbrot fractal. Let's break down what happens there, starting with the JavaScript code: 114 | 115 | For each run, the first two values of each `vec4` of the allocated memory are filled with random coordinates as the input for the fractal function: 116 | 117 | ```js 118 | for (var i = 0; i < sampleSize; i += 4) { 119 | testData.data[i] = Math.random(); 120 | testData.data[i + 1] = Math.random(); 121 | } 122 | ``` 123 | 124 | For each operation, the result will be a greyscale color value. That will be written to the third (i.e. `b`) component of each vector: 125 | 126 | ```js 127 | function testJS() { 128 | for (var i = 0; i < sampleSize; i += 4) { 129 | var x0 = -2.5 + (3.5 * testData.data[i]); 130 | var y0 = testData.data[i + 1], x = 0, y = 0, xt = 0, c = 0; 131 | 132 | for (var n = 0; n < sampleIterations; n++) { 133 | if (x * x + y * y >= 2 * 2) break; 134 | 135 | xt = x * x - y * y + x0; 136 | y = 2 * x * y + y0; 137 | x = xt; 138 | c++; 139 | } 140 | 141 | var col = c / sampleIterations; 142 | 143 | testData.data[i + 2] = col; 144 | } 145 | } 146 | ``` 147 | 148 | The fractal is calculated to the iteration depth of `sampleIterations`. Now let's take a look at the turbo.js code performing the same task: 149 | 150 | ```js 151 | function testTurbo() { 152 | turbojs.run(testData, `void main(void) { 153 | vec4 ipt = read(); 154 | 155 | float x0 = -2.5 + (3.5 * ipt.r); 156 | float y0 = ipt.g, x, y, xt, c; 157 | 158 | for(int i = 0; i < ${sampleIterations}; i++) { 159 | if (x * x + y * y >= 2. * 2.) break; 160 | 161 | xt = x * x - y * y + x0; 162 | y = 2. * x * y + y0; 163 | x = xt; 164 | c++; 165 | } 166 | 167 | float col = c / ${sampleIterations}.; 168 | 169 | commit(vec4(ipt.rg, col, 0.)); 170 | }`); 171 | } 172 | ``` 173 | 174 | Notice how easy the JS code can be translated to GLSL and vice versa, as long as no exclusive paradigms are used. Of course this example is not the optimal algorithm in JS or GLSL, this is just for comparison. 175 | 176 | ### Example 3: Debugging 177 | 178 | GLSL code is compiled by your GPU vendor's compiler. Usually these compilers provide verbose error information. You can catch compile-time errors by catching exceptions thrown by turbo.js. As an example, consider this invalid code: 179 | 180 | ```js 181 | if (turbojs) { 182 | var foo = turbojs.alloc(1e6); 183 | var nFactor = 4; 184 | 185 | turbojs.run(foo, `void main(void) { 186 | commit(${nFactor}. + bar); 187 | }`); 188 | } 189 | ``` 190 | 191 | This will generate two errors. The first one is `bar` being undefined. The second one is a type mismatch: `commit` expects a vector, but we've just given it a float. Opening your browser's console will reveal the error: 192 | 193 | ![](https://i.imgur.com/49Z6Fei.png) 194 | 195 | ### Further considerations 196 | 197 | - Always provide a JS fallback if you detect that turbo.js is not supported. 198 | - Use web workers for huge datasets to prevent the page from blocking. 199 | - Always warm-up the GPU using dummy data. You won't get the full performance if you don't. 200 | - In addition to error checking, do a sanity check using a small dataset and a simple kernel. If the numbers don't check out, fall back to JS. 201 | - I haven't tried it, but I guess you can adapt [glsl-transpiler](https://github.com/stackgl/glsl-transpiler) to create JS fallback code automatically. 202 | - Consider if you *really* need turbo.js. Optimize your *algorithm* (not code) first. Consider using JS SIMD. turbo.js can't be used for non-parallel workloads. 203 | 204 | Make sure to familiarize yourself with the GLSL standard, which can be found at [OpenGL.org](https://www.opengl.org/registry/doc/GLSLangSpec.4.40.pdf). 205 | 206 | Follow best practices to reduce your algorithm complexity. MDN adds: 207 | 208 | > Simpler shaders perform better than complex ones. In particular, if you can remove an if statement from a shader, that will make it run faster. Division and math functions like `log()` should be considered expensive too. 209 | 210 | Many C shorthands apply to GLSL. Having said that, this also applies: 211 | 212 | > However, nowadays even mobile devices possess powerful GPUs that are capable of running even relatively complex shader programs. Moreover, because shaders are compiled, the eventual machine code that actually runs on the hardware may be highly optimized. What may seem like an expensive function call may in fact compile into only few (or even a single) machine instructions. This is particularly true for GLSL functions that typically operate on vectors, such as `normalize()`, `dot()` and `mix()`. The best advice in that regard is to use the built-in functions, rather than try to implement, for example, one's own version of a dot-product or linear interpolation, which may in fact compile to larger and less optimized machine code. Finally, it is important to keep in mind that GPUs are constructed to do complex mathematical calculations in hardware, and therefore, may support math functions, such as `sin()`, `cos()` and other, through dedicated machine instructions. 213 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "turbojs", 3 | "version": "1.0.2", 4 | "description": "GPGPU for the web.", 5 | "main": "turbo.js", 6 | "repository": { 7 | "type": "git", 8 | "url": "git+https://github.com/turbo/js.git" 9 | }, 10 | "keywords": [ 11 | "webgl", 12 | "gpgpu" 13 | ], 14 | "author": "turbo", 15 | "license": "MIT", 16 | "bugs": { 17 | "url": "https://github.com/turbo/js/issues" 18 | }, 19 | "homepage": "https://github.com/turbo/js#readme" 20 | } 21 | -------------------------------------------------------------------------------- /turbo.js: -------------------------------------------------------------------------------- 1 | (function (root, factory) { 2 | if (typeof define === 'function' && define.amd) { 3 | // AMD. Register as an anonymous module. 4 | define([], factory); 5 | } else if (typeof module === 'object' && module.exports) { 6 | // Node. Does not work with strict CommonJS, but 7 | // only CommonJS-like environments that support module.exports, 8 | // like Node. 9 | module.exports = factory(); 10 | } else { 11 | // Browser globals (root is window) 12 | root.turbojs = factory(); 13 | } 14 | }(this, function () { 15 | 16 | // turbo.js 17 | // (c) turbo - github.com/turbo 18 | // MIT licensed 19 | 20 | "use strict"; 21 | 22 | // Mozilla reference init implementation 23 | var initGLFromCanvas = function(canvas) { 24 | var gl = null; 25 | var attr = {alpha : false, antialias : false}; 26 | 27 | // Try to grab the standard context. If it fails, fallback to experimental. 28 | gl = canvas.getContext("webgl", attr) || canvas.getContext("experimental-webgl", attr); 29 | 30 | // If we don't have a GL context, give up now 31 | if (!gl) 32 | throw new Error("turbojs: Unable to initialize WebGL. Your browser may not support it."); 33 | 34 | return gl; 35 | } 36 | 37 | var gl = initGLFromCanvas(document.createElement('canvas')); 38 | 39 | // turbo.js requires a 32bit float vec4 texture. Some systems only provide 8bit/float 40 | // textures. A workaround is being created, but turbo.js shouldn't be used on those 41 | // systems anyway. 42 | if (!gl.getExtension('OES_texture_float')) 43 | throw new Error('turbojs: Required texture format OES_texture_float not supported.'); 44 | 45 | // GPU texture buffer from JS typed array 46 | function newBuffer(data, f, e) { 47 | var buf = gl.createBuffer(); 48 | 49 | gl.bindBuffer((e || gl.ARRAY_BUFFER), buf); 50 | gl.bufferData((e || gl.ARRAY_BUFFER), new (f || Float32Array)(data), gl.STATIC_DRAW); 51 | 52 | return buf; 53 | } 54 | 55 | var positionBuffer = newBuffer([ -1, -1, 1, -1, 1, 1, -1, 1 ]); 56 | var textureBuffer = newBuffer([ 0, 0, 1, 0, 1, 1, 0, 1 ]); 57 | var indexBuffer = newBuffer([ 1, 2, 0, 3, 0, 2 ], Uint16Array, gl.ELEMENT_ARRAY_BUFFER); 58 | 59 | var vertexShaderCode = 60 | "attribute vec2 position;\n" + 61 | "varying vec2 pos;\n" + 62 | "attribute vec2 texture;\n" + 63 | "\n" + 64 | "void main(void) {\n" + 65 | " pos = texture;\n" + 66 | " gl_Position = vec4(position.xy, 0.0, 1.0);\n" + 67 | "}" 68 | 69 | var stdlib = 70 | "\n" + 71 | "precision mediump float;\n" + 72 | "uniform sampler2D u_texture;\n" + 73 | "varying vec2 pos;\n" + 74 | "\n" + 75 | "vec4 read(void) {\n" + 76 | " return texture2D(u_texture, pos);\n" + 77 | "}\n" + 78 | "\n" + 79 | "void commit(vec4 val) {\n" + 80 | " gl_FragColor = val;\n" + 81 | "}\n" + 82 | "\n" + 83 | "// user code begins here\n" + 84 | "\n" 85 | 86 | var vertexShader = gl.createShader(gl.VERTEX_SHADER); 87 | 88 | gl.shaderSource(vertexShader, vertexShaderCode); 89 | gl.compileShader(vertexShader); 90 | 91 | // This should not fail. 92 | if (!gl.getShaderParameter(vertexShader, gl.COMPILE_STATUS)) 93 | throw new Error( 94 | "\nturbojs: Could not build internal vertex shader (fatal).\n" + "\n" + 95 | "INFO: >REPORT< THIS. That's our fault!\n" + "\n" + 96 | "--- CODE DUMP ---\n" + vertexShaderCode + "\n\n" + 97 | "--- ERROR LOG ---\n" + gl.getShaderInfoLog(vertexShader) 98 | ); 99 | 100 | // Transfer data onto clamped texture and turn off any filtering 101 | function createTexture(data, size) { 102 | var texture = gl.createTexture(); 103 | 104 | gl.bindTexture(gl.TEXTURE_2D, texture); 105 | gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE); 106 | gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE); 107 | gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST); 108 | gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST); 109 | gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, size, size, 0, gl.RGBA, gl.FLOAT, data); 110 | gl.bindTexture(gl.TEXTURE_2D, null); 111 | 112 | return texture; 113 | } 114 | 115 | return { 116 | // run code against a pre-allocated array 117 | run : function(ipt, code) { 118 | var fragmentShader = gl.createShader(gl.FRAGMENT_SHADER); 119 | 120 | gl.shaderSource( 121 | fragmentShader, 122 | stdlib + code 123 | ); 124 | 125 | gl.compileShader(fragmentShader); 126 | 127 | // Use this output to debug the shader 128 | // Keep in mind that WebGL GLSL is **much** stricter than e.g. OpenGL GLSL 129 | if (!gl.getShaderParameter(fragmentShader, gl.COMPILE_STATUS)) { 130 | var LOC = code.split('\n'); 131 | var dbgMsg = "ERROR: Could not build shader (fatal).\n\n------------------ KERNEL CODE DUMP ------------------\n" 132 | 133 | for (var nl = 0; nl < LOC.length; nl++) 134 | dbgMsg += (stdlib.split('\n').length + nl) + "> " + LOC[nl] + "\n"; 135 | 136 | dbgMsg += "\n--------------------- ERROR LOG ---------------------\n" + gl.getShaderInfoLog(fragmentShader) 137 | 138 | throw new Error(dbgMsg); 139 | } 140 | 141 | var program = gl.createProgram(); 142 | 143 | gl.attachShader(program, vertexShader); 144 | gl.attachShader(program, fragmentShader); 145 | gl.linkProgram(program); 146 | 147 | if (!gl.getProgramParameter(program, gl.LINK_STATUS)) 148 | throw new Error('turbojs: Failed to link GLSL program code.'); 149 | 150 | var uTexture = gl.getUniformLocation(program, 'u_texture'); 151 | var aPosition = gl.getAttribLocation(program, 'position'); 152 | var aTexture = gl.getAttribLocation(program, 'texture'); 153 | 154 | gl.useProgram(program); 155 | 156 | var size = Math.sqrt(ipt.data.length) / 4; 157 | var texture = createTexture(ipt.data, size); 158 | 159 | gl.viewport(0, 0, size, size); 160 | gl.bindFramebuffer(gl.FRAMEBUFFER, gl.createFramebuffer()); 161 | 162 | // Types arrays speed this up tremendously. 163 | var nTexture = createTexture(new Float32Array(ipt.data.length), size); 164 | 165 | gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, nTexture, 0); 166 | 167 | // Test for mobile bug MDN->WebGL_best_practices, bullet 7 168 | var frameBufferStatus = (gl.checkFramebufferStatus(gl.FRAMEBUFFER) == gl.FRAMEBUFFER_COMPLETE); 169 | 170 | if (!frameBufferStatus) 171 | throw new Error('turbojs: Error attaching float texture to framebuffer. Your device is probably incompatible. Error info: ' + frameBufferStatus.message); 172 | 173 | gl.bindTexture(gl.TEXTURE_2D, texture); 174 | gl.activeTexture(gl.TEXTURE0); 175 | gl.uniform1i(uTexture, 0); 176 | gl.bindBuffer(gl.ARRAY_BUFFER, textureBuffer); 177 | gl.enableVertexAttribArray(aTexture); 178 | gl.vertexAttribPointer(aTexture, 2, gl.FLOAT, false, 0, 0); 179 | gl.bindBuffer(gl.ARRAY_BUFFER, positionBuffer); 180 | gl.enableVertexAttribArray(aPosition); 181 | gl.vertexAttribPointer(aPosition, 2, gl.FLOAT, false, 0, 0); 182 | gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, indexBuffer); 183 | gl.drawElements(gl.TRIANGLES, 6, gl.UNSIGNED_SHORT, 0); 184 | gl.readPixels(0, 0, size, size, gl.RGBA, gl.FLOAT, ipt.data); 185 | // ^ 4 x 32 bit ^ 186 | 187 | return ipt.data.subarray(0, ipt.length); 188 | }, 189 | alloc: function(sz) { 190 | // A sane limit for most GPUs out there. 191 | // JS falls apart before GLSL limits could ever be reached. 192 | if (sz > 16777216) 193 | throw new Error("turbojs: Whoops, the maximum array size is exceeded!"); 194 | 195 | var ns = Math.pow(Math.pow(2, Math.ceil(Math.log(sz) / 1.386) - 1), 2); 196 | return { 197 | data : new Float32Array(ns * 16), 198 | length : sz 199 | }; 200 | } 201 | }; 202 | 203 | })); 204 | 205 | --------------------------------------------------------------------------------