├── LICENSE ├── README.md ├── imurmurhash.js ├── imurmurhash.min.js ├── package.json └── simplification.md /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2013 Gary Court, Jens Taylor 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of 6 | this software and associated documentation files (the "Software"), to deal in 7 | the Software without restriction, including without limitation the rights to 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 9 | the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | iMurmurHash.js 2 | ============== 3 | 4 | An incremental implementation of the MurmurHash3 (32-bit) hashing algorithm for JavaScript based on [Gary Court's implementation](https://github.com/garycourt/murmurhash-js) with [kazuyukitanimura's modifications](https://github.com/kazuyukitanimura/murmurhash-js). 5 | 6 | This version works significantly faster than the non-incremental version if you need to hash many small strings into a single hash, since string concatenation (to build the single string to pass the non-incremental version) is fairly costly. In one case tested, using the incremental version was about 50% faster than concatenating 5-10 strings and then hashing. 7 | 8 | Installation 9 | ------------ 10 | 11 | To use iMurmurHash in the browser, [download the latest version](https://raw.github.com/jensyt/imurmurhash-js/master/imurmurhash.min.js) and include it as a script on your site. 12 | 13 | ```html 14 | 15 | 18 | ``` 19 | 20 | --- 21 | 22 | To use iMurmurHash in Node.js, install the module using NPM: 23 | 24 | ```bash 25 | npm install imurmurhash 26 | ``` 27 | 28 | Then simply include it in your scripts: 29 | 30 | ```javascript 31 | MurmurHash3 = require('imurmurhash'); 32 | ``` 33 | 34 | Quick Example 35 | ------------- 36 | 37 | ```javascript 38 | // Create the initial hash 39 | var hashState = MurmurHash3('string'); 40 | 41 | // Incrementally add text 42 | hashState.hash('more strings'); 43 | hashState.hash('even more strings'); 44 | 45 | // All calls can be chained if desired 46 | hashState.hash('and').hash('some').hash('more'); 47 | 48 | // Get a result 49 | hashState.result(); 50 | // returns 0xe4ccfe6b 51 | ``` 52 | 53 | Functions 54 | --------- 55 | 56 | ### MurmurHash3 ([string], [seed]) 57 | Get a hash state object, optionally initialized with the given _string_ and _seed_. _Seed_ must be a positive integer if provided. Calling this function without the `new` keyword will return a cached state object that has been reset. This is safe to use as long as the object is only used from a single thread and no other hashes are created while operating on this one. If this constraint cannot be met, you can use `new` to create a new state object. For example: 58 | 59 | ```javascript 60 | // Use the cached object, calling the function again will return the same 61 | // object (but reset, so the current state would be lost) 62 | hashState = MurmurHash3(); 63 | ... 64 | 65 | // Create a new object that can be safely used however you wish. Calling the 66 | // function again will simply return a new state object, and no state loss 67 | // will occur, at the cost of creating more objects. 68 | hashState = new MurmurHash3(); 69 | ``` 70 | 71 | Both methods can be mixed however you like if you have different use cases. 72 | 73 | --- 74 | 75 | ### MurmurHash3.prototype.hash (string) 76 | Incrementally add _string_ to the hash. This can be called as many times as you want for the hash state object, including after a call to `result()`. Returns `this` so calls can be chained. 77 | 78 | --- 79 | 80 | ### MurmurHash3.prototype.result () 81 | Get the result of the hash as a 32-bit positive integer. This performs the tail and finalizer portions of the algorithm, but does not store the result in the state object. This means that it is perfectly safe to get results and then continue adding strings via `hash`. 82 | 83 | ```javascript 84 | // Do the whole string at once 85 | MurmurHash3('this is a test string').result(); 86 | // 0x70529328 87 | 88 | // Do part of the string, get a result, then the other part 89 | var m = MurmurHash3('this is a'); 90 | m.result(); 91 | // 0xbfc4f834 92 | m.hash(' test string').result(); 93 | // 0x70529328 (same as above) 94 | ``` 95 | 96 | --- 97 | 98 | ### MurmurHash3.prototype.reset ([seed]) 99 | Reset the state object for reuse, optionally using the given _seed_ (defaults to 0 like the constructor). Returns `this` so calls can be chained. 100 | 101 | --- 102 | 103 | License (MIT) 104 | ------------- 105 | Copyright (c) 2013 Gary Court, Jens Taylor 106 | 107 | Permission is hereby granted, free of charge, to any person obtaining a copy of 108 | this software and associated documentation files (the "Software"), to deal in 109 | the Software without restriction, including without limitation the rights to 110 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 111 | the Software, and to permit persons to whom the Software is furnished to do so, 112 | subject to the following conditions: 113 | 114 | The above copyright notice and this permission notice shall be included in all 115 | copies or substantial portions of the Software. 116 | 117 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 118 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 119 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 120 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 121 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 122 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 123 | -------------------------------------------------------------------------------- /imurmurhash.js: -------------------------------------------------------------------------------- 1 | /** 2 | * @preserve 3 | * JS Implementation of incremental MurmurHash3 (r150) (as of May 10, 2013) 4 | * 5 | * @author Jens Taylor 6 | * @see http://github.com/homebrewing/brauhaus-diff 7 | * @author Gary Court 8 | * @see http://github.com/garycourt/murmurhash-js 9 | * @author Austin Appleby 10 | * @see http://sites.google.com/site/murmurhash/ 11 | */ 12 | (function(){ 13 | var cache; 14 | 15 | // Call this function without `new` to use the cached object (good for 16 | // single-threaded environments), or with `new` to create a new object. 17 | // 18 | // @param {string} key A UTF-16 or ASCII string 19 | // @param {number} seed An optional positive integer 20 | // @return {object} A MurmurHash3 object for incremental hashing 21 | function MurmurHash3(key, seed) { 22 | var m = this instanceof MurmurHash3 ? this : cache; 23 | m.reset(seed) 24 | if (typeof key === 'string' && key.length > 0) { 25 | m.hash(key); 26 | } 27 | 28 | if (m !== this) { 29 | return m; 30 | } 31 | }; 32 | 33 | // Incrementally add a string to this hash 34 | // 35 | // @param {string} key A UTF-16 or ASCII string 36 | // @return {object} this 37 | MurmurHash3.prototype.hash = function(key) { 38 | var h1, k1, i, top, len; 39 | 40 | len = key.length; 41 | this.len += len; 42 | 43 | k1 = this.k1; 44 | i = 0; 45 | switch (this.rem) { 46 | case 0: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) : 0; 47 | case 1: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) << 8 : 0; 48 | case 2: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) << 16 : 0; 49 | case 3: 50 | k1 ^= len > i ? (key.charCodeAt(i) & 0xff) << 24 : 0; 51 | k1 ^= len > i ? (key.charCodeAt(i++) & 0xff00) >> 8 : 0; 52 | } 53 | 54 | this.rem = (len + this.rem) & 3; // & 3 is same as % 4 55 | len -= this.rem; 56 | if (len > 0) { 57 | h1 = this.h1; 58 | while (1) { 59 | k1 = (k1 * 0x2d51 + (k1 & 0xffff) * 0xcc9e0000) & 0xffffffff; 60 | k1 = (k1 << 15) | (k1 >>> 17); 61 | k1 = (k1 * 0x3593 + (k1 & 0xffff) * 0x1b870000) & 0xffffffff; 62 | 63 | h1 ^= k1; 64 | h1 = (h1 << 13) | (h1 >>> 19); 65 | h1 = (h1 * 5 + 0xe6546b64) & 0xffffffff; 66 | 67 | if (i >= len) { 68 | break; 69 | } 70 | 71 | k1 = ((key.charCodeAt(i++) & 0xffff)) ^ 72 | ((key.charCodeAt(i++) & 0xffff) << 8) ^ 73 | ((key.charCodeAt(i++) & 0xffff) << 16); 74 | top = key.charCodeAt(i++); 75 | k1 ^= ((top & 0xff) << 24) ^ 76 | ((top & 0xff00) >> 8); 77 | } 78 | 79 | k1 = 0; 80 | switch (this.rem) { 81 | case 3: k1 ^= (key.charCodeAt(i + 2) & 0xffff) << 16; 82 | case 2: k1 ^= (key.charCodeAt(i + 1) & 0xffff) << 8; 83 | case 1: k1 ^= (key.charCodeAt(i) & 0xffff); 84 | } 85 | 86 | this.h1 = h1; 87 | } 88 | 89 | this.k1 = k1; 90 | return this; 91 | }; 92 | 93 | // Get the result of this hash 94 | // 95 | // @return {number} The 32-bit hash 96 | MurmurHash3.prototype.result = function() { 97 | var k1, h1; 98 | 99 | k1 = this.k1; 100 | h1 = this.h1; 101 | 102 | if (k1 > 0) { 103 | k1 = (k1 * 0x2d51 + (k1 & 0xffff) * 0xcc9e0000) & 0xffffffff; 104 | k1 = (k1 << 15) | (k1 >>> 17); 105 | k1 = (k1 * 0x3593 + (k1 & 0xffff) * 0x1b870000) & 0xffffffff; 106 | h1 ^= k1; 107 | } 108 | 109 | h1 ^= this.len; 110 | 111 | h1 ^= h1 >>> 16; 112 | h1 = (h1 * 0xca6b + (h1 & 0xffff) * 0x85eb0000) & 0xffffffff; 113 | h1 ^= h1 >>> 13; 114 | h1 = (h1 * 0xae35 + (h1 & 0xffff) * 0xc2b20000) & 0xffffffff; 115 | h1 ^= h1 >>> 16; 116 | 117 | return h1 >>> 0; 118 | }; 119 | 120 | // Reset the hash object for reuse 121 | // 122 | // @param {number} seed An optional positive integer 123 | MurmurHash3.prototype.reset = function(seed) { 124 | this.h1 = typeof seed === 'number' ? seed : 0; 125 | this.rem = this.k1 = this.len = 0; 126 | return this; 127 | }; 128 | 129 | // A cached object to use. This can be safely used if you're in a single- 130 | // threaded environment, otherwise you need to create new hashes to use. 131 | cache = new MurmurHash3(); 132 | 133 | if (typeof(module) != 'undefined') { 134 | module.exports = MurmurHash3; 135 | } else { 136 | this.MurmurHash3 = MurmurHash3; 137 | } 138 | }()); 139 | -------------------------------------------------------------------------------- /imurmurhash.min.js: -------------------------------------------------------------------------------- 1 | /** 2 | * @preserve 3 | * JS Implementation of incremental MurmurHash3 (r150) (as of May 10, 2013) 4 | * 5 | * @author Jens Taylor 6 | * @see http://github.com/homebrewing/brauhaus-diff 7 | * @author Gary Court 8 | * @see http://github.com/garycourt/murmurhash-js 9 | * @author Austin Appleby 10 | * @see http://sites.google.com/site/murmurhash/ 11 | */ 12 | !function(){function t(h,r){var s=this instanceof t?this:e;return s.reset(r),"string"==typeof h&&h.length>0&&s.hash(h),s!==this?s:void 0}var e;t.prototype.hash=function(t){var e,h,r,s,i;switch(i=t.length,this.len+=i,h=this.k1,r=0,this.rem){case 0:h^=i>r?65535&t.charCodeAt(r++):0;case 1:h^=i>r?(65535&t.charCodeAt(r++))<<8:0;case 2:h^=i>r?(65535&t.charCodeAt(r++))<<16:0;case 3:h^=i>r?(255&t.charCodeAt(r))<<24:0,h^=i>r?(65280&t.charCodeAt(r++))>>8:0}if(this.rem=3&i+this.rem,i-=this.rem,i>0){for(e=this.h1;;){if(h=4294967295&11601*h+3432906752*(65535&h),h=h<<15|h>>>17,h=4294967295&13715*h+461832192*(65535&h),e^=h,e=e<<13|e>>>19,e=4294967295&5*e+3864292196,r>=i)break;h=65535&t.charCodeAt(r++)^(65535&t.charCodeAt(r++))<<8^(65535&t.charCodeAt(r++))<<16,s=t.charCodeAt(r++),h^=(255&s)<<24^(65280&s)>>8}switch(h=0,this.rem){case 3:h^=(65535&t.charCodeAt(r+2))<<16;case 2:h^=(65535&t.charCodeAt(r+1))<<8;case 1:h^=65535&t.charCodeAt(r)}this.h1=e}return this.k1=h,this},t.prototype.result=function(){var t,e;return t=this.k1,e=this.h1,t>0&&(t=4294967295&11601*t+3432906752*(65535&t),t=t<<15|t>>>17,t=4294967295&13715*t+461832192*(65535&t),e^=t),e^=this.len,e^=e>>>16,e=4294967295&51819*e+2246770688*(65535&e),e^=e>>>13,e=4294967295&44597*e+3266445312*(65535&e),e^=e>>>16,e>>>0},t.prototype.reset=function(t){return this.h1="number"==typeof t?t:0,this.rem=this.k1=this.len=0,this},e=new t,"undefined"!=typeof module?module.exports=t:this.MurmurHash3=t}(); -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "imurmurhash", 3 | "version": "0.1.4", 4 | "description": "An incremental implementation of MurmurHash3", 5 | "homepage": "https://github.com/jensyt/imurmurhash-js", 6 | "main": "imurmurhash.js", 7 | "files": [ 8 | "imurmurhash.js", 9 | "imurmurhash.min.js", 10 | "package.json", 11 | "README.md" 12 | ], 13 | "repository": { 14 | "type": "git", 15 | "url": "https://github.com/jensyt/imurmurhash-js" 16 | }, 17 | "bugs": { 18 | "url": "https://github.com/jensyt/imurmurhash-js/issues" 19 | }, 20 | "keywords": [ 21 | "murmur", 22 | "murmurhash", 23 | "murmurhash3", 24 | "hash", 25 | "incremental" 26 | ], 27 | "author": { 28 | "name": "Jens Taylor", 29 | "email": "jensyt@gmail.com", 30 | "url": "https://github.com/homebrewing" 31 | }, 32 | "license": "MIT", 33 | "dependencies": { 34 | }, 35 | "devDependencies": { 36 | }, 37 | "engines": { 38 | "node": ">=0.8.19" 39 | } 40 | } 41 | -------------------------------------------------------------------------------- /simplification.md: -------------------------------------------------------------------------------- 1 | Proof of modifications 2 | ---------------------- 3 | 4 | Gary Court's version contains the following line: 5 | 6 | ```javascript 7 | k1 = (((k1 & 0xffff) * c1) + ((((k1 >>> 16) * c1) & 0xffff) << 16)) & 0xffffffff 8 | ``` 9 | 10 | which can be simplified to: 11 | 12 | ```javascript 13 | k1 = (k1 * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000)) & 0xffffffff 14 | ``` 15 | 16 | Consider each half of the equation individually, letting `k1 = (a + b) & 0xffffffff`. Starting with `b`, we can simplify: 17 | 18 | ```javascript 19 | b = (((k1 >>> 16) * c1) & 0xffff) << 16 20 | b = (((k1 >>> 16) * (c1 & 0xffff)) & 0xffff) << 16 21 | b = ((k1 & 0xffff0000) * (c1 & 0xffff)) & 0xffff0000 22 | b = ((k1 & 0xffff0000) * (c1 & 0xffff)) & 0xffffffff 23 | b = (k1 & 0xffff0000) * (c1 & 0xffff) 24 | ``` 25 | 26 | The last line is equal in this case because the entire expression `(a + b)` is ANDed with `0xffffffff`, so it can be factored out of `b`. Next, `a` can be expanded: 27 | 28 | ```javascript 29 | a = (k1 & 0xffff) * c1 30 | a = (k1 & 0xffff) * ((c1 & 0xffff) + (c1 & 0xffff0000)) 31 | a = (k1 & 0xffff) * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000) 32 | ``` 33 | 34 | Letting `a = e + f`, we get `k1 = (e + f + b) & 0xffffffff`. Combining `e + b`, we can find: 35 | 36 | ```javascript 37 | e + b = (k1 & 0xffff) * (c1 & 0xffff) + (k1 & 0xffff0000) * (c1 & 0xffff) 38 | e + b = ((k1 & 0xffff) + (k1 & 0xffff0000)) * (c1 & 0xffff) 39 | e + b = k1 * (c1 & 0xffff) 40 | ``` 41 | 42 | Finally, putting it all together: 43 | 44 | ```javascript 45 | k1 = ((e + b) + f) & 0xffffffff 46 | k1 = (k1 * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000)) & 0xffffffff 47 | ```` 48 | 49 | Overall, all this does is multiply `k1` by `c1` and only keep the 32-bit result. Unfortunately, JavaScript can't handle the multiply directly since the result will be cast to floating point if it results in more than 53 bits of precision (and will lose the lower bits, exactly the ones we want to keep). 50 | 51 | All other modifications to the algorithm are based on this change. 52 | 53 | To see the original and modified versions this is based on, check out: 54 | * [Gary Court's original version](https://github.com/garycourt/murmurhash-js) 55 | * [kazuyukitanimura's modified version](https://github.com/kazuyukitanimura/murmurhash-js) 56 | --------------------------------------------------------------------------------