├── LICENSE
├── README.md
├── imurmurhash.js
├── imurmurhash.min.js
├── package.json
└── simplification.md
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2013 Gary Court, Jens Taylor
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
6 | this software and associated documentation files (the "Software"), to deal in
7 | the Software without restriction, including without limitation the rights to
8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9 | the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | iMurmurHash.js
2 | ==============
3 |
4 | An incremental implementation of the MurmurHash3 (32-bit) hashing algorithm for JavaScript based on [Gary Court's implementation](https://github.com/garycourt/murmurhash-js) with [kazuyukitanimura's modifications](https://github.com/kazuyukitanimura/murmurhash-js).
5 |
6 | This version works significantly faster than the non-incremental version if you need to hash many small strings into a single hash, since string concatenation (to build the single string to pass the non-incremental version) is fairly costly. In one case tested, using the incremental version was about 50% faster than concatenating 5-10 strings and then hashing.
7 |
8 | Installation
9 | ------------
10 |
11 | To use iMurmurHash in the browser, [download the latest version](https://raw.github.com/jensyt/imurmurhash-js/master/imurmurhash.min.js) and include it as a script on your site.
12 |
13 | ```html
14 |
15 |
18 | ```
19 |
20 | ---
21 |
22 | To use iMurmurHash in Node.js, install the module using NPM:
23 |
24 | ```bash
25 | npm install imurmurhash
26 | ```
27 |
28 | Then simply include it in your scripts:
29 |
30 | ```javascript
31 | MurmurHash3 = require('imurmurhash');
32 | ```
33 |
34 | Quick Example
35 | -------------
36 |
37 | ```javascript
38 | // Create the initial hash
39 | var hashState = MurmurHash3('string');
40 |
41 | // Incrementally add text
42 | hashState.hash('more strings');
43 | hashState.hash('even more strings');
44 |
45 | // All calls can be chained if desired
46 | hashState.hash('and').hash('some').hash('more');
47 |
48 | // Get a result
49 | hashState.result();
50 | // returns 0xe4ccfe6b
51 | ```
52 |
53 | Functions
54 | ---------
55 |
56 | ### MurmurHash3 ([string], [seed])
57 | Get a hash state object, optionally initialized with the given _string_ and _seed_. _Seed_ must be a positive integer if provided. Calling this function without the `new` keyword will return a cached state object that has been reset. This is safe to use as long as the object is only used from a single thread and no other hashes are created while operating on this one. If this constraint cannot be met, you can use `new` to create a new state object. For example:
58 |
59 | ```javascript
60 | // Use the cached object, calling the function again will return the same
61 | // object (but reset, so the current state would be lost)
62 | hashState = MurmurHash3();
63 | ...
64 |
65 | // Create a new object that can be safely used however you wish. Calling the
66 | // function again will simply return a new state object, and no state loss
67 | // will occur, at the cost of creating more objects.
68 | hashState = new MurmurHash3();
69 | ```
70 |
71 | Both methods can be mixed however you like if you have different use cases.
72 |
73 | ---
74 |
75 | ### MurmurHash3.prototype.hash (string)
76 | Incrementally add _string_ to the hash. This can be called as many times as you want for the hash state object, including after a call to `result()`. Returns `this` so calls can be chained.
77 |
78 | ---
79 |
80 | ### MurmurHash3.prototype.result ()
81 | Get the result of the hash as a 32-bit positive integer. This performs the tail and finalizer portions of the algorithm, but does not store the result in the state object. This means that it is perfectly safe to get results and then continue adding strings via `hash`.
82 |
83 | ```javascript
84 | // Do the whole string at once
85 | MurmurHash3('this is a test string').result();
86 | // 0x70529328
87 |
88 | // Do part of the string, get a result, then the other part
89 | var m = MurmurHash3('this is a');
90 | m.result();
91 | // 0xbfc4f834
92 | m.hash(' test string').result();
93 | // 0x70529328 (same as above)
94 | ```
95 |
96 | ---
97 |
98 | ### MurmurHash3.prototype.reset ([seed])
99 | Reset the state object for reuse, optionally using the given _seed_ (defaults to 0 like the constructor). Returns `this` so calls can be chained.
100 |
101 | ---
102 |
103 | License (MIT)
104 | -------------
105 | Copyright (c) 2013 Gary Court, Jens Taylor
106 |
107 | Permission is hereby granted, free of charge, to any person obtaining a copy of
108 | this software and associated documentation files (the "Software"), to deal in
109 | the Software without restriction, including without limitation the rights to
110 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
111 | the Software, and to permit persons to whom the Software is furnished to do so,
112 | subject to the following conditions:
113 |
114 | The above copyright notice and this permission notice shall be included in all
115 | copies or substantial portions of the Software.
116 |
117 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
118 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
119 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
120 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
121 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
122 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
123 |
--------------------------------------------------------------------------------
/imurmurhash.js:
--------------------------------------------------------------------------------
1 | /**
2 | * @preserve
3 | * JS Implementation of incremental MurmurHash3 (r150) (as of May 10, 2013)
4 | *
5 | * @author Jens Taylor
6 | * @see http://github.com/homebrewing/brauhaus-diff
7 | * @author Gary Court
8 | * @see http://github.com/garycourt/murmurhash-js
9 | * @author Austin Appleby
10 | * @see http://sites.google.com/site/murmurhash/
11 | */
12 | (function(){
13 | var cache;
14 |
15 | // Call this function without `new` to use the cached object (good for
16 | // single-threaded environments), or with `new` to create a new object.
17 | //
18 | // @param {string} key A UTF-16 or ASCII string
19 | // @param {number} seed An optional positive integer
20 | // @return {object} A MurmurHash3 object for incremental hashing
21 | function MurmurHash3(key, seed) {
22 | var m = this instanceof MurmurHash3 ? this : cache;
23 | m.reset(seed)
24 | if (typeof key === 'string' && key.length > 0) {
25 | m.hash(key);
26 | }
27 |
28 | if (m !== this) {
29 | return m;
30 | }
31 | };
32 |
33 | // Incrementally add a string to this hash
34 | //
35 | // @param {string} key A UTF-16 or ASCII string
36 | // @return {object} this
37 | MurmurHash3.prototype.hash = function(key) {
38 | var h1, k1, i, top, len;
39 |
40 | len = key.length;
41 | this.len += len;
42 |
43 | k1 = this.k1;
44 | i = 0;
45 | switch (this.rem) {
46 | case 0: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) : 0;
47 | case 1: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) << 8 : 0;
48 | case 2: k1 ^= len > i ? (key.charCodeAt(i++) & 0xffff) << 16 : 0;
49 | case 3:
50 | k1 ^= len > i ? (key.charCodeAt(i) & 0xff) << 24 : 0;
51 | k1 ^= len > i ? (key.charCodeAt(i++) & 0xff00) >> 8 : 0;
52 | }
53 |
54 | this.rem = (len + this.rem) & 3; // & 3 is same as % 4
55 | len -= this.rem;
56 | if (len > 0) {
57 | h1 = this.h1;
58 | while (1) {
59 | k1 = (k1 * 0x2d51 + (k1 & 0xffff) * 0xcc9e0000) & 0xffffffff;
60 | k1 = (k1 << 15) | (k1 >>> 17);
61 | k1 = (k1 * 0x3593 + (k1 & 0xffff) * 0x1b870000) & 0xffffffff;
62 |
63 | h1 ^= k1;
64 | h1 = (h1 << 13) | (h1 >>> 19);
65 | h1 = (h1 * 5 + 0xe6546b64) & 0xffffffff;
66 |
67 | if (i >= len) {
68 | break;
69 | }
70 |
71 | k1 = ((key.charCodeAt(i++) & 0xffff)) ^
72 | ((key.charCodeAt(i++) & 0xffff) << 8) ^
73 | ((key.charCodeAt(i++) & 0xffff) << 16);
74 | top = key.charCodeAt(i++);
75 | k1 ^= ((top & 0xff) << 24) ^
76 | ((top & 0xff00) >> 8);
77 | }
78 |
79 | k1 = 0;
80 | switch (this.rem) {
81 | case 3: k1 ^= (key.charCodeAt(i + 2) & 0xffff) << 16;
82 | case 2: k1 ^= (key.charCodeAt(i + 1) & 0xffff) << 8;
83 | case 1: k1 ^= (key.charCodeAt(i) & 0xffff);
84 | }
85 |
86 | this.h1 = h1;
87 | }
88 |
89 | this.k1 = k1;
90 | return this;
91 | };
92 |
93 | // Get the result of this hash
94 | //
95 | // @return {number} The 32-bit hash
96 | MurmurHash3.prototype.result = function() {
97 | var k1, h1;
98 |
99 | k1 = this.k1;
100 | h1 = this.h1;
101 |
102 | if (k1 > 0) {
103 | k1 = (k1 * 0x2d51 + (k1 & 0xffff) * 0xcc9e0000) & 0xffffffff;
104 | k1 = (k1 << 15) | (k1 >>> 17);
105 | k1 = (k1 * 0x3593 + (k1 & 0xffff) * 0x1b870000) & 0xffffffff;
106 | h1 ^= k1;
107 | }
108 |
109 | h1 ^= this.len;
110 |
111 | h1 ^= h1 >>> 16;
112 | h1 = (h1 * 0xca6b + (h1 & 0xffff) * 0x85eb0000) & 0xffffffff;
113 | h1 ^= h1 >>> 13;
114 | h1 = (h1 * 0xae35 + (h1 & 0xffff) * 0xc2b20000) & 0xffffffff;
115 | h1 ^= h1 >>> 16;
116 |
117 | return h1 >>> 0;
118 | };
119 |
120 | // Reset the hash object for reuse
121 | //
122 | // @param {number} seed An optional positive integer
123 | MurmurHash3.prototype.reset = function(seed) {
124 | this.h1 = typeof seed === 'number' ? seed : 0;
125 | this.rem = this.k1 = this.len = 0;
126 | return this;
127 | };
128 |
129 | // A cached object to use. This can be safely used if you're in a single-
130 | // threaded environment, otherwise you need to create new hashes to use.
131 | cache = new MurmurHash3();
132 |
133 | if (typeof(module) != 'undefined') {
134 | module.exports = MurmurHash3;
135 | } else {
136 | this.MurmurHash3 = MurmurHash3;
137 | }
138 | }());
139 |
--------------------------------------------------------------------------------
/imurmurhash.min.js:
--------------------------------------------------------------------------------
1 | /**
2 | * @preserve
3 | * JS Implementation of incremental MurmurHash3 (r150) (as of May 10, 2013)
4 | *
5 | * @author Jens Taylor
6 | * @see http://github.com/homebrewing/brauhaus-diff
7 | * @author Gary Court
8 | * @see http://github.com/garycourt/murmurhash-js
9 | * @author Austin Appleby
10 | * @see http://sites.google.com/site/murmurhash/
11 | */
12 | !function(){function t(h,r){var s=this instanceof t?this:e;return s.reset(r),"string"==typeof h&&h.length>0&&s.hash(h),s!==this?s:void 0}var e;t.prototype.hash=function(t){var e,h,r,s,i;switch(i=t.length,this.len+=i,h=this.k1,r=0,this.rem){case 0:h^=i>r?65535&t.charCodeAt(r++):0;case 1:h^=i>r?(65535&t.charCodeAt(r++))<<8:0;case 2:h^=i>r?(65535&t.charCodeAt(r++))<<16:0;case 3:h^=i>r?(255&t.charCodeAt(r))<<24:0,h^=i>r?(65280&t.charCodeAt(r++))>>8:0}if(this.rem=3&i+this.rem,i-=this.rem,i>0){for(e=this.h1;;){if(h=4294967295&11601*h+3432906752*(65535&h),h=h<<15|h>>>17,h=4294967295&13715*h+461832192*(65535&h),e^=h,e=e<<13|e>>>19,e=4294967295&5*e+3864292196,r>=i)break;h=65535&t.charCodeAt(r++)^(65535&t.charCodeAt(r++))<<8^(65535&t.charCodeAt(r++))<<16,s=t.charCodeAt(r++),h^=(255&s)<<24^(65280&s)>>8}switch(h=0,this.rem){case 3:h^=(65535&t.charCodeAt(r+2))<<16;case 2:h^=(65535&t.charCodeAt(r+1))<<8;case 1:h^=65535&t.charCodeAt(r)}this.h1=e}return this.k1=h,this},t.prototype.result=function(){var t,e;return t=this.k1,e=this.h1,t>0&&(t=4294967295&11601*t+3432906752*(65535&t),t=t<<15|t>>>17,t=4294967295&13715*t+461832192*(65535&t),e^=t),e^=this.len,e^=e>>>16,e=4294967295&51819*e+2246770688*(65535&e),e^=e>>>13,e=4294967295&44597*e+3266445312*(65535&e),e^=e>>>16,e>>>0},t.prototype.reset=function(t){return this.h1="number"==typeof t?t:0,this.rem=this.k1=this.len=0,this},e=new t,"undefined"!=typeof module?module.exports=t:this.MurmurHash3=t}();
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
1 | {
2 | "name": "imurmurhash",
3 | "version": "0.1.4",
4 | "description": "An incremental implementation of MurmurHash3",
5 | "homepage": "https://github.com/jensyt/imurmurhash-js",
6 | "main": "imurmurhash.js",
7 | "files": [
8 | "imurmurhash.js",
9 | "imurmurhash.min.js",
10 | "package.json",
11 | "README.md"
12 | ],
13 | "repository": {
14 | "type": "git",
15 | "url": "https://github.com/jensyt/imurmurhash-js"
16 | },
17 | "bugs": {
18 | "url": "https://github.com/jensyt/imurmurhash-js/issues"
19 | },
20 | "keywords": [
21 | "murmur",
22 | "murmurhash",
23 | "murmurhash3",
24 | "hash",
25 | "incremental"
26 | ],
27 | "author": {
28 | "name": "Jens Taylor",
29 | "email": "jensyt@gmail.com",
30 | "url": "https://github.com/homebrewing"
31 | },
32 | "license": "MIT",
33 | "dependencies": {
34 | },
35 | "devDependencies": {
36 | },
37 | "engines": {
38 | "node": ">=0.8.19"
39 | }
40 | }
41 |
--------------------------------------------------------------------------------
/simplification.md:
--------------------------------------------------------------------------------
1 | Proof of modifications
2 | ----------------------
3 |
4 | Gary Court's version contains the following line:
5 |
6 | ```javascript
7 | k1 = (((k1 & 0xffff) * c1) + ((((k1 >>> 16) * c1) & 0xffff) << 16)) & 0xffffffff
8 | ```
9 |
10 | which can be simplified to:
11 |
12 | ```javascript
13 | k1 = (k1 * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000)) & 0xffffffff
14 | ```
15 |
16 | Consider each half of the equation individually, letting `k1 = (a + b) & 0xffffffff`. Starting with `b`, we can simplify:
17 |
18 | ```javascript
19 | b = (((k1 >>> 16) * c1) & 0xffff) << 16
20 | b = (((k1 >>> 16) * (c1 & 0xffff)) & 0xffff) << 16
21 | b = ((k1 & 0xffff0000) * (c1 & 0xffff)) & 0xffff0000
22 | b = ((k1 & 0xffff0000) * (c1 & 0xffff)) & 0xffffffff
23 | b = (k1 & 0xffff0000) * (c1 & 0xffff)
24 | ```
25 |
26 | The last line is equal in this case because the entire expression `(a + b)` is ANDed with `0xffffffff`, so it can be factored out of `b`. Next, `a` can be expanded:
27 |
28 | ```javascript
29 | a = (k1 & 0xffff) * c1
30 | a = (k1 & 0xffff) * ((c1 & 0xffff) + (c1 & 0xffff0000))
31 | a = (k1 & 0xffff) * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000)
32 | ```
33 |
34 | Letting `a = e + f`, we get `k1 = (e + f + b) & 0xffffffff`. Combining `e + b`, we can find:
35 |
36 | ```javascript
37 | e + b = (k1 & 0xffff) * (c1 & 0xffff) + (k1 & 0xffff0000) * (c1 & 0xffff)
38 | e + b = ((k1 & 0xffff) + (k1 & 0xffff0000)) * (c1 & 0xffff)
39 | e + b = k1 * (c1 & 0xffff)
40 | ```
41 |
42 | Finally, putting it all together:
43 |
44 | ```javascript
45 | k1 = ((e + b) + f) & 0xffffffff
46 | k1 = (k1 * (c1 & 0xffff) + (k1 & 0xffff) * (c1 & 0xffff0000)) & 0xffffffff
47 | ````
48 |
49 | Overall, all this does is multiply `k1` by `c1` and only keep the 32-bit result. Unfortunately, JavaScript can't handle the multiply directly since the result will be cast to floating point if it results in more than 53 bits of precision (and will lose the lower bits, exactly the ones we want to keep).
50 |
51 | All other modifications to the algorithm are based on this change.
52 |
53 | To see the original and modified versions this is based on, check out:
54 | * [Gary Court's original version](https://github.com/garycourt/murmurhash-js)
55 | * [kazuyukitanimura's modified version](https://github.com/kazuyukitanimura/murmurhash-js)
56 |
--------------------------------------------------------------------------------