├── .gitignore ├── README.md ├── THANKS ├── hooks ├── .gitignore ├── post-commit │ └── .gitignore └── pre-commit │ ├── .gitignore │ ├── erlang │ ├── uppercase_text.erl │ ├── validate_integrity.erl │ └── validate_json.erl │ └── js │ └── validate_json.js ├── mapreduce ├── .gitignore ├── erlang │ ├── .gitignore │ ├── delete_keys.erl │ ├── get_keys.erl │ ├── luwak_mr.erl │ ├── mr_kv_counters.erl │ ├── riak_mapreduce_utils.erl │ └── save_reduce.erl └── js │ ├── .gitignore │ ├── count_keys.js │ ├── get_keys.js │ ├── iso8601.js │ ├── regex_key_match.js │ ├── slenderize.js │ ├── sorting-by-field.js │ └── stats.js ├── other ├── erlang │ ├── bucket_exporter.erl │ ├── bucket_importer.erl │ ├── bucket_inspector.erl │ ├── bucket_reloader.erl │ ├── digraph_exporter.erl │ └── digraph_importer.erl └── ruby │ ├── riak_yaml_importer.rb │ └── yaml_importer.rb └── todo.txt /.gitignore: -------------------------------------------------------------------------------- 1 | _site 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Riak Function Contrib 2 | 3 | Riak Function Contrib is a community-powered library of MapReduce, Pre-/Post-Commit Hook, and other functions. It exists for several reasons: 4 | 5 | 1. So Riak users can contribute functions they've written back to the community in one centralized, easy-to-manage location 6 | 2. To provide users who are new to Riak with a list of previously-created and tested functions that may suit their usage needs 7 | 3. To lower the barrier to entry to using and mastering MapReduce and Pre-/Post-Commit Hook Functions in Riak 8 | 9 | ## Usage 10 | 11 | To use the code in this repo you can either browse the source files of the functions in the directory of your choice or head over to the [Riak Function Contrib wiki](https://github.com/basho/riak_function_contrib/wiki) to search for useful code and read in depth descriptions provided by function authors. 12 | 13 | ## Issues or Questions 14 | 15 | If, at any point, you have a question or issue, please post to the [Riak Mailing List](http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com) and submit your question there. 16 | 17 | ## Contributing 18 | 19 | Have a function to share with the rest of the Riak Community? Great. Here is how you do it: 20 | 21 | 1. Fork this repo to your own GitHub account (You can read up on forking [here](http://help.github.com/forking/) if you need a refresher) 22 | 2. Branch from `master` and add your function source file to the appropriate directory (explained in depth below) 23 | 3. Send a pull request against the `master` branch from your branch 24 | 4. Create your docs branch based on the `wiki` upstream branch, and add your overview file (again, explained below) 25 | 5. Send a pull request against the `wiki` branch for your docs contribution 26 | 6. Kick back, smile, and relish in the fact that your functions are helping Riak users everywhere 27 | 28 | ### Your Function Source File 29 | 30 | Step 2 above is "Add your function source file to the appropriate directory." The MapReduce Functions live in the `mapreduce` directory, and are broken up by languages (Erlang or JavaScript). Pre- and Post- Commit source code lives in the `hooks` directory and is broken down by type (Pre and Post). For instance, if you have a JavaScript MapReduce Function to contribute, it will [live in this directory](https://github.com/basho/riak_function_contrib/tree/master/mapreduce/js/). 31 | 32 | **What should the source file contain?** 33 | 34 | 1. _Apache 2.0 License Boilerplate_. All files submitted to the Riak Function Contrib Repo must include the Apache 2.0 boilerplate. If you're unfamiliar with the Apache license and how to include it in your code, please take a moment and read up on it [here](http://www.apache.org/licenses/LICENSE-2.0.html). 35 | 36 | 2. Your code. This is fairly self-explanatory. In addition to the code, make sure to include adequate comments and notation. [Here's a great example](https://github.com/basho/riak_function_contrib/blob/master/mapreduce/js/sorting-by-field.js). 37 | 38 | ### Your Overview File 39 | 40 | In addition to your source, you're encourage to take a few minutes and put together an overview file. Why? Because your contribution to Riak Function Contrib will visible in two locations: 1) as part of the actual code repo in the form of your source file and 2) as an overview page on the [Riak Function Contrib wiki](https://github.com/basho/riak_function_contrib/wiki) For example, the overview page for [this function](https://github.com/basho/riak_function_contrib/blob/master/mapreduce/js/sorting-by-field.js) lives [here](https://github.com/basho/riak_function_contrib/wiki/Sorting-By-Field) 41 | 42 | ### A General Note on File Naming 43 | 44 | Try to name your source file in such a way that it describes what the function might be used for. For example, if the code is a JavaScript reduce function that is good for filtering out large objects, you might name it `large-object-filter-reduce.js` 45 | 46 | Also, be sure to name your overview file with the same name as your source file (save for the extension, of course). So, to continue with the example given above, if your source file is named `large-object-filter-reduce.js` you would name your overview file `Large-Object-Filter-Reduce.md` 47 | -------------------------------------------------------------------------------- /THANKS: -------------------------------------------------------------------------------- 1 | The following people have contributed to riak_function_contrib in some capacity: 2 | 3 | Kevin Smith 4 | Mark Phillips 5 | Alexander Sicular 6 | Grant Schofield 7 | Francisco Treacy 8 | Daniel Einspanjer -------------------------------------------------------------------------------- /hooks/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/hooks/.gitignore -------------------------------------------------------------------------------- /hooks/post-commit/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/hooks/post-commit/.gitignore -------------------------------------------------------------------------------- /hooks/pre-commit/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/hooks/pre-commit/.gitignore -------------------------------------------------------------------------------- /hooks/pre-commit/erlang/uppercase_text.erl: -------------------------------------------------------------------------------- 1 | 2 | %% Author: Hal Eisen (hal.eisen@ask.com) 3 | 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | 18 | 19 | -module(uppercase_text). 20 | 21 | -export([uppertext/1]). 22 | 23 | uppertext(IncomingObject) -> 24 | BinaryContent = riak_object:get_value(IncomingObject), 25 | StringContent = binary:bin_to_list(BinaryContent), 26 | UpperStringContent = string:to_upper(StringContent), 27 | UpperBinaryContent = binary:list_to_bin(UpperStringContent), 28 | riak_object:apply_updates(riak_object:update_value(IncomingObject, UpperBinaryContent)). 29 | 30 | -------------------------------------------------------------------------------- /hooks/pre-commit/erlang/validate_integrity.erl: -------------------------------------------------------------------------------- 1 | -module(validate_integrity). 2 | 3 | -compile(export_all). 4 | 5 | %%% Verify checksum if present. 6 | %%% Note that the checksum is expected to be base64(md5(Body)). 7 | precommit_verify_cksum({fail,_}=Error) -> Error; 8 | precommit_verify_cksum(Object) -> 9 | case get_usermeta(<<"Content-MD5">>, Object) of 10 | error -> Object; % No checksum. 11 | {ok, ExpectedMD5} -> 12 | Contents = riak_object:get_value(Object), 13 | ActualMD5 = base64:encode(erlang:md5(Contents)), 14 | if ActualMD5 =:= ExpectedMD5 -> 15 | %% io:format("{C}"), 16 | Object; % We allow it! 17 | true -> 18 | fail(precommit_verify_cksum, 19 | "Bad checksum - expected ~s, computed ~s", 20 | [ExpectedMD5, ActualMD5], 21 | Object) 22 | end 23 | end. 24 | 25 | %%% Verify body size if present. 26 | precommit_verify_size({fail,_}=Error) -> Error; 27 | precommit_verify_size(Object) -> 28 | case get_usermeta(<<"Byte-count">>, Object) of 29 | error -> Object; % No size. 30 | {ok, ExpectedSize} -> 31 | Contents = riak_object:get_value(Object), 32 | ActualSize = list_to_binary(integer_to_list(byte_size(Contents))), 33 | if ActualSize =:= ExpectedSize -> 34 | %% io:format("{S}"), 35 | Object; % We allow it! 36 | true -> 37 | fail(precommit_verify_size,"Bad byte-count - expected ~s, computed ~s", [ExpectedSize, ActualSize], 38 | Object) 39 | end 40 | end. 41 | 42 | %%% Verify that payload is an inflatable (zlib/gzip-compressed) blob. 43 | precommit_verify_compression({fail,_}=Error) -> Error; 44 | precommit_verify_compression(Object) -> 45 | case get_usermeta(<<"Verify-Compression">>, Object) of 46 | error -> Object; % No check. 47 | {ok, Method} -> 48 | Contents = riak_object:get_value(Object), 49 | try uncompress(Method, Contents) of 50 | X when is_binary(X) -> 51 | %% io:format("{D}"), 52 | Object; % Looks OK. 53 | _ -> 54 | fail(precommit_verify_compression, "Verify-Compression failed.", [], 55 | Object) 56 | catch 57 | _:Reason -> 58 | fail(precommit_verify_compression, "Verify-Compression failed: ~p", [Reason], 59 | Object) 60 | end 61 | end. 62 | 63 | 64 | %%%==================== Helpers: 65 | 66 | uncompress(<<"deflate">>, CompData) -> inflate(CompData); 67 | uncompress(<<"gzip">>, CompData) -> zlib:gunzip(CompData); 68 | uncompress(Method, _CompData) -> 69 | error(format("unknown compression method: ~s", [Method])). 70 | 71 | inflate(CompData) -> 72 | Z = zlib:open(), 73 | try 74 | zlib:inflateInit(Z, 15), 75 | Data = zlib:inflate(Z, CompData), 76 | zlib:inflateEnd(Z), 77 | Data 78 | after 79 | zlib:close(Z) 80 | end. 81 | 82 | 83 | fail(Tag, Format, Args, Object) -> 84 | ErrTxt = format(Format, Args), 85 | error_logger:error_msg("~p: ~s rejected write: ~s\n", [?MODULE, Tag, ErrTxt]), 86 | (Object == undefined) orelse 87 | error_logger:error_msg("~p: Dump of rejected object (~s):\n ~p\n", [?MODULE, ErrTxt, Object]), 88 | {fail, ErrTxt}. 89 | 90 | format(Fmt, Args) -> 91 | lists:flatten(io_lib:format(Fmt, Args)). 92 | 93 | get_usermeta(Key, Object) -> 94 | MetaDict = riak_object:get_metadata(Object), 95 | UserMeta = dict:fetch(<<"X-Riak-Meta">>, MetaDict), 96 | UserMetaLC = [{string:to_lower(K), V} || {K,V} <- UserMeta], 97 | KeyLC = string:to_lower("X-Riak-Meta-" ++ binary_to_list(Key)), 98 | case lists:keyfind(KeyLC, 1, UserMetaLC) of 99 | false -> 100 | error; 101 | {_,Value} -> 102 | {ok, list_to_binary(Value)} 103 | end. 104 | -------------------------------------------------------------------------------- /hooks/pre-commit/erlang/validate_json.erl: -------------------------------------------------------------------------------- 1 | -module(validate_json). 2 | -export([validate/1]). 3 | 4 | validate(Object) -> 5 | try 6 | mochijson2:decode(riak_object:get_value(Object)), 7 | Object 8 | catch 9 | throw:invalid_utf8 -> 10 | {fail, "Invalid JSON: Illegal UTF-8 character"}; 11 | error:Error -> 12 | {fail, "Invalid JSON: " ++ binary_to_list(list_to_binary(io_lib:format("~p", [Error])))} 13 | end. 14 | 15 | -------------------------------------------------------------------------------- /hooks/pre-commit/js/validate_json.js: -------------------------------------------------------------------------------- 1 | // ------------------------------------------------------------------- 2 | // 3 | // 4 | // This file is provided to you under the Apache License, 5 | // Version 2.0 (the "License"); you may not use this file 6 | // except in compliance with the License. You may obtain 7 | // a copy of the License at 8 | // 9 | // http://www.apache.org/licenses/LICENSE-2.0 10 | // 11 | // Unless required by applicable law or agreed to in writing, 12 | // software distributed under the License is distributed on an 13 | // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | // KIND, either express or implied. See the License for the 15 | // specific language governing permissions and limitations 16 | // under the License. 17 | // 18 | // ------------------------------------------------------------------- 19 | 20 | // Makes sure the object being inserted is valid JSON 21 | function validateJSON(object){ 22 | 23 | // A delete is a type of put in Riak so check and see what this 24 | // operation is doing and pass over objects being deleted 25 | if (obj.values[0]['metadata']['X-Riak-Deleted']){ 26 | return obj; 27 | } 28 | 29 | try { 30 | Riak.mapValuesJson(object); 31 | return object; 32 | } catch(e) { 33 | return {"fail":"Object is not JSON"}; 34 | } 35 | } 36 | -------------------------------------------------------------------------------- /mapreduce/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/mapreduce/.gitignore -------------------------------------------------------------------------------- /mapreduce/erlang/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/mapreduce/erlang/.gitignore -------------------------------------------------------------------------------- /mapreduce/erlang/delete_keys.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | 21 | -module(reduce_functions). 22 | 23 | -export([delete/2]). 24 | 25 | % Data is a list of bucket and key pairs, intermixed with the counts of deleted 26 | % objects. Returns a count of deleted objects. 27 | delete(List, _None) -> 28 | {ok, C} = riak:local_client(), 29 | 30 | Delete = fun(Bucket, Key) -> 31 | case C:delete(Bucket, Key, 0) of 32 | ok -> 1; 33 | _ -> 0 34 | end 35 | end, 36 | 37 | F = fun(Elem, Acc) -> 38 | case Elem of 39 | {{Bucket, Key}, _KeyData} -> 40 | Acc + Delete(Bucket, Key); 41 | {Bucket, Key} -> 42 | Acc + Delete(Bucket, Key); 43 | [Bucket, Key] -> 44 | Acc + Delete(Bucket, Key); 45 | _ -> 46 | Acc + Elem 47 | end 48 | end, 49 | 50 | [lists:foldl(F, 0, List)]. 51 | -------------------------------------------------------------------------------- /mapreduce/erlang/get_keys.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | 21 | -module(map_functions). 22 | 23 | -export([get_keys/3]). 24 | 25 | %Returns bucket and key pairs from a map phase 26 | get_keys(Value,_Keydata,_Arg) -> 27 | [[riak_object:bucket(Value),riak_object:key(Value)]]. 28 | -------------------------------------------------------------------------------- /mapreduce/erlang/luwak_mr.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% luwak_mr: utilities for map/reducing on Luwak data 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | %% @doc Tools for map/reducing Luwak data. 21 | %% 22 | %% The primary tool in this module is a function that conforms to 23 | %% the interface for "dynamic map/reduce inputs." This function 24 | %% will allow you to set up a map/reduce process for running a 25 | %% computation across the blocks of a Luwak file. 26 | %% 27 | %% To use the function via the Erlang client: 28 | %%``` 29 | %% C:mapred({modfun, luwak_mr, file, <<"my_file_name">>}, 30 | %% [... your query ...]). 31 | %%''' 32 | %% Over HTTP, structure your JSON query like: 33 | %%``` 34 | %% {"inputs":{"module":"luwak_mr", 35 | %% "function":"file", 36 | %% "arg":"my_file_name"}, 37 | %% "query":[... your query ...]} 38 | %%''' 39 | %% 40 | %% The luwak_mr:file/3 function will send an input to the 41 | %% map/reduce query for each block in the file. The "KeyData" 42 | %% for the block will be its offset in the file. As a trivial 43 | %% example, you might use this to get an ordered list of the 44 | %% first byte of each block like so: 45 | %%``` 46 | %% F = fun(B, O, _) -> 47 | %% <> = luwak_block:data(B), 48 | %% [{Y, O}] 49 | %% end, 50 | %% {ok, Bytes} = C:mapred({modfun,luwak_mr,file,<<"name">>}, 51 | %% [{map, {qfun, F}, none, true}]), 52 | %% OrderedBytes = lists:keysort(2, Bytes), 53 | %% [ Y || {Y, _} <- OrderedBytes. 54 | %%''' 55 | 56 | -module(luwak_mr). 57 | 58 | -export([file/3]). 59 | 60 | -include("luwak.hrl"). 61 | 62 | %% @spec file(pid(), binary(), integer()) -> ok 63 | %% @doc Sends the bucket-keys for the blocks of a Luwak file as 64 | %% map/reduce inputs to the specified FlowPid. Use it by 65 | %% specifying the map/reduce input as: 66 | %%``` 67 | %% {modfun, luwak_mr, file, <<"file_name">>} 68 | %%''' 69 | file(FlowPid, Filename, _Timeout) when is_binary(Filename) -> 70 | {ok, Client} = riak:local_client(), 71 | 72 | {ok, File} = luwak_file:get(Client, Filename), 73 | V = riak_object:get_value(File), 74 | {block_size, BlockSize} = lists:keyfind(block_size, 1, V), 75 | 76 | case lists:keyfind(root, 1, V) of 77 | {root, RootKey} -> tree(FlowPid, Client, BlockSize, RootKey, 0); 78 | false -> ok 79 | end, 80 | 81 | luke_flow:finish_inputs(FlowPid). 82 | 83 | %% @spec tree(pid(), riak_client(), integer(), binary(), integer()) 84 | %% -> integer() 85 | %% @doc Recursive tree walker used by file/3. This function assumes 86 | %% that a child link in a tree is a data block if the size it 87 | %% lists is less than or equal to the specified BlockSize, and 88 | %% that it is a subtree if the size is greater than BlockSize. 89 | %% 90 | %% The result is the offset of the byte that would imediately 91 | %% follow all of the bytes in this tree. This fact is unused, 92 | %% but *could* be used for testing an invariant. 93 | tree(FlowPid, Client, BlockSize, Key, Offset) -> 94 | {ok, #n{children=Children}} = luwak_tree:get(Client, Key), 95 | lists:foldl( 96 | fun({SubTree, Size}, SubOffset) when Size > BlockSize -> 97 | tree(FlowPid, Client, BlockSize, SubTree, SubOffset), 98 | SubOffset+Size; 99 | ({Leaf, Size}, LeafOffset) -> 100 | luke_flow:add_inputs( 101 | FlowPid, [{{?N_BUCKET, Leaf}, LeafOffset}]), 102 | LeafOffset+Size 103 | end, 104 | Offset, 105 | Children). 106 | -------------------------------------------------------------------------------- /mapreduce/erlang/mr_kv_counters.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% mr_kv_counters: utilities for map/reducing on KV Counters 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | -module(mr_kv_counters). 21 | 22 | % Map Functions 23 | -export([value/3]). 24 | 25 | % Reduce Functions 26 | -export([sum/2, maximum/2]). 27 | 28 | 29 | % @doc value/3 takes a Riak Object, and returns a proplist containing only 30 | % its key, and the counter value. Use it as the first Map step of an 31 | % operation. 32 | % 33 | % In the case that none of the object's siblings are counters, the 34 | % count returned will be 0. 35 | % 36 | value(RiakObject, _KeyData, _Arg) -> 37 | Key = riak_object:key(RiakObject), 38 | Count = riak_kv_counter:value(RiakObject), 39 | [ {Key, Count} ]. 40 | 41 | % @doc sum/2 takes a list of either counts or {key, count} pairs, and 42 | % adds up all the counts. 43 | % 44 | % The overall result is the total returned as list 45 | % containing the pair {<<"total">>, integer total}. 46 | % 47 | % If PairList = [], then the total will be 0. 48 | % 49 | sum(PairList, _Arg) -> 50 | [ {<<"total">>, lists:foldl(fun add_sum/2, 0, PairList)} ]. 51 | 52 | add_sum({_Key, Count}, Acc) -> 53 | Acc + Count; 54 | add_sum(Count, Acc) when is_integer(Count) -> 55 | Acc + Count. 56 | 57 | % @doc maximum/2 takes a list of {key, count} pairs, and returns 58 | % a list of *all* pairs with the maximum count. 59 | % 60 | % If PairList = [], then the result will be [] 61 | % 62 | maximum([], _Arg) -> 63 | []; 64 | maximum(PairList, _Arg) -> 65 | lists:foldl(fun choose_max/2, [], PairList). 66 | 67 | choose_max({Key, Count}, []) -> 68 | [ {Key, Count} ]; 69 | choose_max({Key, Count}, [{_MaxKey, MaxCount}|_] = Maximums) -> 70 | if 71 | Count > MaxCount -> [ {Key, Count} ]; 72 | Count =:= MaxCount -> [ {Key, Count} | Maximums ]; 73 | true -> Maximums 74 | end. 75 | 76 | -------------------------------------------------------------------------------- /mapreduce/erlang/riak_mapreduce_utils.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% riak_mapreduce_utils: Utility functions for defining map/reduce processing. 4 | %% 5 | %% Copyright (c)2012, Christian Dahlqvist, WhiteNode Software Ltd. All Rights Reserved. 6 | %% 7 | %% This file is provided to you under the Apache License, 8 | %% Version 2.0 (the "License"); you may not use this file 9 | %% except in compliance with the License. You may obtain 10 | %% a copy of the License at 11 | %% 12 | %% http://www.apache.org/licenses/LICENSE-2.0 13 | %% 14 | %% Unless required by applicable law or agreed to in writing, 15 | %% software distributed under the License is distributed on an 16 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 17 | %% KIND, either express or implied. See the License for the 18 | %% specific language governing permissions and limitations 19 | %% under the License. 20 | %% 21 | %% ------------------------------------------------------------------- 22 | 23 | -module(riak_mapreduce_utils). 24 | 25 | -export([map_delete/3, 26 | map_indexinclude/3, 27 | map_indexlink/3, 28 | map_metafilter/3, 29 | map_id/3, 30 | map_key/3, 31 | map_datasize/3 32 | ]). 33 | 34 | %% From riak_pb_kv_codec.hrl 35 | -define(MD_USERMETA, <<"X-Riak-Meta">>). 36 | -define(MD_INDEX, <<"index">>). 37 | 38 | %% 39 | %% Map Phases 40 | %% 41 | 42 | %% @spec map_delete(riak_object:riak_object(), term(), term()) -> 43 | %% [integer()] 44 | %% @doc map phase function for deleting records 45 | map_delete({error, notfound}, _, _) -> 46 | []; 47 | map_delete(RiakObject, Props, Arg) when is_list(Arg) -> 48 | map_delete(RiakObject, Props, list_to_binary(Arg)); 49 | map_delete(RiakObject, Props, Arg) when is_atom(Arg) -> 50 | map_delete(RiakObject, Props, <<"">>); 51 | map_delete(RiakObject, _, Arg) when is_binary(Arg) -> 52 | {ok, C} = riak:local_client(), 53 | Bucket = riak_object:bucket(RiakObject), 54 | Key = riak_object:key(RiakObject), 55 | case Arg of 56 | Bucket -> 57 | C:delete(Bucket, Key), 58 | [1]; 59 | <<"">> -> 60 | C:delete(Bucket, Key), 61 | [1]; 62 | _ -> 63 | [] 64 | end; 65 | map_delete(_, _, _) -> 66 | []. 67 | 68 | %% @spec map_indexinclude(riak_object:riak_object(), term(), term()) -> 69 | %% [{{Bucket :: binary(), Key :: binary()}, Props :: term()}] 70 | %% @doc map phase function for including based on secondary index query in a 71 | %% manner similar to links. 72 | map_indexinclude({error, notfound}, _, _) -> 73 | []; 74 | map_indexinclude(RiakObject, Props, JsonArg) -> 75 | Bucket = riak_object:bucket(RiakObject), 76 | Key = riak_object:key(RiakObject), 77 | InitialList = [{{Bucket, Key}, Props}], 78 | % Parse config arg 79 | %%Args = decode_arguments(JsonArg), 80 | {struct, Args} = mochijson2:decode(JsonArg), 81 | case proplists:get_value(<<"keep">>, Args) of 82 | <<"false">> -> 83 | Keep = false; 84 | _ -> 85 | Keep = true 86 | end, 87 | case {proplists:get_value(<<"source">>, Args), 88 | proplists:get_value(<<"target">>, Args), 89 | proplists:get_value(<<"indexname">>, Args)} of 90 | {undefined, _, _} -> 91 | return_list(InitialList, [], Keep); 92 | {_, undefined, _} -> 93 | return_list(InitialList, [], Keep); 94 | {_, _, undefined} -> 95 | return_list(InitialList, [], Keep); 96 | {Bucket, Target, IndexName} -> 97 | Result = get_index_items(Target, Props, IndexName, Key), 98 | return_list(InitialList, Result, Keep); 99 | _ -> 100 | return_list(InitialList, [], Keep) 101 | end. 102 | 103 | %% @spec map_indexlink(riak_object:riak_object(), term(), term()) -> 104 | %% [{{Bucket :: binary(), Key :: binary()}, Props :: term()}] 105 | %% @doc map phase function for inclusion based on local secondary index value 106 | map_indexlink({error, notfound}, _, _) -> 107 | []; 108 | map_indexlink(RiakObject, Props, JsonArg) -> 109 | Bucket = riak_object:bucket(RiakObject), 110 | Key = riak_object:key(RiakObject), 111 | InitialList = [{{Bucket, Key}, Props}], 112 | {struct, Args} = mochijson2:decode(JsonArg), 113 | case proplists:get_value(<<"keep">>, Args) of 114 | <<"false">> -> 115 | Keep = false; 116 | _ -> 117 | Keep = true 118 | end, 119 | case {proplists:get_value(<<"source">>, Args), 120 | proplists:get_value(<<"target">>, Args), 121 | proplists:get_value(<<"indexname">>, Args)} of 122 | {undefined, _, _} -> 123 | return_list(InitialList, [], Keep); 124 | {_, undefined, _} -> 125 | return_list(InitialList, [], Keep); 126 | {_, _, undefined} -> 127 | return_list(InitialList, [], Keep); 128 | {Bucket, Target, IndexName} -> 129 | Result = create_indexlink_list(RiakObject, Props, IndexName, Target), 130 | return_list(InitialList, Result, Keep); 131 | _ -> 132 | return_list(InitialList, [], Keep) 133 | end. 134 | 135 | %% @spec map_metafilter(riak_object:riak_object(), term(), term()) -> 136 | %% [{{Bucket :: binary(), Key :: binary()}, Props :: term()}] 137 | %% @doc map phase function for selectively discarding records from the current set 138 | map_metafilter({error, notfound}, _, _) -> 139 | []; 140 | map_metafilter(RiakObject, Props, JsonArg) -> 141 | Bucket = riak_object:bucket(RiakObject), 142 | Key = riak_object:key(RiakObject), 143 | MetaDataList = riak_object:get_metadatas(RiakObject), 144 | InitialList = [{{Bucket, Key}, Props}], 145 | {struct, Args} = mochijson2:decode(JsonArg), 146 | case {proplists:get_value(<<"source">>, Args), 147 | proplists:get_value(<<"criteria">>, Args)} of 148 | {Bucket, undefined} -> 149 | Result = true; 150 | {Bucket, []} -> 151 | Result = true; 152 | {Bucket, Criteria} when is_list(Criteria)-> 153 | Result = check_criteria(MetaDataList, Criteria); 154 | {undefined, Criteria} when is_list(Criteria) -> 155 | Result = check_criteria(MetaDataList, Criteria); 156 | _ -> 157 | Result = false 158 | end, 159 | case Result of 160 | true -> 161 | []; 162 | _ -> 163 | InitialList 164 | end. 165 | 166 | %% @spec map_id(riak_object:riak_object(), term(), term()) -> 167 | %% [[Bucket :: binary(), Key :: binary()]] 168 | %% @doc map phase function returning bucket name and key in a readable format 169 | map_id({error, notfound}, _, _) -> 170 | []; 171 | map_id(RiakObject, Props, Arg) when is_list(Arg) -> 172 | map_id(RiakObject, Props, list_to_binary(Arg)); 173 | map_id(RiakObject, Props, Arg) when is_atom(Arg) -> 174 | map_id(RiakObject, Props, <<"">>); 175 | map_id(RiakObject, _, Arg) when is_binary(Arg) -> 176 | Bucket = riak_object:bucket(RiakObject), 177 | Key = riak_object:key(RiakObject), 178 | case Arg of 179 | Bucket -> 180 | [[Bucket, Key]]; 181 | <<"">> -> 182 | [[Bucket, Key]]; 183 | _ -> 184 | [] 185 | end; 186 | map_id(_, _, _) -> 187 | []. 188 | 189 | %% @spec map_key(riak_object:riak_object(), term(), term()) -> 190 | %% [Key :: binary()] 191 | %% @doc map phase function returning object key in a readable format 192 | map_key({error, notfound}, _, _) -> 193 | []; 194 | map_key(RiakObject, Props, Arg) when is_list(Arg) -> 195 | map_key(RiakObject, Props, list_to_binary(Arg)); 196 | map_key(RiakObject, Props, Arg) when is_atom(Arg) -> 197 | map_key(RiakObject, Props, <<"">>); 198 | map_key(RiakObject, _, Arg) when is_binary(Arg) -> 199 | Bucket = riak_object:bucket(RiakObject), 200 | Key = riak_object:key(RiakObject), 201 | case Arg of 202 | Bucket -> 203 | [Key]; 204 | <<"">> -> 205 | [Key]; 206 | _ -> 207 | [] 208 | end; 209 | map_key(_, _, _) -> 210 | []. 211 | 212 | %% @spec map_datasize(riak_object:riak_object(), term(), term()) -> 213 | %% [integer()] 214 | %% @doc map phase function returning size of the data stored in bytes. 215 | %% It does return total size if siblings are found. 216 | map_datasize({error, notfound}, _, _) -> 217 | []; 218 | map_datasize(RiakObject, _, _) -> 219 | DataSize = lists:foldl(fun(V, A) -> 220 | (byte_size(V) + A) 221 | end, 0, riak_object:get_values(RiakObject)), 222 | [DataSize]. 223 | 224 | %% hidden 225 | get_index_items(Bucket, Props, IndexName, Value) -> 226 | {ok, C} = riak:local_client(), 227 | case C:get_index(Bucket, {eq, IndexName, Value}) of 228 | {ok, KeyList} -> 229 | [{{Bucket, K}, Props} || K <- KeyList]; 230 | {error, _} -> 231 | [] 232 | end. 233 | 234 | %% hidden 235 | create_indexlink_list(RiakObject, Props, IndexName, Target) -> 236 | DictList = riak_object:get_metadatas(RiakObject), 237 | Result = create_indexlink_list(DictList, Props, IndexName, Target, []), 238 | sets:to_list(sets:from_list(Result)). 239 | 240 | %% hidden 241 | create_indexlink_list([], _Props, _IndexName, _Target, List) -> 242 | sets:to_list(sets:from_list(List)); 243 | create_indexlink_list([Dict | DictList], Props, IndexName, Target, List) -> 244 | case dict:find(?MD_INDEX, Dict) of 245 | error -> 246 | create_indexlink_list(DictList, Props, IndexName, Target, List); 247 | {ok, IndexList} -> 248 | case [I || {K, I} <- IndexList, K == IndexName] of 249 | [] -> 250 | create_indexlink_list(DictList, Props, IndexName, Target, List); 251 | [Indexes] when is_list(Indexes) -> 252 | Result = [{{Target, V}, Props} || V <- Indexes], 253 | ResList = lists:append(Result, List), 254 | create_indexlink_list(DictList, Props, IndexName, Target, ResList); 255 | [Index] -> 256 | ResList = lists:append([{{Target, Index}, Props}], List), 257 | create_indexlink_list(DictList, Props, IndexName, Target, ResList) 258 | end 259 | end. 260 | 261 | %% hidden 262 | return_list(Original, Result, true) -> 263 | lists:append([Original, Result]); 264 | return_list(_, Result, _) -> 265 | Result. 266 | 267 | %% hidden 268 | check_criteria(MetaDataList, Criteria) when is_list(Criteria) -> 269 | case parse_criteria(Criteria, []) of 270 | error -> 271 | false; 272 | CList -> 273 | check_parsed_criteria(MetaDataList, CList) 274 | end; 275 | check_criteria(_, _) -> 276 | error. 277 | 278 | %% hidden 279 | parse_criteria([], CList) -> 280 | CList; 281 | parse_criteria([C | R], CList) -> 282 | case C of 283 | [Op, Field, Val] -> 284 | case {Op, parse_field(Field)} of 285 | {_, error} -> error; 286 | {<<"eq">>, F} -> parse_criteria(R, lists:append([{eq, F, Val}], CList)); 287 | {<<"neq">>, F} -> parse_criteria(R, lists:append([{neq, F, Val}], CList)); 288 | {<<"greater_than">>, F} -> parse_criteria(R, lists:append([{greater_than, F, Val}], CList)); 289 | {<<"greater_than_eq">>, F} -> parse_criteria(R, lists:append([{greater_than_eq, F, Val}], CList)); 290 | {<<"less_than">>, F} -> parse_criteria(R, lists:append([{less_than, F, Val}], CList)); 291 | {<<"less_than_eq">>, F} -> parse_criteria(R, lists:append([{less_than_eq, F, Val}], CList)); 292 | _ -> error 293 | end; 294 | _ -> error 295 | end. 296 | 297 | %% hidden 298 | parse_field(Field) when is_list(Field) -> 299 | parse_field(list_to_binary(Field)); 300 | parse_field(Field) when is_binary(Field) -> 301 | case Field of 302 | <<"meta:", Val/binary>> -> {meta, Val}; 303 | <<"index:", Val/binary>> -> {index, Val}; 304 | _ -> error 305 | end; 306 | parse_field(_) -> 307 | error. 308 | 309 | %% hidden 310 | check_parsed_criteria([], _CList) -> 311 | false; 312 | check_parsed_criteria([MetaData | Rest], CList) -> 313 | case evaluate_criteria(MetaData, CList) of 314 | true -> true; 315 | _ -> check_parsed_criteria(Rest, CList) 316 | end. 317 | 318 | %% hidden 319 | evaluate_criteria(_MetaData, []) -> 320 | true; 321 | evaluate_criteria(MetaData, [{Op, {Type, F}, V} | List]) -> 322 | case get_metadata_value(MetaData, Type, F) of 323 | undefined -> 324 | false; 325 | Value -> 326 | case check_value(Op, Value, V) of 327 | true -> 328 | evaluate_criteria(MetaData, List); 329 | _ -> 330 | false 331 | end 332 | end. 333 | 334 | get_metadata_value(MetaData, Type, MetaName) -> 335 | case Type of 336 | meta -> 337 | MetaKey = ?MD_USERMETA, 338 | MN = binary_to_list(MetaName); 339 | index -> 340 | MetaKey = ?MD_INDEX, 341 | MN = MetaName 342 | end, 343 | case dict:find(MetaKey, MetaData) of 344 | {ok, Value} -> 345 | case [V || {K, V} <- Value, K == MN] of 346 | [] -> undefined; 347 | [V] when is_list(V) -> list_to_binary(V); 348 | [V] -> V 349 | end; 350 | error -> undefined 351 | end. 352 | 353 | %% hidden 354 | check_value(Op, Value, Param) when is_integer(Value) andalso is_binary(Param) -> 355 | try list_to_integer(binary_to_list(Param)) of 356 | Integer -> check_value(Op, Value, Integer) 357 | catch 358 | _:_ -> 359 | BValue = list_to_binary(integer_to_list(Value)), 360 | BParam = list_to_binary(Param), 361 | check_value(Op, BValue, BParam) 362 | end; 363 | check_value(Op, Value, Param) when is_binary(Value) andalso is_integer(Param) -> 364 | Pbin = list_to_binary(integer_to_list(Param)), 365 | check_value(Op, Value, Pbin); 366 | check_value(eq, Value, Param)-> 367 | Value == Param; 368 | check_value(neq, Value, Param) -> 369 | Value =/= Param; 370 | check_value(greater_than, Value, Param) -> 371 | Value > Param; 372 | check_value(greater_than_eq, Value, Param) -> 373 | Value >= Param; 374 | check_value(less_than, Value, Param) -> 375 | Value < Param; 376 | check_value(less_than_eq, Value, Param) -> 377 | Value =< Param. 378 | -------------------------------------------------------------------------------- /mapreduce/erlang/save_reduce.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | 21 | -module(reduce_functions). 22 | 23 | %%Function assumes JSON values 24 | -export([save_reduce/2]). 25 | 26 | %%Arg is a [bucket, key] combination 27 | save_reduce([Data | _], [Bucket, Key]) -> 28 | {ok, C} = riak:local_client(), 29 | Json = iolist_to_binary(mochijson2:encode(Data)), 30 | Object = riak_object:new(Bucket, Key, Json, "application/json"), 31 | C:put(Object, 1), 32 | []; 33 | save_reduce(_, _) -> 34 | []. 35 | -------------------------------------------------------------------------------- /mapreduce/js/.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/basho/riak_function_contrib/bad124ac39c72eb6ed85e6f30cfc779515e4e612/mapreduce/js/.gitignore -------------------------------------------------------------------------------- /mapreduce/js/count_keys.js: -------------------------------------------------------------------------------- 1 | // ------------------------------------------------------------------- 2 | // 3 | // 4 | // This file is provided to you under the Apache License, 5 | // Version 2.0 (the "License"); you may not use this file 6 | // except in compliance with the License. You may obtain 7 | // a copy of the License at 8 | // 9 | // http://www.apache.org/licenses/LICENSE-2.0 10 | // 11 | // Unless required by applicable law or agreed to in writing, 12 | // software distributed under the License is distributed on an 13 | // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | // KIND, either express or implied. See the License for the 15 | // specific language governing permissions and limitations 16 | // under the License. 17 | // 18 | // ------------------------------------------------------------------- 19 | // 20 | 21 | //This is the map function to the count map reduce functions 22 | 23 | function mapCount() { 24 | return [1] 25 | } 26 | 27 | //The Riak built in function Riak.reduceSum can be used for the reduce phase -------------------------------------------------------------------------------- /mapreduce/js/get_keys.js: -------------------------------------------------------------------------------- 1 | // ------------------------------------------------------------------- 2 | // 3 | // 4 | // This file is provided to you under the Apache License, 5 | // Version 2.0 (the "License"); you may not use this file 6 | // except in compliance with the License. You may obtain 7 | // a copy of the License at 8 | // 9 | // http://www.apache.org/licenses/LICENSE-2.0 10 | // 11 | // Unless required by applicable law or agreed to in writing, 12 | // software distributed under the License is distributed on an 13 | // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | // KIND, either express or implied. See the License for the 15 | // specific language governing permissions and limitations 16 | // under the License. 17 | // 18 | // ------------------------------------------------------------------- 19 | 20 | 21 | get_keys(object, keyData, arg){ 22 | return [[object.bucket, object.key]] 23 | } -------------------------------------------------------------------------------- /mapreduce/js/iso8601.js: -------------------------------------------------------------------------------- 1 | // Written by Paul Sowden, 2005 2 | // http://delete.me.uk/2005/03/iso8601.html 3 | // Released under the Academic Free License 4 | 5 | Date.prototype.setISO8601 = function (string) { 6 | var regexp = "([0-9]{4})(-([0-9]{2})(-([0-9]{2})" + 7 | "(T([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?" + 8 | "(Z|(([-+])([0-9]{2}):([0-9]{2})))?)?)?)?"; 9 | var d = string.match(new RegExp(regexp)); 10 | 11 | var offset = 0; 12 | var date = new Date(d[1], 0, 1); 13 | 14 | if (d[3]) { date.setMonth(d[3] - 1); } 15 | if (d[5]) { date.setDate(d[5]); } 16 | if (d[7]) { date.setHours(d[7]); } 17 | if (d[8]) { date.setMinutes(d[8]); } 18 | if (d[10]) { date.setSeconds(d[10]); } 19 | if (d[12]) { date.setMilliseconds(Number("0." + d[12]) * 1000); } 20 | if (d[14]) { 21 | offset = (Number(d[16]) * 60) + Number(d[17]); 22 | offset *= ((d[15] == '-') ? 1 : -1); 23 | } 24 | 25 | offset -= date.getTimezoneOffset(); 26 | time = (Number(date) + (offset * 60 * 1000)); 27 | this.setTime(Number(time)); 28 | } 29 | 30 | Date.iso8601 = function (string) { 31 | d = new Date; 32 | d.setISO8601(string); 33 | return d; 34 | } 35 | 36 | Date.prototype.iso8601 = function (format, offset) { 37 | /* accepted values for the format [1-6]: 38 | 1 Year: 39 | YYYY (eg 1997) 40 | 2 Year and month: 41 | YYYY-MM (eg 1997-07) 42 | 3 Complete date: 43 | YYYY-MM-DD (eg 1997-07-16) 44 | 4 Complete date plus hours and minutes: 45 | YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00) 46 | 5 Complete date plus hours, minutes and seconds: 47 | YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00) 48 | 6 Complete date plus hours, minutes, seconds and a decimal 49 | fraction of a second 50 | YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) 51 | */ 52 | if (!format) { var format = 6; } 53 | if (!offset) { 54 | var offset = 'Z'; 55 | var date = this; 56 | } else { 57 | var d = offset.match(/([-+])([0-9]{2}):([0-9]{2})/); 58 | var offsetnum = (Number(d[2]) * 60) + Number(d[3]); 59 | offsetnum *= ((d[1] == '-') ? -1 : 1); 60 | var date = new Date(Number(Number(this) + (offsetnum * 60000))); 61 | } 62 | 63 | var zeropad = function (num) { return ((num < 10) ? '0' : '') + num; } 64 | 65 | var str = ""; 66 | str += date.getUTCFullYear(); 67 | if (format > 1) { str += "-" + zeropad(date.getUTCMonth() + 1); } 68 | if (format > 2) { str += "-" + zeropad(date.getUTCDate()); } 69 | if (format > 3) { 70 | str += "T" + zeropad(date.getUTCHours()) + 71 | ":" + zeropad(date.getUTCMinutes()); 72 | } 73 | if (format > 5) { 74 | var secs = Number(date.getUTCSeconds() + "." + 75 | ((date.getUTCMilliseconds() < 100) ? '0' : '') + 76 | zeropad(date.getUTCMilliseconds())); 77 | str += ":" + zeropad(secs); 78 | } else if (format > 4) { str += ":" + zeropad(date.getUTCSeconds()); } 79 | 80 | if (format > 3) { str += offset; } 81 | return str; 82 | } 83 | -------------------------------------------------------------------------------- /mapreduce/js/regex_key_match.js: -------------------------------------------------------------------------------- 1 | // ------------------------------------------------------------------- 2 | // 3 | // 4 | // This file is provided to you under the Apache License, 5 | // Version 2.0 (the "License"); you may not use this file 6 | // except in compliance with the License. You may obtain 7 | // a copy of the License at 8 | // 9 | // http://www.apache.org/licenses/LICENSE-2.0 10 | // 11 | // Unless required by applicable law or agreed to in writing, 12 | // software distributed under the License is distributed on an 13 | // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | // KIND, either express or implied. See the License for the 15 | // specific language governing permissions and limitations 16 | // under the License. 17 | // 18 | // ------------------------------------------------------------------- 19 | 20 | 21 | //arg is a regular expression 22 | function keyMatch(value, arg){ 23 | if (value.values[0].data.match(arg)){ 24 | return [value.key] 25 | }else{ 26 | return [] 27 | } 28 | } -------------------------------------------------------------------------------- /mapreduce/js/slenderize.js: -------------------------------------------------------------------------------- 1 | // ------------------------------------------------------------------- 2 | // 3 | // 4 | // This file is provided to you under the Apache License, 5 | // Version 2.0 (the "License"); you may not use this file 6 | // except in compliance with the License. You may obtain 7 | // a copy of the License at 8 | // 9 | // http://www.apache.org/licenses/LICENSE-2.0 10 | // 11 | // Unless required by applicable law or agreed to in writing, 12 | // software distributed under the License is distributed on an 13 | // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | // KIND, either express or implied. See the License for the 15 | // specific language governing permissions and limitations 16 | // under the License. 17 | // 18 | // ------------------------------------------------------------------- 19 | 20 | var slenderize = function (v, kd, arg) { 21 | 22 | // we obviously assume it's a JSON document 23 | v = Riak.mapValuesJson(v)[0]; 24 | 25 | // arg must be an Array 26 | if (arg instanceof Array) { 27 | arg.forEach(function(prop) { 28 | delete v[prop]; 29 | }); 30 | } else { 31 | throw new Error("The provided argument must be an Array"); 32 | } 33 | 34 | return [v]; 35 | 36 | } -------------------------------------------------------------------------------- /mapreduce/js/sorting-by-field.js: -------------------------------------------------------------------------------- 1 | /* ------------------------------------------------------------------- 2 | This file is provided to you under the Apache License, 3 | Version 2.0 (the "License"); you may not use this file 4 | except in compliance with the License. You may obtain 5 | a copy of the License at 6 | 7 | http://www.apache.org/licenses/LICENSE-2.0 8 | 9 | Unless required by applicable law or agreed to in writing, 10 | software distributed under the License is distributed on an 11 | "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 12 | KIND, either express or implied. See the License for the 13 | specific language governing permissions and limitations 14 | under the License. 15 | 16 | -------------------------------------------------------------------*/ 17 | 18 | 19 | // This function was originally written in a "CoffeeScript" (http://jashkenas.github.com/coffee-script/) which compiles down to JavaScript 20 | // The CoffeeScript code looks like this: 21 | 22 | /* sort = (values, arg) -> 23 | field = arg?.by 24 | reverse = arg?.order is 'desc' 25 | values.sort (a, b) -> 26 | if reverse then [a,b] = [b,a] 27 | if a?[field] < b?[field] then -1 28 | else if a?[field] is b?[field] then 0 29 | else if a?[field] > b?[field] then 1 */ 30 | 31 | // The code below is what is generated after running it through a compiler 32 | 33 | 34 | var sort = function(values, arg) { 35 | var field = (typeof arg === "undefined" || arg === null) ? undefined : arg.by; 36 | var reverse = ((typeof arg === "undefined" || arg === null) ? 37 | undefined : arg.order) === 'desc'; 38 | values.sort(function(a, b) { 39 | if (reverse) { 40 | var _ref = [b, a]; 41 | a = _ref[0]; 42 | b = _ref[1]; 43 | } 44 | if (((typeof a === "undefined" || a === null) ? undefined : 45 | a[field]) < ((typeof b === "undefined" || b === null) ? undefined : 46 | b[field])) { 47 | return -1; 48 | } else if (((typeof a === "undefined" || a === null) ? undefined : 49 | a[field]) === ((typeof b === "undefined" || b === null) ? undefined : 50 | b[field])) { 51 | return 0; 52 | } else if (((typeof a === "undefined" || a === null) ? undefined : 53 | a[field]) > ((typeof b === "undefined" || b === null) ? undefined : 54 | b[field])) { 55 | return 1; 56 | } 57 | }); 58 | }; -------------------------------------------------------------------------------- /mapreduce/js/stats.js: -------------------------------------------------------------------------------- 1 | /* ------------------------------------------------------------------- 2 | Copyright 2010 Mozilla Foundation 3 | 4 | This file is provided to you under the Apache License, 5 | Version 2.0 (the "License"); you may not use this file 6 | except in compliance with the License. You may obtain 7 | a copy of the License at 8 | 9 | http://www.apache.org/licenses/LICENSE-2.0 10 | 11 | Unless required by applicable law or agreed to in writing, 12 | software distributed under the License is distributed on an 13 | "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | KIND, either express or implied. See the License for the 15 | specific language governing permissions and limitations 16 | under the License. 17 | 18 | Contributor(s): 19 | Daniel Einspanjer 20 | 21 | -------------------------------------------------------------------*/ 22 | 23 | // Function object that contains the count, sum, 24 | // minimum, percentiles, maximum, mean, variance, and 25 | // standard deviation of the series of numbers stored 26 | // in the specified array of sorted numbers. 27 | var Stats = function(data) { 28 | var result = {}; 29 | 30 | data.sort(function(a,b){return a-b;}); 31 | result.count = data.length; 32 | 33 | // Since the data is sorted, the minimum value 34 | // is at the beginning of the array, the median 35 | // value is in the middle of the array, and the 36 | // maximum value is at the end of the array. 37 | result.min = data[0]; 38 | result.max = data[data.length - 1]; 39 | 40 | var ntileFunc = function(percentile){ 41 | if (data.length == 1) return data[0]; 42 | var ntileRank = ((percentile/100) * (data.length - 1)) + 1; 43 | var integralRank = Math.floor(ntileRank); 44 | var fractionalRank = ntileRank - integralRank; 45 | var lowerValue = data[integralRank-1]; 46 | var upperValue = data[integralRank]; 47 | return (fractionalRank * (upperValue - lowerValue)) + lowerValue; 48 | } 49 | 50 | result.percentile25 = ntileFunc(25); 51 | result.median = ntileFunc(50); 52 | result.percentile75 = ntileFunc(75); 53 | result.percentile99 = ntileFunc(99); 54 | 55 | // Compute the mean and variance using a 56 | // numerically stable algorithm. 57 | var sqsum = 0; 58 | result.mean = data[0]; 59 | result.sum = result.mean * result.count; 60 | for (var i = 1; i < data.length; ++i) { 61 | var x = data[i]; 62 | var delta = x - result.mean; 63 | var sweep = i + 1.0; 64 | result.mean += delta / sweep; 65 | sqsum += delta * delta * (i / sweep); 66 | result.sum += x; 67 | } 68 | result.variance = sqsum / result.count; 69 | result.sdev = Math.sqrt(result.variance); 70 | 71 | 72 | return result; 73 | } 74 | -------------------------------------------------------------------------------- /other/erlang/bucket_exporter.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | -module(bucket_exporter). 21 | 22 | -export([export_data/4, 23 | export_data/5]). 24 | 25 | export_data(FromServer, Bucket, Extension, Directory) -> 26 | export_data(FromServer, Bucket, Extension, Directory, 1.0). 27 | 28 | export_data(FromServer, Bucket, Extension, Directory, InputSize) -> 29 | {ok, CFrom} = riak:client_connect(FromServer), 30 | {ok, Keys0} = CFrom:list_keys(Bucket), 31 | Keys = truncate_keys(Keys0, InputSize), 32 | io:format("Got ~p keys~n", [length(Keys)]), 33 | export_data(CFrom, Bucket, Extension, Directory, Keys, 0), 34 | io:format("Data export complete~n"). 35 | 36 | export_data(_CFrom, _Bucket, _Extension, _Directory, [], _) -> 37 | io:format("~n"), 38 | ok; 39 | export_data(CFrom, Bucket, Extension, Directory0, [H|T], Count) when is_binary(H) -> 40 | Owner = self(), 41 | proc_lib:spawn(fun() -> 42 | case CFrom:get(Bucket, H) of 43 | {ok, FromObj} -> 44 | Directory = munge_directory(Directory0, binary_to_list(H)), 45 | FileName = binary_to_list(H) ++ "." ++ Extension, 46 | Path = filename:join([Directory, FileName]), 47 | filelib:ensure_dir(Path), 48 | Obj = riak_object:get_value(FromObj), 49 | ok = file:write_file(Path, Obj), 50 | Owner ! done; 51 | _Error -> 52 | Owner ! done end end), 53 | NewCount = if 54 | Count == 250 -> 55 | let_workers_catch_up(Count), 56 | 0; 57 | true -> 58 | Count + 1 59 | end, 60 | export_data(CFrom, Bucket, Extension, Directory0, T, NewCount). 61 | 62 | let_workers_catch_up(0) -> 63 | ok; 64 | let_workers_catch_up(Count) -> 65 | receive 66 | done -> 67 | ok 68 | end, 69 | let_workers_catch_up(Count - 1). 70 | 71 | munge_directory(Directory0, [C1, C2, C3|_]) -> 72 | Directory0 ++ [$/,C1,$/,C2,$/,C3]. 73 | 74 | truncate_keys(Keys, 1.0) -> 75 | Keys; 76 | truncate_keys(Keys, InputSize) -> 77 | TargetSize = erlang:round(length(Keys) * InputSize), 78 | {Keys1, _} = lists:split(TargetSize, Keys), 79 | Keys1. -------------------------------------------------------------------------------- /other/erlang/bucket_importer.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% This file is provided to you under the Apache License, 4 | %% Version 2.0 (the "License"); you may not use this file 5 | %% except in compliance with the License. You may obtain 6 | %% a copy of the License at 7 | %% 8 | %% http://www.apache.org/licenses/LICENSE-2.0 9 | %% 10 | %% Unless required by applicable law or agreed to in writing, 11 | %% software distributed under the License is distributed on an 12 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 13 | %% KIND, either express or implied. See the License for the 14 | %% specific language governing permissions and limitations 15 | %% under the License. 16 | %% 17 | %% ------------------------------------------------------------------- 18 | 19 | -module(bucket_importer). 20 | 21 | -export([import_data/4]). 22 | 23 | 24 | import_data(ToServer, Bucket, Directory, ContentType) when is_list(Bucket) -> 25 | import_data(ToServer, list_to_binary(Bucket), Directory, ContentType); 26 | import_data(ToServer, Bucket, Directory, ContentType) -> 27 | {ok, Client} = riak:client_connect(ToServer), 28 | {ok, StripExtensionRe} = re:compile("\\.[a-z0-9]+$", [caseless]), 29 | DirectoryLen = length(Directory), 30 | 31 | F = fun(Filename_, Acc0_) -> 32 | case file:read_file(Filename_) of 33 | {ok, Data} -> 34 | FilenameRel = lists:nthtail(DirectoryLen, Filename_), 35 | KeyBase = unmunge_directory(FilenameRel), 36 | Key = re:replace(KeyBase, StripExtensionRe, "", [{return,binary}]), 37 | Object = riak_object:new(Bucket, Key, Data, ContentType), 38 | Client:put(Object, 1), 39 | io:format("."); 40 | 41 | {error, Reason} -> 42 | io:format("Error reading ~p:~p~n", [Filename_, Reason]) 43 | end, 44 | Acc0_ 45 | end, 46 | [] = filelib:fold_files(Directory, ".*", true, F, []), 47 | ok. 48 | 49 | 50 | unmunge_directory([$/ | Rest]) -> 51 | unmunge_directory(Rest); 52 | unmunge_directory([C1,$/,C2,$/,C3,$/ | [C1,C2,C3 | _] = Rest]) -> 53 | Rest. 54 | 55 | -------------------------------------------------------------------------------- /other/erlang/bucket_inspector.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | -module(bucket_inspector). 21 | 22 | -export([inspect/2]). 23 | 24 | inspect(Bucket, Server) -> 25 | {ok, C} = riak:client_connect(Server), 26 | {ok, Keys} = C:list_keys(Bucket), 27 | inspect_objects(Bucket, Keys, C). 28 | 29 | inspect_objects(_Bucket, [], _Client) -> 30 | ok; 31 | inspect_objects(Bucket, [H|T], Client) -> 32 | Client:get(Bucket, H), 33 | inspect_objects(Bucket, T, Client). -------------------------------------------------------------------------------- /other/erlang/bucket_reloader.erl: -------------------------------------------------------------------------------- 1 | -module(bucket_reloader). 2 | %% ------------------------------------------------------------------- 3 | %% 4 | %% 5 | %% This file is provided to you under the Apache License, 6 | %% Version 2.0 (the "License"); you may not use this file 7 | %% except in compliance with the License. You may obtain 8 | %% a copy of the License at 9 | %% 10 | %% http://www.apache.org/licenses/LICENSE-2.0 11 | %% 12 | %% Unless required by applicable law or agreed to in writing, 13 | %% software distributed under the License is distributed on an 14 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 15 | %% KIND, either express or implied. See the License for the 16 | %% specific language governing permissions and limitations 17 | %% under the License. 18 | %% 19 | %% ------------------------------------------------------------------- 20 | 21 | -export([reload/4, 22 | reload/5]). 23 | 24 | reload(FromServer, ToServer, Bucket, NewBucket) -> 25 | reload(FromServer, ToServer, Bucket, NewBucket, 1.0). 26 | 27 | reload(FromServer, ToServer, Bucket, NewBucket, InputSize) -> 28 | {ok, CFrom} = riak:client_connect(FromServer), 29 | {ok, CTo} = riak:client_connect(ToServer), 30 | {ok, Keys0} = CFrom:list_keys(Bucket), 31 | Keys = truncate_keys(Keys0, InputSize), 32 | io:format("Transferring ~p keys~n", [length(Keys)]), 33 | transfer(CFrom, CTo, Bucket, NewBucket, Keys, 0). 34 | 35 | transfer(_CFrom, _CTo, _Bucket, _NewBucket, [], _) -> 36 | io:format("~n"), 37 | ok; 38 | transfer(CFrom, CTo, Bucket, NewBucket, [H|T], Count) when is_binary(H) -> 39 | Owner = self(), 40 | proc_lib:spawn(fun() -> 41 | case CFrom:get(Bucket, H) of 42 | {ok, FromObj} -> 43 | OldObj = riak_object:get_value(FromObj), 44 | OldKey = riak_object:key(FromObj), 45 | OldContentType = riak_object:key(FromObj), 46 | Object = riak_object:new(NewBucket, OldKey, OldObj, OldContentType), 47 | CTo:put(Object, 1), 48 | io:format("."), 49 | Owner ! done; 50 | Error -> 51 | error_logger:error_msg("Error fetching ~p/~p: ~p~n", [Bucket, H, Error]), 52 | Owner ! done 53 | end end), 54 | NewCount = if 55 | Count == 250 -> 56 | let_workers_catch_up(Count), 57 | 0; 58 | true -> 59 | Count + 1 60 | end, 61 | transfer(CFrom, CTo, Bucket, NewBucket, T, NewCount). 62 | 63 | let_workers_catch_up(0) -> 64 | ok; 65 | let_workers_catch_up(Count) -> 66 | receive 67 | done -> 68 | ok 69 | end, 70 | let_workers_catch_up(Count - 1). 71 | 72 | truncate_keys(Keys, 1.0) -> 73 | Keys; 74 | truncate_keys(Keys, InputSize) -> 75 | TargetSize = erlang:round(length(Keys) * InputSize), 76 | {Keys1, _} = lists:split(TargetSize, Keys), 77 | Keys1. -------------------------------------------------------------------------------- /other/erlang/digraph_exporter.erl: -------------------------------------------------------------------------------- 1 | %% 2 | %% 3 | %% This file is provided to you under the Apache License, 4 | %% Version 2.0 (the "License"); you may not use this file 5 | %% except in compliance with the License. You may obtain 6 | %% a copy of the License at 7 | %% 8 | %% http://www.apache.org/licenses/LICENSE-2.0 9 | %% 10 | %% Unless required by applicable law or agreed to in writing, 11 | %% software distributed under the License is distributed on an 12 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 13 | %% KIND, either express or implied. See the License for the 14 | %% specific language governing permissions and limitations 15 | %% under the License. 16 | %% 17 | %% ------------------------------------------------------------------- 18 | 19 | -module(digraph_exporter). 20 | 21 | -export([export_digraph/5, export_digraph/6]). 22 | 23 | %% @spec export_digraph(Server :: ip_address(), 24 | %% Port :: integer(), 25 | %% Bucket :: bucket(), 26 | %% Ref :: digraph(), 27 | %% UseValue :: boolean()) -> ok. 28 | 29 | export_digraph(Server, Port, Bucket, Ref, UseValue) -> 30 | export_digraph(Server, Port, Bucket, undefined, Ref, UseValue). 31 | 32 | %% @spec export_digraph(Server :: ip_address(), 33 | %% Port :: integer(), 34 | %% Bucket :: bucket(), 35 | %% FilterList :: list(), 36 | %% Ref :: digraph(), 37 | %% UseValue :: boolean()) -> ok. 38 | 39 | export_digraph(Server, Port, Bucket, FilterList, Ref, UseValue) -> 40 | {ok, Client} = riakc_pb_socket:start(Server, Port), 41 | 42 | Input = 43 | case FilterList of 44 | undefined -> 45 | Bucket; 46 | _ -> 47 | {Bucket, [FilterList]} 48 | end, 49 | 50 | MapFun = build_map_fun(), 51 | 52 | MapPhase = {map, {qfun, MapFun}, notused, true}, 53 | Query = [MapPhase], 54 | 55 | {ok, [{_, Data}]} = riakc_pb_socket:mapred(Client, Input, Query), 56 | 57 | build_vertices(Ref, UseValue, Data), 58 | add_edges(Ref, Data). 59 | 60 | 61 | build_vertices(_Ref, _UseValue, []) -> ok; 62 | build_vertices(Ref, UseValue, [{Key, Value, ContentType, _}|T]) -> 63 | Label = 64 | case UseValue of 65 | true -> 66 | case ContentType of 67 | "application/x-erlang-binary" -> 68 | binary_to_term(Value); 69 | _ -> 70 | binary_to_list(Value) 71 | end; 72 | false -> "" 73 | end, 74 | 75 | Vertex = binary_to_list(Key), 76 | 77 | digraph:add_vertex(Ref, Vertex, Label), 78 | io:format("vertex: ~s~n", [Vertex]), 79 | build_vertices(Ref, UseValue, T). 80 | 81 | 82 | add_edges(_Ref, []) -> ok; 83 | add_edges(Ref, [{Key, _, _, Links}|T]) -> 84 | lists:foreach(fun({{_, Dest}, Tag}) -> 85 | Edge = binary_to_list(Tag), 86 | VSrc = binary_to_list(Key), 87 | VDest = binary_to_list(Dest), 88 | digraph:add_edge(Ref, Edge, VSrc, VDest, ""), 89 | io:format("edge: ~s~n", [Edge]) 90 | end, Links), 91 | add_edges(Ref, T). 92 | 93 | 94 | build_map_fun() -> 95 | MapFun = "fun(Object, _KeyData, _Args) -> 96 | [{MetaDataDict, Value}] = riak_object:get_contents(Object), 97 | 98 | MetaData = dict:to_list(MetaDataDict), 99 | 100 | ContentType = proplists:get_value(<<\"content-type\">>, MetaData, \"\"), 101 | Links = proplists:get_value(<<\"Links\">>, MetaData, []), 102 | 103 | [{riak_object:key(Object), Value, ContentType, Links}] 104 | end.", 105 | 106 | {ok, Tokens, _} = erl_scan:string(MapFun), 107 | {ok, [Form]} = erl_parse:parse_exprs(Tokens), 108 | Bindings = erl_eval:new_bindings(), 109 | {value, Fun, _} = erl_eval:expr(Form, Bindings), 110 | Fun. 111 | 112 | %%% EOF 113 | -------------------------------------------------------------------------------- /other/erlang/digraph_importer.erl: -------------------------------------------------------------------------------- 1 | %% ------------------------------------------------------------------- 2 | %% 3 | %% 4 | %% This file is provided to you under the Apache License, 5 | %% Version 2.0 (the "License"); you may not use this file 6 | %% except in compliance with the License. You may obtain 7 | %% a copy of the License at 8 | %% 9 | %% http://www.apache.org/licenses/LICENSE-2.0 10 | %% 11 | %% Unless required by applicable law or agreed to in writing, 12 | %% software distributed under the License is distributed on an 13 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 14 | %% KIND, either express or implied. See the License for the 15 | %% specific language governing permissions and limitations 16 | %% under the License. 17 | %% 18 | %% ------------------------------------------------------------------- 19 | 20 | -module(digraph_importer). 21 | 22 | -export([import_digraph/5]). 23 | 24 | %% @spec import_digraph(Server :: ip_address(), 25 | %% Port :: integer(), 26 | %% Bucket :: bucket(), 27 | %% Ref :: digraph(), 28 | %% ContentType :: string()) -> ok. 29 | 30 | import_digraph(Server, Port, Bucket, Ref, ContentType) -> 31 | {ok, Client} = riakc_pb_socket:start(Server, Port), 32 | 33 | Vertices = digraph:vertices(Ref), 34 | Edges = digraph:edges(Ref), 35 | 36 | VTupleList = [{Vertex, []} || Vertex <- Vertices], 37 | 38 | MappedVTupleList = map_edges(VTupleList, Edges, Ref, Bucket), 39 | 40 | lists:foreach(fun(VTuple) -> 41 | load_data(Client, VTuple, Ref, Bucket, ContentType) 42 | end, MappedVTupleList). 43 | 44 | 45 | map_edges(VTuple, [], _Ref, _Bucket) -> VTuple; 46 | 47 | map_edges(VTuple, [H|T], Ref, Bucket) -> 48 | {Edge, Src, Dest, _} = digraph:edge(Ref, H), 49 | {value, {_, Links}} = lists:keysearch(Src, 1, VTuple), 50 | 51 | NewLinks = Links++[{{Bucket, to_binary(Dest)}, to_binary(Edge)}], 52 | NewVtuple = lists:keyreplace(Src, 1, VTuple, {Src, NewLinks}), 53 | 54 | map_edges(NewVtuple, T, Ref, Bucket). 55 | 56 | 57 | load_data(Client, {Vertex, LinkList}, Ref, Bucket, ContentType) -> 58 | {_, Label} = digraph:vertex(Ref, Vertex), 59 | 60 | Key = to_binary(Vertex), 61 | Value = 62 | if ContentType == "application/x-erlang-binary" -> 63 | term_to_binary(Label); 64 | true -> 65 | to_binary(Label) 66 | end, 67 | 68 | Metadata = dict:from_list([{<<"content-type">>, ContentType}, 69 | {<<"Links">>, LinkList}]), 70 | 71 | Object = riakc_obj:new(to_binary(Bucket), Key, Value), 72 | MDObject = riakc_obj:update_metadata(Object, Metadata), 73 | riakc_pb_socket:put(Client, MDObject), 74 | io:format("Vertex: ~p~n", [Key]). 75 | 76 | 77 | to_binary(Item) when is_atom(Item) -> 78 | list_to_binary(atom_to_list(Item)); 79 | to_binary(Item) when is_list(Item) -> 80 | list_to_binary(Item); 81 | to_binary(Item) when is_binary(Item) -> 82 | Item. 83 | 84 | %%% EOF 85 | -------------------------------------------------------------------------------- /other/ruby/riak_yaml_importer.rb: -------------------------------------------------------------------------------- 1 | # Copyright 2011 Jeremiah Peschka 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | require 'yaml' 16 | require 'riak' 17 | 18 | def import_folder(host, port, bucket, file_list) 19 | file_list.each do |file| 20 | import_file(host, port, bucket, file) 21 | end 22 | end 23 | 24 | def import_file(host, port, bucket, file) 25 | client = Riak::Client.new(:host => host, 26 | :port => port, 27 | :http_backend => :Excon) 28 | 29 | bucket = client.bucket(bucket) 30 | 31 | w_props = { 32 | :w => 0, 33 | :dw => 0, 34 | :returnbody => false 35 | } 36 | 37 | records = YAML::load_stream(File.open(file)) 38 | 39 | records[0].each do |record| 40 | o = bucket.new(record[0]) 41 | o.data = record[1] 42 | o.content_type = 'application/json' 43 | o.store(w_props) 44 | end 45 | end -------------------------------------------------------------------------------- /other/ruby/yaml_importer.rb: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env ruby 2 | 3 | # Copyright 2011 Jeremiah Peschka 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | require 'rubygems' 18 | require 'riak' 19 | require 'yaml' 20 | require 'optparse' 21 | require_relative 'riak_yaml_importer' 22 | 23 | options = {} 24 | 25 | optparse = OptionParser.new do |opts| 26 | opts.banner = "Usage: load_yaml.rb [options]" 27 | 28 | options[:file_list] = nil 29 | opts.on('-f', '--file FILE', String, 'YAML to load') do |file| 30 | options[:file_list] = [file] 31 | end 32 | 33 | options[:directory] = nil 34 | opts.on('-d', '--directory DIRECTORY', String, 'Folder to recurse') do |directory| 35 | options[:directory] = directory 36 | end 37 | 38 | options[:bucket] = nil 39 | opts.on('-b', '--bucket BUCKET', String, 'Bukket for mah data!') do |bucket| 40 | options[:bucket] = bucket 41 | end 42 | 43 | options[:host] = 'localhost' 44 | opts.on('-h', '--host HOSTNAME', String, 'IP/hostname for the Riak cluster') do |host| 45 | options[:host] = host 46 | end 47 | 48 | options[:port] = 8091 49 | opts.on('-p', '--port PORT', Integer, 'Port number') do |port| 50 | options[:port] = port 51 | end 52 | 53 | opts.on('--help', 'Display this screen') do 54 | puts opts 55 | exit 56 | end 57 | end 58 | 59 | optparse.parse! 60 | 61 | options[:file_list] = Dir.glob("#{options[:directory]}/**.yaml") if !options[:directory].nil? 62 | 63 | puts options[:file_list] 64 | 65 | import_folder(options[:host], 66 | options[:port], 67 | options[:bucket], 68 | options[:file_list]) 69 | -------------------------------------------------------------------------------- /todo.txt: -------------------------------------------------------------------------------- 1 | # Copy 2 | 3 | Proofread all copy for clarity and grammar/spelling 4 | 5 | # Styling 6 | 7 | Pretty up the front page? 8 | 9 | # Nav - 10 | 11 | - Look into adding breadcrumb navigation 12 | - Fix home link layout 13 | 14 | # Functions 15 | 16 | Get five functions for each of the categories before we launch this puppy 17 | 18 | #Upgrade nav to use new nav stuff in gollum 19 | --------------------------------------------------------------------------------