├── .gitignore ├── README.md ├── img └── rct_vnode.png └── simple_crawler ├── .edts ├── .gitignore ├── Makefile ├── apps └── sc │ └── src │ ├── sc.app.src │ ├── sc.erl │ ├── sc.hrl │ ├── sc_app.erl │ ├── sc_console.erl │ ├── sc_downloader_vnode.erl │ ├── sc_node_event_handler.erl │ ├── sc_ring_event_handler.erl │ ├── sc_storage_vnode.erl │ ├── sc_sup.erl │ └── sc_vnode.erl ├── links.txt ├── rebar ├── rebar.config └── rel ├── files ├── app.config ├── erl ├── nodetool ├── sc ├── sc-admin └── vm.args ├── reltool.config ├── vars.config └── vars ├── dev1.config ├── dev2.config └── dev3.config /.gitignore: -------------------------------------------------------------------------------- 1 | .eunit 2 | deps 3 | *.o 4 | *.beam 5 | *.plt 6 | erl_crash.dump 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | riak_core_tutorial 2 | ================== 3 | 4 | ## Table of Contents ## 5 | 1. [Environment](#environment) 6 | 2. [Multinode Hello World](#multinode-hello-world) 7 | 3. [Installing rebar template for riak_core](#installing-rebar-template-for-riak_core) 8 | 4. [Hello Multinode!!!](#hello-multinode) 9 | 5. [Consistent Hashing](#consistent-hashing) 10 | 6. [Implementing simple crawler](#implementing-simple-crawler) 11 | 7. [Implementing downloader part](#implementing-downloader-part) 12 | 8. [Implementing storage part](#implementing-storage-part) 13 | 9. [Handoff](#handoff) 14 | 10. [What is handoff](#what-is-handoff) 15 | 11. [Handling handoff](#handling-handoff) 16 | 12. [See handoff in action](#see-handoff-in-action) 17 | 13. [Fault tolerance](#fault-tolerance) 18 | 19 | ## Environment ## 20 | 21 | To skip setting up an environment there is already one prepared for this 22 | tutorial: [riak_core_env](https://github.com/mentels/riak_core_env). 23 | In the following chapters I assume that you have the environment 24 | running and do the whole work in `RIAK_CORE_ENV/synced/` directory 25 | mentioned in the link. 26 | 27 | ## Multinode Hello World ## 28 | 29 | > The commands in this chapter has to be invoked from the VM. 30 | 31 | #### Installing rebar template for riak_core #### 32 | 33 | Folks from Basho where kind enough to prepare a rebar template for 34 | creating riak_core apps. Apart from creating an application structure 35 | it also creates a script for administering the cluster. 36 | 37 | Clone the template and install it (under ~/.rebar/templates): 38 | ```bash 39 | git clone https://github.com/basho/rebar_riak_core 40 | cd rebar_riak_core && make install 41 | ``` 42 | 43 | > The template provided by Basho is quite old. However there are 44 | > a lot of forks and 45 | > [this one](https://github.com/marianoguerra/rebar_riak_core) seems 46 | > to be adjusted to the newest stable version of riak_core that is 47 | > 1.4.10 at the time of writing this tutorial. For other versions see 48 | > [releases](https://github.com/basho/riak_core/releases). 49 | 50 | #### Hello Multinode!!! #### 51 | 52 | Once we have the template let's use it to generate an Erlang app. Enter 53 | the `multinode` directory and invoke rebar: 54 | ```bash 55 | cd ~/synced/multinode && ./rebar create template=riak_core appid=hwmn nodeid=hwmn 56 | ``` 57 | 58 | Next tweak it a little bit so that we work on the newest stable release 59 | of the beast. Also we need newer lager version. Go and modify freshly 60 | created `rebar.config`: 61 | ```bash 62 | {deps, [ 63 | {lager, "2.0.1", {git, "git://github.com/basho/lager", {tag, "2.0.1"}}}, 64 | {riak_core, "1.4.10", {git, "git://github.com/basho/riak_core", {tag, "1.4.10"}}} 65 | ]}. 66 | ``` 67 | 68 | We are ready to generate a release with 4 nodes and play with them: 69 | ```bash 70 | make devrel 71 | for d in dev/dev*; do $d/bin/hwmn start; done 72 | ``` 73 | 74 | To make sure that we're up and running do: 75 | `for d in dev/dev*; do $d/bin/hwmn ping; done` 76 | 77 | If you're not getting pongs... well I'm sorry - it worked for me. 78 | But do our nodes know anything about each other? Let's check it using 79 | an admin utility: 80 | `./dev/dev1/bin/hwmn-admin member_status` 81 | 82 | The output from the above command should like like this: 83 | ```bash 84 | ================================= Membership ================================== 85 | Status Ring Pending Node 86 | ------------------------------------------------------------------------------- 87 | valid 100.0% -- 'hwmn1@127.0.0.1' 88 | ------------------------------------------------------------------------------- 89 | ``` 90 | This simply means that the nodes **ARE NOT** in any relation - node *hwmn1* 91 | knows just about itself. But as you probably already know riak_core 92 | machinery was invented to actually help nodes live together. To join 93 | them, issue: 94 | `for d in dev/dev{2,3,4}; do $d/bin/hwmn-admin cluster join hwmn1@127.0.0.1; done` 95 | 96 | But this is not enough. We've just **staged** changes to the ring. Before 97 | they take effect we have to **confirm** the plan and **commit**. Yeah, 98 | complicated... but move forward: 99 | `dev/dev1/bin/hwmn-admin cluster plan` 100 | 101 | We've just get informed by riak_core what will happen. Trust me and 102 | agree by committing: 103 | `dev/dev1/bin/hwmn-admin cluster commit` 104 | 105 | And check the node's relations again: 106 | `./dev/dev1/bin/hwmn-admin member_status` 107 | 108 | If your output is similar to the following you managed to make a family: 109 | ```bash 110 | ================================= Membership ================================== 111 | Status Ring Pending Node 112 | ------------------------------------------------------------------------------- 113 | valid 75.0% 25.0% 'hwmn1@127.0.0.1' 114 | valid 9.4% 25.0% 'hwmn2@127.0.0.1' 115 | valid 7.8% 25.0% 'hwmn3@127.0.0.1' 116 | valid 7.8% 25.0% 'hwmn4@127.0.0.1' 117 | ------------------------------------------------------------------------------- 118 | Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 119 | ``` 120 | Look at the `Ring` column. It indicates how much of the key 121 | space is allocated for a particular node. Over time, each node should 122 | cover proportional percentage of a ring. 123 | 124 | #### Consistent hashing #### 125 | 126 | What is it? After the 127 | [Riak Glossary](http://docs.basho.com/riak/latest/theory/concepts/glossary/#Consistent-Hashing): 128 | > Consistent hashing is a technique used to limit the reshuffling of 129 | > keys when a hash-table data structure is rebalanced 130 | > (when slots are added or removed). Riak uses consistent hashing 131 | > to organize its data storage and replication. 132 | > Specifically, the vnodes in the Riak Ring responsible for storing 133 | > each object are determined using the consistent hashing technique. 134 | 135 | Basically, if we want to perform an operation in a particular riak_core 136 | virtual node (I'll try to explain later mysterious virtual node 137 | - *vnode*) **and** we always want it to be the same vnode for a 138 | particular input, we use consistent hasing. And the resulting hash 139 | value for a given input stays the same regardless of changes to the ring. 140 | 141 | As an exercise we can compute a hash value for some input and make 142 | sure that it's the same over the ring. To do so attach to one of 143 | the nodes: 144 | `./dev/dev1/bin/hwmn attach` 145 | and run the following snippet: 146 | ``` erlang 147 | F = fun() -> 148 | Hashes = [begin 149 | Node = "hwmn" ++ integer_to_list(N) ++ "@127.0.0.1", 150 | rpc:call(list_to_atom(Node), riak_core_util, chash_key, [{<<"please">>, <<"bleed">>}]) 151 | end || N <- [1,2,3]], 152 | [OneHash] = lists:usort(Hashes), 153 | OneHash 154 | end. 155 | (Hash = F()) == F(). 156 | ``` 157 | > **What is a *vnode*?** 158 | > 159 | > ![alt text](/img/rct_vnode.png) 160 | > 161 | > A *vnode* is a virtual node, as opposed to physical node 162 | > * Each vnode is responsible for one partition on the ring 163 | > * A vnode is an Erlang process 164 | > * A vnode is a behavior written a top of the gen_fsm behavior 165 | > * A vnode handles incoming requests 166 | > * A vnode potentially stores data to be retrieved later 167 | > * A vnode is the unit of concurrency, replication, and fault tolerance 168 | > * Typically many vnodes will run on each physical node 169 | > 170 | > Each machine has a vnode master who's purpose is to keep 171 | > track of all active vnodes on its node. 172 | 173 | 174 | How can we make a use of a computed `Hash`? We can get a list of vnodes 175 | on which we can perform/store something. 176 | 177 | ``` 178 | riak_core_apl:get_apl(Hash, _N = 2, hwmn). 179 | ``` 180 | > `apl` stands for *active preference list*. 181 | > 182 | > The value of `_N` indicates how many vnodes we want to involve in 183 | > performing some operation associated with `Hash`. For example we might 184 | > want to save an object on two vnodes. 185 | 186 | 187 | The output from the call looks like this: 188 | ```erlang 189 | [{1301649895747835411525156804137939564381064921088, 'hwmn2@127.0.0.1'}, 190 | {1324485858831130769622089379649131486563188867072, 'hwmn3@127.0.0.1'}] 191 | ``` 192 | How to read this? The first element of a tuple is a partition (remember 193 | that a **vnode** is responsible for one partition in the ring?) which is 194 | dedicated to the `Hash` and as you guessed the second element is 195 | a node on which the partition sits! 196 | 197 | When you're done with playing with the cluster stop the nodes: 198 | `for d in dev/dev*; do $d/bin/hwmn stop; done` 199 | 200 | Awesome, congratulations, great, sweet just fantastic. Hello Multinode 201 | completed! 202 | 203 | ## Implementing simple crawler ## 204 | 205 | The plan for the next step is to implement a distributed internet 206 | crawler that will be able to download websites and store them for 207 | later retrieval. The desing is as follows: 208 | * downloading will take place on random vnodes in the cluster; 209 | * a particular vnode will store a content of a given URL; 210 | * an old version of a website will be replaced by a new one; 211 | * the API will be implemented in an `sc.erl` module. 212 | 213 | This part of the tutorial requires you to implement missing parts of 214 | the `simple_crawler` application that can be found in 215 | `RIAK_CORE_ENV/synced/crawler/simple_crawler`. The 216 | application structure is compliant with Erlang/OTP so all the modules 217 | are in `apps/sc/src`. 218 | 219 | ### Implementing downloader part ### 220 | 221 | The API for downloader is already implemented in `sc:download/1` so 222 | we only need to add a vnode that will be handling the actual download 223 | tasks. A skeleton for the vnode is already there in 224 | `sc_downloader_vnode.erl`. Note, that a vnode have to implement 225 | `riak_core_vnode` behaviour. Here we're focusing on `handle_command/3` 226 | callback, that will be invokded by the `riak_core_vnode_master:command/3`. 227 | 228 | > More information on the `riak_core_vnode` callbacks can be found 229 | > [here](https://github.com/vitormazzi/try-try-try/tree/master/2011/riak-core-the-vnode#life-cycle-callbacks). 230 | 231 | Let's get to coding. First of all, add the asynchronous API to the 232 | vnode. Edit the `sc_downloader_vnode.erl`: 233 | ```erlang 234 | -export([start_vnode/1, 235 | download/2]). 236 | ... 237 | -define(MASTER, sc_downloader_vnode_master). 238 | ... 239 | 240 | -spec download({chash:index_as_int(), node()}, binary()) -> term(). 241 | download(IdxNode, URL) -> 242 | riak_core_vnode_master:command(IdxNode, {download, URL}, ?MASTER). 243 | ``` 244 | 245 | `MASTER` indicates the ID of the master vnode for downloader vnodes. 246 | 247 | Next, implement the command: 248 | ```erlang 249 | ... 250 | handle_command({download, URL} = Req, _Sender, State) -> 251 | print_request_info(State#state.partition, node(), Req), 252 | try 253 | Content = download(URL), 254 | store(URL, Content) 255 | catch 256 | throw:{download_error, Reason} -> 257 | ?PRINT({request_failed, Req, Reason}) 258 | end, 259 | {noreply, State}; 260 | ... 261 | ``` 262 | 263 | In the final step provide a specification for 264 | `sc_downloader_vnode_master` in `sc_sup.erl` and add it to the 265 | supervisor's child list: 266 | ```erlang 267 | ... 268 | VDownloaderMaster = 269 | {sc_downloader_vnode_master, 270 | {riak_core_vnode_master, start_link, [sc_downloader_vnode]}, 271 | permanent, 5000, worker, [riak_core_vnode_master]}, 272 | ... 273 | {ok, {{one_for_one, 5, 10}, [VMaster, VDownloaderMaster]}}. 274 | ``` 275 | Additionaly, register the vnode in `sc_app.erl`: 276 | ```erlang 277 | ... 278 | ok = riak_core:register([{vnode_module, sc_downloader_vnode}]), 279 | .... 280 | ``` 281 | 282 | To test our new funcionality stop the whole cluster, clean project, 283 | build devrel and form the cluster: 284 | ```bash 285 | for d in dev/dev*; do $d/bin/sc stop; done 286 | make devclean && make devrel 287 | for d in dev/dev*; do $d/bin/sc start; done 288 | for d in dev/dev{2,3}; do $d/bin/sc-admin join sc1@127.0.0.1; done 289 | ``` 290 | > This time to make things simpler we won't be **staging** and 291 | > **committing** changes to the riak_core ring. 292 | > [This](https://github.com/rzezeski/rebar_riak_core) 293 | > riak_core rebar template implements such simplified behaviour. And we 294 | > will get by with 3 nodes. 295 | 296 | Once we have all the setup up and running attach to one of the nodes 297 | and observe the logs of the other two nodes: 298 | ```bash 299 | dev/dev1/bin/sc attach 300 | tail -f dev/dev2/log/erlang.log.1 301 | tail -f dev/dev3/log/erlang.log.1 302 | ``` 303 | 304 | > Run the above commands from separate consoles. 305 | 306 | Experiment a bit with `sc:download/1` API: 307 | `[sc:download("http://www.erlang.org") || _ <- lists:seq(1,10)].` 308 | 309 | Note that the reuqests are serverd by random partitions on different 310 | nodes. Effectively it means that requests hit different vnodes (a vnode 311 | is responsible for one partition, right?). 312 | 313 | "The randomness" is achieved by picking a vnode for random document 314 | index. See `sc:get_random_dument_index/0` how it works. 315 | 316 | ### Implementing storage part ### 317 | 318 | Let's move to our storage system. As above, the API is already 319 | implemented in `sc:store/2` and `sc:get_content/2` (uncomment 320 | all the lines in these functions). Recall from the design description, 321 | that in this case the same vnode will be chosen for storing or 322 | retrieving data for a particular URL. 323 | 324 | Similarly as in the previous example we need a vnode to do our job. 325 | There is already such a vnode implemented `sc_storage_vnode.erl`. 326 | Please, have look at its `get_content/3` API function. It invokes 327 | the command using `riak_core_vnode_master:sync_spawn_command/3` that 328 | is synchronous but **does not** block the vnode. The difference is also 329 | in the command for retrieving the content as we return a reply. 330 | 331 | To get it working you have to take care of the master vnode for 332 | storage in `sc_sup.erl` and register the `sc_storage_vnode.erl` in the 333 | `sc_app.erl`- analogously as with downloader vnode. 334 | 335 | > The `ID` in the child specification that you need to provide 336 | > in sc_sup.erl (for example sc_storage_vnode_master) must match the 337 | > 3rd argument in the call to `riak_core_vnode_master:command/3`. 338 | 339 | When you're done restart the whole machinery, attach to one node and 340 | "tailf" the other nodes' logs: 341 | ```bash 342 | for d in dev/dev*; do $d/bin/sc stop; done 343 | make devclean && make devrel 344 | for d in dev/dev*; do $d/bin/sc start; done 345 | for d in dev/dev{2,3}; do $d/bin/sc-admin join sc1@127.0.0.1; done 346 | dev/dev1/bin/sc attach 347 | tail -f dev/dev2/log/erlang.log.1 348 | tail -f dev/dev3/log/erlang.log.1 349 | ``` 350 | 351 | Then download your favorite website and retrieve its content: 352 | ```erlang 353 | sc:download("http://joemonster.org/"). 354 | sc:download("http://joemonster.org/"). 355 | sc:get_content("http://joemonster.org/"). 356 | ``` 357 | 358 | You would expect that *download* requests will be served by different 359 | vnodes and each *store* and *get_content* requests by the same vnode. 360 | But, hey! what if `get_content/1` returns nothing but `ok` even though 361 | the request matches the right partition?! Well, it's possible... 362 | 363 | The explanation behind this behavior is that, when you start your 364 | first node it servers all the partitions which in practice means that 365 | it runs all the vnodes of each kind (by default 64 partitions are 366 | created). When new nodes join the cluster the partitions are spread 367 | across them but it happens in the background - strictly speaking: while 368 | the cluster is serving a request it's moving vnodes to other physical 369 | nodes in the same time. But riak_core have no idea how to move our 370 | data so it's just lost! Terrible ha? 371 | 372 | To observe the whole system working as expected you need to wait for 373 | the cluster to come into "stable state". Just check the status: 374 | `./dev/dev1/bin/sc-admin member_status` 375 | 376 | When there're no pending changes it means that no partitions will be 377 | moved. Now you can experiment again and make sure, that requests are 378 | served by appropriate partitions, vnodes and nodes. 379 | 380 | In the next part I'm going to explain how not to lose data while moving 381 | vnode to another erlang node: so called *handoff*. 382 | 383 | ## Handoff ## 384 | 385 | ### What is handoff? ### 386 | 387 | A *handoff* occurs when a vnode realizes that it's not on the proper 388 | Erlang node. Such a situation can take place when: 389 | * a node is added to or removed from the ring, 390 | * a node comes alive after it has been down. 391 | 392 | In riak_core there's a periodic "home check" that verifies whether 393 | a vnode uses correct physical node. If that's not true for some vnode 394 | it will go into *handoff mode* and data will be transferred. 395 | 396 | ### Handling handoff ### 397 | 398 | When riak_core decides to perform a handoff it calls two functions: 399 | `Mod:handoff_starting/2` and `Mod:is_empty/1`. Through the first one 400 | a vnode can agree on or not to proceede with the handoff. The second one 401 | indicates if there's any data to be transfered and this one is 402 | interesting for us. The vnode process started in `sc_storage_vnode` saves 403 | all the webpages' content into a dict data structure. Thus when a handoff 404 | occurs we want this dict to be transferred. 405 | 406 | So lets code `sc_storage_vnode:is_empty/1`: 407 | ```erlang 408 | is_empty(State) -> 409 | case dict:size(State#state.store) of 410 | 0 -> 411 | {true, State}; 412 | _ -> 413 | {false, State} 414 | end. 415 | ``` 416 | 417 | When the framework decides to start handoff it sends `?FOLD_REQ` that 418 | that says the vnode how to fold over its data. This request is supposed 419 | to be handled in `Mod:handle_handoff_command/3` and contains "folding 420 | function" along with an initial accumulator. We should implement 421 | `handle_handoff_command/3` as follows: 422 | ```erlang 423 | handle_handoff_command(?FOLD_REQ{foldfun = Fun, acc0=Acc0}, 424 | _Sender, State) -> 425 | Acc = dict:fold(Fun, Acc0, State#state.store), 426 | {reply, Acc, State}. 427 | ``` 428 | 429 | > If you're like my and this ?FOLD_REQ looks strange to you have a look 430 | > at the 431 | > [riak_core_vnode.hrl](https://github.com/basho/riak_core/blob/master/include/riak_core_vnode.hrl) 432 | > that reveals that the macro expands to a record. 433 | 434 | So, what's next with this magic handoff? Well, at this point things 435 | are simple: each iteration of "folding function" calls 436 | `Mod:encode_handoff_item/2` that just do what it is supposed to do: 437 | encode data before sending it to the targe vnode. The target vnode, 438 | decodes the data in `Mod:handle_handoff_data/2`. In this tutorial we are 439 | using *extremely complex method of encoding* so write the following code 440 | in your storage vnode really carefully: 441 | ```erlang 442 | encode_handoff_item(URL, Content) -> 443 | term_to_binary({URL, Content}). 444 | ... 445 | handle_handoff_data(Data, State) -> 446 | {URL, Content} = binary_to_term(Data), 447 | Dict = dict:store(URL, Content, State#state.store), 448 | {reply, ok, State#state{store = Dict}}. 449 | ``` 450 | 451 | And that's it. Our system is now "move-vnode-resistant". One more thing 452 | worth noting here: after the handoff is completed `Mod:delete/1` 453 | is called and the vnode will be terminated just after this call. 454 | During the termination `Mod:terminate/2` will be called too. 455 | 456 | > I did not present all the callbacks related to handoff. For more 457 | > information go to a great tutorial 458 | [here](https://github.com/vitormazzi/try-try-try/tree/master/2011/riak-core-the-vnode#handoff). 459 | > If you need more details look at the 460 | [basho wiki](https://github.com/basho/riak_core/wiki/Handoffs). 461 | 462 | ### See handoff in action ### 463 | 464 | Now, when we have handoff implemented, build devrel, start the cluster, 465 | **but only join dev2 to dev1**. We want to observe how the partitions 466 | are moved: 467 | ```bash 468 | for d in dev/dev*; do $d/bin/sc stop; done 469 | make devclean && make devrel 470 | for d in dev/dev*; do $d/bin/sc start; done 471 | dev/dev2/bin/sc-admin join sc1@127.0.0.1 472 | ``` 473 | Wait for the ring to get populated symmetrically across the two nodes. 474 | Use `./dev/dev1/bin/sc-admin member_status` to check the status. 475 | 476 | Next attach to the console of one node from the cluster "tailf" 477 | logs of the other node: 478 | ```bash 479 | dev/dev1/bin/sc attach 480 | tail -f dev/dev2/log/erlang.log.1 481 | ``` 482 | 483 | Download some websites using `sc:get_links/0`. This functions will return 484 | a list of 50 URLS: 485 | ```erlang 486 | [begin sc:download(L), timer:sleep(500) end || L <- sc:get_links()]. 487 | ``` 488 | 489 | After that join the 3rd node and "tailf" its log. Wait for the cluster 490 | to get balanced and try to retrieve previously downloaded content using 491 | the attached console: 492 | ```bash 493 | dev/dev3/bin/sc-admin join sc1@127.0.0.1 494 | tail -f dev/dev3/log/erlang.log.1 495 | ``` 496 | ```erlang 497 | spawn(fun() -> [begin 498 | sc:get_content(L), 499 | timer:sleep(500) 500 | end || L <- sc:get_links()] end). 501 | ``` 502 | 503 | You should see that some URL's content is served by the 3rd node, 504 | although it joined the cluster after all the sites had been downloaded. 505 | 506 | ## Fault tolerance ## 507 | 508 | Without destroying the previous setup stop one of the nodes that you know 509 | holds content for some website. Then try to get content of the website. 510 | You should end up with `not_found` response: 511 | ```erlang 512 | (sc1@127.0.0.1)9> sc:get_content("http://en.wikipedia.org/"). 513 | not_found 514 | ``` 515 | 516 | To tackle this problem we need to store our data on more that one vnode. 517 | Lets code it in by changing `sc:get_index_node/1` so that it gets 518 | number of vnodes: 519 | ```erlang 520 | get_index_node(DocIdx, N) -> 521 | riak_core_apl:get_apl(DocIdx, N, sc), 522 | ``` 523 | 524 | It also requires us to repair `sc:download/1`, `sc:store/2` and 525 | `sc:get_content/1`. Let's say that we want to store data on 3 vnodes: 526 | ```erlang 527 | download(URL) -> 528 | DocIdx = get_random_document_index(), 529 | IdxNodes = get_index_node(DocIdx, 1), 530 | sc_downloader_vnode:download(IdxNodes, URL). 531 | 532 | store(URL, Content) -> 533 | DocIdx = get_index_for_url(URL), 534 | IdxNodes = get_index_node(DocIdx, 3), 535 | sc_storage_vnode:store(IdxNodes, {URL, Content}). 536 | 537 | get_content(URL) -> 538 | DocIdx = get_index_for_url(URL), 539 | IdxNodes = get_index_node(DocIdx, 3), 540 | R0 = [sc_storage_vnode:get_content(IN, URL) || IN <- IdxNodes], 541 | R1 = lists:filter(fun(not_found) -> 542 | false; 543 | (_) -> 544 | true 545 | end, R0), 546 | case R1 of 547 | [] -> 548 | not_found; 549 | _ -> 550 | hd(R1) 551 | end. 552 | ``` 553 | 554 | Try that it really works. First start the cluster, join the nodes, 555 | attach to `sc1` and observe logs: 556 | ```bash 557 | for d in dev/dev*; do $d/bin/sc stop; done 558 | make devclean && make devrel 559 | for d in dev/dev*; do $d/bin/sc start; done 560 | for d in dev/dev{2,3}; do $d/bin/sc-admin join sc1@127.0.0.1; done 561 | dev/dev1/bin/sc attach 562 | tail -f dev/dev2/log/erlang.log.1 563 | tail -f dev/dev3/log/erlang.log.1 564 | ``` 565 | 566 | Then wait for the ring to get synced and download some sites. In the logs 567 | you should see that the same data is duplicated over several vnodes 568 | (for the data to be safe at least 2 vnodes should be located on different 569 | physical nodes): 570 | ```erlang 571 | [begin sc:download(L), timer:sleep(500) end || L <- sc:get_links()]. 572 | ``` 573 | 574 | Find a request for one website in a node, bring that node down and 575 | try to query that website. It should be still availabe on another node 576 | that hold a vnode for that website: 577 | ```erlang 578 | sc:get_content("http://en.wikipedia.org/"). 579 | ``` 580 | 581 | 582 | -------------------------------------------------------------------------------- /img/rct_vnode.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mentels/riak_core_tutorial/90c115dc7ff5245d278423fa07a207fac5450826/img/rct_vnode.png -------------------------------------------------------------------------------- /simple_crawler/.edts: -------------------------------------------------------------------------------- 1 | :name "riak_core-tutorial" 2 | :node-sname "rct" 3 | :lib-dirs '("deps" "apps/sc") 4 | :app-include-dirs '("include") 5 | -------------------------------------------------------------------------------- /simple_crawler/.gitignore: -------------------------------------------------------------------------------- 1 | *.beam 2 | .eunit 3 | deps/* 4 | apps/sc/ebin 5 | *~ 6 | dev/* 7 | doc/* 8 | rel/sc 9 | -------------------------------------------------------------------------------- /simple_crawler/Makefile: -------------------------------------------------------------------------------- 1 | REBAR = $(shell pwd)/rebar 2 | 3 | .PHONY: deps rel stagedevrel 4 | 5 | all: deps compile 6 | 7 | compile: 8 | $(REBAR) compile 9 | 10 | deps: 11 | $(REBAR) get-deps 12 | 13 | clean: 14 | $(REBAR) clean 15 | 16 | distclean: clean devclean relclean 17 | $(REBAR) delete-deps 18 | 19 | test: 20 | $(REBAR) skip_deps=true eunit 21 | 22 | rel: all 23 | $(REBAR) generate 24 | 25 | relclean: 26 | rm -rf rel/sc 27 | 28 | devrel: dev1 dev2 dev3 29 | 30 | ### 31 | ### Docs 32 | ### 33 | docs: 34 | $(REBAR) skip_deps=true doc 35 | 36 | ## 37 | ## Developer targets 38 | ## 39 | 40 | stage : rel 41 | $(foreach dep,$(wildcard deps/* wildcard apps/*), rm -rf rel/sc/lib/$(shell basename $(dep))-* && ln -sf $(abspath $(dep)) rel/sc/lib;) 42 | 43 | 44 | stagedevrel: dev1 dev2 dev3 45 | $(foreach dev,$^,\ 46 | $(foreach dep,$(wildcard deps/* wildcard apps/*), rm -rf dev/$(dev)/lib/$(shell basename $(dep))-* && ln -sf $(abspath $(dep)) dev/$(dev)/lib;)) 47 | 48 | devrel: dev1 dev2 dev3 49 | 50 | 51 | devclean: 52 | rm -rf dev 53 | 54 | dev1 dev2 dev3: all 55 | mkdir -p dev 56 | (cd rel && $(REBAR) generate target_dir=../dev/$@ overlay_vars=vars/$@.config) 57 | 58 | 59 | ## 60 | ## Dialyzer 61 | ## 62 | APPS = kernel stdlib sasl erts ssl tools os_mon runtime_tools crypto inets \ 63 | xmerl webtool snmp public_key mnesia eunit syntax_tools compiler 64 | COMBO_PLT = $(HOME)/.sc_combo_dialyzer_plt 65 | 66 | check_plt: deps compile 67 | dialyzer --check_plt --plt $(COMBO_PLT) --apps $(APPS) \ 68 | deps/*/ebin apps/*/ebin 69 | 70 | build_plt: deps compile 71 | dialyzer --build_plt --output_plt $(COMBO_PLT) --apps $(APPS) \ 72 | deps/*/ebin apps/*/ebin 73 | 74 | dialyzer: deps compile 75 | @echo 76 | @echo Use "'make check_plt'" to check PLT prior to using this target. 77 | @echo Use "'make build_plt'" to build PLT prior to using this target. 78 | @echo 79 | @sleep 1 80 | dialyzer -Wno_return --plt $(COMBO_PLT) deps/*/ebin apps/*/ebin 81 | 82 | 83 | cleanplt: 84 | @echo 85 | @echo "Are you sure? It takes about 1/2 hour to re-build." 86 | @echo Deleting $(COMBO_PLT) in 5 seconds. 87 | @echo 88 | sleep 5 89 | rm $(COMBO_PLT) 90 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc.app.src: -------------------------------------------------------------------------------- 1 | %% -*- erlang -*- 2 | {application, sc, 3 | [ 4 | {description, ""}, 5 | {vsn, "1"}, 6 | {registered, []}, 7 | {applications, [ 8 | kernel, 9 | stdlib, 10 | riak_core 11 | ]}, 12 | {mod, { sc_app, []}}, 13 | {env, []} 14 | ]}. 15 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc.erl: -------------------------------------------------------------------------------- 1 | -module(sc). 2 | -include("sc.hrl"). 3 | -include_lib("riak_core/include/riak_core_vnode.hrl"). 4 | 5 | -export([ping/0, 6 | download/1, 7 | store/2, 8 | get_content/1, 9 | fill/0]). 10 | 11 | %% Public API 12 | 13 | %% @doc Pings a random vnode to make sure communication is functional 14 | ping() -> 15 | DocIdx = riak_core_util:chash_key({<<"ping">>, term_to_binary(now())}), 16 | PrefList = riak_core_apl:get_primary_apl(DocIdx, 1, sc), 17 | [{IndexNode, _Type}] = PrefList, 18 | riak_core_vnode_master:sync_spawn_command(IndexNode, ping, sc_vnode_master). 19 | 20 | %% @doc Dispatch downloading URL's content to a random download_vnode. 21 | -spec download(string()) -> term(). 22 | download(URL) -> 23 | DocIdx = get_random_document_index(), 24 | IdxNode = get_index_node(DocIdx), 25 | sc_downloader_vnode:download(IdxNode, URL). 26 | 27 | %% @doc Store URL's content in a VNode correspoding to the URL 28 | -spec store(string(), binary()) -> term(). 29 | store(URL, Content) -> 30 | ok. 31 | %% DocIdx = get_index_for_url(URL), 32 | %% IdxNode = get_index_node(DocIdx), 33 | %% sc_storage_vnode:store(IdxNode, {URL, Content}). 34 | 35 | %% @doc Get content for a given URL. 36 | -spec get_content(string()) -> {ok, binary()} | not_found. 37 | get_content(URL) -> 38 | ok. 39 | %% DocIdx = get_index_for_url(URL), 40 | %% IdxNode = get_index_node(DocIdx), 41 | %% sc_storage_vnode:get_content(IdxNode, URL). 42 | 43 | %% @doc downloads content for all linkes specified in ../../links.txt 44 | fill() -> 45 | {ok, File} = file:open("../../links.txt", [read]), 46 | process_links(File). 47 | 48 | %% Helpers 49 | 50 | get_random_document_index() -> 51 | riak_core_util:chash_key({<<"download">>, term_to_binary(now())}). 52 | 53 | get_index_node(DocIdx) -> 54 | [IndexNode] = riak_core_apl:get_apl(DocIdx, 1, sc), 55 | IndexNode. 56 | 57 | get_index_for_url(URL) -> 58 | riak_core_util:chash_key({<<"url">>, list_to_binary(URL)}). 59 | 60 | get_links(File, Acc) -> 61 | case io:get_line(File, "") of 62 | eof -> 63 | file:close(File), 64 | Acc; 65 | URL -> 66 | get_links(File, [URL | Acc]) 67 | end. 68 | 69 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc.hrl: -------------------------------------------------------------------------------- 1 | -define(PRINT(Var), io:format("DEBUG: ~p:~p - ~p~n~n ~p~n~n", [?MODULE, ?LINE, ??Var, Var])). -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_app.erl: -------------------------------------------------------------------------------- 1 | -module(sc_app). 2 | 3 | -behaviour(application). 4 | 5 | %% Application callbacks 6 | -export([start/2, stop/1]). 7 | 8 | %% =================================================================== 9 | %% Application callbacks 10 | %% =================================================================== 11 | 12 | start(_StartType, _StartArgs) -> 13 | case sc_sup:start_link() of 14 | {ok, Pid} -> 15 | ok = riak_core:register([{vnode_module, sc_vnode}]), 16 | ok = riak_core_ring_events:add_guarded_handler(sc_ring_event_handler, []), 17 | ok = riak_core_node_watcher_events:add_guarded_handler(sc_node_event_handler, []), 18 | ok = riak_core_node_watcher:service_up(sc, self()), 19 | {ok, Pid}; 20 | {error, Reason} -> 21 | {error, Reason} 22 | end. 23 | 24 | stop(_State) -> 25 | ok. 26 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_console.erl: -------------------------------------------------------------------------------- 1 | %% @doc Interface for sc-admin commands. 2 | -module(sc_console). 3 | -export([join/1, 4 | leave/1, 5 | remove/1, 6 | ringready/1]). 7 | 8 | join([NodeStr]) -> 9 | try riak_core:join(NodeStr) of 10 | ok -> 11 | io:format("Sent join request to ~s\n", [NodeStr]), 12 | ok; 13 | {error, not_reachable} -> 14 | io:format("Node ~s is not reachable!\n", [NodeStr]), 15 | error; 16 | {error, different_ring_sizes} -> 17 | io:format("Failed: ~s has a different ring_creation_size~n", 18 | [NodeStr]), 19 | error 20 | catch 21 | Exception:Reason -> 22 | lager:error("Join failed ~p:~p", [Exception, Reason]), 23 | io:format("Join failed, see log for details~n"), 24 | error 25 | end. 26 | 27 | leave([]) -> 28 | remove_node(node()). 29 | 30 | remove([Node]) -> 31 | remove_node(list_to_atom(Node)). 32 | 33 | remove_node(Node) when is_atom(Node) -> 34 | try catch(riak_core:remove_from_cluster(Node)) of 35 | {'EXIT', {badarg, [{erlang, hd, [[]]}|_]}} -> 36 | %% This is a workaround because 37 | %% riak_core_gossip:remove_from_cluster doesn't check if 38 | %% the result of subtracting the current node from the 39 | %% cluster member list results in the empty list. When 40 | %% that code gets refactored this can probably go away. 41 | io:format("Leave failed, this node is the only member.~n"), 42 | error; 43 | Res -> 44 | io:format(" ~p\n", [Res]) 45 | catch 46 | Exception:Reason -> 47 | lager:error("Leave failed ~p:~p", [Exception, Reason]), 48 | io:format("Leave failed, see log for details~n"), 49 | error 50 | end. 51 | 52 | -spec(ringready([]) -> ok | error). 53 | ringready([]) -> 54 | try riak_core_status:ringready() of 55 | {ok, Nodes} -> 56 | io:format("TRUE All nodes agree on the ring ~p\n", [Nodes]); 57 | {error, {different_owners, N1, N2}} -> 58 | io:format("FALSE Node ~p and ~p list different partition owners\n", 59 | [N1, N2]), 60 | error; 61 | {error, {nodes_down, Down}} -> 62 | io:format("FALSE ~p down. All nodes need to be up to check.\n", 63 | [Down]), 64 | error 65 | catch 66 | Exception:Reason -> 67 | lager:error("Ringready failed ~p:~p", [Exception, Reason]), 68 | io:format("Ringready failed, see log for details~n"), 69 | error 70 | end. 71 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_downloader_vnode.erl: -------------------------------------------------------------------------------- 1 | -module(sc_downloader_vnode). 2 | -behaviour(riak_core_vnode). 3 | -include("sc.hrl"). 4 | 5 | %% API 6 | -export([start_vnode/1]). 7 | 8 | %% Behaviour API 9 | -export([init/1, 10 | terminate/2, 11 | handle_command/3, 12 | is_empty/1, 13 | delete/1, 14 | handle_handoff_command/3, 15 | handoff_starting/2, 16 | handoff_cancelled/1, 17 | handoff_finished/2, 18 | handle_handoff_data/2, 19 | encode_handoff_item/2, 20 | handle_coverage/4, 21 | handle_exit/3]). 22 | 23 | -record(state, {partition}). 24 | 25 | %% API 26 | start_vnode(I) -> 27 | riak_core_vnode_master:get_vnode_pid(I, ?MODULE). 28 | 29 | %% Callbacks 30 | 31 | init([Partition]) -> 32 | {ok, #state {partition = Partition}}. 33 | 34 | handle_command(Message, _Sender, State) -> 35 | ?PRINT({unhandled_command, Message}), 36 | {noreply, State}. 37 | 38 | handle_handoff_command(_Message, _Sender, State) -> 39 | {noreply, State}. 40 | 41 | handoff_starting(_TargetNode, State) -> 42 | {true, State}. 43 | 44 | handoff_cancelled(State) -> 45 | {ok, State}. 46 | 47 | handoff_finished(_TargetNode, State) -> 48 | {ok, State}. 49 | 50 | handle_handoff_data(_Data, State) -> 51 | {reply, ok, State}. 52 | 53 | encode_handoff_item(_ObjectName, _ObjectValue) -> 54 | <<>>. 55 | 56 | is_empty(State) -> 57 | {true, State}. 58 | 59 | delete(State) -> 60 | {ok, State}. 61 | 62 | handle_coverage(_Req, _KeySpaces, _Sender, State) -> 63 | {stop, not_implemented, State}. 64 | 65 | handle_exit(_Pid, _Reason, State) -> 66 | {noreply, State}. 67 | 68 | terminate(_Reason, _State) -> 69 | ok. 70 | 71 | %% Herlpers 72 | 73 | print_request_info(Partition, Node, Request) -> 74 | io:format("~n" 75 | "Request: ~p~n" 76 | "Partition: ~p~n" 77 | "Node: ~p~n", 78 | [Request, Partition, Node]). 79 | 80 | download(URL) -> 81 | case httpc:request(URL) of 82 | {ok, {{_Version, 200, _ReasonPhrase}, _Headers, Body}} -> 83 | Body; 84 | {error, Reason} -> 85 | throw({download_error, Reason}); 86 | {ok, {{_Version, _, ReasonPhrase}, _Headers, _Body}} -> 87 | throw({download_error, ReasonPhrase}); 88 | Other -> 89 | throw({download_error, Other}) 90 | end. 91 | 92 | store(URL, Content) -> 93 | sc:store(URL, Content). 94 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_node_event_handler.erl: -------------------------------------------------------------------------------- 1 | %% This file is provided to you under the Apache License, 2 | %% Version 2.0 (the "License"); you may not use this file 3 | %% except in compliance with the License. You may obtain 4 | %% a copy of the License at 5 | %% 6 | %% http://www.apache.org/licenses/LICENSE-2.0 7 | %% 8 | %% Unless required by applicable law or agreed to in writing, 9 | %% software distributed under the License is distributed on an 10 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 11 | %% KIND, either express or implied. See the License for the 12 | %% specific language governing permissions and limitations 13 | %% under the License. 14 | 15 | %% Copyright (c) 2007-2011 Basho Technologies, Inc. All Rights Reserved. 16 | 17 | -module(sc_node_event_handler). 18 | -behaviour(gen_event). 19 | 20 | %% gen_event callbacks 21 | -export([init/1, handle_event/2, handle_call/2, 22 | handle_info/2, terminate/2, code_change/3]). 23 | -record(state, {}). 24 | 25 | init([]) -> 26 | {ok, #state{}}. 27 | 28 | handle_event({service_update, _Services}, State) -> 29 | {ok, State}. 30 | 31 | handle_call(_Event, State) -> 32 | {ok, ok, State}. 33 | 34 | handle_info(_Info, State) -> 35 | {ok, State}. 36 | 37 | terminate(_Reason, _State) -> 38 | ok. 39 | 40 | code_change(_OldVsn, State, _Extra) -> 41 | {ok, State}. 42 | 43 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_ring_event_handler.erl: -------------------------------------------------------------------------------- 1 | %% This file is provided to you under the Apache License, 2 | %% Version 2.0 (the "License"); you may not use this file 3 | %% except in compliance with the License. You may obtain 4 | %% a copy of the License at 5 | %% 6 | %% http://www.apache.org/licenses/LICENSE-2.0 7 | %% 8 | %% Unless required by applicable law or agreed to in writing, 9 | %% software distributed under the License is distributed on an 10 | %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 11 | %% KIND, either express or implied. See the License for the 12 | %% specific language governing permissions and limitations 13 | %% under the License. 14 | 15 | %% Copyright (c) 2007-2011 Basho Technologies, Inc. All Rights Reserved. 16 | 17 | -module(sc_ring_event_handler). 18 | -behaviour(gen_event). 19 | 20 | %% gen_event callbacks 21 | -export([init/1, handle_event/2, handle_call/2, 22 | handle_info/2, terminate/2, code_change/3]). 23 | -record(state, {}). 24 | 25 | init([]) -> 26 | {ok, #state{}}. 27 | 28 | handle_event({ring_update, _Ring}, State) -> 29 | {ok, State}. 30 | 31 | handle_call(_Event, State) -> 32 | {ok, ok, State}. 33 | 34 | handle_info(_Info, State) -> 35 | {ok, State}. 36 | 37 | terminate(_Reason, _State) -> 38 | ok. 39 | 40 | code_change(_OldVsn, State, _Extra) -> 41 | {ok, State}. 42 | 43 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_storage_vnode.erl: -------------------------------------------------------------------------------- 1 | -module(sc_storage_vnode). 2 | -behaviour(riak_core_vnode). 3 | -include("sc.hrl"). 4 | -include_lib("riak_core/include/riak_core_vnode.hrl"). 5 | 6 | %% API 7 | -export([start_vnode/1, 8 | store/2, 9 | get_content/2]). 10 | 11 | %% Behaviour API 12 | -export([init/1, 13 | terminate/2, 14 | handle_command/3, 15 | is_empty/1, 16 | delete/1, 17 | handle_handoff_command/3, 18 | handoff_starting/2, 19 | handoff_cancelled/1, 20 | handoff_finished/2, 21 | handle_handoff_data/2, 22 | encode_handoff_item/2, 23 | handle_coverage/4, 24 | handle_exit/3]). 25 | 26 | -record(state, {partition, 27 | store :: dict()}). 28 | 29 | -define(MASTER, sc_storage_vnode_master). 30 | 31 | %% API 32 | start_vnode(I) -> 33 | riak_core_vnode_master:get_vnode_pid(I, ?MODULE). 34 | 35 | -spec store({chash:index_as_int(), node()}, {string(), binary()}) -> term(). 36 | store(IdxNode, {URL, Content}) -> 37 | riak_core_vnode_master:command(IdxNode, {store, URL, Content}, 38 | ?MASTER). 39 | 40 | get_content(IdxNode, URL) -> 41 | riak_core_vnode_master:sync_spawn_command(IdxNode, 42 | {get_content, URL}, 43 | ?MASTER). 44 | %% Callbacks 45 | 46 | init([Partition]) -> 47 | {ok, #state {partition = Partition, store = dict:new()}}. 48 | 49 | handle_command({store, URL, Content}, _Sender, State0) -> 50 | print_request_info(State0#state.partition, node(), {store, URL}), 51 | State1 = do_store(URL, Content, State0), 52 | {noreply, State1}; 53 | handle_command({get_content, URL} = Req, _Sender, State) -> 54 | print_request_info(State#state.partition, node(), Req), 55 | Reply = case do_get_content(URL, State) of 56 | not_found = NF -> 57 | NF; 58 | Content -> 59 | {ok, Content} 60 | end, 61 | {reply, Reply, State}; 62 | handle_command(Message, _Sender, State) -> 63 | ?PRINT({unhandled_command, Message}), 64 | {noreply, State}. 65 | 66 | handle_handoff_command(_Message, _Sender, State) -> 67 | {noreply, State}. 68 | 69 | handoff_starting(_TargetNode, State) -> 70 | {true, State}. 71 | 72 | handoff_cancelled(State) -> 73 | {ok, State}. 74 | 75 | handoff_finished(_TargetNode, State) -> 76 | {ok, State}. 77 | 78 | handle_handoff_data(_Data, State) -> 79 | {reply, ok, State}. 80 | 81 | encode_handoff_item(_ObjectName, _ObjectValue) -> 82 | <<>>. 83 | 84 | is_empty(State) -> 85 | {true, State}. 86 | 87 | delete(State) -> 88 | {ok, State}. 89 | 90 | handle_coverage(_Req, _KeySpaces, _Sender, State) -> 91 | {stop, not_implemented, State}. 92 | 93 | handle_exit(_Pid, _Reason, State) -> 94 | {noreply, State}. 95 | 96 | terminate(_Reason, _State) -> 97 | ok. 98 | 99 | %% Herlpers 100 | 101 | print_request_info(Partition, Node, Request) -> 102 | io:format("~n" 103 | "Request: ~p~n" 104 | "Partition: ~p~n" 105 | "Node: ~p~n", 106 | [Request, Partition, Node]). 107 | 108 | do_store(URL, Content, #state{store = Dict} = State) -> 109 | State#state{store = dict:store(URL, Content, Dict)}. 110 | 111 | do_get_content(URL, #state{store = Dict}) -> 112 | case dict:find(URL, Dict) of 113 | error -> 114 | not_found; 115 | {ok, Content} -> 116 | Content 117 | end. 118 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_sup.erl: -------------------------------------------------------------------------------- 1 | -module(sc_sup). 2 | 3 | -behaviour(supervisor). 4 | 5 | %% API 6 | -export([start_link/0]). 7 | 8 | %% Supervisor callbacks 9 | -export([init/1]). 10 | 11 | %% =================================================================== 12 | %% API functions 13 | %% =================================================================== 14 | 15 | start_link() -> 16 | supervisor:start_link({local, ?MODULE}, ?MODULE, []). 17 | 18 | %% =================================================================== 19 | %% Supervisor callbacks 20 | %% =================================================================== 21 | 22 | init(_Args) -> 23 | VMaster = { sc_vnode_master, 24 | {riak_core_vnode_master, start_link, [sc_vnode]}, 25 | permanent, 5000, worker, [riak_core_vnode_master]}, 26 | 27 | {ok, {{one_for_one, 5, 10}, [VMaster]}}. 28 | -------------------------------------------------------------------------------- /simple_crawler/apps/sc/src/sc_vnode.erl: -------------------------------------------------------------------------------- 1 | -module(sc_vnode). 2 | -behaviour(riak_core_vnode). 3 | -include("sc.hrl"). 4 | 5 | -export([start_vnode/1, 6 | init/1, 7 | terminate/2, 8 | handle_command/3, 9 | is_empty/1, 10 | delete/1, 11 | handle_handoff_command/3, 12 | handoff_starting/2, 13 | handoff_cancelled/1, 14 | handoff_finished/2, 15 | handle_handoff_data/2, 16 | encode_handoff_item/2, 17 | handle_coverage/4, 18 | handle_exit/3]). 19 | 20 | -record(state, {partition}). 21 | 22 | %% API 23 | start_vnode(I) -> 24 | riak_core_vnode_master:get_vnode_pid(I, ?MODULE). 25 | 26 | init([Partition]) -> 27 | {ok, #state { partition=Partition }}. 28 | 29 | %% Sample command: respond to a ping 30 | handle_command(ping, _Sender, State) -> 31 | {reply, {pong, State#state.partition}, State}; 32 | handle_command(Message, _Sender, State) -> 33 | ?PRINT({unhandled_command, Message}), 34 | {noreply, State}. 35 | 36 | handle_handoff_command(_Message, _Sender, State) -> 37 | {noreply, State}. 38 | 39 | handoff_starting(_TargetNode, State) -> 40 | {true, State}. 41 | 42 | handoff_cancelled(State) -> 43 | {ok, State}. 44 | 45 | handoff_finished(_TargetNode, State) -> 46 | {ok, State}. 47 | 48 | handle_handoff_data(_Data, State) -> 49 | {reply, ok, State}. 50 | 51 | encode_handoff_item(_ObjectName, _ObjectValue) -> 52 | <<>>. 53 | 54 | is_empty(State) -> 55 | {true, State}. 56 | 57 | delete(State) -> 58 | {ok, State}. 59 | 60 | handle_coverage(_Req, _KeySpaces, _Sender, State) -> 61 | {stop, not_implemented, State}. 62 | 63 | handle_exit(_Pid, _Reason, State) -> 64 | {noreply, State}. 65 | 66 | terminate(_Reason, _State) -> 67 | ok. 68 | -------------------------------------------------------------------------------- /simple_crawler/links.txt: -------------------------------------------------------------------------------- 1 | http://en.wikipedia.org/ 2 | http://es.wikipedia.org/ 3 | http://ja.wikipedia.org/ 4 | http://ru.wikipedia.org/ 5 | http://de.wikipedia.org/ 6 | http://fr.wikipedia.org/ 7 | http://it.wikipedia.org/ 8 | http://zh.wikipedia.org/ 9 | http://pl.wikipedia.org/ 10 | http://pt.wikipedia.org/ 11 | http://nl.wikipedia.org/ 12 | http://ceb.wikipedia.org/ 13 | http://sv.wikipedia.org/ 14 | http://vi.wikipedia.org/ 15 | http://war.wikipedia.org/ 16 | http://ar.wikipedia.org/ 17 | http://az.wikipedia.org/ 18 | http://bg.wikipedia.org/ 19 | http://ca.wikipedia.org/ 20 | http://cs.wikipedia.org/ 21 | http://da.wikipedia.org/ 22 | http://et.wikipedia.org/ 23 | http://el.wikipedia.org/ 24 | http://eo.wikipedia.org/ 25 | http://eu.wikipedia.org/ 26 | http://fa.wikipedia.org/ 27 | http://gl.wikipedia.org/ 28 | http://ko.wikipedia.org/ 29 | http://hy.wikipedia.org/ 30 | http://hi.wikipedia.org/ 31 | http://hr.wikipedia.org/ 32 | http://id.wikipedia.org/ 33 | http://he.wikipedia.org/ 34 | http://la.wikipedia.org/ 35 | http://lt.wikipedia.org/ 36 | http://hu.wikipedia.org/ 37 | http://ms.wikipedia.org/ 38 | http://min.wikipedia.org/ 39 | http://no.wikipedia.org/ 40 | http://nn.wikipedia.org/ 41 | http://uz.wikipedia.org/ 42 | http://kk.wikipedia.org/ 43 | http://ro.wikipedia.org/ 44 | http://simple.wikipedia.org/ 45 | http://sk.wikipedia.org/ 46 | http://sl.wikipedia.org/ 47 | http://sr.wikipedia.org/ 48 | http://sh.wikipedia.org/ 49 | http://fi.wikipedia.org/ 50 | -------------------------------------------------------------------------------- /simple_crawler/rebar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mentels/riak_core_tutorial/90c115dc7ff5245d278423fa07a207fac5450826/simple_crawler/rebar -------------------------------------------------------------------------------- /simple_crawler/rebar.config: -------------------------------------------------------------------------------- 1 | %% -*- erlang -*- 2 | {sub_dirs, ["rel", "apps/sc"]}. 3 | {cover_enabled, true}. 4 | {erl_opts, [debug_info]}. 5 | {edoc_opts, [{dir, "../../doc"}]}. 6 | {deps, [{riak_core, "1.4.10", 7 | {git, "git://github.com/basho/riak_core", {tag, "1.4.10"}}} 8 | ]}. 9 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/app.config: -------------------------------------------------------------------------------- 1 | %% -*- erlang -*- 2 | [ 3 | %% Riak Core config 4 | {riak_core, [ 5 | %% Default location of ringstate 6 | {ring_state_dir, "{{ring_state_dir}}"}, 7 | 8 | %% http is a list of IP addresses and TCP ports that the Riak 9 | %% HTTP interface will bind. 10 | {http, [ {"{{web_ip}}", {{web_port}} } ]}, 11 | 12 | %% https is a list of IP addresses and TCP ports that the Riak 13 | %% HTTPS interface will bind. 14 | %{https, [{ "{{web_ip}}", {{web_port}} }]}, 15 | 16 | %% default cert and key locations for https can be overridden 17 | %% with the ssl config variable 18 | %{ssl, [ 19 | % {certfile, "etc/cert.pem"}, 20 | % {keyfile, "etc/key.pem"} 21 | % ]}, 22 | 23 | %% riak_handoff_port is the TCP port that Riak uses for 24 | %% intra-cluster data handoff. 25 | {handoff_port, {{handoff_port}} } 26 | ]}, 27 | 28 | %% SASL config 29 | {sasl, [ 30 | {sasl_error_logger, {file, "log/sasl-error.log"}}, 31 | {errlog_type, error}, 32 | {error_logger_mf_dir, "log/sasl"}, % Log directory 33 | {error_logger_mf_maxbytes, 10485760}, % 10 MB max file size 34 | {error_logger_mf_maxfiles, 5} % 5 files max 35 | ]} 36 | ]. 37 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/erl: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ## This script replaces the default "erl" in erts-VSN/bin. This is necessary 4 | ## as escript depends on erl and in turn, erl depends on having access to a 5 | ## bootscript (start.boot). Note that this script is ONLY invoked as a side-effect 6 | ## of running escript -- the embedded node bypasses erl and uses erlexec directly 7 | ## (as it should). 8 | ## 9 | ## Note that this script makes the assumption that there is a start_clean.boot 10 | ## file available in $ROOTDIR/release/VSN. 11 | 12 | # Determine the abspath of where this script is executing from. 13 | ERTS_BIN_DIR=$(cd ${0%/*} && pwd) 14 | 15 | # Now determine the root directory -- this script runs from erts-VSN/bin, 16 | # so we simply need to strip off two dirs from the end of the ERTS_BIN_DIR 17 | # path. 18 | ROOTDIR=${ERTS_BIN_DIR%/*/*} 19 | 20 | # Parse out release and erts info 21 | START_ERL=`cat $ROOTDIR/releases/start_erl.data` 22 | ERTS_VSN=${START_ERL% *} 23 | APP_VSN=${START_ERL#* } 24 | 25 | BINDIR=$ROOTDIR/erts-$ERTS_VSN/bin 26 | EMU=beam 27 | PROGNAME=`echo $0 | sed 's/.*\\///'` 28 | CMD="$BINDIR/erlexec" 29 | export EMU 30 | export ROOTDIR 31 | export BINDIR 32 | export PROGNAME 33 | 34 | exec $CMD -boot $ROOTDIR/releases/$APP_VSN/start_clean ${1+"$@"} 35 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/nodetool: -------------------------------------------------------------------------------- 1 | %% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*- 2 | %% ex: ft=erlang ts=4 sw=4 et 3 | %% ------------------------------------------------------------------- 4 | %% 5 | %% nodetool: Helper Script for interacting with live nodes 6 | %% 7 | %% ------------------------------------------------------------------- 8 | 9 | main(Args) -> 10 | ok = start_epmd(), 11 | %% Extract the args 12 | {RestArgs, TargetNode} = process_args(Args, [], undefined), 13 | 14 | %% See if the node is currently running -- if it's not, we'll bail 15 | case {net_kernel:hidden_connect_node(TargetNode), net_adm:ping(TargetNode)} of 16 | {true, pong} -> 17 | ok; 18 | {_, pang} -> 19 | io:format("Node ~p not responding to pings.\n", [TargetNode]), 20 | halt(1) 21 | end, 22 | 23 | case RestArgs of 24 | ["ping"] -> 25 | %% If we got this far, the node already responsed to a ping, so just dump 26 | %% a "pong" 27 | io:format("pong\n"); 28 | ["stop"] -> 29 | io:format("~p\n", [rpc:call(TargetNode, init, stop, [], 60000)]); 30 | ["restart"] -> 31 | io:format("~p\n", [rpc:call(TargetNode, init, restart, [], 60000)]); 32 | ["reboot"] -> 33 | io:format("~p\n", [rpc:call(TargetNode, init, reboot, [], 60000)]); 34 | ["rpc", Module, Function | RpcArgs] -> 35 | case rpc:call(TargetNode, list_to_atom(Module), list_to_atom(Function), 36 | [RpcArgs], 60000) of 37 | ok -> 38 | ok; 39 | {badrpc, Reason} -> 40 | io:format("RPC to ~p failed: ~p\n", [TargetNode, Reason]), 41 | halt(1); 42 | _ -> 43 | halt(1) 44 | end; 45 | ["rpcterms", Module, Function, ArgsAsString] -> 46 | case rpc:call(TargetNode, list_to_atom(Module), list_to_atom(Function), 47 | consult(ArgsAsString), 60000) of 48 | {badrpc, Reason} -> 49 | io:format("RPC to ~p failed: ~p\n", [TargetNode, Reason]), 50 | halt(1); 51 | Other -> 52 | io:format("~p\n", [Other]) 53 | end; 54 | Other -> 55 | io:format("Other: ~p\n", [Other]), 56 | io:format("Usage: nodetool {ping|stop|restart|reboot}\n") 57 | end, 58 | net_kernel:stop(). 59 | 60 | process_args([], Acc, TargetNode) -> 61 | {lists:reverse(Acc), TargetNode}; 62 | process_args(["-setcookie", Cookie | Rest], Acc, TargetNode) -> 63 | erlang:set_cookie(node(), list_to_atom(Cookie)), 64 | process_args(Rest, Acc, TargetNode); 65 | process_args(["-name", TargetName | Rest], Acc, _) -> 66 | ThisNode = append_node_suffix(TargetName, "_maint_"), 67 | {ok, _} = net_kernel:start([ThisNode, longnames]), 68 | process_args(Rest, Acc, nodename(TargetName)); 69 | process_args(["-sname", TargetName | Rest], Acc, _) -> 70 | ThisNode = append_node_suffix(TargetName, "_maint_"), 71 | {ok, _} = net_kernel:start([ThisNode, shortnames]), 72 | process_args(Rest, Acc, nodename(TargetName)); 73 | process_args([Arg | Rest], Acc, Opts) -> 74 | process_args(Rest, [Arg | Acc], Opts). 75 | 76 | 77 | start_epmd() -> 78 | [] = os:cmd(epmd_path() ++ " -daemon"), 79 | ok. 80 | 81 | epmd_path() -> 82 | ErtsBinDir = filename:dirname(escript:script_name()), 83 | Name = "epmd", 84 | case os:find_executable(Name, ErtsBinDir) of 85 | false -> 86 | case os:find_executable(Name) of 87 | false -> 88 | io:format("Could not find epmd.~n"), 89 | halt(1); 90 | GlobalEpmd -> 91 | GlobalEpmd 92 | end; 93 | Epmd -> 94 | Epmd 95 | end. 96 | 97 | 98 | nodename(Name) -> 99 | case string:tokens(Name, "@") of 100 | [_Node, _Host] -> 101 | list_to_atom(Name); 102 | [Node] -> 103 | [_, Host] = string:tokens(atom_to_list(node()), "@"), 104 | list_to_atom(lists:concat([Node, "@", Host])) 105 | end. 106 | 107 | append_node_suffix(Name, Suffix) -> 108 | case string:tokens(Name, "@") of 109 | [Node, Host] -> 110 | list_to_atom(lists:concat([Node, Suffix, os:getpid(), "@", Host])); 111 | [Node] -> 112 | list_to_atom(lists:concat([Node, Suffix, os:getpid()])) 113 | end. 114 | 115 | 116 | %% 117 | %% Given a string or binary, parse it into a list of terms, ala file:consult/0 118 | %% 119 | consult(Str) when is_list(Str) -> 120 | consult([], Str, []); 121 | consult(Bin) when is_binary(Bin)-> 122 | consult([], binary_to_list(Bin), []). 123 | 124 | consult(Cont, Str, Acc) -> 125 | case erl_scan:tokens(Cont, Str, 0) of 126 | {done, Result, Remaining} -> 127 | case Result of 128 | {ok, Tokens, _} -> 129 | {ok, Term} = erl_parse:parse_term(Tokens), 130 | consult([], Remaining, [Term | Acc]); 131 | {eof, _Other} -> 132 | lists:reverse(Acc); 133 | {error, Info, _} -> 134 | {error, Info} 135 | end; 136 | {more, Cont1} -> 137 | consult(Cont1, eof, Acc) 138 | end. 139 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/sc: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # -*- tab-width:4;indent-tabs-mode:nil -*- 3 | # ex: ts=4 sw=4 et 4 | 5 | RUNNER_SCRIPT_DIR=$(cd ${0%/*} && pwd) 6 | 7 | RUNNER_BASE_DIR=${RUNNER_SCRIPT_DIR%/*} 8 | RUNNER_ETC_DIR=$RUNNER_BASE_DIR/etc 9 | RUNNER_LOG_DIR=$RUNNER_BASE_DIR/log 10 | # Note the trailing slash on $PIPE_DIR/ 11 | PIPE_DIR=/tmp/$RUNNER_BASE_DIR/ 12 | RUNNER_USER= 13 | 14 | # Make sure this script is running as the appropriate user 15 | if [ ! -z "$RUNNER_USER" ] && [ `whoami` != "$RUNNER_USER" ]; then 16 | exec sudo -u $RUNNER_USER -i $0 $@ 17 | fi 18 | 19 | # Make sure CWD is set to runner base dir 20 | cd $RUNNER_BASE_DIR 21 | 22 | # Make sure log directory exists 23 | mkdir -p $RUNNER_LOG_DIR 24 | 25 | # Extract the target node name from node.args 26 | NAME_ARG=`grep -e '-[s]*name' $RUNNER_ETC_DIR/vm.args` 27 | if [ -z "$NAME_ARG" ]; then 28 | echo "vm.args needs to have either -name or -sname parameter." 29 | exit 1 30 | fi 31 | 32 | # Extract the target cookie 33 | COOKIE_ARG=`grep -e '-setcookie' $RUNNER_ETC_DIR/vm.args` 34 | if [ -z "$COOKIE_ARG" ]; then 35 | echo "vm.args needs to have a -setcookie parameter." 36 | exit 1 37 | fi 38 | 39 | # Identify the script name 40 | SCRIPT=`basename $0` 41 | 42 | # Parse out release and erts info 43 | START_ERL=`cat $RUNNER_BASE_DIR/releases/start_erl.data` 44 | ERTS_VSN=${START_ERL% *} 45 | APP_VSN=${START_ERL#* } 46 | 47 | # Add ERTS bin dir to our path 48 | ERTS_PATH=$RUNNER_BASE_DIR/erts-$ERTS_VSN/bin 49 | 50 | # Setup command to control the node 51 | NODETOOL="$ERTS_PATH/escript $ERTS_PATH/nodetool $NAME_ARG $COOKIE_ARG" 52 | 53 | # Check the first argument for instructions 54 | case "$1" in 55 | start) 56 | # Make sure there is not already a node running 57 | RES=`$NODETOOL ping` 58 | if [ "$RES" = "pong" ]; then 59 | echo "Node is already running!" 60 | exit 1 61 | fi 62 | HEART_COMMAND="$RUNNER_BASE_DIR/bin/$SCRIPT start" 63 | export HEART_COMMAND 64 | mkdir -p $PIPE_DIR 65 | shift # remove $1 66 | $ERTS_PATH/run_erl -daemon $PIPE_DIR $RUNNER_LOG_DIR "exec $RUNNER_BASE_DIR/bin/$SCRIPT console $@" 2>&1 67 | ;; 68 | 69 | stop) 70 | # Wait for the node to completely stop... 71 | case `uname -s` in 72 | Linux|Darwin|FreeBSD|DragonFly|NetBSD|OpenBSD) 73 | # PID COMMAND 74 | PID=`ps ax -o pid= -o command=|\ 75 | grep "$RUNNER_BASE_DIR/.*/[b]eam"|awk '{print $1}'` 76 | ;; 77 | SunOS) 78 | # PID COMMAND 79 | PID=`ps -ef -o pid= -o args=|\ 80 | grep "$RUNNER_BASE_DIR/.*/[b]eam"|awk '{print $1}'` 81 | ;; 82 | CYGWIN*) 83 | # UID PID PPID TTY STIME COMMAND 84 | PID=`ps -efW|grep "$RUNNER_BASE_DIR/.*/[b]eam"|awk '{print $2}'` 85 | ;; 86 | esac 87 | $NODETOOL stop 88 | while `kill -0 $PID 2>/dev/null`; 89 | do 90 | sleep 1 91 | done 92 | ;; 93 | 94 | restart) 95 | ## Restart the VM without exiting the process 96 | $NODETOOL restart 97 | ;; 98 | 99 | reboot) 100 | ## Restart the VM completely (uses heart to restart it) 101 | $NODETOOL reboot 102 | ;; 103 | 104 | ping) 105 | ## See if the VM is alive 106 | $NODETOOL ping 107 | ;; 108 | 109 | attach) 110 | # Make sure a node IS running 111 | RES=`$NODETOOL ping` 112 | if [ "$RES" != "pong" ]; then 113 | echo "Node is not running!" 114 | exit 1 115 | fi 116 | 117 | shift 118 | $ERTS_PATH/to_erl $PIPE_DIR 119 | ;; 120 | 121 | console|console_clean) 122 | # .boot file typically just $SCRIPT (ie, the app name) 123 | # however, for debugging, sometimes start_clean.boot is useful: 124 | case "$1" in 125 | console) BOOTFILE=$SCRIPT ;; 126 | console_clean) BOOTFILE=start_clean ;; 127 | esac 128 | # Setup beam-required vars 129 | ROOTDIR=$RUNNER_BASE_DIR 130 | BINDIR=$ROOTDIR/erts-$ERTS_VSN/bin 131 | EMU=beam 132 | PROGNAME=`echo $0 | sed 's/.*\\///'` 133 | CMD="$BINDIR/erlexec -boot $RUNNER_BASE_DIR/releases/$APP_VSN/$BOOTFILE -embedded -config $RUNNER_ETC_DIR/app.config -args_file $RUNNER_ETC_DIR/vm.args -- ${1+"$@"}" 134 | export EMU 135 | export ROOTDIR 136 | export BINDIR 137 | export PROGNAME 138 | 139 | # Dump environment info for logging purposes 140 | echo "Exec: $CMD" 141 | echo "Root: $ROOTDIR" 142 | 143 | # Log the startup 144 | logger -t "$SCRIPT[$$]" "Starting up" 145 | 146 | # Start the VM 147 | exec $CMD 148 | ;; 149 | 150 | *) 151 | echo "Usage: $SCRIPT {start|stop|restart|reboot|ping|console|console_clean|attach}" 152 | exit 1 153 | ;; 154 | esac 155 | 156 | exit 0 157 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/sc-admin: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | RUNNER_SCRIPT_DIR=$(cd ${0%/*} && pwd) 4 | RUNNER_SCRIPT=${0##*/} 5 | 6 | RUNNER_BASE_DIR=${RUNNER_SCRIPT_DIR%/*} 7 | RUNNER_ETC_DIR=$RUNNER_BASE_DIR/etc 8 | RUNNER_LOG_DIR=$RUNNER_BASE_DIR/log 9 | RUNNER_USER= 10 | 11 | # Make sure this script is running as the appropriate user 12 | if [ "$RUNNER_USER" -a "x$LOGNAME" != "x$RUNNER_USER" ]; then 13 | type -p sudo > /dev/null 2>&1 14 | if [ $? -ne 0 ]; then 15 | echo "sudo doesn't appear to be installed and your EUID isn't $RUNNER_USER" 1>&2 16 | exit 1 17 | fi 18 | echo "Attempting to restart script through sudo -u $RUNNER_USER" 19 | exec sudo -u $RUNNER_USER -i $RUNNER_SCRIPT_DIR/$RUNNER_SCRIPT "$@" 20 | fi 21 | 22 | # Make sure CWD is set to runner base dir 23 | cd $RUNNER_BASE_DIR 24 | 25 | # Extract the target node name from node.args 26 | NAME_ARG=`grep -e '-[s]*name' $RUNNER_ETC_DIR/vm.args` 27 | if [ -z "$NAME_ARG" ]; then 28 | echo "vm.args needs to have either -name or -sname parameter." 29 | exit 1 30 | fi 31 | 32 | # Learn how to specify node name for connection from remote nodes 33 | echo "$NAME_ARG" | grep '^-sname' > /dev/null 2>&1 34 | if [ "X$?" = "X0" ]; then 35 | NAME_PARAM="-sname" 36 | NAME_HOST="" 37 | else 38 | NAME_PARAM="-name" 39 | echo "$NAME_ARG" | grep '@.*' > /dev/null 2>&1 40 | if [ "X$?" = "X0" ]; then 41 | NAME_HOST=`echo "${NAME_ARG}" | sed -e 's/.*\(@.*\)$/\1/'` 42 | else 43 | NAME_HOST="" 44 | fi 45 | fi 46 | 47 | # Extract the target cookie 48 | COOKIE_ARG=`grep '\-setcookie' $RUNNER_ETC_DIR/vm.args` 49 | if [ -z "$COOKIE_ARG" ]; then 50 | echo "vm.args needs to have a -setcookie parameter." 51 | exit 1 52 | fi 53 | 54 | # Identify the script name 55 | SCRIPT=`basename $0` 56 | 57 | # Parse out release and erts info 58 | START_ERL=`cat $RUNNER_BASE_DIR/releases/start_erl.data` 59 | ERTS_VSN=${START_ERL% *} 60 | APP_VSN=${START_ERL#* } 61 | 62 | # Add ERTS bin dir to our path 63 | ERTS_PATH=$RUNNER_BASE_DIR/erts-$ERTS_VSN/bin 64 | 65 | # Setup command to control the node 66 | NODETOOL="$ERTS_PATH/escript $ERTS_PATH/nodetool $NAME_ARG $COOKIE_ARG" 67 | 68 | run() 69 | { 70 | mod=$1 71 | shift 72 | cmd=$1 73 | shift 74 | 75 | # Make sure the local node IS running 76 | RES=`$NODETOOL ping` 77 | if [ "$RES" != "pong" ]; then 78 | echo "Node is not running!" 79 | exit 1 80 | fi 81 | 82 | $NODETOOL rpc $mod $cmd $@ 83 | } 84 | 85 | # Check the first argument for instructions 86 | case "$1" in 87 | join) 88 | shift 89 | run sc_console join $@ 90 | ;; 91 | 92 | leave) 93 | shift 94 | run sc_console leave $@ 95 | ;; 96 | 97 | remove) 98 | if [ $# -ne 2 ]; then 99 | echo "Usage: $SCRIPT remove " 100 | exit 1 101 | fi 102 | 103 | shift 104 | run sc_console remove $@ 105 | ;; 106 | 107 | member_status) 108 | if [ $# -ne 1 ]; then 109 | echo "Usage: $SCRIPT member_status" 110 | exit 1 111 | fi 112 | 113 | shift 114 | run riak_core_console member_status $@ 115 | ;; 116 | 117 | ring_status) 118 | if [ $# -ne 1 ]; then 119 | echo "Usage: $SCRIPT ring_status" 120 | exit 1 121 | fi 122 | 123 | shift 124 | run riak_core_console ring_status $@ 125 | ;; 126 | 127 | services) 128 | $NODETOOL rpcterms riak_core_node_watcher services '' 129 | ;; 130 | 131 | wait-for-service) 132 | SVC=$2 133 | TARGETNODE=$3 134 | if [ $# -lt 3 ]; then 135 | echo "Usage: $SCRIPT wait-for-service " 136 | exit 1 137 | fi 138 | 139 | while (true); do 140 | # Make sure riak_core_node_watcher is up and running locally before trying to query it 141 | # to avoid ugly (but harmless) error messages 142 | NODEWATCHER=`$NODETOOL rpcterms erlang whereis "'riak_core_node_watcher'."` 143 | if [ "$NODEWATCHER" = "undefined" ]; then 144 | echo "$SVC is not up: node watcher is not running" 145 | continue 146 | fi 147 | 148 | # Get the list of services that are available on the requested node 149 | SERVICES=`$NODETOOL rpcterms riak_core_node_watcher services "'${TARGETNODE}'."` 150 | echo "$SERVICES" | grep "[[,]$SVC[],]" > /dev/null 2>&1 151 | if [ "X$?" = "X0" ]; then 152 | echo "$SVC is up" 153 | exit 0 154 | else 155 | echo "$SVC is not up: $SERVICES" 156 | fi 157 | sleep 3 158 | done 159 | ;; 160 | 161 | ringready) 162 | shift 163 | run sc_console ringready $@ 164 | ;; 165 | 166 | *) 167 | echo "Usage: $SCRIPT { join | leave | reip | ringready | remove |" 168 | echo " services | wait-for-service | member_status |" 169 | echo " ring_status }" 170 | exit 1 171 | ;; 172 | esac 173 | -------------------------------------------------------------------------------- /simple_crawler/rel/files/vm.args: -------------------------------------------------------------------------------- 1 | ## Name of the node 2 | -name {{node}} 3 | 4 | ## Cookie for distributed erlang 5 | -setcookie {{cookie}} 6 | 7 | ## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive 8 | ## (Disabled by default..use with caution!) 9 | ##-heart 10 | 11 | ## Enable kernel poll and a few async threads 12 | +K true 13 | +A 5 14 | 15 | ## Increase number of concurrent ports/sockets 16 | -env ERL_MAX_PORTS 4096 17 | 18 | ## Tweak GC to run more often 19 | -env ERL_FULLSWEEP_AFTER 10 20 | 21 | -------------------------------------------------------------------------------- /simple_crawler/rel/reltool.config: -------------------------------------------------------------------------------- 1 | {sys, [ 2 | {lib_dirs, ["../apps/", "../deps/"]}, 3 | {rel, "sc", "1", 4 | [ 5 | kernel, 6 | stdlib, 7 | sasl, 8 | sc 9 | ]}, 10 | {rel, "start_clean", "", 11 | [ 12 | kernel, 13 | stdlib 14 | ]}, 15 | {boot_rel, "sc"}, 16 | {profile, embedded}, 17 | {excl_sys_filters, ["^bin/.*", 18 | "^erts.*/bin/(dialyzer|typer)"]}, 19 | {app, sasl, [{incl_cond, include}]}, 20 | {app, sc, [{incl_cond, include}]} 21 | ]}. 22 | 23 | {target_dir, "sc"}. 24 | 25 | {overlay_vars, "vars.config"}. 26 | 27 | {overlay, [ 28 | {mkdir, "data/ring"}, 29 | {mkdir, "log/sasl"}, 30 | {copy, "files/erl", "\{\{erts_vsn\}\}/bin/erl"}, 31 | {copy, "files/nodetool", "\{\{erts_vsn\}\}/bin/nodetool"}, 32 | {template, "files/app.config", "etc/app.config"}, 33 | {template, "files/vm.args", "etc/vm.args"}, 34 | {template, "files/sc", "bin/sc"}, 35 | {template, "files/sc-admin", "bin/sc-admin"} 36 | ]}. 37 | -------------------------------------------------------------------------------- /simple_crawler/rel/vars.config: -------------------------------------------------------------------------------- 1 | %% 2 | %% etc/app.config 3 | %% 4 | {ring_state_dir, "data/ring"}. 5 | {web_ip, "127.0.0.1"}. 6 | {web_port, "8098"}. 7 | {handoff_port, "8099"}. 8 | 9 | %% 10 | %% etc/vm.args 11 | %% 12 | {node, "sc@127.0.0.1"}. 13 | {cookie, "sc"}. 14 | -------------------------------------------------------------------------------- /simple_crawler/rel/vars/dev1.config: -------------------------------------------------------------------------------- 1 | %% 2 | %% etc/app.config 3 | %% 4 | {ring_state_dir, "data/ring"}. 5 | {web_ip, "127.0.0.1"}. 6 | {web_port, "8091"}. 7 | {handoff_port, "8101"}. 8 | 9 | %% 10 | %% etc/vm.args 11 | %% 12 | {node, "sc1@127.0.0.1"}. 13 | {cookie, "sc"}. -------------------------------------------------------------------------------- /simple_crawler/rel/vars/dev2.config: -------------------------------------------------------------------------------- 1 | %% 2 | %% etc/app.config 3 | %% 4 | {ring_state_dir, "data/ring"}. 5 | {web_ip, "127.0.0.1"}. 6 | {web_port, "8092"}. 7 | {handoff_port, "8102"}. 8 | 9 | %% 10 | %% etc/vm.args 11 | %% 12 | {node, "sc2@127.0.0.1"}. 13 | {cookie, "sc"}. -------------------------------------------------------------------------------- /simple_crawler/rel/vars/dev3.config: -------------------------------------------------------------------------------- 1 | %% 2 | %% etc/app.config 3 | %% 4 | {ring_state_dir, "data/ring"}. 5 | {web_ip, "127.0.0.1"}. 6 | {web_port, "8093"}. 7 | {handoff_port, "8103"}. 8 | 9 | %% 10 | %% etc/vm.args 11 | %% 12 | {node, "sc3@127.0.0.1"}. 13 | {cookie, "sc"}. --------------------------------------------------------------------------------