├── .eqc_ci ├── EQC_CI_LICENCE.txt ├── LICENSE ├── Makefile ├── README.md ├── doc ├── 5HT.css ├── cr.htm └── images │ ├── log.svg │ ├── merging.svg │ ├── replicas.svg │ └── sup.png ├── include ├── cr.hrl ├── rafter.hrl ├── rafter_consensus_fsm.hrl └── rafter_opts.hrl ├── mad ├── otp.mk ├── rebar.config ├── src ├── backends │ └── cr_kvs.erl ├── consensus │ ├── README.md │ ├── cr_config.erl │ ├── cr_log.erl │ ├── cr_paxon.erl │ ├── cr_rafter.erl │ └── cr_replication.erl ├── cr.app.src ├── cr.erl ├── cr_app.erl ├── cr_hash.erl ├── cr_heart.erl ├── cr_vnode.erl └── tcp │ ├── cr_client.erl │ ├── cr_connection.erl │ ├── cr_interconnect.erl │ ├── cr_ping.erl │ └── cr_tcp.erl ├── sys.config └── vm.args /.eqc_ci: -------------------------------------------------------------------------------- 1 | {build, "./mad dep com pla"}. 2 | {test_path, "ebin"}. 3 | {deps, "deps"}. 4 | {test_root, "test"}. 5 | -------------------------------------------------------------------------------- /EQC_CI_LICENCE.txt: -------------------------------------------------------------------------------- 1 | This file is an agreement between Quviq AB ("Quviq"), Sven Hultins 2 | Gata 9, Gothenburg, Sweden, and the committers to the github 3 | repository in which the file appears ("the owner"). By placing this 4 | file in a github repository, the owner agrees to the terms below. 5 | 6 | The purpose of the agreement is to enable Quviq AB to provide a 7 | continuous integration service to the owner, whereby the code in the 8 | repository ("the source code") is tested using Quviq's test tools, and 9 | the test results are made available on the web. The test results 10 | include test output, generated test cases, and a copy of the source 11 | code in the repository annotated with coverage information ("the test 12 | results"). 13 | 14 | The owner agrees that Quviq may run the tests in the source code and 15 | display the test results on the web, without obligation. 16 | 17 | The owner warrants that running the tests in the source code and 18 | displaying the test results on the web violates no laws, licences or other 19 | agreements. In the event of such a violation, the owner accepts full 20 | responsibility. 21 | 22 | The owner warrants that the source code is not malicious, and will not 23 | mount an attack on either Quviq's server or any other server--for 24 | example by taking part in a denial of service attack, or by attempting 25 | to send unsolicited emails. 26 | 27 | The owner warrants that the source code does not attempt to reverse 28 | engineer Quviq's code. 29 | 30 | Quviq reserves the right to exclude repositories that break this 31 | agreement from its continuous integration service. 32 | 33 | Any dispute arising from the use of Quviq's service will be resolved 34 | under Swedish law. 35 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015 Maxim Sokhatsky, Synrc Research Center 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy 4 | of this software and associated documentation files (the "Software"), to deal 5 | in the Software without restriction, including without limitation the rights 6 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 7 | copies of the Software, and to permit persons to whom the Software is 8 | furnished to do so, subject to the following conditions: 9 | 10 | Software may only be used for the great good and the true happiness of all sentient beings. 11 | 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 13 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 14 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 16 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 17 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 18 | THE SOFTWARE. 19 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | RELEASE := cr 2 | COOKIE := node_runner 3 | VER := 1.0.0 4 | 5 | NAME ?= cr 6 | 7 | default: compile 8 | 9 | include otp.mk 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Byzantine Chain Replication Database 2 | ==================================== 3 | 4 | [](https://gitter.im/spawnproc/cr?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 5 | 6 | In banking system demands are very tight. Database 7 | should be at least tripled, stand-by nodes should pick up 8 | master reads from failover node, writes should be 9 | accepted on a reasonable quorum, failover must be followed by recovery, database 10 | should be able to scale even with the RAM/DISC limitations. 11 | 12 | No data should be treated as written otherwise that commited to all replicas. 13 | All this circumstances leads us to chain replication protocol as a simple and natural 14 | feedback to this challenge. 15 | 16 | Different replication techniques exists to satisfy replication demands. 17 | Master-slave replication is most widely known type of replication 18 | used before in such products like GFS, HDFS, mongodb, etc. Quorum Intersection 19 | is another technique used in databases like Cassandra or Amazon Dynamo. 20 | They mostly provide a consistent distributed repository 21 | for event tables or for file storage. In banking industry 22 | we synchronize account balances and need simple and managable 23 | protocol for storage consistency issuing high demand on system integrity. 24 | 25 | There are several classes of error usually implied when dealing with failure detection. 26 | The most weak class is fail-stop events, when the outage is normal or predictable. 27 | The second class is crash-failures, the ubnormal terminations and outages. The most strong 28 | type of failures are byzantine failures resistant to bit-flips, 29 | hacked parties or any types of compromising the transaction objects. 30 | For banking applications the byzantine fault tolerance is desired, 31 | despite it affects the latency. 32 | 33 | Features 34 | -------- 35 | 36 | * Highly-available CP database :-) 37 | * 2N+1 nodes tolerates N failures 38 | * Consistent hashing DHT 39 | * RAFT for managing server configurations timeline 40 | * HMAC signing for Byzantine capabilities 41 | * Various database backends: mnesia, riak, redis, fs, sql 42 | * High-performance non-blocking TCP acceptor 43 | * Separate endpoints for HEART, CLIENT and SERVER protocols 44 | * Pure, clean and understandable codebase 45 | * Article about CR implementation details: http://synrc.space/apps/cr/doc/cr.htm 46 | * Business Processing Erlang book: http://synrc.space/apps/bpe/doc/book.pdf 47 | 48 | Launch 49 | ------ 50 | 51 | ```bash 52 | make console NAME=cr 53 | make console NAME=cr2 54 | make console NAME=cr3 55 | ``` 56 | 57 | You could start all nodes in separate console sesions or you 58 | can `make start NAME=cr2` nodes and later attach to them with `make attach NAME=cr2`. 59 | Also the start is compatible within single folders, which cause no single problem. 60 | 61 | ```erlang 62 | > timer:tc(cr,test,[500]). 63 | 64 | =INFO REPORT==== 7-Apr-2015::00:56:34 === 65 | cr:Already in Database: 14020 66 | New record will be applied: 500 67 | {214369,{transactions,11510}} 68 | ``` 69 | 70 | Fore generating sample data, let say 500 transactions you may run with `cr:test(500)`. 71 | By measuring accepring performance it's like `2000 Req/s`. 72 | 73 | ```erlang 74 | > cr:dump(). 75 | 76 | vnode i n top latency 77 | 121791803110908576516973736059690251637994378581 1 1 391 2/198/64 78 | 243583606221817153033947472119380503275988757162 2 1 400 2/183/72 79 | 365375409332725729550921208179070754913983135743 3 1 388 3/195/64 80 | 487167212443634306067894944238761006551977514324 4 1 357 2/183/53 81 | 608959015554542882584868680298451258189971892905 5 2 12994 2/198/67 82 | 730750818665451459101842416358141509827966271486 6 2 13017 3/184/66 83 | 852542621776360035618816152417831761465960650067 7 2 13019 2/201/75 84 | 974334424887268612135789888477522013103955028648 8 2 13020 3/178/62 85 | 1096126227998177188652763624537212264741949407229 9 3 13021 2/190/68 86 | 1217918031109085765169737360596902516379943785810 10 3 13028 3/206/65 87 | 1339709834219994341686711096656592768017938164391 11 3 13030 2/208/55 88 | 1461501637330902918203684832716283019655932542972 12 3 13031 2/185/58 89 | ok 90 | ``` 91 | 92 | The latency in last column `~70 ms` means the moment data is stored on all `mnesia` replicas. 93 | The latency in a given example is for storing async_dirty using KVS 94 | chain linking (from `1 to 3` msg per write operation, from `1 to 2` msg for lookups) 95 | clustered in `3 nodes` with same replicas number. 96 | 97 | Let's say we want to see all the operations log of a given replica `391`. 98 | 99 | ```erlang 100 | > cr:dump(391). 101 | operation id prev i size 102 | transaction:389:feed::false: 391 387 1 480 103 | transaction:399:feed::false: 387 382 1 500 104 | transaction:375:feed::false: 382 379 1 446 105 | transaction:373:feed::false: 379 378 1 446 106 | transaction:383:feed::false: 378 376 1 473 107 | transaction:392:feed::false: 376 374 1 500 108 | transaction:360:feed::false: 374 371 1 446 109 | transaction:366:feed::false: 371 370 1 473 110 | transaction:370:feed::false: 370 369 1 446 111 | transaction:371:feed::false: 369 368 1 446 112 | ok 113 | ``` 114 | 115 | You may check this from the other side. First retrieve the operation and then 116 | retrieve the transaction created during operation. 117 | 118 | ```erlang 119 | > kvs:get(operation,391). 120 | {ok,#operation{id = 391,version = undefined,container = log, 121 | feed_id = {121791803110908576516973736059690251637994378581,1}, % VNODE 122 | prev = 387,next = undefined,feeds = [],guard = false, 123 | etc = undefined, 124 | body = {prepare,{<0.41.0>,{1428,358105,840469}}, 125 | [{121791803110908576516973736059690251637994378581,1}, % SIGNATURES 126 | {608959015554542882584868680298451258189971892905,2}], 127 | #transaction{id = 389,version = undefined,container = feed, 128 | feed_id = undefined,prev = undefined,next = undefined, 129 | feeds = [],guard = false,etc = undefined, 130 | timestamp = undefined,beneficiary = undefined,...}}, 131 | name = prepare,status = pending}} 132 | ``` 133 | 134 | The transaction. For linking transaction to the link you should use full XA 135 | protocol with two-stage confirmation (1) the PUT operation followed 136 | with (2) LINK operation to some feed, such as user account or customer admin list. 137 | 138 | ```erlang 139 | > kvs:get(transaction,389). 140 | {ok,#transaction{id = 389,version = undefined, 141 | container = feed, feed_id = undefined, prev = undefined, 142 | next = undefined, feeds = [], guard = false, etc = undefined, 143 | timestamp = [], beneficiary = [], 144 | subsidiary = [], amount = [],tax = [], 145 | ballance = [], currency = [], 146 | description = [], info = [], 147 | prevdate = [], rate = [], item = []}} 148 | ``` 149 | 150 | The actiual Erlang business logic, banking transaction from `db` schema 151 | application is stored under 389 id. So you can easlity grab it unlinked 152 | as it was stored as atomic PUT. 153 | 154 | Licenses 155 | -------- 156 | 157 | * consensus protols 1) raft and 2) paxos are distributed under the terms of Apache 2.0 http://www.apache.org/licenses/LICENSE-2.0.html 158 | * cr itself is distributed under the DHARMA license: http://5ht.co/license.htm 159 | 160 | Credits 161 | ------- 162 | 163 | Copyright (c) 2015 Synrc Research Center s.r.o. 164 | 165 | * Maxim Sokhatsky 166 | * Vladimir Kirillov 167 | * Sergey Klimenko 168 | * Valery Meleshkin 169 | * Victor Sovietov 170 | 171 | OM A HUM 172 | -------------------------------------------------------------------------------- /doc/5HT.css: -------------------------------------------------------------------------------- 1 | pre { padding:4px;white-space:pre;background-color:#F1F1F1;font-family:monospace;font-size:14pt;} 2 | code { padding:4px;white-space:pre;font-family:monospace;font-size:14pt;} 3 | body { font-family: local; font-size: 16pt; color: #888; } 4 | h1 { font-size: 34pt; } 5 | h2 { font-size: 24pt; margin-top: 50px; } 6 | h3 { margin-top: 40px; } 7 | h4 { margin-top: 40px; } 8 | h5 { margin-top: -20px; } 9 | p { margin-top: 10px; } 10 | .note { margin-top: 0px; } 11 | .note p { margin-top: 20px; } 12 | .menu { text-align: right;} 13 | a { margin-top: 10px; padding: 10px; } 14 | .app { margin:100px auto;min-width:300px;max-width:800px; } 15 | .message { align: center; } 16 | .note { margin-left:0px;margin-top:0px;background-color:#F1F1F1;padding:4px 10px 4px 24px;color:gray;} 17 | ul {margin-left:70px;} 18 | 19 | a { color: blue; text-decoration: none } 20 | a:hover { color:blue; } 21 | a:hover, a:active { outline: 0 } 22 | 23 | @font-face { 24 | font-family: 'local'; 25 | src: url('Geometria-Light.otf'); 26 | font-weight: normal; 27 | font-style: normal 28 | } 29 | -------------------------------------------------------------------------------- /doc/cr.htm: -------------------------------------------------------------------------------- 1 | 2 |
3 | 4 | 5 | 6 | 7 | 8 |