├── .eqc_ci ├── .gitignore ├── .travis.yml ├── AUTHORS ├── CONTRIBUTING.md ├── Description.md ├── EQC_CI_LICENCE.txt ├── Makefile ├── Proto.md ├── README.md ├── doc └── overview.edoc ├── erlang.mk ├── include ├── etorrent_rate.hrl ├── etorrent_version.hrl └── supervisor.hrl ├── rebar.config ├── rebar.lock ├── rel ├── sys.config └── vm.args ├── relx.config ├── src ├── Makefile ├── dht.app.src ├── dht.erl ├── dht_app.erl ├── dht_constants.hrl ├── dht_metric.erl ├── dht_net.erl ├── dht_par.erl ├── dht_proto.erl ├── dht_rand.erl ├── dht_refresh.erl ├── dht_routing_meta.erl ├── dht_routing_table.erl ├── dht_search.erl ├── dht_socket.erl ├── dht_state.erl ├── dht_store.erl ├── dht_sup.erl ├── dht_time.erl └── dht_track.erl └── test ├── Makefile ├── README.md ├── dht_SUITE.erl ├── dht_cluster.erl ├── dht_eqc.erl ├── dht_eqc.hrl ├── dht_metric_eqc.erl ├── dht_net_eqc.erl ├── dht_par_eqc.erl ├── dht_proto_eqc.erl ├── dht_rand_eqc.erl ├── dht_routing_meta_eqc.erl ├── dht_routing_table_eqc.erl ├── dht_state_eqc.erl ├── dht_store_eqc.erl ├── dht_time_eqc.erl ├── dht_track_eqc.erl ├── eqc_lib.erl ├── meta_cluster.erl ├── net_cluster.erl ├── routing_table.erl ├── state_cluster.erl ├── store_cluster.erl └── track_cluster.erl /.eqc_ci: -------------------------------------------------------------------------------- 1 | {build, "make eqc-ci"}. 2 | 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.beam 2 | deps/ 3 | ebin/*.app 4 | .dht_bt.plt 5 | src/dialyzer.out 6 | .dht.plt 7 | src/dialyze.out 8 | dht.state.bin 9 | test/.eqc-info 10 | test/current_counterexample.eqc 11 | _build 12 | .rebar 13 | doc/*.html 14 | doc/*.css 15 | doc/*.png 16 | doc/edoc-info 17 | 18 | current_counterexample.eqc 19 | .eqc-info 20 | _rel 21 | relx 22 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: erlang 2 | 3 | sudo: false 4 | 5 | script: "make all" 6 | otp_release: 7 | - 18.0 8 | - 18.1 9 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | Original Authors: 2 | 3 | Magnus Klaar 4 | Jesper Louis Andersen 5 | 6 | Other contributions by: 7 | 8 | Josh Adams 9 | Alexander Færøy 10 | Edward Wang 11 | Mads Hartmann Jensen 12 | Maxim Treskin 13 | Michael Uvarov 14 | Benoît Chesneau 15 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Making contributions to this project 2 | 3 | Rule #1: Don't piss Jesper off. 4 | 5 | We have fairly good EQC coverage, so if you write any PR, discuss how you want to test that your change is correct, and how it fits into the model at large. Most bugs will mean one of two things: 6 | 7 | * The model is wrong. 8 | * The model doesn't cover the area you have found a bug in. 9 | 10 | Most features will require at least some model change. 11 | 12 | # Code of Conduct 13 | 14 | There is no code of conduct for this project. 15 | -------------------------------------------------------------------------------- /Description.md: -------------------------------------------------------------------------------- 1 | # A description of DIDS—Distributed ID Service 2 | 3 | This is a Distributed ID Service, or DIDS. The project started out as a Distributed Hash Table, but thinking about it more had me realize it is not really a DHT, but something else. Therefore it is more apt to rename stuff to make it clear what kind of service we provide. 4 | 5 | So what is a Distributed Identity Service? It is a service in which we ultimately want to provide a mapping 6 | 7 | Identity -> Location* 8 | 9 | That is, given an identity, we obtain the locations where can we find it. In our system, we rely on the IP protocol, so in our system we define locations as IP addresses and we define identities as very large integers. So we have: 10 | 11 | Identity ::= 256 bit integer 12 | Location ::= {inet:ip_address(), inet:port_number()} 13 | 14 | and the DIDS maps Identities into Locations, nothing more. But it does so in a distributed fashion, in order to provide a no-single-point-of-failure. 15 | 16 | The *distributed* part is that DIDS nodes are part of a *swarm* of other nodes they know about. Some queries—which can't be satisfied locally—can ask other nodes in swarm for the locations. This provides the system with robustness and no single-point-of-failure. 17 | 18 | The DIDS system keeps a *partial* routing table of all nodes known in the swarm. The property is chosen such that nodes which are "close" are more likely to be in the routing table, and nodes which are far away are less like to be there. This allows us to query foreign nodes for a given Identity and its locations, should we not have it in our local cache. Querying will necessary be limited by network round trips. That it, query is relatively fast, but it is not blindingly fast. We have deliberately built the system for robustness of keeping values rather than being fast. 19 | 20 | ## TL;DR version 21 | 22 | DIDS is essentially a global DNS service with no single point of failure and no hierarchy of machines within the infrastructure. It is a big dynamic swarm of machines covering for each other. It provides a mapping of *identities* into *locations* much like DNS does. The details are different, but the general idea is the same: you can use this from within Erlang to create large naming registries. 23 | 24 | ## Important properties 25 | 26 | A DIDS never stores data back to "Itself". That is, A given node is *never* its own client. This is in contrast with other systems, like the Dynamo ring used by Amazon and Riak, where some queries are to the node "itself" and acts exactly like another node in the ring. When you store Identities in the swarm, you are always storing identities in some other place than in your own node. This is because it is assumed you already store locally in your own cache, so there is no need to handle this. 27 | 28 | (Note: the API should make it easy to handle this at the top-level, so you don't have to go and worry about this. But it is not really fleshed out how to do this in detail, actually). 29 | 30 | # Uses 31 | 32 | Typical use cases are: 33 | 34 | * Data repositories for persistent data. Package repositories. 35 | * World-wide global file systems. 36 | * Very large scale file systems in large distributed data centers. 37 | * Naming services with no central authority. 38 | * Global dynamic DNS registry for Erlang nodes. 39 | 40 | # Non-goals 41 | 42 | A DIDS is different from a DHT. Specifically, we need to address a number of points it doesn't try to solve: 43 | 44 | * We do not map `Key -> Value`. This is to avoid the trap of building a system which also takes care of storage. We don't care about how data are actually stored. The only thing we do is to provide a way in which you can get locations of the given Identity. You can use any storage system you want with DIDS. 45 | 46 | * The DIDS does not persist data at all. If an Identity is pushed to the swarm it has to be refreshed at a given interval, or it will stop being in the swarms global knowledge. This is to make sure the swarm is fairly dynamic in the way it works and to protect it: If you have millions of entries you wish the swarm to manage, you have work to do in order to push out these identities at regular intervals. (In time, we will provide ways to make the DIDS system refresh its kept values). 47 | 48 | * We do not define a protocol for grabbing the actual data. This means you can build this on top of other services. You could for instance define that a hexadecimal representation of the ID, is what you are using. And that if the location of the form `10.0.x.y:8080` then you should run a request 49 | http://10.0.x.y:8080/c/54bad3fa… 50 | 51 | to obtain data. It is entirely up to you how the identity should be interpreted. You can even let them represent Infohashes for torrents if you want. It is the same idea that drives the distributed hash table in the BitTorrent network. 52 | 53 | # API 54 | 55 | We support 4 commands: 56 | 57 | * PING(Peer): Ask another node in the swarm if it lives 58 | * FIND_NODE(ID): Find a node with a given ID 59 | * FIND_VALUE(ID): Find nodes which has the given ID associated with them. 60 | * STORE(ID, Port): Store the fact that ID can be obtained on this Node by contacting it on the port `Port`. The IP is implicitly derived. The IP is implicitly derived because we don't want enemies to be able to store an arbitrary IP address in the swarm, but only addresses for which the packet originate. When we return that we are willing to handle a value for a peer, we give that peer an unique token back. They have to supply this token to us, or we reject the store in our end of the system. This ensures the peer has actually contacted us recently to store a value. Tokens stop being valid a minute in. 61 | 62 | Notes: Values can *never* be deleted but given enough time and no refreshing STORE command, they will automatically dissipate from the cloud. Unless some other node decides to keep the reference alive. 63 | 64 | # Rules 65 | 66 | * Your identities must be drawn from the 256 bit unsigned integer space uniformly. In Erlang, one way to do this is to take the data you wish to store and index it based on its content. Given `Content` run `ID = crypto:hash(sha256, Content)` and store the ID pointing back to you. 67 | 68 | * Nodes in the swarm pick random IDs in the 256 bit space as well. They have to in order to achieve good spread. Otherwise the system can degenerate. 69 | 70 | # Inspiration: 71 | 72 | * The `Kademlia` Distributed Hash Table ( http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf ). 73 | * The `BitTorrent` DHT implementation documented in `BEP 005`. 74 | -------------------------------------------------------------------------------- /EQC_CI_LICENCE.txt: -------------------------------------------------------------------------------- 1 | This file is an agreement between Quviq AB ("Quviq"), Sven Hultins 2 | Gata 9, Gothenburg, Sweden, and the committers to the github 3 | repository in which the file appears ("the owner"). By placing this 4 | file in a github repository, the owner agrees to the terms below. 5 | 6 | The purpose of the agreement is to enable Quviq AB to provide a 7 | continuous integration service to the owner, whereby the code in the 8 | repository ("the source code") is tested using Quviq's test tools, and 9 | the test results are made available on the web. The test results 10 | include test output, generated test cases, and a copy of the source 11 | code in the repository annotated with coverage information ("the test 12 | results"). 13 | 14 | The owner agrees that Quviq may run the tests in the source code and 15 | display the test results on the web, without obligation. 16 | 17 | The owner warrants that running the tests in the source code and 18 | displaying the test results on the web violates no laws, licences or other 19 | agreements. In the event of such a violation, the owner accepts full 20 | responsibility. 21 | 22 | The owner warrants that the source code is not malicious, and will not 23 | mount an attack on either Quviq's server or any other server--for 24 | example by taking part in a denial of service attack, or by attempting 25 | to send unsolicited emails. 26 | 27 | The owner warrants that the source code does not attempt to reverse 28 | engineer Quviq's code. 29 | 30 | Quviq reserves the right to exclude repositories that break this 31 | agreement from its continuous integration service. 32 | 33 | Any dispute arising from the use of Quviq's service will be resolved 34 | under Swedish law. 35 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | PROJECT=dht 2 | 3 | LOCAL_DEPS = crypto 4 | .DEFAULT_GOAL := app 5 | 6 | include erlang.mk 7 | -------------------------------------------------------------------------------- /Proto.md: -------------------------------------------------------------------------------- 1 | # DHT Protocol design 2 | 3 | A far-reaching DHT for Erlang nodes needs a protocol which is different from the BitTorrent DHT protocol. In BitTorrent, we utilize the common interchange format for BitTorrent, *bencoding*, in order to convey information between nodes. Messages are exchanged via UDP in a quick Request/Response pattern and there is some cookie-employment in order to protect against rogue nodes going havoc and destroying the DHT cloud in its entirety. 4 | 5 | The problem with a DHT built for world-wide adaption is *trust*. We can't in general trust other nodes to produce meaningful inputs. An evil system can easily send us random garbage in order to mess with us. Therefore, the format we propose must be resilient against that. Hence, we propose a simple format, with few moving parts in order to make it harder to untrusted parties to mess with our system. 6 | 7 | This file only contains the parts of the protocol which has to do with sending and receiving messages on the wire. The parts which has to do with the high-level DHT semantics has to go elsewhere. This split makes it possible to focus on one thing at a time, and produce better software, hopefully. 8 | 9 | We avoid using the Erlang Term binary format for this reason. It is a format which is excellent between trusted participants, but for an untrusted node, it is not so good. We opt instead for a binary format with a very simple and well-defined tree-like structure we can parse by binary pattern matchings in Erlang. Great care has been taken to make the parse as simple as possible as to avoid parsing ambiguities: 10 | 11 | * The format can be parsed from the head through a simple EBNF-like grammar structure. 12 | * Length fields are kept to a minimum and it is made such that the grammar is easy to parse as an LL(1) parser by recursive descent. 13 | * Great care has been placed on limiting the size of various fields such that it is not possible to mis-parse data by reading strings incorrectly. 14 | * The grammar has been written so it is suited for Erlang binary pattern matching parsing. 15 | 16 | 17 | # Deviations from Kademlia 18 | 19 | We deviate from the Kademlia paper in one very important aspect. In Kademlia, you store pairs of Key/Value. In our distributed network, you store identifications of a Key to the IP/Port pairs that have the key. It is implicitly expected that the identification is enough to satisfy what kind of protocol we are speaking. That is, if we are given `10.18.19.20` at port `80`, for key ID `0xc0ffecafe`, we can build things on the assumption that requesting `http://10.18.19.20/v/c0ffecafe` will obtain the value for that key. So this protocol doesn't store values themselves, but only a mapping from the world of Key material into an IP world where we can retrieve the given values. 20 | 21 | This design choice is made to keep the DHT as simple as possible. For most systems, this is enough and the only facility that the DHT should provide is a way to identify who has what in a decentralized and distributed fashion. The actual storage of data is left to another system in a typical layered model. 22 | 23 | # Syntax 24 | 25 | Messages are exchanged as packets. The UDP packets has this general framing form: 26 | 27 | Packet ::= <<"EDHT-KDM-", Version:8/integer, Tag:16/integer, ID:256, Msg/binary>> 28 | 29 | The "EDHT-KDM-" header makes it possible to remove spurious messages that accidentally hit the port. The Version allows us 256 versions so we can extend the protocol later. I propose a binary protocol which is not easily extensible indefinitely, although certain simple extensions are possible. It is not our intent that this protocol is to be used by other parties, except for the Erlang DHT cloud. Hence, we keep the format simple in version 0. If we hit extension hell, we can always propose a later version of the protocol, parsing data differently. In that situation, we probably extend the protocol with a self-describing data set like in ASN.1. 30 | 31 | The `Tag` value encodes a 16 bit value which is selected by the querying entity. And it is reflected in the message from the responding entity. This means you can match up the values and have multiple outstanding requests to the same node in question. It also makes it easy to track outstanding requests, and correlate them to waiting processes locally. 32 | 33 | The tag is not meant to be a security feature. A random attacker can easily forge reply-packets given the tag size. On the other hand, it would not provide much added security if we extended the tag to a 128 bit random value, say. In this case, eavesdropping eve can just sniff the query packet and come up with a fake reply to that query. As such, it is possible to steer the replies. 34 | 35 | Finally, each message contains an `ID` field. The ID field encodes the NodeID of the node from which the message originated. It was easier to add this to each and every message rather than trying to handle it on a per-message type basis. It is more often the case that a message will contain an ID than it will be the case that it will not. 36 | 37 | Messages are not length-coded directly. The remainder of the UDP packet is the message. Note that implementations are free to limit the message lengths to 1024 bytes if they want. This is to protect against excessively overloading a node. There are three types of messages: 38 | 39 | Msg ::= 40 | | <<$q, QueryMsg/binary>> 41 | | <<$r, ReplyMsg/binary>> 42 | | <<$e, ErrMsg/binary>> 43 | 44 | For each kind of query, there is a corresponding reply. So if the query type is `K` then `qK` has a reply `rK`. 45 | 46 | It is important to stress there is no bijection between a query and reply. Often, the query and its corresponding reply are vastly different packets with vastly different kinds of data. 47 | 48 | The rules are for queries, Q, replies R and errors E there are two valid transitions: 49 | 50 | either 51 | Q → R (reply) 52 | or 53 | Q → E (error) 54 | 55 | That is, either a query results in a reply or an error but never both. We would have liked exactly once semantics, but since this is impossible, there is a `Tag` in each message to (near) idempotent handling of messages. 56 | 57 | # Error handling 58 | 59 | We begin by handling Errors because they are the simplest: 60 | 61 | ErrMsg ::= <> length(ErrString) =< 1024 bytes 62 | ErrString <> 63 | 64 | We limit the error message to 1024 *bytes*. We don't want excessive parses of large messages here, so we keep it short. The `ErrCode` entries are taken from an Error Code table, given below, together with its error message. The list is forward extensible. 65 | 66 | # Security considerations: 67 | 68 | A DHT like Kademlia uses random Identities chosen for nodes. And chooses a cryptographic hash function to represent the identity of content. Given bytes `Bin`, the ID of the binary `Bin` is simple the value `crypto:hash(sha256, Bin)`. Hence, the strength of the integrity guarantee we provide is given by the strength of the hash function we pick. 69 | 70 | * Confidentiality: No confidentiality is provided. Everyone can snoop at what you are requesting at any point in time. 71 | 72 | * Integrity: Node-ID and Key-ID identity is chosen to be SHA-256. This is a change from SHA-1 used in Kademlia back in 2001. SHA-1 has collision problems in numerous ways at the moment so in order to preserve 2nd preimage resistance, and obtain proper integrity, we need at least SHA-256. SHA-3 or SHA-512 are also possible for extending the security margin further, but we can do so in a later version of the protocol. I've opted *not* to make the hash-function negotiable. If an error crops up in SHA-256, we bump the protocol number and rule out any earlier request as being invalid. This also builds in a nice self-destruction mechanism, so safe clients don't accidentally talk to insecure clients. 73 | 74 | * Availability: The protocol is susceptible to several attacks on its availability. The protection against it is a "enough nodes" defense, much like the one posed in BitCoin, but it is somewhat shady. If nodes lie about routing information or if a node is flooded with requests it will cease to operate correctly. Hopefully the sought-after value is at multiple nodes, so this doesn't pose a problem. But in itself, there is no protection against availability. 75 | 76 | The key take-away from the DHT method is that it provides integrity, but not confidentiality nor availability. If you receive a key `K`, you must construct a design where you can *derive* the key `K` from a value `V`. The standard way is to take the cryptographic hash of the value, ie, `K = crypto:hash(sha256, V)`. 77 | 78 | # Common entities 79 | 80 | The format uses a set of common data types which are described here: 81 | 82 | SHA_ID ::= <> 83 | IP4 ::= <> Parse as {ID, {B1, B2, B3, B4}, Port} for an IPv4 socket 84 | 85 | The SHA_ID refers to a 256 bit SHA-256 bit sequence. It is used to uniquely identify nodes and keys in the DHT space. The values IP4 refers to peers given by an IP address and a Port number on which to contact a peer. The two values correspond to IPv4 and IPv6 addressing respectively. 86 | 87 | # Commands 88 | 89 | Each command is a Query/Reply pair. The format of the query and its reply are usually not the same, but they are connected since each query result in a reply. This means that there is a rule that a query for command K must result in a reply of type K. Otherwise things are wrong. This is easily handled in a parser. Furthermore, it means we can parse replies without having to tell the parser what command to expect before we try to parse the reply. It neatly decouples the syntax of the protocol from its semantics in the protocol. 90 | 91 | In the following, it is always an exchange between Alice and Bob, where Alice sends a Query-message to Bob which then replies back with a Reply-message. But of course, in a real peer-to-peer network, the roles can easily be the reversed order in practice. We just pick names here for the purpose of meaningful explanation. 92 | 93 | All commands are 1-byte values. We deliberately pick the values such that the commands have mnemonics. In principle it just encodes a one-byte enumeration of the different kinds of message types. 94 | 95 | A peer is free to return an error back if it wants. But clients should be prepared for timeouts. Overloaded clients might limit responses to other parties. 96 | 97 | ## `p`—Ping a node to check its availability 98 | 99 | A `p` command is used to check for availability of a peer: 100 | 101 | QueryMsg ::= <<$p>> 102 | ReplyMsg ::= <<$p>> 103 | 104 | Alice sends her Node-ID SHA to Bob and Bob replies back with his Node-ID. This is used to learn that another node is up and running, or is not responding to pings right now. Note that the `QueryMsg` and `ReplyMsg` contains the ID already, so there is no reason to repeat it here. 105 | 106 | ## `f`—Find (search) for a node with a given ID 107 | 108 | In the DHT protocol you can find either nodes or values. A node search is always going to return nodes which are close to a given key, whereas the value search may return addresses which hosts the searched value in question. 109 | 110 | QueryMsg ::= … 111 | | <<$f, $n, SHA_ID>> 112 | | <<$f, $v, SHA_ID>> 113 | ReplyMsg ::= … 114 | | <<$f, $n, L:8, Nodes>> 115 | | <<$f, $v, Token:64, L:8, Values>> 116 | 117 | Nodes = <> 118 | Values = <> 119 | 120 | Replies to find_node commands are always going to be a node-reply (`$f, $n`) with `L` nodes. A parser MUST check that it is given `L` nodes. Likewise, a find_value command (`$f, $v`) can return nodes, but it can also return values, also with a length encoded as `L` which MUST be checked by the parser. 121 | 122 | The given Token is used as a protection against random Storage requests. A store request to a node `X` must supply a `Token` value that was recently received as a reply to a find_value query. It makes sure that before I can store a new ID into the DHT swarm, I need to get close to the area in the swarm where the data will be stored. 123 | 124 | ## `s`—Store a Key/Value pair in the DHT cloud 125 | 126 | The `s` command stores the availability of a Key in the cloud: 127 | 128 | QueryMsg ::= … 129 | | <<$s, Token:64, KEY, Port:16>> 130 | ReplyMsg ::= … 131 | | <<$s>> 132 | 133 | Key ::= SHA_ID 134 | 135 | Store a mapping `KEY → {IP, Port}` under this node. The IP is implicit and is obtained from the IP address of the UDP socket. Each `KEY` is allowed to be stored multiple times, under different locations. A query can reply with multiple locations. 136 | 137 | # Extensions 138 | 139 | We give extensions as DEPs (Distributed-hash-table Extension Proposals) 140 | 141 | ## DEP001: IPv6 Support 142 | 143 | In the modern Internet, IPv6 support is a necessity. We have already run out of IPv4 addresses in all major and minor regions, and we can do nothing but watch IP addresses being NAT'ed and sold in auctions to the highest bidder. Hence, we need the protocol to naturally extend to the successor, IPv6. The approach we take is to store two lists of nodes, one for IPv4 and one for IPv6 in a specific order. Over time, the list of IPv4 addresses will fall out of the protocol. 144 | 145 | Implementation: TODO. 146 | 147 | ## DEP002: MAC'ed messages 148 | 149 | Let `S` be a secret shared by all peers. Then 150 | 151 | MacPacket ::= <<"EDHT-KDM-M-", Version:8/integer, Tag:16/integer, Msg/binary, MAC:256>> 152 | 153 | is a MAC-encoded packet, which is encoded as a normal packet, except that it has another header and contains a 256 bit MAC (Message Authentication Code). Clients handle this packet by checking the MAC against `S`. If it fails to pass the check, the packet is thrown away on the grounds of a MAC error. This allows people to create "local" DHT clouds which has no other participants than the designated. There is no accidental situation which can make this DHT merge with other DHTs or the world at large. 154 | 155 | Strict rule: You *MUST* verify the MAC before you attempt to decode the packet in an implementation. This guards against malicious users trying to inject messages/packets into the cloud you have built and trusted. 156 | 157 | In a large setting, this partially addresses the availability of the DHT. An adversary in the middle, Mallory, can't inject packets into our DHT which it then tries to handle. Also, unless you have `S`, you can't communicate with the DHT. Note that this doesn't provide confidentiality of packet messages. 158 | 159 | ## DEP003: NACL encrypted messages 160 | 161 | TODO—NaCl encrypted exchanges with shared secrets. 162 | 163 | # Error Codes and their messages 164 | 165 | TODO 166 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # dht — Distributed Hash Table for Erlang 2 | 3 | The `dht` application implements a Distributed Hash Table for Erlang. It is excised from the etorrent application in order to make it possible to use it without using the rest of etorrent. 4 | 5 | The code is highly rewritten by now. There are few traces left of the original code base. The reason for the partial rewrite was to support a full QuickCheck model, so code was changed in order to make certain parts easier to handle from a formal point of view. 6 | 7 | # State of the code 8 | 9 | The code is currently early alpha state, and things may not work yet. In particular, many parts of the system has not been thoroughly tested and as such they may contain grave bugs. 10 | 11 | The QuickCheck model currently exercises every part of the system, but avoids the `search` and `refresh` code for the time being. 12 | 13 | Check the issues at github. They may contain current problems. 14 | 15 | ## Flag days 16 | 17 | Since we are at an early alpha, changes will be made which are not backwards compatible. This section describes the so-called “flag days” at which we change stuff in ways that are not backwards compatible to earlier versions. Once things are stable and we are looking at a real release, a versioning scheme will be in place to handle this. 18 | 19 | * 2015-08-15: Increased the size of the token from 4 to 8 bytes. Protocol format has changed as a result. The old clients will fail to handle this correctly. Also, started using the Versions more correctly in the protocol. 20 | 21 | # Building 22 | 23 | The code relies on no additional modules but the standard Erlang/OTP implementation in order to ease its portability to other systems and injection into existing code bases. The code does require at least Erlang/OTP release 18 as it makes heavy use of the new time API. To handle older Erlang releases, some work is needed in the module `dht_time` to provide backporting capabilities. 24 | 25 | To build the code: 26 | 27 | make 28 | 29 | This uses `erlang.mk` currently, but you can also execute: 30 | 31 | rebar compile 32 | 33 | or 34 | 35 | rebar3 compile 36 | 37 | which will also work on the project. 38 | 39 | # What is a DHT? 40 | 41 | A distributed hash table is a mapping from *identities* which are 256 bit numbers to a list of pairs, `[{IP, Port}]` on which the identity has been registered. The table is *distributed* by having a large number of nodes store small parts of the mapping each. Also, each node keeps a partial routing table in order to be able to find values. 42 | 43 | Note the DHT, much like DNS doesn't interpret what the identity is. It just maps that identity onto the IP/Port endpoints. That is, you have to write your own protocol once you know where an identity is registered. 44 | 45 | Also, the protocol makes no security guarantees. Anyone can claim to store a given ID. It is recommended clients have some way to verify that indeed, the other end stores the correct value for the ID. For instance, one can run `ID = crypto:sha(256, Val)` on a claimed value to verify that it is correct, as long as the value does not take too long to transfer. Another option is to store a Public key as the identity and then challenging the target end to verify the correct key, for instance by use of a protocol like CurveCP. 46 | 47 | The implemented DHT has the following properties: 48 | 49 | * Multiple peers can store the same identity. In that case, each peer is reflected in the table. 50 | * Protocol agnostic, you are encouraged to do your own protocol, security checking and blacklisting of peers. 51 | * Insertion is fast, but deletion can take up to one hour because it is based on timeouts. If a node disappears, it can take up to one hour before its entries are removed as well. 52 | * Scalability is in millions of nodes. 53 | * Storage is *probabilistic*. An entry is stored at 8 nodes, and if all 8 nodes disappear, that entry will be lost for up to 45 minutes, before it is reinserted in a refresh. 54 | 55 | # Using the implementation: 56 | 57 | The implementation supports 3 high-level commands, which together with two low-level commands is enough to run the DHT from scratch. Say you have just started a new DHT node: 58 | 59 | application:ensure_all_started(dht) 60 | 61 | Then, in order to make the DHT part of swarm, you must know at least one node in the swarm. How to obtain the nodes it outside of the scope of the DHT application currently. One you know about a couple of nodes, you can ping them to insert them into the DHT routing table: 62 | 63 | [dht:ping({IP, Port}) || {IP, Port} <- Peers] 64 | 65 | Then, in order to populate the routing table quickly, execute a find node search on your own randomly generated ID: 66 | 67 | Self = dht:node_id(), 68 | dht_search:run(find_node, Self). 69 | 70 | At this point, we are part of the swarm, and can start using it. Say we want to say that ID 456 are present on our node, port 3000. We then execute: 71 | 72 | dht:enter(456, 3000) 73 | 74 | and now the DHT tracks that association. If we later want to rid ourselves of the association, we use 75 | 76 | dht:delete(456) 77 | 78 | Finally, other nodes can find the association by executing a lookup routine: 79 | 80 | [{IP, 3000}] = dht:lookup(456) 81 | 82 | where IP is the IP address the node is using to send UDP packets. 83 | 84 | ### The internal state of the system 85 | 86 | I believe a lot in the concepts of *transparency* and *discoverability* in which the system will be try to be transparent in what is happening inside it, so you can verify the internal state for correctness. You can obtain information about the internal state by the command: 87 | 88 | dht:info(). 89 | 90 | which will tell you what the internal state of everything is currently. This is useful if you suspect something is wrong as it gives a top-level glance of the internal state as a snapshot. 91 | 92 | # Code layout and how the DHT works 93 | 94 | In general, we follow the [Kademlia](http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf) hash table implementation. But the implementation details of this code is down below. The key take-away is that in order to implement Kademlia in Erlang, you have to come up with a working process model. 95 | 96 | The system supports 4 low-level commands: 97 | 98 | * `PING({IP, Port})`—Ask a node if it is alive 99 | * `FIND_NODE({IP, Port}, NodeID)`—Ask a peer routing table about information on `NodeID`. 100 | * `FIND_VALUE({IP, Port}, ID)`—Ask a peer about knowledge about a stored value `ID`. May return either information about that value, or a set of nodes who are likely to know more about it. Will also return a temporary token to be used in subsequent `STORE` operations. 101 | * `STORE({IP, Port}, Token, ID, Port)`—If the current node is on IP-address `NodeIP`, then call a peer at `{IP, Port}` to associate `ID => {NodeIP, Port}`. The `Token` provides proof we recently executed a `FIND_VALUE` call to the peer. 102 | 103 | These are supported by a number of process groups, which we describe in the following: 104 | 105 | ## dht_metric.erl 106 | 107 | The `dht_metric` module is a library, not a process. It implements the metric space in which the DHT operates: 108 | 109 | The DHT operates on an identity-space which are 256 bit integers. If `X` and `Y` are IDs, then `X xor Y` forms a [metric](https://en.wikipedia.org/wiki/Metric_(mathematics)), or distance-function, over the space. Each node in the DHT swarm has an ID, and nodes are close to each other if the distance is small. 110 | 111 | Every ID you wish to store is a 256 bit integer. Likewise, they are mapped into the space with the nodes. Nodes which are “close” to a stored ID tracks who is currently providing service for that ID. 112 | 113 | ## dht_net.erl, dht_proto.erl 114 | 115 | The DHT communicates via a simple binary protocol that has been built loosely on the BitTorrent protocol. However, the protocol has been changed completely from BitTorrent and has no resemblance to the original protocol whatsoever. The protocol encoding/decoding scheme is implemented in `dht_proto` and can be changed later on if necessary by adding a specific 8 byte random header to messages. 116 | 117 | All of the network stack is handled by `dht_net`. This gen_server runs two tasks: incoming requests and outgoing requests. 118 | 119 | For incoming requests, a handler process is spawned to handle that particular query. Once it knows about an answer, it invokes `dht_net` again to send out the response. For outgoing requests, the network gen_server uses `noreply` to block the caller until either timeout or until a response comes back for a request. It then unblocks the caller. In short, `dht_net` multiplexes the network. 120 | 121 | The network protocol uses UDP. Currently, we don't believe packets will be large enough to mess with the MTU size of the IP stack, but we may be wrong. 122 | 123 | Storing data at another node is predicated on a random token. Technically, the random token is hashed with the IP address of the peer in order to produce a unique token for that peer. If the peer wants to store data it has to present that token to prove it has recently executed a search and hit the node. It provides some security against forged stores of data at random peers. The random token is cycled once every minute to make sure there is some progress and a peer can't reuse a token forever. 124 | 125 | The network protocol provides no additional security. In particular, there is currently no provision for blacklisting of talkative peers, no storage limit per node, no security handshake, etc. 126 | 127 | ## dht_state.erl, dht_routing_meta.erl, dht_routing_table.erl 128 | 129 | The code in `dht_state` runs the routing table state of the system. To do so, it uses the `dht_routing_table` code to maintain the routing table, and uses the `dht_routing_meta` code to handle meta-data about nodes in the table. 130 | 131 | Imagine a binary tree based on the digital bit-pattern of a 256 bit number. A `0` means “follow the left child” whereas a `1` means “follow the right child” in the tree. Since each node has a 256 bit ID, the NodeID, it sits somewhere in this tree. The full tree is the full routing table for the system, but in general this is a very large tree containing all nodes currently present in the system. Since there can be millions of nodes, it is not feasible to store the full routing table in each node. Hence, they opt for partial storage. 132 | 133 | If you follow the path from the node's NodeID to the root of the tree, you follow a “spine” in the tree. At each node in the spine, there will be a child which leads down to our NodeID. The other child would lead to other parts of the routing table, but the routing table stores at most 8 such nodes in a “bucket”. Hence, the partial routing table is a spine of nodes, with buckets hanging on every path we don't take toward our own NodeID. This is what the routing table code stores. 134 | 135 | For each node, we keep meta-data about that node: when we last successfully contacted it, how many times requests failed to reach it, and if we know it answers our queries. The latter is because many nodes are firewalled, so they can perform requests to other nodes, but you can't contact them yourself. Such nodes are not interesting to the routing table. 136 | 137 | Every time a node is succesfully handled by the network code, it pings the state code with an update. This can then impact the routing table: 138 | 139 | * If the node belongs to a bucket which has less than 8 nodes, it is inserted. 140 | * If the node belongs into a full bucket, and there are nodes in the bucket which we haven't talked with for 15 minutes, we try to ping those nodes. If they fail to respond, we replace the bad node with the new one. This ensures the table tracks stable nodes over time. 141 | * If the node belongs into a full bucket with nodes that we know are bad, we replace the bad one with the new node in hope it is better. 142 | 143 | Periodically, every 15 minutes, we also check each bucket. If no nodes in the bucket has had any activity in the time-frame, we pick a random node in the bucket and asks it of its local nodes. For each node it returns, we try to insert those into the table. This ensures the bucket is somewhat full—even in the case the node is behind a firewall. Most nodes in the swarm which talk regularly will never hit these timers in practice. 144 | 145 | We make a distinction between nodes which are reachable because they answered a query from us, and those where we answer a query for them. The rule is that refreshing of nodes can only happen if we have at some point had a successful reachable query to that node. And after this, then the node is refreshed even on non-reachable responses back to that node. 146 | 147 | When the system stops, the routing table dumps its state to disk, but the meta-data is thrown out. The system thus starts in a state where it has to reconstruct information about nodes in its table, but since we don't know for how long we have been down, this is a fair trade-off. 148 | 149 | ## dht_store.erl 150 | 151 | The store keeps a table mapping an ID to a IP/Port pair for other nodes are kept in the store. It is periodically collected for entries which are older than 1 hour. Thus, clients who wish a permanent entry needs to refresh it before one hour. 152 | 153 | ## dht_track.erl 154 | 155 | The tracker is the natural dual counterpart to the store. It keeps mappings present in the DHT for a node by periodically refreshing them every 45 minutes. Thus, users of the DHT doesn't have to worry about refreshing nodes. 156 | 157 | ## dht_refresh.erl 158 | 159 | The refresher module handles refreshing work of nodes and ranges. This has to be handled "one level up" from the two work-horse modules: state and net. To avoid deadlocking the two processes running state and net, we spawn 160 | helper functions which blocks and calls into the two underlying systems. 161 | 162 | ## Top level: dht.erl, dht_search.erl 163 | 164 | The code here is not a process, but merely a library which is used by the code to perform recursive searches on top of the system. That is, the above implement the 4 major low-level commands, and this library uses them to handle the DHT by combining them into useful high-level commands. 165 | 166 | The search code implements the recursive nature of queries on a partial routing table. A search for a target ID or NodeID starts by using your own partial routing table to find peers close to the target. You then ask those nodes for the knowledge in their partial routing table. The returned nodes are then queried by yourself in order to hop a step closer to the target. This recursion continues until either the target is found, or we get as close as we can. 167 | 168 | 169 | # Testing the DHT (example) 170 | 171 | This section describes how to use the DHT before we have a bootstrap process into the system. The system is early alpha, so things might not work entirely as expected. I'm interested in any kind of error you see when trying to run this. 172 | 173 | DHTs are learners algorithms. They learn about other peers when they communicate with other peers. If they know of no peers, the question is how one manages to start out. Usually one requests a number of bootstrap nodes and injects those blindly into the DHTs routing table at the start of the system. But a smarter way, when toying around, is just to ping a couple of other nodes which runs the DHT code. Once the DHT has run for a while, it will remember its routing table on disk, so then it should be able to start from the on-disk state. 174 | 175 | Nodes needs to be “reachable” which means there are no firewall on the peer which makes it unable to respond. 176 | 177 | To start out, set a port and start the DHT code: 178 | 179 | application:load(dht), 180 | application:set_env(dht, port, 1729), 181 | application:ensure_all_started(dht). 182 | 183 | Do this on all nodes for which you want to run the DHT. If testing locally, you can pick different port numbers, but you probably also want to set the state file on disk differently in that case. 184 | 185 | Ping the other node: 186 | 187 | dht:ping({{192,168,1,123}, 1729}). 188 | 189 | At this point, the two nodes ought to know about each other. Then you can proceed to use the DHT: 190 | 191 | dht:enter(ID, Port). 192 | 193 | will make the DHT remember that the entry identified by `ID` can be obtained on the IP of this node, on `Port`. A good way 194 | to get at `ID` for a value `Value` is simply to call `ID = crypto:hash(sha256, Value)`. To look up a handler for an ID, use 195 | 196 | 3> dht:lookup(ID). 197 | [{{172,16,1,200},12345}] 198 | 199 | which means `ID` can be found at `{172,16,1,200}` on port `12345`. The DHT does not specify the protocol to use there. But one could just define it is HTTP/1.1 and you then request `http://172.16.1.200:12345/ID`. It could also be another protocol, at your leisure. 200 | 201 | ## QuickCheck 202 | 203 | The "research project" which is being done in this project is to provide a full QuickCheck model of every part of the DHT code. This work has already uncovered numerous grave errors and mistakes in the original etorrent code base, to the point where I'm wondering if this code even worked appropriately in the first place. 204 | 205 | Hence, the modeling work continues. It is slowly moving, because you often need to do successive refinement on the knowledge you have as you go along. On the other hand, the parts which have been checked are likely to be formally correct. Far more than any other project. 206 | 207 | The current effort is centered around the construction of the top level binding code, that makes everything fit together. This code has not been handled by a QuickCheck model yet, however. What *has* been handled already though, is all the low-level parts: network code, routing stable state code and so on. 208 | -------------------------------------------------------------------------------- /doc/overview.edoc: -------------------------------------------------------------------------------- 1 | @author Magnus Klaar 2 | @author Jesper Louis Andersen 3 | @copyright 2015 Jesper Louis Andersen 4 | @title DHT—A distributed hash table for Erlang 5 | @doc This application implements a service for running as a peer in a DHT swarm. 6 | 7 | All communication is loose and based on sending UDP messages around between 8 | peers. There is no other connectivity needed between peers in the swarm. 9 | 10 | The implementation is essentially the Kademlia variant used in BitTorrent. This DHT has 11 | been known to scale to several million nodes on the internet with ease. 12 | 13 | The DHT only provides an association between a unique `ID' and an `{IP, Port}' pair. From 14 | here, you will need to specify your own protocol to run. It is advised to pick ID's which are 15 | content addressed to provide integrity. For instance by execution of 16 | `crypto:hash(sha256, Content)' to obtain an ID. 17 | @end -------------------------------------------------------------------------------- /include/etorrent_rate.hrl: -------------------------------------------------------------------------------- 1 | %% The rate record is used for recording information about rates 2 | %% on torrents 3 | -record(peer_rate, { rate = 0.0 :: float(), 4 | total = 0 :: integer(), 5 | next_expected = none :: none | integer(), 6 | last = none :: none | integer(), 7 | rate_since = none :: none | integer()}). 8 | 9 | -define(RATE_UPDATE, 5 * 1000). 10 | -define(RATE_FUDGE, 5). 11 | -------------------------------------------------------------------------------- /include/etorrent_version.hrl: -------------------------------------------------------------------------------- 1 | %% Protocol version. It does not in general follow the versioning of the client itself. 2 | %% Rather the idea is that it is bumped whenever we fix something really grave that 3 | %% might get the client banned from communicating with other clients. 4 | -define(VERSION, "d011"). 5 | -define(AGENT_TRACKER_STRING, <<"etorrent/1.2.1">>). 6 | -------------------------------------------------------------------------------- /include/supervisor.hrl: -------------------------------------------------------------------------------- 1 | %% Helper child specs for supervisors 2 | -define(CHILD(I), {I, {I, start_link, []}, permanent, 5000, worker, [I]}). 3 | -define(CHILDP(I, P), {I, {I, start_link, P}, permanent, 5000, worker, [I]}). 4 | -define(CHILDW(I, W), {I, {I, start_link, []}, permanent, W, worker, [I]}). 5 | -------------------------------------------------------------------------------- /rebar.config: -------------------------------------------------------------------------------- 1 | {deps, []}. 2 | {erl_opts, [debug_info,warn_export_vars,warn_shadow_vars,warn_obsolete_guard]}. 3 | -------------------------------------------------------------------------------- /rebar.lock: -------------------------------------------------------------------------------- 1 | []. 2 | -------------------------------------------------------------------------------- /rel/sys.config: -------------------------------------------------------------------------------- 1 | [{sasl, 2 | [ {error_logger_mf_dir, "./log/mf"}, 3 | {error_logger_mf_maxbytes, 10485760}, 4 | {error_logger_mf_maxfiles, 10} ] }, 5 | {dht, 6 | [ 7 | %% Port to use 8 | {port, 3739}, 9 | 10 | %% The file in which to store the current DHT application state 11 | {state_file, "./dht_state/dht_state.bin"}, 12 | 13 | %% The "bootstrap nodes to start off the DHT from" 14 | {bootstrap_nodes, []}, 15 | 16 | %% The options to give the listen socket. This is useful to only bind to a specific port 17 | {listen_opts, []} 18 | ]} 19 | ]. 20 | -------------------------------------------------------------------------------- /rel/vm.args: -------------------------------------------------------------------------------- 1 | -name dht@127.0.0.1 2 | -setcookie dht 3 | -heart 4 | 5 | +K true 6 | +A 5 7 | 8 | -------------------------------------------------------------------------------- /relx.config: -------------------------------------------------------------------------------- 1 | {release, {dht_release, "1.0.0"}, [dht, sasl]}. 2 | {extended_start_script, true}. 3 | {sys_config, "rel/sys.config"}. 4 | {vm_args, "rel/vm.args"}. 5 | {overlay, [ 6 | {mkdir, "log"}, 7 | {mkdir, "log/mf"}, 8 | {mkdir, "dht_state"} 9 | ]}. 10 | -------------------------------------------------------------------------------- /src/Makefile: -------------------------------------------------------------------------------- 1 | app: 2 | $(MAKE) -C .. app 3 | 4 | dialyze: 5 | $(MAKE) -C .. dialyze 6 | -------------------------------------------------------------------------------- /src/dht.app.src: -------------------------------------------------------------------------------- 1 | {application, dht, [ 2 | {description, "Erlang DHT based loosely on BitTorrent/Kademlia"}, 3 | {vsn, "0.9.0"}, 4 | {modules, []}, 5 | {registered, []}, 6 | {applications, [ 7 | kernel, 8 | stdlib, 9 | crypto 10 | ]}, 11 | {mod, {dht_app, []}}, 12 | {env, [ 13 | %% Port to use 14 | {port, 3739}, 15 | 16 | %% The file in which to store the current DHT application state 17 | {state_file, "dht_state.bin"}, 18 | 19 | %% The "bootstrap nodes to start off the DHT from" 20 | {bootstrap_nodes, []}, 21 | 22 | %% The options to give the listen socket. This is useful to only bind to a specific port 23 | {listen_opts, []} 24 | ]} 25 | ]}. 26 | -------------------------------------------------------------------------------- /src/dht.erl: -------------------------------------------------------------------------------- 1 | %%% @doc API for the DHT application. 2 | %% 3 | %% This module provides the API of the DHT application. There are two 4 | %% major groups of calls: Low level DHT code, and high level API which 5 | %% is the one you are going to use, most likely. 6 | %% 7 | %% The high level API is: 8 | %% 9 | %%
    10 | %%
  • lookup/1
  • 11 | %%
  • enter/2
  • 12 | %%
  • delete/1
  • 13 | %%
14 | %% 15 | %% The Low-level API exposes the low-level four commands you can execute 16 | %% against the DHT. It is intended for those who wants to build their own 17 | %% subsystems around the DHT. The high-level API uses these to provide the 18 | %% low level implementation: 19 | %% 20 | %%
    21 | %%
  • ping/1
  • 22 | %%
  • store/4
  • 23 | %%
  • find_node/2
  • 24 | %%
  • find_value/2
  • 25 | %%
26 | %% 27 | %% @end 28 | 29 | %% @author Magnus Klaar 30 | %% @author Jesper Louis Andersen 31 | -module(dht). 32 | 33 | %% High-level API 34 | -export([ 35 | node_id/0, 36 | lookup/1, 37 | enter/2, 38 | delete/1 39 | ]). 40 | 41 | %% Informative API 42 | -export([ 43 | info/0, info/1 44 | ]). 45 | 46 | %% Low-level API for others to use 47 | -export([ 48 | ping/1, 49 | store/4, 50 | find_node/2, 51 | find_value/2 52 | ]). 53 | 54 | -type id() :: non_neg_integer(). 55 | -type tag() :: binary(). 56 | -type token() :: binary(). 57 | 58 | -type peer() :: {id(), inet:ip_address(), inet:port_number()}. 59 | -type endpoint() :: {inet:ip_address(), inet:port_number()}. 60 | 61 | -type range() :: {id(), id()}. 62 | 63 | -export_type([id/0, tag/0, token/0]). 64 | -export_type([peer/0, range/0, endpoint/0]). 65 | 66 | %% High-level API Functions 67 | %% ------------------------------ 68 | 69 | %% @doc node_id/0 returns the `ID' of the current node 70 | %% @end 71 | node_id() -> 72 | dht_state:node_id(). 73 | 74 | %% @doc delete/1 removes an `ID' inserted by this node 75 | %% 76 | %% Remove the tracking of the `ID' inserted by this node. If no such `ID' exist, 77 | %% this is a no-op. 78 | %% 79 | %% It may take up to an hour before peers stop contacting you for the ID. This is 80 | %% an artifact of the DHT, so you must be prepared to handle the case where you 81 | %% are contacted for an old inexistant ID. 82 | %% @end 83 | delete(ID) -> 84 | dht_track:delete(ID). 85 | 86 | %% @doc enter/2 associates an `ID' with a `Location' on this node. 87 | %% 88 | %% Associate the given `ID' with a `Port'. Lookups to this ID will 89 | %% henceforth contain this node as a possible peer for the ID. The protocol 90 | %% which is used to transfer the data afterwards is not specified by the DHT. 91 | %% 92 | %% Note that the IP address to use is given by the UDP port on which the system 93 | %% is bound. This is a current setup in order to make it harder to craft packets 94 | %% where you impersonate someone else. The hope is that egress filtering at ISPs 95 | %% will help to mitigate eventual amplification attacks. 96 | %% 97 | %% @end 98 | -spec enter(ID, Port) -> ok 99 | when 100 | ID :: id(), 101 | Port :: inet:port_number(). 102 | enter(ID, Port) -> 103 | dht_track:store(ID, Port). 104 | 105 | %% @doc lookup/1 searches the DHT for nodes which can give you an `ID' back 106 | %% 107 | %% Perform a lookup operation. It returns a list of pairs {IP, Port} pairs 108 | %% which the DHT knows about that given ID. 109 | %% 110 | %% Assumptions: 111 | %% 112 | %%
    113 | %%
  • We are never looking up keys which we have locally stored at the service.
  • 114 | %%
  • Before querying the network, we look into our own store. If we have entries 115 | %% locally, we pick those.
  • 116 | %%
  • If we don't have entries ourselves, we look up in the swarm.
  • 117 | %%
118 | %% @end 119 | lookup(ID) -> 120 | case dht_store:find(ID) of 121 | [] -> 122 | #{ found := Fs } = dht_search:run(find_value, ID), 123 | Fs; 124 | Peers -> Peers 125 | end. 126 | 127 | %% Informative API functions 128 | 129 | %% @doc info/0 retrieves the internal state of the DHT subsystem. 130 | %% This call is intended for debugging purposes. You can use it to query the internal state of 131 | %% the DHT and use that as a basis of a report if something is wrong with the DHT. 132 | %% @end 133 | -spec info() -> [{atom(), term()}]. 134 | info() -> 135 | [info(tracking), info(routing_table), info(store)]. 136 | 137 | %% @doc info/1 queries specific entries in the `info/0' block 138 | %% @end 139 | -spec info(Area) -> term() 140 | when Area :: tracking | store | routing_table. 141 | 142 | info(tracking) -> {tracking, dht_track:info()}; 143 | info(routing_table) -> {routing_table, dht_state:info()}; 144 | info(store) -> {store, dht_store:info()}. 145 | 146 | %% Low-level API Functions 147 | 148 | %% @doc ping/1 queries a peer with a ping message 149 | %% 150 | %% Low level message which allows you to program your own strategies. 151 | %% 152 | %% @end 153 | -spec ping(Location) -> pang | {ok, id()} | {error, Reason} 154 | when 155 | Location :: {inet:ip_address(), inet:port_number()}, 156 | Reason :: any(). 157 | ping(Peer) -> 158 | dht_net:ping(Peer). 159 | 160 | %% @doc store/4 stores a new association at a peer 161 | %% 162 | %% Low level message which allows you to program your own strategies. 163 | %% 164 | %% @end 165 | store(Peer, Token, ID, Port) -> 166 | dht_net:store(Peer, Token, ID, Port). 167 | 168 | %% @doc find_node/2 performs a `find_node' query 169 | %% 170 | %% Low level message which allows you to program your own strategies. 171 | %% 172 | %% @end 173 | find_node({IP, Port}, Node) -> 174 | dht_net:find_node({IP, Port}, Node). 175 | 176 | %% @doc find_value/2 performs a `find_value' query 177 | %% 178 | %% Low level message which allows you to program your own strategies. 179 | %% 180 | %% @end 181 | find_value({IP, Port}, ID) -> 182 | dht_net:find_value({IP, Port}, ID). 183 | -------------------------------------------------------------------------------- /src/dht_app.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Application behaviour for the DHT 2 | %%% @end 3 | %%% @private 4 | -module(dht_app). 5 | -behaviour(application). 6 | 7 | %% API. 8 | -export([start/2]). 9 | -export([stop/1]). 10 | 11 | %% API. 12 | 13 | start(_Type, _Args) -> 14 | dht_sup:start_link(). 15 | 16 | stop(_State) -> 17 | ok. -------------------------------------------------------------------------------- /src/dht_constants.hrl: -------------------------------------------------------------------------------- 1 | % 2 | % Defines the maximal bucket size 3 | -define(MAX_RANGE_SZ, 8). 4 | -define(MIN_ID, 0). 5 | -define(MAX_ID, 1 bsl 256). 6 | 7 | % 8 | % The bucket refresh timeout is the amount of time that the 9 | % server will tolerate a node to be disconnected before it 10 | % attempts to refresh the bucket. 11 | -define(RANGE_TIMEOUT, 15 * 60 * 1000). 12 | -define(NODE_TIMEOUT, 15 * 60 * 1000). 13 | 14 | % 15 | % For how long will you believe in a stored value? 16 | -define(REFRESH_TIME, 45 * 60 * 1000). 17 | -define(STORE_TIME, 60 * 60 * 1000). 18 | 19 | % 20 | % How many nodes to store at when we find nodes 21 | -define(STORE_COUNT, 8). 22 | -------------------------------------------------------------------------------- /src/dht_metric.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Internal/external representation of DHT IDs 2 | %% In the DHT system, the canonical ID is a 160 bit integer. This ID and its operation 3 | %% induces a metric on which everything computes. In this DHT It is 160 bit integers and 4 | %% The XOR operation is used as the composition. 5 | %% 6 | %% We have to pick an internal representation of these in our system, so we have a canonical way of 7 | %% representing these. The internal representation chosen are 160 bit integers, 8 | %% which in erlang are the integer() type. 9 | %%% @end 10 | %%% @private 11 | -module(dht_metric). 12 | 13 | -export([mk/0, d/2]). 14 | -export([neighborhood/3]). 15 | 16 | %% @doc mk_random_id/0 constructs a new random ID 17 | %% @end 18 | -spec mk() -> dht:id(). 19 | mk() -> 20 | <> = dht_rand:crypto_rand_bytes(20), 21 | ID. 22 | 23 | %% @doc dist/2 calculates the distance between two random IDs 24 | %% @end 25 | -spec d(dht:id(), dht:id()) -> dht:id(). 26 | d(ID1, ID2) -> ID1 bxor ID2. 27 | 28 | %% @doc neighborhood/3 finds known nodes close to an ID 29 | %% neighborhood(ID, Nodes, Limit) searches for Limit nodes in the neighborhood of ID. Nodes is the list of known nodes. 30 | %% @end 31 | -spec neighborhood(ID, Nodes, Limit) -> [dht:peer()] 32 | when 33 | ID :: dht:id(), 34 | Nodes :: [dht:peer()], 35 | Limit :: non_neg_integer(). 36 | 37 | neighborhood(ID, Nodes, Limit) -> 38 | DF = fun({NID, _, _}) -> d(ID, NID) end, 39 | case lists:sort(fun(X, Y) -> DF(X) < DF(Y) end, Nodes) of 40 | Sorted when length(Sorted) =< Limit -> Sorted; 41 | Sorted when length(Sorted) > Limit -> 42 | {H, _T} = lists:split(Limit, Sorted), 43 | H 44 | end. 45 | 46 | -------------------------------------------------------------------------------- /src/dht_net.erl: -------------------------------------------------------------------------------- 1 | %% @author Magnus Klaar 2 | %% @doc DHT networking code 3 | %% @end 4 | %% @private 5 | -module(dht_net). 6 | -behaviour(gen_server). 7 | 8 | %% 9 | %% Implementation notes 10 | %% RPC calls to remote nodes in the DHT are written by use of a gen_server proxy. 11 | %% The proxy maintains an internal correlation table from requests to replies so 12 | %% a given reply can be matched up with the correct requestor. It uses the 13 | %% standard gen_server:call/3 approach to handling calls in the DHT. 14 | %% 15 | %% A timer is used to notify the server of requests that 16 | %% time out, if a request times out {error, timeout} is 17 | %% returned to the client. If a response is received after 18 | %% the timer has fired, the response is dropped. 19 | %% 20 | %% The expected behavior is that the high-level timeout fires 21 | %% before the gen_server call times out, therefore this interval 22 | %% should be shorter then the interval used by gen_server calls. 23 | %% 24 | %% Lifetime interface. Mostly has to do with setup and configuration 25 | -export([start_link/1, start_link/2, node_port/0]). 26 | 27 | %% DHT API 28 | -export([ 29 | store/4, 30 | find_node/2, 31 | find_value/2, 32 | ping/1 33 | ]). 34 | 35 | %% Private internal use 36 | -export([handle_query/5]). 37 | 38 | % gen_server callbacks 39 | -export([init/1, 40 | handle_call/3, 41 | handle_cast/2, 42 | handle_info/2, 43 | terminate/2, 44 | code_change/3]). 45 | 46 | % internal exports 47 | -export([sync/0]). 48 | 49 | -record(state, { 50 | socket :: inet:socket(), 51 | outstanding :: #{ {dht:peer(), binary()} => {pid(), reference()} }, 52 | tokens :: queue:queue() 53 | }). 54 | 55 | % 56 | % Constants and settings 57 | % 58 | -define(TOKEN_LIFETIME, 5 * 60 * 1000). 59 | -define(UDP_MAILBOX_SZ, 16). 60 | -define(QUERY_TIMEOUT, 2000). 61 | 62 | % 63 | % Public interface 64 | % 65 | 66 | %% @doc Start up the DHT networking subsystem 67 | %% @end 68 | start_link(DHTPort) -> 69 | start_link(DHTPort, #{}). 70 | 71 | %% @private 72 | start_link(Port, Opts) -> 73 | gen_server:start_link({local, ?MODULE}, ?MODULE, [Port, Opts], []). 74 | 75 | %% @doc node_port/0 returns the (UDP) port number to which the DHT system is bound. 76 | %% @end 77 | -spec node_port() -> {inet:ip_address(), inet:port_number()}. 78 | node_port() -> 79 | gen_server:call(?MODULE, node_port). 80 | 81 | %% @private 82 | request(Target, Q) -> 83 | gen_server:call(?MODULE, {request, Target, Q}). 84 | 85 | %% @private 86 | sync() -> 87 | gen_server:call(?MODULE, sync). 88 | 89 | %% @doc ping/1 sends a ping to a node 90 | %% Calling `ping(IP, Port)' will send a ping message to the IP/Port pair 91 | %% and wait for a result to come back. Used to check if the node in the 92 | %% other end is up and running. 93 | %% 94 | %% If running over an IP/Port pair, we can't timeout, so we don't 95 | %% timeout the peer here. We do that in dht_state. 96 | %% @end 97 | -spec ping({inet:ip_address(), inet:port_number()}) -> 98 | pang | {ok, dht:id()} | {error, Reason} 99 | when Reason :: term(). 100 | ping(Peer) -> 101 | case request(Peer, ping) of 102 | {error, timeout} -> pang; 103 | {response, _, ID, ping} -> {ok, ID}; 104 | {error, eagain} -> 105 | timer:sleep(15), 106 | ping(Peer); 107 | {error, Reason} -> 108 | fail = error_common(Reason), 109 | pang 110 | end. 111 | 112 | %% @doc find_node/3 searches in the DHT for a given target NodeID 113 | %% Search at the target IP/Port pair for the NodeID given by `Target'. May time out. 114 | %% @end 115 | -spec find_node({IP, Port}, Target) -> {nodes, ID, Token, Nodes} | {error, Reason} 116 | when 117 | IP :: inet:ip_address(), 118 | Port :: inet:port_number(), 119 | Target :: dht:id(), 120 | ID :: dht:id(), 121 | Token :: dht:token(), 122 | Nodes :: [dht:peer()], 123 | Reason :: any(). 124 | 125 | find_node({IP, Port}, N) -> 126 | case request({IP, Port}, {find, node, N}) of 127 | {error, E} -> {error, E}; 128 | {response, _, _, {find, node, Token, Nodes}} -> 129 | {nodes, N, Token, Nodes} 130 | end. 131 | 132 | -spec find_value(Peer, ID) -> 133 | {nodes, ID, [Node]} 134 | | {values, ID, Token, [Value]} 135 | | {error, Reason} 136 | when 137 | Peer :: {inet:ip_address(), inet:port_number()}, 138 | ID :: dht:id(), 139 | Node :: dht:peer(), 140 | Token :: dht:token(), 141 | Value :: dht:peer(), 142 | Reason :: any(). 143 | 144 | find_value(Peer, IDKey) -> 145 | case request(Peer, {find, value, IDKey}) of 146 | {error, Reason} -> {error, Reason}; 147 | {response, _, ID, {find, node, Token, Nodes}} -> 148 | {nodes, ID, Token, Nodes}; 149 | {response, _, ID, {find, value, Token, Values}} -> 150 | {values, ID, Token, Values} 151 | end. 152 | 153 | -spec store(Peer, Token, ID, Port) -> {error, timeout} | dht:id() 154 | when 155 | Peer :: {inet:ip_address(), inet:port_number()}, 156 | ID :: dht:id(), 157 | Token :: dht:token(), 158 | Port :: inet:port_number(). 159 | 160 | store(Peer, Token, IDKey, Port) -> 161 | case request(Peer, {store, Token, IDKey, Port}) of 162 | {error, R} -> {error, R}; 163 | {response, _, ID, _} -> 164 | {ok, ID} 165 | end. 166 | 167 | %% @private 168 | handle_query(ping, Peer, Tag, OwnID, _Tokens) -> 169 | return(Peer, {response, Tag, OwnID, ping}); 170 | handle_query({find, node, ID}, Peer, Tag, OwnID, Tokens) -> 171 | TVal = token_value(Peer, queue:last(Tokens)), 172 | Nodes = filter_node(Peer, dht_state:closest_to(ID)), 173 | return(Peer, {response, Tag, OwnID, {find, node, TVal, Nodes}}); 174 | handle_query({find, value, ID}, Peer, Tag, OwnID, Tokens) -> 175 | case dht_store:find(ID) of 176 | [] -> 177 | handle_query({find, node, ID}, Peer, Tag, OwnID, Tokens); 178 | Peers -> 179 | TVal = token_value(Peer, queue:last(Tokens)), 180 | return(Peer, {response, Tag, OwnID, {find, value, TVal, Peers}}) 181 | end; 182 | handle_query({store, Token, ID, Port}, {IP, _Port} = Peer, Tag, OwnID, Tokens) -> 183 | case is_valid_token(Token, Peer, Tokens) of 184 | false -> ok; 185 | true -> dht_store:store(ID, {IP, Port}) 186 | end, 187 | return(Peer, {response, Tag, OwnID, store}). 188 | 189 | -spec return({inet:ip_address(), inet:port_number()}, any()) -> 'ok'. 190 | return(Peer, Response) -> 191 | case gen_server:call(?MODULE, {return, Peer, Response}) of 192 | ok -> ok; 193 | {error, eagain} -> 194 | %% For now, we just ignore the case where EAGAIN happens 195 | %% in the system, but we could return these packets back 196 | %% to the caller by trying again. lager:warning("return 197 | %% packet to peer responded with EAGAIN"), 198 | {error, eagain}; 199 | {error, Reason} -> 200 | fail = error_common(Reason), 201 | {error, Reason} 202 | end. 203 | 204 | %% CALLBACKS 205 | %% --------------------------------------------------- 206 | 207 | %% @private 208 | init([DHTPort, Opts]) -> 209 | {ok, Base} = application:get_env(dht, listen_opts), 210 | {ok, Socket} = dht_socket:open(DHTPort, [binary, inet, {active, ?UDP_MAILBOX_SZ} | Base]), 211 | dht_time:send_after(?TOKEN_LIFETIME, ?MODULE, renew_token), 212 | {ok, #state{ 213 | socket = Socket, 214 | outstanding = #{}, 215 | tokens = init_tokens(Opts)}}. 216 | 217 | init_tokens(#{ tokens := Toks}) -> queue:from_list(Toks); 218 | init_tokens(#{}) -> queue:from_list([random_token() || _ <- lists:seq(1, 2)]). 219 | 220 | %% @private 221 | handle_call({request, Peer, Request}, From, State) -> 222 | case send_query(Peer, Request, From, State) of 223 | {ok, S} -> {noreply, S}; 224 | {error, Reason} -> {reply, {error, Reason}, State} 225 | end; 226 | handle_call({return, {IP, Port}, Response}, _From, #state { socket = Socket } = State) -> 227 | Packet = dht_proto:encode(Response), 228 | Result = dht_socket:send(Socket, IP, Port, Packet), 229 | {reply, Result, State}; 230 | handle_call(sync, _From, #state{} = State) -> 231 | {reply, ok, State}; 232 | handle_call(node_port, _From, #state { socket = Socket } = State) -> 233 | {ok, SockName} = dht_socket:sockname(Socket), 234 | {reply, SockName, State}. 235 | 236 | %% @private 237 | handle_cast(_Msg, State) -> 238 | {noreply, State}. 239 | 240 | %% @private 241 | handle_info({request_timeout, Key}, State) -> 242 | HandledState = handle_request_timeout(Key, State), 243 | {noreply, HandledState}; 244 | handle_info(renew_token, State) -> 245 | dht_time:send_after(?TOKEN_LIFETIME, ?MODULE, renew_token), 246 | {noreply, handle_recycle_token(State)}; 247 | handle_info({udp_passive, Socket}, #state { socket = Socket } = State) -> 248 | ok = inet:setopts(Socket, [{active, ?UDP_MAILBOX_SZ}]), 249 | {noreply, State}; 250 | handle_info({udp, _Socket, IP, Port, Packet}, State) when is_binary(Packet) -> 251 | {noreply, handle_packet({IP, Port}, Packet, State)}; 252 | handle_info({stop, Caller}, #state{} = State) -> 253 | Caller ! stopped, 254 | {stop, normal, State}; 255 | handle_info(Msg, State) -> 256 | error_logger:error_msg("Unkown message in handle info: ~p", [Msg]), 257 | {noreply, State}. 258 | 259 | %% @private 260 | terminate(_, _State) -> 261 | ok. 262 | 263 | %% @private 264 | code_change(_, State, _) -> 265 | {ok, State}. 266 | 267 | %% INTERNAL FUNCTIONS 268 | %% --------------------------------------------------- 269 | 270 | %% Handle a request timeout by unblocking the calling process with `{error, timeout}' 271 | handle_request_timeout(Key, #state { outstanding = Outstanding } = State) -> 272 | case maps:get(Key, Outstanding, not_found) of 273 | not_found -> State; 274 | {Client, _Timeout} -> 275 | reply(Client, {error, timeout}), 276 | State#state { outstanding = maps:remove(Key, Outstanding) } 277 | end. 278 | 279 | %% reply/2 handles correlated responses for processes using the `dht_net' framework. 280 | reply(_Client, {query, _, _, _} = M) -> exit({message_to_ourselves, M}); 281 | reply({P, T} = From, M) when is_pid(P), is_reference(T) -> 282 | gen_server:reply(From, M). 283 | 284 | %% 285 | %% Token renewal is called whenever the tokens grows too old. 286 | %% Cycle the tokens to make sure they wither and die over time. 287 | %% 288 | handle_recycle_token(#state { tokens = Tokens } = State) -> 289 | Cycled = queue:in(random_token(), queue:drop(Tokens)), 290 | State#state { tokens = Cycled }. 291 | 292 | %% 293 | %% Handle an incoming UDP message on the socket 294 | %% 295 | handle_packet({IP, Port} = Peer, Packet, 296 | #state { outstanding = Outstanding, tokens = Tokens } = State) -> 297 | Self = dht_state:node_id(), %% @todo cache this locally. It can't change. 298 | case view_packet_decode(Packet) of 299 | invalid_decode -> 300 | State; 301 | {valid_decode, PeerID, Tag, M} -> 302 | Node = {PeerID, IP, Port}, 303 | Key = {Peer, Tag}, 304 | case maps:get(Key, Outstanding, not_found) of 305 | not_found -> 306 | case M of 307 | {query, Tag, PeerID, Query} -> 308 | %% Incoming request 309 | RState = request_success(Node, #{ reachable => false }, State), 310 | spawn_link(fun() -> ?MODULE:handle_query(Query, Peer, Tag, Self, Tokens) end), 311 | RState; 312 | _ -> 313 | State %% No recipient 314 | end; 315 | {Client, TRef} -> 316 | %% Handle blocked client process 317 | _ = dht_time:cancel_timer(TRef), 318 | RState = request_success(Node, #{ reachable => true }, State), 319 | reply(Client, M), 320 | RState#state { outstanding = maps:remove(Key, Outstanding) } 321 | end 322 | end. 323 | 324 | 325 | 326 | %% view_packet_decode/1 is a view on the validity of an incoming packet 327 | view_packet_decode(Packet) -> 328 | try dht_proto:decode(Packet) of 329 | {error, {old_version, <<0,0,0,0,0,0,0,0>>}} -> invalid_decode; 330 | {error, Tag, ID, _Code, _Msg} = E -> {valid_decode, ID, Tag, E}; 331 | {response, Tag, ID, _Reply} = R -> {valid_decode, ID, Tag, R}; 332 | {query, Tag, ID, _Query} = Q -> {valid_decode, ID, Tag, Q} 333 | catch 334 | _Class:_Error -> 335 | invalid_decode 336 | end. 337 | 338 | unique_message_id(Peer, Active) -> 339 | unique_message_id(Peer, Active, 16). 340 | 341 | unique_message_id(Peer, Active, K) when K > 0 -> 342 | IntID = dht_rand:uniform(16#FFFF), 343 | MsgID = <>, 344 | case maps:is_key({Peer, MsgID}, Active) of 345 | true -> 346 | %% That MsgID is already in use, recurse and try again 347 | unique_message_id(Peer, Active, K-1); 348 | false -> MsgID 349 | end. 350 | 351 | % 352 | % Generate a random token value. A token value is used to filter out bogus store 353 | % requests, or at least store requests from nodes that never sends find_value requests. 354 | % 355 | random_token() -> 356 | dht_rand:crypto_rand_bytes(16). 357 | 358 | request_success(Node, Opts, State) -> 359 | case dht_state:request_success(Node, Opts) of 360 | ok -> State; 361 | not_inserted -> State; 362 | already_member -> State 363 | end. 364 | 365 | send_query({IP, Port} = Peer, Query, From, #state { outstanding = Active, socket = Socket } = State) -> 366 | Self = dht_state:node_id(), %% @todo cache this locally. It can't change. 367 | MsgID = unique_message_id(Peer, Active), 368 | Packet = dht_proto:encode({query, MsgID, Self, Query}), 369 | 370 | case dht_socket:send(Socket, IP, Port, Packet) of 371 | ok -> 372 | TRef = dht_time:send_after(?QUERY_TIMEOUT, ?MODULE, {request_timeout, {Peer, MsgID}}), 373 | 374 | Key = {Peer, MsgID}, 375 | Value = {From, TRef}, 376 | {ok, State#state { outstanding = Active#{ Key => Value } } }; 377 | {error, Reason} -> 378 | {error, Reason} 379 | end. 380 | 381 | %% @doc Delete node with `IP' and `Port' from the list. 382 | filter_node({IP, Port}, Nodes) -> 383 | [X || {_NID, NIP, NPort}=X <- Nodes, NIP =/= IP orelse NPort =/= Port]. 384 | 385 | token_value(Peer, Token) -> 386 | X = term_to_binary(Peer), 387 | crypto:hmac(sha256, Token, X, 8). 388 | 389 | is_valid_token(TokenValue, Peer, Tokens) -> 390 | ValidValues = [token_value(Peer, Token) || Token <- queue:to_list(Tokens)], 391 | lists:member(TokenValue, ValidValues). 392 | 393 | %% Error common inhabits the common errors in the network stack. 394 | %% This is used to validate that the error we got, was one of the well-known errors 395 | %% In other words, the assumption is that ALL errors produced by this subsystem 396 | %% falls into this group. 397 | error_common(enobufs) -> fail; 398 | error_common(ehostunreach) -> fail; 399 | error_common(econnrefused) -> fail; 400 | error_common(ehostdown) -> fail; 401 | error_common(enetdown) -> fail. 402 | 403 | -------------------------------------------------------------------------------- /src/dht_par.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Module dht_par runs commands in parallel for the DHT 2 | %%% @end 3 | %%% @private 4 | -module(dht_par). 5 | 6 | -export([pmap/2, partition/1]). 7 | 8 | %% @doc Very very parallel pmap implementation :) 9 | %% @end 10 | %% @todo: fix this parallelism 11 | -spec pmap(fun((A) -> B), [A]) -> [B]. 12 | pmap(F, Es) -> 13 | Parent = self(), 14 | Running = [spawn_monitor(fun() -> Parent ! {self(), F(E)} end) || E <- Es], 15 | collect(Running, 5000). 16 | 17 | collect([], _Timeout) -> []; 18 | collect([{Pid, MRef} | Next], Timeout) -> 19 | receive 20 | {Pid, Res} -> 21 | erlang:demonitor(MRef, [flush]), 22 | [{ok, Res} | collect(Next, Timeout)]; 23 | {'DOWN', MRef, process, Pid, Reason} -> 24 | [{error, Reason} | collect(Next, Timeout)] 25 | after Timeout -> 26 | exit(pmap_timeout) 27 | end. 28 | 29 | partition(Res) -> partition(Res, [], []). 30 | 31 | partition([], OK, Err) -> {lists:reverse(OK), lists:reverse(Err)}; 32 | partition([{ok, R} | Next], OK, Err) -> partition(Next, [R | OK], Err); 33 | partition([{error, E} | Next], OK, Err) -> partition(Next, OK, [E | Err]). 34 | -------------------------------------------------------------------------------- /src/dht_proto.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Module dht_proto handles syntactical DHT protocol encoding/decoding. 2 | %%% @end 3 | %%% @private 4 | -module(dht_proto). 5 | 6 | -export([encode/1, decode/1]). 7 | 8 | -define(VERSION1, <<175,64,13,52,167,136,55,45>>). 9 | 10 | -type query() :: 11 | ping | 12 | {find, node | value, non_neg_integer()} | 13 | {store, dht:token(), dht:id(), inet:port_number()}. 14 | 15 | -type response() :: 16 | ping | 17 | {find, node, dht:token(), [dht:peer()]} | 18 | {find, value, dht:token(), [dht:endpoint()]} | 19 | store. 20 | 21 | -type msg() :: 22 | {query, dht:tag(), dht:id(), query()} | 23 | {response, dht:tag(), dht:id(), response()} | 24 | {error, dht:tag(), integer(), binary()}. 25 | 26 | -export_type([msg/0, query/0, response/0]). 27 | 28 | %% Encoding on the wire 29 | %% ------------------------ 30 | header(Tag, ID) -> <>. 31 | 32 | encode_query(ping) -> <<$p>>; 33 | encode_query({find, node, ID}) -> <<$f, $n, ID:256>>; 34 | encode_query({find, value, ID}) -> <<$f, $v, ID:256>>; 35 | encode_query({store, Token, ID, Port}) -> <<$s, Token/binary, ID:256, Port:16>>. 36 | 37 | encode_response(ping) -> $p; 38 | encode_response({find, node, Token, Ns}) -> 39 | L = length(Ns), 40 | [<<$f, $n, Token/binary, L:8>>, encode_nodes(Ns)]; 41 | encode_response({find, value, Token, Vs}) -> 42 | L = length(Vs), 43 | [<<$f, $v, Token/binary, L:8>>, encode_peers(Vs)]; 44 | encode_response(store) -> 45 | $s. 46 | 47 | encode_peers(Vs) -> iolist_to_binary(encode_ps(Vs)). 48 | 49 | encode_ps([]) -> []; 50 | encode_ps([{{B1, B2, B3, B4, B5, B6, B7, B8}, Port} | Ns]) -> 51 | [<<6, 52 | B1:16/integer, B2:16/integer, B3:16/integer, B4:16/integer, 53 | B5:16/integer, B6:16/integer, B7:16/integer, B8:16/integer, 54 | Port:16/integer>> | encode_ps(Ns)]; 55 | encode_ps([{{B1, B2, B3, B4}, Port} | Ns]) -> 56 | [<<4, 57 | B1:8/integer, B2:8/integer, B3:8/integer, B4:8/integer, 58 | Port:16/integer>> | encode_ps(Ns)]. 59 | 60 | encode_nodes(Ns) -> iolist_to_binary(encode_ns(Ns)). 61 | 62 | encode_ns([]) -> []; 63 | encode_ns([{ID, {B1, B2, B3, B4, B5, B6, B7, B8}, Port} | Ns]) -> 64 | [<<6, ID:256, 65 | B1:16/integer, B2:16/integer, B3:16/integer, B4:16/integer, 66 | B5:16/integer, B6:16/integer, B7:16/integer, B8:16/integer, 67 | Port:16/integer>> | encode_ns(Ns)]; 68 | encode_ns([{ID, {B1, B2, B3, B4}, Port} | Ns]) -> 69 | [<<4, ID:256/integer, 70 | B1:8/integer, B2:8/integer, B3:8/integer, B4:8/integer, 71 | Port:16/integer>> | encode_ns(Ns)]. 72 | 73 | -spec encode(msg()) -> iolist(). 74 | encode({query, Tag, ID, Q}) -> [header(Tag, ID), $q, encode_query(Q)]; 75 | encode({response, Tag, ID, R}) -> [header(Tag, ID), $r, encode_response(R)]; 76 | encode({error, Tag, ID, ErrCode, ErrStr}) -> [header(Tag, ID), $e, <>]. 77 | 78 | %% Decoding from the wire 79 | %% ----------------------- 80 | 81 | decode_query(<<$p>>) -> ping; 82 | decode_query(<<$f, $n, ID:256>>) -> {find, node, ID}; 83 | decode_query(<<$f, $v, ID:256>>) -> {find, value, ID}; 84 | decode_query(<<$s, Token:8/binary, ID:256, Port:16>>) -> {store, Token, ID, Port}. 85 | 86 | decode_response(<<$p>>) -> ping; 87 | decode_response(<<$f, $n, Token:8/binary, L:8, Pack/binary>>) -> {find, node, Token, decode_nodes(L, Pack)}; 88 | decode_response(<<$f, $v, Token:8/binary, L:8, Pack/binary>>) -> {find, value, Token, decode_endpoints(L, Pack)}; 89 | decode_response(<<$s>>) -> store. 90 | 91 | %% Force recognition of the correct number of incoming arguments. 92 | decode_nodes(0, <<>>) -> []; 93 | decode_nodes(K, <<4, ID:256, B1, B2, B3, B4, Port:16, Nodes/binary>>) -> 94 | [{ID, {B1, B2, B3, B4}, Port} | decode_nodes(K-1, Nodes)]; 95 | decode_nodes(K, <<6, ID:256, B1:16, B2:16, B3:16, B4:16, B5:16, B6:16, B7:16, B8:16, Port:16, Nodes/binary>>) -> 96 | [{ID, {B1, B2, B3, B4, B5, B6, B7, B8}, Port} | decode_nodes(K-1, Nodes)]. 97 | 98 | %% Force recognition of the correct number of incoming arguments. 99 | decode_endpoints(0, <<>>) -> []; 100 | decode_endpoints(K, <<4, B1, B2, B3, B4, Port:16, Nodes/binary>>) -> 101 | [{{B1, B2, B3, B4}, Port} | decode_endpoints(K-1, Nodes)]; 102 | decode_endpoints(K, <<6, B1:16, B2:16, B3:16, B4:16, B5:16, B6:16, B7:16, B8:16, Port:16, Nodes/binary>>) -> 103 | [{{B1, B2, B3, B4, B5, B6, B7, B8}, Port} | decode_endpoints(K-1, Nodes)]. 104 | 105 | -spec decode(binary()) -> msg(). 106 | decode(<<175,64,13,52,167,136,55,45, Tag:2/binary, ID:256, $q, Query/binary>>) -> 107 | {query, Tag, ID, decode_query(Query)}; 108 | decode(<<175,64,13,52,167,136,55,45, Tag:2/binary, ID:256, $r, Response/binary>>) -> 109 | {response, Tag, ID, decode_response(Response)}; 110 | decode(<<175,64,13,52,167,136,55,45, Tag:2/binary, ID:256, $e, ErrCode:16, ErrorString/binary>>) -> 111 | {error, Tag, ID, ErrCode, ErrorString}; 112 | decode(<<"EDHT-KDM-", 0:8, _Rest/binary>>) -> 113 | {error, {old_version, <<0,0,0,0,0,0,0,0>>}}. 114 | 115 | -------------------------------------------------------------------------------- /src/dht_rand.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Wrappers around random functions 2 | %% This module provides wrappers around the random functions which allows us to 3 | %% mock their behaviour when we are EQC testing. 4 | %%% @end 5 | %%% @private 6 | -module(dht_rand). 7 | 8 | -export([pick/1]). 9 | -export([crypto_rand_bytes/1, uniform/1]). 10 | 11 | pick([]) -> []; 12 | pick(Items) -> 13 | Len = length(Items), 14 | Pos = rand:uniform(Len), 15 | lists:nth(Pos, Items). 16 | 17 | crypto_rand_bytes(N) -> crypto:strong_rand_bytes(N). 18 | 19 | uniform(N) -> rand:uniform(N). 20 | -------------------------------------------------------------------------------- /src/dht_refresh.erl: -------------------------------------------------------------------------------- 1 | %%% @doc This module contains various refreshing tasks 2 | %%% @end 3 | %%% @private 4 | -module(dht_refresh). 5 | -export([insert_nodes/1, range/1, verify/3]). 6 | 7 | %% @doc insert_nodes/1 inserts a list of nodes into the routing table asynchronously 8 | %% @end 9 | -spec insert_nodes([dht:peer()]) -> ok. 10 | insert_nodes(NodeInfos) -> 11 | [spawn_link(dht_net, ping, [{IP, Port}]) || {_, IP, Port} <- NodeInfos], 12 | ok. 13 | 14 | %% @doc range/1 refreshes a range for the system based on its ID 15 | %% @end 16 | -spec range(dht:peer()) -> ok. 17 | range({ID, IP, Port}) -> 18 | spawn_link(fun() -> 19 | case dht_net:find_node({IP, Port}, ID) of 20 | {error, timeout} -> ok; 21 | {error, Err} -> 22 | error_logger:info_report([{unrecognized_error, Err}]), 23 | ok; 24 | {nodes, _, _Token, Nodes} -> 25 | [spawn_link(fun() -> dht_net:ping({I, P}) end) || {_, I, P} <- Nodes], 26 | ok 27 | end 28 | end), 29 | ok. 30 | 31 | -spec verify(dht:peer(), dht:peer(), map()) -> pid(). 32 | verify({QID, QIP, QPort} = QNode, Node, Opts) -> 33 | spawn_link(fun() -> 34 | case dht_net:ping({QIP, QPort}) of 35 | pang -> dht_state:request_timeout(QNode); 36 | {ok, QID} -> ok; 37 | {ok, _Other} -> dht_state:request_timeout(QNode) 38 | end, 39 | dht_state:request_success(Node, Opts) 40 | end). 41 | -------------------------------------------------------------------------------- /src/dht_routing_meta.erl: -------------------------------------------------------------------------------- 1 | %% @doc Wrap a routing table in timer constructs 2 | %% 3 | %% This module implements a "wrapper" around the routing table code, by adding 4 | %% timer tables for nodes and ranges. The module provides several functions 5 | %% which are used to manipulate not only the routing table, but also the timers 6 | %% for the routing table. The invariant is, roughly, that any node/range in the table 7 | %% has a timer. 8 | %% 9 | %% The module also provides a number of query-facilities for the policy module 10 | %% (dht_state) to query the internal timer state when a timer triggers. 11 | %% 12 | %% Global TODO: 13 | %% 14 | %% · Rework the EQC model, since it is now in tatters. 15 | %%% @end 16 | %%% @private 17 | -module(dht_routing_meta). 18 | -include("dht_constants.hrl"). 19 | 20 | %% Create/Export 21 | -export([new/1]). 22 | -export([export/1]). 23 | 24 | %% Manipulate the routing table and meta-data 25 | -export([ 26 | insert/2, 27 | replace/3, 28 | remove/2, 29 | node_touch/3, 30 | node_timeout/2, 31 | reset_range_timer/3 32 | ]). 33 | 34 | %% Query the state of the routing table and its meta-data 35 | -export([ 36 | member_state/2, 37 | neighbors/3, 38 | node_list/1, 39 | node_state/2, 40 | range_members/2, 41 | range_state/2, 42 | info/1 43 | ]). 44 | 45 | -record(routing, { 46 | table, 47 | nodes = #{}, 48 | ranges = #{} 49 | }). 50 | 51 | -type node_state() :: good | {questionable, integer()} | bad. 52 | -export_type([node_state/0]). 53 | 54 | info(#routing { table = T, nodes = Ns, ranges = Rs }) -> 55 | M = #{ buckets := Bs } = dht_routing_table:info(T), 56 | Bs2 = info_fill_in(Bs, Ns, Rs), 57 | M#{ buckets := Bs2 }. 58 | 59 | info_fill_in([], _, _) -> []; 60 | info_fill_in([B | Bs], Ns, Rs) -> 61 | #{ low := L, high := H, members := Mems } = B, 62 | #{ last_activity := LA } = maps:get({L, H}, Rs), 63 | MStates = 64 | [begin 65 | NState = maps:get(N, Ns), 66 | NState#{ peer => N } 67 | end || N <- Mems], 68 | [B#{ members := MStates, last_activity => LA } | info_fill_in(Bs, Ns, Rs)]. 69 | 70 | %% API 71 | %% ------------------------------------------------------ 72 | new(Tbl) -> 73 | Now = dht_time:monotonic_time(), 74 | Nodes = dht_routing_table:node_list(Tbl), 75 | ID = dht_routing_table:node_id(Tbl), 76 | RangeTable = init_range_timers(Now, Tbl), 77 | NodeTable = init_nodes(Now, Nodes), 78 | State = #routing { 79 | table = Tbl, 80 | ranges = RangeTable, 81 | nodes = NodeTable 82 | }, 83 | {ok, ID, State}. 84 | 85 | member_state(Node, #routing { table = T }) -> dht_routing_table:member_state(Node, T). 86 | range_members({_, _, _} = Node, #routing { table = T }) -> dht_routing_table:members({node, Node}, T); 87 | range_members({_, _} = Range, #routing { table = T }) -> dht_routing_table:members({range, Range}, T). 88 | 89 | %% @doc replace/3 substitutes one bad node for a new node 90 | %% Preconditions: 91 | %% • The removed node MUST be bad. 92 | %% • The added node MUST NOT be a member. 93 | %% @end 94 | -spec replace(Old, New, Meta) -> not_inserted | roaming_member | {error, Reason} | {ok, Meta} 95 | when 96 | Old :: dht:peer(), 97 | New :: dht:peer(), 98 | Meta :: #routing{}, 99 | Reason :: atom(). 100 | 101 | replace(Old, New, #routing { nodes = Ns, table = Tbl } = State) -> 102 | bad = timer_state({node, Old}, Ns), 103 | case member_state(New, State) of 104 | unknown -> 105 | Deleted = State#routing { 106 | table = dht_routing_table:delete(Old, Tbl), 107 | nodes = maps:remove(Old, Ns) 108 | }, 109 | insert(New, Deleted); 110 | roaming_member -> 111 | roaming_member; 112 | member -> 113 | {error, member} 114 | end. 115 | 116 | %% @doc insert/2 inserts a new node in the routing table 117 | %% Precondition: The node inserted is not a member 118 | %% Postcondition: The inserted node is now a member 119 | %% @end 120 | -spec insert(Node, #routing{}) -> {ok, #routing{}} | not_inserted 121 | when Node :: dht:peer(). 122 | insert(Node, #routing { table = Tbl, nodes = NT } = Routing) -> 123 | Now = dht_time:monotonic_time(), 124 | PrevRanges = dht_routing_table:ranges(Tbl), 125 | case dht_routing_table:space(Node, Tbl) of 126 | false -> not_inserted; 127 | true -> 128 | NextTbl = dht_routing_table:insert(Node, Tbl), 129 | member = dht_routing_table:member_state(Node, NextTbl), 130 | NewState = Routing#routing { 131 | table = NextTbl, 132 | nodes = node_update({reachable, Node}, Now, NT) }, 133 | {ok, update_ranges(PrevRanges, Now, NewState)} 134 | end. 135 | 136 | %% @doc remove/2 removes a node from the routing table 137 | %% @end 138 | remove(Node, #routing { table = Tbl, nodes = NT} = State) -> 139 | bad = timer_state({node, Node}, NT), 140 | State#routing { 141 | table = dht_routing_table:delete(Node, Tbl), 142 | nodes = maps:remove(Node, NT) 143 | }. 144 | 145 | %% @doc node_touch/3 marks recent communication with a node 146 | %% We pass `#{ reachable => true/false }' to signify if the touch was part of a reachable 147 | %% communication. That is, if we know for sure the peer node is reachable. 148 | %% @end 149 | -spec node_touch(Node, Opts, Routing) -> Routing 150 | when 151 | Node :: dht:peer(), 152 | Opts :: #{ atom() => boolean() }, 153 | Routing :: #routing{}. 154 | 155 | node_touch(Node, #{ reachable := true }, #routing { nodes = NT} = Routing) -> 156 | Routing#routing { nodes = node_update({reachable, Node}, dht_time:monotonic_time(), NT) }; 157 | node_touch(Node, #{ reachable := false }, #routing { nodes = NT } = Routing) -> 158 | Routing#routing { nodes = node_update({unreachable, Node}, dht_time:monotonic_time(), NT) }. 159 | 160 | %% @doc node_timeout/2 marks a Node communication as timed out 161 | %% Tracks the flakiness of a peer. If this is called too many times, then the peer/node 162 | %% enters the 'bad' state. 163 | %% @end 164 | node_timeout(Node, #routing { nodes = NT } = Routing) -> 165 | #{ timeout_count := TC } = State = maps:get(Node, NT), 166 | NewState = State#{ timeout_count := TC + 1 }, 167 | Routing#routing { nodes = maps:update(Node, NewState, NT) }. 168 | 169 | %% @doc node_list/1 returns the currently known nodes in the routing table. 170 | %% @end 171 | -spec node_list(#routing{}) -> [dht:peer()]. 172 | node_list(#routing { table = Tbl }) -> dht_routing_table:node_list(Tbl). 173 | 174 | %% @doc node_state/2 computes the state of a node list 175 | %% @end 176 | -spec node_state([Peer], Routing) -> [{Peer, node_state()}] 177 | when Peer :: dht:peer(), Routing :: #routing{}. 178 | 179 | node_state(Nodes, #routing { nodes = NT }) -> 180 | [{N, timer_state({node, N}, NT)} || N <- Nodes]. 181 | 182 | range_state(Range, #routing{ table = Tbl } = Routing) -> 183 | case dht_routing_table:is_range(Range, Tbl) of 184 | false -> {error, not_member}; 185 | true -> 186 | range_state_members(range_members(Range, Routing), Routing) 187 | end. 188 | 189 | reset_range_timer(Range, #{ force := Force }, 190 | #routing { ranges = RT, nodes = Nodes, table = Tbl } = Routing) -> 191 | TS = 192 | case Force of 193 | false -> range_last_activity(Range, Nodes, Tbl); 194 | true -> dht_time:monotonic_time() 195 | end, 196 | 197 | %% Update the range timer to the oldest member 198 | TmpR = timer_delete(Range, RT), 199 | RTRef = mk_timer(TS, ?RANGE_TIMEOUT, {inactive_range, Range}), 200 | NewRT = range_timer_add(Range, TS, RTRef, TmpR), 201 | 202 | Routing#routing { ranges = NewRT }. 203 | 204 | %% @doc export/1 returns the underlying routing table as an Erlang term 205 | %% Note: We DON'T store the timers in between invocations. This decision 206 | %% effectively makes the DHT system Time Warp capable 207 | %% @end 208 | export(#routing { table = Tbl }) -> Tbl. 209 | 210 | %% @doc neighbors/3 returns up to K neighbors around an ID 211 | %% The search returns a list of nodes, where the nodes toward the head 212 | %% are good nodes, and nodes further down are questionable nodes. 213 | %% @end 214 | neighbors(ID, K, #routing { table = Tbl } = Routing) -> 215 | Nodes = dht_routing_table:closest_to(ID, Tbl), 216 | States = node_state(Nodes, Routing), 217 | {Good, QBNodes} = lists:partition( 218 | fun({_, S}) -> S == good end, States), 219 | case take(K, Good) of 220 | L when length(L) == K -> [N || {N, _} <- L]; 221 | L when length(L) < K -> 222 | Remaining = K - length(L), 223 | {Questionable, _} = lists:partition( 224 | fun 225 | ({_, {questionable, _}}) -> true; 226 | (_) -> false 227 | end, 228 | QBNodes), 229 | [N || {N, _} <- L ++ take(Remaining, Questionable)] 230 | end. 231 | 232 | %% INTERNAL FUNCTIONS 233 | %% ------------------------------------------------------ 234 | 235 | take(0, _) -> []; 236 | take(_, []) -> []; 237 | take(K, [X|Xs]) when K > 0 -> [X | take(K-1, Xs)]. 238 | 239 | range_state_members(Members, Routing) -> 240 | T = dht_time:monotonic_time(), 241 | case last_activity(Members, Routing) of 242 | never -> empty; 243 | A when A =< T -> 244 | Window = dht_time:convert_time_unit(T - A, native, milli_seconds), 245 | case Window =< ?RANGE_TIMEOUT of 246 | true -> ok; 247 | false -> {needs_refresh, dht_rand:pick(Members)} 248 | end 249 | end. 250 | 251 | 252 | %% Insertion may invoke splitting of ranges. If this happens, we need to 253 | %% update the timers for ranges: The old range gets removed. The new 254 | %% ranges gets added. 255 | update_ranges(PrevRanges, Now, #routing { ranges = Ranges, nodes = NT, table = NewTbl } = State) -> 256 | NewRanges = dht_routing_table:ranges(NewTbl), 257 | Operations = lists:append( 258 | [{del, R} || R <- ordsets:subtract(PrevRanges, NewRanges)], 259 | [{add, R} || R <- ordsets:subtract(NewRanges, PrevRanges)]), 260 | State#routing { ranges = fold_ranges(lists:sort(Operations), Now, NT, Ranges, NewTbl) }. 261 | 262 | %% Carry out a sequence of operations over the ranges in a fold. 263 | fold_ranges([{del, R} | Ops], Now, Nodes, Ranges, Tbl) -> 264 | fold_ranges(Ops, Now, Nodes, timer_delete(R, Ranges), Tbl); 265 | fold_ranges([{add, R} | Ops], Now, Nodes, Ranges, Tbl) -> 266 | Recent = range_last_activity(R, Nodes, Tbl), 267 | TRef = mk_timer(Recent, ?RANGE_TIMEOUT, {inactive_range, R}), 268 | fold_ranges(Ops, Now, Nodes, range_timer_add(R, Now, TRef, Ranges), Tbl); 269 | fold_ranges([], _Now, _Nodes, Ranges, _Tbl) -> Ranges. 270 | 271 | %% Find the oldest member in the range and use that as the last activity 272 | %% point for the range. 273 | range_last_activity(Range, Nodes, Tbl) -> 274 | Members = dht_routing_table:members({range, Range}, Tbl), 275 | timer_newest(Members, Nodes). 276 | 277 | last_activity(Members, #routing { nodes = NTs } ) -> 278 | NodeTimers = [maps:get(M, NTs) || M <- Members], 279 | Changed = [LA || #{ last_activity := LA } <- NodeTimers], 280 | case Changed of 281 | [] -> never; 282 | Cs -> lists:max(Cs) 283 | end. 284 | 285 | init_range_timers(Now, Tbl) -> 286 | Ranges = dht_routing_table:ranges(Tbl), 287 | F = fun(R, Acc) -> 288 | Ref = mk_timer(Now, ?RANGE_TIMEOUT, {inactive_range, R}), 289 | range_timer_add(R, Now, Ref, Acc) 290 | end, 291 | lists:foldl(F, #{}, Ranges). 292 | 293 | init_nodes(Now, Nodes) -> 294 | Timeout = dht_time:convert_time_unit(?NODE_TIMEOUT, milli_seconds, native), 295 | F = fun(N) -> {N, #{ last_activity => Now - Timeout, timeout_count => 0, reachable => false }} end, 296 | maps:from_list([F(N) || N <- Nodes]). 297 | 298 | 299 | timer_delete(Item, Timers) -> 300 | #{ timer_ref := TRef } = V = maps:get(Item, Timers), 301 | _ = dht_time:cancel_timer(TRef), 302 | maps:update(Item, V#{ timer_ref := undefined }, Timers). 303 | 304 | node_update({reachable, Item}, Activity, Timers) -> 305 | Timers#{ Item => #{ last_activity => Activity, timeout_count => 0, reachable => true }}; 306 | node_update({unreachable, Item}, Activity, Timers) -> 307 | case maps:get(Item, Timers) of 308 | M = #{ reachable := true } -> 309 | Timers#{ Item => M#{ last_activity => Activity, timeout_count => 0, reachable := true }}; 310 | #{ reachable := false } -> 311 | Timers 312 | end. 313 | 314 | range_timer_add(Item, ActivityTime, TRef, Timers) -> 315 | Timers#{ Item => #{ last_activity => ActivityTime, timer_ref => TRef} }. 316 | 317 | timer_newest([], _) -> dht_time:monotonic_time(); % None available 318 | timer_newest(Items, Timers) -> 319 | Activities = [maps:get(K, Timers) || K <- Items], 320 | lists:max([A || #{ last_activity := A } <- Activities]). 321 | 322 | %% monus/2 is defined on integers in the obvious way (look up the Wikipedia article) 323 | monus(A, B) when A > B -> A - B; 324 | monus(A, B) when A =< B-> 0. 325 | 326 | %% Age returns the time since a point in time T. 327 | %% The age function is not time-warp resistant. 328 | age(T) -> 329 | Now = dht_time:monotonic_time(), 330 | age(T, Now). 331 | 332 | %% Return the age compared to the current point in time 333 | age(T, Now) when T =< Now -> 334 | dht_time:convert_time_unit(Now - T, native, milli_seconds); 335 | age(T, Now) when T > Now -> 336 | exit(time_warp_future). 337 | 338 | %% mk_timer/3 creates a new timer based on starting-point and an interval 339 | %% Given `Start', the point in time when the timer should start, and an interval, 340 | %% construct a timer that triggers at the end of the Start+Interval window. 341 | %% 342 | %% Start is in native time scale, Interval is in milli_seconds. 343 | mk_timer(Start, Interval, Msg) -> 344 | Age = age(Start), 345 | dht_time:send_after(monus(Interval, Age), self(), Msg). 346 | 347 | %% @doc timer_state/2 returns the state of a timer, based on BitTorrent Enhancement Proposal 5 348 | %% @end 349 | -spec timer_state({node, N}, Timers) -> 350 | good | {questionable, non_neg_integer()} | bad 351 | when 352 | N :: dht:peer(), 353 | Timers :: #{ atom() => any() }. 354 | 355 | timer_state({node, N}, NTs) -> 356 | case maps:get(N, NTs, undefined) of 357 | #{ timeout_count := K } when K > 1 -> bad; 358 | #{ last_activity := LA } -> 359 | Age = age(LA), 360 | case Age < ?NODE_TIMEOUT of 361 | true -> good; 362 | false -> {questionable, LA} 363 | end 364 | end. 365 | 366 | -------------------------------------------------------------------------------- /src/dht_routing_table.erl: -------------------------------------------------------------------------------- 1 | %% @author Magnus Klaar 2 | %% @author Jesper Louis Andersen 3 | %% @doc module dht_routing_table maintains a Kademlia routing table 4 | %% 5 | %% This module implements a server maintaining the 6 | %% DHT routing table. The nodes in the routing table 7 | %% is distributed across a set of buckets. The bucket 8 | %% set is created incrementally based on the local node id. 9 | %% 10 | %% The set of buckets, id ranges, is used to limit 11 | %% the number of nodes in the routing table. The routing 12 | %% table must only contain ?K nodes that fall within the 13 | %% range of each bucket. 14 | %% 15 | %% @end 16 | %%% @private 17 | -module(dht_routing_table). 18 | -include("dht_constants.hrl"). 19 | 20 | -export([new/1, new/3]). 21 | -export([ 22 | delete/2, 23 | insert/2 24 | ]). 25 | 26 | %% Query 27 | -export([ 28 | closest_to/2, 29 | member_state/2, 30 | is_range/2, 31 | members/2, 32 | node_id/1, 33 | node_list/1, 34 | ranges/1, 35 | space/2, 36 | info/1 37 | ]). 38 | 39 | -define(in_range(Dist, Min, Max), ((Dist >= Min) andalso (Dist < Max))). 40 | 41 | -record(bucket, { 42 | low :: dht:id(), 43 | high :: dht:id(), 44 | members :: [dht:peer()] 45 | }). 46 | 47 | -record(routing_table, { 48 | self :: dht:id(), 49 | table :: [#bucket{}] 50 | }). 51 | -type t() :: #routing_table{}. 52 | -export_type([t/0]). 53 | 54 | info(#bucket{ low = Low, high = High, members = Members }) -> 55 | #{ low => Low, high => High, members => Members }; 56 | info(#routing_table { self = Self, table = Buckets }) -> 57 | #{ 58 | self => Self, 59 | buckets => [info(B) || B <- Buckets] 60 | }. 61 | 62 | %% 63 | %% Create a new bucket list 64 | %% 65 | -spec new(dht:id()) -> t(). 66 | new(Self) -> new(Self, ?MIN_ID, ?MAX_ID). 67 | 68 | new(Self, Lo, Hi) when is_integer(Self), Self >= 0 -> 69 | #routing_table { 70 | self = Self, 71 | table = [#bucket { low = Lo, high = Hi, members = [] }] 72 | }. 73 | 74 | -spec node_id(t()) -> dht:id(). 75 | node_id(#routing_table { self = ID }) -> ID. 76 | 77 | %% 78 | %% Space - determine if there is space for a node 79 | %% 80 | -spec space(dht:peer(), t()) -> boolean(). 81 | space(N, T) -> 82 | TestTable = insert(N, T), 83 | case member_state(N, TestTable) of 84 | unknown -> false; 85 | member -> true 86 | end. 87 | 88 | %% 89 | %% Insert a new node into a bucket list 90 | %% 91 | %% TODO: Insertion should also provide evidence for what happened to buckets/ranges. 92 | %% 93 | -spec insert(dht:peer(), t()) -> t(). 94 | insert(Node, #routing_table { self = Self, table = Table} = Tbl) -> 95 | Tbl#routing_table { table = insert_node(Self, Node, Table) }. 96 | 97 | %% The recursive runner for insertion 98 | insert_node(Self, {ID, _, _} = Node, [#bucket{ low = Min, high = Max, members = Members} = B | Next]) 99 | when ?in_range(ID, Min, Max) -> 100 | %% We analyze the numbers of members and either insert or split the bucket 101 | case length(Members) of 102 | L when L < ?MAX_RANGE_SZ -> [B#bucket { members = ordsets:add_element(Node, Members) } | Next]; 103 | L when L == ?MAX_RANGE_SZ -> 104 | case ?in_range(Self, Min, Max) of 105 | true -> insert_node(Self, Node, insert_split_bucket(B) ++ Next); 106 | false -> [B | Next] 107 | end 108 | end; 109 | insert_node(Self, Node, [H|T]) -> [H | insert_node(Self, Node, T)]. 110 | 111 | insert_split_bucket(#bucket{ low = Min, high = Max, members = Members }) -> 112 | Diff = Max - Min, 113 | Half = Max - (Diff div 2), 114 | F = fun({MID, _, _}) -> ?in_range(MID, Min, Half) end, 115 | {Lower, Upper} = lists:partition(F, Members), 116 | [#bucket{ low = Min, high = Half, members = Lower }, 117 | #bucket{ low = Half, high = Max, members = Upper }]. 118 | 119 | %% Get all ranges present in a bucket list 120 | %% 121 | -spec ranges(t()) -> list({dht:id(), dht:id()}). 122 | ranges(#routing_table { table = Entries }) -> 123 | lists:sort([{Min, Max} || #bucket{ low = Min, high = Max } <- Entries]). 124 | 125 | %% 126 | %% Delete a node from a bucket list 127 | %% 128 | -spec delete(dht:peer(), t()) -> t(). 129 | delete(Node, #routing_table { table = Table} = Tbl) -> 130 | Tbl#routing_table { table = delete_node(Node, Table) }. 131 | 132 | delete_node({ID, _, _} = Node, [#bucket { low = Min, high = Max, members = Members } = B|T]) 133 | when ?in_range(ID, Min, Max) -> 134 | [B#bucket { members = ordsets:del_element(Node, Members) }|T]; 135 | delete_node(Node, [H|T]) -> [H | delete_node(Node, T)]; 136 | delete_node(_, []) -> []. 137 | 138 | %% 139 | %% Return all members of the bucket that this node is a member of 140 | %% 141 | %% TODO: Figure out why we are using the metric here as well. 142 | %% TODO: Call as members({range, Min, Max} | {node, Node}) to make search explicit. 143 | -spec members({range, Range} | {node, Node}, t()) -> [Node] 144 | when 145 | Node :: dht:peer(), 146 | Range :: dht:range(). 147 | members({range, {Min, Max}}, #routing_table { table = Table}) -> 148 | S = fun(#bucket { low = Lo, high = Hi}) -> Lo == Min andalso Hi == Max end, 149 | Target = retrieve(S, Table), 150 | Target#bucket.members; 151 | members({node, {ID, _, _}}, RT) -> 152 | #bucket { members = Members } = retrieve_id(ID, RT), 153 | Members. 154 | 155 | %% 156 | %% Check if a node is a member of a bucket list 157 | %% 158 | -spec member_state(dht:peer(), t()) -> member | roaming_member | unknown. 159 | member_state({ID, IP, Port}, RT) -> 160 | #bucket { members = Members } = retrieve_id(ID, RT), 161 | case lists:keyfind(ID, 1, Members) of 162 | false -> unknown; 163 | {ID, IP, Port} -> member; 164 | {ID, _, _} -> roaming_member 165 | end. 166 | 167 | %% 168 | %% Check if a range exists in a range list 169 | %% 170 | -spec is_range({dht:id(), dht:id()}, t()) -> boolean(). 171 | is_range(Range, RT) -> lists:member(Range, ranges(RT)). 172 | 173 | -spec closest_to(dht:id(), t()) -> list(dht:peer()). 174 | closest_to(ID, #routing_table { table = Buckets }) -> 175 | Nodes = [N || N <- all_nodes(Buckets)], 176 | DF = fun({MID, _, _}) -> dht_metric:d(ID, MID) end, 177 | lists:sort(fun(X, Y) -> DF(X) < DF(Y) end, Nodes). 178 | 179 | %% 180 | %% Return a list of all members, combined, in all buckets. 181 | %% 182 | -spec node_list(t()) -> [dht:peer()]. 183 | node_list(#routing_table { table = Entries }) -> 184 | lists:flatmap(fun(B) -> B#bucket.members end, Entries). 185 | 186 | %% Retrieve an element from a list 187 | %% Precondition: The element is already in the list 188 | retrieve(F, [X|Xs]) -> 189 | case F(X) of 190 | true -> X; 191 | false -> retrieve(F, Xs) 192 | end. 193 | 194 | %% Given a distance to a target, is it the right bucket? 195 | in_bucket(Dist, #bucket { low = Lo, high = Hi }) -> ?in_range(Dist, Lo, Hi). 196 | 197 | %% Specialized retrieve on an ID 198 | retrieve_id(ID, #routing_table { table = Table }) -> 199 | S = fun(B) -> in_bucket(ID, B) end, 200 | retrieve(S, Table). 201 | 202 | all_nodes([]) -> []; 203 | all_nodes([#bucket { members = Ns } | Bs]) -> 204 | Rest = all_nodes(Bs), 205 | Ns ++ Rest. 206 | -------------------------------------------------------------------------------- /src/dht_search.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Implements recursive searching in the DHT 2 | %%% @end 3 | %%% @private 4 | -module(dht_search). 5 | 6 | %% API for iterative search functions 7 | -export([ 8 | run/2 9 | ]). 10 | 11 | -define(SEARCH_WIDTH, 32). 12 | -define(SEARCH_RETRIES, 3). 13 | 14 | -record(search_state, { 15 | query_type :: find_node | find_value, 16 | done = gb_sets:empty() :: gb_sets:set(dht:peer()), 17 | alive = gb_sets:empty() :: gb_sets:set(dht:peer()), 18 | acc = [] :: [ {dht:peer(), dht:token(), [dht:endpoint()]} ] 19 | }). 20 | 21 | %% SEARCH API 22 | %% --------------------------------------------------- 23 | 24 | -spec run(find_node | find_value, dht:id()) -> FindNodeRes | FindValueRes 25 | when 26 | FindNodeRes :: list ( dht:peer() ), 27 | FindValueRes :: #{ atom() => [term()] }. 28 | run(Type, ID) -> 29 | search_iterate(Type, ID, ?SEARCH_WIDTH, dht_state:closest_to(ID, ?SEARCH_WIDTH)). 30 | 31 | %% Internal functions 32 | %% ----------------------------- 33 | 34 | search_iterate(QType, Target, Width, Nodes) -> 35 | NodeID = dht_state:node_id(), 36 | dht_iter_search(NodeID, Target, Width, ?SEARCH_RETRIES, Nodes, 37 | #search_state{ query_type = QType }). 38 | 39 | dht_iter_search(_NodeID, _Target, _Width, 0, _Todo, 40 | #search_state { query_type = find_node, alive = A }) -> 41 | gb_sets:to_list(A); 42 | dht_iter_search(_NodeID, _Target, _Width, 0, _Todo, 43 | #search_state { query_type = find_value, alive = A } = State) -> 44 | Res = res(State), 45 | #{ 46 | store => [{N, Token} || {N, Token, _Vs} <- Res], 47 | found => lists:usort([V || {_, _, Vs} <- Res, V <- Vs]), 48 | alive => A 49 | }; 50 | dht_iter_search(NodeID, Target, Width, Retries, Todo, #search_state{ query_type = QType } = State) -> 51 | %% Call the DHT in parallel to speed up the search: 52 | Call = fun({_, IP, Port} = N) -> 53 | {N, apply(dht_net, QType, [{IP, Port}, Target])} 54 | end, 55 | CallRes = dht_par:pmap(Call, Todo), 56 | 57 | %% Assert there are no errors, since workers are not to die here 58 | {Results, [] = _Errors} = dht_par:partition(CallRes), 59 | 60 | %% Maintain invariants for the next round by updating the necessary data structures: 61 | #{ new := New, next := NextState } = track_state(Results, Todo, State), 62 | 63 | WorkQueue = lists:usort(dht_metric:neighborhood(Target, New, Width)), 64 | Retry = update_retries(NodeID, Retries, WorkQueue, NextState), 65 | dht_iter_search(NodeID, Target, Width, Retry, WorkQueue, NextState). 66 | 67 | %% If the work queue contains the closest node, we are converging toward our target. 68 | %% If not, then we decrease the retry-count by one, since it is not likely we will hit 69 | %% a better target. 70 | update_retries(NodeID, K, WorkQueue, State) -> 71 | case view_closest_node(NodeID, alive(State), WorkQueue) of 72 | work_queue -> ?SEARCH_RETRIES; 73 | alive_set -> K - 1 74 | end. 75 | 76 | %% Once a round completes, track the state of the round in the #search_state{} record. 77 | %% 78 | %% Rules: 79 | %% • Nodes which responded as being alive are added to the alive-set 80 | %% • Track which nodes has been processed 81 | %% • Accumulate the results 82 | %% 83 | %% Returns the next state as well as the set of newly found nodes (for the work-queue) 84 | %% 85 | track_state(Results, Processed, 86 | #search_state { 87 | query_type = QType, 88 | alive = Alive, 89 | done = Done, 90 | acc = Acc } = State) -> 91 | Next = State#search_state { 92 | alive = gb_sets:union(Alive, gb_sets:from_list( alive_nodes(Results)) ), 93 | done = Queried = gb_sets:union(Done, gb_sets:from_list(Processed)), 94 | acc = accum_peers(QType, Acc, Results) 95 | }, 96 | New = [N || N <- all_nodes(Results), not gb_sets:is_member(N, Queried)], 97 | 98 | #{ new => New, next => Next }. 99 | 100 | %% Select nodes from a response 101 | resp_nodes({nodes, _, Ns}) -> Ns; 102 | resp_nodes(_) -> []. 103 | 104 | %% all_nodes/1 gathers nodes from a set of parallel queries 105 | all_nodes(Resp) -> lists:concat([resp_nodes(R) || {_N, R} <- Resp]). 106 | 107 | %% Compute if a response is ok, for use in a predicate 108 | ok({error, _}) -> false; 109 | ok({error, _ID, _Code, _Msg}) -> false; 110 | ok(_) -> true. 111 | 112 | %% alive_nodes/1 returns the nodes returned positive results 113 | alive_nodes(Resp) -> [N || {N, R} <- Resp, ok(R)]. 114 | 115 | %% accum_peers/2 accumulates new targets with the value present 116 | accum_peers(find_node, [], _Results) -> 117 | %% find_node never has results to accumulate 118 | []; 119 | accum_peers(find_value, Acc, Results) -> 120 | New = accum_results(Results), 121 | New ++ Acc. 122 | 123 | accum_results([]) -> []; 124 | accum_results([ {_Node, {error, timeout}} | Rs ]) -> 125 | %% Skip if the node timed out 126 | accum_results(Rs); 127 | accum_results([ {Node, {values, _, Token, Vs}} | Rs ]) -> 128 | [{Node, Token, Vs} | accum_results(Rs)]; 129 | accum_results([ {Node, {nodes, _, Token, _}} | Rs ]) -> 130 | [{Node, Token, []} | accum_results(Rs)]. 131 | 132 | %% view_closest_node/3 determines if the closest node is in the work_queue or in the alive set 133 | view_closest_node(ID, AliveNodes, WorkQueue) -> 134 | DistFun = fun({NID, _, _}, Incumbent) -> 135 | min(dht_metric:d(ID, NID), Incumbent) 136 | end, 137 | 138 | MinAlive = gb_sets:fold(DistFun, infinity, AliveNodes), 139 | MinWork = lists:foldl(DistFun, infinity, WorkQueue), 140 | case MinWork < MinAlive of 141 | true -> work_queue; 142 | false -> alive_set 143 | end. 144 | 145 | %% find the nodes/values which are alive 146 | alive(#search_state { alive = A }) -> A. 147 | 148 | %% obtain the final result of the query 149 | res(#search_state { acc = A }) -> A. 150 | -------------------------------------------------------------------------------- /src/dht_socket.erl: -------------------------------------------------------------------------------- 1 | %%% @doc module dht_socket is a callthrough module for gen_udp 2 | %%% @end 3 | %%% @private 4 | -module(dht_socket). 5 | 6 | -export([ 7 | open/2, 8 | send/4, 9 | sockname/1 10 | ]). 11 | 12 | open(Port, Opts) -> 13 | gen_udp:open(Port, Opts). 14 | 15 | send(Socket, IP, Port, Packet) -> 16 | gen_udp:send(Socket, IP, Port, Packet). 17 | 18 | sockname(Socket) -> 19 | inet:sockname(Socket). 20 | -------------------------------------------------------------------------------- /src/dht_state.erl: -------------------------------------------------------------------------------- 1 | %%% @author Magnus Klaar 2 | %%% @author Jesper Louis Andersen 3 | %%% @doc A Server for maintaining the the routing table in DHT 4 | %% 5 | %% @todo Document all exported functions. 6 | %% 7 | %% This module implements the higher-level logic of the DHT 8 | %% routing table. The routing table itself is split over 3 modules: 9 | %% 10 | %% * The routing table itself - In dht_routing_table 11 | %% * The set of timer meta-data for node/range refreshes - In dht_routing_meta 12 | %% * The policy rules for what to do - In dht_state (this file) 13 | %% 14 | %% This modules main responsibility is to call out to helper modules 15 | %% and make sure to maintain consistency of the above three states 16 | %% we maintain. 17 | %% 18 | %% Many of the calls in this module will block for a while if called. The reason is 19 | %% that the calling process handles network blocking calls for the gen_server 20 | %% that runs the dht_state routing table. That is, a response from the routing 21 | %% table may be to execute a network call, and then the caller awaits the 22 | %% completion of that network call. 23 | %% 24 | %%% @end 25 | %%% @private 26 | -module(dht_state). 27 | -behaviour(gen_server). 28 | 29 | -include("dht_constants.hrl"). 30 | -include_lib("kernel/include/inet.hrl"). 31 | 32 | %% Lifetime 33 | -export([ 34 | dump_state/0, dump_state/1, 35 | start_link/2, 36 | start_link/3, 37 | sync/0 38 | ]). 39 | 40 | %% Query 41 | -export([ 42 | closest_to/1, 43 | closest_to/2, 44 | node_id/0 45 | ]). 46 | 47 | %% Manipulation 48 | -export([ 49 | request_success/2, 50 | request_timeout/1 51 | ]). 52 | 53 | %% Information 54 | -export([ 55 | info/0 56 | ]). 57 | 58 | -export([init/1, 59 | handle_call/3, 60 | handle_cast/2, 61 | handle_info/2, 62 | terminate/2, 63 | code_change/3]). 64 | 65 | -record(state, { 66 | node_id :: dht:id(), % ID of this node 67 | routing = dht_routing_meta:empty() % Routing table and timing structure 68 | }). 69 | 70 | %% LIFETIME MAINTENANCE 71 | %% ---------------------------------------------------------- 72 | start_link(StateFile, BootstrapNodes) -> 73 | start_link(dht_metric:mk(), StateFile, BootstrapNodes). 74 | 75 | start_link(RequestedID, StateFile, BootstrapNodes) -> 76 | gen_server:start_link({local, ?MODULE}, 77 | ?MODULE, 78 | [RequestedID, StateFile, BootstrapNodes], []). 79 | 80 | %% @doc Retrieve routing table information 81 | info() -> 82 | MetaData = call(info), 83 | dht_routing_meta:info(MetaData). 84 | 85 | %% @doc dump_state/0 dumps the routing table state to disk 86 | %% @end 87 | dump_state() -> 88 | call(dump_state). 89 | 90 | %% @doc dump_state/1 dumps the routing table state to disk into a given file 91 | %% @end 92 | dump_state(Filename) -> 93 | call({dump_state, Filename}). 94 | 95 | %% PUBLIC API 96 | %% ---------------------------------------------------------- 97 | %% Helper for calling 98 | call(X) -> gen_server:call(?MODULE, X). 99 | cast(X) -> gen_server:cast(?MODULE, X). 100 | 101 | %% QUERIES 102 | %% ----------- 103 | 104 | %% @equiv closest_to(NodeID, 8) 105 | -spec closest_to(dht:id()) -> list(dht:peer()). 106 | closest_to(NodeID) -> closest_to(NodeID, ?MAX_RANGE_SZ). 107 | 108 | %% @doc closest_to/2 returns the neighborhood around an ID known to the routing table 109 | %% @end 110 | -spec closest_to(dht:id(), pos_integer()) -> list(dht:peer()). 111 | closest_to(NodeID, NumNodes) -> 112 | call({closest_to, NodeID, NumNodes}). 113 | 114 | %% @doc Return this node id as an integer. 115 | %% Node ids are generated in a random manner. 116 | -spec node_id() -> dht:id(). 117 | node_id() -> 118 | gen_server:call(?MODULE, node_id). 119 | 120 | %% OPERATIONS WHICH CHANGE STATE 121 | %% ------------------------------ 122 | request_success(Node, Opts) -> 123 | case call({insert_node, Node, Opts}) of 124 | ok -> ok; 125 | not_inserted -> not_inserted; 126 | already_member -> 127 | cast({request_success, Node, Opts}), 128 | already_member; 129 | {error, Reason} -> {error, Reason}; 130 | {verify, QNode} -> 131 | dht_refresh:verify(QNode, Node, Opts), 132 | ok 133 | end. 134 | 135 | request_timeout(Node) -> 136 | cast({request_timeout, Node}). 137 | 138 | %% INTERNAL API 139 | %% ------------------------------------------------------------------- 140 | 141 | %% @private 142 | %% sync/0 is used to make sure all async processing in the state process has been done. 143 | %% Its only intended use is when we are testing the code in the process. 144 | sync() -> 145 | gen_server:call(?MODULE, sync). 146 | 147 | %% CALLBACKS 148 | %% ------------------------------------------------------------------- 149 | 150 | %% @private 151 | init([RequestedNodeID, StateFile, BootstrapNodes]) -> 152 | %% For now, we trap exits which ensures the state table is dumped upon termination 153 | %% of the process. 154 | %% @todo lift this restriction. Periodically dump state, but don't do it if an 155 | %% invariant is broken for some reason 156 | %% erlang:process_flag(trap_exit, true), 157 | 158 | RoutingTbl = load_state(RequestedNodeID, StateFile), 159 | 160 | %% @todo, consider just folding over these as well rather than a background insert. 161 | ok = dht_refresh:insert_nodes(BootstrapNodes), 162 | 163 | {ok, ID, Routing} = dht_routing_meta:new(RoutingTbl), 164 | {ok, #state { node_id = ID, routing = Routing}}. 165 | 166 | %% @private 167 | handle_call({insert_node, Node, _Opts}, _From, #state { routing = R } = State) -> 168 | {Reply, NR} = insert_node_internal(Node, R), 169 | {reply, Reply, State#state { routing = NR }}; 170 | handle_call({closest_to, ID, NumNodes}, _From, #state{routing = Routing } = State) -> 171 | Neighbors = dht_routing_meta:neighbors(ID, NumNodes, Routing), 172 | {reply, Neighbors, State}; 173 | handle_call(dump_state, From, State) -> 174 | handle_call({dump_state, get_current_statefile()}, From, State); 175 | handle_call({dump_state, StateFile}, _From, #state{ routing = Routing } = State) -> 176 | try 177 | Tbl = dht_routing_meta:export(Routing), 178 | dump_state(StateFile, Tbl), 179 | {reply, ok, State} 180 | catch 181 | Class:Err -> 182 | {reply, {error, {dump_state_failed, Class, Err}}, State} 183 | end; 184 | handle_call(node_list, _From, #state { routing = Routing } = State) -> 185 | {reply, dht_routing_meta:node_list(Routing), State}; 186 | handle_call(node_id, _From, #state{ node_id = Self } = State) -> 187 | {reply, Self, State}; 188 | handle_call(info, _From, #state { routing = Routing } = State) -> 189 | {reply, Routing, State}; 190 | handle_call(sync, _From, State) -> 191 | {reply, ok, State}. 192 | 193 | %% @private 194 | handle_cast({request_timeout, Node}, #state{ routing = Routing } = State) -> 195 | case dht_routing_meta:member_state(Node, Routing) of 196 | unknown -> {noreply, State}; 197 | roaming_member -> {noreply, State}; 198 | member -> 199 | R = dht_routing_meta:node_timeout(Node, Routing), 200 | {noreply, State#state { routing = R }} 201 | end; 202 | handle_cast({request_success, Node, Opts}, #state{ routing = R } = State) -> 203 | case dht_routing_meta:member_state(Node, R) of 204 | unknown -> 205 | {_, NR} = insert_node_internal(Node, R), 206 | {noreply, State#state { routing = NR }}; 207 | roaming_member -> {noreply, State}; 208 | member -> 209 | NR = dht_routing_meta:node_touch(Node, Opts, R), 210 | {noreply, State#state { routing = NR }} 211 | end; 212 | handle_cast(_, State) -> 213 | {noreply, State}. 214 | 215 | %% @private 216 | %% The timer {inactive_range, Range} is set by the dht_routing module 217 | handle_info({inactive_range, Range}, #state{ routing = Routing } = State) -> 218 | case dht_routing_meta:range_state(Range, Routing) of 219 | {error, not_member} -> 220 | {noreply, State}; 221 | ok -> 222 | R = dht_routing_meta:reset_range_timer(Range, #{ force => false}, Routing), 223 | {noreply, State#state { routing = R }}; 224 | empty -> 225 | R = dht_routing_meta:reset_range_timer(Range, #{ force => true}, Routing), 226 | {noreply, State#state { routing = R }}; 227 | {needs_refresh, Member} -> 228 | R = dht_routing_meta:reset_range_timer(Range, #{ force => true }, Routing), 229 | %% Create a monitor on the process, so we can handle the state of our 230 | %% background worker. 231 | ok = dht_refresh:range(Member), 232 | {noreply, State#state { routing = R }} 233 | end; 234 | handle_info({stop, Caller}, #state{} = State) -> 235 | Caller ! stopped, 236 | {stop, normal, State}. 237 | 238 | %% @private 239 | terminate(_Reason, #state{ routing = _Routing }) -> 240 | %%error_logger:error_report({exiting, Reason}), 241 | %%dump_state(StateFile, dht_routing_meta:export(Routing)) 242 | ok. 243 | 244 | %% @private 245 | code_change(_, State, _) -> 246 | {ok, State}. 247 | 248 | %% 249 | %% INTERNAL FUNCTIONS 250 | %% 251 | 252 | get_current_statefile() -> 253 | {ok, SF} = application:get_env(dht, state_file), 254 | SF. 255 | 256 | %% adjoin/2 attempts to add a new `Node' to the `Routing'-table. 257 | %% It either succeeds in doing so, or fails due to one of the many reasons 258 | %% as to why it can't happen: 259 | %% • The range is already full of nodes 260 | %% • There is room for a new node 261 | %% • We can replace a bad node with the new node 262 | %% • We fail to adjoin the new node, but can point to a node neighboring node which 263 | %% needs verification. 264 | adjoin(Node, Routing) -> 265 | Near = dht_routing_meta:range_members(Node, Routing), 266 | case analyze_range(dht_routing_meta:node_state(Near, Routing)) of 267 | {[], [], Gs} when length(Gs) == ?MAX_RANGE_SZ -> 268 | %% Already enough nodes in that range/bucket 269 | case dht_routing_meta:insert(Node, Routing) of 270 | {ok, NewRouting} -> {ok, NewRouting}; 271 | not_inserted -> {not_inserted, Routing} 272 | end; 273 | {Bs, Qs, Gs} when length(Gs) + length(Bs) + length(Qs) < ?MAX_RANGE_SZ -> 274 | %% There is room for the new node, insert it 275 | {ok, _NewRouting} = dht_routing_meta:insert(Node, Routing); 276 | {[Bad | _], _, _} -> 277 | %% There is a bad node present, swap the new node for the bad 278 | {ok, _NewRouting} = dht_routing_meta:replace(Bad, Node, Routing); 279 | {[], [Questionable | _], _} -> 280 | %% Ask the caller to verify the questionable node (they are sorted in order of interestingness) 281 | {{verify, Questionable}, Routing} 282 | end. 283 | 284 | %% Given a list of node states, sort them into good, bad, and questionable buckets. 285 | %% In the case of the questionable nodes, they are sorted oldest first, so the list head 286 | %% is the one you want to analyze. 287 | %% 288 | %% In addition we sort the bad nodes, so there is a specific deterministic order in which 289 | %% we consume them. 290 | analyze_range(Nodes) when length(Nodes) =< ?MAX_RANGE_SZ -> 291 | analyze_range(lists:sort(Nodes), [], [], []). 292 | 293 | analyze_range([], Bs, Qs, Gs) -> 294 | SortedQs = [N || {N, _} <- lists:keysort(2, lists:reverse(Qs))], 295 | {lists:reverse(Bs), SortedQs, lists:reverse(Gs)}; 296 | analyze_range([{B, bad} | Next], Bs, Qs, Gs) -> analyze_range(Next, [B | Bs], Qs, Gs); 297 | analyze_range([{G, good} | Next], Bs, Qs, Gs) -> analyze_range(Next, Bs, Qs, [G | Gs]); 298 | analyze_range([{_, {questionable, _}} = Q | Next], Bs, Qs, Gs) -> analyze_range(Next, Bs, [Q | Qs], Gs). 299 | 300 | %% 301 | %% DISK STATE 302 | %% ---------------------------------- 303 | 304 | dump_state(no_state_file, _) -> ok; 305 | dump_state(Filename, RoutingTable) -> 306 | ok = file:write_file(Filename, term_to_binary(RoutingTable, [compressed])). 307 | 308 | load_state(RequestedNodeID, {no_state_file, L, H}) -> 309 | dht_routing_table:new(RequestedNodeID, L, H); 310 | load_state(RequestedNodeID, no_state_file) -> 311 | dht_routing_table:new(RequestedNodeID); 312 | load_state(RequestedNodeID, Filename) -> 313 | case file:read_file(Filename) of 314 | {ok, BinState} -> 315 | binary_to_term(BinState); 316 | {error, enoent} -> 317 | dht_routing_table:new(RequestedNodeID) 318 | end. 319 | 320 | insert_node_internal(Node, R) -> 321 | case dht_routing_meta:member_state(Node, R) of 322 | member -> {already_member, R}; 323 | roaming_member -> {already_member, R}; 324 | unknown -> adjoin(Node, R) 325 | end. 326 | -------------------------------------------------------------------------------- /src/dht_store.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Store a subset of the DHT at this node 2 | %% 3 | %% This module keeps track of our subset of the DHT node in the system. 4 | %% 5 | %%% @end 6 | %%% @private 7 | -module(dht_store). 8 | -behaviour(gen_server). 9 | -include("dht_constants.hrl"). 10 | -include_lib("stdlib/include/ms_transform.hrl"). 11 | 12 | %% lifetime API 13 | -export([start_link/0, sync/0]). 14 | 15 | %% Operational API 16 | -export([store/2, find/1]). 17 | 18 | -export([info/0]). 19 | 20 | %% gen_server API 21 | -export([ 22 | init/1, 23 | handle_cast/2, 24 | handle_call/3, 25 | terminate/2, 26 | code_change/3, 27 | handle_info/2 28 | ]). 29 | 30 | -define(TBL, ?MODULE). 31 | 32 | -record(state, { tbl }). 33 | 34 | %% API 35 | start_link() -> 36 | gen_server:start_link({local, ?MODULE}, ?MODULE, [], []). 37 | 38 | sync() -> 39 | gen_server:call(?MODULE, sync). 40 | 41 | store(ID, {IP, Port}) -> 42 | gen_server:call(?MODULE, {store, ID, {IP, Port}}). 43 | 44 | find(ID) -> 45 | gen_server:call(?MODULE, {find, ID}). 46 | 47 | info() -> 48 | L = ets:tab2list(?TBL), 49 | [#{ id => ID, peer => Peer, inserted => Ins } || {ID, Peer, Ins} <- L]. 50 | 51 | %% Callbacks 52 | init([]) -> 53 | Tbl = ets:new(?TBL, [named_table, protected, bag]), 54 | dht_time:send_after(5 * 60 * 1000, ?MODULE, evict), 55 | {ok, #state { tbl = Tbl }}. 56 | 57 | handle_call({store, ID, Loc}, _From, State) -> 58 | push(ID, Loc), 59 | {reply, ok, State}; 60 | handle_call({find, Key}, _From, State) -> 61 | evict(Key), 62 | Peers = ets:match(?TBL, {Key, '$1', '_'}), 63 | {reply, [Loc || [Loc] <- Peers], State}; 64 | handle_call(sync, _From, State) -> 65 | {reply, ok, State}; 66 | handle_call(_Msg, _From, State) -> 67 | {reply, {error, unknown_msg}, State}. 68 | 69 | handle_cast(_Msg, State) -> 70 | {noreply, State}. 71 | 72 | handle_info(evict, State) -> 73 | dht_time:send_after(5 * 60 * 1000, ?MODULE, evict), 74 | evict(), 75 | {noreply, State}; 76 | handle_info(_Msg, State) -> 77 | {noreply, State}. 78 | 79 | terminate(_How, _State) -> 80 | ok. 81 | 82 | code_change(_OldVsn, State, _Extra) -> 83 | {ok, State}. 84 | 85 | evict() -> 86 | Now = dht_time:monotonic_time(), 87 | Window = Now - dht_time:convert_time_unit(?STORE_TIME, milli_seconds, native), 88 | MS = ets:fun2ms(fun({_, _, T}) -> T < Window end), 89 | ets:select_delete(?TBL, MS). 90 | 91 | evict(Key) -> 92 | Now = dht_time:monotonic_time(), 93 | Window = Now - dht_time:convert_time_unit(?STORE_TIME, milli_seconds, native), 94 | MS = ets:fun2ms(fun({K, _, T}) -> K == Key andalso T < Window end), 95 | ets:select_delete(?TBL, MS). 96 | 97 | push(ID, Loc) -> 98 | Now = dht_time:monotonic_time(), 99 | %% 100 | %% Only match expected output. We crash if we break the invariant by any means 101 | %% 102 | case ets:match_object(?TBL, {ID, Loc, '_'}) of 103 | [] -> 104 | ets:insert(?TBL, {ID, Loc, Now}), 105 | ok; 106 | [E] -> 107 | ets:delete_object(?TBL, E), 108 | ets:insert(?TBL, {ID, Loc, Now}), 109 | ok 110 | end. 111 | 112 | 113 | 114 | -------------------------------------------------------------------------------- /src/dht_sup.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Main supervisor for the DHT code 2 | %%% @end 3 | %%% @private 4 | -module(dht_sup). 5 | -behaviour(supervisor). 6 | -export([start_link/0]). 7 | 8 | % supervisor callbacks 9 | -export([init/1]). 10 | 11 | -dialyzer({nowarn_function, [init/1]}). 12 | %% ------------------------------------------------------------------ 13 | 14 | start_link() -> 15 | supervisor:start_link({local, dht_sup}, ?MODULE, []). 16 | 17 | %% ------------------------------------------------------------------ 18 | init([]) -> 19 | {ok, Port} = application:get_env(dht, port), 20 | {ok, StateFile} = application:get_env(dht, state_file), 21 | {ok, BootstrapNodes} = application:get_env(dht, bootstrap_nodes), 22 | 23 | Store = #{ id => store, 24 | start => {dht_store, start_link, []} }, 25 | State = #{ id => state, 26 | start => {dht_state, start_link, [StateFile, BootstrapNodes]} }, 27 | Network = #{ id => network, 28 | start => {dht_net, start_link, [Port]} }, 29 | Tracker = #{ id => tracker, 30 | start => {dht_track, start_link, []} }, 31 | {ok, 32 | {#{ strategy => rest_for_one, 33 | intensity => 5, 34 | period => 900 }, 35 | [Store, State, Network, Tracker]}}. 36 | 37 | %% ------------------------------------------------------------------ 38 | -------------------------------------------------------------------------------- /src/dht_time.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Module dht_time proxies Erlang time calls 2 | %%% 3 | %%% The purpose of this module is to provide a proxy to typical Erlang/OTP 4 | %%% time calls. It allows us to mock time in the test cases and keeps the 5 | %%% interface local. It can also become a backwards-compatibility layer 6 | %%% for release 17 and earlier. 7 | %%% @end 8 | %%% @private 9 | -module(dht_time). 10 | 11 | -export([monotonic_time/0, convert_time_unit/3, time_offset/0]). 12 | -export([send_after/3, read_timer/1, cancel_timer/1]). 13 | 14 | -spec monotonic_time() -> integer(). 15 | monotonic_time() -> 16 | erlang:monotonic_time(). 17 | 18 | time_offset() -> 19 | erlang:time_offset(). 20 | 21 | send_after(Time, Target, Msg) -> 22 | erlang:send_after(Time, Target, Msg). 23 | 24 | read_timer(TRef) -> 25 | erlang:read_timer(TRef). 26 | 27 | cancel_timer(TRef) -> 28 | erlang:cancel_timer(TRef). 29 | 30 | -spec convert_time_unit(integer(), erlang:time_unit(), erlang:time_unit()) -> integer(). 31 | convert_time_unit(T, From, To) -> 32 | erlang:convert_time_unit(T, From, To). 33 | -------------------------------------------------------------------------------- /src/dht_track.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Track entries in the DHT for an Erlang node 2 | %% 3 | %% Since you have to track entries in the DHT and you have to refresh 4 | %% them at predefined times, we have this module, which makes sure to 5 | %% refresh stored values in the DHT. You can also delete values from 6 | %% the DHT again by calling the `delete/1' method on the stored values 7 | %% 8 | %%% @end 9 | %%% @private 10 | -module(dht_track). 11 | -behaviour(gen_server). 12 | -include("dht_constants.hrl"). 13 | 14 | %% lifetime API 15 | -export([start_link/0, sync/0]). 16 | 17 | %% Operational API 18 | -export([store/2, lookup/1, delete/1]). 19 | -export([info/0]). 20 | 21 | %% gen_server API 22 | -export([ 23 | init/1, 24 | handle_cast/2, 25 | handle_call/3, 26 | terminate/2, 27 | code_change/3, 28 | handle_info/2 29 | ]). 30 | 31 | -record(state, { tbl = #{} }). 32 | 33 | %% LIFETIME & MANAGEMENT 34 | %% ---------------------------- 35 | start_link() -> 36 | gen_server:start_link({local, ?MODULE}, ?MODULE, [], []). 37 | 38 | sync() -> 39 | gen_server:call(?MODULE, sync). 40 | 41 | %% API 42 | %% ------------------------ 43 | 44 | store(ID, Location) -> call({store, ID, Location}). 45 | 46 | delete(ID) -> call({delete, ID}). 47 | 48 | lookup(ID) -> call({lookup, ID}). 49 | 50 | info() ->call(info). 51 | 52 | call(Msg) -> 53 | gen_server:call(?MODULE, Msg). 54 | 55 | 56 | %% CALLBACKS 57 | %% ------------------------- 58 | 59 | init([]) -> 60 | {ok, #state { tbl = #{} }}. 61 | 62 | handle_call({lookup, ID}, _From, #state { tbl = T } = State) -> 63 | {reply, maps:get(ID, T, not_found), State}; 64 | handle_call({store, ID, Location}, _From, #state { tbl = T } = State) -> 65 | self() ! {refresh, ID, Location}, 66 | {reply, ok, State#state { tbl = T#{ ID => Location } }}; 67 | handle_call({delete, ID}, _From, #state { tbl = T } = State) -> 68 | {reply, ok, State#state { tbl = maps:remove(ID, T)}}; 69 | handle_call(sync, _From, State) -> 70 | {reply, ok, State}; 71 | handle_call(info, _From, #state { tbl = T } = State) -> 72 | {reply, T, State}; 73 | handle_call(_Msg, _From, State) -> 74 | {reply, ok, State}. 75 | 76 | handle_cast(_Msg, State) -> 77 | {noreply, State}. 78 | 79 | handle_info({refresh, ID, Location}, #state { tbl = T } = State) -> 80 | case maps:get(ID, T, undefined) of 81 | undefined -> 82 | %% Deleted entry, don't do anything 83 | {noreply, State}; 84 | Location -> 85 | %% TODO: When refresh fails, we should track what failed here and then report it 86 | %% back to the process for which we store the entries. But this is for a later extension 87 | %% of the system. 88 | ok = refresh(ID, Location), 89 | dht_time:send_after(?REFRESH_TIME, ?MODULE, {refresh, ID, Location}), 90 | {noreply, State} 91 | end; 92 | handle_info(_Msg, State) -> 93 | {noreply, State}. 94 | 95 | terminate(_How, _State) -> 96 | ok. 97 | 98 | code_change(_OldVsn, State, _Extra) -> 99 | {ok, State}. 100 | 101 | %% INTERNAL FUNCTIONS 102 | %% ------------------------------ 103 | 104 | refresh(ID, Location) -> 105 | #{ store := StoreCandidates } = dht_search:run(find_value, ID), 106 | Stores = pick(?STORE_COUNT, ID, StoreCandidates), 107 | store_at_peers(Stores, ID, Location). 108 | 109 | store_at_peers(STS, ID, Location) -> 110 | _ = [dht_net:store({IP, Port}, Token, ID, Location) || {{_ID, IP, Port}, Token} <- STS], 111 | ok. 112 | 113 | pick(K, ID, Candidates) -> 114 | Ordered = lists:sort(fun({{IDx, _, _}, _}, {{IDy, _, _}, _}) -> dht_metric:d(ID, IDx) < dht_metric:d(ID, IDy) end, Candidates), 115 | take(K, Ordered). 116 | 117 | take(0, _Rest) -> []; 118 | take(_K, []) -> []; 119 | take(K, [C | Cs]) -> [C | take(K-1, Cs)]. 120 | -------------------------------------------------------------------------------- /test/Makefile: -------------------------------------------------------------------------------- 1 | compile: 2 | erl -make 3 | 4 | clean: 5 | rm *.beam 6 | -------------------------------------------------------------------------------- /test/README.md: -------------------------------------------------------------------------------- 1 | # TODO List of things still needing attention 2 | 3 | * The dht_search code needs a EQC model 4 | * The dht_store code can be joined into a cluster. We better join it into a cluster :) 5 | * It should be possible to build a cluster of the state and net code at the same time. There is nothing which makes this impossible in the system. But we have yet to make this happen. 6 | 7 | # Tests and specification of the DHT 8 | 9 | This directory contains tests for the DHT system. The tests are captured in the module `dht_SUITE` and running tests will run this suite. The way to run these is from the top-level: 10 | 11 | rebar3 ct 12 | 13 | Every test in this project are QuickCheck tests and requires Erlang QuickCheck. 14 | 15 | This file contains most of the prose written in order to understand what the specification of the system is. Each module contains further low-level documentation. Do *NOT* rely too much on the text in this file, but refer to the actual specification when in doubt. It is precise, whereas this text is not. For instance, the tests can make the discrimination between `L < 8` and `L ≤8` which is harder to manage with text. 16 | 17 | # Test strategy 18 | 19 | To make sure the code works as expected, we split the tests into QuickCheck components are start by verifying each component on its own. Once we have individual verfication, we cluster the components using the Erlang QuickCheck `eqc_cluster` feature. In turn, this allows us to handle the full specification from simpler parts. 20 | 21 | For each Erlang module, we define a *component* which has a *mocking* *specification* for every call it makes to the outside. This allows us to isolate each module on its own and test it without involving the specification of other modules. We also often overspecify what the module accepts to make sure it works even if the underlying modules change a bit. 22 | 23 | Once we have all components, we assemble them into a cluster. Now, the mocking specifications are replaced with actual models. In turn, a callout to a mock can now be verified by the preconditions of the underlying model. This ensures the composition of components are safe. 24 | 25 | # Test Model 26 | 27 | In the real system, the ID-space used is 160 bit (SHA-1) or 256 bit (SHA-256). In our test case, in order to hit limits faster, we pick an ID-space of 7 bit (128 positions). This allows us to find corner cases much faster and it is also more likely we hit counterexamples earlier. 28 | 29 | # Specification 30 | 31 | In order to make sure the code performs as defined in the DHT BitTorrent BEP 0005 spec, this code defines a model of the specification which is executed against the DHT code. Using QuickCheck, random test cases are generated from the specification and are then used as tests. Usually, 100 randomly generated test cases are enough to uncover errors in the system, but before releases we run test cases for much larger counts: hundreds of thousands to millions. 32 | 33 | There are many modules, each covering part of what the DHT is doing. 34 | 35 | The DHT system roughly contains three parts: 36 | 37 | * The Storage, storing values which are being advertised into the DHT 38 | * The Network stack, handling all network communication, protocol handling and request/reply proxying. 39 | * The State system, tracking the routing table of the DHT known to the particular node. 40 | 41 | ## Metric 42 | 43 | The Kademlia DHT relies on a “distance function”, which is `XOR`. The claim is that this is a metric, which has well-defined mathematical properties. We verify this to be the case for the domain on which the function is defined, while in the original paper, it is just mentioned in passing. By defining we require a metric, we can later change the metric to another one. 44 | 45 | ## Randomness 46 | 47 | The specification mentions the need for picking random values. In a QuickCheck test we don't want picked values to be random, as we want to know what value we picked. Thus we provide a model for randomness and mock it through `dht_rand`. 48 | 49 | ## Time 50 | 51 | Time is often a problem in models because the system runs on a different time scale when it is tested. We solve this by requiring all time calls to factor through the module `dht_time`, which we then promptly mock. This means the models can contain their own notion of time and when the SUT requests the time, we can control it. The same is true for setting and cancelling timers. 52 | 53 | *NOTE:* We use a time resolution of `milli_seconds` for the models and we assume this is the “native” time resolution. This makes it easier to handle rather than the nano-second default which is used on the system by default. 54 | 55 | All time in the system is `monotonic_time()`. This makes all of the code time-warp-safe. 56 | 57 | ## Routing table 58 | 59 | Routing ID's are positive integers. They have a digital bit-representation as lists of bits. These bits form a binary tree structure. The routing table is such a “path” in such a binary tree, considering a prefix of the bit-string. The rules are: 60 | 61 | * Leafs have at most 8 nodes 62 | * If inserting a new node into a leaf of size 8, there are two cases: 63 | * Our own ID has a common prefix `P` with the leaf: Partition the leaf into nodes with prefix follow by a `0` bit: `P0` and prefix followed by a `1` bit: `P1`. These become the new leafs. One of these leafs contain our own ID, whereas the other does not. This ensure only one subtree will split later on. 64 | * Our own ID is not a common prefix with the leaf: reject the insertion. 65 | 66 | In turn, the routing table is a “spine” in the tree only where the path mimicks our own ID. In turn, the routing table stores more elements close to ourselves rather than far away from ourselves. 67 | 68 | When there are 3 bits left in the suffix, the tree will not expand further since you can at most have 8 nodes in a leaf. This MUST hold, but it only holds if equality is on the NodeID. Example: In a 160 bit ID space, if you have a suffix of 157 bits in common with our own ID, there can at most be 3 bits which can vary. Our own ID is one of those, so there are 7 nodes left. They will fill the leaf, but it is impossible to insert more nodes into the table. 69 | 70 | ### Modeling the routing table 71 | 72 | We use a simple `map({Lo, Hi}, [Node])` as the model of the routing table in our component. `Lo` and `Hi` are the bounds on the range: all elements `X` in the range has an id such that `Lo ≤ id(X) < Hi`. Most operations are straightforward to define since we can just walk all of the range and filter it by predicate functions which pick the wanted elements. 73 | 74 | The `closest_to` call looks hard to implement, but it's formal specification is straightforward: sort all nodes based on the distance to the desired ID, and pick the first K of those nodes. If we canonicalize the output by sorting it, we 75 | obtain a correct specification. 76 | 77 | ### Node reachability 78 | 79 | Nodes in the routing table are either "reachable" or not. Nodes which are "reachable" are nodes which are not thought to be behind a firewall, which means they are possible to contact at any time. 80 | 81 | Some commands, touching nodes, inserting nodes into the node table and so on, exists in two variants, one where we know about reachability and one where we don't. Consider a query, originating from us to a peer, and that peer responding. This gives a valid reachability notion for that node. On the other hand, we we receive a query from a peer, it is not by default reachable, so we update the node with a notion of a non-reachable state. 82 | 83 | ## Routing table metadata 84 | 85 | The routing table is wrapped into a module tracking “meta-data” which is timer information associated with each entry in the routing table. There are two rules: 86 | 87 | * For each node in the routing table, we know the timing state of that node 88 | * For each leaf in the routing table, we know it's timing state 89 | 90 | This is verified to be the case, by mandating each insertion also provides a corresponding meta-data entry. 91 | 92 | A node can be in one of three possible states: 93 | 94 | * Good—we have succesfully communicated with the node recently. 95 | * Questionable—we have not communicated with the node recently, and we don't know its current state. 96 | * Bad—Communication with the node has failed repeatedly. 97 | 98 | Of course, “recently” and “repeatedly” has to be defined. We define the timeout limit to be 15 minutes and we define that nodes which have failed MORE THAN 1 time are bad nodes. These numbers could be changed, but they are not well-defined. 99 | 100 | We don't track the Node state with a timer. Rather, we note the last succesful point of communication. This means a simple calculation defines when a node becomes questionable, and thus needs a refresh. 101 | 102 | The reason we do it this way, is that all refreshes of nodes are event-driven: the node is refreshed whenever it communicates or when we want to use that node. So if we know the point-in-time for last communication, we can determine its internal state, be it `good`, `questionable`, or `bad`. 103 | 104 | The meta-data tracks nodes with the following map: 105 | 106 | Node => #{ last_activity => time_point(), timeout_count => non_neg_integer(), reachability => boolean() } 107 | 108 | Where: 109 | 110 | * `last_activity` encodes when we last had (valid) activity with the node as a point in time. 111 | * `timeout_count` measures how many timeouts we've had since the last succesful communication 112 | * `reachability` encodes if the node has ever replied to one of our queries. If it has, then this field is true. We use the field to track which nodes are behind firewalls and which are not. 113 | 114 | This means that for any point in time τ, we can measure the age(T) = τ-T, and 115 | check if the age is larger than, say, ?NODE_TIMEOUT. This defines when a node is 116 | questionable. The timeout_count defines a bad node, if it grows too large. 117 | 118 | The meta-data tracks ranges as a single map `Range => #{ timer => timer_ref() }`. When this timer triggers, 119 | we can use the current members of the range and their last activity points. The maximal such value defines the current `age(lists:max(LastActivities))` of the range. 120 | 121 | Ranges can exist without any members in them. When this happens, the age of the range is always such that it needs refreshing. We do this by picking the age of the range to be `monotinic_time() - convert_time_unit(?NODE_TIMEOUT, milli_seconds, native)`, which forces the range to refresh. 122 | 123 | ## Nodes 124 | 125 | Nodes are defined as the triple `{id(), ip(), port()}`, where `id()` is the Node ID space (7 bits in test, 160 bit in Kademlia based on sha1, 256bit if based on sha-256). `port()` is the UDP port number, in the usual 0–65535 range. Currently `ip()` is an `ipv4_address()`, but this needs extension to `ipv6_addresses()`. 126 | 127 | ### Node Equality 128 | 129 | TODO: Discussion needed here. Equality is either on the ID or on the Node. The important thing to understand is how this affects roaming nodes which changes address/port but not the ID. How will this affect the routing table and such? 130 | 131 | • Equality on NodeID means we can handle the deep end of the routing table easily. 132 | 133 | TODO 134 | 135 | ## Ranges 136 | 137 | TODO 138 | 139 | Ranges have an age. Suppose we pick the ages of all nodes in the range and sort them, smallest age first. Then the age of the range is the head of that sorted list. A range is refreshable if it's age is larger than 15 minutes. When this happens, we pick a random node from the range and run a FIND_NODE call on it. Once we have a list of nodes near to the randomly picked one, we insert all of these into the routing table. This in turn forces out every bad node, and swaps them with new good nodes. 140 | 141 | The assumption is that for normal operation, it is relatively rare a range won't see any kind of traffic for 15 minutes. In other words, a normal, communicating client will not have to refresh ranges very often. But a client who is behind a firewall will have to rely on refreshes a lot in order to keep its routing table alive. 142 | 143 | Note that refreshing doesn't evict any nodes by itself. So if the network is down, the routing table will not change. This property is desirable, because it means we don't lose the routing table if our system can't connect to the network for a while. 144 | 145 | # Networking 146 | 147 | TODO 148 | -------------------------------------------------------------------------------- /test/dht_SUITE.erl: -------------------------------------------------------------------------------- 1 | -module(dht_SUITE). 2 | -include_lib("common_test/include/ct.hrl"). 3 | -include_lib("eqc/include/eqc_ct.hrl"). 4 | 5 | -compile(export_all). 6 | 7 | suite() -> 8 | [{timetrap, {seconds, 15}}]. 9 | 10 | init_per_group(_Group, Config) -> 11 | Config. 12 | 13 | end_per_group(_Group, _Config) -> 14 | ok. 15 | 16 | init_per_suite(Config) -> 17 | Config. 18 | 19 | end_per_suite(_Config) -> 20 | ok. 21 | 22 | init_per_testcase(_Case, Config) -> 23 | Config. 24 | 25 | end_per_testcase(_Case, _Config) -> 26 | ok. 27 | 28 | metric_group() -> [{metric, [shuffle], [ 29 | check_metric_refl, 30 | check_metric_sym, 31 | check_metric_triangle_ineq 32 | ]}]. 33 | 34 | state_group() -> [{state, [shuffle], [ 35 | check_routing_table, 36 | check_routing_meta, 37 | check_state, 38 | check_state_cluster 39 | ]}]. 40 | 41 | network_group() -> [{network, [shuffle], [ 42 | check_protocol_encoding 43 | ]}]. 44 | 45 | utility_group() -> [{utility, [shuffle], [ 46 | check_rand_pick 47 | ]}]. 48 | 49 | groups() -> 50 | lists:append([ 51 | utility_group(), 52 | metric_group(), 53 | state_group(), 54 | network_group() 55 | ]). 56 | 57 | all() -> 58 | [{group, utility}, 59 | {group, metric}, 60 | {group, state}, 61 | {group, network}]. 62 | 63 | %% TESTS 64 | %% ---------------------------------------------------------------------------- 65 | check_rand_pick(_Config) -> 66 | ?quickcheck((dht_rand_eqc:prop_pick())). 67 | 68 | check_metric_refl(_Config) -> 69 | ?quickcheck((dht_metric_eqc:prop_op_refl())). 70 | 71 | check_metric_sym(_Config) -> 72 | ?quickcheck((dht_metric_eqc:prop_op_sym())). 73 | 74 | check_metric_triangle_ineq(_Config) -> 75 | ?quickcheck((dht_metric_eqc:prop_op_triangle_ineq())). 76 | 77 | check_protocol_encoding(_Config) -> 78 | ?quickcheck((dht_proto_eqc:prop_iso_packet())). 79 | 80 | check_routing_table(_Config) -> 81 | ?quickcheck((dht_routing_table_eqc:prop_component_correct())). 82 | 83 | check_routing_meta(_Config) -> 84 | ?quickcheck((dht_routing_meta_eqc:prop_component_correct())). 85 | 86 | check_state(_Config) -> 87 | ?quickcheck((dht_state_eqc:prop_component_correct())). 88 | 89 | check_state_cluster(_Config) -> 90 | ?quickcheck((dht_state_eqc:prop_cluster_correct())). -------------------------------------------------------------------------------- /test/dht_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(dht_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | -define(DRIVER, dht_routing_tracker). 10 | 11 | components() -> [ 12 | dht_state_eqc, 13 | dht_net_eqc, 14 | dht_store_eqc, 15 | dht_routing_meta_eqc, 16 | dht_routing_table_eqc, 17 | dht_time_eqc 18 | ]. 19 | 20 | api_spec() -> api_spec(?MODULE). 21 | 22 | prop_cluster_correct() -> 23 | ?SETUP(fun() -> 24 | application:load(dht), 25 | eqc_mocking:start_mocking(api_spec(), components()), 26 | fun() -> ok end 27 | end, 28 | ?FORALL(ID, dht_eqc:id(), 29 | ?FORALL({MetaState, TableState, State}, 30 | {dht_routing_meta_eqc:gen_state(ID), 31 | dht_routing_table_eqc:gen_state(ID), 32 | dht_state_eqc:gen_state(ID)}, 33 | ?FORALL(Cmds, eqc_cluster:commands(?MODULE, [ 34 | {dht_state_eqc, State}, 35 | {dht_routing_meta_eqc, MetaState}, 36 | {dht_routing_table_eqc, TableState}]), 37 | begin 38 | ok = dht_state_eqc:reset(), 39 | ok = dht_net_eqc:reset(), 40 | ok = routing_table:reset(ID, ?ID_MIN, ?ID_MAX), 41 | {H,S,R} = run_commands(?MODULE, Cmds), 42 | pretty_commands(?MODULE, Cmds, {H,S,R}, 43 | aggregate(with_title('Commands'), command_names(Cmds), 44 | collect(eqc_lib:summary('Length'), length(Cmds), 45 | collect(eqc_lib:summary('Routing Table Size'), rt_size(S), 46 | aggregate(with_title('Features'), eqc_statem:call_features(H), 47 | features(eqc_statem:call_features(H), 48 | R == ok)))))) 49 | end)))). 50 | 51 | rt_size(Components) -> 52 | V = proplists:get_value(dht_routing_table_eqc, Components), 53 | length(dht_routing_table_eqc:current_nodes(V)). 54 | 55 | t() -> t(15). 56 | 57 | t(Secs) -> 58 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 59 | 60 | recheck() -> 61 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 62 | 63 | cmds() -> 64 | ?LET(ID, dht_eqc:id(), 65 | ?LET({MetaState, TableState, State}, 66 | {dht_routing_meta_eqc:gen_state(ID), 67 | dht_routing_table_eqc:gen_state(ID), 68 | dht_state_eqc:gen_state(ID)}, 69 | eqc_cluster:commands(?MODULE, [ 70 | {dht_state_eqc, State}, 71 | {dht_routing_meta_eqc, MetaState}, 72 | {dht_routing_table_eqc, TableState}]))). 73 | 74 | sample() -> 75 | eqc_gen:sample(cmds()). -------------------------------------------------------------------------------- /test/dht_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_eqc). 2 | -compile(export_all). 3 | 4 | -include_lib("eqc/include/eqc.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | %% Generators 9 | id(Min, Max) -> choose(Min, Max). 10 | 11 | id() -> id(?ID_MIN, ?ID_MAX - 1). 12 | 13 | ip() -> 14 | oneof([ipv4_address(), 15 | ipv6_address()]). 16 | 17 | ipv4_address() -> 18 | ?LET(L, vector(4, choose(0, 255)), 19 | list_to_tuple(L)). 20 | 21 | ipv6_address() -> 22 | ?LET(L, vector(8, choose(0, 65535)), 23 | list_to_tuple(L)). 24 | 25 | port() -> 26 | choose(0, 1024*64 - 1). 27 | 28 | peer() -> 29 | {id(), ip(), port()}. 30 | 31 | endpoint() -> 32 | {ip(), port()}. 33 | 34 | tag() -> 35 | ?LET(ID, choose(0, 16#FFFF), 36 | <>). 37 | 38 | unique_id_pair() -> 39 | ?SUCHTHAT([X, Y], [id(), id()], 40 | X /= Y). 41 | 42 | range() -> ?LET([X, Y], unique_id_pair(), list_to_tuple(lists:sort([X,Y]))). 43 | 44 | token() -> binary(8). 45 | 46 | node_eq({X, _, _}, {Y, _, _}) -> X =:= Y. 47 | 48 | %% closer/2 generates IDs which are closer to a target 49 | %% Generate elements which are closer to the target T than the 50 | %% value X. 51 | closer(X, T) -> 52 | ?LET(Prefix, bit_prefix(<>, <>), 53 | ?LET(BS, bitstring(?BITS - bit_size(Prefix)), 54 | begin 55 | <> = <>, 56 | R 57 | end)). 58 | 59 | %% bit_prefix/2 finds the prefix of two numbers 60 | %% Given bit_prefix(X, Tgt), find the common prefix amongst 61 | %% the two and extend it with one bit from Tgt. 62 | bit_prefix(<<>>, <<>>) -> <<>>; 63 | bit_prefix(<<0:1, Xs/bitstring>>, <<0:1, Ys/bitstring>>) -> 64 | Rest = bit_prefix(Xs, Ys), 65 | <<0:1, Rest/bitstring>>; 66 | bit_prefix(<<1:1, Xs/bitstring>>, <<1:1, Ys/bitstring>>) -> 67 | Rest = bit_prefix(Xs, Ys), 68 | <<1:1, Rest/bitstring>>; 69 | bit_prefix(_Xs, <>) -> 70 | <>. 71 | 72 | 73 | -------------------------------------------------------------------------------- /test/dht_eqc.hrl: -------------------------------------------------------------------------------- 1 | %% For simplicity, the test picks a smaller range in which to 2 | %% place nodes. This not only makes it easier to tets for corner cases, 3 | %% it also makes it easier to figure out what happens in the routing table. 4 | -define(BITS, 7). 5 | -define(ID_MIN, 0). 6 | -define(ID_MAX, 1 bsl ?BITS). 7 | 8 | %% Maximal size of a range is defined exactly as in the SUT 9 | -define(MAX_RANGE_SZ, 8). 10 | 11 | -------------------------------------------------------------------------------- /test/dht_metric_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_metric_eqc). 2 | 3 | -compile(export_all). 4 | 5 | -include_lib("eqc/include/eqc.hrl"). 6 | 7 | %% Metrics are defined by three rules: 8 | %% • reflexivity 9 | %% • symmetry 10 | %% • triangle equality 11 | 12 | prop_op_refl() -> 13 | ?FORALL(X, dht_eqc:id(), 14 | dht_metric:d(X, X) == 0). 15 | 16 | prop_op_sym() -> 17 | ?FORALL({X, Y}, {dht_eqc:id(), dht_eqc:id()}, 18 | dht_metric:d(X, Y) == dht_metric:d(Y, X)). 19 | 20 | prop_op_triangle_ineq() -> 21 | ?FORALL({X, Y, Z}, {dht_eqc:id(), dht_eqc:id(), dht_eqc:id()}, 22 | dht_metric:d(X, Y) + dht_metric:d(Y, Z) >= dht_metric:d(X, Z)). 23 | 24 | %% Verify we can generate elements closer to a target in the metric. 25 | prop_closer() -> 26 | ?FORALL({X, T}, {dht_eqc:id(), dht_eqc:id()}, 27 | ?IMPLIES(X /= T, 28 | ?FORALL(Z, dht_eqc:closer(X, T), 29 | dht_metric:d(X, T) >= dht_metric:d(Z, T)))). 30 | -------------------------------------------------------------------------------- /test/dht_par_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_par_eqc). 2 | 3 | -compile(export_all). 4 | 5 | -include_lib("eqc/include/eqc.hrl"). 6 | -include_lib("pulse/include/pulse.hrl"). 7 | 8 | crasher() -> 9 | ?LET(F, function1(int()), 10 | fun 11 | (-1) -> timer:sleep(6000); %% Fail by timing out 12 | (0) -> exit(err); %% Fail by crashing 13 | (X) -> F(X) %% Run normally 14 | end). 15 | 16 | expected_result(F, Xs) -> 17 | case [X || X <- Xs, X == -1] of 18 | [] -> [case X of 0 -> {error, err}; N -> {ok, F(N)} end || X <- Xs]; 19 | [_|_] -> 20 | {'EXIT', pmap_timeout} 21 | end. 22 | 23 | prop_pmap() -> 24 | ?FORALL([F, Xs], [crasher(), list( frequency([ {10,0}, {1, -1}, {1000, nat()} ]) ) ], 25 | ?PULSE(Result, (catch dht_par:pmap(F, Xs)), 26 | begin 27 | Expected = expected_result(F, Xs), 28 | equals(Result, Expected) 29 | end)). 30 | 31 | pulse_instrument() -> 32 | [ pulse_instrument(File) || File <- filelib:wildcard("../src/dht_par.erl") ++ filelib:wildcard("../test/dht_par_eqc.erl") ], 33 | ok. 34 | 35 | pulse_instrument(File) -> 36 | io:format("Compiling: ~p~n", [File]), 37 | {ok, Mod} = compile:file(File, [{d, 'PULSE', true}, {parse_transform, pulse_instrument}]), 38 | code:purge(Mod), 39 | {module, Mod} = code:load_file(Mod), 40 | Mod. 41 | -------------------------------------------------------------------------------- /test/dht_proto_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_proto_eqc). 2 | -compile(export_all). 3 | 4 | -include_lib("eqc/include/eqc.hrl"). 5 | 6 | %% Generators 7 | q_ping() -> 8 | return(ping). 9 | 10 | q_find_node()-> 11 | ?LET(ID, dht_eqc:id(), 12 | {find, node, ID}). 13 | 14 | q_find_value() -> 15 | ?LET(ID, dht_eqc:id(), 16 | {find, value, ID}). 17 | 18 | q_find() -> 19 | oneof([ 20 | q_find_node(), 21 | q_find_value() 22 | ]). 23 | 24 | q_store(Token) -> 25 | ?LET([ID, Port], [dht_eqc:id(), dht_eqc:port()], 26 | {store, Token, ID, Port}). 27 | 28 | q_store() -> 29 | ?LET(Token, dht_eqc:token(), 30 | q_store(Token)). 31 | 32 | q() -> 33 | q(oneof([q_ping(), q_find(), q_store()])). 34 | 35 | q(G) -> 36 | ?LET({Tag, ID, Query}, {dht_eqc:tag(), dht_eqc:id(), G}, 37 | {query, Tag, ID, Query}). 38 | 39 | r_ping() -> return(ping). 40 | 41 | r_find_node() -> 42 | ?LET({Rs, Token}, {list(dht_eqc:peer()), dht_eqc:token()}, 43 | {find, node, Token, Rs}). 44 | 45 | r_find_value() -> 46 | ?LET({Rs, Token}, {list(dht_eqc:endpoint()), dht_eqc:token()}, 47 | {find, value, Token, Rs}). 48 | 49 | r_find() -> 50 | oneof([r_find_node(), r_find_value()]). 51 | 52 | r_store() -> return(store). 53 | 54 | r_reply() -> 55 | oneof([r_ping(), r_find(), r_store()]). 56 | 57 | r() -> 58 | ?LET({Tag, ID, Reply}, {dht_eqc:tag(), dht_eqc:id(), r_reply()}, 59 | {response, Tag, ID, Reply}). 60 | 61 | e() -> 62 | ?LET({Tag, ID, Code, Msg}, {dht_eqc:tag(), dht_eqc:id(), nat(), binary()}, 63 | {error, Tag, ID, Code, Msg}). 64 | 65 | packet() -> 66 | oneof([q(), r(), e()]). 67 | 68 | %% Properties 69 | prop_iso_packet() -> 70 | ?FORALL(P, packet(), 71 | begin 72 | E = iolist_to_binary(dht_proto:encode(P)), 73 | equals(P, dht_proto:decode(E)) 74 | end). 75 | 76 | t() -> 77 | t(5). 78 | 79 | t(Secs) -> 80 | eqc:quickcheck( 81 | eqc:testing_time(Secs, eqc_statem:show_states( 82 | prop_iso_packet()))). 83 | -------------------------------------------------------------------------------- /test/dht_rand_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_rand_eqc). 2 | -include_lib("eqc/include/eqc.hrl"). 3 | 4 | -compile(export_all). 5 | 6 | prop_pick() -> 7 | ?FORALL(Specimen, list(int()), 8 | case Specimen of 9 | [] -> equals(dht_rand:pick(Specimen), []); 10 | Is -> lists:member(dht_rand:pick(Specimen), Is) 11 | end). 12 | -------------------------------------------------------------------------------- /test/dht_routing_table_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_routing_table_eqc). 2 | 3 | -compile(export_all). 4 | 5 | -include_lib("eqc/include/eqc.hrl"). 6 | -include_lib("eqc/include/eqc_component.hrl"). 7 | 8 | -include("dht_eqc.hrl"). 9 | 10 | api_spec() -> 11 | #api_spec { 12 | language = erlang, 13 | modules = [] }. 14 | 15 | -record(state, 16 | { self, 17 | init = false, 18 | filter_fun = fun(_X) -> true end, 19 | tree = #{} }). 20 | 21 | %% Generators 22 | %% ---------- 23 | gen_state(ID) -> #state { self = ID }. 24 | initial_state() -> #state {}. 25 | 26 | initial_tree(Low, High) -> 27 | K = {Low, High}, 28 | #{ K => [] }. 29 | 30 | %% Construction of a new table 31 | %% -------------- 32 | 33 | new(Self, Min, Max) -> 34 | routing_table:reset(Self, Min, Max), 35 | 'ROUTING_TABLE'. 36 | 37 | new_pre(S) -> not initialized(S). 38 | 39 | new_args(#state { self = undefined }) -> [dht_eqc:id(), ?ID_MIN, ?ID_MAX]; 40 | new_args(#state { self = Self }) -> [Self, ?ID_MIN, ?ID_MAX]. 41 | 42 | new_pre(_S, _) -> true. 43 | 44 | new_return(_S, [_Self, _, _]) -> 'ROUTING_TABLE'. 45 | 46 | new_next(S, _, [Self, Min, Max]) -> S#state { self = Self, init = true, tree = initial_tree(Min, Max) }. 47 | 48 | new_callers() -> [dht_state_eqc]. 49 | 50 | new_features(_S, _, _) -> [{table, new}]. 51 | 52 | %% Do we have space for insertion of a node 53 | %% ----------------------------------------------- 54 | space(Node, _) -> 55 | routing_table:space(Node). 56 | 57 | space_pre(S) -> initialized(S). 58 | 59 | space_args(#state{}) -> 60 | [dht_eqc:peer(), 'ROUTING_TABLE']. 61 | 62 | space_pre(#state { self = Self } = S, [{NodeID, _, _} = Node, _]) -> 63 | (not has_node(Node, S)) andalso has_space(Node, S) andalso (Self /= NodeID). 64 | 65 | space_callouts(S, [Node, _]) -> 66 | case may_split(S, Node, 7) of 67 | true -> ?RET(true); 68 | false -> ?RET(false) 69 | end. 70 | 71 | space_callers() -> [dht_routing_meta_eqc]. 72 | 73 | space_features(_State, [_, _], Return) -> [{table, {space, Return}}]. 74 | 75 | %% Insertion of new entries into the routing table 76 | %% ----------------------------------------------- 77 | insert(Node, _) -> 78 | routing_table:insert(Node). 79 | 80 | insert_callers() -> [dht_routing_meta_eqc, dht_state_eqc]. 81 | insert_pre(S) -> initialized(S). 82 | 83 | insert_args(#state {}) -> 84 | [dht_eqc:peer(), 'ROUTING_TABLE']. 85 | 86 | insert_pre(#state { self = Self } = S, [{NodeID, _, _} = Node, _]) -> 87 | (not has_node(Node, S)) andalso has_space(Node, S) andalso (Self /= NodeID). 88 | 89 | has_space(Node, #state { self = Self } = S) -> 90 | {{Lo, Hi}, Members} = find_range({node, Node}, S), 91 | case length(Members) of 92 | L when L < ?MAX_RANGE_SZ -> true; 93 | L when L == ?MAX_RANGE_SZ -> between(Lo, Self, Hi) 94 | end. 95 | 96 | insert_callouts(_S, [Node, _]) -> 97 | ?APPLY(insert_split_range, [Node, 7]), 98 | ?RET('ROUTING_TABLE'). 99 | 100 | insert_features(_State, [Node, _], _Return) -> 101 | Ns = routing_table:node_list(), 102 | case lists:member(Node, Ns) of 103 | true -> [{table, {insert, success}}]; 104 | false -> [{table, {insert, full_bucket}}] 105 | end. 106 | 107 | %% Ask the system for the current state table ranges 108 | %% ------------------------------------------------- 109 | ranges(_) -> 110 | routing_table:ranges(). 111 | 112 | ranges_callers() -> [dht_routing_meta_eqc]. 113 | 114 | ranges_pre(S) -> initialized(S). 115 | 116 | ranges_args(_S) -> ['ROUTING_TABLE']. 117 | 118 | %% Range validation is simple. The set of all ranges should form a contiguous 119 | %% space of split ranges. If it doesn't something is wrong. 120 | ranges_return(S, [_Dummy]) -> 121 | lists:sort(current_ranges(S)). 122 | 123 | ranges_features(_S, _A, _Res) -> [{table, ranges}]. 124 | 125 | %% Delete a node from the routing table 126 | %% If the node is not present, this is a no-op. 127 | %% ------------------------------------ 128 | delete(Node, _) -> 129 | routing_table:delete(Node). 130 | 131 | delete_callers() -> [dht_routing_meta_eqc]. 132 | 133 | delete_pre(S) -> initialized(S) andalso has_nodes(S). 134 | 135 | nonexisting_node(S) -> 136 | ?SUCHTHAT(Node, dht_eqc:peer(), not has_node(Node, S)). 137 | 138 | delete_args(S) -> 139 | Ns = current_nodes(S), 140 | Node = frequency( 141 | lists:append( 142 | [{1, nonexisting_node(S)}], 143 | [{10, elements(Ns)} || Ns /= [] ])), 144 | [Node, 'ROUTING_TABLE']. 145 | 146 | delete_next(#state { tree = TR } = S, _, [Node, _]) -> 147 | {Range, Members} = find_range({node, Node}, S), 148 | S#state { tree = TR#{ Range := Members -- [Node] } }. 149 | 150 | %% TODO: Fix this, as we have to return the routing table itself 151 | delete_return(_S, [_, _]) -> 'ROUTING_TABLE'. 152 | 153 | delete_features(S, [Node, _], _R) -> 154 | case has_node(Node, S) of 155 | true -> [{table, {delete, member}}]; 156 | false -> [{table, {delete, non_member}}] 157 | end. 158 | 159 | %% Ask for members of a given ID 160 | %% Currently, we only ask for existing members, but this could also fault-inject 161 | %% ----------------------------- 162 | members(Node, _) -> 163 | lists:sort(routing_table:members(Node)). 164 | 165 | nonexisting_id(IDs) -> 166 | ?SUCHTHAT({ID, _, _}, dht_eqc:peer(), 167 | not lists:member(ID, IDs)). 168 | 169 | members_callers() -> [dht_routing_meta_eqc]. 170 | 171 | members_pre(S) -> initialized(S). 172 | 173 | %% TODO: Ask for ranges here as well! 174 | members_args(S) -> 175 | Ns = current_nodes(S), 176 | Rs = current_ranges(S), 177 | Arg = frequency( 178 | lists:append([ 179 | [{1, {node, nonexisting_id(ids(Ns))}}], 180 | [{10, {node, elements(Ns)}} || Ns /= [] ], 181 | [{5, {range, elements(Rs)}} || Rs /= [] ] 182 | ])), 183 | [Arg, 'ROUTING_TABLE']. 184 | 185 | members_pre(_S, [{node, _}, _]) -> true; 186 | members_pre(S, [{range, R}, _]) -> has_range(R, S); 187 | members_pre(_S, _) -> false. 188 | 189 | members_return(#state { tree = Tree }, [{range, R}, _]) -> 190 | {ok, Members} = maps:find(R, Tree), 191 | lists:sort(Members); 192 | members_return(S, [{node, Node}, _]) -> 193 | {_, Members} = find_range({node, Node}, S), 194 | lists:sort(Members). 195 | 196 | members_features(S, [{range, R}, _], _Res) -> 197 | case has_range(R, S) of 198 | true -> [{table, {members, existing_range}}]; 199 | false -> [{table, {members, nonexisting_range}}] 200 | end; 201 | members_features(S, [{node, Node}, _], _Res) -> 202 | case has_node(Node, S) of 203 | true -> [{table, {members, existing_node}}]; 204 | false -> [{table, {members, nonexisting_node}}] 205 | end. 206 | 207 | %% Ask for membership of the Routing Table 208 | %% --------------------------------------- 209 | member_state(Node, _) -> 210 | routing_table:member_state(Node). 211 | 212 | member_state_callers() -> [dht_routing_meta_eqc]. 213 | 214 | member_state_pre(S) -> initialized(S). 215 | 216 | member_state_args(S) -> 217 | Node = oneof( 218 | lists:append( 219 | [elements(current_nodes(S)) || has_nodes(S)], 220 | [dht_eqc:peer()]) ), 221 | [Node, 'ROUTING_TABLE']. 222 | 223 | member_state_return(S, [{ID, IP, Port}, _]) -> 224 | Ns = current_nodes(S), 225 | case lists:keyfind(ID, 1, Ns) of 226 | false -> unknown; 227 | {ID, IP, Port} -> member; 228 | {ID, _, _} -> roaming_member 229 | end. 230 | 231 | member_state_features(_S, [_, _], unknown) -> [{table, {member_state, unknown}}]; 232 | member_state_features(_S, [_, _], member) -> [{table, {member_state, member}}]; 233 | member_state_features(_S, [_, _], roaming_member) -> [{{member_state, roaming}}]. 234 | 235 | %% Ask for the node id 236 | %% -------------------------- 237 | node_id(_) ->routing_table:node_id(). 238 | 239 | node_id_callers() -> [dht_routing_meta_eqc]. 240 | 241 | node_id_pre(S) -> initialized(S). 242 | 243 | node_id_args(_S) -> ['ROUTING_TABLE']. 244 | 245 | node_id_return(#state { self = Self }, _) -> Self. 246 | 247 | node_id_features(_S, [_], _R) -> [{table, node_id}]. 248 | 249 | %% Ask for the node list 250 | %% ----------------------- 251 | node_list(_) -> 252 | lists:sort( 253 | routing_table:node_list() ). 254 | 255 | node_list_callers() -> [dht_routing_meta_eqc]. 256 | 257 | node_list_pre(S) -> initialized(S). 258 | 259 | node_list_args(_S) -> ['ROUTING_TABLE']. 260 | 261 | node_list_return(S, [_], _) -> 262 | lists:sort(current_nodes(S)). 263 | 264 | node_list_features(_S, _A, _R) -> [{table, node_list}]. 265 | 266 | %% Ask if the routing table has a bucket 267 | %% ------------------------------------- 268 | is_range(B, _) -> 269 | routing_table:is_range(B). 270 | 271 | is_range_callers() -> [dht_routing_meta_eqc]. 272 | 273 | is_range_pre(S) -> initialized(S). 274 | 275 | is_range_args(S) -> 276 | Rs = current_ranges(S), 277 | Range = oneof([elements(Rs), dht_eqc:range()]), 278 | [Range, 'ROUTING_TABLE']. 279 | 280 | is_range_return(S, [Range, _]) -> 281 | lists:member(Range, current_ranges(S)). 282 | 283 | is_range_features(_S, _, true) -> [{table, {is_range, existing}}]; 284 | is_range_features(_S, _, false) -> [{table, {is_range, nonexisting}}]. 285 | 286 | %% Ask who is closest to a given ID 287 | %% -------------------------------- 288 | closest_to(ID, _) -> 289 | routing_table:closest_to(ID). 290 | 291 | closest_to_callers() -> [dht_routing_meta_eqc]. 292 | 293 | closest_to_pre(S) -> initialized(S). 294 | 295 | closest_to_args(#state {}) -> 296 | [dht_eqc:id(), 'ROUTING_TABLE']. 297 | 298 | closest_to_callouts(#state { filter_fun = F } = S, [TargetID, _]) -> 299 | Ns = [N || N <- current_nodes(S), F(N)], 300 | D = fun({ID, _IP, _Port}) -> dht_metric:d(TargetID, ID) end, 301 | Sorted = lists:sort(fun(X, Y) -> D(X) < D(Y) end, Ns), 302 | ?RET(Sorted). 303 | 304 | closest_to_features(_S, [_, _], _R) -> [{table, {closest_to}}]. 305 | 306 | %% Preconditions 307 | %% -------------------------------------- 308 | ensure_started_pre(S) -> initialized(S). 309 | 310 | %% Determine if we may split a range (Internal call) 311 | %% -------------------------------------- 312 | 313 | may_split(#state { self = Self, tree = TR } = S, Node, K) when K > 0 -> 314 | {{Lo, Hi}, Members} = find_range({node, Node}, S), 315 | case length(Members) of 316 | L when L < ?MAX_RANGE_SZ -> true; 317 | L when L == ?MAX_RANGE_SZ -> 318 | case between(Lo, Self, Hi) of 319 | false -> false; 320 | true -> 321 | Half = ((Hi - Lo) bsr 1) + Lo, 322 | {Lower, Upper} = lists:partition(fun({ID, _, _}) -> ID < Half end, Members), 323 | SplitTree = 324 | (maps:remove({Lo, Hi}, TR))#{ {Lo, Half} => Lower, {Half, Hi} => Upper }, 325 | may_split(S#state { tree = SplitTree }, Node, K-1) 326 | end 327 | end. 328 | 329 | %% INSERT_SPLIT_RANGE / SPLIT_RANGE (Internal calls) 330 | %% ---------------------------------------- 331 | 332 | insert_split_range_callouts(_S, [_, 0]) -> 333 | ?FAIL('recursion depth'); 334 | insert_split_range_callouts(#state { self = Self } = S, [Node, K]) -> 335 | {{Lo, Hi}, Members} = find_range({node, Node}, S), 336 | case length(Members) of 337 | L when L < ?MAX_RANGE_SZ -> 338 | ?APPLY(add_node, [Node]); 339 | L when L == ?MAX_RANGE_SZ -> 340 | case between(Lo, Self, Hi) of 341 | true -> 342 | ?APPLY(split_range, [{Lo, Hi}]), 343 | ?APPLY(insert_split_range, [Node, K-1]); 344 | false -> 345 | ?EMPTY 346 | end 347 | end. 348 | 349 | split_range_next(#state { tree = TR } = S, _, [{Lo, Hi} = Range]) -> 350 | Members = maps:get(Range, TR), 351 | Half = ((Hi - Lo) bsr 1) + Lo, 352 | {Lower, Upper} = lists:partition(fun({ID, _, _}) -> ID < Half end, Members), 353 | SplitTree = (maps:remove(Range, TR))#{ {Lo, Half} => Lower, {Half, Hi} => Upper }, 354 | S#state { tree = SplitTree }. 355 | 356 | %% ADD_NODE (Internal call) 357 | %% ---------------------------------- 358 | 359 | add_node_next(#state { tree = TR } = S, _, [Node]) -> 360 | {Range, Members} = find_range({node, Node}, S), 361 | S#state { tree = TR#{ Range := [Node | Members] } }. 362 | 363 | %% Invariant 364 | %% --------- 365 | %% 366 | %% Initialized routing tables support the following invariants: 367 | %% 368 | %% • No bucket has more than 8 members 369 | %% • Buckets can't overlap 370 | %% • Members of a bucket share a property: a common prefix 371 | %% • The common prefix is given by the depth/width of the bucket 372 | invariant(#state { init = false }) -> true; 373 | invariant(#state { init = true }) -> routing_table:invariant(). 374 | 375 | %% Weights 376 | %% ------- 377 | %% 378 | %% It is more interesting to manipulate the structure than it is to query it: 379 | weight(_S, insert) -> 15; 380 | weight(_S, delete) -> 3; 381 | weight(_S, _Cmd) -> 1. 382 | 383 | %% Properties 384 | %% ---------- 385 | self(#state { self = S }) -> S. 386 | 387 | initialized(#state { init = I }) -> I. 388 | 389 | %% Use a common postcondition for all commands, so we can utilize the valid return 390 | %% of each command. 391 | postcondition_common(S, Call, Res) -> 392 | eq(Res, return_value(S, Call)). 393 | 394 | prop_component_correct() -> 395 | ?SETUP(fun() -> 396 | eqc_mocking:start_mocking(api_spec()), 397 | fun() -> ok end 398 | end, 399 | ?FORALL(Cmds, commands(?MODULE), 400 | begin 401 | {H, S, R} = run_commands(?MODULE, Cmds), 402 | pretty_commands(?MODULE, Cmds, {H, S, R}, 403 | aggregate(with_title('Commands'), command_names(Cmds), 404 | collect(eqc_lib:summary('Length'), length(Cmds), 405 | collect(eqc_lib:summary('Routing table size'), length(current_nodes(S)), 406 | aggregate(with_title('Features'), eqc_statem:call_features(H), 407 | features(eqc_statem:call_features(H), 408 | R == ok)))))) 409 | end)). 410 | 411 | t() -> t(5). 412 | 413 | t(Time) -> 414 | eqc:quickcheck(eqc:testing_time(Time, eqc_statem:show_states(prop_component_correct()))). 415 | 416 | %% Internal functions 417 | %% ------------------ 418 | 419 | has_node({ID, _, _}, S) -> 420 | Ns = current_nodes(S), 421 | lists:keymember(ID, 1, Ns). 422 | 423 | has_range(R, #state { tree = Tree }) -> maps:is_key(R, Tree). 424 | 425 | has_nodes(S) -> current_nodes(S) /= []. 426 | 427 | ids(Nodes) -> [ID || {ID, _, _} <- Nodes]. 428 | 429 | tree(#state { tree = T }) -> T. 430 | current_nodes(#state { tree = TR }) -> lists:append(maps:values(TR)). 431 | current_ranges(#state { tree = TR }) -> maps:keys(TR). 432 | 433 | find_range({node, {ID, _, _}}, S) -> 434 | [{Range, Members}] = maps:to_list(maps:filter(fun({Lo, Hi}, _) -> between(Lo, ID, Hi) end, tree(S))), 435 | {Range, Members}. 436 | 437 | between(L, X, H) when L =< X, X < H -> true; 438 | between(_, _, _) -> false. 439 | -------------------------------------------------------------------------------- /test/dht_state_eqc.erl: -------------------------------------------------------------------------------- 1 | %% @doc EQC Model for the system state 2 | %% The high-level entry-point for the DHT state. This model implements the public 3 | %% interface for the dht_state gen_server which contains the routing table of the DHT 4 | %% 5 | %% The high-level view is relatively simple to define, since most of the advanced parts 6 | %% pertaining to routing has already been handled in dht_routing_meta and its corresponding 7 | %% EQC model. 8 | %% 9 | %% This model defines the large-scale policy rules of the DHT routing table. It uses dht_routing_meta 10 | %% and dht_routing_table for the hard work at the low level and just delegates necessary work to 11 | %% those parts of the code (and their respective models). 12 | %% 13 | %% The dht_state code is a gen_server which occasionally spawns functions to handle background 14 | %% work. Some states are gen_server internal and are marked as such. They refer to callout 15 | %% specifications which are done inside the gen_server. The code path is usually linear in this 16 | %% case as well, however. 17 | %% 18 | %% @end 19 | %% 20 | %% TODO LIST: 21 | %% • When a range is full, you can maybe insert the node anyway. You have to ask if 22 | %% the range can split, and if affirmative, you can still insert the node as the range 23 | %% will correctly split in this case. This is not currently handled by the code, but it 24 | %% is necessary for correct operation. 25 | 26 | -module(dht_state_eqc). 27 | -compile(export_all). 28 | 29 | -include_lib("eqc/include/eqc.hrl"). 30 | -include_lib("eqc/include/eqc_component.hrl"). 31 | 32 | -include("dht_eqc.hrl"). 33 | 34 | -record(state,{ 35 | init = false, %% true when the model has been initialized 36 | id %% The NodeID the node is currently running under 37 | }). 38 | 39 | -define(K, 8). 40 | 41 | %% API SPEC 42 | %% ----------------- 43 | %% 44 | %% We call out to the networking layer and also the routing meta-data layer. 45 | api_spec() -> 46 | #api_spec { 47 | language = erlang, 48 | modules = 49 | [ 50 | #api_module { 51 | name = dht_refresh, 52 | functions = [ 53 | #api_fun { name = insert_nodes, arity = 1 }, 54 | #api_fun { name = range, arity = 1 }, 55 | #api_fun { name = verify, arity = 3 } 56 | ] 57 | }, 58 | #api_module { 59 | name = dht_routing_table, 60 | functions = [ 61 | #api_fun { name = new, arity = 3, classify = dht_routing_table_eqc } 62 | ] 63 | }, 64 | #api_module { 65 | name = dht_routing_meta, 66 | functions = [ 67 | #api_fun { name = new, arity = 1, classify = dht_routing_meta_eqc }, 68 | #api_fun { name = export, arity = 1, classify = dht_routing_meta_eqc }, 69 | 70 | #api_fun { name = insert, arity = 2, classify = dht_routing_meta_eqc }, 71 | #api_fun { name = replace, arity = 3, classify = dht_routing_meta_eqc }, 72 | #api_fun { name = remove, arity = 2, classify = dht_routing_meta_eqc }, 73 | #api_fun { name = node_touch, arity = 3, classify = dht_routing_meta_eqc }, 74 | #api_fun { name = node_timeout, arity = 2, classify = dht_routing_meta_eqc }, 75 | #api_fun { name = reset_range_timer, arity = 3, classify = dht_routing_meta_eqc }, 76 | 77 | #api_fun { name = member_state, arity = 2, classify = dht_routing_meta_eqc }, 78 | #api_fun { name = neighbors, arity = 3, classify = dht_routing_meta_eqc }, 79 | #api_fun { name = node_list, arity = 1, classify = dht_routing_meta_eqc }, 80 | #api_fun { name = node_state, arity = 2, classify = dht_routing_meta_eqc }, 81 | #api_fun { name = range_members, arity = 2, classify = dht_routing_meta_eqc }, 82 | #api_fun { name = range_state, arity = 2, classify = dht_routing_meta_eqc } 83 | ] 84 | }, 85 | #api_module { 86 | name = dht_net, 87 | functions = [ 88 | #api_fun { name = find_node, arity = 2, classify = dht_net_eqc }, 89 | #api_fun { name = ping, arity = 1, classify = dht_net_eqc } 90 | ] 91 | } 92 | ] 93 | }. 94 | 95 | %% Commands we are skipping: 96 | %% 97 | %% We skip the state load/store functions. Mostly due to jlouis@ not thinking this is where the bugs are 98 | %% nor is it the place where interesting interactions happen: 99 | %% 100 | %% * load_state/2 101 | %% * dump_state/0, dump_state/1, dump_state/2 102 | %% 103 | 104 | %% INITIAL STATE 105 | %% ----------------------- 106 | 107 | gen_state(ID) -> #state { id = ID, init = false }. 108 | initial_state() -> #state{}. 109 | 110 | %% START_LINK 111 | %% ----------------------- 112 | 113 | %% Start up a new routing state tracker: 114 | start_link(NodeID, Nodes) -> 115 | {ok, Pid} = dht_state:start_link(NodeID, {no_state_file, ?ID_MIN, ?ID_MAX}, Nodes), 116 | unlink(Pid), 117 | erlang:is_process_alive(Pid). 118 | 119 | start_link_pre(S) -> not initialized(S). 120 | 121 | start_link_args(#state { id = ID }) -> 122 | BootStrapNodes = [], 123 | [ID, BootStrapNodes]. 124 | 125 | %% Starting the routing state tracker amounts to initializing the routing meta-data layer 126 | start_link_callouts(#state { id = ID }, [ID, Nodes]) -> 127 | ?MATCH(Tbl, ?CALLOUT(dht_routing_table, new, [ID, ?ID_MIN, ?ID_MAX], 'ROUTING_TABLE')), 128 | ?CALLOUT(dht_refresh, insert_nodes, [Nodes], ok), 129 | ?CALLOUT(dht_routing_meta, new, [Tbl], {ok, ID, 'META'}), 130 | ?RET(true). 131 | 132 | %% Once started, we can't start the State system again. 133 | start_link_next(State, _, _) -> 134 | State#state { init = true }. 135 | 136 | start_link_features(_S, _A, _R) -> [{state, start_link}]. 137 | 138 | %% CLOSEST TO 139 | %% ------------------------ 140 | 141 | %% Return the `Num` nodes closest to `ID` known to the routing table system. 142 | closest_to(ID, Num) -> 143 | dht_state:closest_to(ID, Num). 144 | 145 | closest_to_callers() -> 146 | [dht_net_eqc]. 147 | 148 | closest_to_pre(S) -> initialized(S). 149 | 150 | closest_to_args(_S) -> 151 | [dht_eqc:id(), nat()]. 152 | 153 | %% This call is likewise just served by the underlying system 154 | closest_to_callouts(_S, [ID, Num]) -> 155 | ?MATCH(Ns, ?CALLOUT(dht_routing_meta, neighbors, [ID, Num, 'META'], 156 | list(dht_eqc:peer()))), 157 | ?RET(Ns). 158 | 159 | closest_to_features(_S, [_, Num], _) when Num >= 8 -> [{state, {closest_to, '>=8'}}]; 160 | closest_to_features(_S, [_, Num], _) -> [{state, {closest_to, Num}}]. 161 | 162 | %% NODE ID 163 | %% --------------------- 164 | 165 | %% Request the node ID of the system 166 | node_id() -> dht_state:node_id(). 167 | 168 | node_id_callers() -> 169 | [dht_net_eqc]. 170 | 171 | node_id_pre(S) -> initialized(S). 172 | 173 | node_id_args(_S) -> []. 174 | 175 | node_id_callouts(#state { id = ID }, []) -> ?RET(ID). 176 | 177 | node_id_features(_S, _A, _R) -> [{state, node_id}]. 178 | 179 | %% INSERT 180 | %% --------------------- 181 | 182 | %% The rules split into two variants 183 | 184 | %% REQUEST_SUCCESS 185 | %% ---------------- 186 | 187 | %% Tell the routing system a node responded succesfully. 188 | %% 189 | %% Once we learn there is a new node, we can request success on the 190 | %% node. Doing so will attempt an insert of the node into the routing table, 191 | %% or will update an already existing node in the routing table. 192 | %% 193 | %% • an IP/Port pair is first pinged to learn the ID of the pair, and 194 | %% then it is inserted. 195 | %% • A node is assumed to be valid already and is 196 | %% just inserted. If you doubt the validity of a node, then supply 197 | %% it's IP/Port pair in which case the node will be inserted with a 198 | %% ping. 199 | %% 200 | %% Note that if the gen_server returns `{verify, QuestionableNode}` 201 | %% then that node is being pinged by the call in order to possible 202 | %% refresh this node. And then we recurse. This means there is a 203 | %% behavior where we end up pinging all the nodes in a bucket/range 204 | %% before insertion succeeds/fails. 205 | %% 206 | request_success(Node, Opts) -> 207 | Res = dht_state:request_success(Node, Opts), 208 | dht_state:sync(), 209 | timer:sleep(5), 210 | Res. 211 | 212 | request_success_callers() -> 213 | [dht_net_eqc]. 214 | 215 | request_success_pre(S) -> initialized(S). 216 | 217 | request_success_args(_S) -> 218 | [dht_eqc:peer(), #{ reachable => bool() }]. 219 | 220 | request_success_callouts(_S, [Node, Opts]) -> 221 | ?MATCH(NodeState, ?APPLY(insert_node_gs, [Node])), 222 | case NodeState of 223 | ok -> ?RET(ok); 224 | already_member -> 225 | ?APPLY(request_success_gs, [Node, Opts]), 226 | ?RET(already_member); 227 | not_inserted -> ?RET(not_inserted); 228 | {error, Reason} -> ?RET({error, Reason}); 229 | {verify, QNode} -> 230 | ?CALLOUT(dht_refresh, verify, [QNode, Node, Opts], ok), 231 | ?RET(ok) 232 | end. 233 | 234 | request_success_gs_callouts(_S, [Node, Opts]) -> 235 | ?MATCH(MState, 236 | ?CALLOUT(dht_routing_meta, member_state, 237 | [Node, 'META'], 238 | oneof([unknown, member]))), 239 | case MState of 240 | unknown -> 241 | ?APPLY(insert_node_gs, [Node]), 242 | ?RET(ok); 243 | roaming_member -> ?RET(ok); 244 | member -> 245 | ?CALLOUT(dht_routing_meta, node_touch, [Node, Opts, 'META'], 'META'), 246 | ?RET(ok) 247 | end. 248 | 249 | request_success_features(_S, [_, #{reachable := true }], _R) -> 250 | [{state, {request_success, reachable}}]; 251 | request_success_features(_S, [_, #{reachable := false }], _R) -> 252 | [{state, {request_success, non_reachable}}]. 253 | 254 | %% REQUEST_TIMEOUT 255 | %% ---------------- 256 | 257 | %% Tell the routing system a node did not respond in a timely fashion 258 | request_timeout(Node) -> 259 | Res = dht_state:request_timeout(Node), 260 | dht_state:sync(), 261 | Res. 262 | 263 | request_timeout_pre(S) -> initialized(S). 264 | 265 | request_timeout_args(_S) -> 266 | [dht_eqc:peer()]. 267 | 268 | request_timeout_callouts(_S, [Node]) -> 269 | ?MATCH(MState, 270 | ?CALLOUT(dht_routing_meta, member_state, [Node, 'META'], oneof([unknown, member]))), 271 | case MState of 272 | unknown -> ?RET(ok); 273 | member -> 274 | ?CALLOUT(dht_routing_meta, node_timeout, [Node, 'META'], 'META'), 275 | ?RET(ok); 276 | roaming_member -> ?RET(ok) 277 | end. 278 | 279 | request_timeout_features(_S, [_], _) -> [{state, request_timeout}]. 280 | 281 | %% PING (Internal call to the network stack) 282 | %% --------------------- 283 | 284 | %% Ping a node, updating the response correctly on a succesful pong message 285 | ping_callouts(_S, [IP, Port]) -> 286 | ?MATCH(R, ?CALLOUT(dht_net, ping, [{IP, Port}], oneof([pang, {ok, dht_eqc:id()}]))), 287 | ?RET(R). 288 | 289 | ping_features(_S, _A, pang) -> [{state, {ping, pang}}]; 290 | ping_features(_S, _A, {ok, _}) -> [{state, {ping, ok}}]. 291 | 292 | %% INACTIVE_RANGE (GenServer Message) 293 | %% -------------------------------- 294 | 295 | %% Refreshing a range proceeds based on the state of that range. 296 | %% • The range is not a member: do nothing 297 | %% • The range is ok: set a new timer for the range. 298 | %% This sets a timer based on the last activity of the range. In turn, it sets a timer 299 | %% somewhere between 0 and ?RANGE_TIMEOUT. Ex: The last activity was 5 minutes 300 | %% ago. So the timer should trigger in 10 minutes rather than 15 minutes. 301 | %% • The range needs refreshing: 302 | %% refreshing the range amounts to executing a FIND_NODE call on a random ID in the range 303 | %% which is supplied by the underlying meta-data code. The timer is alway set to 15 minutes 304 | %% in this case (by a forced set), since we use the timer for progress. 305 | 306 | inactive_range(Msg) -> 307 | dht_state ! Msg, 308 | dht_state:sync(). 309 | 310 | inactive_range_pre(S) -> initialized(S). 311 | 312 | inactive_range_args(_S) -> 313 | [{inactive_range, dht_eqc:range()}]. 314 | 315 | %% Analyze the state of the range and let the result guide what happens. 316 | inactive_range_callouts(_S, [{inactive_range, Range}]) -> 317 | ?MATCH(RS, ?CALLOUT(dht_routing_meta, range_state, [Range, 'META'], 318 | oneof([{error, not_member}, ok, empty, {needs_refresh, dht_eqc:id()}]))), 319 | case RS of 320 | {error, not_member} -> ?EMPTY; 321 | ok -> 322 | ?CALLOUT(dht_routing_meta, reset_range_timer, [Range, #{ force => false }, 'META'], 'META'); 323 | empty -> 324 | ?CALLOUT(dht_routing_meta, reset_range_timer, [Range, #{ force => true }, 'META'], 'META'); 325 | {needs_refresh, Peer} -> 326 | ?CALLOUT(dht_routing_meta, reset_range_timer, [Range, #{ force => true }, 'META'], 'META'), 327 | ?APPLY(refresh_range, [Peer]) 328 | end, 329 | ?RET(ok). 330 | 331 | inactive_range_features(_S, _A, _R) -> [{state, inactive_range}]. 332 | 333 | %% REFRESH_RANGE (Internal private call) 334 | 335 | %% This encodes the invariant that once we have found nodes close to the refreshing, they are 336 | %% used as a basis for insertion. 337 | refresh_range_callouts(_S, [Peer]) -> 338 | ?CALLOUT(dht_refresh, range, [Peer], ok). 339 | 340 | refresh_range_features(_S, _A, _R) -> [{state, refresh_range}]. 341 | 342 | %% INSERT_NODE (GenServer Internal Call) 343 | %% -------------------------------- 344 | 345 | %% Insertion of a node on the gen_server side amounts to analyzing the state of the 346 | %% range/bucket in which the node would fall. We generate random bucket data by 347 | %% means of the following calls: 348 | g_node_state(L) -> 349 | ?LET(S, vector(length(L), oneof([good, bad, {questionable, nat()}])), 350 | lists:zip(L, S)). 351 | 352 | bucket_members() -> 353 | ?SUCHTHAT(L, list(dht_eqc:peer()), 354 | length(L) =< 8). 355 | 356 | %% Given a set of pairs {Node, NodeState} we can analyze them and sort them into 357 | %% the good, bad, and questionable nodes. The sort order matter. 358 | analyze_node_state(UnsortedNodes) -> 359 | Nodes = lists:sort(UnsortedNodes), %% Force stable order 360 | GoodNodes = [N || {N, good} <- Nodes], 361 | BadNodes = lists:sort([N || {N, bad} <- Nodes]), 362 | QNodes = [{N,T} || {N, {questionable, T}} <- Nodes], 363 | QSorted = [N || {N, _} <- lists:keysort(2, QNodes)], 364 | analyze_node_state(BadNodes, GoodNodes, QSorted). 365 | 366 | %% Provide a view-type for the analyzed node state 367 | analyze_node_state(_Bs, Gs, _Qs) when length(Gs) == ?K -> range_full; 368 | analyze_node_state(Bs ,Gs, Qs) when length(Bs) + length(Gs) + length(Qs) < ?K -> room; 369 | analyze_node_state([B|_], _Gs, _Qs) -> {bad, B}; 370 | analyze_node_state([], _, [Q | _Qs]) -> {questionable, Q}. 371 | 372 | %% Insertion requests the current bucket members, then analyzes the state of the bucket. 373 | %% There are 4 possible cases: 374 | %% • The bucket is full of good nodes—ignore the new node 375 | %% • The bucket has room for another node—insert the new node 376 | %% • The bucket has at least one bad node—swap the new node for the bad node 377 | %% • The bucket has no bad nodes, but a questionable node—verify responsiveness 378 | %% of the questionable node (by means of interaction with the caller of 379 | %% insert_node/1) 380 | %% 381 | insert_node_gs_callouts(_S, [Node]) -> 382 | ?MATCH(MState, ?CALLOUT(dht_routing_meta, member_state, [Node, 'META'], 383 | oneof([unknown, member, roaming_member]))), 384 | case MState of 385 | member -> ?RET(already_member); 386 | roaming_member -> ?RET(already_member); %% TODO: For now. I'm not sure this is right. 387 | unknown -> ?APPLY(adjoin_node, [Node]) 388 | end. 389 | 390 | insert_node_gs_features(_S, _A, _R) -> [{state, insert_node_gs}]. 391 | 392 | %% Internal helper call for adjoining a new node 393 | adjoin_node_callouts(_S, [Node]) -> 394 | ?MATCH(Near, ?CALLOUT(dht_routing_meta, range_members, [Node, 'META'], 395 | bucket_members())), 396 | ?MATCH(NodeState, ?CALLOUT(dht_routing_meta, node_state, [Near, 'META'], 397 | g_node_state(Near))), 398 | R = analyze_node_state(NodeState), 399 | case R of 400 | range_full -> 401 | ?MATCH(IR, ?CALLOUT(dht_routing_meta, insert, [Node, 'META'], 402 | oneof([ok, not_inserted]))), 403 | case IR of 404 | ok -> ?RET(ok); 405 | not_inserted -> ?RET(not_inserted) 406 | end; 407 | room -> 408 | ?CALLOUT(dht_routing_meta, insert, [Node, 'META'], {ok, 'META'}), 409 | ?RET(ok); 410 | {bad, Bad} -> 411 | ?CALLOUT(dht_routing_meta, replace, [Bad, Node, 'META'], {ok, 'META'}), 412 | ?RET(ok); 413 | {questionable, Q} -> ?RET({verify, Q}) 414 | end. 415 | 416 | %% MODEL CLEANUP 417 | %% ------------------------------ 418 | reset() -> 419 | case whereis(dht_state) of 420 | undefined -> ok; 421 | Pid when is_pid(Pid) -> 422 | exit(Pid, kill), 423 | timer:sleep(1) 424 | end, 425 | ok. 426 | 427 | %% PROPERTY 428 | %% ----------------------- 429 | postcondition_common(S, Call, Res) -> 430 | eq(Res, return_value(S, Call)). 431 | 432 | weight(_S, request_success) -> 15; 433 | weight(_S, _) -> 1. 434 | 435 | prop_component_correct() -> 436 | ?SETUP(fun() -> 437 | eqc_mocking:start_mocking(api_spec()), 438 | fun() -> ok end 439 | end, 440 | ?FORALL(ID, dht_eqc:id(), 441 | ?FORALL(StartState, gen_state(ID), 442 | ?FORALL(Cmds, commands(?MODULE, StartState), 443 | begin 444 | ok = reset(), 445 | {H,S,R} = run_commands(?MODULE, Cmds), 446 | pretty_commands(?MODULE, Cmds, {H,S,R}, 447 | aggregate(with_title('Commands'), command_names(Cmds), 448 | collect(eqc_lib:summary('Length'), length(Cmds), 449 | aggregate(with_title('Features'), eqc_statem:call_features(H), 450 | features(eqc_statem:call_features(H), 451 | R == ok))))) 452 | end)))). 453 | 454 | %% Helper for showing states of the output: 455 | t() -> t(5). 456 | 457 | t(Secs) -> 458 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_component_correct()))). 459 | 460 | %% INTERNAL MODEL HELPERS 461 | %% ----------------------- 462 | 463 | initialized(#state { init = I }) -> I. 464 | -------------------------------------------------------------------------------- /test/dht_store_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_store_eqc). 2 | 3 | -compile(export_all). 4 | 5 | -include_lib("eqc/include/eqc.hrl"). 6 | -include_lib("eqc/include/eqc_component.hrl"). 7 | 8 | -include("dht_eqc.hrl"). 9 | 10 | api_spec() -> 11 | #api_spec { 12 | language = erlang, 13 | modules = [] }. 14 | 15 | -record(state, { 16 | entries = [], 17 | init = false 18 | }). 19 | 20 | initial_state() -> #state{}. 21 | 22 | %% START A NEW DHT STORE 23 | %% ----------------------------------- 24 | start_link() -> 25 | reset(). 26 | 27 | start_link_pre(S) -> not initialized(S). 28 | 29 | start_link_args(_S) -> []. 30 | 31 | start_link_callouts(_S, []) -> 32 | ?APPLY(dht_time_eqc, send_after, [5 * 60 * 1000, dht_store, evict]), 33 | ?RET(ok). 34 | 35 | start_link_next(S, _, []) -> S#state { init = true }. 36 | 37 | start_link_features(_S, _, _) -> [{dht_store, start_link}]. 38 | 39 | %% STORING AN ENTRY 40 | %% ------------------------------ 41 | store(ID, Loc) -> 42 | dht_store:store(ID, Loc). 43 | 44 | store_callers() -> 45 | [dht_net_eqc]. 46 | 47 | store_pre(S) -> initialized(S). 48 | 49 | store_args(_S) -> 50 | [dht_eqc:id(), {dht_eqc:ip(), dht_eqc:port()}]. 51 | 52 | store_callouts(_S, [ID, Loc]) -> 53 | ?MATCH(Now, ?APPLY(dht_time_eqc, monotonic_time, [])), 54 | ?APPLY(add_store, [ID, Loc, Now]), 55 | ?RET(ok). 56 | 57 | store_features(_S, _, _) -> 58 | [{dht_store, store}]. 59 | 60 | %% FINDING A POTENTIAL STORED ENTITY 61 | %% -------------------------------------- 62 | find(ID) -> 63 | dht_store:find(ID). 64 | 65 | find_callers() -> 66 | [dht_net_eqc]. 67 | 68 | find_pre(S) -> initialized(S). 69 | find_args(S) -> 70 | StoredIDs = stored_ids(S), 71 | ID = oneof([elements(StoredIDs) || StoredIDs /= []] ++ [dht_eqc:id()]), 72 | [ID]. 73 | 74 | find_callouts(_S, [ID]) -> 75 | ?MATCH(Now, ?APPLY(dht_time_eqc, monotonic_time, [])), 76 | ?MATCH(Sz, ?APPLY(dht_time_eqc, convert_time_unit, [60 * 60 * 1000, milli_seconds, native])), 77 | Flank = Now - Sz, 78 | ?APPLY(evict, [ID, Flank]), 79 | ?MATCH(R, ?APPLY(lookup, [ID])), 80 | ?RET(R). 81 | 82 | find_features(_S, [_], L) -> [{dht_store, find, {size, length(L)}}]. 83 | 84 | %% EVICTING OLD KEYS 85 | evict_timeout() -> 86 | dht_store ! evict, 87 | dht_store:sync(). 88 | 89 | evict_timeout_pre(S) -> initialized(S). 90 | 91 | evict_timeout_args(_S) -> []. 92 | 93 | evict_timeout_callouts(_S, []) -> 94 | ?APPLY(dht_time_eqc, trigger_msg, [evict]), 95 | ?APPLY(dht_time_eqc, send_after, [5 * 60 * 1000, dht_store, evict]), 96 | ?MATCH(Now, ?APPLY(dht_time_eqc, monotonic_time, [])), 97 | ?MATCH(Sz, ?APPLY(dht_time_eqc, convert_time_unit, [60 * 60 * 1000, milli_seconds, native])), 98 | Flank = Now - Sz, 99 | ?APPLY(evict, [Flank]), 100 | ?RET(ok). 101 | 102 | %% ADDING ENTRIES TO THE STORE (Internal Call) 103 | add_store_next(#state { entries = Es } = S, _, [ID, Loc, Now]) -> 104 | {_, NonMatching} = lists:partition( 105 | fun({IDx, Locx, _}) -> IDx == ID andalso Locx == Loc end, 106 | Es), 107 | S#state { entries = NonMatching ++ [{ID, Loc, Now}] }. 108 | 109 | %% EVICTING OLD ENTRIES (Internal Call) 110 | evict_next(#state { entries = Es } = S, _, [Key, Flank]) -> 111 | S#state { entries = [{ID, Loc, T} 112 | || {ID, Loc, T} <- Es, 113 | ID /= Key orelse T >= Flank] }; 114 | evict_next(#state { entries = Es } = S, _, [Flank]) -> 115 | S#state { entries = [{ID, Loc, T} || {ID, Loc, T} <- Es, T >= Flank] }. 116 | 117 | %% FINDING ENTRIES (Internal Call) 118 | lookup_callouts(#state { entries = Es }, [Target]) -> 119 | ?RET([Loc || {ID, Loc, _} <- Es, ID == Target]). 120 | 121 | %% RESETTING THE STATE 122 | reset() -> 123 | case whereis(dht_store) of 124 | undefined -> 125 | {ok, Pid} = dht_store:start_link(), 126 | unlink(Pid), 127 | ok; 128 | P when is_pid(P) -> 129 | exit(P, kill), 130 | timer:sleep(1), 131 | {ok, Pid} = dht_store:start_link(), 132 | unlink(Pid), 133 | ok 134 | end. 135 | 136 | %% Checking for startup 137 | ensure_started_pre(S, []) -> initialized(S). 138 | 139 | %% Weights 140 | %% ------- 141 | %% 142 | %% It is more interesting to manipulate the structure than it is to query it: 143 | weight(_S, store) -> 100; 144 | weight(_S, _Cmd) -> 10. 145 | 146 | %% Properties 147 | %% ---------- 148 | 149 | %% Use a common postcondition for all commands, so we can utilize the valid return 150 | %% of each command. 151 | postcondition_common(S, Call, Res) -> 152 | eq(Res, return_value(S, Call)). 153 | 154 | %% INTERNALS 155 | %% ------------------------ 156 | initialized(#state { init = Init }) -> Init. 157 | 158 | stored_ids(#state { entries = Es }) -> lists:usort([ID || {ID, _, _} <- Es]). 159 | -------------------------------------------------------------------------------- /test/dht_time_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_time_eqc). 2 | -compile(export_all). 3 | 4 | -include_lib("eqc/include/eqc.hrl"). 5 | -include_lib("eqc/include/eqc_component.hrl"). 6 | 7 | -type time() :: integer(). 8 | -type time_ref() :: integer(). 9 | 10 | -record(state, { 11 | time = 0 :: time(), 12 | timers = [] :: [{time(), {time_ref(), term(), term()}}], 13 | time_ref = 0 :: time_ref() 14 | }). 15 | 16 | api_spec() -> 17 | #api_spec { 18 | language = erlang, 19 | modules = [ 20 | #api_module { 21 | name = dht_time, 22 | functions = [ 23 | #api_fun { name = convert_time_unit, arity = 3 }, 24 | #api_fun { name = monotonic_time, arity = 0 }, 25 | #api_fun { name = send_after, arity = 3 }, 26 | #api_fun { name = cancel_timer, arity = 1 } 27 | ]} 28 | ] 29 | }. 30 | 31 | gen_initial_state() -> 32 | #state { time = int() }. 33 | 34 | initial_state() -> #state{}. 35 | 36 | %% ADVANCING TIME 37 | %% ------------------------------------ 38 | 39 | advance_time(_Advance) -> ok. 40 | advance_time_args(_S) -> 41 | T = frequency([ 42 | {10, ?LET(K, nat(), K+1)}, 43 | {10, ?LET({K, N}, {nat(), nat()}, (N+1)*1000 + K)}, 44 | {10, ?LET({K, N, M}, {nat(), nat(), nat()}, (M+1)*60*1000 + N*1000 + K)}, 45 | {1, ?LET({K, N, Q}, {nat(), nat(), nat()}, (Q*17)*60*1000 + N*1000 + K)} 46 | ]), 47 | [T]. 48 | 49 | %% Advancing time transitions the system into a state where the time is incremented 50 | %% by A. 51 | advance_time_next(#state { time = T } = State, _, [A]) -> State#state { time = T+A }. 52 | advance_time_return(_S, [_]) -> ok. 53 | 54 | advance_time_features(_S, _, _) -> [{dht_time, advance_time}]. 55 | 56 | %% TRIGGERING OF TIMERS 57 | %% ------------------------------------ 58 | %% 59 | %% This is to be used by another component as: 60 | %% ?APPLY(dht_time, trigger, []) in a callout specification. This ensures the given command can 61 | %% only be picked if you can trigger the timer. 62 | 63 | can_fire(#state { time = T, timers = TS }, Ref) -> 64 | case lists:keyfind(Ref, 2, TS) of 65 | false -> false; 66 | {TP, _, _, _} -> T >= TP 67 | end. 68 | 69 | trigger_pre(S, [{tref, Ref}]) -> can_fire(S, Ref). 70 | 71 | trigger_return(#state { timers = TS }, [{tref, Ref}]) -> 72 | case lists:keyfind(Ref, 2, TS) of 73 | {_TP, _Ref, _Pid, Msg} -> Msg 74 | end. 75 | 76 | trigger_next(#state { timers = TS } = S, _, [{tref, Ref}]) -> 77 | S#state{ timers = lists:keydelete(Ref, 2, TS) }. 78 | 79 | can_fire_msg(#state { time = T, timers = TS }, Msg) -> 80 | case lists:keyfind(Msg, 4, TS) of 81 | false -> false; 82 | {TP, _, _, _} -> T >= TP 83 | end. 84 | 85 | trigger_msg_pre(S, [Msg]) -> can_fire_msg(S, Msg). 86 | trigger_msg_return(_S, [Msg]) -> Msg. 87 | 88 | trigger_msg_next(#state { timers = TS } = S, _, [Msg]) -> 89 | {_, Ref, _, _} = lists:keyfind(Msg, 4, TS), 90 | S#state{ timers = lists:keydelete(Ref, 2, TS) }. 91 | 92 | %% INTERNAL CALLS IN THE MODEL 93 | %% ------------------------------------------- 94 | %% 95 | %% All these calls are really "wrappers" such that if you call into the timing model, you obtain 96 | %% faked time. 97 | 98 | monotonic_time_callers() -> [dht_routing_meta_eqc, dht_routing_table_eqc, dht_state_eqc]. 99 | 100 | monotonic_time_callouts(#state {time = T }, []) -> 101 | ?CALLOUT(dht_time, monotonic_time, [], T), 102 | ?RET(T). 103 | 104 | monotonic_time_return(#state { time = T }, []) -> T. 105 | 106 | convert_time_unit_callers() -> [dht_routing_meta_eqc, dht_routing_table_eqc, dht_state_eqc]. 107 | 108 | convert_time_unit_callouts(_S, [T, From, To]) -> 109 | ?CALLOUT(dht_time, convert_time_unit, [T, From, To], T), 110 | case {From, To} of 111 | {native, milli_seconds} -> ?RET(T); 112 | {milli_seconds, native} -> ?RET(T); 113 | FT -> ?FAIL({convert_time_unit, FT}) 114 | end. 115 | 116 | send_after_callers() -> [dht_routing_meta_eqc, dht_routing_table_eqc, dht_state_eqc, dht_net_eqc]. 117 | 118 | send_after_callouts(#state { time_ref = Ref}, [Timeout, Reg, Msg]) when is_atom(Reg) -> 119 | ?CALLOUT(dht_time, send_after, [Timeout, Reg, Msg], {tref, Ref}), 120 | ?RET({tref, Ref}); 121 | send_after_callouts(#state { time_ref = Ref}, [Timeout, Pid, Msg]) when is_pid(Pid) -> 122 | ?CALLOUT(dht_time, send_after, [Timeout, ?WILDCARD, Msg], {tref, Ref}), 123 | ?RET({tref, Ref}). 124 | 125 | send_after_next(#state { time = T, time_ref = Ref, timers = TS } = S, _, [Timeout, Pid, Msg]) -> 126 | TriggerPoint = T + Timeout, 127 | S#state { time_ref = Ref + 1, timers = TS ++ [{TriggerPoint, Ref, Pid, Msg}] }. 128 | 129 | cancel_timer_callers() -> [dht_routing_meta_eqc, dht_net]. 130 | 131 | cancel_timer_callouts(S, [{tref, TRef}]) -> 132 | Return = cancel_timer_rv(S, TRef), 133 | ?CALLOUT(dht_time, cancel_timer, [{tref, TRef}], Return), 134 | ?RET(Return). 135 | 136 | cancel_timer_rv(#state { time = T, timers = TS }, TRef) -> 137 | case lists:keyfind(TRef, 2, TS) of 138 | false -> false; 139 | {TriggerPoint, TRef, _Pid, _Msg} -> monus(TriggerPoint, T) 140 | end. 141 | 142 | cancel_timer_next(#state { timers = TS } = S, _, [{tref, TRef}]) -> 143 | S#state { timers = lists:keydelete(TRef, 2, TS) }. 144 | 145 | %% HELPER ROUTINES 146 | %% ---------------------------------------- 147 | 148 | %% A monus operation is a subtraction for natural numbers 149 | monus(A, B) when A > B -> A - B; 150 | monus(A, B) when A =< B -> 0. 151 | 152 | %% PROPERTY 153 | %% ---------------------------------- 154 | 155 | %% The property here is a pretty dummy property as we don't need a whole lot for this to work. 156 | 157 | %% Use a common postcondition for all commands, so we can utilize the valid return 158 | %% of each command. 159 | postcondition_common(S, Call, Res) -> 160 | eq(Res, return_value(S, Call)). 161 | 162 | %% Main property, just verify that the commands are in sync with reality. 163 | prop_component_correct() -> 164 | ?SETUP(fun() -> 165 | eqc_mocking:start_mocking(api_spec()), 166 | fun() -> ok end 167 | end, 168 | ?FORALL(St, gen_initial_state(), 169 | ?FORALL(Cmds, commands(?MODULE, St), 170 | begin 171 | {H,S,R} = run_commands(?MODULE, Cmds), 172 | pretty_commands(?MODULE, Cmds, {H,S,R}, 173 | aggregate(with_title('Commands'), command_names(Cmds), 174 | collect(eqc_lib:summary('Length'), length(Cmds), 175 | aggregate(with_title('Features'), eqc_statem:call_features(H), 176 | features(eqc_statem:call_features(H), 177 | R == ok))))) 178 | end))). 179 | 180 | %% Helper for showing states of the output: 181 | t() -> t(5). 182 | 183 | t(Secs) -> 184 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_component_correct()))). -------------------------------------------------------------------------------- /test/dht_track_eqc.erl: -------------------------------------------------------------------------------- 1 | -module(dht_track_eqc). 2 | 3 | -compile(export_all). 4 | 5 | -include_lib("eqc/include/eqc.hrl"). 6 | -include_lib("eqc/include/eqc_component.hrl"). 7 | 8 | -include("dht_eqc.hrl"). 9 | 10 | api_spec() -> 11 | #api_spec { 12 | language = erlang, 13 | modules = [ 14 | #api_module { 15 | name = dht_search, 16 | functions = [ 17 | #api_fun { name = run, arity = 2, classify = dht_search_eqc } ]}, 18 | #api_module { 19 | name = dht_net, 20 | functions = [ 21 | #api_fun { name = store, arity = 4, classify = dht_net_eqc } ]} 22 | ] }. 23 | 24 | -record(state, { 25 | entries = [], 26 | init = false 27 | }). 28 | 29 | initial_state() -> #state{}. 30 | 31 | -define(STORE_COUNT, 8). 32 | 33 | %% RESETTING THE STATE 34 | %% ------------------------------------ 35 | 36 | reset() -> 37 | case whereis(dht_track) of 38 | undefined -> 39 | {ok, Pid} = dht_track:start_link(), 40 | unlink(Pid), 41 | ok; 42 | P when is_pid(P) -> 43 | exit(P, kill), 44 | timer:sleep(1), 45 | {ok, Pid} = dht_track:start_link(), 46 | unlink(Pid), 47 | ok 48 | end. 49 | 50 | %% START A NEW DHT TRACKER 51 | %% ----------------------------------- 52 | start_link() -> 53 | reset(). 54 | 55 | start_link_pre(S) -> not initialized(S). 56 | 57 | start_link_args(_S) -> []. 58 | 59 | start_link_callouts(_S, []) -> 60 | ?RET(ok). 61 | 62 | start_link_next(S, _, []) -> S#state { init = true }. 63 | 64 | start_link_features(_S, _, _) -> [{dht_track, start_link}]. 65 | 66 | %% STORING A NEW ENTRY 67 | %% ----------------------------------------- 68 | 69 | store(ID, Location) -> 70 | dht_track:store(ID, Location), 71 | dht_track:sync(). 72 | 73 | store_pre(S) -> initialized(S). 74 | store_args(_S) -> 75 | [dht_eqc:id(), dht_eqc:port()]. 76 | 77 | store_callouts(_S, [ID, Location]) -> 78 | ?APPLY(refresh, [ID, Location]), 79 | ?APPLY(add_entry, [ID, Location]), 80 | ?RET(ok). 81 | 82 | store_features(#state { entries = Es }, [ID, _Location], _) -> 83 | case lists:keymember(ID, 1, Es) of 84 | true -> [{dht_track, store, overwrite}]; 85 | false -> [{dht_track, store, new}] 86 | end. 87 | 88 | %% DELETING AN ENTRY 89 | %% ---------------------------------------------- 90 | delete(ID) -> 91 | dht_track:delete(ID), 92 | dht_track:sync(). 93 | 94 | delete_pre(S) -> initialized(S). 95 | delete_args(S) -> [id(S)]. 96 | 97 | delete_callouts(_S, [ID]) -> 98 | ?APPLY(del_entry, [ID]), 99 | ?RET(ok). 100 | 101 | delete_features(#state { entries = Es }, [ID], _) -> 102 | case lists:keymember(ID, 1, Es) of 103 | true -> [{dht_track, delete, existing}]; 104 | false -> [{dht_track, delete, non_existing}] 105 | end. 106 | 107 | %% LOOKUP 108 | %% -------------------- 109 | lookup(ID) -> 110 | dht_track:lookup(ID). 111 | 112 | lookup_pre(S) -> initialized(S). 113 | lookup_args(S) -> [id(S)]. 114 | 115 | lookup_return(#state {entries = Es }, [ID]) -> 116 | case lists:keyfind(ID, 1, Es) of 117 | false -> not_found; 118 | {_, V} -> V 119 | end. 120 | 121 | lookup_features(_, _, not_found) -> [{dht_track, lookup, not_found}]; 122 | lookup_features(_, _, Port) when is_integer(Port) -> [{dht_track, lookup, ok}]. 123 | 124 | %% TIMING OUT 125 | %% ------------------------------ 126 | timeout(Msg) -> 127 | dht_track ! Msg, 128 | dht_track:sync(). 129 | 130 | timeout_pre(#state { entries = Es } = S) -> initialized(S) andalso Es /= []. 131 | 132 | timeout_args(#state { entries = Es }) -> 133 | ?LET({ID, Loc}, elements(Es), 134 | [{refresh, ID, Loc}]). 135 | 136 | timeout_pre(#state { entries = Es }, [{refresh, ID, Loc}]) -> 137 | lists:member({ID, Loc}, Es). 138 | 139 | timeout_callouts(_S, [{refresh, ID, Loc}]) -> 140 | ?APPLY(dht_time, trigger_msg, [{refresh, ID, Loc}]), 141 | ?APPLY(refresh, [ID, Loc]), 142 | ?RET(ok). 143 | 144 | timeout_features(_S, [_], _) -> [{dht_track, timeout, entry}]. 145 | 146 | %% REFRESHING AN ENTRY (Internal Call) 147 | %% ------------------------------------------ 148 | refresh_callouts(_S, [ID, Location]) -> 149 | ?MATCH(Res, ?CALLOUT(dht_search, run, [find_value, ID], dht_search_find_value_ret())), 150 | #{ store := Stores } = Res, 151 | Candidates = order(ID, Stores), 152 | StorePoints = take(?STORE_COUNT, Candidates), 153 | ?APPLY(net_store, [ID, Location, StorePoints]), 154 | ?APPLY(dht_time_eqc, send_after, [45 * 60 * 1000, dht_track, {refresh, ID, Location}]), 155 | ?RET(ok). 156 | 157 | net_store_callouts(_S, [_ID, _Location, []]) -> ?RET(ok); 158 | net_store_callouts(_S, [ID, Location, [{{_, IP, Port}, Token} | SPs]]) -> 159 | ?CALLOUT(dht_net, store, [{IP, Port}, Token, ID, Location], ok), 160 | ?APPLY(net_store, [ID, Location, SPs]). 161 | 162 | %% ADD/DELETE AN ENTRY 163 | %% ------------------------------------- 164 | add_entry_next(#state { entries = Es } = S, _, [ID, Location]) -> 165 | S#state { entries = lists:keystore(ID, 1, Es, {ID, Location}) }. 166 | 167 | del_entry_next(#state { entries = Es } = S, _, [ID]) -> 168 | S#state { entries = lists:keydelete(ID, 1, Es) }. 169 | 170 | %% Weights 171 | %% ------- 172 | %% 173 | %% It is more interesting to manipulate the structure than it is to query it: 174 | weight(_S, _Cmd) -> 10. 175 | 176 | %% Properties 177 | %% ---------- 178 | 179 | %% Use a common postcondition for all commands, so we can utilize the valid return 180 | %% of each command. 181 | postcondition_common(S, Call, Res) -> 182 | eq(Res, return_value(S, Call)). 183 | 184 | %% INTERNALS 185 | %% ------------------------ 186 | initialized(#state { init = Init }) -> Init. 187 | 188 | %% Generate an ID, with strong preference for IDs which exist. 189 | id(#state { entries = Es }) -> 190 | IDs = [ID || {ID, _} <- Es], 191 | frequency( 192 | [{1, dht_eqc:id()}] ++ [{5, elements(IDs)} || IDs /= [] ] ). 193 | 194 | dht_search_find_value_ret() -> 195 | ?LET(Stores, list({dht_eqc:peer(), dht_eqc:token()}), 196 | #{ store => Stores }). 197 | 198 | order(ID, L) -> 199 | lists:sort(fun({{IDx, _, _}, _}, {{IDy, _, _}, _}) -> dht_metric:d(ID, IDx) < dht_metric:d(ID, IDy) end, L). 200 | 201 | take(0, _) -> []; 202 | take(_, []) -> []; 203 | take(K, [X | Xs]) -> [X | take(K-1, Xs)]. 204 | -------------------------------------------------------------------------------- /test/eqc_lib.erl: -------------------------------------------------------------------------------- 1 | %%% @doc Erlang QuickCheck library functions 2 | %%% Kept as one big module for ease of development. 3 | %%% @end 4 | -module(eqc_lib). 5 | -vsn("1.3.0"). 6 | -include_lib("eqc/include/eqc.hrl"). 7 | 8 | -compile(export_all). 9 | 10 | %%% BIT INTEGERS 11 | %%% --------------------------------------------------------------- 12 | %%% 13 | 14 | %% @doc pow_2_int/0 generates integers close to a power of two 15 | %% It turns out that integers around powers of two are often able to mess up stuff 16 | %% because of their bitwise representation. This generator generates integers close 17 | %% to a power of two deliberately. 18 | %% @end 19 | pow_2_int() -> 20 | ?LET({Sign, Exponent, Perturb}, {sign(), choose(0, 128), choose(-3, 3)}, 21 | Sign * pow(2, Exponent) + Perturb). 22 | 23 | sign() -> elements([1, -1]). 24 | 25 | pow(0, 0) -> 0; 26 | pow(_Base, 0) -> 1; 27 | pow(Base, N) -> Base * pow(Base, N-1). 28 | 29 | %%% HEX STRING 30 | %%% --------------------------------------------------------------- 31 | 32 | %% @doc hex_char() generates a hexadecimal character 33 | %% @end 34 | hex_char() -> 35 | elements([$0, $1, $2, $3, $4, $5, $6, $7, $8, $9, $0, $a, $b, $c, $d, $e, $f]). 36 | 37 | %% @doc hex_string/0 generates a hex string 38 | %% @end 39 | hex_string() -> list(hex_char()). 40 | 41 | %% @doc hex_string/1 generates a hexadecimal string of length `N' 42 | %% @end 43 | hex_string(N) -> 44 | vector(N, hex_char()). 45 | 46 | %%% UUID 47 | %%% --------------------------------------------------------------- 48 | 49 | %% @doc uuid_v4() generates a v4 UUID 50 | %% @end 51 | uuid_v4() -> 52 | ?LET( 53 | {S1, S2, S3, S4, S5}, 54 | {hex_string(8), hex_string(4), hex_string(3), hex_string(3), hex_string(12)}, 55 | iolist_to_binary([S1, $-, S2, $-, $4, S3, $-, $a, S4, $-, S5])). 56 | 57 | %%% SORTING 58 | %%% --------------------------------------------------------------- 59 | %%% 60 | 61 | %% @doc sort/1 is a total sort function 62 | %% The built-in lists:sort/1 is not total, because 0 == 0.0. Since the sort function 63 | %% is also *stable* it can't be used to force a unique order on terms. This variant 64 | %% of sort has the property of total order with INTEGER < FLOAT. 65 | %% @end 66 | sort(L) -> 67 | lists:sort(fun(X, Y) -> erts_internal:cmp_term(X, Y) < 0 end, L). 68 | 69 | prop_sorted() -> 70 | ?FORALL(L, maps_eqc:map_list(), 71 | begin 72 | Sorted = sort(L), 73 | conjunction([ 74 | {size, equals(length(L), length(Sorted))}, 75 | {ordering, ordered(Sorted)} 76 | ]) 77 | end). 78 | 79 | ordered([]) -> true; 80 | ordered([_]) -> true; 81 | ordered([X,Y|T]) -> 82 | case cmp_term(X,Y) of 83 | true -> ordered([X|T]); 84 | false -> false 85 | end. 86 | 87 | %% The following implement term comparison in Erlang to test an alternative implementation 88 | %% of erts_internal:cmp_term/2 89 | cmp_term(T1, T2) when is_integer(T1), is_integer(T2) -> T1 < T2; 90 | cmp_term(T1, _) when is_integer(T1) -> true; 91 | cmp_term(T1, T2) when is_float(T1), is_float(T2) -> T1 < T2; 92 | cmp_term(T1, _) when is_float(T1) -> true; 93 | cmp_term(T1, T2) when is_atom(T1), is_atom(T2) -> T1 < T2; 94 | cmp_term(T1, _) when is_atom(T1) -> true; 95 | cmp_term(T1, T2) when is_reference(T1), is_reference(T2) -> T1 < T2; 96 | cmp_term(T1, _) when is_reference(T1) -> true; 97 | cmp_term(T1, T2) when is_function(T1), is_function(T2) -> T1 < T2; 98 | cmp_term(T1, _) when is_function(T1) -> true; 99 | cmp_term(T1, T2) when is_port(T1), is_port(T2) -> T1 < T2; 100 | cmp_term(T1, _) when is_port(T1) -> true; 101 | cmp_term(T1, T2) when is_pid(T1), is_pid(T2) -> T1 < T2; 102 | cmp_term(T1, _) when is_pid(T1) -> true; 103 | cmp_term(T1, T2) when is_tuple(T1), is_tuple(T2) -> cmp_term(tuple_to_list(T1), tuple_to_list(T2)); 104 | cmp_term(T1, _) when is_tuple(T1) -> true; 105 | cmp_term(T1, T2) when is_list(T1), is_list(T2) -> cmp_term_list(T1, T2); 106 | cmp_term(T1, _) when is_list(T1) -> true; 107 | cmp_term(T1, T2) when is_bitstring(T1), is_bitstring(T2) -> T1 < T2; 108 | cmp_term(_, _) -> false. 109 | 110 | cmp_term_list([], []) -> false; 111 | cmp_term_list([], _) -> true; 112 | cmp_term_list(_, []) -> false; 113 | cmp_term_list([X|Xs], [Y|Ys]) when X =:= Y -> cmp_term_list(Xs, Ys); 114 | cmp_term_list([X|_], [Y|_]) -> cmp_term(X, Y). 115 | 116 | 117 | %% STEM AND LEAF PLOTS 118 | %% ------------------------------------------------------ 119 | %% 120 | %% If you are collecting lots of values, you may often want to show the distribution of those 121 | %% values. A stem & leaf plot allows you to handle this easily. Use it like you would use the 122 | %% with_title/1 printer: 123 | %% 124 | %% collect(stem_and_leaf('Command Length'), length(Cmds), …) 125 | %% 126 | stem_and_leaf(Title) -> 127 | fun(Counts) -> 128 | io:format("~s", [ 129 | [atom_to_list(Title), $\n, $\n, 130 | "Stem | Leaf\n", 131 | "----------------\n", 132 | (out_stem_and_leaf(stem_and_leaf_collect(Counts, #{})))]]) 133 | end. 134 | 135 | stem_and_leaf_collect([{C, 1}|Cs], Bins) -> 136 | stem_and_leaf_collect(Cs, store_bin(C div 10, C rem 10, Bins)); 137 | stem_and_leaf_collect([{C, K} | Cs], Bins) -> 138 | stem_and_leaf_collect([{C, K-1} | Cs], store_bin(C div 10, C rem 10, Bins)); 139 | stem_and_leaf_collect([], Bins) -> Bins. 140 | 141 | store_bin(D, R, Bins) -> 142 | case maps:find(D, Bins) of 143 | {ok, L} -> maps:put(D, [R | L], Bins); 144 | error -> maps:put(D, [R], Bins) 145 | end. 146 | 147 | out_stem_and_leaf(Bins) -> 148 | out_sl(lists:sort(maps:to_list(Bins))). 149 | 150 | out_sl([]) -> []; 151 | out_sl([{C, Elems} | Next]) -> 152 | Line = io_lib:format("~4.B | ~ts~n", [C, leaves(lists:sort(Elems))]), 153 | [Line | out_sl(Next)]. 154 | 155 | leaves([E | Es] = Elems) when length(Elems) > 66 -> ["*** ", rle(Es, E, 1)]; 156 | leaves(Elems) -> 157 | [E + $0 || E <- Elems]. 158 | 159 | rle([E | Es], E, Cnt) -> 160 | rle(Es, E, Cnt+1); 161 | rle([Ez | Es], E, Cnt) -> 162 | [rle_out(E, Cnt), " " | rle(Es, Ez, 1)]; 163 | rle([], E, Cnt) -> 164 | [rle_out(E, Cnt)]. 165 | 166 | rle_out(E, Cnt) -> 167 | [integer_to_list(E), <<"·("/utf8>>, integer_to_list(Cnt), ")"]. 168 | 169 | %% SUMMARY PLOTS 170 | %% ------------------------------------------------------ 171 | %% 172 | %% Summarize a data set like in R 173 | %% 174 | summary(Title) -> 175 | fun(Values) -> 176 | Stats = summary_stats(Values), 177 | Out = [atom_to_list(Title), $\n, 178 | "Min. :", summary_stats(min, Stats), $\n, 179 | "1st Qr.:", summary_percentile(25, Stats), $\n, 180 | "Median.:", summary_percentile(50, Stats), $\n, 181 | "Mean. :", summary_stats(mean, Stats), $\n, 182 | "3rd Qr.:", summary_percentile(75, Stats), $\n, 183 | "Max. :", summary_stats(max, Stats), $\n 184 | ], 185 | io:format("~s", [Out]) 186 | end. 187 | 188 | summary_stats(Name, Stats) -> 189 | case maps:get(Name, Stats) of 190 | I when is_integer(I) -> integer_to_list(I); 191 | F when is_float(F) -> float_to_list(F, [{decimals, 6}, compact]) 192 | end. 193 | 194 | summary_percentile(N, Stats) -> 195 | case maps:get({percentile, N}, Stats) of 196 | I when is_integer(I) -> integer_to_list(I); 197 | F when is_float(F) -> float_to_list(F, [{decimals, 6}, compact]) 198 | end. 199 | 200 | summary_expand(Values) -> 201 | lists:flatten([lists:duplicate(N, Elem) || {Elem, N} <- Values]). 202 | 203 | summary_stats(RLEs) -> 204 | summary_stats_(lists:sort(RLEs)). 205 | 206 | summary_stats_([]) -> 207 | #{ min => na, max => na, {percentile, 25} => na, {percentile, 50} => na, 208 | {percentile, 75} => na, mean => na, n => 0 }; 209 | summary_stats_([{E, EC} | RLEs] = Values) -> 210 | {Min, Max, Mean, N} = summary_scan(E, E, EC, E*EC, RLEs), 211 | #{ min => Min, max => Max, n => N, mean => Mean, 212 | {percentile, 25} => percentile(Values, N, 25), 213 | {percentile, 50} => percentile(Values, N, 50), 214 | {percentile, 75} => percentile(Values, N, 75) 215 | }. 216 | 217 | summary_scan(Min, Max, N, Sum, []) -> {Min, Max, Sum/N, N}; 218 | summary_scan(Min, Max, N, Sum, [{E, Count} | RLEs]) -> 219 | summary_scan( 220 | min(E, Min), 221 | max(E, Max), 222 | N + Count, 223 | Sum + E*Count, 224 | RLEs). 225 | 226 | percentile(RLE, N, Pct) -> 227 | percentile_pick(RLE, perc(Pct, N)). 228 | 229 | percentile_pick([{E, N} | _RLEs], ToSkip) when ToSkip =< N -> E; 230 | percentile_pick([{_E, N} | RLEs], ToSkip) -> 231 | percentile_pick(RLEs, ToSkip - N). 232 | 233 | perc(P, Len) -> 234 | V = round(P * Len / 100), 235 | erlang:max(1, V). 236 | 237 | %% TRACKER PROCESS 238 | %% The tracker process can be used to track a state outside the EQC state 239 | reset(Name) -> 240 | case whereis(Name) of 241 | undefined -> 242 | Pid = spawn_link(fun() -> tracker_loop(undefined) end), 243 | register(Name, Pid), 244 | ok; 245 | P when is_pid(P) -> 246 | P ! reset, 247 | ok 248 | end. 249 | 250 | bind(Name, Fun) -> 251 | Name ! {get_state, self()}, 252 | receive 253 | {state, S} -> 254 | case Fun(S) of 255 | {ok, R, S} -> R; % Optimize the case where there is no change 256 | {ok, R, N} -> 257 | Name ! {set_state, N}, 258 | R 259 | end 260 | after 5000 -> 261 | exit(timeout) 262 | end. 263 | 264 | tracker_loop(S) -> 265 | receive 266 | reset -> ?MODULE:tracker_loop(undefined); 267 | stop -> ok; 268 | {get_state, From} -> 269 | From ! {state, S}, 270 | ?MODULE:tracker_loop(S); 271 | {set_state, N} -> 272 | ?MODULE:tracker_loop(N) 273 | end. 274 | -------------------------------------------------------------------------------- /test/meta_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(meta_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | -define(DRIVER, dht_routing_tracker). 10 | 11 | components() -> [ 12 | dht_routing_meta_eqc, 13 | dht_time_eqc 14 | ]. 15 | 16 | api_spec() -> api_spec(?MODULE). 17 | 18 | prop_cluster_correct() -> 19 | ?SETUP(fun() -> 20 | eqc_mocking:start_mocking(api_spec(), components()), 21 | fun() -> ok end 22 | end, 23 | ?FORALL(ID, dht_eqc:id(), 24 | ?FORALL(MetaState, dht_routing_meta_eqc:gen_state(ID), 25 | ?FORALL(Cmds, eqc_cluster:commands(?MODULE, [ 26 | {dht_routing_meta_eqc, MetaState}]), 27 | begin 28 | {H,S,R} = run_commands(?MODULE, Cmds), 29 | pretty_commands(?MODULE, Cmds, {H,S,R}, 30 | aggregate(with_title('Commands'), command_names(Cmds), 31 | collect(eqc_lib:summary('Length'), length(Cmds), 32 | collect(eqc_lib:summary('Routing Table Size'), rt_size(S), 33 | aggregate(with_title('Features'), eqc_statem:call_features(H), 34 | features(eqc_statem:call_features(H), 35 | R == ok)))))) 36 | end)))). 37 | 38 | rt_size(Components) -> 39 | V = proplists:get_value(dht_routing_meta_eqc, Components), 40 | length(dht_routing_meta_eqc:current_nodes(V)). 41 | 42 | t() -> t(5). 43 | 44 | t(Secs) -> 45 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 46 | 47 | recheck() -> 48 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 49 | 50 | cmds() -> 51 | ?LET(ID, dht_eqc:id(), 52 | ?LET({MetaState, TableState, State}, 53 | {dht_routing_meta_eqc:gen_state(ID), 54 | dht_routing_table_eqc:gen_state(ID), 55 | dht_state_eqc:gen_state(ID)}, 56 | eqc_cluster:commands(?MODULE, [ 57 | {dht_state_eqc, State}, 58 | {dht_routing_meta_eqc, MetaState}, 59 | {dht_routing_table_eqc, TableState}]))). 60 | 61 | sample() -> 62 | eqc_gen:sample(cmds()). -------------------------------------------------------------------------------- /test/net_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(net_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | 10 | components() -> [ 11 | dht_net_eqc, 12 | dht_store_eqc, 13 | dht_time_eqc 14 | ]. 15 | 16 | api_spec() -> api_spec(?MODULE). 17 | 18 | prop_cluster_correct() -> 19 | ?SETUP(fun() -> 20 | application:load(dht), 21 | eqc_mocking:start_mocking(api_spec(), components()), 22 | fun() -> ok end 23 | end, 24 | ?FORALL(Cmds, 25 | fault_rate(1,40, eqc_cluster:commands(?MODULE)), 26 | begin 27 | ok = dht_net_eqc:reset(), 28 | {H,S,R} = run_commands(?MODULE, Cmds), 29 | pretty_commands(?MODULE, Cmds, {H,S,R}, 30 | aggregate(with_title('Commands'), command_names(Cmds), 31 | collect(eqc_lib:summary('Length'), length(Cmds), 32 | aggregate(with_title('Features'), eqc_statem:call_features(H), 33 | features(eqc_statem:call_features(H), 34 | R == ok))))) 35 | end)). 36 | 37 | t() -> t(15). 38 | 39 | t(Secs) -> 40 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 41 | 42 | recheck() -> 43 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 44 | 45 | cmds() -> 46 | eqc_cluster:commands(?MODULE). 47 | 48 | sample() -> 49 | eqc_gen:sample(cmds()). 50 | -------------------------------------------------------------------------------- /test/routing_table.erl: -------------------------------------------------------------------------------- 1 | -module(routing_table). 2 | -behaviour(gen_server). 3 | 4 | -include("dht_eqc.hrl"). 5 | 6 | -export([start_link/0]). 7 | -export([reset/3, grab/0]). 8 | -export([ 9 | closest_to/1, 10 | delete/1, 11 | insert/1, 12 | invariant/0, 13 | is_range/1, 14 | members/1, 15 | member_state/1, 16 | node_id/0, 17 | node_list/0, 18 | ranges/0, 19 | space/1 20 | ]). 21 | 22 | -export([init/1, handle_cast/2, handle_call/3, handle_info/2, terminate/2, code_change/3]). 23 | 24 | -record(state, { 25 | table 26 | }). 27 | 28 | start_link() -> 29 | gen_server:start_link({local, ?MODULE}, ?MODULE, [], []). 30 | 31 | grab() -> 32 | gen_server:call(?MODULE, grab). 33 | 34 | reset(Self, L, H) -> 35 | case whereis(?MODULE) of 36 | undefined -> {ok, _} = start_link(); 37 | P when is_pid(P) -> ok 38 | end, 39 | gen_server:call(?MODULE, {reset, Self, L, H}). 40 | 41 | insert(Node) -> 42 | gen_server:call(?MODULE, {insert, Node}). 43 | 44 | ranges() -> 45 | gen_server:call(?MODULE, ranges). 46 | 47 | delete(Node) -> 48 | gen_server:call(?MODULE, {delete, Node}). 49 | 50 | members(ID) -> 51 | gen_server:call(?MODULE, {members, ID}). 52 | 53 | member_state(Node) -> 54 | gen_server:call(?MODULE, {member_state, Node}). 55 | 56 | node_list() -> 57 | gen_server:call(?MODULE, node_list). 58 | 59 | node_id() -> 60 | gen_server:call(?MODULE, node_id). 61 | 62 | is_range(B) -> 63 | gen_server:call(?MODULE, {is_range, B}). 64 | 65 | closest_to(ID) -> 66 | gen_server:call(?MODULE, {closest_to, ID}). 67 | 68 | invariant() -> 69 | gen_server:call(?MODULE, invariant). 70 | 71 | space(Node) -> 72 | gen_server:call(?MODULE, {space, Node}). 73 | 74 | %% Callbacks 75 | 76 | init([]) -> 77 | {ok, #state{ table = undefined }}. 78 | 79 | handle_cast(_Msg, State) -> 80 | {noreply, State}. 81 | 82 | handle_call(grab, _From, #state { table = RT } = State) -> 83 | {reply, RT, State}; 84 | handle_call({space, N}, _From, #state { table = RT } = State) -> 85 | {reply, dht_routing_table:space(N, RT), State}; 86 | handle_call({reset, Self, L, H}, _From, State) -> 87 | {reply, ok, State#state { table = dht_routing_table:new(Self, L, H) }}; 88 | handle_call(ranges, _From, #state { table = RT } = State) -> 89 | {reply, dht_routing_table:ranges(RT), State}; 90 | handle_call({insert, Node}, _From, #state { table = RT } = State) -> 91 | {reply, 'ROUTING_TABLE', State#state { table = dht_routing_table:insert(Node, RT) }}; 92 | handle_call({delete, Node}, _From, #state { table = RT } = State) -> 93 | {reply, 'ROUTING_TABLE', State#state { table = dht_routing_table:delete(Node, RT) }}; 94 | handle_call({members, ID}, _From, #state { table = RT } = State) -> 95 | {reply, dht_routing_table:members(ID, RT), State}; 96 | handle_call({member_state, Node}, _From, #state { table = RT } = State) -> 97 | {reply, dht_routing_table:member_state(Node, RT), State}; 98 | handle_call(node_list, _From, #state { table = RT } = State) -> 99 | {reply, dht_routing_table:node_list(RT), State}; 100 | handle_call(node_id, _From, #state { table = RT} = State) -> 101 | {reply, dht_routing_table:node_id(RT), State}; 102 | handle_call({is_range, B}, _From, #state { table = RT } = State) -> 103 | {reply, dht_routing_table:is_range(B, RT), State}; 104 | handle_call({closest_to, ID}, _From, #state { table = RT } = State) -> 105 | {reply, dht_routing_table:closest_to(ID, RT), State}; 106 | handle_call(invariant, _From, #state { table = RT } = State) -> 107 | {reply, check_invariants(dht_routing_table:node_id(RT), RT), State}; 108 | handle_call(_Msg, _From, State) -> 109 | {reply, {error, unsupported}, State}. 110 | 111 | handle_info(_Msg, State) -> 112 | {noreply, State}. 113 | 114 | code_change(_Vsn, State, _Aux) -> 115 | {ok, State}. 116 | 117 | terminate(_What, _State) -> 118 | ok. 119 | 120 | check_invariants(ID, RT) -> 121 | check([ 122 | check_member_count(ID, RT), 123 | check_contiguous(RT) 124 | ]). 125 | 126 | check([ok | Chks]) -> check(Chks); 127 | check([Err | _]) -> Err; 128 | check([]) -> true. 129 | 130 | check_member_count(ID, {routing_table, _, Table}) -> 131 | check_member_count_(ID, Table). 132 | 133 | check_member_count_(_ID, []) -> true; 134 | check_member_count_(ID, [{bucket, Min, Max, Members } | Buckets ]) -> 135 | %% If our own ID falls into a bucket, then there can't be 8 elements in that bucket 136 | Sz = case Min =< ID andalso ID =< Max of 137 | true -> 8; 138 | false -> 8 139 | end, 140 | case length(Members) =< Sz of 141 | true -> check_member_count_(ID, Buckets); 142 | false -> {error, bucket_length} 143 | end. 144 | 145 | check_contiguous({routing_table, _, Table}) -> 146 | check_contiguous_(Table). 147 | 148 | check_contiguous_([]) -> true; 149 | check_contiguous_([{bucket, _Min, _Max, _Members}]) -> true; 150 | check_contiguous_([{bucket, _Low, M1, _Members1}, {bucket, M2, High, Members2} | T]) when M1 == M2 -> 151 | check_contiguous_([{bucket, M2, High, Members2} | T]); 152 | check_contiguous_([_X, _Y | _T]) -> 153 | {error, contiguous}. 154 | -------------------------------------------------------------------------------- /test/state_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(state_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | -define(DRIVER, dht_routing_tracker). 10 | 11 | components() -> [ 12 | dht_state_eqc, 13 | dht_routing_meta_eqc, 14 | dht_routing_table_eqc, 15 | dht_time_eqc 16 | ]. 17 | 18 | api_spec() -> api_spec(?MODULE). 19 | 20 | prop_cluster_correct() -> 21 | ?SETUP(fun() -> 22 | eqc_mocking:start_mocking(api_spec(), components()), 23 | fun() -> ok end 24 | end, 25 | ?FORALL(ID, dht_eqc:id(), 26 | ?FORALL({MetaState, TableState, State}, 27 | {dht_routing_meta_eqc:gen_state(ID), 28 | dht_routing_table_eqc:gen_state(ID), 29 | dht_state_eqc:gen_state(ID)}, 30 | ?FORALL(Cmds, eqc_cluster:commands(?MODULE, [ 31 | {dht_state_eqc, State}, 32 | {dht_routing_meta_eqc, MetaState}, 33 | {dht_routing_table_eqc, TableState}]), 34 | begin 35 | ok = dht_state_eqc:reset(), 36 | ok = routing_table:reset(ID, ?ID_MIN, ?ID_MAX), 37 | {H,S,R} = run_commands(?MODULE, Cmds), 38 | pretty_commands(?MODULE, Cmds, {H,S,R}, 39 | aggregate(with_title('Commands'), command_names(Cmds), 40 | collect(eqc_lib:summary('Length'), length(Cmds), 41 | collect(eqc_lib:summary('Routing Table Size'), rt_size(S), 42 | aggregate(with_title('Features'), eqc_statem:call_features(H), 43 | features(eqc_statem:call_features(H), 44 | R == ok)))))) 45 | end)))). 46 | 47 | rt_size(Components) -> 48 | V = proplists:get_value(dht_routing_table_eqc, Components), 49 | length(dht_routing_table_eqc:current_nodes(V)). 50 | 51 | t() -> t(15). 52 | 53 | t(Secs) -> 54 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 55 | 56 | recheck() -> 57 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 58 | 59 | cmds() -> 60 | ?LET(ID, dht_eqc:id(), 61 | ?LET({MetaState, TableState, State}, 62 | {dht_routing_meta_eqc:gen_state(ID), 63 | dht_routing_table_eqc:gen_state(ID), 64 | dht_state_eqc:gen_state(ID)}, 65 | eqc_cluster:commands(?MODULE, [ 66 | {dht_state_eqc, State}, 67 | {dht_routing_meta_eqc, MetaState}, 68 | {dht_routing_table_eqc, TableState}]))). 69 | 70 | sample() -> 71 | eqc_gen:sample(cmds()). 72 | -------------------------------------------------------------------------------- /test/store_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(store_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | 10 | components() -> [ 11 | dht_store_eqc, 12 | dht_time_eqc 13 | ]. 14 | 15 | api_spec() -> api_spec(?MODULE). 16 | 17 | prop_cluster_correct() -> 18 | ?SETUP(fun() -> 19 | eqc_mocking:start_mocking(api_spec(), components()), 20 | fun() -> ok end 21 | end, 22 | ?FORALL(Cmds, 23 | fault_rate(1,40, eqc_cluster:commands(?MODULE)), 24 | begin 25 | {H,S,R} = run_commands(?MODULE, Cmds), 26 | pretty_commands(?MODULE, Cmds, {H,S,R}, 27 | aggregate(with_title('Commands'), command_names(Cmds), 28 | collect(eqc_lib:summary('Length'), length(Cmds), 29 | aggregate(with_title('Features'), eqc_statem:call_features(H), 30 | features(eqc_statem:call_features(H), 31 | R == ok))))) 32 | end)). 33 | 34 | t() -> t(15). 35 | 36 | t(Secs) -> 37 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 38 | 39 | recheck() -> 40 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 41 | 42 | cmds() -> 43 | eqc_cluster:commands(?MODULE). 44 | 45 | sample() -> 46 | eqc_gen:sample(cmds()). 47 | -------------------------------------------------------------------------------- /test/track_cluster.erl: -------------------------------------------------------------------------------- 1 | -module(track_cluster). 2 | 3 | -include_lib("eqc/include/eqc.hrl"). 4 | -include_lib("eqc/include/eqc_cluster.hrl"). 5 | 6 | -include("dht_eqc.hrl"). 7 | 8 | -compile(export_all). 9 | 10 | components() -> [ 11 | dht_track_eqc, 12 | dht_time_eqc 13 | ]. 14 | 15 | api_spec() -> api_spec(?MODULE). 16 | 17 | prop_cluster_correct() -> 18 | ?SETUP(fun() -> 19 | eqc_mocking:start_mocking(api_spec(), components()), 20 | fun() -> ok end 21 | end, 22 | ?FORALL(Cmds, 23 | fault_rate(1,40, eqc_cluster:commands(?MODULE)), 24 | begin 25 | {H,S,R} = run_commands(?MODULE, Cmds), 26 | pretty_commands(?MODULE, Cmds, {H,S,R}, 27 | aggregate(with_title('Commands'), command_names(Cmds), 28 | collect(eqc_lib:summary('Length'), length(Cmds), 29 | aggregate(with_title('Features'), eqc_statem:call_features(H), 30 | features(eqc_statem:call_features(H), 31 | R == ok))))) 32 | end)). 33 | 34 | t() -> t(15). 35 | 36 | t(Secs) -> 37 | eqc:quickcheck(eqc:testing_time(Secs, eqc_statem:show_states(prop_cluster_correct()))). 38 | 39 | recheck() -> 40 | eqc:recheck(eqc_statem:show_states(prop_cluster_correct())). 41 | 42 | cmds() -> 43 | eqc_cluster:commands(?MODULE). 44 | 45 | sample() -> 46 | eqc_gen:sample(cmds()). 47 | --------------------------------------------------------------------------------