├── tla └── index.tla ├── README.md ├── SIMULATOR.md └── SPEC.md /tla/index.tla: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Summary 2 | 3 | Specify a minimal NAT traversal library for UDP. 4 | 5 | # Description 6 | 7 | Program execution is represented by `behavior`. A behavior is a sequence of `states`. A state is the assignment of values to varaibles. A program is modeled by a set of behaviors: the bahaviors representing all possible executions. These states are described for implementers in markdown, but it is only possible to prove the correctness of the described algorithms using the more formal TLA+ spec. 8 | 9 | # Status 10 | 11 | The status of this work is `Unstable`. 12 | -------------------------------------------------------------------------------- /SIMULATOR.md: -------------------------------------------------------------------------------- 1 | # Network Simulator 2 | 3 | ## Motivation 4 | 5 | Developing distributed systems is challenging, because distributed programs are not sequential. A network of nodes operates by sending asyncronous messages to each other and those messages may be dropped or arrive out of order. This means that the order of events in a system is at best partially ordered, and correctness must be defined over a set of possible partial orders. A correct distributed system must still achive it's goals given any possible message ordering. 6 | 7 | Testing such a system simply by running it is extremely complicated because it's the diversity of real world networks and devices that produces the unpredictable behavior, and an easy to set up test rig would just be a collection of cloud servers - which would likely behave much more reliably, so wouldn't be very representative! 8 | 9 | More, there can be subtle bugs that happen infrequently on any given node, but in a network with many nodes, happen to some node frequently - reproducing such a bug on a developer machine might take months, so it would be extremely difficult to manually debug. 10 | 11 | Instead, we use perform testing in a _network simulation_. This gives the possibility to model network behavior in a deterministic way. A simplified abstract model of the network is used. It would be very difficult to make a very realistic network, both difficult to define, and difficult to argue that it is realistic. Instead we make a network model that is auguably _worse_ than realistic. For example, instead of predictable latency, the next message to be delivered is just random, and instead of dropping messages based on router buffer over-capacity, simply drop packets with a configurable uniform random probability. 12 | 13 | By using deterministic random message delivery, we sample the space of possible orderings. It would might be better to exaustively explore the ordering space, but much simpler to sample it. By running a test many hundreds or thousands of times we can detect bugs that happen infrequently. Then, by replaying those test runs with a deterministic seed, test failures can be reproduced instantly, and debugged. 14 | 15 | ## Event 16 | 17 | An event is a tuple of a `(ts, fn)` some code to be executed at a specific time in the simulation's execution. 18 | This is used to model timers scheduled to run at specific times, and also used to model message delivery my using random times in the near future. 19 | 20 | ## Queue 21 | 22 | 23 | The Events go into a single Queue that the current state of the simulation. Events are always removed smallest `.ts` value first, but tuples may be added in random order as long as `ts` is greater than the last event that has been processed. (see `Queue.ts` property. Processing an event may result in additional events being inserted into the Queue. 24 | 25 | `add(ts, fn)` adds an event tuple, with `fn` to be processed at specific time `ts`. 26 | 27 | `ts` a property of the Queue which is set to the `ts` of the event to be processed. It is set immediately before the event is processed. 28 | 29 | `drain(ts)` processes events in the queue, until there are either no more events, or the remaining events have a scheduled time greater than `ts`. All events are processed in order of their attached `ts`. 30 | 31 | ## Nodes 32 | 33 | The test node represents a device on the network. 34 | Once a node has been added to a network, it may send and receive messages. 35 | Nodes also have a "sleeping" state. 36 | Node sleep is used to model laptops suspending and mobile apps being backgrounded, 37 | where the program state of the node is maintained but they do not respond to events. 38 | 39 | `onMessage(msg, addr, port)` function to represent receiving a message. This method is overridden and used to model the specific behaviour of this particular node. 40 | It is called with the message object, the addr of the sender, and the port sent to. (note, because p2p holepunching sometimes requires binding many ports we don't bother to represent binding each socket as a separate object, but just sending to and from ports) 41 | 42 | `send(msg, to_addr, from_port)` send a message to `to_addr` from `from_port`. 43 | 44 | `init(ts)` a method called by the simulation when the simulation starts. Clients and p2p protocols do not passively wait for connections, but actively seek out peers in the network, so must be given an opportunity to start doing things. 45 | 46 | the `timer(delay, repeat, fn)` method schedules events. if `repeat` is zero, the event runs a single time. If `delay` is zero, the event occurs immediately (before timer() returns) then at the `repeat` interval. 47 | 48 | If the node is asleep, the event not processed, and is added to the `awaken` list. 49 | The awaken list represents events that were not processed because the node was asleep. 50 | When the node comes out of sleep, the events are processed until the list is empty or one of the events causes the node to sleep again. If a repeating interval is scheduled and a node sleeps through several iterations of the interval, just a single event is processed at the time the sleep ends. This reflects the behavior of the javascript `setInterval` method. It is the users responsibility to detect if it has actually been several cycles, if necessary. 51 | 52 | ## Address 53 | 54 | ipv4 address, 4 byte integer in dotted decimal `a.b.c.d` "ip address" notation. 55 | 56 | ## Network 57 | 58 | A network is a derivative of Node, and has a map of `subnet` of type Address->Node. (because an Network is a node, a network can map to subnetworks. This is used to model Nat behaviour and private networks, see: Nat) 59 | 60 | `add(address, node)` add an address to the network. If the network has been initialized then call `init(ts)` on the node. 61 | 62 | `remove(node)` remove node from the network's address map. 63 | 64 | `send(msg, addr, from_port, source_node)` send a message `msg` from `from_port` on `source_node` to `addr`. First, lookup `addr.address` in the address map - if there is a Node at this address in the map, the destination is another peer in the local network. If not, it must be a public address in the public network. In `Network` this is simply considered an error, but the situation is handled in `Nat`. 65 | 66 | ## Nat 67 | 68 | derivative of Network used as a base to model Network Address Translation 69 | 70 | A Nat instance must have a policy for keying a port mapping. `getKey(dest, source)` 71 | and a policy for selecting a port to allocate. This may just be random, or it might be sequential. (Sometimes it is difficult to see what port selection policy a NAT is using, so if a p2p system works with random ports then it will work with sequential ports.) 72 | 73 | adds {TTL, map, unmap, hairpinning} properties to Network. 74 | `TTL` is a the number of miliseconds before an entry in the firewall should be expired. 75 | `map` is a mapping of internal to external ports. 76 | `unmap` is the reverse of map. 77 | `hairpininng` is a boolean, if true, messages from internal nodes may address the public address of the network and they will be delivered to another internal node based on the port. (this behaviour is not usually supported most real world nats) 78 | 79 | ip addresses are 4 bytes, and ports are 2 bytes. This gives 4 billion ip addresses, and 65 thousands ports per address. 4 billion isn't enough to easily assign addresses uniquely to every device. However, 65 thousand ports per address is quite a lot so a network router can share it's address by remapping ports. When a Node on a local network sends a new message to the outside network, the router remembers the local address and port, selects a new external port, and maps that external port to that local address. 80 | 81 | Different NATs use different algorithms for assigning ports, and this effects likelyhood of that Nat running out of ports, and also the ability of Nodes to establish p2p connections on that nat. For example, if a NAT assigns a port to a local address:port combination, irrespective of the remote address the Node is sending to, then p2p connections are easy, but the Nat can only do this 65,536 times before running out of ports (see IndependentNat). If the NAT assigns a port based on both the local address:port _and_ the destination port, then it's possible for the nat to reuse the same port for different Nodes to communicate with different remote servers (see DependantNat) 82 | 83 | `send(msg, addr, from_port, source_node)` as in Network, the destination is checked if it's part of the local network and delivery is scheduled. If it is not local, a new port is assigned, using the NAT's specific keying policy: `key = getKey(addr, {address: source_node.address, port: from_port}); port = getPort(); map[key] = port; unmap[port] = {address: source_node.address, port: from_port}`. When delivering local messages to external Nodes, the source port is mapped, but the external destination port remains the same. 84 | 85 | When the NAT receives a message, it uses the `unmap` table to figure out what local address to send it to. `dest = unmap[port]` if the dest is undefined, the message is dropped. 86 | Otherwise, a delivery of the message is scheduled to the port used by the local Node. When delivering external messages to local nodes, the destination port is unmapped, but the source port remains the same. -------------------------------------------------------------------------------- /SPEC.md: -------------------------------------------------------------------------------- 1 | # Objectives 2 | 3 | Specify a minimal NAT traversal library for UDP. 4 | 5 | # Specification 6 | 7 | This specification also targets UDP. UDP is a [message-oriented][F0] [transport layer protocol][W1], ideal for talking to NATs because unlike TCP, it doesn't require a handshake to start communicating. It also delegates encryption and security responsibility to a higher level protocol or even the application layer, which allows for the broadest set of use cases. 8 | 9 | This document includes essential constants, functions, and program states. This document differentiates very intentionally between SHOULD and MUST. This document tries to be concise, but if something isn't clear enough, please open an issue. 10 | 11 | ## Constants 12 | 13 | ### `LOCAL_PORT` 14 | 15 | The UDP port to bind to. 16 | 17 | ```c 18 | const uint LOCAL_PORT = 3456; 19 | ``` 20 | 21 | ### `TEST_PORT` 22 | 23 | The read only port to bind to that will accept inbound data and help determine if a NAT is static. 24 | 25 | ```c 26 | const uint TEST_PORT = 3457; 27 | ``` 28 | 29 | ### `BDP` 30 | 31 | The delay between packets sent for birthday paradox connection 10ms means 100 packets per second. 32 | 33 | ```c 34 | const uint BDP = 10; 35 | ``` 36 | 37 | ### `BDP_MAX_PACKETS` 38 | 39 | The maximum number of packets to use when employing the birthday paradox strategy. On average, about ~255 packets are sent per successful connection (giving up after 1000 packets 40 | means 97% of attempts are successful. It is necessary to give up at some point because the other side might not have done anything, or might have crashed, etc). 41 | 42 | ```c 43 | const uint BDP_MAX_PACKETS = 1000; 44 | ``` 45 | 46 | ### `CONNECTING_MAX_TIME` 47 | 48 | The time that we expect a new connection to take. Do not start another new connection attempt within this time, even if we haven't received a packet yet. 49 | 50 | ```c 51 | const uint CONNECTING_MAX_TIME = BDP * BDP_MAX_PACKETS; 52 | ``` 53 | 54 | ### `KEEP_ALIVE_TIMEOUT` 55 | 56 | We tested several nats (phone hotspot, wifi routers) and found that the firewall port stayed open for 30 seconds, so the keepalive timeout is 29. 57 | This is expected to cost one `100 byte` keepalive packet `120 times` an hour `24 hours` is `0.288Mb` a day per peer. 58 | 59 | ```c 60 | const uint KEEP_ALIVE_TIMEOUT = 29_000; 61 | ``` 62 | 63 | ## Data Structures 64 | 65 | ### `PeerId` 66 | 67 | A high entropy key, for example a ed25519 public key which is 32 bytes or 256 bits. 68 | 69 | ```c 70 | typedef unsigned char[32] PeerId; 71 | ``` 72 | 73 | ### `SwarmId` 74 | 75 | A high entropy key, for example a ed25519 public key which is 32 bytes or 256 bits. 76 | 77 | ```c 78 | typedef unsigned char[32] SwarmId; 79 | ``` 80 | 81 | ### `NatType` 82 | 83 | ```c 84 | enum NatType { 85 | Easy, 86 | Hard, 87 | Static 88 | }; 89 | ``` 90 | 91 | ### `PeerState` 92 | 93 | ```c 94 | struct PeerState { 95 | float Forgotten = 5; 96 | float Missing = 3.0; 97 | float Inactive = 1.5; 98 | float Active = 0.0; 99 | }; 100 | ``` 101 | 102 | ### `PeerIdentity` 103 | 104 | ```c 105 | struct PeerIdentity { 106 | PeerId id; // this unique identifier 107 | string address; // a valid IP address 108 | uint16_t port; // a valid port number 109 | }; 110 | ``` 111 | 112 | ### `PeerAddress` 113 | 114 | ```c 115 | struct PeerAddress { 116 | string address; // a valid IP address 117 | uint16_t port; // a valid port number 118 | NatType nat; // the nat type of the peer 119 | }; 120 | ``` 121 | 122 | ### `Config` 123 | 124 | ```c 125 | struct Config { 126 | localPort: LOCAL_PORT; 127 | testPort: TEST_PORT; 128 | bdp: BDP; 129 | nat: null; 130 | bdpMaxPackets: BDP_MAX_PACKETS; 131 | connecting: CONNECTING_MAX_TIME; 132 | keepAlive: KEEP_ALIVE_TIMEOUT; 133 | introducerA: PeerIdentity iA; 134 | introducerB: PeerIdentity iB; 135 | }; 136 | ``` 137 | 138 | ### `ArgsMessage` 139 | 140 | ```c 141 | struct ArgsMessage { 142 | string message; 143 | string address; 144 | uint16_t port; 145 | uint timestamp; 146 | }; 147 | ``` 148 | 149 | ### `ArgsAddPeer` 150 | 151 | ```c 152 | struct ArgsAddPeer { 153 | PeerId id; // the unique identity of the peer 154 | string address; // the ip address of the peer 155 | uint16_t port; // the numeric port of the peer 156 | NatType nat; // the nat type of the peer 157 | uint16_t outport; // the outgoing ephemeral port of the peer 158 | uint restart; // timestamp of the last restart 159 | uint timestamp; 160 | bool isIntroducer; // if this peer is static 161 | }; 162 | ``` 163 | 164 | ### `ArgsIntro` 165 | 166 | ```c 167 | struct ArgsIntro { 168 | PeerId id; 169 | string swarm; 170 | Peer intro; 171 | }; 172 | ``` 173 | 174 | ### `PongState` 175 | 176 | ```c 177 | struct PongState { 178 | uint timestamp; 179 | string address; // the ip address of the peer 180 | uint port; // the numeric port of the peer 181 | }; 182 | ``` 183 | 184 | ## Classes 185 | 186 | A peer MUST, in some way, implement at least these methods and properties. Regardless of how you implement a peer, it is impossible to demonstrate the reliability of this solution when deployed. So it is necessary to specify a [TLA+][3] spec as well as run the code on top of a network simulation. Tests that demonstrate [`Safety`][0] and [`Liveness`][1] properties will need to override the `[init, onMessage, localAddress, createInterval, send]` methods and properties so that they can be run synchronously in a simulation. 187 | 188 | ```c 189 | class Peer { 190 | bool isIntroducer = false; // set to true if peer has a static IP address 191 | bool notified = false; // ensure a peer is only notified once about another peer being added 192 | string localAddress; // determined by checking the network interfaces 193 | uint localPort; // set in the configuration 194 | string publicAddress; // set when a pong is received 195 | uint publicPort; // set when a pong is received 196 | NatType nat; // this NatType 197 | PongState pong; // the state of the last pong 198 | map peers; // a map of locally known peers 199 | map swarms; // a map of locally known peers 200 | vector connections; // an array of PeerId that 201 | 202 | void addPeer (ArgsAddPeer args); 203 | void connect (string fromId, string toId, string swarm, uint port); 204 | void constructor (Config config, uint timestamp); 205 | void bind (uint port, bool mustBind = false); 206 | void calculateNat (ArgsCalculateNat args); 207 | void requestNat (ArgsRequestNat args); 208 | void init (uint timestamp); 209 | void intro (ArgsIntro args); 210 | void localNetworkConnect (); 211 | void onMsgIntro (ArgsMessage args); 212 | void onMsgJoin (ArgsMessage args); 213 | void onMessage (Any data, string address, uint port, uint timestamp); 214 | void onMsgPing (ArgsMessage args); 215 | void onMsgPong (ArgsMessage args); 216 | void onTest (ArgsMessage args); 217 | void onWakeup (uint timestamp); 218 | void ping (ArgsPing args); 219 | void send (ArgsMessage message, PeerAddress address, uint port); 220 | void retryPing (PeerId id, PeerAddress address); 221 | void createInterval (uint delay, uint repeat, function cb); 222 | }; 223 | ``` 224 | 225 | ## States 226 | 227 | This section outlines the states of the program. 228 | 229 | ### Initial 230 | 231 | - An instance of the Peer class is constructed 232 | - The UDP ports defined in `Config.localPort` and `Config.testPort` are bound 233 | - When a message is received it will dispatch the corresponding method 234 | - TODO what message to call what method 235 | - An interval will poll for network interface changes 236 | - IF there is no `Config.keepAlive` specified in the config the function returns 237 | - An additional interval is started after `Config.keepAlive` and repeats every `Config.keepAlive` seconds 238 | - If `currentTime` - `lastPongReceived` > `keepalive` * `5`, the peer is Forgotten 239 | - If `currentTime` - `lastPongReceived` > `keepalive` * `3`, the peer is Missing 240 | - If `currentTime` - `lastPongReceived` > `keepalive` * `1.5`, the peer is Inactive 241 | - Otherwise the peer is considered Active 242 | - IF there is an interface change, the NAT type is re-evaluated and the function returns 243 | - IF the time elapsed is greater than a single cycle of the interval 244 | - FOR every peer in this `.peers` map, send `MsgPing` 245 | - FOR every swarm in this `.swarms` map, send `MsgJoin` 246 | - The NAT type is not well defined, it is re-evaluated 247 | 248 | ### NAT Detection 249 | 250 | A router's routing table or firewall may drop "unsolicited" packets. So simply binding a port and waiting for connections won't work. However, a router (even one with a firewall) can be coerced into accepting packets in a perfectly safe way. There are 3 conditions where a NAT (and Firewall) will allow inbound traffic. 251 | 252 | 1) The user manually configures port forwarding/mapping. 253 | 254 | 2) The NAT supports/allows a port mapping protocol (uPnP/PMP/PCP). 255 | 256 | 3) The inbound traffic looks like the response to some prior outbound traffic. This technique is also known as [hole-punching][W0] and involves something like a [STUN][W3] server. 257 | 258 | | NAT Type | Description | 259 | | :--- | :--- | 260 | | Static | The nat has a static IP address and does not drop unsolicited packets. | 261 | | Easy | The nat allows a device to use the same port to communicate with other hosts. If you are on an easy NAT, you just have to find out what port you have been given and then other peers will be able to message you on that port. | 262 | | Hard | The nat assigns different (probably random) ports for every other host you communicate with. Since a port cannot be reused, connecting as a hard nat is more complicated. | 263 | 264 | #### Execution 265 | 266 | Some NATs provide mechanisms for being configured directly. This SHOULD be the first phase of NAT traversal since its less complex than the phases that will follow. The mechanisms we want to use are Universal Plug and Play, NAT-Port Mapping Protocol, and Port Control Protocol (respectively, uPnP, NAT-PMP and PCP), UDP based port mapping protocols. 267 | 268 |
269 | 270 | Notes for NAT-PMP/PCP (Click to Expand) 271 | 272 | In 2005 NAT-PMP (RFC [6886][rfc6886]) was widely implemented, but in 2013 it was superseded by PCP (RFC [6887][rfc6887]). PCP builds on NAT-PMP, using the same UDP ports `5350` and `5351`, and a compatible packet format. PCP allows an IPv6 or IPv4 host to control how incoming IPv6 or IPv4 packets are translated and forwarded by a NAT or firewall, and also allows a host to optimize its outgoing NAT keep-alive messages. This is ideal for reducing infrastructure requirements (no rendezvous servers), saving energy, and reducing network chatter from keep alive requests. PCP is widely supported but NAT-PMP will handle most cases related to connecting peers. There are many library licensed open source projects that offer reference implementations, for example [libplum][GH02] or [libpcp][GH01]. 273 | 274 | UDP packets have an 8 byte header with 4 fields (`Source Port`, `Destination Port` `Length` and `Checksum`) with a maximum of 67 KB as a payload (according to RFCs [791][rfc791], [1122][rfc1122], and [2460][rfc2460]). Here are examples of the packets needed to instruct NAT-PMP/PCP on how to map addresses and ports. 275 | 276 | Earlier implementations of uPnP gained negative attention for security flaws. Many IT administrators still incorrectly assume all NAT port mapping protocols are unsafe. This is why these features are sometimes disabled. 277 | 278 | The first step in communicating with the NAT is to request a port mapping. To do this, send a UDP packet to port `5351` of the gateway's internal IP address with the [following format](https://datatracker.ietf.org/doc/html/rfc6886#section-3.3), on an interval of 250ms until it gets a response. 279 | 280 | ```c 281 | struct request { 282 | uint8_t version; 283 | uint8_t opcode; 1=UDP, 2=TCP 284 | uint16_t reserved; // Must be 0 (always) 285 | uint16_t internal_port; 286 | uint16_t suggested_external_port; // Avoid (not consistently honored by NATs) 287 | uint32_t lifetime; // The RECOMMENDED Lifetime is 7200 seconds (two hours) 288 | }; 289 | ``` 290 | 291 | > As a security note, your protocol should be aware that some poorly implemented NATs will create both UDP and TCP maps regardless of what you ask for. 292 | 293 | ```c 294 | struct response { 295 | uint8_t version; 296 | uint8_t opcode; 297 | uint16_t result; 298 | uint32_t epoch_time; 299 | uint16_t internal_port; 300 | uint16_t external_port; 301 | uint32_t lifetime; 302 | }; 303 | ``` 304 | 305 | A mapping renewal packet is formatted identically to an original mapping request; from the point of view of the client, it is a renewal of an existing mapping, but from the point of view of the freshly rebooted NAT gateway, it appears as a new mapping request. 306 | 307 |
308 | 309 | In the next phase, the NAT type needs to be discovered. This requires a peer (`P0`) to initially bind two ports, `Config.localPort` and `Config.testPort`. In addition, two introducers (`I0`, `I1`) are required, they should reside on separate static peers outside the NAT being tested. 310 | 311 | - The `Peer.publicAddress` and `Peer.nat` properties are set to `null` 312 | - `P0` sends `MsgPing` to `I0` and `I1`. 313 | - `I0` and `I1` should respond by sending `MsgPong` to `P0` and the message includes the NAT type and public IP and ephemeral port. 314 | - `I0` and `I1` also respond by sending a message to `P0` on the `Config.testPort`. 315 | - IF `P0` receives a message on `Config.testPort` we know that our NAT type is Static 316 | - Finally, `P0` must calculate the nat type based on the data collected so far 317 | 318 | 319 | To guard against dropped packets, peers should retry the nat detection process every second until they know their nat. 320 | 321 | 322 | ### `MsgPing` 323 | 324 | Sent as a "request" for a `MsgPong` message. 325 | The other peer is expected to respond with a `MsgPong`. 326 | If they do not respond promptly, they are considered to be down. 327 | 328 | ```c 329 | struct MsgPing { 330 | string type = "ping"; // the type of the message 331 | PeerId id; // the unique id of the sending-peer 332 | NatType nat; 333 | uint restart; // a unix timestamp specifying uptime of the sending-peer 334 | }; 335 | ``` 336 | 337 | #### Receive `MsgPing` 338 | 339 | - A message of type `MsgPing` is received 340 | - Respond with a message of type `MsgPong` 341 | - `.id` MUST be set to this `.id` 342 | - `.address` MUST be set to the value of `.address` in rinfo 343 | - `.port` MUST be set to the value of `.port` in rinfo 344 | - `.nat` MUST be set to this `.nat` property 345 | - `.restart` MUST be set to this `.restart` property 346 | - `.ts` MUST be set to the timestamp that the message was received 347 | 348 | ### `MsgPong` 349 | 350 | Sent as a "response" to a `MsgPing` message. 351 | 352 | ```c 353 | struct MsgPong { 354 | string type = "pong"; // the type of the message 355 | PeerId id; // the unique id of the sending-peer 356 | string address; // a string representation of the ip address of the sending-peer 357 | uint port; // a numeric representation of the port of the sending-peer 358 | NatType nat; 359 | uint restart; // a unix timestamp specifying uptime of the sending-peer 360 | uint timestamp; a unix timestamp specifying the time the ping message was received 361 | }; 362 | ``` 363 | 364 | #### Receive `MsgPong` 365 | 366 | - A message of type `MsgPong` is received 367 | - The message properties are added to an object and placed in a locally stored list representing known peers 368 | - The local properties `Peer.pong.timestamp`, `Peer.pong.address` and `Peer.pong.port` are updated using the data received 369 | - This `.recv` is updated with a current timestamp 370 | - call this `.notify` method 371 | - TODO set 372 | 373 | 374 | ### `MsgIntro` 375 | 376 | Sent to request an introduction to a target peer 377 | 378 | ```c 379 | struct MsgIntro { 380 | string type = "intro"; // the type of the message 381 | PeerId id; // the unique id of the sending-peer 382 | PeerId target; // the unique id of the sending-peer 383 | NatType nat; //nat of the sender 384 | SwarmId swarm; //optional swarm 385 | }; 386 | ``` 387 | 388 | #### Receive `MsgIntro` 389 | 390 | - A message of type `MsgIntro` is received 391 | - If the `MsgIntro.target` peer is not known, respond with `MsgIntroError` 392 | - `id` should be set to our `.config.id` 393 | - `target` should be `MsgIntro.target` 394 | - call should be `'intro'` 395 | - IF the both the IDs (`MsgIntro.target`) and (`MsgIntro.id`) are known locally 396 | let `p` be the peer with `id == MsgIntro.target` 397 | let `s` be the sender of the `MsgIntro` 398 | - send a `MsgConnect` back to the `MsgIntro` sender 399 | - `.id` to our `.config.id` 400 | - `.target` to `p.target` 401 | - `.address` to `p.address` 402 | - `.port` to `p.port` 403 | - `.nat` to `p.nat` 404 | - send a `MsgConnect` to `p` 405 | - `.id` to our `.config.id` 406 | - `.target` to `s.id` 407 | - `.address` to `s.address` 408 | - `.port` to `s.port` 409 | - `.nat` to `s.nat` 410 | 411 | ### `MsgLocal` 412 | 413 | Sent to establish a connection to another peer on the local network. 414 | 415 | ```c 416 | struct MsgLocal { 417 | string type = "local"; // the type of the message 418 | PeerId id; // the unique id of the sending-peer 419 | string address; // this local address 420 | uint16_t port; // this local port 421 | }; 422 | ``` 423 | 424 | #### Receive `MsgLocal` 425 | 426 | - Call the `.retryPing` method to send a `MsgPing` to the peer with `MsgLocal.id` 427 | 428 | ### `MsgJoin` 429 | 430 | A `MsgJoin` is sent to request to join a swarm. The remote peer will respond with several `MsgConnect` for some other random peers that they know in the swarm, or a `MsgJoinError` 431 | 432 | The sender should actively maintain their membership in the swarm. They should try to maintain an active connection to at least 3 other peers in the swarm. If their number of connections is less than 3, they should attempt to rejoin the swarm every `config.keepAlive` interval. 433 | 434 | ```c 435 | struct MsgJoin { 436 | string type = "join"; 437 | PeerId id; // the id of the sender 438 | SwarmId swarm; // the id of the swarm 439 | NatType nat; // the nat typeo of the sender 440 | uint peers; // the numner of peers in the swarm that the peer knows about 441 | }; 442 | ``` 443 | 444 | #### Receive `MsgJoin` 445 | 446 | - A message of type `MsgJoin` is received 447 | - The object with the `SwarmId`, `MsgJoin.swarm` is found on this `.swarms` property OR it is created 448 | - The object is an associative array where the key is the `SwarmId` and the value is the timestamp that this message was received 449 | - The sender peer must be updated by calling this `.addPeer` method 450 | - `.id` MUST be set to the `MsgJoin.id` property 451 | - `.address` MUST be set to the `.address` property of the rinfo 452 | - `.port` MUST be set to the `.port` property of the rinfo 453 | - `.nat` MUST be set to `MsgJoin.nat` 454 | - `.outport` MUST be set to this the port this message was received on. (this may differ from `.config.localPort` if this is BDP connection. 455 | - `.reset` MUST be set to`.restart` 456 | - `.timestamp` MUST be the timestamp this message was received 457 | - IF there are no other peers in the swarm then respond with a `MsgJoinError` and return 458 | - `.id` MUST be set to '.config.id` 459 | - `.swarm` must be `MsgJoin.swarm` 460 | - `.peers` MUST be set the number of peers in the swarm (including `MsgJoin` sender) 461 | - `.call` MUST be set to `join` 462 | - Next, randomly select peers for the sender of the `MsgJoin` to connect to. 463 | - Take the list of peers in the swarm, but remove the sender of `MsgConnect` 464 | - Sort the list randomly. 465 | - If `MsgConnect.nat` is `hard` then remove peers with hard nat from the list, unless they have the same address as `MsgJoin` sender (note, this would mean they are connected to the same wifi, and may then connect locally) 466 | - Take the first `MsgConnect.peers` from the list and discard the rest (unless there are less than `MsgConnect.peers` in the list, then take the whole list) 467 | - for each peer `p` in the list, 468 | - send a `MsgConnect` to `p` with `target` set to `MsgJoin` sender's`id`, `address`, `port` and `nat`, and `swarm` 469 | - send a `MsgConnect` back to the `MsgJoin` sender, with `target` set to `p`'s `id`, `address`, `port` and `nat`, and `swarm`. 470 | 471 | - Send `MsgConnect` randomly to `min(swarm.length, MsgConnect.peers)` peers 472 | - IF the peer to receive the message has `.nat` of type `Hard` 473 | - it MUST connect to peers with `.nat` type `Easy` OR peers with the same `.address` (peers on the same nat) 474 | - IF peers is `0`, the sender of the `MsgJoin` joins the swarm but doesnt send any `MsgConnect` 475 | - IF there are no other connectable peers 476 | 477 | ### `MsgRelay` 478 | 479 | ```c 480 | struct MsgRelay { 481 | string type = "relay"; 482 | PeerId target; // the id of the peer we want to connect to 483 | Any content; // most likely a message of any type 484 | }; 485 | ``` 486 | 487 | #### Receive `MsgRelay` 488 | 489 | - A message of type `MsgRelay` is receved 490 | - IF the object of type `Peer` exists in this `.peers` map 491 | - call this `.send` method 492 | - the first argument must be set to `MsgRelay.content` 493 | - the second argument muyst be set to the `Peer` 494 | - the third argument must be set to the `Peer.output` or this `.localPort` 495 | 496 | ### `MsgConnect` 497 | 498 | ```c 499 | struct MsgConnect { 500 | string type = "connect"; 501 | PeerId id; // the introducer's id. The id in the message is always the sender's id. 502 | PeerId target; // the id of the peer to connect to 503 | string address; // the address of the target 504 | NatType nat; // the nat of the target 505 | uint16_t port; // the port of the target 506 | SwarmId swarm; // optional 507 | }; 508 | ``` 509 | 510 | #### Receive `MsgConnect` 511 | 512 | - A message of type `MsgConnect` is received 513 | - IF the message has a `SwarmId` 514 | - call thi `.addPeer` method 515 | - set TODO 516 | - IF there is a `Peer` with `MsgConnect.target` id in this `.peers` map property 517 | - IF the address is not the same assign the peer the address from the message and make the `PongState` null 518 | - IF we have sent a packet within `CONNECTING_MAX_TIME` time, then return 519 | - IF we have sent or received a message within the `KEEP_ALIVE_TIMEOUT` 520 | - call this `.retryPing` method to be sure it is already connected 521 | - the first argument MUST be `Peer` 522 | - the second argument MUST be `MsgConnect.timestamp` 523 | - IF `MsgConenct.address` is equal to this `.publicAddress` property 524 | - Both peers are on the same local network, send a `MsgRelay` containing a `MsgLocal` back to `MsgConnect.id` 525 | - IF this `.nat` is `Easy` AND `MsgConnect.nat` is `easy` OR `MsgConnect.nat` is `Satitc` 526 | - call this `.retryPing` method and then return 527 | - IF this `.nat` is `Easy` AND `MsgConnect.nat` is `Hard` 528 | Commence "easy" side of a birthday paradox connection. 529 | - send many `MsgPing` messages from our main port **to** _unique random ports_ at `MsgConnect.address`. (disregard the `MsgConnect.port`). On average it will take about 250 packets to get through. Sometimes it will be less, sometimes more. Sending only 250 packets would mean that the connection succeeds 50% of the time, so it is recommended to send at least 1000 packets, which will have a 97% success rate. It is recommended to give up after that, in case the other peer was actually down or didn't try to connect. 530 | - It is also recommended to send these `MsgPing` packets from a quick internal. This gives time for response messages `MsgPong` to travel back. If a `MsgPong` is received from `MsgConnect.target` then stop. If more than 1000 packets have been sent, then stop. 531 | - IF this `.nat` is `Hard` AND `MsgConnect.nat` is `Easy` 532 | commence the _hard_ side of a birthday paradox connection (BDP). 533 | - send 256 `MsgPing` **from** _unique random ports_ to `MsgConnect.port`. Note this means binding 256 ports. The peer may reuse the same ports for other BDP connections. These packets should be sent immediately without waiting. These outgoing packets will open ports in our firewall, so that packets from the easy side can come through. The nat will remap all these ports. A hard nat assigns new ports for each address, so it does not work to simply send to a port that another address has observed. Instead, the peer must try to guess a port, but open many ports to make that easier. 534 | - IF this `.nat` is `Hard` AND `MsgConnect.nat` is `Hard` 535 | Unable create a connection via nat traversal. 536 | In future versions of this spec, it may be possible to connect peers in this situation by relaying through another peer with an `easy` or `static` nat. 537 | 538 | 539 | ### `MsgTest` 540 | 541 | Sent to the `Config.testPort` of a peer as a response to a `MsgPing` message. 542 | 543 | ```c 544 | struct MsgTest { 545 | string type = "test"; // the type of the message 546 | PeerId id; // the unique id of the sending-peer 547 | string address; // a string representation of the ip address of the sending-peer 548 | uint port; // a numeric representation of the port of the sending-peer 549 | NatType nat; 550 | }; 551 | ``` 552 | 553 | 554 | #### Receive `MsgTest` 555 | 556 | This message is received when an introducer sends a message to a peer's `Config.testPort` as the result of receiving a `Ping`. 557 | 558 | 559 | - A message of type `MsgPing` is received 560 | - update the local `PongState` property 561 | - re-calculate the local nat state 562 | 563 | # Credit 564 | 565 | This work is derived from the work of Bryan Ford (), Pyda Srisuresh () and Dan Kegel () who published ["Peer-to-Peer Communication Across Network Address Translators"][2]. 566 | 567 | [0]:https://lamport.azurewebsites.net/tla/proving-safety.pdf 568 | [1]:https://lamport.azurewebsites.net/pubs/liveness.pdf 569 | [2]:https://pdos.csail.mit.edu/papers/p2pnat.pdf 570 | [3]:https://www.microsoft.com/en-us/research/uploads/prod/2018/05/book-02-08-08.pdf 571 | 572 | [W0]:https://en.wikipedia.org/wiki/UDP_hole_punching 573 | [W1]:https://en.wikipedia.org/wiki/Transport_layer 574 | [W2]:https://en.wikipedia.org/wiki/Rendezvous_protocol 575 | [W3]:https://en.wikipedia.org/wiki/STUN 576 | 577 | [T0]:https://tailscale.com/blog/how-nat-traversal-works 578 | [F0]:https://fossbytes.com/connection-oriented-vs-connection-less-connection/ 579 | 580 | [B1]:https://www.bittorrent.org/beps/bep_0055.html 581 | [C0]:https://github.com/clostra/libutp 582 | [GH01]:https://github.com/libpcp/pcp 583 | [GH02]:https://github.com/paullouisageneau/libplum 584 | 585 | [rfc3022]:https://datatracker.ietf.org/doc/html/rfc3022 586 | [rfc2663]:https://datatracker.ietf.org/doc/html/rfc2663 587 | [rfc6886]:https://datatracker.ietf.org/doc/html/rfc6886 588 | [rfc6887]:https://datatracker.ietf.org/doc/html/rfc6887 589 | [rfc791]:https://datatracker.ietf.org/doc/html/rfc791 590 | [rfc1122]:https://datatracker.ietf.org/doc/html/rfc1122 591 | [rfc2460]:https://datatracker.ietf.org/doc/html/rfc2460 592 | --------------------------------------------------------------------------------