├── tla
    └── index.tla
├── README.md
├── SIMULATOR.md
└── SPEC.md


/tla/index.tla:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Summary
 2 | 
 3 | Specify a minimal NAT traversal library for UDP.
 4 | 
 5 | # Description
 6 | 
 7 | Program execution is represented by `behavior`. A behavior is a sequence of `states`. A state is the assignment of values to varaibles. A program is modeled by a set of behaviors: the bahaviors representing all possible executions. These states are described for implementers in markdown, but it is only possible to prove the correctness of the described algorithms using the more formal TLA+ spec.
 8 | 
 9 | # Status
10 | 
11 | The status of this work is `Unstable`.
12 | 


--------------------------------------------------------------------------------
/SIMULATOR.md:
--------------------------------------------------------------------------------
 1 | # Network Simulator
 2 | 
 3 | ## Motivation
 4 | 
 5 | Developing distributed systems is challenging, because distributed programs are not sequential. A network of nodes operates by sending asyncronous messages to each other and those messages may be dropped or arrive out of order. This means that the order of events in a system is at best partially ordered, and correctness must be defined over a set of possible partial orders. A correct distributed system must still achive it's goals given any possible message ordering.
 6 | 
 7 | Testing such a system simply by running it is extremely complicated because it's the diversity of real world networks and devices that produces the unpredictable behavior, and an easy to set up test rig would just be a collection of cloud servers - which would likely behave much more reliably, so wouldn't be very representative!
 8 | 
 9 | More, there can be subtle bugs that happen infrequently on any given node, but in a network with many nodes, happen to some node frequently - reproducing such a bug on a developer machine might take months, so it would be extremely difficult to manually debug.
10 | 
11 | Instead, we use perform testing in a _network simulation_. This gives the possibility to model network behavior in a deterministic way. A simplified abstract model of the network is used. It would be very difficult to make a very realistic network, both difficult to define, and difficult to argue that it is realistic. Instead we make a network model that is auguably _worse_ than realistic. For example, instead of predictable latency, the next message to be delivered is just random, and instead of dropping messages based on router buffer over-capacity, simply drop packets with a configurable uniform random probability.
12 | 
13 | By using deterministic random message delivery, we sample the space of possible orderings. It would might be better to exaustively explore the ordering space, but much simpler to sample it. By running a test many hundreds or thousands of times we can detect bugs that happen infrequently. Then, by replaying those test runs with a deterministic seed, test failures can be reproduced instantly, and debugged.
14 | 
15 | ## Event
16 | 
17 | An event is a tuple of a `(ts, fn)` some code to be executed at a specific time in the simulation's execution.
18 | This is used to model timers scheduled to run at specific times, and also used to model message delivery my using random times in the near future.
19 |  
20 | ## Queue
21 | 
22 | 
23 | The Events go into a single Queue that the current state of the simulation. Events are always removed smallest `.ts` value first, but tuples may be added in random order as long as `ts` is greater than the last event that has been processed. (see `Queue.ts` property. Processing an event may result in additional events being inserted into the Queue.
24 | 
25 | `add(ts, fn)` adds an event tuple, with `fn` to be processed at specific time `ts`.
26 | 
27 | `ts` a property of the Queue which is set to the `ts` of the event to be processed. It is set immediately before the event is processed.
28 | 
29 | `drain(ts)` processes events in the queue, until there are either no more events, or the remaining events have a scheduled time greater than `ts`. All events are processed in order of their attached `ts`.
30 | 
31 | ## Nodes
32 | 
33 | The test node represents a device on the network.
34 | Once a node has been added to a network, it may send and receive messages.
35 | Nodes also have a "sleeping" state.
36 | Node sleep is used to model laptops suspending and mobile apps being backgrounded,
37 | where the program state of the node is maintained but they do not respond to events.
38 | 
39 | `onMessage(msg, addr, port)` function to represent receiving a message. This method is overridden and used to model the specific behaviour of this particular node.
40 | It is called with the message object, the addr of the sender, and the port sent to. (note, because p2p holepunching sometimes requires binding many ports we don't bother to represent binding each socket as a separate object, but just sending to and from ports)
41 | 
42 | `send(msg, to_addr, from_port)` send a message to `to_addr` from `from_port`.
43 | 
44 | `init(ts)` a method called by the simulation when the simulation starts. Clients and p2p protocols do not passively wait for connections, but actively seek out peers in the network, so must be given an opportunity to start doing things.
45 | 
46 | the `timer(delay, repeat, fn)` method schedules events. if `repeat` is zero, the event runs a single time. If `delay` is zero, the event occurs immediately (before timer() returns) then at the `repeat` interval.
47 | 
48 | If the node is asleep, the event not processed, and is added to the `awaken` list.
49 | The awaken list represents events that were not processed because the node was asleep.
50 | When the node comes out of sleep, the events are processed until the list is empty or one of the events causes the node to sleep again. If a repeating interval is scheduled and a node sleeps through several iterations of the interval, just a single event is processed at the time the sleep ends. This reflects the behavior of the javascript `setInterval` method. It is the users responsibility to detect if it has actually been several cycles, if necessary.
51 | 
52 | ## Address
53 | 
54 | ipv4 address, 4 byte integer in dotted decimal `a.b.c.d` "ip address" notation.
55 | 
56 | ## Network
57 | 
58 | A network is a derivative of Node, and has a map of `subnet` of type Address->Node. (because an Network is a node, a network can map to subnetworks. This is used to model Nat behaviour and private networks, see: Nat)
59 | 
60 | `add(address, node)` add an address to the network. If the network has been initialized then call `init(ts)` on the node.
61 | 
62 | `remove(node)` remove node from the network's address map.
63 | 
64 | `send(msg, addr, from_port, source_node)` send a message `msg` from `from_port` on `source_node` to `addr`. First, lookup `addr.address` in the address map - if there is a Node at this address in the map, the destination is another peer in the local network. If not, it must be a public address in the public network. In `Network` this is simply considered an error, but the situation is handled in `Nat`.
65 | 
66 | ## Nat
67 | 
68 | derivative of Network used as a base to model Network Address Translation
69 | 
70 | A Nat instance must have a policy for keying a port mapping. `getKey(dest, source)`
71 | and a policy for selecting a port to allocate. This may just be random, or it might be sequential. (Sometimes it is difficult to see what port selection policy a NAT is using, so if a p2p system works with random ports then it will work with sequential ports.)
72 | 
73 | adds {TTL, map, unmap, hairpinning} properties to Network.
74 | `TTL` is a the number of miliseconds before an entry in the firewall should be expired.
75 | `map` is a mapping of internal to external ports.
76 | `unmap` is the reverse of map.
77 | `hairpininng` is a boolean, if true, messages from internal nodes may address the public address of the network and they will be delivered to another internal node based on the port. (this behaviour is not usually supported most real world nats)
78 | 
79 | ip addresses are 4 bytes, and ports are 2 bytes. This gives 4 billion ip addresses, and 65 thousands ports per address. 4 billion isn't enough to easily assign addresses uniquely to every device. However, 65 thousand ports per address is quite a lot so a network router can share it's address by remapping ports. When a Node on a local network sends a new message to the outside network, the router remembers the local address and port, selects a new external port, and maps that external port to that local address.
80 | 
81 | Different NATs use different algorithms for assigning ports, and this effects likelyhood of that Nat running out of ports, and also the ability of Nodes to establish p2p connections on that nat. For example, if a NAT assigns a port to a local address:port combination, irrespective of the remote address the Node is sending to, then p2p connections are easy, but the Nat can only do this 65,536 times before running out of ports (see IndependentNat). If the NAT assigns a port based on both the local address:port _and_ the destination port, then it's possible for the nat to reuse the same port for different Nodes to communicate with different remote servers (see DependantNat)
82 | 
83 | `send(msg, addr, from_port, source_node)` as in Network, the destination is checked if it's part of the local network and delivery is scheduled. If it is not local, a new port is assigned, using the NAT's specific keying policy: `key = getKey(addr, {address: source_node.address, port: from_port}); port = getPort(); map[key] = port; unmap[port] = {address: source_node.address, port: from_port}`. When delivering local messages to external Nodes, the source port is mapped, but the external destination port remains the same.
84 | 
85 | When the NAT receives a message, it uses the `unmap` table to figure out what local address to send it to. `dest = unmap[port]` if the dest is undefined, the message is dropped.
86 | Otherwise, a delivery of the message is scheduled to the port used by the local Node. When delivering external messages to local nodes, the destination port is unmapped, but the source port remains the same.


--------------------------------------------------------------------------------
/SPEC.md:
--------------------------------------------------------------------------------
  1 | # Objectives
  2 | 
  3 | Specify a minimal NAT traversal library for UDP.
  4 | 
  5 | # Specification
  6 | 
  7 | This specification also targets UDP. UDP is a [message-oriented][F0] [transport layer protocol][W1], ideal for talking to NATs because unlike TCP, it doesn't require a handshake to start communicating. It also delegates encryption and security responsibility to a higher level protocol or even the application layer, which allows for the broadest set of use cases.
  8 | 
  9 | This document includes essential constants, functions, and program states. This document differentiates very intentionally between SHOULD and MUST. This document tries to be concise, but if something isn't clear enough, please open an issue.
 10 | 
 11 | ## Constants
 12 | 
 13 | ### `LOCAL_PORT`
 14 | 
 15 | The UDP port to bind to.
 16 | 
 17 | ```c
 18 | const uint LOCAL_PORT = 3456;
 19 | ```
 20 | 
 21 | ### `TEST_PORT`
 22 | 
 23 | The read only port to bind to that will accept inbound data and help determine if a NAT is static.
 24 | 
 25 | ```c
 26 | const uint TEST_PORT = 3457;
 27 | ```
 28 | 
 29 | ### `BDP`
 30 | 
 31 | The delay between packets sent for birthday paradox connection 10ms means 100 packets per second.
 32 | 
 33 | ```c
 34 | const uint BDP = 10;
 35 | ```
 36 | 
 37 | ### `BDP_MAX_PACKETS`
 38 | 
 39 | The maximum number of packets to use when employing the birthday paradox strategy. On average, about ~255 packets are sent per successful connection (giving up after 1000 packets
 40 | means 97% of attempts are successful. It is necessary to give up at some point because the other side might not have done anything, or might have crashed, etc).
 41 | 
 42 | ```c
 43 | const uint BDP_MAX_PACKETS = 1000;
 44 | ```
 45 | 
 46 | ### `CONNECTING_MAX_TIME`
 47 | 
 48 | The time that we expect a new connection to take. Do not start another new connection attempt within this time, even if we haven't received a packet yet.
 49 | 
 50 | ```c
 51 | const uint CONNECTING_MAX_TIME = BDP * BDP_MAX_PACKETS;
 52 | ```
 53 | 
 54 | ### `KEEP_ALIVE_TIMEOUT`
 55 | 
 56 | We tested several nats (phone hotspot, wifi routers) and found that the firewall port stayed open for 30 seconds, so the keepalive timeout is 29.
 57 | This is expected to cost one `100 byte` keepalive packet `120 times` an hour `24 hours` is `0.288Mb` a day per peer.
 58 | 
 59 | ```c
 60 | const uint KEEP_ALIVE_TIMEOUT = 29_000;
 61 | ```
 62 | 
 63 | ## Data Structures
 64 | 
 65 | ### `PeerId`
 66 | 
 67 | A high entropy key, for example a ed25519 public key which is 32 bytes or 256 bits.
 68 | 
 69 | ```c
 70 | typedef unsigned char[32] PeerId;
 71 | ```
 72 | 
 73 | ### `SwarmId`
 74 | 
 75 | A high entropy key, for example a ed25519 public key which is 32 bytes or 256 bits.
 76 | 
 77 | ```c
 78 | typedef unsigned char[32] SwarmId;
 79 | ```
 80 | 
 81 | ### `NatType`
 82 | 
 83 | ```c
 84 | enum NatType {
 85 |   Easy,
 86 |   Hard,
 87 |   Static
 88 | };
 89 | ```
 90 | 
 91 | ### `PeerState`
 92 | 
 93 | ```c
 94 | struct PeerState {
 95 |   float Forgotten = 5;
 96 |   float Missing = 3.0;
 97 |   float Inactive = 1.5;
 98 |   float Active = 0.0;
 99 | };
100 | ```
101 | 
102 | ### `PeerIdentity`
103 | 
104 | ```c
105 | struct PeerIdentity {
106 |   PeerId id; // this unique identifier
107 |   string address; // a valid IP address
108 |   uint16_t port; // a valid port number
109 | };
110 | ```
111 | 
112 | ### `PeerAddress`
113 | 
114 | ```c
115 | struct PeerAddress {
116 |   string address; // a valid IP address
117 |   uint16_t port; // a valid port number
118 |   NatType nat; // the nat type of the peer
119 | };
120 | ```
121 | 
122 | ### `Config`
123 | 
124 | ```c
125 | struct Config {
126 |   localPort: LOCAL_PORT;
127 |   testPort: TEST_PORT;
128 |   bdp: BDP;
129 |   nat: null;
130 |   bdpMaxPackets: BDP_MAX_PACKETS;
131 |   connecting: CONNECTING_MAX_TIME;
132 |   keepAlive: KEEP_ALIVE_TIMEOUT;
133 |   introducerA: PeerIdentity iA;
134 |   introducerB: PeerIdentity iB;
135 | };
136 | ```
137 | 
138 | ### `ArgsMessage`
139 | 
140 | ```c
141 | struct ArgsMessage {
142 |   string message;
143 |   string address;
144 |   uint16_t port;
145 |   uint timestamp;
146 | };
147 | ```
148 | 
149 | ### `ArgsAddPeer`
150 | 
151 | ```c
152 | struct ArgsAddPeer {
153 |   PeerId id; // the unique identity of the peer
154 |   string address; // the ip address of the peer
155 |   uint16_t port; // the numeric port of the peer
156 |   NatType nat; // the nat type of the peer
157 |   uint16_t outport; // the outgoing ephemeral port of the peer
158 |   uint restart; // timestamp of the last restart
159 |   uint timestamp;
160 |   bool isIntroducer; // if this peer is static
161 | };
162 | ```
163 | 
164 | ### `ArgsIntro`
165 | 
166 | ```c
167 | struct ArgsIntro {
168 |   PeerId id;
169 |   string swarm;
170 |   Peer intro;
171 | };
172 | ```
173 | 
174 | ### `PongState`
175 | 
176 | ```c
177 | struct PongState {
178 |   uint timestamp;
179 |   string address; // the ip address of the peer
180 |   uint port; // the numeric port of the peer
181 | };
182 | ```
183 | 
184 | ## Classes
185 | 
186 | A peer MUST, in some way, implement at least these methods and properties. Regardless of how you implement a peer, it is impossible to demonstrate the reliability of this solution when deployed. So it is necessary to specify a [TLA+][3] spec as well as run the code on top of a network simulation. Tests that demonstrate [`Safety`][0] and [`Liveness`][1] properties will need to override the `[init, onMessage, localAddress, createInterval, send]` methods and properties so that they can be run synchronously in a simulation.
187 | 
188 | ```c
189 | class Peer {
190 |   bool isIntroducer = false; // set to true if peer has a static IP address
191 |   bool notified = false; // ensure a peer is only notified once about another peer being added
192 |   string localAddress; // determined by checking the network interfaces
193 |   uint localPort; // set in the configuration
194 |   string publicAddress; // set when a pong is received
195 |   uint publicPort; // set when a pong is received
196 |   NatType nat; // this NatType
197 |   PongState pong; // the state of the last pong
198 |   map<PeerId, Peer*> peers; // a map of locally known peers
199 |   map<SwarmId, Swarm*> swarms; // a map of locally known peers
200 |   vector<PeerId> connections; // an array of PeerId that
201 | 
202 |   void addPeer (ArgsAddPeer args);
203 |   void connect (string fromId, string toId, string swarm, uint port);
204 |   void constructor (Config config, uint timestamp);
205 |   void bind (uint port, bool mustBind = false);
206 |   void calculateNat (ArgsCalculateNat args);
207 |   void requestNat (ArgsRequestNat args);
208 |   void init (uint timestamp);
209 |   void intro (ArgsIntro args);
210 |   void localNetworkConnect ();
211 |   void onMsgIntro (ArgsMessage args);
212 |   void onMsgJoin (ArgsMessage args);
213 |   void onMessage (Any data, string address, uint port, uint timestamp);
214 |   void onMsgPing (ArgsMessage args);
215 |   void onMsgPong (ArgsMessage args);
216 |   void onTest (ArgsMessage args);
217 |   void onWakeup (uint timestamp);
218 |   void ping (ArgsPing args);
219 |   void send (ArgsMessage message, PeerAddress address, uint port);
220 |   void retryPing (PeerId id, PeerAddress address);
221 |   void createInterval (uint delay, uint repeat, function<void(uint timestamp)> cb);
222 | };
223 | ```
224 | 
225 | ## States
226 | 
227 | This section outlines the states of the program.
228 | 
229 | ### Initial
230 | 
231 | - An instance of the Peer class is constructed
232 |   - The UDP ports defined in `Config.localPort` and `Config.testPort` are bound
233 |     - When a message is received it will dispatch the corresponding method
234 |       - TODO what message to call what method
235 |   - An interval will poll for network interface changes
236 |     - IF there is no `Config.keepAlive` specified in the config the function returns
237 |     - An additional interval is started after `Config.keepAlive` and repeats every `Config.keepAlive` seconds
238 |       - If `currentTime` - `lastPongReceived` > `keepalive` * `5`, the peer is Forgotten
239 |       - If `currentTime` - `lastPongReceived` > `keepalive` * `3`, the peer is Missing
240 |       - If `currentTime` - `lastPongReceived` > `keepalive` * `1.5`, the peer is Inactive
241 |       - Otherwise the peer is considered Active
242 |     - IF there is an interface change, the NAT type is re-evaluated and the function returns
243 |     - IF the time elapsed is greater than a single cycle of the interval
244 |       - FOR every peer in this `.peers` map, send `MsgPing`
245 |       - FOR every swarm in this `.swarms` map, send `MsgJoin`
246 |     - The NAT type is not well defined, it is re-evaluated
247 | 
248 | ### NAT Detection
249 | 
250 | A router's routing table or firewall may drop "unsolicited" packets. So simply binding a port and waiting for connections won't work. However, a router (even one with a firewall) can be coerced into accepting packets in a perfectly safe way. There are 3 conditions where a NAT (and Firewall) will allow inbound traffic.
251 | 
252 | 1) The user manually configures port forwarding/mapping.
253 | 
254 | 2) The NAT supports/allows a port mapping protocol (uPnP/PMP/PCP).
255 | 
256 | 3) The inbound traffic looks like the response to some prior outbound traffic. This technique is also known as [hole-punching][W0] and involves something like a [STUN][W3] server.
257 | 
258 | | NAT Type | Description |
259 | | :---     | :---        |
260 | | Static   | The nat has a static IP address and does not drop unsolicited packets. |
261 | | Easy     | The nat allows a device to use the same port to communicate with other hosts. If you are on an easy NAT, you just have to find out what port you have been given and then other peers will be able to message you on that port. |
262 | | Hard     | The nat assigns different (probably random) ports for every other host you communicate with. Since a port cannot be reused, connecting as a hard nat is more complicated. |
263 | 
264 | #### Execution
265 | 
266 | Some NATs provide mechanisms for being configured directly. This SHOULD be the first phase of NAT traversal since its less complex than the phases that will follow. The mechanisms we want to use are Universal Plug and Play, NAT-Port Mapping Protocol, and Port Control Protocol (respectively, uPnP, NAT-PMP and PCP), UDP based port mapping protocols.
267 | 
268 | <details>
269 | 
270 | <summary>Notes for NAT-PMP/PCP (Click to Expand)</summary>
271 | 
272 | In 2005 NAT-PMP (RFC [6886][rfc6886]) was widely implemented, but in 2013 it was superseded by PCP (RFC [6887][rfc6887]). PCP builds on NAT-PMP, using the same UDP ports `5350` and `5351`, and a compatible packet format. PCP allows an IPv6 or IPv4 host to control how incoming IPv6 or IPv4 packets are translated and forwarded by a NAT or firewall, and also allows a host to optimize its outgoing NAT keep-alive messages. This is ideal for reducing infrastructure requirements (no rendezvous servers), saving energy, and reducing network chatter from keep alive requests. PCP is widely supported but NAT-PMP will handle most cases related to connecting peers. There are many library licensed open source projects that offer reference implementations, for example [libplum][GH02] or [libpcp][GH01].
273 | 
274 | UDP packets have an 8 byte header with 4 fields (`Source Port`, `Destination Port` `Length` and `Checksum`) with a maximum of 67 KB as a payload (according to RFCs [791][rfc791], [1122][rfc1122], and [2460][rfc2460]). Here are examples of the packets needed to instruct NAT-PMP/PCP on how to map addresses and ports.
275 | 
276 | Earlier implementations of uPnP gained negative attention for security flaws. Many IT administrators still incorrectly assume all NAT port mapping protocols are unsafe. This is why these features are sometimes disabled.
277 | 
278 | The first step in communicating with the NAT is to request a port mapping. To do this, send a UDP packet to port `5351` of the gateway's internal IP address with the [following format](https://datatracker.ietf.org/doc/html/rfc6886#section-3.3), on an interval of 250ms until it gets a response.
279 | 
280 | ```c
281 | struct request {
282 |   uint8_t version;
283 |   uint8_t opcode; 1=UDP, 2=TCP
284 |   uint16_t reserved; // Must be 0 (always)
285 |   uint16_t internal_port;
286 |   uint16_t suggested_external_port; // Avoid (not consistently honored by NATs)
287 |   uint32_t lifetime; // The RECOMMENDED Lifetime is 7200 seconds (two hours)
288 | };
289 | ```
290 | 
291 | > As a security note, your protocol should be aware that some poorly implemented NATs will create both UDP and TCP maps regardless of what you ask for.
292 | 
293 | ```c
294 | struct response {
295 |   uint8_t version;
296 |   uint8_t opcode;
297 |   uint16_t result;
298 |   uint32_t epoch_time;
299 |   uint16_t internal_port;
300 |   uint16_t external_port;
301 |   uint32_t lifetime;
302 | };
303 | ```
304 | 
305 | A mapping renewal packet is formatted identically to an original mapping request; from the point of view of the client, it is a renewal of an existing mapping, but from the point of view of the freshly rebooted NAT gateway, it appears as a new mapping request.
306 | 
307 | </details>
308 | 
309 | In the next phase, the NAT type needs to be discovered. This requires a peer (`P0`) to initially bind two ports, `Config.localPort` and `Config.testPort`. In addition, two introducers (`I0`, `I1`) are required, they should reside on separate static peers outside the NAT being tested.
310 | 
311 | - The `Peer.publicAddress` and `Peer.nat` properties are set to `null`
312 | - `P0` sends `MsgPing` to `I0` and `I1`.
313 | - `I0` and `I1` should respond by sending `MsgPong` to `P0` and the message includes the NAT type and public IP and ephemeral port.
314 | - `I0` and `I1` also respond by sending a message to `P0` on the `Config.testPort`.
315 |   - IF `P0` receives a message on `Config.testPort` we know that our NAT type is Static
316 | - Finally, `P0` must calculate the nat type based on the data collected so far
317 | 
318 | 
319 | To guard against dropped packets, peers should retry the nat detection process every second until they know their nat.
320 | 
321 | 
322 | ### `MsgPing`
323 | 
324 | Sent as a "request" for a `MsgPong` message.
325 | The other peer is expected to respond with a `MsgPong`.
326 | If they do not respond promptly, they are considered to be down.
327 | 
328 | ```c
329 | struct MsgPing {
330 |   string type = "ping"; // the type of the message
331 |   PeerId id; // the unique id of the sending-peer
332 |   NatType nat;
333 |   uint restart; // a unix timestamp specifying uptime of the sending-peer
334 | };
335 | ```
336 | 
337 | #### Receive `MsgPing`
338 | 
339 | - A message of type `MsgPing` is received
340 |   - Respond with a message of type `MsgPong`
341 |     - `.id` MUST be set to this `.id`
342 |     - `.address` MUST be set to the value of `.address` in rinfo
343 |     - `.port` MUST be set to the value of `.port` in rinfo
344 |     - `.nat` MUST be set to this `.nat` property
345 |     - `.restart` MUST be set to this `.restart` property
346 |     - `.ts` MUST be set to the timestamp that the message was received
347 | 
348 | ### `MsgPong`
349 | 
350 | Sent as a "response" to a `MsgPing` message.
351 | 
352 | ```c
353 | struct MsgPong {
354 |   string type = "pong"; // the type of the message
355 |   PeerId id; // the unique id of the sending-peer
356 |   string address; // a string representation of the ip address of the sending-peer
357 |   uint port; // a numeric representation of the port of the sending-peer
358 |   NatType nat;
359 |   uint restart; // a unix timestamp specifying uptime of the sending-peer
360 |   uint timestamp; a unix timestamp specifying the time the ping message was received
361 | };
362 | ```
363 | 
364 | #### Receive `MsgPong`
365 | 
366 | - A message of type `MsgPong` is received
367 |   - The message properties are added to an object and placed in a locally stored list representing known peers
368 |   - The local properties `Peer.pong.timestamp`, `Peer.pong.address` and `Peer.pong.port` are updated using the data received
369 |   - This `.recv` is updated with a current timestamp
370 |   - call this `.notify` method
371 |     - TODO set
372 | 
373 | 
374 | ### `MsgIntro`
375 | 
376 | Sent to request an introduction to a target peer
377 | 
378 | ```c
379 | struct MsgIntro {
380 |   string type = "intro"; // the type of the message
381 |   PeerId id; // the unique id of the sending-peer
382 |   PeerId target; // the unique id of the sending-peer
383 |   NatType nat; //nat of the sender
384 |   SwarmId swarm; //optional swarm 
385 | };
386 | ```
387 | 
388 | #### Receive `MsgIntro`
389 | 
390 | - A message of type `MsgIntro` is received
391 |   - If the `MsgIntro.target` peer is not known, respond with `MsgIntroError`
392 |     - `id` should be set to our `.config.id`
393 |     - `target` should be `MsgIntro.target`
394 |     - call should be `'intro'`
395 |   - IF the both the IDs (`MsgIntro.target`) and (`MsgIntro.id`) are known locally
396 |     let `p` be the peer with `id == MsgIntro.target`
397 |     let `s` be the sender of the `MsgIntro`
398 |     - send a `MsgConnect` back to the `MsgIntro` sender
399 |       - `.id` to our `.config.id`
400 |       - `.target` to `p.target`
401 |       - `.address` to `p.address`
402 |       - `.port` to `p.port`
403 |       - `.nat` to `p.nat`
404 |     - send a `MsgConnect` to `p`
405 |       - `.id` to our `.config.id`
406 |       - `.target` to `s.id`
407 |       - `.address` to `s.address`
408 |       - `.port` to `s.port`
409 |       - `.nat` to `s.nat`
410 | 
411 | ### `MsgLocal`
412 | 
413 | Sent to establish a connection to another peer on the local network.
414 | 
415 | ```c
416 | struct MsgLocal {
417 |   string type = "local"; // the type of the message
418 |   PeerId id; // the unique id of the sending-peer
419 |   string address; // this local address
420 |   uint16_t port; // this local port
421 | };
422 | ```
423 | 
424 | #### Receive `MsgLocal`
425 | 
426 | - Call the `.retryPing` method to send a `MsgPing` to the peer with `MsgLocal.id`
427 | 
428 | ### `MsgJoin`
429 | 
430 | A `MsgJoin` is sent to request to join a swarm. The remote peer will respond with several `MsgConnect` for some other random peers that they know in the swarm, or a `MsgJoinError`
431 | 
432 | The sender should actively maintain their membership in the swarm. They should try to maintain an active connection to at least 3 other peers in the swarm. If their number of connections is less than 3, they should attempt to rejoin the swarm every `config.keepAlive` interval.
433 | 
434 | ```c
435 | struct MsgJoin {
436 |   string type = "join";
437 |   PeerId id; // the id of the sender
438 |   SwarmId swarm; // the id of the swarm
439 |   NatType nat; // the nat typeo of the sender
440 |   uint peers; // the numner of peers in the swarm that the peer knows about
441 | };
442 | ```
443 | 
444 | #### Receive `MsgJoin`
445 | 
446 | - A message of type `MsgJoin` is received
447 |   - The object with the `SwarmId`, `MsgJoin.swarm` is found on this `.swarms` property OR it is created
448 |   - The object is an associative array where the key is the `SwarmId` and the value is the timestamp that this message was received
449 |   - The sender peer must be updated by calling this `.addPeer` method
450 |     - `.id` MUST be set to the `MsgJoin.id` property
451 |     - `.address` MUST be set to the `.address` property of the rinfo
452 |     - `.port` MUST be set to the `.port` property of the rinfo
453 |     - `.nat` MUST be set to `MsgJoin.nat`
454 |     - `.outport` MUST be set to this the port this message was received on. (this may differ from `.config.localPort` if this is BDP connection.
455 |     - `.reset` MUST be set to`.restart`
456 |     - `.timestamp` MUST be the timestamp this message was received
457 |   - IF there are no other peers in the swarm then respond with a `MsgJoinError` and return
458 |       - `.id` MUST be set to '.config.id`
459 |       - `.swarm` must be `MsgJoin.swarm`
460 |       - `.peers` MUST be set the number of peers in the swarm (including `MsgJoin` sender)
461 |       - `.call` MUST be set to `join`
462 |   - Next, randomly select peers for the sender of the `MsgJoin` to connect to.
463 |     - Take the list of peers in the swarm, but remove the sender of `MsgConnect`
464 |     - Sort the list randomly.
465 |     - If `MsgConnect.nat` is `hard` then remove peers with hard nat from the list, unless they have the same address as `MsgJoin` sender (note, this would mean they are connected to the same wifi, and may then connect locally)
466 |     - Take the first `MsgConnect.peers` from the list and discard the rest (unless there are less than `MsgConnect.peers` in the list, then take the whole list)
467 |     - for each peer `p` in the list,
468 |       - send a `MsgConnect` to `p` with `target` set to `MsgJoin` sender's`id`, `address`, `port` and `nat`, and `swarm`
469 |       - send a `MsgConnect` back to the `MsgJoin` sender, with `target` set to `p`'s `id`, `address`, `port` and `nat`, and `swarm`.
470 | 
471 |   - Send `MsgConnect` randomly to `min(swarm.length, MsgConnect.peers)` peers
472 |     - IF the peer to receive the message has `.nat` of type `Hard`
473 |       - it MUST connect to peers with `.nat` type `Easy` OR peers with the same `.address` (peers on the same nat)
474 |     - IF peers is `0`, the sender of the `MsgJoin` joins the swarm but doesnt send any `MsgConnect`
475 |     - IF there are no other connectable peers
476 | 
477 | ### `MsgRelay`
478 | 
479 | ```c
480 | struct MsgRelay {
481 |   string type = "relay";
482 |   PeerId target; // the id of the peer we want to connect to
483 |   Any content; // most likely a message of any type
484 | };
485 | ```
486 | 
487 | #### Receive `MsgRelay`
488 | 
489 | - A message of type `MsgRelay` is receved
490 |   - IF the object of type `Peer` exists in this `.peers` map
491 |     - call this `.send` method
492 |       - the first argument must be set to `MsgRelay.content`
493 |       - the second argument muyst be set to the `Peer`
494 |       - the third argument must be set to the `Peer.output` or this `.localPort`
495 | 
496 | ### `MsgConnect`
497 | 
498 | ```c
499 | struct MsgConnect {
500 |   string type = "connect";
501 |   PeerId id; // the introducer's id. The id in the message is always the sender's id.
502 |   PeerId target; // the id of the peer to connect to
503 |   string address; // the address of the target
504 |   NatType nat; // the nat of the target
505 |   uint16_t port; // the port of the target
506 |   SwarmId swarm; // optional
507 | };
508 | ```
509 | 
510 | #### Receive `MsgConnect`
511 | 
512 | - A message of type `MsgConnect` is received
513 |   - IF the message has a `SwarmId`
514 |     - call thi `.addPeer` method
515 |       - set TODO
516 |   - IF there is a `Peer` with `MsgConnect.target` id in this `.peers` map property
517 |     - IF the address is not the same assign the peer the address from the message and make the `PongState` null
518 |     - IF we have sent a packet within `CONNECTING_MAX_TIME` time, then return
519 |     - IF we have sent or received a message within the `KEEP_ALIVE_TIMEOUT`
520 |       - call this `.retryPing` method to be sure it is already connected
521 |         - the first argument MUST be `Peer`
522 |         - the second argument MUST be `MsgConnect.timestamp`
523 |   - IF `MsgConenct.address` is equal to this `.publicAddress` property
524 |     - Both peers are on the same local network, send a `MsgRelay` containing a `MsgLocal` back to `MsgConnect.id`
525 |   - IF this `.nat` is `Easy` AND `MsgConnect.nat` is `easy` OR `MsgConnect.nat` is `Satitc`
526 |     - call this `.retryPing` method and then return
527 |   - IF this `.nat` is `Easy` AND `MsgConnect.nat` is `Hard`
528 |     Commence "easy" side of a birthday paradox connection.
529 |     - send many `MsgPing` messages from our main port **to** _unique random ports_ at `MsgConnect.address`. (disregard the `MsgConnect.port`). On average it will take about 250 packets to get through. Sometimes it will be less, sometimes more. Sending only 250 packets would mean that the connection succeeds 50% of the time, so it is recommended to send at least 1000 packets, which will have a 97% success rate. It is recommended to give up after that, in case the other peer was actually down or didn't try to connect.
530 |     - It is also recommended to send these `MsgPing` packets from a quick internal. This gives time for response messages `MsgPong` to travel back. If a `MsgPong` is received from `MsgConnect.target` then stop. If more than 1000 packets have been sent, then stop.
531 |   - IF this `.nat` is `Hard` AND `MsgConnect.nat` is `Easy`
532 |     commence the _hard_ side of a birthday paradox connection (BDP).
533 |     - send 256 `MsgPing` **from** _unique random ports_ to `MsgConnect.port`. Note this means binding 256 ports. The peer may reuse the same ports for other BDP connections. These packets should be sent immediately without waiting. These outgoing packets will open ports in our firewall, so that packets from the easy side can come through. The nat will remap all these ports. A hard nat assigns new ports for each address, so it does not work to simply send to a port that another address has observed. Instead, the peer must try to guess a port, but open many ports to make that easier.
534 |   - IF this `.nat` is `Hard` AND `MsgConnect.nat` is `Hard`
535 |     Unable create a connection via nat traversal.
536 |     In future versions of this spec, it may be possible to connect peers in this situation by relaying through another peer with an `easy` or `static` nat.
537 | 
538 | 
539 | ### `MsgTest`
540 | 
541 | Sent to the `Config.testPort` of a peer as a response to a `MsgPing` message.
542 | 
543 | ```c
544 | struct MsgTest {
545 |   string type = "test"; // the type of the message
546 |   PeerId id; // the unique id of the sending-peer
547 |   string address; // a string representation of the ip address of the sending-peer
548 |   uint port; // a numeric representation of the port of the sending-peer
549 |   NatType nat;
550 | };
551 | ```
552 | 
553 | 
554 | #### Receive `MsgTest`
555 | 
556 | This message is received when an introducer sends a message to a peer's `Config.testPort` as the result of receiving a `Ping`.
557 | 
558 | 
559 | - A message of type `MsgPing` is received
560 |   - update the local `PongState` property
561 |   - re-calculate the local nat state
562 | 
563 | # Credit
564 | 
565 | This work is derived from the work of Bryan Ford (<baford@mit.edu>), Pyda Srisuresh (<srisuresh@yahoo.com>) and Dan Kegel (<dank@kegel.com>) who published ["Peer-to-Peer Communication Across Network Address Translators"][2].
566 | 
567 | [0]:https://lamport.azurewebsites.net/tla/proving-safety.pdf
568 | [1]:https://lamport.azurewebsites.net/pubs/liveness.pdf
569 | [2]:https://pdos.csail.mit.edu/papers/p2pnat.pdf
570 | [3]:https://www.microsoft.com/en-us/research/uploads/prod/2018/05/book-02-08-08.pdf
571 | 
572 | [W0]:https://en.wikipedia.org/wiki/UDP_hole_punching
573 | [W1]:https://en.wikipedia.org/wiki/Transport_layer
574 | [W2]:https://en.wikipedia.org/wiki/Rendezvous_protocol
575 | [W3]:https://en.wikipedia.org/wiki/STUN
576 | 
577 | [T0]:https://tailscale.com/blog/how-nat-traversal-works
578 | [F0]:https://fossbytes.com/connection-oriented-vs-connection-less-connection/
579 | 
580 | [B1]:https://www.bittorrent.org/beps/bep_0055.html
581 | [C0]:https://github.com/clostra/libutp
582 | [GH01]:https://github.com/libpcp/pcp
583 | [GH02]:https://github.com/paullouisageneau/libplum
584 | 
585 | [rfc3022]:https://datatracker.ietf.org/doc/html/rfc3022
586 | [rfc2663]:https://datatracker.ietf.org/doc/html/rfc2663
587 | [rfc6886]:https://datatracker.ietf.org/doc/html/rfc6886
588 | [rfc6887]:https://datatracker.ietf.org/doc/html/rfc6887
589 | [rfc791]:https://datatracker.ietf.org/doc/html/rfc791
590 | [rfc1122]:https://datatracker.ietf.org/doc/html/rfc1122
591 | [rfc2460]:https://datatracker.ietf.org/doc/html/rfc2460
592 | 


--------------------------------------------------------------------------------