├── README.md ├── devp2p.md ├── discv4.md ├── etherdog.png └── rlpx.md /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 | This repository contains specifications for the peer-to-peer networking protocols used by 4 | Ethereum. The issue tracker here is for discussions of protocol changes. You can also get 5 | in touch through our **[gitter channel](https://gitter.im/ethereum/devp2p)**. 6 | 7 | ### Implementations 8 | 9 | devp2p is part of most Ethereum clients. Implementations include: 10 | 11 | - [go-ethereum (Go)](https://github.com/ethereum/go-ethereum/) 12 | - [Parity Ethereum (Rust)](https://github.com/paritytech/parity-ethereum) 13 | - [Trinity (Python)](https://github.com/ethereum/py-evm) 14 | - [Aleth (C++)](https://github.com/ethereum/aleth) 15 | - [EthereumJ (Java)](https://github.com/ethereum/ethereumj) 16 | - [EthereumJS (JavaScript)](https://github.com/ethereumjs/ethereumjs-devp2p) 17 | - [ruby-devp2p (Ruby)](https://github.com/cryptape/ruby-devp2p) 18 | - [Exthereum (Elixir)](https://github.com/exthereum/ex_wire) 19 | - [eth_p2p (Nim)](https://github.com/status-im/nim-eth-p2p) 20 | - [Nethermind (.NET)](https://github.com/tkstanczak/nethermind) 21 | 22 | WireShark dissectors are [available here](https://github.com/ConsenSys/ethereum-dissectors). 23 | -------------------------------------------------------------------------------- /devp2p.md: -------------------------------------------------------------------------------- 1 | # devp2p Application Protocol 2 | 3 | devp2p is an application-layer networking protocol for communication among nodes in a 4 | peer-to-peer network. Nodes may support any number of sub-protocols. devp2p handles 5 | negotiation of supported sub-protocols on both sides and carries their messages over a 6 | single connection. 7 | 8 | ### Low-Level 9 | 10 | Nodes communicate by sending messages using RLPx. Nodes are free to advertise and accept 11 | connections on any TCP ports they wish, however, a default port on which the connection 12 | may be listened and made will be 30303. Though TCP provides a connection-oriented medium, 13 | devp2p nodes communicate in terms of packets. RLPx provides facilities to send and receive 14 | packets. For more information about RLPx, refer to the [RLPx specification][rlpx]. 15 | 16 | devp2p nodes find peers through the [discovery protocol][discv4] DHT. Peer connections can 17 | also be initiated by supplying the endpoint of a peer to a client-specific RPC API. 18 | 19 | ### Message Contents 20 | 21 | Messages are encoded using the [RLP serialization format][rlp]. 22 | 23 | There are a number of different types of payload that may be encoded within the message. 24 | This 'type' is always determined by the first entry of the packet RLP, interpreted as an 25 | integer. 26 | 27 | devp2p is designed to support arbitrary sub-protocols (aka _capabilities_) over the basic 28 | wire protocol. Each sub-protocol is given as much of the message-ID space as it needs. All 29 | such protocols must statically specify how many message IDs they require. On connection 30 | and reception of the `Hello` message, both peers have equivalent information about what 31 | subprotocols they share (including versions) and are able to form consensus over the 32 | composition of message ID space. 33 | 34 | Message IDs are assumed to be compact from ID 0x10 onwards (0x00-0x10 is reserved for 35 | devp2p messages) and given to each shared (equal-version, equal name) sub-protocol in 36 | alphabetic order. Sub-protocols that are not shared are ignored. If multiple versions are 37 | shared of the same (equal name) sub-protocol, the numerically highest wins, others are 38 | ignored. 39 | 40 | ### "p2p" Sub-protocol Messages 41 | 42 | **Hello** `0x00` [`p2pVersion`: `P`, `clientId`: `B`, [[`cap1`: `B_3`, `capVersion1`: 43 | `P`], [`cap2`: `B_3`, `capVersion2`: `P`], `...`], `listenPort`: `P`, `nodeId`: `B_64`] 44 | First packet sent over the connection, and sent once by both sides. No other messages may 45 | be sent until a Hello is received. 46 | 47 | * `p2pVersion` Specifies the implemented version of the P2P protocol. Now must be 1. 48 | * `clientId` Specifies the client software identity, as a human-readable string (e.g. 49 | "Ethereum(++)/1.0.0"). 50 | * `cap` Specifies a peer capability name as an ASCII string, e.g. "eth" for the eth subprotocol. 51 | * `capVersion` Specifies a peer capability version as a positive integer. 52 | * `listenPort` specifies the port that the client is listening on (on the interface that 53 | the present connection traverses). If 0 it indicates the client is not listening. 54 | * `nodeId` is the unique identity of the node and specifies a 512-bit secp256k1 public key that identifies this node. 55 | 56 | **Disconnect** `0x01` [`reason`: `P`] Inform the peer that a disconnection is imminent; if 57 | received, a peer should disconnect immediately. When sending, well-behaved hosts give 58 | their peers a fighting chance (read: wait 2 seconds) to disconnect to before disconnecting 59 | themselves. 60 | 61 | * `reason` is an optional integer specifying one of a number of reasons for disconnect: 62 | * `0x00` Disconnect requested; 63 | * `0x01` TCP sub-system error; 64 | * `0x02` Breach of protocol, e.g. a malformed message, bad RLP, incorrect magic number 65 | &c.; 66 | * `0x03` Useless peer; 67 | * `0x04` Too many peers; 68 | * `0x05` Already connected; 69 | * `0x06` Incompatible P2P protocol version; 70 | * `0x07` Null node identity received - this is automatically invalid; 71 | * `0x08` Client quitting; 72 | * `0x09` Unexpected identity (i.e. a different identity to a previous connection/what a 73 | trusted peer told us). 74 | * `0x0a` Identity is the same as this node (i.e. connected to itself); 75 | * `0x0b` Timeout on receiving a message (i.e. nothing received since sending last ping); 76 | * `0x10` Some other reason specific to a subprotocol. 77 | 78 | **Ping** `0x02` [] Requests an immediate reply of `Pong` from the peer. 79 | 80 | **Pong** `0x03` [] Reply to peer's `Ping` packet. 81 | 82 | ### Session Management 83 | 84 | Upon connecting, all clients (i.e. both sides of the connection) must send a `Hello` 85 | message. Upon receiving the `Hello` message and verifying compatibility of the network and 86 | versions, a session is active and any other P2P messages may be sent. 87 | 88 | At any time, a `Disconnect` message may be sent. 89 | 90 | [rlp]: https://github.com/ethereum/wiki/wiki/RLP 91 | [rlpx]: https://github.com/ethereum/devp2p/tree/master/rlpx.md 92 | [discv4]: https://github.com/ethereum/devp2p/tree/master/discv4.md 93 | -------------------------------------------------------------------------------- /discv4.md: -------------------------------------------------------------------------------- 1 | # Node Discovery Protocol v4 2 | 3 | This specification defines the Node Discovery protocol version 4, a Kademlia-like DHT that 4 | stores information about Ethereum nodes. The Kademlia structure was chosen because it 5 | yields a topology of low diameter. 6 | 7 | ## Node Identities 8 | 9 | Every node has a cryptographic identity, a key on the elliptic curve secp256k1. The public 10 | key of the node serves as its identifier or 'node ID'. 11 | 12 | The 'distance' between two node IDs is the bitwise exclusive or on the hashes of the 13 | public keys, taken as the number. 14 | 15 | ```text 16 | distance(n₁, n₂) = keccak256(n₁) XOR keccak256(n₂) 17 | ``` 18 | 19 | ## Node Table 20 | 21 | Nodes in the Discovery Protocol keep information about other nodes in their neighborhood. 22 | Neighbor nodes are stored in a routing table consisting of 'k-buckets'. For each `0 ≤ i < 23 | 256`, every node keeps a k-bucket for nodes of distance between `2i` and `2i+1` from 24 | itself. 25 | 26 | The Node Discovery Protocol uses `k = 16`, i.e. every k-bucket contains up to 16 node 27 | entries. The entries are sorted by time last seen — least-recently seen node at the head, 28 | most-recently seen at the tail. 29 | 30 | Whenever a new node N₁ is encountered, it can be inserted into the corresponding bucket. 31 | If the bucket contains less than `k` entries N₁ can simply be added as the first entry. If 32 | the bucket already contains `k` entries, the least recently seen node in the bucket, N₂, 33 | needs to be revalidated by sending a ping packet. If no reply is received from N₂ it is 34 | considered dead, removed and N₁ added to the front of the bucket. 35 | 36 | ## Endpoint Proof 37 | 38 | To prevent traffic amplification attacks, implementations must verify that the sender of a 39 | query participates in the discovery protocol. The sender of a packet is considered 40 | verified if it has sent a valid pong response with matching ping hash within the last 12 41 | hours. 42 | 43 | ## Recursive Lookup 44 | 45 | A 'lookup' locates the `k` closest nodes to a node ID. 46 | 47 | The lookup initiator starts by picking `α` closest nodes to the target it knows of. The 48 | initiator then sends concurrent FindNode packets to those nodes. `α` is a system-wide 49 | concurrency parameter, such as 3. In the recursive step, the initiator resends FindNode to 50 | nodes it has learned about from previous queries. Of the `k` nodes the initiator has heard 51 | of closest to the target, it picks `α` that it has not yet queried and resends FindNode to 52 | them. Nodes that fail to respond quickly are removed from consideration until and unless 53 | they do respond. 54 | 55 | If a round of FindNode queries fails to return a node any closer than the closest already 56 | seen, the initiator resends the find node to all of the `k` closest nodes it has not 57 | already queried. The lookup terminates when the initiator has queried and gotten responses 58 | from the `k` closest nodes it has seen. 59 | 60 | ## Wire Protocol 61 | 62 | Node discovery messages are sent as UDP datagrams. The maximum size of any packet is 1280 63 | bytes. 64 | 65 | ```text 66 | packet = packet-header || packet-data 67 | ``` 68 | 69 | Every packet starts with a header: 70 | 71 | ```text 72 | packet-header = hash || signature || packet-type 73 | hash = keccak256(signature || packet-type || packet-data) 74 | signature = sign(packet-type || packet-data) 75 | ``` 76 | 77 | The `hash` exists to make the packet format recognizable when running multiple protocols 78 | on the same UDP port. It serves no other purpose. 79 | 80 | Every packet is signed by the node's identity key. The `signature` is encoded as a byte 81 | array of length 65 as the concatenation of the signature values `r`, `s` and the 'recovery 82 | id' `v`. 83 | 84 | The `packet-type` is a single byte defining the type of message. Valid packet types are 85 | listed below. Data after the header is specific to the packet type and is encoded as an 86 | RLP list. As per EIP-8, implementations should ignore any additional elements in the list 87 | as well as any extra data after the list. 88 | 89 | ### Ping Packet (0x01) 90 | 91 | ```text 92 | packet-data = [version, from, to, expiration] 93 | version = 4 94 | from = [sender-ip, sender-udp-port, sender-tcp-port] 95 | to = [recipient-ip, recipient-udp-port, 0] 96 | ``` 97 | 98 | The `expiration` field is an absolute UNIX time stamp. Packets containing a time stamp 99 | that lies in the past are expired may not be processed. 100 | 101 | When a ping packet is received, the recipient should reply with a pong packet. It may also 102 | consider the sender for addition into the node table. 103 | 104 | If no communication with the sender has occurred within the last 12h, a ping should be 105 | sent in addition to pong in order to receive an endpoint proof. 106 | 107 | ### Pong Packet (0x02) 108 | 109 | ```text 110 | packet-data = [to, ping-hash, expiration] 111 | ``` 112 | 113 | Pong is the reply to ping. 114 | 115 | `ping-hash` should be equal to `hash` of the corresponding ping packet. Implementations 116 | should ignore unsolicited pong packets that do not contain the hash of the most recent 117 | ping packet. 118 | 119 | ### FindNode Packet (0x03) 120 | 121 | ```text 122 | packet-data = [target, expiration] 123 | ``` 124 | 125 | A FindNode packet requests information about nodes close to `target`. The `target` is a 126 | 65-byte secp256k1 public key. When FindNode is received, the recipient should reply with 127 | neighbors packets containing the closest 16 nodes to target found in its local table. 128 | 129 | To guard against traffic amplification attacks, Neighbors replies should only be sent if 130 | the sender of FindNode has been verified by the endpoint proof procedure. 131 | 132 | ### Neighbors Packet (0x04) 133 | 134 | ```text 135 | packet-data = [nodes, expiration] 136 | nodes = [[ip, udp-port, tcp-port, node-id], ... ] 137 | ``` 138 | 139 | Neighbors is the reply to FindNode. 140 | 141 | ## Known Issues & Implementation Advice 142 | 143 | The `expiration` field present in all packets is supposed to prevent packet replay. Since 144 | it is an absolute time stamp, the node's clock must be accurate to verify it correctly. 145 | Since the protocol's launch in 2016 we have received countless reports about connectivity 146 | issues related to the user's clock being wrong. 147 | 148 | The endpoint proof is imprecise because the sender of FindNode can never be sure whether 149 | the recipient has seen a recent enough pong. Geth handles it as follows: If no 150 | communication with the recipient has occurred within the last 12h, initiate the procedure 151 | by sending a ping. Wait for a ping from the other side, reply to it and then send 152 | FindNode. 153 | -------------------------------------------------------------------------------- /etherdog.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/IDouble/devp2p/7d5b4f24febd1b474cb7c03973b439e76b81b041/etherdog.png -------------------------------------------------------------------------------- /rlpx.md: -------------------------------------------------------------------------------- 1 | # The RLPx Transport Protocol 2 | 3 | This specification defines the RLPx transport protocol, a TCP-based transport protocol 4 | used for communication among Ethereum nodes. The protocol works with encrypted frames of 5 | arbitrary content, though it is typically used to carry the devp2p application protocol. 6 | 7 | ## Node Identity 8 | 9 | All cryptographic operations are based on the secp256k1 elliptic curve. Each node is 10 | expected to maintain a static private key which is saved and restored between sessions. It 11 | is recommended that the private key can only be reset manually, for example, by deleting a 12 | file or database entry. 13 | 14 | ## ECIES Encryption 15 | 16 | ECIES (Elliptic Curve Integrated Encryption Scheme) is an asymmetric encryption method 17 | used in the RLPx handshake. The cryptosystem used by RLPx is 18 | 19 | - The elliptic curve secp256k1 with generator `G`. 20 | - `KDF(k, len)`: the NIST SP 800-56 Concatenation Key Derivation Function 21 | - `MAC(k, m)`: HMAC using the SHA-256 hash function. 22 | - `AES(k, iv, m)`: the AES-128 encryption function in CTR mode. 23 | 24 | Alice wants to send an encrypted message that can be decrypted by Bobs static private key 25 | kB. Alice knows about Bobs static public key 26 | KB. 27 | 28 | To encrypt the message `m`, Alice generates a random number `r` and corresponding elliptic 29 | curve public key `R = r * G` and computes the shared secret S = Px 30 | where (Px, Py) = r * KB. She derives key 31 | material for encryption and authentication as 32 | kE || kM = KDF(S, 32) as well as a random 33 | initialization vector `iv`. Alice sends the encrypted message `R || iv || c || d` where 34 | c = AES(kE, iv , m) and 35 | d = MAC(kM, iv || c) to Bob. 36 | 37 | For Bob to decrypt the message `R || iv || c || d`, he derives the shared secret 38 | S = Px where 39 | (Px, Py) = kB * R as well as the encryption and 40 | authentication keys kE || kM = KDF(S, 32). Bob verifies 41 | the authenticity of the message by checking whether 42 | d == MAC(kM, iv || c) then obtains the plaintext as 43 | m = AES(kE, iv || c). 44 | 45 | ## Handshake 46 | 47 | The 'handshake' establishes key material to be used for the duration of the session. It is 48 | carried out between the initiator (the node which opened the TCP connection) recipient 49 | (the node which accepted it). 50 | 51 | Handshake protocol: 52 | 53 | `E` is the ECIES asymmetric encryption function defined above. 54 | 55 | ```text 56 | 57 | auth -> E(remote-pubk, S(ephemeral-privk, static-shared-secret ^ nonce) || H(ephemeral-pubk) || pubk || nonce || 0x0) 58 | auth-ack -> E(remote-pubk, remote-ephemeral-pubk || nonce || 0x0) 59 | 60 | static-shared-secret = ecdh.agree(privkey, remote-pubk) 61 | ``` 62 | 63 | Values generated following the handshake (see below for steps): 64 | 65 | ```text 66 | ephemeral-shared-secret = ecdh.agree(ephemeral-privkey, remote-ephemeral-pubk) 67 | shared-secret = keccak256(ephemeral-shared-secret || keccak256(nonce || initiator-nonce)) 68 | aes-secret = keccak256(ephemeral-shared-secret || shared-secret) 69 | # destroy shared-secret 70 | mac-secret = keccak256(ephemeral-shared-secret || aes-secret) 71 | # destroy ephemeral-shared-secret 72 | 73 | Initiator: 74 | egress-mac = keccak256.update(mac-secret ^ recipient-nonce || auth-sent-init) 75 | # destroy nonce 76 | ingress-mac = keccak256.update(mac-secret ^ initiator-nonce || auth-recvd-ack) 77 | # destroy remote-nonce 78 | 79 | Recipient: 80 | egress-mac = keccak256.update(mac-secret ^ initiator-nonce || auth-sent-ack) 81 | # destroy nonce 82 | ingress-mac = keccak256.update(mac-secret ^ recipient-nonce || auth-recvd-init) 83 | # destroy remote-nonce 84 | ``` 85 | 86 | Creating authenticated connection: 87 | 88 | 1. initiator connects to recipient and sends `auth` message 89 | 2. recipient accepts, decrypts and verifies `auth` (checks that recovery of signature == 90 | `keccak256(ephemeral-pubk)`) 91 | 3. recipient generates `auth-ack` message from `remote-ephemeral-pubk` and `nonce` 92 | 4. recipient derives secrets and sends the first payload frame 93 | 5. initiator receives `auth-ack` and derives secrets 94 | 6. initiator sends first payload frame 95 | 7. recipient receives and authenticates first payload frame 96 | 8. initiator receives and authenticates first payload frame 97 | 9. cryptographic handshake is complete if MAC of first payload frame is valid on both sides 98 | 99 | # Framing 100 | 101 | All packets following `auth` are framed. Either side may disconnect if authentication of 102 | the first framed packet fails. 103 | 104 | The primary purpose behind framing packets is in order to robustly support multiplexing 105 | multiple protocols over a single connection. Secondarily, as framed packets yield 106 | reasonable demarcation points for message authentication codes, supporting an encrypted 107 | stream becomes straight-forward. Frames are authenticated via key material which is 108 | generated during the handshake. 109 | 110 | The frame header provides information about the size of the packet and the packet's source 111 | protocol. 112 | 113 | ```text 114 | frame = header || header-mac || frame-data || frame-mac 115 | header = frame-size || header-data || padding 116 | frame-size = size of frame excluding padding, integer < 2**24, big endian 117 | header-data = rlp.list(protocol-type[, context-id]) 118 | protocol-type = integer < 2**16, big endian 119 | context-id = integer < 2**16, big endian 120 | padding = zero-fill to 16-byte boundary 121 | frame-content = any binary data 122 | 123 | header-mac = left16(egress-mac.update(aes(mac-secret,egress-mac)) ^ header-ciphertext).digest 124 | frame-mac = left16(egress-mac.update(aes(mac-secret,egress-mac)) ^ left16(egress-mac.update(frame-ciphertext).digest)) 125 | egress-mac = keccak256 state, continuously updated with egress bytes 126 | ingress-mac = keccak256 state, continuously updated with ingress bytes 127 | 128 | left16(x) is the first 16 bytes of x 129 | || is concatenate 130 | ^ is xor 131 | ``` 132 | 133 | Message authentication is achieved by continuously updating `egress-mac` or `ingress-mac` 134 | with the ciphertext of bytes sent (egress) or received (ingress); for headers the update 135 | is performed by xoring the header with the encrypted output of it's corresponding mac (see 136 | header-mac above for example). This is done to ensure uniform operations are performed for 137 | both plaintext mac and ciphertext. All macs are sent cleartext. 138 | 139 | Padding is used to prevent buffer starvation, such that frame components are byte-aligned 140 | to block size of cipher. 141 | 142 | ## Known Issues 143 | 144 | - The RLPx handshake is considered 'broken crypto' because `aes-secret` and `mac-secret` 145 | are reused for both reading and writing. The two sides of a RLPx connection generate two 146 | CTR streams from the same key, nonce and IV. If an attacker knows one plaintext, they can 147 | decrypt unknown plaintexts of the reused keystream. 148 | - The frame encoding provides a `protocol-type` field for multiplexing purposes, but this 149 | field is unused by devp2p. 150 | 151 | ## References 152 | - Petar Maymounkov and David Mazieres. Kademlia: A Peer-to-peer Information System Based on the XOR Metric. 2002. URL { https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf } 153 | - Victor Shoup. A proposal for an ISO standard for public key encryption, Version 2.1. 2001. URL { http://www.shoup.net/papers/iso-2_1.pdf } 154 | - Mike Belshe and Roberto Peon. SPDY Protocol - Draft 3. 2014. URL { http://www.chromium.org/spdy/spdy-protocol/spdy-protocol-draft3 } 155 | 156 | Copyright © 2014 Alex Leverington. 157 | This work is licensed under a 158 | Creative Commons Attribution-NonCommercial-ShareAlike 159 | 4.0 International License. 160 | --------------------------------------------------------------------------------