├── README.md
├── devp2p.md
├── discv4.md
├── etherdog.png
└── rlpx.md
/README.md:
--------------------------------------------------------------------------------
1 |

2 |
3 | This repository contains specifications for the peer-to-peer networking protocols used by
4 | Ethereum. The issue tracker here is for discussions of protocol changes. You can also get
5 | in touch through our **[gitter channel](https://gitter.im/ethereum/devp2p)**.
6 |
7 | ### Implementations
8 |
9 | devp2p is part of most Ethereum clients. Implementations include:
10 |
11 | - [go-ethereum (Go)](https://github.com/ethereum/go-ethereum/)
12 | - [Parity Ethereum (Rust)](https://github.com/paritytech/parity-ethereum)
13 | - [Trinity (Python)](https://github.com/ethereum/py-evm)
14 | - [Aleth (C++)](https://github.com/ethereum/aleth)
15 | - [EthereumJ (Java)](https://github.com/ethereum/ethereumj)
16 | - [EthereumJS (JavaScript)](https://github.com/ethereumjs/ethereumjs-devp2p)
17 | - [ruby-devp2p (Ruby)](https://github.com/cryptape/ruby-devp2p)
18 | - [Exthereum (Elixir)](https://github.com/exthereum/ex_wire)
19 | - [eth_p2p (Nim)](https://github.com/status-im/nim-eth-p2p)
20 | - [Nethermind (.NET)](https://github.com/tkstanczak/nethermind)
21 |
22 | WireShark dissectors are [available here](https://github.com/ConsenSys/ethereum-dissectors).
23 |
--------------------------------------------------------------------------------
/devp2p.md:
--------------------------------------------------------------------------------
1 | # devp2p Application Protocol
2 |
3 | devp2p is an application-layer networking protocol for communication among nodes in a
4 | peer-to-peer network. Nodes may support any number of sub-protocols. devp2p handles
5 | negotiation of supported sub-protocols on both sides and carries their messages over a
6 | single connection.
7 |
8 | ### Low-Level
9 |
10 | Nodes communicate by sending messages using RLPx. Nodes are free to advertise and accept
11 | connections on any TCP ports they wish, however, a default port on which the connection
12 | may be listened and made will be 30303. Though TCP provides a connection-oriented medium,
13 | devp2p nodes communicate in terms of packets. RLPx provides facilities to send and receive
14 | packets. For more information about RLPx, refer to the [RLPx specification][rlpx].
15 |
16 | devp2p nodes find peers through the [discovery protocol][discv4] DHT. Peer connections can
17 | also be initiated by supplying the endpoint of a peer to a client-specific RPC API.
18 |
19 | ### Message Contents
20 |
21 | Messages are encoded using the [RLP serialization format][rlp].
22 |
23 | There are a number of different types of payload that may be encoded within the message.
24 | This 'type' is always determined by the first entry of the packet RLP, interpreted as an
25 | integer.
26 |
27 | devp2p is designed to support arbitrary sub-protocols (aka _capabilities_) over the basic
28 | wire protocol. Each sub-protocol is given as much of the message-ID space as it needs. All
29 | such protocols must statically specify how many message IDs they require. On connection
30 | and reception of the `Hello` message, both peers have equivalent information about what
31 | subprotocols they share (including versions) and are able to form consensus over the
32 | composition of message ID space.
33 |
34 | Message IDs are assumed to be compact from ID 0x10 onwards (0x00-0x10 is reserved for
35 | devp2p messages) and given to each shared (equal-version, equal name) sub-protocol in
36 | alphabetic order. Sub-protocols that are not shared are ignored. If multiple versions are
37 | shared of the same (equal name) sub-protocol, the numerically highest wins, others are
38 | ignored.
39 |
40 | ### "p2p" Sub-protocol Messages
41 |
42 | **Hello** `0x00` [`p2pVersion`: `P`, `clientId`: `B`, [[`cap1`: `B_3`, `capVersion1`:
43 | `P`], [`cap2`: `B_3`, `capVersion2`: `P`], `...`], `listenPort`: `P`, `nodeId`: `B_64`]
44 | First packet sent over the connection, and sent once by both sides. No other messages may
45 | be sent until a Hello is received.
46 |
47 | * `p2pVersion` Specifies the implemented version of the P2P protocol. Now must be 1.
48 | * `clientId` Specifies the client software identity, as a human-readable string (e.g.
49 | "Ethereum(++)/1.0.0").
50 | * `cap` Specifies a peer capability name as an ASCII string, e.g. "eth" for the eth subprotocol.
51 | * `capVersion` Specifies a peer capability version as a positive integer.
52 | * `listenPort` specifies the port that the client is listening on (on the interface that
53 | the present connection traverses). If 0 it indicates the client is not listening.
54 | * `nodeId` is the unique identity of the node and specifies a 512-bit secp256k1 public key that identifies this node.
55 |
56 | **Disconnect** `0x01` [`reason`: `P`] Inform the peer that a disconnection is imminent; if
57 | received, a peer should disconnect immediately. When sending, well-behaved hosts give
58 | their peers a fighting chance (read: wait 2 seconds) to disconnect to before disconnecting
59 | themselves.
60 |
61 | * `reason` is an optional integer specifying one of a number of reasons for disconnect:
62 | * `0x00` Disconnect requested;
63 | * `0x01` TCP sub-system error;
64 | * `0x02` Breach of protocol, e.g. a malformed message, bad RLP, incorrect magic number
65 | &c.;
66 | * `0x03` Useless peer;
67 | * `0x04` Too many peers;
68 | * `0x05` Already connected;
69 | * `0x06` Incompatible P2P protocol version;
70 | * `0x07` Null node identity received - this is automatically invalid;
71 | * `0x08` Client quitting;
72 | * `0x09` Unexpected identity (i.e. a different identity to a previous connection/what a
73 | trusted peer told us).
74 | * `0x0a` Identity is the same as this node (i.e. connected to itself);
75 | * `0x0b` Timeout on receiving a message (i.e. nothing received since sending last ping);
76 | * `0x10` Some other reason specific to a subprotocol.
77 |
78 | **Ping** `0x02` [] Requests an immediate reply of `Pong` from the peer.
79 |
80 | **Pong** `0x03` [] Reply to peer's `Ping` packet.
81 |
82 | ### Session Management
83 |
84 | Upon connecting, all clients (i.e. both sides of the connection) must send a `Hello`
85 | message. Upon receiving the `Hello` message and verifying compatibility of the network and
86 | versions, a session is active and any other P2P messages may be sent.
87 |
88 | At any time, a `Disconnect` message may be sent.
89 |
90 | [rlp]: https://github.com/ethereum/wiki/wiki/RLP
91 | [rlpx]: https://github.com/ethereum/devp2p/tree/master/rlpx.md
92 | [discv4]: https://github.com/ethereum/devp2p/tree/master/discv4.md
93 |
--------------------------------------------------------------------------------
/discv4.md:
--------------------------------------------------------------------------------
1 | # Node Discovery Protocol v4
2 |
3 | This specification defines the Node Discovery protocol version 4, a Kademlia-like DHT that
4 | stores information about Ethereum nodes. The Kademlia structure was chosen because it
5 | yields a topology of low diameter.
6 |
7 | ## Node Identities
8 |
9 | Every node has a cryptographic identity, a key on the elliptic curve secp256k1. The public
10 | key of the node serves as its identifier or 'node ID'.
11 |
12 | The 'distance' between two node IDs is the bitwise exclusive or on the hashes of the
13 | public keys, taken as the number.
14 |
15 | ```text
16 | distance(n₁, n₂) = keccak256(n₁) XOR keccak256(n₂)
17 | ```
18 |
19 | ## Node Table
20 |
21 | Nodes in the Discovery Protocol keep information about other nodes in their neighborhood.
22 | Neighbor nodes are stored in a routing table consisting of 'k-buckets'. For each `0 ≤ i <
23 | 256`, every node keeps a k-bucket for nodes of distance between `2i` and `2i+1` from
24 | itself.
25 |
26 | The Node Discovery Protocol uses `k = 16`, i.e. every k-bucket contains up to 16 node
27 | entries. The entries are sorted by time last seen — least-recently seen node at the head,
28 | most-recently seen at the tail.
29 |
30 | Whenever a new node N₁ is encountered, it can be inserted into the corresponding bucket.
31 | If the bucket contains less than `k` entries N₁ can simply be added as the first entry. If
32 | the bucket already contains `k` entries, the least recently seen node in the bucket, N₂,
33 | needs to be revalidated by sending a ping packet. If no reply is received from N₂ it is
34 | considered dead, removed and N₁ added to the front of the bucket.
35 |
36 | ## Endpoint Proof
37 |
38 | To prevent traffic amplification attacks, implementations must verify that the sender of a
39 | query participates in the discovery protocol. The sender of a packet is considered
40 | verified if it has sent a valid pong response with matching ping hash within the last 12
41 | hours.
42 |
43 | ## Recursive Lookup
44 |
45 | A 'lookup' locates the `k` closest nodes to a node ID.
46 |
47 | The lookup initiator starts by picking `α` closest nodes to the target it knows of. The
48 | initiator then sends concurrent FindNode packets to those nodes. `α` is a system-wide
49 | concurrency parameter, such as 3. In the recursive step, the initiator resends FindNode to
50 | nodes it has learned about from previous queries. Of the `k` nodes the initiator has heard
51 | of closest to the target, it picks `α` that it has not yet queried and resends FindNode to
52 | them. Nodes that fail to respond quickly are removed from consideration until and unless
53 | they do respond.
54 |
55 | If a round of FindNode queries fails to return a node any closer than the closest already
56 | seen, the initiator resends the find node to all of the `k` closest nodes it has not
57 | already queried. The lookup terminates when the initiator has queried and gotten responses
58 | from the `k` closest nodes it has seen.
59 |
60 | ## Wire Protocol
61 |
62 | Node discovery messages are sent as UDP datagrams. The maximum size of any packet is 1280
63 | bytes.
64 |
65 | ```text
66 | packet = packet-header || packet-data
67 | ```
68 |
69 | Every packet starts with a header:
70 |
71 | ```text
72 | packet-header = hash || signature || packet-type
73 | hash = keccak256(signature || packet-type || packet-data)
74 | signature = sign(packet-type || packet-data)
75 | ```
76 |
77 | The `hash` exists to make the packet format recognizable when running multiple protocols
78 | on the same UDP port. It serves no other purpose.
79 |
80 | Every packet is signed by the node's identity key. The `signature` is encoded as a byte
81 | array of length 65 as the concatenation of the signature values `r`, `s` and the 'recovery
82 | id' `v`.
83 |
84 | The `packet-type` is a single byte defining the type of message. Valid packet types are
85 | listed below. Data after the header is specific to the packet type and is encoded as an
86 | RLP list. As per EIP-8, implementations should ignore any additional elements in the list
87 | as well as any extra data after the list.
88 |
89 | ### Ping Packet (0x01)
90 |
91 | ```text
92 | packet-data = [version, from, to, expiration]
93 | version = 4
94 | from = [sender-ip, sender-udp-port, sender-tcp-port]
95 | to = [recipient-ip, recipient-udp-port, 0]
96 | ```
97 |
98 | The `expiration` field is an absolute UNIX time stamp. Packets containing a time stamp
99 | that lies in the past are expired may not be processed.
100 |
101 | When a ping packet is received, the recipient should reply with a pong packet. It may also
102 | consider the sender for addition into the node table.
103 |
104 | If no communication with the sender has occurred within the last 12h, a ping should be
105 | sent in addition to pong in order to receive an endpoint proof.
106 |
107 | ### Pong Packet (0x02)
108 |
109 | ```text
110 | packet-data = [to, ping-hash, expiration]
111 | ```
112 |
113 | Pong is the reply to ping.
114 |
115 | `ping-hash` should be equal to `hash` of the corresponding ping packet. Implementations
116 | should ignore unsolicited pong packets that do not contain the hash of the most recent
117 | ping packet.
118 |
119 | ### FindNode Packet (0x03)
120 |
121 | ```text
122 | packet-data = [target, expiration]
123 | ```
124 |
125 | A FindNode packet requests information about nodes close to `target`. The `target` is a
126 | 65-byte secp256k1 public key. When FindNode is received, the recipient should reply with
127 | neighbors packets containing the closest 16 nodes to target found in its local table.
128 |
129 | To guard against traffic amplification attacks, Neighbors replies should only be sent if
130 | the sender of FindNode has been verified by the endpoint proof procedure.
131 |
132 | ### Neighbors Packet (0x04)
133 |
134 | ```text
135 | packet-data = [nodes, expiration]
136 | nodes = [[ip, udp-port, tcp-port, node-id], ... ]
137 | ```
138 |
139 | Neighbors is the reply to FindNode.
140 |
141 | ## Known Issues & Implementation Advice
142 |
143 | The `expiration` field present in all packets is supposed to prevent packet replay. Since
144 | it is an absolute time stamp, the node's clock must be accurate to verify it correctly.
145 | Since the protocol's launch in 2016 we have received countless reports about connectivity
146 | issues related to the user's clock being wrong.
147 |
148 | The endpoint proof is imprecise because the sender of FindNode can never be sure whether
149 | the recipient has seen a recent enough pong. Geth handles it as follows: If no
150 | communication with the recipient has occurred within the last 12h, initiate the procedure
151 | by sending a ping. Wait for a ping from the other side, reply to it and then send
152 | FindNode.
153 |
--------------------------------------------------------------------------------
/etherdog.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/IDouble/devp2p/7d5b4f24febd1b474cb7c03973b439e76b81b041/etherdog.png
--------------------------------------------------------------------------------
/rlpx.md:
--------------------------------------------------------------------------------
1 | # The RLPx Transport Protocol
2 |
3 | This specification defines the RLPx transport protocol, a TCP-based transport protocol
4 | used for communication among Ethereum nodes. The protocol works with encrypted frames of
5 | arbitrary content, though it is typically used to carry the devp2p application protocol.
6 |
7 | ## Node Identity
8 |
9 | All cryptographic operations are based on the secp256k1 elliptic curve. Each node is
10 | expected to maintain a static private key which is saved and restored between sessions. It
11 | is recommended that the private key can only be reset manually, for example, by deleting a
12 | file or database entry.
13 |
14 | ## ECIES Encryption
15 |
16 | ECIES (Elliptic Curve Integrated Encryption Scheme) is an asymmetric encryption method
17 | used in the RLPx handshake. The cryptosystem used by RLPx is
18 |
19 | - The elliptic curve secp256k1 with generator `G`.
20 | - `KDF(k, len)`: the NIST SP 800-56 Concatenation Key Derivation Function
21 | - `MAC(k, m)`: HMAC using the SHA-256 hash function.
22 | - `AES(k, iv, m)`: the AES-128 encryption function in CTR mode.
23 |
24 | Alice wants to send an encrypted message that can be decrypted by Bobs static private key
25 | kB
. Alice knows about Bobs static public key
26 | KB
.
27 |
28 | To encrypt the message `m`, Alice generates a random number `r` and corresponding elliptic
29 | curve public key `R = r * G` and computes the shared secret S = Px
30 | where (Px, Py) = r * KB
. She derives key
31 | material for encryption and authentication as
32 | kE || kM = KDF(S, 32)
as well as a random
33 | initialization vector `iv`. Alice sends the encrypted message `R || iv || c || d` where
34 | c = AES(kE, iv , m)
and
35 | d = MAC(kM, iv || c)
to Bob.
36 |
37 | For Bob to decrypt the message `R || iv || c || d`, he derives the shared secret
38 | S = Px
where
39 | (Px, Py) = kB * R
as well as the encryption and
40 | authentication keys kE || kM = KDF(S, 32)
. Bob verifies
41 | the authenticity of the message by checking whether
42 | d == MAC(kM, iv || c)
then obtains the plaintext as
43 | m = AES(kE, iv || c)
.
44 |
45 | ## Handshake
46 |
47 | The 'handshake' establishes key material to be used for the duration of the session. It is
48 | carried out between the initiator (the node which opened the TCP connection) recipient
49 | (the node which accepted it).
50 |
51 | Handshake protocol:
52 |
53 | `E` is the ECIES asymmetric encryption function defined above.
54 |
55 | ```text
56 |
57 | auth -> E(remote-pubk, S(ephemeral-privk, static-shared-secret ^ nonce) || H(ephemeral-pubk) || pubk || nonce || 0x0)
58 | auth-ack -> E(remote-pubk, remote-ephemeral-pubk || nonce || 0x0)
59 |
60 | static-shared-secret = ecdh.agree(privkey, remote-pubk)
61 | ```
62 |
63 | Values generated following the handshake (see below for steps):
64 |
65 | ```text
66 | ephemeral-shared-secret = ecdh.agree(ephemeral-privkey, remote-ephemeral-pubk)
67 | shared-secret = keccak256(ephemeral-shared-secret || keccak256(nonce || initiator-nonce))
68 | aes-secret = keccak256(ephemeral-shared-secret || shared-secret)
69 | # destroy shared-secret
70 | mac-secret = keccak256(ephemeral-shared-secret || aes-secret)
71 | # destroy ephemeral-shared-secret
72 |
73 | Initiator:
74 | egress-mac = keccak256.update(mac-secret ^ recipient-nonce || auth-sent-init)
75 | # destroy nonce
76 | ingress-mac = keccak256.update(mac-secret ^ initiator-nonce || auth-recvd-ack)
77 | # destroy remote-nonce
78 |
79 | Recipient:
80 | egress-mac = keccak256.update(mac-secret ^ initiator-nonce || auth-sent-ack)
81 | # destroy nonce
82 | ingress-mac = keccak256.update(mac-secret ^ recipient-nonce || auth-recvd-init)
83 | # destroy remote-nonce
84 | ```
85 |
86 | Creating authenticated connection:
87 |
88 | 1. initiator connects to recipient and sends `auth` message
89 | 2. recipient accepts, decrypts and verifies `auth` (checks that recovery of signature ==
90 | `keccak256(ephemeral-pubk)`)
91 | 3. recipient generates `auth-ack` message from `remote-ephemeral-pubk` and `nonce`
92 | 4. recipient derives secrets and sends the first payload frame
93 | 5. initiator receives `auth-ack` and derives secrets
94 | 6. initiator sends first payload frame
95 | 7. recipient receives and authenticates first payload frame
96 | 8. initiator receives and authenticates first payload frame
97 | 9. cryptographic handshake is complete if MAC of first payload frame is valid on both sides
98 |
99 | # Framing
100 |
101 | All packets following `auth` are framed. Either side may disconnect if authentication of
102 | the first framed packet fails.
103 |
104 | The primary purpose behind framing packets is in order to robustly support multiplexing
105 | multiple protocols over a single connection. Secondarily, as framed packets yield
106 | reasonable demarcation points for message authentication codes, supporting an encrypted
107 | stream becomes straight-forward. Frames are authenticated via key material which is
108 | generated during the handshake.
109 |
110 | The frame header provides information about the size of the packet and the packet's source
111 | protocol.
112 |
113 | ```text
114 | frame = header || header-mac || frame-data || frame-mac
115 | header = frame-size || header-data || padding
116 | frame-size = size of frame excluding padding, integer < 2**24, big endian
117 | header-data = rlp.list(protocol-type[, context-id])
118 | protocol-type = integer < 2**16, big endian
119 | context-id = integer < 2**16, big endian
120 | padding = zero-fill to 16-byte boundary
121 | frame-content = any binary data
122 |
123 | header-mac = left16(egress-mac.update(aes(mac-secret,egress-mac)) ^ header-ciphertext).digest
124 | frame-mac = left16(egress-mac.update(aes(mac-secret,egress-mac)) ^ left16(egress-mac.update(frame-ciphertext).digest))
125 | egress-mac = keccak256 state, continuously updated with egress bytes
126 | ingress-mac = keccak256 state, continuously updated with ingress bytes
127 |
128 | left16(x) is the first 16 bytes of x
129 | || is concatenate
130 | ^ is xor
131 | ```
132 |
133 | Message authentication is achieved by continuously updating `egress-mac` or `ingress-mac`
134 | with the ciphertext of bytes sent (egress) or received (ingress); for headers the update
135 | is performed by xoring the header with the encrypted output of it's corresponding mac (see
136 | header-mac above for example). This is done to ensure uniform operations are performed for
137 | both plaintext mac and ciphertext. All macs are sent cleartext.
138 |
139 | Padding is used to prevent buffer starvation, such that frame components are byte-aligned
140 | to block size of cipher.
141 |
142 | ## Known Issues
143 |
144 | - The RLPx handshake is considered 'broken crypto' because `aes-secret` and `mac-secret`
145 | are reused for both reading and writing. The two sides of a RLPx connection generate two
146 | CTR streams from the same key, nonce and IV. If an attacker knows one plaintext, they can
147 | decrypt unknown plaintexts of the reused keystream.
148 | - The frame encoding provides a `protocol-type` field for multiplexing purposes, but this
149 | field is unused by devp2p.
150 |
151 | ## References
152 | - Petar Maymounkov and David Mazieres. Kademlia: A Peer-to-peer Information System Based on the XOR Metric. 2002. URL { https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf }
153 | - Victor Shoup. A proposal for an ISO standard for public key encryption, Version 2.1. 2001. URL { http://www.shoup.net/papers/iso-2_1.pdf }
154 | - Mike Belshe and Roberto Peon. SPDY Protocol - Draft 3. 2014. URL { http://www.chromium.org/spdy/spdy-protocol/spdy-protocol-draft3 }
155 |
156 | Copyright © 2014 Alex Leverington.
157 | This work is licensed under a
158 | Creative Commons Attribution-NonCommercial-ShareAlike
159 | 4.0 International License.
160 |
--------------------------------------------------------------------------------