├── .gitattributes
├── .pre-commit-config.yaml
├── README.md
├── core
├── merkle-tree.md
└── poh.md
├── gossip
└── gossip-protocol-spec.md
├── p2p
├── shred.md
└── tpu.md
└── solana_specs
├── .gitignore
├── __init__.py
├── consensus
└── leader_schedule.py
├── core
├── base58.py
├── chacha20.py
├── chacha20rng.py
├── poh.py
└── weighted_index.py
└── fixtures
├── epoch-stakes-mainnet-454.csv
└── leader-schedule-454.txt
/.gitattributes:
--------------------------------------------------------------------------------
1 | solana_specs/fixtures/** filter=lfs diff=lfs merge=lfs -text
2 |
--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | repos:
2 | - repo: https://github.com/pre-commit/pre-commit-hooks
3 | rev: v4.4.0
4 | hooks:
5 | - id: check-yaml
6 | - id: end-of-file-fixer
7 | - id: trailing-whitespace
8 | - repo: https://github.com/psf/black
9 | rev: 23.3.0
10 | hooks:
11 | - id: black
12 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Solana protocol specs
2 |
3 | This repository hosts protocol specifications of the Solana network, maintained by various protocol teams.
4 |
5 | ## Organization
6 |
7 | | Section | Description |
8 | |--------------:|-----------------------------------------|
9 | | *[core]* | Basic concepts and data structures |
10 | | *[gossip]* | Protocol for network communication |
11 | | *[consensus]* | Blockchain consensus rules |
12 | | *[runtime]* | On-chain runtime environment (Sealevel) |
13 | | *[p2p]* | Validator network protocols |
14 | | *[api]* | Client-facing node APIs (e.g. JSON-RPC) |
15 |
16 | [core]: ./core/
17 | [consensus]: ./consensus/
18 | [runtime]: ./runtime/
19 | [p2p]: ./p2p/
20 | [api]: ./api/
21 | [gossip]: ./gossip/
22 |
23 | ## Community
24 |
25 | This repo exists to define a single source of truth for consensus-critical sections of the protocol,
26 | such as verification and state transition rules.
27 |
28 | The first long-term objective of the specification effort is to produce a complete and unambiguous reference for implementing a Solana validator.
29 |
30 | Other documentation regarding widely adopted protocols may be added at the discretion of the Solana Foundation.
31 |
32 | ## Reference Code
33 |
34 | The `solana_specs` module contains Python 3.10 implementations of various specs.
35 |
36 | Each unit is runnable as a test, like so:
37 |
38 | ```
39 | python3.10 -m solana_specs.consensus.leader_schedule
40 | ```
41 |
42 | The reference code is formatted using [black](https://black.readthedocs.io/en/stable/).
43 |
44 | Certain tests depend on test fixtures which are hosted on [Git LFS](https://git-lfs.com/).
45 | To download test fixtures, run:
46 |
47 | ```shell
48 | git lfs pull
49 | ```
50 |
--------------------------------------------------------------------------------
/core/merkle-tree.md:
--------------------------------------------------------------------------------
1 | # Binary Merkle Tree
2 |
3 | ## Usage
4 |
5 | The binary hash tree facilitates equality checks over a list of arbitrary data blobs.
6 |
7 | Each tree anchors in a 32 byte *root hash* which is constructed by recursively hashing pairs of two tree nodes into one.
8 |
9 | Merkle proofs, which equate arbitrary data against entries in the tree, are of succinct size irrespective of the input data.
10 |
11 | ## Structure
12 |
13 | - Each tree node is identified by a SHA-256 hash.
14 | - Two types of nodes exist: *Leaf* and *intermediate* nodes.
15 | - The pre-image of a *leaf node* is the byte `0x00` followed by an arbitrary amount of data.
16 | - The pre-image of an *intermediate node* is the byte `0x01` followed by two 32 byte hashes, each referring to a node.
17 | - Each node has one or zero parent intermediate nodes.
18 | - Referring to the same node twice in the same intermediate node is permitted.
19 | - The graph of nodes represents a binary tree, thus is acyclic and has one *root node*.
20 |
21 | ## Algorithms
22 |
23 | ### Leaf Order
24 |
25 | The *leaf order* of a tree is defined by depth-first search traversal starting at the root,
26 | counting only leaf nodes.
27 |
28 | **Example**
29 |
30 | The leaf order of the tree in _figure 1_ is `[Lβ, Lα, Lδ]`.
31 |
32 |
33 |
34 | ```
35 | Figure 1: Non-canonical tree with three leaf nodes and two intermediate nodes
36 |
37 | Iε
38 | / \
39 | Iγ Lδ
40 | / \
41 | Lβ Lα
42 | ```
43 |
44 | ### Level Order
45 |
46 | The *level order* of a tree is defined by depth-first search starting at the root,
47 | counting only nodes with a given level.
48 |
49 | **Example**
50 |
51 | The tree in [_figure 1_](#figure_1) has the following level orders:
52 | - `0: [Iε]`
53 | - `1: [Iγ, Lδ]`
54 | - `2: [Lβ, Lα]`
55 |
56 | ### Canonical Construction
57 |
58 | The construction algorithm deterministically creates a tree structure over a list of items.
59 |
60 | Determinism ensures that independently constructed trees over the same items are identical.
61 | This is required for equality and membership checks.
62 |
63 | The *canonical* tree layout for any arbitrary list of items is defined by the following invariants:
64 | - Each list item corresponds to one leaf node.
65 | - The ordering of list items matches the order of leaf nodes.
66 | - Each leaf node is in the deepest tree level.
67 | - For any level `l` with number of nodes `n(l)`, if `n(l) % 2 == 1 and n(l) > 1`,
68 | then the last node in level `l-1` is an intermediate node that contains the hash of the last node in `l` twice.
69 |
70 | **Example**
71 |
72 | _Figure 2_ shows the canonical construction of items `[L0, L1, L2, L3, L4]`.
73 |
74 |
75 |
76 | ```
77 | Figure 2: Canonical tree with 5 items
78 |
79 | Iζ
80 | / \
81 | / \
82 | Iδ Iε
83 | / \ \\
84 | / \ \\
85 | Iα Iβ Iγ
86 | / \ / \ ||
87 | L0 L1 L2 L3 L4
88 | ```
89 |
90 | Contents of nodes:
91 |
92 | - `L0 := sha256(concat(0x00, data[0]))`
93 | - `L1 := sha256(concat(0x00, data[1]))`
94 | - `L2 := sha256(concat(0x00, data[2]))`
95 | - `L3 := sha256(concat(0x00, data[3]))`
96 | - `L4 := sha256(concat(0x00, data[4]))`
97 | - `Iα := sha256(concat(0x01, hash(L0), hash(L1)))`
98 | - `Iβ := sha256(concat(0x01, hash(L2), hash(L3)))`
99 | - `Iγ := sha256(concat(0x01, hash(L4), hash(L4)))`
100 | - `Iδ := sha256(concat(0x01, hash(Iα), hash(Iβ)))`
101 | - `Iε := sha256(concat(0x01, hash(Iγ), hash(Iγ)))`
102 | - `Iζ := sha256(concat(0x01, hash(Iδ), hash(Iε)))`
103 |
104 | ### List Equality Check
105 |
106 | Checking the equality of two merkle trees is trivial: Comparing the hash of the roots of either tree.
107 |
108 | ## Security
109 |
110 | No practical collision attacks against SHA-256 are known as of Oct 2022.
111 |
112 | Collision resistance is vital to ensure that the graph of nodes remains acyclic and that each hash unambiguously refers to one logical node.
113 |
114 | ## Test Vectors
115 |
116 |
117 | Canonical root test vectors
118 |
119 |
120 | Items (UTF-8) |
121 | Root Hash (Hex) |
122 |
123 |
124 |
125 |
126 | ['test'] |
127 | dbebd10e61bc8c28591273feafbbef95d544f874693301d8f7f8e54c6e30058e |
128 |
129 |
130 | ['my', 'very', 'eager', 'mother', 'just, 'served', 'us', 'nine', 'pizzas', 'make', 'prime'] |
131 | b40c847546fdceea166f927fc46c5ca33c3638236a36275c1346d3dffb84e1bc |
132 |
133 |
134 |
135 |
--------------------------------------------------------------------------------
/core/poh.md:
--------------------------------------------------------------------------------
1 | # Proof-of-History delay function
2 |
3 | ## Usage
4 |
5 | Proof-of-History (PoH) is a recursive SHA-256 hash chain.
6 |
7 | ## Structure
8 |
9 | The state of PoH is sized 256 bits.
10 | The initial state is set to the *seed* value.
11 |
12 | The *append* operation sets the state to the SHA-256 hash of itself.
13 |
14 | The *mixin* operation sets the state to the SHA-256 hash of the concatenation of itself and an arbitrary 32 byte external input.
15 |
16 | ## Pseudocode
17 |
18 | [poh.py](./poh.py) is a functional Python 3 implementation of PoH.
19 |
20 | ## Test Vectors
21 |
22 | ### Solana mainnet block 0
23 |
24 |
25 |
26 | Pre State |
27 | 45296998a6f8e2a784db5d9f95e18fc23f70441a1039446801089879b08c7ef0 |
28 |
29 |
30 | Append |
31 | 800000x |
32 |
33 |
34 | Post State |
35 | 3973e330c29b831f3fcb0e49374ed8d0388f410a23e4ebf23328505036efbd03 |
36 |
37 |
38 |
39 | ### Solana mainnet block 1
40 |
41 |
42 |
43 | Pre State |
44 | 3973e330c29b831f3fcb0e49374ed8d0388f410a23e4ebf23328505036efbd03 |
45 |
46 |
47 | Append |
48 | 14612x |
49 |
50 |
51 | Mixin |
52 | c95f2f13a9a77f32b1437976c4cffe3029298a49bf37007f8e45d793a520f30b |
53 |
54 |
55 | Append |
56 | 210347x |
57 |
58 |
59 | Mixin |
60 | 1aaeeb36611f484d984683a3db9269f2292dd9bb81bdab82b28c45625d9abd59 |
61 |
62 |
63 | Append |
64 | 428775x |
65 |
66 |
67 | Mixin |
68 | db31e861b310f44954403e345b6beeb3ded34084b90694bccaa2345306d366e1 |
69 |
70 |
71 | Append |
72 | 146263x |
73 |
74 |
75 | Post State |
76 | 8ee20607dcf1d9393cf5a2f2c9f7babe167dbdd267491b513c73d2cbf87413f5 |
77 |
78 |
79 |
--------------------------------------------------------------------------------
/gossip/gossip-protocol-spec.md:
--------------------------------------------------------------------------------
1 | # Gossip protocol
2 |
3 | Solana nodes communicate with each other and share data using the gossip protocol. Messages are exchanged in a binary format and need to be deserialized. There are six types of messages:
4 | * pull request
5 | * pull response
6 | * push message
7 | * prune message
8 | * ping
9 | * pong
10 |
11 | Each message contains data specific to its type, such as shared values, filters, pruned nodes, etc. Nodes keep their data in _Cluster Replicated Data Store_ (`crds`), which is synchronized between nodes via pull requests, push messages and pull responses.
12 |
13 | > [!Tip]
14 | > **Naming conventions used in this document**
15 | > - _Node_ - a validator running the gossip
16 | > - _Peer_ - a node sending or receiving messages from the current node we are talking about
17 | > - _Entrypoint_ - the gossip address of the peer the node will initially connect to
18 | > - _Origin_ - node, the original creator of the message
19 | > - _Cluster_ - a network of validators with a leader that produces blocks
20 | > - _Leader_ - node, the leader of the cluster in a given slot
21 | > - _Shred_ - the smallest portion of block produced by a leader
22 | > - _Shred version_ - a cluster identification value
23 | > - _Fork_ - a fork occures when two different blocks are chained to the same parent block (e.g. next block is created before the previous one was completed)
24 | > - _Epoch_ - a predefined period composed of a specific number of blocks (_slots_) in which the validator schedule is defined
25 | > - _Slot_ - the period of time for which each leader ingests transactions and produces a block
26 | > - _Message_ - the protocol message a node sends to its peers, can be push message, pull request, prune message, etc.
27 |
28 |
29 | ## Message format
30 |
31 | Each message is sent in a binary form with a maximum size of 1232 bytes (1280 is a minimum `IPv6 TPU`, 40 bytes is the size of `IPv6` header and 8 bytes is the size of the fragment header).
32 |
33 | Data sent in each message is serialized from a `Protocol` type, which can be one of:
34 |
35 | | Enum ID | Message | Data | Description |
36 | |:--------:|--------------------------------|-------------------------------|-------------|
37 | | 0 | [pull request](#pull-request) | `CrdsFilter`, `CrdsValue` | sent by node to ask for new information |
38 | | 1 | [pull response](#pull-response) | `SenderPubkey`, `CrdsValuesList` | response to a pull request |
39 | | 2 | [push message](#push-message) | `SenderPubkey`, `CrdsValuesList` | sent by node to share the latest data with the cluster |
40 | | 3 | [prune message](#prune-message) | `SenderPubkey`, `PruneData` | sent to peers with a list of origin nodes that should be pruned |
41 | | 4 | [ping message](#ping-message) | `Ping` | sent by node to check for peers' liveliness |
42 | | 5 | [pong message](#pong-message) | `Pong` | response to a ping (confirm liveliness) |
43 |
44 |
45 | ```mermaid
46 | block-beta
47 | block
48 | columns 3
49 | block
50 | columns 1
51 | b["Pull request"]
52 | c["Pull response"]
53 | d["Push message"]
54 | e["Prune message"]
55 | f["Ping message"]
56 | g["Pong message"]
57 | end
58 | right<["Serialize"]>(right)
59 | a["Packet\n(1232 bytes)"]
60 | end
61 | ```
62 |
63 |
64 | Solana client Rust implementation
65 |
66 | ``` rust
67 | enum Protocol {
68 | PullRequest(CrdsFilter, CrdsValue),
69 | PullResponse(Pubkey, Vec),
70 | PushMessage(Pubkey, Vec),
71 | PruneMessage(Pubkey, PruneData),
72 | PingMessage(Ping),
73 | PongMessage(Pong)
74 | }
75 | ```
76 |
77 |
78 |
79 | ### Type definitions
80 | Fields described in the tables below have their types specified using Rust notation:
81 | * `u8` - 8-bit unsigned integer
82 | * `u16` - 16-bit unsigned integer
83 | * `u32` - 32-bit unsigned integer, and so on...
84 | * `[u8]` - dynamic size array of 1-byte elements
85 | * `[u8; 32]` - fixed size array of 32 elements, with each element being 1 byte
86 | * `[[u8; 64]]` - a two-dimensional array containing arrays of 64 1-byte elements
87 | * `b[u8]` - a bit vector containing 1-byte elements
88 | * `u32 | None` - an option type, meaning an element is either `u32` (in this case) or `None`
89 | * `(u32, [u8, 16])` - a tuple that contains two elements - one is a 32-bit integer, the second one is a 16-element array of bytes
90 | * `MyStruct` - a complex type (either defined as a struct or a Rust enum), consisting of
91 | many elements of different basic types
92 |
93 | The **Size** column in the tables below contains the size of data in bytes. The size of dynamic arrays contains an additional _plus_ (`+`) sign, e.g. `32+`, which means the array has at least 32 bytes. Empty dynamic arrays always have 8 bytes which is the size of the array header containing array length.
94 | In case the size of a particular complex data is unknown it is marked with `?`. The limit, however, is always 1232 bytes for the whole data packet (payload within the UDP packet).
95 |
96 | #### Data serialization
97 | In the Rust implementation of the Solana node, the data is serialized into a binary form using the [`bincode` crate][bincode] as follows:
98 | * basic types, e.g. `u8`, `u16`, `u64`, etc. - are serialized as they are present in the memory, e.g. `u8` type is serialized as 1 byte, `u16` as 2 bytes, and so on,
99 | * array elements are serialized as above, e.g. `[u8; 32]` array is serialized as 32 bytes, `[u16; 32]` will be serialized as 32 16-bit elements which are equal to 64 bytes,
100 | * dynamically sized arrays always include an 8-byte header that specifies the array length, followed by the data bytes. Therefore, an empty array occupies 8 bytes,
101 | * bit vectors are serialized similar to dynamic arrays - their header contains 1-byte which tells whether there is any data in the vector, followed by an 8-byte array length and the data,
102 | * [enum types](#enum-types) contain a header with a 4-byte discriminant (tells which enum variant is selected) + additional data,
103 | * option types are serialized using a 1-byte discriminant followed by the bytes of data. If a value is `None` discriminant is set to 0 and the data part is empty, otherwise it is set to 1 with data serialized according to its type,
104 | * struct fields are serialized one by one using the rules above,
105 | * tuples are serialized like structs.
106 |
107 | ##### Enum types
108 | Enum types in Rust are more advanced than in other languages. Apart from _classic_ enum types, e.g.:
109 | ```rust
110 | enum CompressionType {
111 | Gzip,
112 | Bzip2
113 | }
114 | ```
115 | it is also possible to create an enum which contains data fields, e.g.:
116 | ```rust
117 | enum SomeEnum {
118 | Variant1(u64),
119 | Variant2(SomeType)
120 | }
121 |
122 | struct SomeType {
123 | x: u32,
124 | y: u16,
125 | }
126 | ```
127 | In the first case, the serialized object of the `CompressionType` enum will only contain a 4-byte header with the discriminant value set to the selected variant (`0 = GZip`, `1 = Bzip2`). In the latter case, apart from the header, the serialized data will contain additional bytes according to which variant was selected:
128 | * `Variant1`: 8 bytes
129 | * `Variant2`: 6 bytes (the sum of `x` and `y` fields of `SomeType` struct)
130 |
131 | When deserializing enums, it's important to handle them carefully because the amount of data that follows depends on the specific variant chosen.
132 |
133 | ### Push message
134 | Nodes send push messages to share information with others. They periodically collect data from their `crds` and transmit push messages to their peers.
135 |
136 | A node receiving a set of push messages will:
137 |
138 | * check whether the sending node has replied to a recent [`ping` message](#ping-message)
139 | * check for duplicate `CrdsValue`s and drop them
140 | * insert new `CrdsValue`s into the `crds`
141 | * transmit newly inserted `CrdsValue`s to their peers via push message.
142 |
143 |
144 | | Data | Type | Size | Description |
145 | |------|:----:|:----:|-------------|
146 | | `SenderPubkey` | `[u8; 32]` | 32 | a pubkey belonging to the sender of the push message |
147 | | `CrdsValuesList` | [`[CrdsValue]`](#data-shared-between-nodes) | 8+ | a list of Crds values to share |
148 |
149 |
150 | Solana client Rust implementation
151 |
152 | ```rust
153 | enum Protocol {
154 | //...
155 | PushMessage(Pubkey, Vec),
156 | //...
157 | }
158 | ```
159 |
160 |
161 | The node shouldn't process contact infos belonging to an unstaked node that hasn't yet replied to a recent [ping message](#ping-message).
162 |
163 | ### Pull request
164 | A node sends a pull request to ask the cluster for new information. It creates a set of bloom filters populated with the hashes of the `CrdsValue`s in its `crds` table and sends different bloom filters to different peers. The recipients of the pull request use the received bloom filter to identify what information the sender is missing and then construct a [pull response](#pull-response) packed with the missing `CrdsValue` data for the origin of the pull request.
165 |
166 | | Data | Type | Size | Description |
167 | |------|:----:|:----:|-------------|
168 | | `CrdsFilter` | [`CrdsFilter`](#crdsfilter) | 37+ | a bloom filter representing `CrdsValue`s the node already has |
169 | | `ContactInfo` | [`CrdsValue`](#data-shared-between-nodes) | ? | The `crds` value containing contact info of the node that sent the pull request |
170 |
171 | The required values for the `CrdsValue` is a [`ContactInfo`](#contactinfo) or a deprecated [`LegacyContactInfo`](#legacycontactinfo-deprecated) of the node that sent the pull request. The recommended usage for this contact info is the following:
172 | - Use it to check that the node is not sending a pull request to itself.
173 | - Check whether the sender node responded to a `Ping` message. If the node still hasn't replied to Ping message, generate a `Ping` message for the sender node, unless the recipient node is already awaiting for the `Ping` response.
174 |
175 | The node shouldn't respond with a pull response message to node that hasn't yet replied to a recent [ping message](#ping-message).
176 |
177 | #### CrdsFilter
178 |
179 | | Data | Type | Size | Description |
180 | |------|:----:|:----:|-------------|
181 | | `filter` | [`Bloom`](#bloom) | 25+ | a bloom filter |
182 | | `mask` | `u64` | 8 | filter mask which defines the data stored in the bloom filter |
183 | | `mask_bits` | `u32` | 4 | number of mask bits, also defines a number of bloom filters as `2^mask_bits` |
184 |
185 | #### Bloom
186 | | Data | Type | Size | Description |
187 | |------|:----:|:----:|-------------|
188 | | `keys` | `[u64]` | 8+ | keys |
189 | | `bits` | `b[u64]` | 9+ | bits |
190 | | `num_bits` | `u64` | 8 | number of bits |
191 |
192 |
193 | Solana client Rust implementation
194 |
195 | ``` rust
196 |
197 | enum Protocol {
198 | PullRequest(CrdsFilter, CrdsValue),
199 | //...
200 | }
201 |
202 | struct CrdsFilter {
203 | filter: Bloom,
204 | mask: u64,
205 | mask_bits: u32,
206 | }
207 |
208 | struct Bloom {
209 | keys: Vec,
210 | bits: BitVec,
211 | num_bits_set: u64,
212 | }
213 | ```
214 |
215 |
216 |
217 | ### Pull response
218 | These messages are sent in response to a [pull request](#pull-request). They contain values from the node's `crds` table that the origin of the pull request is missing, as determined by the bloom filters received in the pull request.
219 |
220 | | Data | Type | Size | Description |
221 | |------|:----:|:----:|-------------|
222 | | `SenderPubkey` | `[u8; 32]` | 32 | a pubkey belonging to the sender of the pull response message |
223 | | `CrdsValuesList` | [`[CrdsValue]`](#data-shared-between-nodes) | 8+ | a list of new values |
224 |
225 |
226 | Solana client Rust implementation
227 |
228 | ```rust
229 | enum Protocol {
230 | //...
231 | PullResponse(Pubkey, Vec),
232 | //...
233 | }
234 | ```
235 |
236 |
237 |
238 |
239 | ### Prune message
240 | Sent to peers with a list of origin nodes that should be pruned. No more push messages from pruned origin nodes should be sent by the recipient of this prune message to its sender.
241 |
242 | | Data | Type | Size | Description |
243 | |------|:----:|:----:|-------------|
244 | | `SenderPubkey` | `[u8, 32]` | 32 | a pubkey belonging to the sender of the prune message |
245 | | `PruneData` | [`PruneData`](#prunedata) | 144+ | a structure which contains prune details |
246 |
247 |
248 | #### PruneData
249 | | Data | Type | Size | Description |
250 | |------|:----:|:----:|-------------|
251 | | `pubkey` |`[u8, 32]` | 32 | public key of the origin of this message |
252 | | `prunes` | `[[u8, 32]]` | 8+ | public keys of origin nodes that should be pruned |
253 | | `signature` | `[u8, 64]` | 64 | signature of this message |
254 | | `destination` | `[u8, 32]` | 32 | a public key of the destination node for this message |
255 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
256 |
257 |
258 | Solana client Rust implementation
259 |
260 | ```rust
261 | enum Protocol {
262 | //...
263 | PruneMessage(Pubkey, PruneData),
264 | //...
265 | }
266 |
267 | struct PruneData {
268 | pubkey: Pubkey,
269 | prunes: Vec
277 |
278 | **Note**: for signing purposes, before serializing, the `PruneData` struct is prefixed with the byte array `[0xff, 'S', 'O', 'L', 'A', 'N', 'A', '_', 'P', 'R', 'U', 'N', 'E', '_', 'D', 'A', 'T', 'A']`.
279 |
280 |
281 | Solana client Rust implementation
282 |
283 | ```rust
284 | #[derive(Serialize)]
285 | struct SignDataWithPrefix<'a> {
286 | prefix: &'a [u8], // Should be a b"\xffSOLANA_PRUNE_DATA"
287 | pubkey: &'a Pubkey,
288 | prunes: &'a [Pubkey],
289 | destination: &'a Pubkey,
290 | wallclock: u64,
291 | }
292 | ```
293 |
294 |
295 |
296 |
297 | ### Ping message
298 | Nodes send ping messages frequently to their peers to check whether they are active. The node receiving the ping message should respond with a [pong message](#pong-message).
299 |
300 | | Data | Type | Size | Description |
301 | |------|:----:|:----:|-------------|
302 | | `from` |`[u8, 32]` | 32 | public key of the origin |
303 | | `token` |`[u8, 32]` | 32 | 32 bytes token |
304 | | `signature` |`[u8, 64]` | 64 | signature of the message |
305 |
306 |
307 | Solana client Rust implementation
308 |
309 | ```rust
310 | enum Protocol {
311 | //...
312 | PingMessage(Ping),
313 | //...
314 | }
315 |
316 | struct Ping {
317 | from: Pubkey,
318 | token: [u8, 32],
319 | signature: Signature,
320 | }
321 | ```
322 |
323 |
324 |
325 |
326 | ### Pong message
327 | Sent by node as a response to the [ping message](#ping-message).
328 |
329 | | Data | Type | Size | Description |
330 | |------|:----:|:----:|-------------|
331 | | `from` |`[u8, 32]` | 32 | public key of the origin |
332 | | `hash` |`[u8, 32]` | 32 | hash of the received ping token prefixed by the "SOLANA_PING_PONG" string |
333 | | `signature` |`[u8, 64]` | 64 | signature of the message |
334 |
335 |
336 | Solana client Rust implementation
337 |
338 | ```rust
339 | enum Protocol {
340 | //...
341 | PongMessage(Pong)
342 | }
343 |
344 | struct Pong {
345 | from: Pubkey,
346 | hash: Hash,
347 | signature: Signature,
348 | }
349 | ```
350 |
351 |
352 |
353 | ## Data shared between nodes
354 |
355 | The `CrdsValue` values that are sent in push messages, pull requests, and pull responses contain the shared data and the signature of the data.
356 |
357 | | Data | Type | Size | Description |
358 | |------|:----:|:----:|-------------|
359 | | `signature` | `[u8; 64]` | 64 | signature of the origin node that created the `CrdsValue` |
360 | | `data` | [`CrdsData`](#crdsdata) | ? | data |
361 |
362 |
363 | Solana client Rust implementation
364 |
365 | ```rust
366 | struct CrdsValue {
367 | signature: Signature,
368 | data: CrdsData,
369 | }
370 | ```
371 |
372 |
373 | ### CrdsData
374 | The `CrdsData` is an enum and can be one of:
375 | | Enum ID | Type |
376 | |:-------:|------|
377 | | 0 | [LegacyContactInfo](#legacycontactinfo-deprecated) (_deprecated_) |
378 | | 1 | [Vote](#vote) |
379 | | 2 | [LowestSlot](#lowestslot) |
380 | | 3 | [LegacySnapshotHashes](#legacysnapshothashes-accountshashes-deprecated) (_deprecated_) |
381 | | 4 | [AccountsHashes](#legacysnapshothashes-accountshashes-deprecated) (_deprecated_) |
382 | | 5 | [EpochSlots](#epochslots) |
383 | | 6 | [LegacyVersion](#legacyversion-deprecated) (_deprecated_) |
384 | | 7 | [Version](#version-deprecated) (_deprecated_) |
385 | | 8 | [NodeInstance](#nodeinstance) (_almost deprecated_) |
386 | | 9 | [DuplicateShred](#duplicateshred) |
387 | | 10 | [SnapshotHashes](#snapshothashes) |
388 | | 11 | [ContactInfo](#contactinfo) |
389 | | 12 | [RestartLastVotedForkSlots](#restartlastvotedforkslots) |
390 | | 13 | [RestartHeaviestFork](#restartheaviestfork) |
391 |
392 |
393 | Solana client Rust implementation
394 |
395 | ```rust
396 | enum CrdsData {
397 | LegacyContactInfo(LegacyContactInfo),
398 | Vote(VoteIndex, Vote),
399 | LowestSlot(LowestSlotIndex, LowestSlot),
400 | LegacySnapshotHashes(LegacySnapshotHashes),
401 | AccountsHashes(AccountsHashes),
402 | EpochSlots(EpochSlotsIndex, EpochSlots),
403 | LegacyVersion(LegacyVersion),
404 | Version(Version),
405 | NodeInstance(NodeInstance),
406 | DuplicateShred(DuplicateShredIndex, DuplicateShred),
407 | SnapshotHashes(SnapshotHashes),
408 | ContactInfo(ContactInfo),
409 | RestartLastVotedForkSlots(RestartLastVotedForkSlots),
410 | RestartHeaviestFork(RestartHeaviestFork),
411 | }
412 | ```
413 |
414 |
415 | #### LegacyContactInfo (Deprecated)
416 |
417 | Basic info about the node. Nodes send this message to introduce themselves to the cluster and provide all addresses and ports that their peers can use to communicate with them.
418 |
419 | | Data | Type | Size | Description |
420 | |------|:----:|:----:|-------------|
421 | | `id` | `[u8; 32]` | 32 | public key of the origin |
422 | | `gossip` | [`SocketAddr`](#socketaddr) | 10 or 22 | gossip protocol address |
423 | | `tvu` | [`SocketAddr`](#socketaddr) | 10 or 22 | address to connect to for replication |
424 | | `tvu_quic` | [`SocketAddr`](#socketaddr) | 10 or 22 | TVU over QUIC protocol |
425 | | `serve_repair_quic` | [`SocketAddr`](#socketaddr) | 10 or 22 | repair service for QUIC protocol |
426 | | `tpu` | [`SocketAddr`](#socketaddr) | 10 or 22 | transactions address |
427 | | `tpu_forwards` | [`SocketAddr`](#socketaddr) | 10 or 22 | address to forward unprocessed transactions |
428 | | `tpu_vote` | [`SocketAddr`](#socketaddr) | 10 or 22 | address for sending votes |
429 | | `rpc` | [`SocketAddr`](#socketaddr) | 10 or 22 | address for JSON-RPC requests |
430 | | `rpc_pubsub` | [`SocketAddr`](#socketaddr) | 10 or 22 | websocket for JSON-RPC push notifications |
431 | | `serve_repair` | [`SocketAddr`](#socketaddr) | 10 or 22 | address for sending repair requests |
432 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
433 | | `shred_version` | `u16` | 2 | the shred version node has been configured to use |
434 |
435 | ##### SocketAddr
436 | An enum, can be either a V4 or V6 socket address.
437 | | Enum ID | Data | Type | Size | Description |
438 | |:-------:|------|:----:|:----:|-------------|
439 | | 0 | `V4` | [`SocketAddrV4`](#socketaddrv4) | 10 | V4 socket address |
440 | | 1 | `V6` | [`SocketAddrV6`](#socketaddrv6) | 22 | V6 socket address |
441 |
442 | ##### SocketAddrV4
443 | | Data | Type | Size | Description |
444 | |------|:----:|:----:|-------------|
445 | | `ip` | `[u8; 4]` | 4 | ip address |
446 | | `port` | `u16` | 2 | port |
447 |
448 | ##### SocketAddrV6
449 | | Data | Type | Size | Description |
450 | |------|:----:|:----:|-------------|
451 | | `ip` | `[u8; 16]` | 16 | ip address |
452 | | `port` | `u16` | 2 | port |
453 |
454 |
455 | Solana client Rust implementation
456 |
457 | ```rust
458 | struct LegacyContactInfo {
459 | id: Pubkey,
460 | gossip: SocketAddr,
461 | tvu: SocketAddr,
462 | tvu_quic: SocketAddr,
463 | serve_repair_quic: SocketAddr,
464 | tpu: SocketAddr,
465 | tpu_forwards: SocketAddr,
466 | tpu_vote: SocketAddr,
467 | rpc: SocketAddr,
468 | rpc_pubsub: SocketAddr,
469 | serve_repair: SocketAddr,
470 | wallclock: u64,
471 | shred_version: u16,
472 | }
473 |
474 | enum SocketAddr {
475 | V4(SocketAddrV4),
476 | V6(SocketAddrV6)
477 | }
478 |
479 | struct SocketAddrV4 {
480 | ip: Ipv4Addr,
481 | port: u16,
482 | }
483 |
484 | struct SocketAddrV6 {
485 | ip: Ipv6Addr,
486 | port: u16
487 | }
488 |
489 | struct Ipv4Addr {
490 | octets: [u8; 4]
491 | }
492 |
493 | struct Ipv6Addr {
494 | octets: [u8; 16]
495 | }
496 | ```
497 |
498 |
499 |
500 | #### Vote
501 | A validator's vote on a fork. Contains a one-byte index from the vote tower (range 0 to 31) and the vote transaction to be executed by the leader.
502 |
503 | | Data | Type | Size | Description |
504 | |------|:----:|:----:|-------------|
505 | | `index` | `u8` | 1 | vote tower index |
506 | | `from` | `[u8; 32]` | 32 | public key of the origin |
507 | | `transaction` | [`Transaction`](#transaction) | 59+ | a vote transaction, an atomically-committed sequence of instructions |
508 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
509 | | `slot` | `u64` | 8 | slot in which the vote was created |
510 |
511 |
512 | ##### Transaction
513 | Contains a signature and a message with a sequence of instructions.
514 |
515 | | Data | Type | Size | Description |
516 | |------|:----:|:----:|-------------|
517 | | `signature` | `[[u8; 64]]` | 8+ | list of signatures equal to `num_required_signatures` for the message |
518 | | `message` | [`Message`](#message) | 51+ | transaction message containing instructions to invoke |
519 |
520 |
521 |
522 | ##### Message
523 |
524 | | Data | Type | Size | Description |
525 | |------|:----:|:----:|-------------|
526 | | `header` | [`MessageHeader`](#message-header) | 3 | message header |
527 | | `account_keys` | `[[u8; 32]]` | 8+ | all account keys used by this transaction |
528 | | `recent_blockhash` | `[u8; 32]` | 32 | hash of a recent ledger entry |
529 | | `instructions` | [`[CompiledInstruction]`](#compiled-instruction) | 8+ | list of compiled instructions to execute |
530 |
531 | ##### Message header
532 |
533 | | Data | Type | Size | Description |
534 | |------|:----:|:----:|-------------|
535 | | `num_required_signatures` | `u8` | 1 | number of signatures required for this message to be considered valid |
536 | | `num_readonly_signed_accounts` | `u8` | 1 | last `num_readonly_signed_accounts` of the signed keys are read-only accounts |
537 | | `num_readonly_unsigned_accounts` | `u8` | 1 | last `num_readonly_unsigned_accounts` of the unsigned keys are read-only accounts |
538 |
539 | ##### Compiled instruction
540 |
541 | | Data | Type | Size | Description |
542 | |------|:----:|:----:|-------------|
543 | | `program_id_index` | `u8` | 1 | index of the transaction keys array indicating the program account ID that executes the program |
544 | | `accounts` |`[u8]` | 8+ | indices of the transaction keys array indicating the accounts that are passed to a program |
545 | | `data` | `[u8]` | 8+ | program input data |
546 |
547 |
548 | Solana client Rust implementation
549 |
550 | ```rust
551 |
552 | enum CrdsData {
553 | //...
554 | Vote(VoteIndex, Vote),
555 | //...
556 | }
557 |
558 | type VoteIndex = u8;
559 |
560 | struct Vote {
561 | from: Pubkey,
562 | transaction: Transaction,
563 | wallclock: u64,
564 | slot: Option,
565 | }
566 |
567 | type Slot = u64
568 |
569 | struct Transaction {
570 | signature: Vec,
571 | message: Message
572 | }
573 |
574 | struct Message {
575 | header: MessageHeader,
576 | account_keys: Vec,
577 | recent_blockhash: Hash,
578 | instructions: Vec,
579 | }
580 |
581 | struct MessageHeader {
582 | num_required_signatures: u8,
583 | num_readonly_signed_accounts: u8,
584 | num_readonly_unsigned_accounts: u8,
585 | }
586 |
587 | struct CompiledInstruction {
588 | program_id_index: u8,
589 | accounts: Vec,
590 | data: Vec,
591 | }
592 | ```
593 |
594 |
595 | #### LowestSlot
596 | The first available slot in the Solana [blockstore][blockstore] that contains any data. Contains a one-byte index (deprecated) and the lowest slot number.
597 |
598 | | Data | Type | Size | Description |
599 | |------|:----:|:----:|-------------|
600 | | `index` | `u8` | 1 | _**The only valid value is `0u8`** since this is now a deprecated field_ |
601 | | `from` | `[u8; 32]`| 32 | public key of the origin |
602 | | `root` | `u64` | 8 | _deprecated_ |
603 | | `lowest` | `u64` | 8 | the lowest slot |
604 | | `slots` | `[u64]` | 8+ | _deprecated_ |
605 | | `stash` | [`[EpochIncompleteSlots]`](#epochincompleteslots) | 8+ | _deprecated_ |
606 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
607 |
608 | ##### EpochIncompleteSlots
609 |
610 | | Data | Type | Size | Description |
611 | |------|:----:|:----:|-------------|
612 | | `first` | `u64` | 8 | first slot number |
613 | | `compression` | [`CompressionType`](#compressiontype) | 4 | compression type |
614 | | `compressed_list` | `[u8]` | 8+ | compressed slots list |
615 |
616 | ##### CompressionType
617 | Compression type enum.
618 |
619 | | Enum ID | Data | Description |
620 | |:-------:|------|-------------|
621 | | 0 | `Uncompressed` | uncompressed |
622 | | 1 | `GZip` | gzip |
623 | | 2 | `BZip2`| bzip2 |
624 |
625 |
626 | Solana client Rust implementation
627 |
628 | ```rust
629 |
630 | enum CrdsData {
631 | //...
632 | LowestSlot(LowestSlotIndex, LowestSlot),
633 | //...
634 | }
635 |
636 | type LowestSlotIndex = u8;
637 |
638 | struct LowestSlot {
639 | from: Pubkey,
640 | root: Slot,
641 | lowest: Slot,
642 | slots: BTreeSet,
643 | stash: Vec,
644 | wallclock: u64,
645 | }
646 |
647 | struct EpochIncompleteSlots {
648 | first: Slot,
649 | compression: CompressionType,
650 | compressed_list: Vec,
651 | }
652 |
653 | enum CompressionType {
654 | Uncompressed,
655 | GZip,
656 | BZip2,
657 | }
658 | ```
659 |
660 |
661 | #### LegacySnapshotHashes, AccountsHashes (Deprecated)
662 |
663 | These two messages share the same message structure.
664 |
665 | | Data | Type | Size | Description |
666 | |------|:----:|:----:|-------------|
667 | | `from` | `[u8, 32]`| 32 | public key of the origin |
668 | | `hashes` | `[(u64, [u8, 32])]`| 8+ | a list of hashes grouped by slots |
669 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
670 |
671 |
672 |
673 | Solana client Rust implementation
674 |
675 | ```rust
676 | struct AccountsHashes {
677 | from: Pubkey,
678 | hashes: Vec<(Slot, Hash)>,
679 | wallclock: u64,
680 | }
681 |
682 | type LegacySnapshotHashes = AccountsHashes;
683 | ```
684 |
685 |
686 | #### EpochSlots
687 | Contains a one-byte index and list of all slots from an epoch (epoch consists of around 432000 slots). There can be 256 epoch slots in total.
688 |
689 | | Data | Type | Size | Description |
690 | |------|:----:|:----:|-------------|
691 | | `index` | `u8` | 1 | index |
692 | | `from` | `[u8, 32]` | 32 | public key of the origin |
693 | | `slots` | [`[CompressedSlots]`](#compressedslots) | 8+ | list of slots |
694 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
695 |
696 | ##### CompressedSlots
697 | | EnumID | Data | Type | Size | Description |
698 | |:------:|------|:----:|:----:|-------------|
699 | | 0 | `Flate2` | [`Flate2`](#flate2) | 24+ | Flate 2 compression |
700 | | 1 | `Uncompressed` | [`Uncompressed`](#uncompressed) | 25+ | no compression |
701 |
702 | ##### Flate2
703 | | Data | Type | Size | Description |
704 | |------|:----:|:----:|-------------|
705 | | `first_slot` | `u64` | 8 | first slot number |
706 | | `num` | `u64` | 8 | number of slots |
707 | | `compressed` | `[u8]` | 8+ | bytes array of compressed slots |
708 |
709 | ##### Uncompressed
710 | | Data | Type | Size | Description |
711 | |------|:----:|:----:|-------------|
712 | | `first_slot` | `u64` | 8 | first slot number |
713 | | `num` | `u64` | 8 | number of slots |
714 | | `slots` | `b[u8]` | 9+ | bits array of slots |
715 |
716 |
717 |
718 | Solana client Rust implementation
719 |
720 | ```rust
721 | enum CrdsData {
722 | //...
723 | EpochSlots(EpochSlotsIndex, EpochSlots),
724 | //...
725 | }
726 |
727 | type EpochSlotsIndex = u8;
728 |
729 | struct EpochSlots {
730 | from: Pubkey,
731 | slots: Vec,
732 | wallclock: u64,
733 | }
734 |
735 | enum CompressedSlots {
736 | Flate2(Flate2),
737 | Uncompressed(Uncompressed),
738 | }
739 |
740 | struct Flate2 {
741 | first_slot: Slot,
742 | num: usize,
743 | compressed: Vec
744 | }
745 |
746 | struct Uncompressed {
747 | first_slot: Slot,
748 | num: usize,
749 | slots: BitVec,
750 | }
751 | ```
752 |
753 |
754 |
755 | #### LegacyVersion (Deprecated)
756 | The older version of the Solana client the node is using.
757 |
758 | | Data | Type | Size | Description |
759 | |------|:----:|:----:|-------------|
760 | | `from` | `[u8, 32]`| 32 | public key of origin |
761 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
762 | | `version` | [`LegacyVersion1`](#legacyversion1) | 7 or 11 | older version used in 1.3.x and earlier releases |
763 |
764 |
765 | ##### LegacyVersion1
766 | | Data | Type | Size | Description |
767 | |------|:----:|:----:|-------------|
768 | | `major` | `u16`| 2 | major part of version |
769 | | `minor` | `u16`| 2 | minor part of version |
770 | | `patch` | `u16`| 2 | patch |
771 | | `commit` | `u32 \| None`| 5 or 1 | commit |
772 |
773 |
774 | Solana client Rust implementation
775 |
776 | ```rust
777 | struct LegacyVersion {
778 | from: Pubkey,
779 | wallclock: u64,
780 | version: LegacyVersion1,
781 | }
782 |
783 | struct LegacyVersion1 {
784 | major: u16,
785 | minor: u16,
786 | patch: u16,
787 | commit: Option
788 | }
789 | ```
790 |
791 |
792 | #### Version (Deprecated)
793 | The version of the Solana client the node is using.
794 |
795 | | Data | Type | Size | Description |
796 | |------|:----:|:-----:|-------------|
797 | | `from` | `[u8, 32]` | 32 | public key of origin |
798 | | `wallclock` | `u64` | 8 | wallclock of the node that generated the message |
799 | | `version` | [`LegacyVersion2`](#legacyversion2) | 11 or 15 | version of the Solana client |
800 |
801 |
802 | ##### LegacyVersion2
803 | | Data | Type | Size | Description |
804 | |------|:----:|:----:|-------------|
805 | | `major` | `u16`| 2 | major part of version |
806 | | `minor` | `u16`| 2 | minor part of version |
807 | | `patch` | `u16`| 2 | patch |
808 | | `commit` | `u32 \| None`| 5 or 1 | the first four bytes of the sha1 commit hash |
809 | | `feature_set` | `u32`| 4 | feature set |
810 |
811 |
812 | Solana client Rust implementation
813 |
814 | ```rust
815 | struct Version {
816 | from: Pubkey,
817 | wallclock: u64,
818 | version: LegacyVersion2,
819 | }
820 |
821 | struct LegacyVersion2 {
822 | major: u16,
823 | minor: u16,
824 | patch: u16,
825 | commit: Option,
826 | feature_set: u32
827 | }
828 | ```
829 |
830 |
831 | #### NodeInstance
832 | Contains node creation timestamp and randomly generated token.
833 |
834 | | Data | Type | Size | Description |
835 | |------|:----:|:----:|-------------|
836 | | `from` | `[u8, 32]`| 32 | public key of origin |
837 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
838 | | `timestamp` | `u64`| 8 | timestamp when the instance was created |
839 | | `token` | `u64`| 8 | randomly generated value at node instantiation |
840 |
841 |
842 |
843 | Solana client Rust implementation
844 |
845 | ```rust
846 | struct NodeInstance {
847 | from: Pubkey,
848 | wallclock: u64,
849 | timestamp: u64,
850 | token: u64,
851 | }
852 | ```
853 |
854 |
855 | #### DuplicateShred
856 | A duplicated shred proof. Contains a 2-byte index followed by other data:
857 |
858 | | Data | Type | Size | Description |
859 | |------|:----:|:----:|-------------|
860 | | `index` | `u16` | 2 | index |
861 | | `from` | `[u8, 32]`| 32 | public key of origin |
862 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
863 | | `slot` | `u64`| 8 | slot when shreds where created |
864 | | `_unused` | `u32`| 4 | _unused_ |
865 | | `_unused_shred_type` | [`ShredType`](#shredtype) | 1 | _unused_ |
866 | | `num_chunks` | `u8`| 1 | number of chunks available |
867 | | `chunk_index` | `u8`| 1 | index of the chunk |
868 | | `chunk` | `[u8]`| 8+ | shred data |
869 |
870 | ##### ShredType
871 | This enum is serialized as 1-byte data.
872 |
873 | | Enum ID | Data | Description |
874 | |:-------:|------|-------------|
875 | | `0b10100101` | `Data` | data shred |
876 | | `0b01011010` | `Code` | coding shred |
877 |
878 |
879 |
880 | Solana client Rust implementation
881 |
882 | ```rust
883 | enum CrdsData {
884 | //...
885 | DuplicateShred(DuplicateShredIndex, DuplicateShred),
886 | //...
887 | }
888 |
889 | type DuplicateShredIndex = u16;
890 |
891 | struct DuplicateShred {
892 | from: Pubkey,
893 | wallclock: u64,
894 | slot: Slot,
895 | _unused: u32,
896 | _unused_shred_type: ShredType,
897 | num_chunks: u8,
898 | chunk_index: u8,
899 | chunk: Vec,
900 | }
901 |
902 | #[serde(into = "u8", try_from = "u8")]
903 | enum ShredType {
904 | Data = 0b1010_0101,
905 | Code = 0b0101_1010,
906 | }
907 | ```
908 |
909 |
910 | #### SnapshotHashes
911 | Contains information about the hashes of full and incremental snapshots the node has and is ready to share with other nodes via the RPC interface. Snapshots are downloaded by other validators that are starting for the first time, or in cases where validators have fallen behind too far after a restart. To learn more, explore this [snapshots] page.
912 |
913 | | Data | Type | Size | Description |
914 | |------|:----:|:----:|-------------|
915 | | `from` | `[u8, 32]`| 32 | public key of origin |
916 | | `full` | `(u64, [u8, 32])`| 40 | hash and slot number of the full snapshot |
917 | | `incremental` | `[(u64, [u8, 32])]`| 8+ | list of hashes and slot numbers of incremental snapshots |
918 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
919 |
920 |
921 |
922 | Solana client Rust implementation
923 |
924 | ```rust
925 | struct SnapshotHashes {
926 | from: Pubkey,
927 | full: (Slot, Hash),
928 | incremental: Vec<(Slot, Hash)>,
929 | wallclock: u64,
930 | }
931 | ```
932 |
933 |
934 | #### ContactInfo
935 |
936 | Basic info about the node. Nodes send this message to introduce themselves to the cluster and provide all addresses and ports that their peers can use to communicate with them.
937 |
938 | | Data | Type | Size | Description |
939 | |------|:----:|:----:|-------------|
940 | | `pubkey` | `[u8, 32]`| 32 | public key of origin |
941 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
942 | | `outset` | `u64`| 8 | timestamp when node instance was first created, used to identify duplicate running instances |
943 | | `shred_version` | `u16`| 2 | the shred version the node has been configured to use |
944 | | `version` | [`Version`](#version-1) | 13+ | Solana client version |
945 | | `addrs` | [`[IpAddr]`](#ipaddr) | 8+ | list of unique IP addresses |
946 | | `sockets` | [`[SocketEntry]`](#socketentry) | 8+ | list of unique sockets |
947 | | `extensions` | [`[Extension]`](#extension) | 8+ | future additions to `ContactInfo` will be added to `Extensions` instead of modifying `ContactInfo`, currently unused |
948 |
949 | ##### Version
950 | | Data | Type | Size | Description |
951 | |------|:----:|:----:|-------------|
952 | | `major` | `u16`| 2 | major part of version |
953 | | `minor` | `u16`| 2 | minor part of version |
954 | | `patch` | `u16`| 2 | patch |
955 | | `commit` | `u32 \| None`| 5 or 1 | the first four bytes of the sha1 commit hash |
956 | | `feature_set` | `u32`| 4 | the first four bytes of the FeatureSet identifier |
957 | | `client` | `u16`| 2 | client type ID |
958 |
959 | Possible `client` type ID values are:
960 |
961 | | ID | Client |
962 | |:--:|--------|
963 | | `0u16` | `SolanaLabs` |
964 | | `1u16` | `JitoLabs` |
965 | | `2u16` | `Firedancer` |
966 | | `3u16` | `Agave` |
967 |
968 | ##### IpAddr
969 | | Enum ID | Data | Type | Size | Description |
970 | |:-------:|------|:----:|:----:|-------------|
971 | | 0 | `V4` | `[u8; 4]` | 4 | IP v4 addr |
972 | | 1 | `V6` | `[u8, 16]` | 16 | IP v6 addr |
973 |
974 | ##### SocketEntry
975 | | Data | Type | Size | Description |
976 | |------|:----:|:----:|-------------|
977 | | `key` | `u8`| 1 | protocol identifier |
978 | | `index` | `u8`| 1 | [`[IpAddr]`](#ipaddr) index in the addrs list |
979 | | `offset` | `u16`| 2 | port offset in respect to previous entry |
980 |
981 | The list of `key` identifiers is shown in the table below:
982 | | Interface | Key | Description |
983 | |-----------|:---:|-------------|
984 | | `gossip` | 0 | gossip protocol address |
985 | | `serve_repair_quic` | 1 | `serve_repair` over QUIC |
986 | | `rpc` | 2 | address for JSON-RPC requests |
987 | | `rpc_pubsub` | 3 | websocket for JSON-RPC push notifications |
988 | | `serve_repair` | 4 | address for sending repair requests |
989 | | `tpu` | 5 | transactions address |
990 | | `tpu_forwards` | 6 | address to forward unprocessed transactions |
991 | | `tpu_forwards_quic` | 7 | `tpu_forwards` over QUIC |
992 | | `tpu_quic` | 8 | `tpu` over QUIC |
993 | | `tpu_vote` | 9 | address for sending votes |
994 | | `tvu` | 10 | address to connect to for replication |
995 | | `tvu_quic` | 11 | `tvu` over QUIC |
996 | | `tpu_vote_quic` | 12 | `tpu_vote` over QUIC |
997 |
998 | ##### Extension
999 | _Currently empty (unused)_
1000 |
1001 |
1002 | Solana client Rust implementation
1003 |
1004 | ```rust
1005 | struct ContactInfo {
1006 | pubkey: Pubkey,
1007 | wallclock: u64,
1008 | outset: u64,
1009 | shred_version: u16,
1010 | version: Version,
1011 | addrs: Vec,
1012 | sockets: Vec,
1013 | extensions: Vec,
1014 | }
1015 |
1016 | enum Extension {}
1017 |
1018 | enum IpAddr {
1019 | V4(Ipv4Addr),
1020 | V6(Ipv4Addr)
1021 | }
1022 |
1023 | struct Ipv4Addr {
1024 | octets: [u8; 4]
1025 | }
1026 |
1027 | struct Ipv6Addr {
1028 | octets: [u8; 16]
1029 | }
1030 |
1031 | struct SocketEntry {
1032 | key: u8,
1033 | index: u8,
1034 | offset: u16
1035 | }
1036 |
1037 | struct Version {
1038 | major: u16,
1039 | minor: u16,
1040 | patch: u16,
1041 | commit: Option,
1042 | feature_set: u32,
1043 | client: u16
1044 | }
1045 | ```
1046 |
1047 |
1048 | #### RestartLastVotedForkSlots
1049 | Contains a list of last-voted fork slots. This message is not a common gossip message and should be used only during the [cluster-restart] operation.
1050 |
1051 | | Data | Type | Size | Description |
1052 | |------|:----:|:----:|-------------|
1053 | | `from` | `[u8, 32]`| 32 | public key of origin |
1054 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
1055 | | `offsets` | [`SlotsOffsets`](#slotsoffsets) | 12+ | list of slot offsets |
1056 | | `last_voted_slot` | `u64`| 8 | the last voted slot |
1057 | | `last_voted_hash` | `[u8, 32]`| 32 | the bank hash of the slot last voted slot |
1058 | | `shred_version` | `u16`| 2 | the shred version node has been configured to use |
1059 |
1060 | ##### SlotsOffsets
1061 | Offsets are stored either in binary form (`RawOffsets`) or encoded as numbers of consecutive 1's and 0's, e.g. 110001110 is [2, 3, 3, 1].
1062 | | Enum ID | Data | Type | Size | Description |
1063 | |:-------:|------|:----:|:----:|-------------|
1064 | | 0 | `RunLengthEncoding` | `[u16]` | 8+ | encoded offsets |
1065 | | 1 | `RawOffsets` | `b[u8]` | 9+ | raw offsets |
1066 |
1067 |
1068 | Solana client Rust implementation
1069 |
1070 | ```rust
1071 | struct RestartLastVotedForkSlots {
1072 | from: Pubkey,
1073 | wallclock: u64,
1074 | offsets: SlotsOffsets,
1075 | last_voted_slot: Slot,
1076 | last_voted_hash: Hash,
1077 | shred_version: u16,
1078 | }
1079 |
1080 | enum SlotsOffsets {
1081 | RunLengthEncoding(RunLengthEncoding),
1082 | RawOffsets(RawOffsets),
1083 | }
1084 |
1085 | struct RunLengthEncoding(Vec);
1086 | struct RawOffsets(BitVec);
1087 | ```
1088 |
1089 |
1090 |
1091 | #### RestartHeaviestFork
1092 | Contains the heaviest fork. This message is not a common gossip message and should be used only during the [cluster-restart] operation.
1093 |
1094 | | Data | Type | Size | Description |
1095 | |------|:----:|:----:|-------------|
1096 | | `from` | `[u8, 32]`| 32 | public key of origin |
1097 | | `wallclock` | `u64`| 8 | wallclock of the node that generated the message |
1098 | | `last_slot` | `u64`| 8 | slot of the picked block |
1099 | | `last_hash` | `[u8, 32]`| 32 | bank hash of the picked block |
1100 | | `observed_stake` | `u64`| 8 | |
1101 | | `shred_version` | `u16`| 2 | the shred version node has been configured to use |
1102 |
1103 |
1104 | Solana client Rust implementation
1105 |
1106 | ```rust
1107 | struct RestartHeaviestFork {
1108 | from: Pubkey,
1109 | wallclock: u64,
1110 | last_slot: Slot,
1111 | last_slot_hash: Hash,
1112 | observed_stake: u64,
1113 | shred_version: u16,
1114 | }
1115 | ```
1116 |
1117 |
1118 | # Addendum
1119 |
1120 | ## IP Echo Server
1121 |
1122 | The IP Echo Server is a server running on the TCP socket on the same gossip address. (e.g. if the node is running the gossip service on UDP socket 192.0.0.1:9000, then the IP server port is running on the same TCP socket 192.0.0.1:9000).
1123 |
1124 | Before the node starts the gossip service, the node first needs to:
1125 | - find out the shred version of the cluster,
1126 | - discover its public IP address,
1127 | - check TCP/UDP port reachability.
1128 |
1129 | All of these are discovered via the IP echo server running on one of the provided entrypoint nodes. Note: all validators run an IP Echo Server.
1130 | The node should create sockets with ports that need to be checked and then send an IP echo server request message to one of the entrypoint nodes.
1131 | The entrypoint node will then check reachability for all ports listed in the request and then it will respond with an IP echo server response containing the `shred_version` and public IP of the node.
1132 |
1133 | #### IpEchoServerMessage
1134 |
1135 | IP echo server message request containing a list of ports whose reachability should be checked by the server:
1136 | - UDP ports - the server should send a single byte `[0x00]` packet to the socket.
1137 | - TCP ports - the server should establish a TCP connection with every TCP socket.
1138 |
1139 | | Data | Type | Size | Description |
1140 | |------|:----:|:----:|-------------|
1141 | | `tcp_ports` | `[u16, 4]` | 64 | TCP ports that should be checked |
1142 | | `udp_ports` | `[u16, 4]` | 64 | UDP ports that should be checked |
1143 |
1144 |
1145 | Solana client Rust implementation
1146 |
1147 | ```rust
1148 | /// Echo server message request.
1149 | pub struct IpEchoServerMessage {
1150 | pub tcp_ports: [u16; 4],
1151 | pub udp_ports: [u16; 4],
1152 | }
1153 | ```
1154 |
1155 |
1156 | #### IpEchoServerResponse
1157 |
1158 | IP echo server message response.
1159 |
1160 | | Data | Type | Size | Description |
1161 | |------|:----:|:----:|-------------|
1162 | | `address` | [`IpAddr`](#ipaddr) | 4 or 16 | public IP of the node that sent the request |
1163 | | `shred_version` | `u16`| 2 | shred verion of the cluster |
1164 |
1165 |
1166 | Solana client Rust implementation
1167 |
1168 | ```rust
1169 | /// Echo server response.
1170 | pub struct IpEchoServerResponse {
1171 | /// Public IP address of request echoed back to the node.
1172 | pub address: IpAddr,
1173 | /// Cluster shred-version of the node running the server.
1174 | pub shred_version: Option,
1175 | }
1176 | ```
1177 |
1178 |
1179 | [bincode]: https://github.com/bincode-org/bincode/blob/trunk/docs/spec.md
1180 | [blockstore]: https://docs.solanalabs.com/validator/blockstore
1181 | [snapshots]: https://docs.anza.xyz/operations/best-practices/general/#snapshots
1182 | [cluster-restart]: https://github.com/solana-foundation/solana-improvement-documents/blob/main/proposals/0046-optimistic-cluster-restart-automation.md
1183 |
--------------------------------------------------------------------------------
/p2p/shred.md:
--------------------------------------------------------------------------------
1 | # Shred packets
2 |
3 | Shred packets are fragments of blocks (transaction data) that facilitate propagation within peer-to-peer networks.
4 |
5 | Data shreds are created by splitting up block data.
6 | Code shreds are created by constructing erasure codes over data shreds grouped into Forward Error Correction (FEC) sets.
7 |
8 | All shreds are signed by the block producer.
9 |
10 | ## Revisions
11 |
12 | Multiple revisions of the shred structures exist.
13 | The shred revision is not explicitly encoded.
14 |
15 | ### v1: Genesis revision
16 |
17 | Solana mainnet-beta genesis launched with support for code and data shreds with the legacy authentication mechanism.
18 |
19 | ### v2: Explicit data sizes
20 |
21 | Breaking change to the data shred header adding a new size field at offset `0x56`.
22 |
23 | The shred payload shifts from offset `0x56` to `0x58`.
24 |
25 | ### v3: Merkle authentication mechanism
26 |
27 | Introduction of two additional shred variants with the Merkle authentication scheme.
28 |
29 | ## Protocol
30 |
31 | ### Data Layout
32 |
33 | Shreds consist of the following sections, in order:
34 |
35 | - [Common Header](#common-header)
36 | - [Data Header](#data-shred-header-rev-v2) or [Code Header](#code-shred-header)
37 | - [Shred Payload](#shred-payload-construction)
38 | - [Zero Padding](#zero-padding) (if any)
39 | - [Merkle Proof](#merkle-proof) (if any)
40 |
41 | Each field is byte-aligned. Integer byte order is little-endian.
42 |
43 | ### Packet Size
44 |
45 | The `SHRED_SZ_MAX` constant is defined as 1228.
46 |
47 | If using Legacy authentication,
48 | each shred occupies `SHRED_SZ_MAX` bytes when serialized.
49 | If using Merkle authentication, coding shreds occupy `SHRED_SZ_MAX` bytes,
50 | and data shreds occupy 1203 bytes.
51 |
52 | This derives from the IPv6 minimum link MTU of 1280 bytes minus 48 bytes reserved for IPv6 and UDP headers.
53 | An additional 4 bytes are reserved for an optional nonce field.
54 |
55 | ## Common Header
56 |
57 | The common header has size `0x53` (83 bytes).
58 |
59 | | Offset | Size | Type | Name | Purpose |
60 | |--------|-----:|-------------------|-----------------|--------------------------|
61 | | `0x00` | 64B | Ed25519 signature | `signature` | Block producer signature |
62 | | `0x40` | 1B | `u8` | `variant` | Shred variant |
63 | | `0x41` | 8B | `u64` | `slot` | Slot number |
64 | | `0x49` | 4B | `u32` | `shred_index` | Shred index |
65 | | `0x4d` | 2B | `u16` | `shred_version` | Shred version |
66 | | `0x4f` | 4B | `u32` | `fec_set_index` | FEC Set Index |
67 |
68 | #### Field: Block producer signature
69 |
70 | The signature authenticates the fact that a shred originated from the current block producer at that slot.
71 |
72 | The block producer's public key is sourced externally.
73 |
74 | The content of the signed message used to create the signature depends on the authentication scheme used:
75 |
76 | * In the legacy authentication scheme, the block producer signs the content of each shred.
77 | The message to be signed begins 64 bytes past the beginning of the common shred header (i.e. skipping the signature field) and spans the entire rest of the shred, including any zero padding.
78 | The resulting signature is placed into the _block producer signature_ field.
79 |
80 | * In the Merkle authentication scheme, the block producer signs the Merkle root over the FEC set of shreds. See [Merkle Proof](#merkle-proof) for more details.
81 | Consequently, the signature field carries identical content for all shreds in the same FEC set.
82 | The Merkle node size is truncated to 20 bytes.
83 |
84 | #### Field: Shred Variant
85 |
86 | The shred variant identifies the shred type (data, code) and authentication mechanism (legacy, Merkle).
87 |
88 | The field is encoded as two 4-bit unsigned integers.
89 |
90 | The high 4-bit field is at bit range `4:8`.
91 | The low 4-bit field is at bit range `0:4`.
92 |
93 | | High 4-bit | Low 4-bit | Shred Type | Authentication |
94 | |------------|-----------|------------|----------------|
95 | | `0x5` | `0xa` | Code | Legacy |
96 | | `0xa` | `0x5` | Data | Legacy |
97 | | `0x4` | Any | Code | Merkle |
98 | | `0x8` | Any | Data | Merkle |
99 |
100 | When using Merkle authentication,
101 | the low 4 bits indicate the height of the Merkle tree.
102 | This number is defined below as $h$.
103 |
104 | #### Field: Slot number
105 |
106 | Set to the slot number of the block that this shred is part of.
107 |
108 | #### Field: Shred index
109 | For data shreds, set to the index of this shred among all data shreds within a slot.
110 | For coding shreds, set to the index of this shred among all coding shreds within a slot.
111 |
112 | #### Field: Shred version
113 |
114 | Identifies the network fork of the block that this shred is part of.
115 |
116 | #### Field: FEC Set Index
117 |
118 | Set to the shred index of the first shred in the same FEC set as this shred.
119 | All shreds with the same FEC Set Index are part of the same FEC set.
120 |
121 | ## Data Shred Header (rev. v2)
122 |
123 | The data shred header
124 |
125 | Offsets are relative to the start of the common header.
126 |
127 | | Offset | Size | Type | Name | Purpose |
128 | |--------|-----:|-------|----------------|-------------------------------|
129 | | `0x53` | 2B | `u16` | `parent_offset`| Slot distance to parent block |
130 | | `0x55` | 1B | `u8` | `data_flags` | Data Flags |
131 | | `0x56` | 2B | `u16` | `size` | Total Size |
132 |
133 | The data payload begins at offset `0x58` from the start of the common header.
134 |
135 | The data flags contain the following fields. (LSB 0 numbering)
136 |
137 | | Bits | Type | Name | Purpose |
138 | |-------|------|------------------|--------------------|
139 | | `7` | bool | `block_complete` | Block complete bit |
140 | | `6` | bool | `batch_complete` | Batch complete bit |
141 | | `0:6` | `u6` | `batch_tick` | Batch tick number |
142 |
143 | #### Field: Block complete bit
144 |
145 | Set to one if this data shred is the last shred for this block.
146 | Otherwise, set to zero.
147 |
148 | The final shred in a block is also the final shred in its respective batch,
149 | so the batch complete bit must also be set to one if the block complete bit is set to one.
150 |
151 | #### Field: Batch complete bit
152 | Set to one if this data shred is the last shred for this entry batch.
153 | Otherwise, set to zero to indicate that the next data shred is part of the same entry batch.
154 |
155 |
156 | #### Field: Total size
157 | The size of this packet including the common shred header, the data shred header, and the data payload.
158 | The size excludes the zero-padding (if any)
159 | and the Merkle proof (if using Merkle authentication).
160 |
161 | ### Data Shred Header (rev. v1)
162 |
163 | Revision v1 data shreds.
164 |
165 | The total size field is assumed to be 1228 bytes and thus not included in the packet.
166 |
167 | | Offset | Size | Type | Name | Purpose |
168 | |--------|-----:|-------|----------------|--------------------------------|
169 | | `0x53` | 2B | `u16` | `parent_offset`| Slot distance to parent block |
170 | | `0x55` | 1B | `u8` | `data_flags` | Data Flags |
171 |
172 | The data payload begins at offset `0x56` from the start of the common header.
173 |
174 |
175 | ## Code Shred Header
176 |
177 |
178 | | Offset | Size | Type | Name | Purpose |
179 | |--------|-----:|-------|--------------------|--------------------------------|
180 | | `0x53` | 2B | `u16` | `num_data_shreds` | Number of data shreds |
181 | | `0x55` | 2B | `u16` | `num_coding_shreds`| Number of coding shreds |
182 | | `0x57` | 2B | `u16` | `position` | Position of this shred in FEC set |
183 |
184 | The Reed-Solomon coding payload begins at offset `0x59` from the start of the common header.
185 |
186 |
187 | #### Field: Number of data shreds
188 | Set to the number of data shreds in the FEC set to which this packet belongs.
189 | Every coding shred in a set must have the same value in this field.
190 |
191 | #### Field: Number of coding shreds
192 | Set to the number of coding shreds in the FEC set to which this packet belongs.
193 | Every coding shred in a set must have the same value in this field.
194 |
195 | #### Field: FEC set position
196 | Identifies which Reed-Solomon shard this packet contains.
197 | Must be in the range $[0, \texttt{num}\textunderscore\texttt{coding}\textunderscore\texttt{shreds})$.
198 | This field was not present and set to 0 prior to https://github.com/solana-labs/solana/pull/27136.
199 | Every coding shred in a set must have a unique value in this field.
200 |
201 | ## Shred Payload Construction
202 |
203 | Block producers create and broadcast shreds to the validator network while active.
204 |
205 | This occurs as a streaming process wherein shreds are constructed in the earliest possible opportunity.
206 |
207 | The construction of shreds requires the following steps:
208 | 1. Block Entry Batching
209 | 2. Data Shredding
210 | 3. Erasure Coding
211 | 4. Signing
212 |
213 | ## Block Entry Batching
214 |
215 | Batching creates sub-groups of block entries, each of which are serialized into one byte array.
216 |
217 | The serialization of a batch is the concatenation of all serialized entries, prefixed by the entry count as a `u64` integer (8 bytes).
218 |
219 | ## Shredding
220 |
221 | The shredding procedure converts a serialized entry batch into a vector of data shreds.
222 | In this section, the serialized entry batch is split into fixed-size pieces which form the payload of each data shred,
223 | and the associated metadata is computed.
224 |
225 | First, we must compute $S$, the size of the payload of each data shred.
226 | * For shreds using the Legacy authentication scheme, $S$ has a hardcoded value of 1051.
227 | This is small enough to ensure that they payload of a code shred can cover the data shred's headers while still fitting in 1228 bytes.
228 | * For shreds using the Merkle authentication scheme, $S = 1115 - 20 \lceil \log_2 (N+K) \rceil$,
229 | where $N+K$ is the total number of data and code shreds in the FEC set
230 | (see next section for details.
231 | Unfortunately, this means the definition is somewhat circular,
232 | and $S$ is only constant for a specific FEC set,
233 | not for the entire entry batch.
234 | This implies that when using Merkle authentication,
235 | the shredding and erasure coding steps are not entirely independent.
236 | In order to preserve clarity of presentation, we will describe them separately.)
237 |
238 | Let $\ell$ be the length in bytes of the serialized entry batch and
239 | $x_0, x_1, \ldots, x_{\ell-1}$ be the serialized entry batch bytes.
240 |
241 | Then the payload of the $i$ th data shred is $x_{iS}, x_{iS+1}, \ldots, x_{(i+1)S-1}$
242 | for $0\le i < \lfloor{\ell/S}\rfloor$.
243 | If $\ell$ is not divisible by $S$, then the final payload is 0-padded to $S$ bytes, i.e.
244 | $x_{\lfloor{\ell/S}\rfloor S}, x_{\lfloor{\ell/S}\rfloor S+1}, \ldots, x_{\ell-1}, \underbrace{0, 0, \; \ldots\;, 0}_{\ell-S\lfloor{\ell/S}\rfloor bytes}$.
245 |
246 |
247 | The shred index of the first shred of the first batch in the block is 0.
248 | The shred index of each subsequent shred is one larger than the shred index of the previous shred.
249 | The shred index of the first shred in a subsequent batch is one larger than the shred index of the last shred in the previous batch.
250 | That is, shred indices increase monotonically without resetting at each batch.
251 |
252 |
253 | The last data shred of each batch has the "batch complete" bit set.
254 | This field can be extracted using data flags bit `0x40`.
255 |
256 | The last data shred of the block has the "block complete" bit set.
257 | This field can be extracted using data flags bit `0x80`.
258 | Since a block contains an integral number of entry batches,
259 | the last data shred of the block must also be the last data shred of a batch.
260 |
261 | The "batch tick number" of all shreds in a batch is set to the
262 | number of PoH ticks that have passed since the beginning of the slot
263 | for the first entry in the batch.
264 | Since Solana has 64 ticks per slot, this field cannot overflow.
265 | This field can be extracted using data flags bit field with mask `0x3f`.
266 | When the "block complete" flag is set to 1, "batch tick number" may be set to 0.
267 |
268 |
269 |
270 | The inverse of the shredding process (deshredding) reconstructs serialized batches from a stream of data shreds.
271 |
272 | ## Erasure Coding
273 | Data shreds are grouped together to form forward error correction (FEC) sets.
274 | The block producer may choose $N$,
275 | the number of contiguous data shreds to include in the FEC set.
276 | However, an FEC set must have at least 1 and no more than 67 data shreds,
277 | and $N=32$ is recommended.
278 |
279 |
280 | Given the chosen value of $N$,
281 | a compliant block producer must produce $K$ code shreds as given by the following table:
282 |
283 | | $N$ | $K$ | Total shreds ($N+K$) | | | $N$ | $K$ | Total shreds ($N+K$) |
284 | |-----|-----|----------------------|--|--|-----|-----|---------------------|
285 | | 1 | 17 | 18 | | | 17 | 26 | 43 |
286 | | 2 | 18 | 20 | | | 18 | 27 | 45 |
287 | | 3 | 19 | 22 | | | 19 | 27 | 46 |
288 | | 4 | 19 | 23 | | | 20 | 28 | 48 |
289 | | 5 | 20 | 25 | | | 21 | 28 | 49 |
290 | | 6 | 21 | 27 | | | 22 | 29 | 51 |
291 | | 7 | 21 | 28 | | | 23 | 29 | 52 |
292 | | 8 | 22 | 30 | | | 24 | 29 | 53 |
293 | | 9 | 23 | 32 | | | 25 | 30 | 55 |
294 | | 10 | 23 | 33 | | | 26 | 30 | 56 |
295 | | 11 | 24 | 35 | | | 27 | 31 | 58 |
296 | | 12 | 24 | 36 | | | 28 | 31 | 59 |
297 | | 13 | 25 | 38 | | | 29 | 31 | 60 |
298 | | 14 | 25 | 39 | | | 30 | 32 | 62 |
299 | | 15 | 26 | 41 | | | 31 | 32 | 63 |
300 | | 16 | 26 | 42 | | | 32 | 32 | 64 |
301 |
302 | For $N>32$, use $K=N$.
303 |
304 | However, a compliant implementation may also accept FEC sets with a different number of code shreds as long as $1\le K, N \le 67$.
305 |
306 | Code shreds are produced from the data shreds using Reed-Solomon encoding.
307 | When using legacy authentication,
308 | the interpretation of "data shred" used for erasure coding
309 | starts with the first byte of the common header of the data shreds
310 | and includes the signature field.
311 | When using Merkle authentication,
312 | the interpretation of "data shred" used for erasure coding begins immediately after the signature field
313 | and ends immediately before the Merkle proof section.
314 |
315 | Let $x_{i,b}$ be the $b$-th byte of the $i$-th data shred of the FEC set (numbered $0, 1, \ldots, N-1$) interpreted as an element of the finite field $GF(2^8)$ (i.e. $\mathbb{F}_2[\gamma] / (\gamma^8 + \gamma^4 + \gamma^3 + \gamma^2 + 1)$ ).
316 |
317 | Taking one $b$ at a time, define the polynomial $P_b(x)$ of order less than $N$ such that $P_b(i) = x_i$ for all $0\le i < N$ (interpreting the byte value of $i$ as an element of $GF(2^8)$ ).
318 | This polynomial is unique.
319 |
320 | Then the $b$-th byte of each code shred comes from evaluating $P_b$ as subsequent points.
321 | More precisely, let $y_{j,b}$ be the $b$-th byte of the $j$-th code shred for $0\le j < K$.
322 | Then $y_{j,b} = P_b(N+j)$, where $N+j$ is computed as an integer and then interpreted as an element of $GF(2^8)$.
323 |
324 | Equivalently, this is a linear operation, so it can also be described as a matrix-vector product over $GF(2^8)$:
325 |
326 | $$ M \left( \begin{array}{c}
327 | x_{0,b} \\
328 | x_{1,b} \\
329 | \vdots \\
330 | x_{N-1,b} \end{array} \right) = \left( \begin{array}{c}
331 | y_{0,b} \\
332 | y_{1,b} \\
333 | \vdots \\
334 | y_{K-1,b} \end{array} \right).$$
335 |
336 | The matrix $M$ depends only on $N$ and $K$.
337 | There are various ways to compute $M$, but one description is
338 |
339 | $$
340 | M = \left( \begin{array}{c}
341 | N^0 & N^1 & \cdots & N^{N-1} \\
342 | (N+1)^0 & (N+1)^1 & \cdots & (N+1)^{N-1} \\
343 | \vdots & \vdots & \ddots & \vdots \\
344 | (N+K-1)^0 & (N+K-1)^1 & \cdots & (N+K-1)^{N-1} \end{array} \right) *
345 | \left( \begin{array}{ccccc}
346 | 1 & 0 & 0 & \cdots & 0 \\
347 | 1^0 & 1^1 & 1^2 & \cdots & 1^{N-1} \\
348 | 2^0 & 2^1 & 2^2 & \cdots & 2^{N-1} \\
349 | \vdots & \vdots & \vdots & \ddots & \vdots \\
350 | (N-1)^0 & (N-1)^1 & (N-1)^2 & \cdots & (N-1)^{N-1} \end{array} \right)^{-1}
351 | $$
352 |
353 | where the base and exponent computations are integer arithmetic,
354 | but the exponentiation, matrix inverse, and matrix multiplication are finite field operations.
355 | That is, $M$ is the product of a portion of a Vandermonde matrix with the inverse of another Vandermonde matrix.
356 |
357 | ## Zero Padding
358 |
359 | When the data produced by the shredding process
360 | does not fill the payload field,
361 | additional zero bytes must be inserted after the shred data.
362 | This ensures all data shreds are the same length.
363 |
364 | The Reed-Solomon encoding process naturally produces coding shreds of the same size,
365 | so coding shreds do not need zero-padding.
366 |
367 | ## Signing
368 |
369 | ### Legacy
370 | When using the legacy authentication method,
371 | the block producer populates the signature field of data shreds and code shreds with the Ed25519 signature of the bytes in the packet that follow the signature field,
372 | including any zero padding.
373 |
374 |
375 | ### Merkle
376 |
377 | When using the Merkle authentication method,
378 | the block producer constructs the [canonical Merkle tree](/core/merkle-tree.md) from each shred in an FEC set,
379 | with the data shreds in sequence followed by the coding shreds in sequence.
380 | The leaf nodes for both shred types are the bytes
381 | from immediately after the signature field
382 | to immediately before the Merkle proof section.
383 | All hashes are truncated to 20 bytes,
384 | and the last 12 bytes of each SHA256 hash are immediately discarded.
385 | Leaf nodes use a prefix of `\x00SOLANA_MERKLE_SHREDS_LEAF` and interior nodes use a prefix of `\x01SOLANA_MERKLE_SHREDS_NODE`.
386 | Both prefixes are 26 bytes and are not `\0`-terminated.
387 |
388 | The block producer computes the Ed25519 signature of the root of the Merkle tree
389 | and stores the signature in the signature field of the common header.
390 | Since all packets in the FEC set are part of the same Merkle tree
391 | and thus have the same Merkle root,
392 | all shreds (code and data) in the same FEC set have the same signature in this scheme.
393 |
394 | ## Merkle Proof
395 |
396 | When using Merkle authentication,
397 | the last bytes of the packet contain a Merkle proof that the payload belongs to the Merkle tree covered by the signature field.
398 |
399 | Let $h=\lceil \log_2 (N+K) \rceil$ be the height of the Merkle tree for an FEC set
400 | (including the leaf nodes but not the root).
401 | The Merkle proof section is composed of the following:
402 |
403 | | Offset | Size | Type | Description |
404 | |------------------|-----:|-----------------------|--------------------------------------------------|
405 | | end- $20h$ | 20B | Truncated Merkle hash | Merkle hash of sibling leaf node |
406 | | end- $20h$ + 20 | 20B | Truncated Merkle hash | Merkle hash of sibling of parent of leaf node |
407 | | ... | ... | ... | ... |
408 | | end-20 | 20B | Truncated Merkle hash | Merkle hash of child of root |
409 |
410 | The Merkle proof contains the other information needed to compute the full branch from the leaf in the packet to the root.
411 | For example, in [canonical Merkle tree Figure 2](/core/merkle-tree.md#figure_2),
412 | the proof for `L0` contains the hashes `L1` (sibling leaf node),
413 | `Iβ` and `Iε`.
414 | It does not include `Iα`, `Iδ`, or `Iζ` as those can be computed from the included information.
415 | The proof for `L3` contains `L2`, `Iα`, and `Iε`.
416 |
417 | As described previously, the leaf nodes for both shred types are the bytes
418 | from immediately after the signature field
419 | to immediately before the Merkle proof section.
420 |
421 | A compliant implementation must validate that
422 | the signature is a valid signature of the root hash
423 | and that the Merkle tree is consistent among all shreds in the FEC set.
424 |
--------------------------------------------------------------------------------
/p2p/tpu.md:
--------------------------------------------------------------------------------
1 | # TPU protocol
2 |
3 | TPU is a publicly available peer-to-peer service to queue transactions for inclusion on the Solana blockchain.
4 |
5 | It is backed by a streaming protocol (UDP datagrams or QUIC) which routes transactions to network leaders.
6 |
7 | ## Topology
8 |
9 | - **Clients** produce new transactions
10 | - **Relayers** forward transactions to leaders or other relayers
11 | - **Leaders** pack transactions into blocks
12 |
13 | The lifecycle of a transaction starts with the signer.
14 | Signers can either submit transactions directly to TPU nodes (requires UDP connectivity) or submit them through RPC servers.
15 |
16 | ```mermaid
17 | flowchart LR
18 |
19 | subgraph Clients
20 | C1[Client]
21 | C2[Client]
22 | C3[Client]
23 | C4[Client]
24 | end
25 |
26 | subgraph Validators
27 | L[Leader]
28 | V1[Validator]
29 | V2[Validator]
30 | end
31 |
32 | R1[Node]
33 |
34 | C1 -- RPC --> R1
35 | C2 -- TPU --> R1
36 | R1 -- TPU --> L
37 | C3 -- TPU --> L
38 | C4 -- TPU ---> V1
39 | V1 -- TPU\nTPUvote --> L
40 | V2 -- TPUvote --> L
41 | ```
42 |
43 | In a Solana cluster, full nodes usually participate as TPU relayers, although not required.
44 | All validators included in the leader schedule should operate an open TPU service.
45 |
46 | Nodes attempt to forward transactions to the current leader's TPU endpoints.
47 | The node identities of the leader schedule are well known (runtime intrinsic),
48 | and their TPU endpoints are part of CRDS ContactInfo.
49 |
50 | ## Classes
51 |
52 | The network uses separate classes of TPU traffic for QoS.
53 |
54 | - The **TPUvote** class is Tower BFT consensus messages (highest priority).
55 | - The default **TPU** class is regular user transactions, generated by validators.
56 | - The **TPUfwd** class is user transactions that the previous leader did not fully process.
57 |
58 | ## TPU/QUIC protocol v1
59 |
60 | TPU/QUIC relays transactions over the QUIC transport protocol.
61 |
62 | QUIC is defined by [RFC 9000] and runs on UDP networks.
63 |
64 | Data is encrypted using TLS 1.3, defined by [RFC 8446].
65 |
66 | [RFC 8446]: https://www.rfc-editor.org/rfc/rfc8446.html
67 | [RFC 9000]: https://www.rfc-editor.org/rfc/rfc9000.html
68 |
69 | ### Connection Parameters
70 |
71 | **Compliance**
72 |
73 | The QUIC/TPU protocol does not aim to be fully TLS-compliant ([RFC 8446 Section 9]).
74 |
75 | Peers must support the following cipher suites:
76 | - `TLS_AES_128_GCM_SHA256` (0x1301, TLS 1.3)
77 | - `TLS_CHACHA20_POLY1305_SHA256` (0x1303, TLS 1.3)
78 |
79 | Peers should not negotiate any deprecated TLS 1.2 cipher suites.
80 |
81 | Peers must support the following cryptographic schemes:
82 | - digital signature schemes
83 | - `ed25519` (0x0807)
84 | - key exchange groups
85 | - `x25519` (29)
86 |
87 | Peers are not required to support any additional cryptographic schemes.
88 | Contrary to [RFC 8446 Section 9.1], the following cryptographic schemes are optional:
89 | - digital signature schemes
90 | - `rsa_pkcs1_sha256` (0x0401)
91 | - `ecdsa_secp256r1_sha256` (0x0403)
92 | - `rsa_pss_rsae_sha256` (0x0804)
93 | - key exchange groups
94 | - `secp256r1` (23)
95 |
96 | Refer to the [IANA TLS Parameters](https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml) for an authoritative list of identifiers.
97 |
98 | [RFC 8446 Section 9]: https://datatracker.ietf.org/doc/html/rfc8446#section-9
99 | [RFC 8446 Section 9.1]: https://datatracker.ietf.org/doc/html/rfc8446#section-9.1
100 |
101 | **Key Exchange**
102 |
103 | Clients must include an X25519 key share in the initial ClientHello message.
104 |
105 | Failure to do so may result in an additional handshake round trip or an aborted connection.
106 |
107 | **Application-Layer Protocol Negotiation**
108 |
109 | The TLS ClientHello and ServerHello messages must include the ALPN protocol ID `solana-tpu`.
110 | The server must reject connections that fail to advertise ALPN accordingly.
111 |
112 | **Transport parameters**
113 |
114 | On connection creation, peers should set appropriate quotas via the QUIC transport parameters TLS extension.
115 |
116 | Recommended server-side parameters:
117 | - `initial_max_stream_data_uni` (0x07): 1232 (Max transaction size)
118 |
119 | The following parameters should be omitted (defaulting to 0) or explicitly set to 0:
120 | - `initial_max_data` (0x04), omit on client only
121 | - `initial_max_stream_data_bidi_local` (0x05)
122 | - `initial_max_stream_data_bidi_remote` (0x06)
123 | - `initial_max_streams_bidi` (0x08)
124 |
125 | Refer to [RFC 9000 Section 18.2](https://www.rfc-editor.org/rfc/rfc9000.html#name-transport-parameter-definit) for transport parameter definitions.
126 |
127 | **Send quota**
128 |
129 | In QUIC, senders are only allowed to transmit as much data as specified via quotas by the receiver.
130 | If any quota is violated, the server should close the connection.
131 |
132 | The [`MAX_DATA`] and [`MAX_STREAMS`] quotas should be continually replenished by the server side.
133 |
134 | [`MAX_DATA`]: https://www.rfc-editor.org/rfc/rfc9000.html#name-max_data-frames
135 | [`MAX_STREAMS`]: https://www.rfc-editor.org/rfc/rfc9000.html#name-max_streams-frames
136 |
137 | ### Streaming Protocol
138 |
139 | Transactions are transmitted via client-to-server unidirectional QUIC streams.
140 | Every stream contains exactly one serialized transaction.
141 |
142 | TPU/QUIC servers should accept stream data of up to the maximum transaction size (1232 bytes).
143 |
144 | #### Single-packet transactions
145 |
146 | One or more small transactions can be transmitted in a single QUIC 1-RTT packet.
147 |
148 | **Example QUIC frame**
149 |
150 | ```
151 | STREAM Frame {
152 | Prefix (5) = 0b00001 # Prefix present on all STREAM frames
153 | Offset Present (1) = 0 # Offset is absent for first (only) fragment
154 | Length Present (1) # Length should be absent if the tx is last in this packet
155 | Fin (1) = 1 # End-of-stream is set for last (only) fragment
156 |
157 | Stream ID (i) # Stream ID specified by server (implied length)
158 | [Length (i)] # Length of the stream data that follows
159 | Stream Data (..) # Contains the transaction
160 | }
161 | ```
162 |
163 | #### Transaction fragmentation
164 |
165 | Fragmentation is required when the size of a transaction exceeds the packet MTU of the underlying UDP/IP layers.
166 |
167 | TPU/QUIC implements fragmentation by splitting a QUIC stream across multiple packets.
168 | The payload of each QUIC packet should maximize space up to the MTU.
169 |
170 | Peers must avoid sending QUIC traffic using IPv4/IPv6 fragmentation.
171 | Incoming QUIC traffic indicating IP fragmentation may be ignored.
172 |
173 | **Example QUIC frames**
174 |
175 | _First UDP packet_
176 |
177 | ```
178 | STREAM Frame {
179 | Prefix (5) = 0b00001
180 | Offset Present (1) = 0 # Offset is absent for first fragment
181 | Length Present (1) = 0 # Length is absent, as fragment frames always fill packet
182 | Fin (1) = 0
183 |
184 | Stream ID (i)
185 | Stream Data (..)
186 | }
187 | ```
188 |
189 | _Second UDP packet_
190 |
191 | ```
192 | STREAM Frame {
193 | Prefix (5) = 0b00001
194 | Offset Present (1) = 1
195 | Length Present (1) = 0
196 | Fin (1) = 0
197 |
198 | Stream ID (i)
199 | [Offset (i)]
200 | Stream Data (..)
201 | }
202 | ```
203 |
204 | _Last UDP packet_
205 |
206 | ```
207 | STREAM Frame {
208 | Prefix (5) = 0b00001
209 | Offset Present (1) = 1
210 | Length Present (1) = 0
211 | Fin (1) = 1 # Terminate stream
212 |
213 | Stream ID (i)
214 | [Offset (i)]
215 | Stream Data (..)
216 | }
217 | ```
218 |
219 | #### Acknowledgement
220 |
221 | `STREAM` frames sent by clients are ack-eliciting.
222 | Refer to [RFC 9000 Section 13.2.1] for obligations on sending `ACK` frames.
223 |
224 | It is permitted to coalesce multiple `ACK` frames to reduce packet count.
225 |
226 | #### Stream Limits
227 |
228 | The server should send [`MAX_STREAM_DATA`] frames for recently created fragmented tx streams.
229 | The client should expect arbitrarily receive quotas for any stream.
230 |
231 | Note that stream IDs are incremental (ignoring the two least significant bits).
232 | Upon opening a stream with an out-of-order ID, all preceding streams are also created.
233 |
234 | In this case, the server may proactively send multiple [`MAX_STREAM_DATA`] frames for implicitly created streams.
235 |
236 | [RFC 9000 Section 13.2.1]: https://www.rfc-editor.org/rfc/rfc9000.html#name-sending-ack-frames
237 | [`MAX_STREAM_DATA`]: https://www.rfc-editor.org/rfc/rfc9000.html#name-max_stream_data-frames
238 |
239 | #### Aborting
240 |
241 | The server must terminate the QUIC connection in case of the following events:
242 |
243 | - Submission of invalid transactions, specifically:
244 | - Txs that fail deserialization
245 | - Txs that fail parameter sanitization
246 | - Txs that fail signature verification
247 | - Transmission of QUIC datagrams, defined by [RFC 9221]
248 | - Attempted creation of QUIC bidirectional streams (already forbidden by quota)
249 | - All other cases where the QUIC RFC dictates connection termination.
250 |
251 | [RFC 9221]: https://www.rfc-editor.org/rfc/rfc9221.html
252 |
253 | ## TPU/UDP protocol
254 |
255 | TPU/UDP is a simple datagram-oriented network protocol.
256 |
257 | Each datagram carries a single serialized transaction without headers or padding.
258 |
259 | The protocol is strictly unidirectional. The receiver should therefore ignore the source IP address and UDP port.
260 |
261 | It is being deprecated in favor of TPU/QUIC, which adds authentication, confidentiality, and congestion control.
262 |
263 | TPU/UDP traffic cannot be multiplexed with other protocols over the same destination port.
264 |
265 | ## Security
266 |
267 | ### Source IP spoofing
268 |
269 | The source address on IP headers is an untrusted field and can be forged by attackers over the Internet.
270 |
271 | A practical defense involves challenging the source to prove that it is privileged to see incoming traffic at the supposed address.
272 | For example, TCP or QUIC in 1-RTT mode is protected by a three-way handshake,
273 | wherein the source returns back a pseudorandom nonce it has previously received by the destination.
274 |
275 | The TPU/UDP and TPU/QUIC protocols do not protect against source IP spoofing.
276 | Therefore, the receiver must ignore the source IP address when using these protocols.
277 |
278 | Critically, the receiver must not send back significantly more traffic back to the supposed source than what originally arrived.
279 | Failure to do so introduces a traffic amplification vulnerability commonly used in DDoS attacks.
280 |
281 | For more information, refer to [BCP 38](https://www.rfc-editor.org/info/bcp38).
282 |
283 | ### Packet flood incentivization
284 |
285 | When TPU/UDP links become congested, packets start to get dropped arbitrarily.
286 | This incentivizes clients to repeatedly send packets (spam) to increase the chance
287 | of getting transactions confirmed. Such increase in traffic worsens congestion.
288 |
289 | TPU/QUIC implements retransmit and congestion control to address this vulnerability.
290 |
291 | ### Confidentiality
292 |
293 | TPU traffic should be treated as confidential data.
294 |
295 | The ideal TPU route consists of a single hop from transaction signer to leader.
296 | Using TPU/QUIC for end-to-end encryption further reduces MitM risks.
297 |
--------------------------------------------------------------------------------
/solana_specs/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__
2 |
--------------------------------------------------------------------------------
/solana_specs/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/solana-foundation/specs/136feee5d5b126dbfd351258ea621200346c84b1/solana_specs/__init__.py
--------------------------------------------------------------------------------
/solana_specs/consensus/leader_schedule.py:
--------------------------------------------------------------------------------
1 | import csv
2 | from pathlib import Path
3 | import struct
4 | from ..core.base58 import b58encode, b58decode
5 | from ..core.chacha20rng import ChaCha20Rng
6 | from ..core.weighted_index import WeightedIndex
7 |
8 |
9 | class LeaderSchedule:
10 | def __init__(self, schedule, slots_per_rotation):
11 | self.schedule = schedule
12 | self.slots_per_rotation = slots_per_rotation
13 |
14 | @staticmethod
15 | def derive(epoch, stake_weights, rotations, slots_per_rotation) -> "LeaderSchedule":
16 | stake_weights = stake_weights.copy()
17 | stake_weights.sort(key=lambda x: (x[1], x[0]), reverse=True)
18 | weighted_index = WeightedIndex(map(lambda x: x[1], stake_weights))
19 |
20 | rng = ChaCha20Rng(struct.pack("= len(self.schedule) * self.slots_per_rotation:
29 | return None
30 | return self.schedule[slot // self.slots_per_rotation]
31 |
32 | def __iter__(self):
33 | return map(
34 | lambda x: self.lookup(x),
35 | range(len(self.schedule) * self.slots_per_rotation),
36 | )
37 |
38 |
39 | def test():
40 | fixtures = Path(__file__).parents[1] / "fixtures"
41 | epoch_stakes_path = fixtures / "epoch-stakes-mainnet-454.csv"
42 | leader_schedule_path = fixtures / "leader-schedule-454.txt"
43 |
44 | # Read list of epoch stakes
45 | with open(epoch_stakes_path) as f:
46 | rows = csv.reader(f)
47 | next(rows) # skip header
48 | weights = list(map(lambda row: (b58decode(row[0]), int(row[1])), rows))
49 |
50 | schedule = LeaderSchedule.derive(
51 | epoch=454,
52 | stake_weights=weights,
53 | rotations=108000,
54 | slots_per_rotation=4,
55 | )
56 | # Read expected leader schedule
57 | with open(leader_schedule_path) as f:
58 | expected_schedule = map(lambda s: b58decode(s.strip()), iter(f))
59 | for i, tuple in enumerate(zip(schedule, expected_schedule)):
60 | got, expected = tuple
61 | assert got == expected, f"at {i} got: {got}, expected: {expected}"
62 |
63 |
64 | if __name__ == "__main__":
65 | test()
66 |
--------------------------------------------------------------------------------
/solana_specs/core/base58.py:
--------------------------------------------------------------------------------
1 | BASE58_ALPHABET = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"
2 |
3 |
4 | def b58encode(buf: bytes) -> str:
5 | b2 = int.from_bytes(buf, "big")
6 | b58 = ""
7 | while b2 > 0:
8 | b58 = str(BASE58_ALPHABET[b2 % 58]) + b58
9 | b2 //= 58
10 | pad = 0
11 | for b in buf:
12 | if b == 0:
13 | pad += 1
14 | else:
15 | break
16 | return "1" * pad + b58
17 |
18 |
19 | def b58decode(str: str) -> bytes:
20 | b58 = 0
21 | j = 1
22 | for c in reversed(str):
23 | digit = BASE58_ALPHABET.find(c)
24 | assert digit != -1
25 | b58 += digit * j
26 | j *= 58
27 | b2 = b58.to_bytes((b58.bit_length() + 7) // 8, "big")
28 | pad = len(str) - len(str.lstrip("1"))
29 | return b"\x00" * pad + b2
30 |
31 |
32 | def test():
33 | assert (
34 | b58decode("11111111111111111111111111111111").hex()
35 | == "0000000000000000000000000000000000000000000000000000000000000000"
36 | )
37 | assert (
38 | b58decode("Config1111111111111111111111111111111111111").hex()
39 | == "03064aa3002f74dcc86e43310f0c052af8c5da27f6104019a323efa000000000"
40 | )
41 | assert (
42 | b58decode("Certusm1sa411sMpV9FPqU5dXAYhmmhygvxJ23S6hJ24").hex()
43 | == "ad23766ddee6e99ca3340ee5beac0884c89ddbc74dfe248fea56135698bafdd1"
44 | )
45 |
46 |
47 | if __name__ == "__main__":
48 | test()
49 |
--------------------------------------------------------------------------------
/solana_specs/core/chacha20.py:
--------------------------------------------------------------------------------
1 | import struct
2 |
3 |
4 | def chacha20_quarter_round(state, a, b, c, d):
5 | state[a] = (state[a] + state[b]) & 0xFFFF_FFFF
6 | state[d] = state[d] ^ state[a]
7 | state[d] = ((state[d] << 16) & 0xFFFF_FFFF) | (state[d] >> 16)
8 | state[c] = (state[c] + state[d]) & 0xFFFF_FFFF
9 | state[b] = state[b] ^ state[c]
10 | state[b] = ((state[b] << 12) & 0xFFFF_FFFF) | (state[b] >> 20)
11 | state[a] = (state[a] + state[b]) & 0xFFFF_FFFF
12 | state[d] = state[d] ^ state[a]
13 | state[d] = ((state[d] << 8) & 0xFFFF_FFFF) | (state[d] >> 24)
14 | state[c] = (state[c] + state[d]) & 0xFFFF_FFFF
15 | state[b] = state[b] ^ state[c]
16 | state[b] = ((state[b] << 7) & 0xFFFF_FFFF) | (state[b] >> 25)
17 |
18 |
19 | def chacha20_block(key, idx, nonce):
20 | key_parts = list(struct.unpack("<8I", key))
21 | nonce_parts = list(struct.unpack("<3I", nonce))
22 |
23 | state_pre = [0x61707865, 0x3320646E, 0x79622D32, 0x6B206574]
24 | state_pre += key_parts
25 | state_pre += [idx]
26 | state_pre += nonce_parts
27 |
28 | state = state_pre.copy()
29 |
30 | for _ in range(10):
31 | chacha20_quarter_round(state, 0, 4, 8, 12)
32 | chacha20_quarter_round(state, 1, 5, 9, 13)
33 | chacha20_quarter_round(state, 2, 6, 10, 14)
34 | chacha20_quarter_round(state, 3, 7, 11, 15)
35 | chacha20_quarter_round(state, 0, 5, 10, 15)
36 | chacha20_quarter_round(state, 1, 6, 11, 12)
37 | chacha20_quarter_round(state, 2, 7, 8, 13)
38 | chacha20_quarter_round(state, 3, 4, 9, 14)
39 |
40 | for i in range(16):
41 | state[i] = (state[i] + state_pre[i]) & 0xFFFF_FFFF
42 | return struct.pack("<16I", *state)
43 |
44 |
45 | def test():
46 | key = bytes.fromhex(
47 | "000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f"
48 | )
49 | nonce = bytes.fromhex("000000090000004a00000000")
50 | idx = 1
51 | expected = bytes.fromhex(
52 | "10f1e7e4d13b5915500fdd1fa32071c4c7d1f4c733c068030422aa9ac3d46c4ed2826446079faa0914c2d705d98b02a2b5129cd1de164eb9cbd083e8a2503c4e"
53 | )
54 | actual = chacha20_block(key, idx, nonce)
55 | assert actual == expected
56 |
57 |
58 | if __name__ == "__main__":
59 | test()
60 |
--------------------------------------------------------------------------------
/solana_specs/core/chacha20rng.py:
--------------------------------------------------------------------------------
1 | import struct
2 | from .chacha20 import chacha20_block
3 |
4 |
5 | class ChaCha20Rng:
6 | def __init__(self, key):
7 | self.key = key
8 | self.buf = bytearray()
9 | self.idx = 0
10 |
11 | def refill(self):
12 | self.buf += chacha20_block(self.key, self.idx, b"\x00" * 12)
13 | self.idx += 1
14 |
15 | def next_u64(self):
16 | if len(self.buf) < 8:
17 | self.refill()
18 | ret = struct.unpack("> 64
29 | lo = res & (2**64 - 1)
30 | if lo <= zone:
31 | return hi
32 |
33 |
34 | def test():
35 | key = bytes.fromhex(
36 | "000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f"
37 | )
38 | rng = ChaCha20Rng(key)
39 | assert rng.next_u64() == 0x6A19C5D97D2BFD39
40 | for _ in range(100000):
41 | rng.next_u64()
42 | assert rng.next_u64() == 0xF4682B7E28EAE4A7
43 |
44 |
45 | if __name__ == "__main__":
46 | test()
47 |
--------------------------------------------------------------------------------
/solana_specs/core/poh.py:
--------------------------------------------------------------------------------
1 | """
2 | Python pseudocode implementation of Solana proof-of-history.
3 | """
4 |
5 | import hashlib
6 |
7 |
8 | class Poh:
9 | def __init__(self, seed=bytes(32)):
10 | self.state = bytes(seed)
11 |
12 | def append(self):
13 | msg = hashlib.sha256()
14 | msg.update(self.state)
15 | self.state = msg.digest()
16 |
17 | def mixin(self, mixin):
18 | assert len(mixin) == 32
19 | msg = hashlib.sha256()
20 | msg.update(self.state)
21 | msg.update(mixin)
22 | self.state = msg.digest()
23 |
24 |
25 | def test():
26 | poh = Poh()
27 | for _ in range(42):
28 | poh.append()
29 | poh.mixin(b"WAO.............................")
30 | assert (
31 | poh.state.hex()
32 | == "18a244914fc9d21673ed92fc9edfbc4b00a9d630af352e0d8a4cac5846a344ce"
33 | )
34 |
35 |
36 | if __name__ == "__main__":
37 | test()
38 |
--------------------------------------------------------------------------------
/solana_specs/core/weighted_index.py:
--------------------------------------------------------------------------------
1 | from bisect import bisect
2 | from .chacha20rng import ChaCha20Rng
3 |
4 |
5 | class WeightedIndex:
6 | def __init__(self, weights):
7 | self.cumulative_weight = 0
8 | self.indexed_weights = []
9 | for weight in weights:
10 | self.indexed_weights.append((self.cumulative_weight, weight))
11 | self.cumulative_weight += weight
12 |
13 | def lookup(self, x):
14 | if x >= self.cumulative_weight:
15 | return None
16 | return bisect(self.indexed_weights, x, key=lambda x: x[0]) - 1
17 |
18 | def sample(self, rng):
19 | return self.lookup(rng.roll_u64(self.cumulative_weight))
20 |
21 |
22 | def test():
23 | assert WeightedIndex([1]).lookup(0) == 0
24 | assert WeightedIndex([1]).lookup(1) == None
25 | assert WeightedIndex([2, 3, 2]).lookup(0) == 0
26 | assert WeightedIndex([2, 3, 2]).lookup(1) == 0
27 | assert WeightedIndex([2, 3, 2]).lookup(2) == 1
28 | assert WeightedIndex([2, 3, 2]).lookup(5) == 2
29 | assert WeightedIndex([2, 3, 2]).lookup(6) == 2
30 | assert WeightedIndex([2, 3, 2]).lookup(7) == None
31 |
32 |
33 | if __name__ == "__main__":
34 | test()
35 |
--------------------------------------------------------------------------------
/solana_specs/fixtures/epoch-stakes-mainnet-454.csv:
--------------------------------------------------------------------------------
1 | version https://git-lfs.github.com/spec/v1
2 | oid sha256:d795fb47993e302f545c179ecaae951d611cca9f3fb6c38ddb9ea7bc7bd0273c
3 | size 193327
4 |
--------------------------------------------------------------------------------
/solana_specs/fixtures/leader-schedule-454.txt:
--------------------------------------------------------------------------------
1 | version https://git-lfs.github.com/spec/v1
2 | oid sha256:abb21c997cd9d0220d6fcdaad166ddd5b0da46a604c8fd85db5c92ea44e8854e
3 | size 19386260
4 |
--------------------------------------------------------------------------------