├── builder-onboarding.md
├── img
├── nonopt.png
├── v1.png
├── v2.png
├── v3.png
└── v4.png
├── proposal.md
├── stale
├── img
│ ├── nonopt.png
│ ├── v1.png
│ ├── v2.png
│ └── v3.png
└── towards-epbs.md
└── towards-epbs.md
/builder-onboarding.md:
--------------------------------------------------------------------------------
1 | # Optimistic relaying—builder guide
2 |
3 | Thank you for your interest in low-latency optimistic relaying with [the ultra sound relay](https://relay.ultrasound.money)! This document is an onboarding guide for you, Ethereum block builder. Please take the time to understand it :)
4 |
5 | **TLDR**
6 |
7 | >1. Share with us over Telegram or Discord the list of builder pubkeys you want promoted for optimistic relaying. We will manually review recent bid submissions from those pubkeys to ensure a low historical rate of bad bids. A bad bid is one with an invalid block or an insufficient payment to the proposer.
8 | >2. Post a maximum of 1 ETH collateral to `relay.ultrasound.eth` and share the transaction details with us. The transaction sender must be an address publicly associated with one of your builder pubkeys, ideally your primary fee recipient address.
9 | >3. The relay will automatically demote you for submitting a single bad bid to the relay. You will only be re-promoted after the underlying reason for submitting a bad bid is addressed.
10 | >4. A bad bid that wins the auction and is signed by the proposer will cause an on-chain incident, i.e. a missed slot or an insufficient proposer payment. We expect you to directly compensate the proposer the bid value plus a fixed 0.01 ETH penalty within 24 hours and send us the transaction details.
11 | >5. Without receiving proof the proposer was compensated within 24 hours we may use your collateral to compensate the proposer ourselves.
12 |
13 | ### Purpose
14 |
15 | This document outlines key aspects of optimistic relaying with the ultra sound relay. Optimistic relaying allows builders to significantly reduce the latency of their bid submissions through asynchronous simulation. For more detail see [the proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [the implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and the discussion in [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348).
16 |
17 | ### Optimistic logic
18 |
19 | The optimistic relay implementation adds three DB fields to every builder pubkey:
20 |
21 | 1. `is_optimistic`: This boolean, which defaults to `false`, indicates whether or not the pubkey is eligible for optimistic relaying. Promoting a builder pubkey for optimistic relaying is a manual process. When a builder submits a bad bid `is_optimistic` is reset to `false` before moving to the next slot, with demotion details recorded in the relay DB.
22 | 2. `collateral`: This integer reflects the collateral value in wei backstopping the value of optimistic bids. That is, optimistic relaying happens when `is_optimistic` is `true` and `collateral` is at least as large as the bid value.
23 | 3. `builder_id`: This string is used to share collateral across multiple pubkeys. The demotion of a pubkey will result in the simultaneous demotion of all pubkeys sharing the same builder ID.
24 |
25 | Consider the example below:
26 |
27 | ```
28 | builder_pubkey | is_optimistic | collateral | builder_id
29 | ----------------+---------------+--------------------+------------------
30 | 0xaaaaaa... | true | 990000000000000000 | mike
31 | 0xbbbbbb... | true | 990000000000000000 | mike
32 | 0xcccccc... | false | 990000000000000000 | flashbots
33 | 0xdddddd... | false | 0 | bloxroute
34 | ```
35 |
36 | Pubkeys `0xaaaaaa` and `0xbbbbbb` share the same builder ID `mike` and collateral of 0.99 ETH. (0.99 ETH is the maximum 1 ETH collateral minus 0.01 ETH for the fixed penalty.) Since `is_optimistic` is `true` any bid with a value less than or equal to 0.99 ETH will be relayed optimistically. A larger bid, e.g. with 10 ETH of value, will not be relayed optimistically. If either pubkey submits an invalid bid both pubkeys will be demoted before the next slot.
37 |
38 | Pubkey `0xcccccc` also has 0.99 ETH of collateral but `is_optimistic` is `false` so their bids will not be relayed optimistically. Builder `0xdddddd` has no collateral so their bids will also not be relayed optimistically.
39 |
40 | ### Collateral
41 |
42 | Collateral for optimistic relaying must be posted to `relay.ultrasound.eth` from an address publicly associated with one of your builder pubkeys, ideally your primary fee recipient address. The maximum collateral per pubkey is currently 1 ETH—this value may be increased or decreased from time to time. Please contact us if you wish to stop optimistic relaying and have your collateral returned.
43 |
44 | ### Builder ID
45 |
46 | For collateral efficiency you may reuse the same piece of collateral for multiple builder pubkeys. Let us know which pubkeys you want to share a builder ID. The demotion of a pubkey will result in the demotion of all pubkeys sharing the same builder ID.
47 |
48 | ### Promotions and demotions
49 |
50 | We will manually promote your pubkeys by setting `is_optimistic` to `true` after collateral is posted and you have indicated readiness for optimistic relaying. For transparency we intend to publicly disclose `is_optimistic`, `collateral`, `builder_id` for every pubkey.
51 |
52 | When a bad bid is submitted, even if the bid does not get signed by the proposer, the demotion logic resets `is_optimistic` to `false` before the next slot. Only after the root cause of a demotion is understood and fixed can we manually reset `is_optimistic` to `true`.
53 |
54 | ### On-chain incidents
55 |
56 | An on-chain incident, i.e. a missed slot or an insufficient proposer payment, will likely occur if a bad bid wins the auction and is signed by the proposer. (There are exceptional edge cases where an on-chain incident may not happen, including reorgs and proposer double signing.) A proposer that suffers an on-chain incident due to a bad bid needs to be made whole by the builder the full bid value plus 0.01 ETH. The fixed 0.01 ETH penalty attempts to cover missed consensus rewards as well as accounting hassle from the delayed payment.
57 |
58 | Note that we expect you, within 24 hours of the demotion, to directly send ETH to the proposer's fee recipient to compensate for a bad bid leading to an on-chain incident. Please share with us details of the corresponding transaction. Without proof the proposer was compensated within 24 hours we may use your collateral to compensate the proposer ourselves. For transparency we plan to publish a public post-mortem of every on-chain incident due to a bad bid.
59 |
--------------------------------------------------------------------------------
/img/nonopt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/nonopt.png
--------------------------------------------------------------------------------
/img/v1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v1.png
--------------------------------------------------------------------------------
/img/v2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v2.png
--------------------------------------------------------------------------------
/img/v3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v3.png
--------------------------------------------------------------------------------
/img/v4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v4.png
--------------------------------------------------------------------------------
/proposal.md:
--------------------------------------------------------------------------------
1 | # Optimistic Relay Proposal
2 |
3 | ## Purpose
4 |
5 | This document introduces the concept of the Optimistic Relay to accompany https://github.com/flashbots/mev-boost-relay/pull/285, which is a small
6 | PR which adds the functionality to Flashbots' `mev-boost-relay`. While the PR gets into the details of *how* the optimistic feature is added,
7 | this document aims at motivating the change in the broader context of `mev-boost` and the existing Ethereum literature on Proposer-Builder Separation (PBS).
8 |
9 | ## mev-boost today
10 |
11 | `mev-boost` has become critical infrastructure since it was introduced by Flashbots. There are a number of excellent data sources to demostrate this:
12 |
13 | - https://www.mevboost.org/
14 | - https://mevboost.pics/
15 | - https://transparency.flashbots.net/
16 |
17 | tl;dr; >90% of validators are using `mev-boost` to outsource block building.
18 |
19 | The Flashbots' team has continued to engage with the community around the future of software:
20 |
21 | - https://collective.flashbots.net/t/toward-an-open-research-and-development-process-for-mev-boost/464
22 | - https://collective.flashbots.net/t/mev-boost-development-philosophy/505
23 | - https://collective.flashbots.net/t/development-next-steps-for-pbs-roundtable-at-devcon/438
24 |
25 | Part of that discussion is around how the community can continue iterating on the initial architecture to build understanding around what could/should
26 | be enshrined in the protocol.
27 |
28 | ## Proposer-Builder Separation (PBS)
29 |
30 | PBS has been extensively researched. See https://notes.ethereum.org/@domothy/pbs_links and https://github.com/michaelneuder/mev-bibliography for links to the literature and
31 | https://barnabe.substack.com/p/pbs for a thorough overview of the current research landscape.
32 |
33 | ## Optimistic Relay: phase 1
34 |
35 | The Optimistic Relay is an idea from Justin Drake to help bridge the gap between the research and the current `mev-boost` software, with the
36 | goal of building understanding about how the mechanisms proposed in the PBS literature behave in practice.
37 | This aims to be a step towards understanding what (if anything) should be enshrined in the protocol.
38 |
39 | The primary difference between `mev-boost` and in-protocol PBS (IP-PBS) is the presence of the relay. The relay is a trusted intermediary between the block builders and
40 | the proposers, while in IP-PBS other validators enforce the rules of the block building auction through attestations. For this reason, phase 1 of the Optimistic Relay reduces the role of the relay in block production.
41 |
42 | #### Asynchronous block validation
43 | The key idea in phase 1 is to change the builder block submission flow to make the block validation an asnychronous process. When the builder submits a block
44 | to the relay, that bid is immediately elligible to win the auction, even if the relay hasn't had a chance to check that the block is valid. This allows builders
45 | to submit more blocks, and importantly submit blocks later in the slot (because they don't have to wait for the block to be validated for the bid to register).
46 | We call it "optimistic" because the relay assumes that the block is valid in the short-term, while deferring the actual validation to a later time. A
47 | builder must post collateral (or have a guarantor) with the relay to take advantage of this optimistic processing of their blocks, and if they submit an invalid block they are "demoted"
48 | back to the current `mev-boost` implementation where each block is validated before the bid is considered elligible. Critically, if a builder submits an
49 | invalid block that ends up winning the auction and being proposed the validator who missed their slot is refunded the amount of the winning bid.
50 |
51 | #### A builder's perspective
52 |
53 | The sequence diagram below shows the block building pipeline of regular `mev-boost`:
54 |
55 |
56 | The builders block must be validated before the proposer calls `getHeader` on the relay at the beginning of the slot. In this diagram, Δ denotes the amount
57 | of time that it takes the relay to validate the block. Thus the builders submission must arrive at least Δ before the start of the next slot for the bid to be
58 | valid. Compare this to the optimistic block building pipeline:
59 |
60 |
61 |
62 | Here the builder immediately gets a response indicating that their bid is active and the relay might not simulate the block until the payload has already been
63 | published as a full beacon block. A builder is incentivized to use optimistic relaying because they no longer have to pay the Δ time penalty during their
64 | block submission. MEV is a game of small margins, and speed is critical for successful builders. To demonstrate that consider the following situation:
65 |
66 |
67 | In this case, the builder knows that their submission needs to reach the relay before time 12-Δ. Right after they submit their block, a large MEV opportunity arises.
68 | They can try to submit a new block that captures more MEV and thus has a higher probability of winning the auction, but it is too late as
69 | the simulation will not complete before t=12. Intuitively, the longer the builder can wait before submitting their block, the longer they have to listen for transactions and thus the higher their bid.
70 |
71 | #### A proposer's perspective
72 | The proposers perspective is almost identical to the status quo of `mev-boost`, with the slight change that they no longer have a guarantee that the header
73 | they sign corresponds to a valid block. They still call `getHeader` at the beginning of the slot and sign the header they receive.
74 | The trust assumptions on the relay remain the same. If the block they end up proposing turns out to be invalid
75 | they must trust the relay to refund them for their lost slot, which is the same as trusting the relay to validate the block includes a payment corresponding to
76 | the bid amount in the base `mev-boost` case.
77 |
78 | ## Security considerations
79 |
80 | There were a number of concerns raised in https://github.com/michaelneuder/mev-boost-relay/pull/2 that we would like to address. Thanks to Chris Hager, Alex Stokes, and Mateusz Morusiewicz for this initial feedback!
81 |
82 | #### Relay collateralization
83 | Relay operators who manage collateral to enable optimistic processing are accountable for the additional legal and operational complexity of managing these funds. This has been raised as a significant concern by some parties.
84 | The solution we propose for the Ultra Sound Relay is a builder-guarantor approach. The relay acts as an intermediary to determine if/when a builder bug results in a missed slot, but the
85 | expected outcome is that once the issue is brought to the builder, they directly refund the proposer who missed the slot. Since builder reputation far exceeds
86 | the monetary value of the missed slots, we expect that in the vast majority of refunds to process smoothly in this fashion. In the exceptional case where a builder stops
87 | responding or decides to withhold the refund, the builder-guarantor intervenes and refunds the proposer. The guarantor for the offending builder will likely remove
88 | their remaining collateral for that builder, and thus the builder will be back to non-optimistic processing. Additionally, for each refundable event, we plan
89 | on publishing a postmortem publicly explaining what happened. The format for the postmortem will be: (a) a timeline of events, (b) the builder and proposer involved, (c) the bid that resulted in the missed slot, (d) the error associated with the bid, (e) the refund details.
90 | This information will provide transparency into the relay operation and allow us to monitor common builder failure modes.
91 | An additional benefit of this approach is the ability
92 | for guarantors to be separate entities from the relay itself. For example, the Ultra Sound Relay is willing to act as a guarantor for trusted builders for other smaller relays
93 | to remove any overhead of managing collateral and refunds. To bootstrap this process, the Ultra Sound Relay is willing to act as a 1 ETH guarantor for the following builders:
94 | ```
95 | * builder0x69
96 | * flashbots
97 | * beaver
98 | * bloxroute
99 | * manta
100 | * blocknative
101 | * rsync
102 | * eden
103 | * eth-builder
104 | * buildai
105 | * payload
106 | * nfactorial
107 | * s[0-9].+
108 | * lightspeed
109 | * manifold
110 | * Rick Astley
111 | ```
112 | New builders who are willing to engage with the Ultra Sound Relay team and demonstrate high-fidelity block production can get added to this list!
113 |
114 | #### Missed slots (liveness-attack)
115 | This proposal increases the risk of missed slots, but only marginally. Because the relay doesn't validate the block, the proposer could end up signing a bad header. The proposer is
116 | unable to rectify the situation by signing a valid block, because signing two blocks of the same height is a slashing condition. However, we implemented it so that there will be
117 | at most one missed slot per-optimistc builder before a manual intervention. By default, all builders are treated non-optimistically and the process of posting collateral is manual.
118 | If that builder ends up submitting a bad block, a single slot will be missed and we will manually investigate the failure, initiate the refund, and
119 | communicate with the builder. Only once we have high confidence that we understand what went wrong and why it won't happen again will we allow the builder to re-post collateral and have access
120 | to optimistic processing once again. Additionally, we only optimistically process blocks that have a value less than the collateral posted for a builder. The builder is free to submit blocks with huge MEV that exceeds their collateral, we just will validate those bids synchronously because they can't cover refund amount. Lastly, have the benefit of rolling this change out slowly. By starting with just the Ultra Sound Relay we can continually monitor the number
121 | of missed slots caused by this change. If we ever determine that it exceeds what we are comfortable with, we can simply turn it off.
122 |
123 | #### Collusion
124 | Consider the case where the proposer and the builder are the same malicious actor. The builder could submit a large bid with an invalid block in order to ensure that they win the auction through the relay. This will result in an invalid block being proposed and the relay accounting system recording that the proposer is owed a large refund.
125 | However, the bid was collateralized by that same builder, so even though we issue a refund, the funds are coming from same actor, so the actor is not earing any extra profit. The only downside is that the relay team had to manually issue a refund, but we anticipate this to be a rare occurance, and
126 | by only onboarding trusted builders, we minimize this toil. Additionally, the proposer didn't accomplish anything other than skipping their slot, which they could
127 | do by just turning off their machine in the first place.
128 |
129 | #### Incentive compatiblity
130 |
131 | The builder is incentivized to use the optimistic relay honestly because it strictly increases the amount of MEV they are able to produce. Submitting an invalid
132 | bid results in them losing their collateral, which is not a rational choice.
133 |
134 | The proposer is incentivized to use the optimistic relay because it increases the size of the bids against their slot. Larger bids means more value extracted
135 | for the proposer themselves.
136 |
137 | The relay remains a neutral party that serves as an escrow mechanism, but does not receive any rewards.
138 |
139 | #### Moral hazard
140 | The more subtle argument is that this change introduces a moral hazard. Missed slots impact the entire chain, and while we acknowledge that there is a chance a few extra missed slots, we plan on approaching
141 | this change conservatively. By only allow-listing trusted builders initially, we ensure that there will be no runaway amount of missed slots (again only one
142 | per-builder per-manual intervention). We will actively collect data around any invalid blocks proposed and work with builders to understand what caused invalid block submissions. Public post-mortems will increase transparency and allow community feedback and engagement around the project as a whole. The builders
143 | are highly incentivized to avoid submitting invalid blocks and thus will want to know what is going wrong with their blocks if something fails.
144 |
145 | ## Learnings from Goerli
146 |
147 | #### Brittle nature of duplicate simulations
148 |
149 | In our first implementation, we simulated optimistic blocks that won the auction twice: (1) in the async call triggered by the block submission (2) in `getPayload` to check
150 | if the winning block was valid. While this works in theory, we found it to be very brittle to simulate the same block twice as a number of errors from race
151 | conditions between the parallel block submissions arose. Therefore we refactored the optimistic relay to only simulate each block a single time. In `getPayload` we make
152 | use of the waitgroup for optimistic blocks. Once we know all the blocks for the slot have been processed, we check the demotions table for the winning block.
153 | If it is there, then we know that an invalid block was delivered to the proposer, and thus a refund is necessary. This triggers an update on the demotions table
154 | where we insert the `SignedBeaconBlock` and `SignedValidatorRegistration` to ensure all the relevant data is in one place.
155 | This new design is presented in https://github.com/flashbots/mev-boost-relay/pull/285.
156 |
157 | #### Payload too large errors
158 | We found that many blocks returned an error of `Payload too large`. After investigation, we realized that the max size of the payload for block submissions to the `prio-load-balancer` was too small. This PR updates the max size to 2MB: https://github.com/flashbots/prio-load-balancer/commit/d4d171c3fcda4e4ca49075031efa44065c4f9a5a.
159 |
160 | #### Performance analysis
161 | Since the optimistic relay is mainly about improving the performance of the block submission flow to allow builders to submit blocks later into the slot, we
162 | collected performance data from the Goerli relay. The table below contains various percentiles of the distribution of the number of microseconds for the three stages of the submit block flow:
163 |
164 |
165 |
166 |
167 | The middle stage, `simulation_duration`, is what the v1 optimistic relay eliminates by removing the simulation of the blocks from the fast path. The first stage, `decode_duration`, is another huge portion of the overall runtime of the block submission flow. This stage is the process of recieving the payload over the network. With an `8 MB/s` connection, we have `8 KB/ms`.
168 | The median decode time is `~80ms` which implies `640KB` blocks. This seems like a reasonable estimate. Additionally, the network latency is high variance.
169 | The following plot shows the correlation between decoding time and the size of the payload. We color the data points by the builder pubkey. Clearly, there
170 | is a positive correlation between the two, and different builders have different networking connections which also show up in the data.
171 |
172 |
173 |
174 | We also collected data around the builder bids over the course of a slot. The following figure shows the value of a bid as a function of time for slot 5771427 for 10 different builders:
175 |
176 |
177 |
178 | More broadly, we can observe the probability distributions over the slot duration of when the winning block arrived at the relay.
179 |
180 |
181 | Clearly, the later seconds in the slot have a much higher probability of containing the winning bid. This follows directly from the fact that more MEV will occur
182 | during the slot, allowing later blocks to capture more and thus increase their bid size.
183 |
184 | ## FAQ by Justin Drake
185 |
186 | **Does an optimistic relay simulate blocks?**
187 |
188 | _Yes, an optimistic relay simulates all blocks that could have been forwarded to a proposer. Simulation just happens away from the latency-critical path._
189 |
190 | **Can optimistic relaying lead to mass missed slots?**
191 |
192 | _Assuming no relay bug, the worst case is one missed slot per collateralised builder (prior to reactivation). The beacon chain is designed to handle missed slots._
193 |
194 | **Are builders incentivised to produce invalid blocks?**
195 |
196 | _The builder of an invalid winning block suffers a financial loss. Moreover, a single invalid block will disable optimistic relaying for the builder, yielding a latency disadvantage pending reactivation._
197 |
198 | **What recommendations do you have for optimistic relay operators?**
199 |
200 | _We have several recommendations for optimistic relay operators:_
201 |
202 | * _alerts – setup automatic alerts (e.g. email or phone call) for demotions_
203 | * _refunds – promptly transfer (e.g. within 24 hours) the bid value plus beacon chain penalties and missed rewards to the proposer fee recipient_
204 | * _max collateral – cap the maximum collateral amount per builder (e.g. to 10 ETH) to keep a level playing field_
205 | * _investigation – manually investigate demotions and ask builders to fix block building bugs before reactivating optimistic relaying_
206 | * _cool-off – impose a post-demotion cool-off period (e.g. 24 hours) before reactivating optimistic relaying_
207 | * _penalty – consider a fixed penalty (e.g. 0.1 ETH) per demotion, especially for repeat demotion_
208 |
209 |
--------------------------------------------------------------------------------
/stale/img/nonopt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/nonopt.png
--------------------------------------------------------------------------------
/stale/img/v1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v1.png
--------------------------------------------------------------------------------
/stale/img/v2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v2.png
--------------------------------------------------------------------------------
/stale/img/v3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v3.png
--------------------------------------------------------------------------------
/stale/towards-epbs.md:
--------------------------------------------------------------------------------
1 | # Towards Enshrined PBS — An Optimistic Roadmap
2 |
3 | ## Purpose
4 | Present a roadmap towards Enshrined PBS (ePBS) through a series of modifications
5 | to the existing `mev-boost` [relay](https://github.com/flashbots/mev-boost-relay)
6 | functionality. By progressively removing the relay responsibilities,
7 | we aim to converge to a system that looks quite similar to [existing](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980) [proposals](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) for ePBS.
8 |
9 | ### Rationale
10 | 1. **Agility** — We aim to approach this protocol upgrade as suggested by Justin in the [Censorship Panel](https://www.youtube.com/watch?v=Z9VCdiSPJEQ&t=2729s) at SBC. To front-load the
11 | R&D effort, we can iterate quickly by experimenting with a portion of existing
12 | relays, builders, and validators to reduce uncertainty and risk around full ePBS.
13 | There are tradeoffs between ePBS and `mev-boost` as highlighted by Barnabé at [Devconnect](https://youtu.be/jQjBNbEv9Mg?t=943) and in [Notes on PBS](https://barnabe.substack.com/i/82304191/market-structure-and-allocation-mechanism). Through this roadmap we explore the
14 | design space between the two ends of the spectrum, without fully committing upfront to ePBS.
15 | 2. **Inevitability** — At [Devconnect](https://www.youtube.com/watch?v=OD54WfVuDWw&t=818s), Vitalik noted that with Danksharding, some builder separation becomes mandatory because bandwidth
16 | requirements for large blocks exceed what is within reach of a home-staker. This roadmap
17 | allows us to progress the research discussion by examining how different mechanisms work in practice.
18 | 3. **Accessibility** — By presenting relay data and designing the optimistic architecture in the open, we increase visibility into the
19 | existing block building market. This helps answer research questions (e.g.,
20 | [ROP-0](https://efdn.notion.site/ROP-0-Timing-games-in-Proof-of-Stake-385f0f6279374a90b52bf380ed76a85b)) and allows
21 | independent relays stay competitive with vertically integrated Builder-Relays.
22 | Additionally, optimistic relaying actually reduces the hardware and networking
23 | [resources](https://collective.flashbots.net/t/ideas-for-incentivizing-relays/586) required for a relay to be competitive, lowering the barrier of entry.
24 |
25 |
26 |
27 |
28 | ## Block lifecycle
29 | The figure below schematizes the lifecycle of a block in the existing `mev-boost`
30 | architecture.
31 |
32 |
33 |
34 | 1. The builder submits a bid to the relay, which contains a header and an execution
35 | body.
36 | 2. The relay validates the block and asserts that it includes an appropriate payment
37 | to the proposer.
38 | 3. The proposer calls `getHeader` to receive the highest paying bid.
39 | 4. The proposer signs the header and calls `getPayload` which delivers the signed header
40 | to the relay.
41 | 5. The relay publishes the signed block to the p2p network.
42 | |
43 |
44 | #### What is the relay doing?
45 | As implemented, the relay aims to be a simple, mutually-trusted, neutral third-party to
46 | connect builders and proposers. The relay duties and the corresponding trust assumptions are
47 |
48 | - Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*
49 | - Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay
50 | to provide a valid header to sign.*
51 | - Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay
52 | to check that the block that pays them.*
53 | - Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.\**
54 |
55 | > \* Note that the winning block is returned to the proposer,
56 | so even if the relay doesn't publish the block, the proposer should and is incentivized to do so. We still list it as a trust assumption because the builder doesn't trust the proposer,
57 | so in essence they still trust that the relay publishes their block if it wins the auction.
58 |
59 | ## Optimistic Relay v1 — "Efficient builder submissions"
60 | The figure below demonstrates the block lifecycle under Optimistic Relaying v1.
61 | This idea is [proposed](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md) and [implemented](https://github.com/flashbots/mev-boost-relay/pull/285); it
62 | was also discussed in the [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348).
63 |
64 |
65 |
66 |
67 | 1. The builder submits a bid to the relay, which contains a header and an execution
68 | body.
69 | 2. The proposer calls `getHeader` to receive the highest paying bid.
70 | 3. The proposer signs the header and calls `getPayload` which delivers the signed header
71 | to the relay.
72 | 4. The relay publishes the signed block to the p2p network.
73 | |
74 |
75 |
76 | #### What is the relay doing?
77 | This flow differs only slightly from the previous; the relay does not immediately
78 | validate the block sent from the builder. This results in a reduction of
79 | the latency between when a builder submits a bid to the relay and when that bid
80 | becomes eligible to win the auction. This benefits both the builders and
81 | the proposers because it allows bids to arrive later in the slot, thus capturing more
82 | MEV for both parties. Under this design, builders can submit high bids for invalid blocks
83 | that end up winning the auction, which results in a missed slot (because the proposer
84 | signed an invalid header). We account for this by requiring builders to post collateral to
85 | the relay which will be used to refund proposers if a slot is missed. See the [proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and [community call](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348) for further details.
86 | This proposal should lower the hardware and networking requirements of running a relay because now there is no need for large amounts of burst compute and bandwidth due to the block simulation being handled asynchronously in the next slot.
87 |
88 | Beyond the practical benefits mentioned above, we also modified the relay duties and trust assumptions. Now, the relay is responsible for
89 |
90 | - Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*
91 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay
92 | to provide a valid header to sign.*~~
93 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay
94 | to check that the block that pays them.*~~
95 | - Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.*
96 | - **[new]** Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid.*
97 |
98 | This demonstrates the main objective of this roadmap — "to reduce the duties and trust assumptions associated with relays".
99 |
100 | > Optimistic Relay v1.5 — "Header-only parsing". One minor modification of v1 is
101 | to further optimize the block submission flow by making a bid eligible to win the
102 | auction immediately upon receipt of the header. Since the relay already asynchronously
103 | validates the body, we save extra critical milliseconds by parsing the stream for just the
104 | bid details and then updating the highest bid accordingly, as opposed to waiting until the full payload is downloaded. This introduces some technical
105 | complexity because it creates a race condition between the block body availability and the
106 | proposer's call to `getPayload`. However, these are implementation details and do
107 | not impact the relay duties and trust assumptions listed above, so we treat it as an
108 | extension of v1.
109 |
110 | ## Optimistic Relay v2 —"Relay as a header proxy"
111 | The figure below demonstrates the block lifecycle under Optimistic Relaying v2.
112 |
113 |
114 |
115 |
116 | 1. The builder submits a bid to the relay, which contains only the header and a value.
117 | 2. The proposer calls `getHeader` to receive the highest paying bid.
118 | 3. The proposer signs the header and calls `getPayload` which delivers the signed header
119 | to the relay.
120 | 4. The relay proxies the signed header to the corresponding builder.
121 | 5. The builder publishes the signed block.
122 | |
123 |
124 |
125 | #### What is the relay doing?
126 | Under this design, the relay no longer receives the execution body of the block that the builder constructed.
127 | This removes another piece of the trust between the builder and the relay
128 | because it removes the relay's ability to steal MEV. The block never leaves the builder's
129 | machine until it receives a signed header committing to that block. This allocates the
130 | task of publishing the signed block up to the builder, who is incentivized to do so
131 | in order to earn the reward associated with the publication. The relay has one new task under
132 | this paradigm, which is to observe the mempool and await the signed block associated with
133 | the header that won the auction. If the block does not appear on time, then the proposer
134 | must be refunded in the same manner as v1. The relay role has evolved to
135 |
136 | - Storing the builder header ~~and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*~~
137 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay
138 | to provide a valid header to sign.*~~
139 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay
140 | to check that the block that pays them.*~~
141 | - ~~Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.*~~
142 | - Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid block.*
143 | - **[new]** Observing the mempool $\implies$ *Trust assumption 6: the proposer trusts the relay to refund them in the case of a missing block.*
144 |
145 | > This design has 2 additional practical benefits. (1) It completely removes the bandwidth
146 | intensive process of transmitting the block from the builder to the relay. This is a
147 | [zero copy](https://en.wikipedia.org/wiki/Zero-copy) approach as the block only ever
148 | resides on the builder machine until it is published. This becomes increasingly important as the size of blocks grows as a result of the sharding roadmap, and further reduces the hardware and network requirements to run an independent relay. (2) The relay is no longer able to censor.
149 | The relay only proxies the header, and thus has no ability determine information about
150 | the transactions in the execution body.
151 |
152 |
153 | ## Optimistic Relay v3 — "Relay as an oracle"
154 | The figure below demonstrates the block lifecycle under Optimistic Relaying v3.
155 |
156 |
157 |
158 |
159 | 1. The builder submits a bid to the mempool, which contains only the header and a value.
160 | 2. The proposer listens to the mempool and selects a header.
161 | 3. The proposer signs the header and publishes it to the mempool.
162 | 4. The builder listens to the mempool for the signed header corresponding to their bid.
163 | 5. The builder publishes the signed block.
164 | |
165 |
166 |
167 | #### What is the relay doing?
168 | Notice how the relay no longer plays an active role in the block building flow.
169 | The relay becomes an oracle that observes the mempool for the case where
170 | the proposer signed a header on time, but the builder did not
171 | publish a valid block on time. This situation would be treated exactly as in v1 & v2, where
172 | the proposer is refunded for their missed slot using the builder collateral. In this
173 | last iteration, the relay role is reduced to
174 |
175 | - ~~Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*~~
176 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay
177 | to provide a valid header to sign.*~~
178 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay
179 | to check that the block that pays them.*~~
180 | - ~~Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.*~~
181 | - Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid block.*
182 | - Observing the mempool $\implies$ *Trust assumption 6: the proposer trusts the relay to refund them in the case of a missing block.*
183 |
184 | > Note that the proposer payments could be implemented in an unconditional way. Such a mechanism was [presented](https://github.com/flashbots/mev-boost/issues/109)
185 | by Alex O. and Stephane. This would reduce the trust assumptions to zero. The tradeoff here
186 | is the engineering complexity and inherent risk of using smart contracts to implement this logic.
187 | This may well be worth investing time into in the future, but we estimate that in the short-term, the
188 | relay operators (or [guarantors](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md#relay-collateralization)) should handle the refunds manually.
189 |
190 | ## ePBS — "Replace the relay with a committee"
191 | The final evolution of this roadmap is to replace the v3 relay with a committee of
192 | validators and enshrine PBS into the protocol. The specifics of the mechanism
193 | can vary; Vitalik has proposed [single-slot](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) and
194 | [two-slot](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980).
195 | Under the two-slot implementation (which seems to be the most popular currently),
196 | the proposer chooses a header to include in their beacon block.
197 | One committee attests to this block and once a builder is confident that the block
198 | will not be reorged, they publish an intermediate block with the full execution payload that the
199 | remaining committees attest to.
200 |
201 | Note how the first committee attests
202 | to the timely publication of a signed header and the remaining committees attest to the timely publication of a signed block, which is exactly the role that the
203 | relay plays in v3.
204 |
205 |
--------------------------------------------------------------------------------
/towards-epbs.md:
--------------------------------------------------------------------------------
1 | # Towards enshrined PBS — an optimistic roadmap
2 |
3 | ###### _"This has been a latency awakening" - Justin Drake (March 1, 2023)_
4 |
5 | ### Purpose
6 | Present a roadmap towards enshrined PBS (ePBS) through a series of modifications
7 | to the existing `mev-boost` [relay](https://github.com/flashbots/mev-boost-relay). By progressively removing the relay responsibilities and pruning the
8 | critical path of the block-building pipeline,
9 | we aim to converge to a system that looks quite similar to [existing](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980) [proposals](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) for ePBS.
10 |
11 | ### Rationale
12 | 1. **Agility** — We aim to approach this protocol upgrade as suggested by Justin in the [Censorship Panel](https://www.youtube.com/watch?v=Z9VCdiSPJEQ&t=2729s) at SBC. To front-load the
13 | R&D effort, we can iterate quickly by experimenting with a portion of existing
14 | relays, builders, and validators to reduce uncertainty and risk around full ePBS.
15 | There are tradeoffs between ePBS and `mev-boost` as highlighted by Barnabé at [Devconnect](https://youtu.be/jQjBNbEv9Mg?t=943) and in [Notes on PBS](https://barnabe.substack.com/i/82304191/market-structure-and-allocation-mechanism). This roadmap explores the
16 | design space between the two ends of the spectrum.
17 | 2. **Inevitability** — At [Devconnect](https://www.youtube.com/watch?v=OD54WfVuDWw&t=818s), Vitalik noted that with Danksharding, some builder separation becomes mandatory because bandwidth
18 | requirements for large (32 MB) blocks exceed what is within reach of a home-staker. This roadmap
19 | allows us to progress the research discussion by examining how different mechanisms work in practice.
20 | 3. **Accessibility** — By presenting relay data and designing the optimistic architecture in the open, we increase visibility into the
21 | existing block-building market. This helps answer research questions (e.g.,
22 | [ROP-0](https://efdn.notion.site/ROP-0-Timing-games-in-Proof-of-Stake-385f0f6279374a90b52bf380ed76a85b) from Barnabé and Caspar) and allows
23 | independent relays stay competitive with vertically integrated Builder-Relays.
24 | Additionally, optimistic relaying reduces the hardware and networking
25 | [resources](https://collective.flashbots.net/t/ideas-for-incentivizing-relays/586) required for a relay to be competitive, lowering the barrier of entry. From the builder perspective,
26 | optimistic building requires collateral be posted, but it removes the need for them to be in the
27 | high-prio queue of each relay they connect to.
28 |
29 | We define the "critical path" as the set of operations and messages that compose
30 | the block production pipeline beginning when the builder submits a bid.
31 | We present a progression of 3 relay modifications that each reduce the latency associated
32 | with the critical path.
33 |
34 | > Latency is a centralizing force in the competitive block-building market. Since
35 | MEV is constantly being produced and extracted, rational
36 | builders are incentivized to colocate with relays, form trusted connections to relays, or even run
37 | their own relay infrastructure in order to minimize (down to $\mu s$ or $ns$) the time between when they build a block
38 | and when that header is part of the auction. We believe that by simplifying the pipeline, we can maintain a competitive and open builder ecosystem, while working to phase out relays all together.
39 |
40 |
41 | ## Block production today
42 | The figure below schematizes the critical path under the existing `mev-boost`
43 | architecture.
44 |
45 |
46 |
47 | 1. The builder submits a bid that contains the full execution body
48 | of the block to the relay.
49 | 2. The relay simulates the block; the simulation must complete before the bid can win the auction.
50 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block.
51 | |
52 |
53 | #### What is the critical path?
54 | The critical path runs through the relay, which is tasked with
55 | - receiving the full block over the network from the builder (this can be several MB of data and will increase with 4844),
56 | - validating the block against an execution node internally before communicating the winning header to the proposer, and
57 | - receiving the signed header from the proposer and publishing\* the block.
58 |
59 | In practice, this entire process can take hundreds of milliseconds, even when running on
60 | high-performance hardware with good network connectivity.
61 |
62 | > \* Note that the winning block is returned to the proposer,
63 | so even if the relay doesn't publish the block, the proposer should and is incentivized to do so.
64 |
65 | ## Optimistic Relay v1 — "Asynchronous block validation"
66 | The figure below denotes the critical path under Optimistic Relaying v1.
67 | This idea is [proposed](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md) and [implemented](https://github.com/flashbots/mev-boost-relay/pull/285); it
68 | was also discussed in the [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348).
69 |
70 |
71 |
72 |
73 | 1. The builder submits a bid that contains the full execution body
74 | of the block to the relay.
75 | 2. The relay immediately marks the bid as eligible to win the auction.
76 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block.
77 | |
78 |
79 |
80 | #### What is the critical path?
81 | This flow differs only slightly from the previous; the relay does not immediately
82 | validate the block sent from the builder. This results in a reduction of
83 | the latency between when a builder submits a bid to the relay and when that bid
84 | becomes eligible to win the auction. This benefits both the builders and
85 | the proposers because it allows bids to arrive later in the slot, thus capturing more
86 | MEV for both parties. Under this design, builders can dishonestly submit high bids for invalid blocks
87 | that end up winning the auction, which results in a missed slot (because the proposer
88 | signed an invalid header). Alternatively, EVM valid blocks could be proposed and included but not pay the proposer
89 | the value of the bid. We account for this by requiring builders to post collateral to
90 | the relay which will be used to refund proposers if a slot or payment is missed. See the [proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and [community call](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348) for further details.
91 | This proposal should lower the hardware and networking requirements of running a relay because now there is no need for large amounts of burst compute and bandwidth right at the end of the slot.
92 |
93 | >The end of the slot is highly congested because all the highest bids are arriving in the final milliseconds before the proposer calls `getHeader`. This exaggerates the issue if simulation latency and further
94 | benefits builders who are vertically integrated with relays.
95 |
96 | Beyond the practical benefits mentioned above, we modified the critical path. Now, the relay is responsible for
97 |
98 | - receiving the full block over the network from the builder (this can be several MB of data and will increase with 4844),
99 | - communicating the winning header to the proposer, and
100 | - receiving the signed header from the proposer and publishing the block.
101 |
102 | The relay still validates each block it receives, but this validation happens asynchronously, thus
103 | is no longer part of the critical path. This demonstrates the main objective of this roadmap — "to reduce the duties of the relay
104 | and the latency of the critical path".
105 |
106 | ## Optimistic Relay v2 —"Header-only parsing"
107 | The figure below demonstrates the critical path under Optimistic Relaying v2.
108 |
109 |
110 |
111 |
112 | 1. The builder submits a bid to the relay.
113 | 2. The relay parses the incoming message just for the bid details and then marks the bid as eligible to win the auction.
114 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block.
115 | |
116 |
117 |
118 | #### What is the critical path?
119 | The critical path still runs through the relay, which is tasked with
120 | - parsing the incoming message from the builder for just the bid details (this is a few hundred bytes),
121 | - communicating the winning header to the proposer, and
122 | - receiving the signed header from the proposer and publishing the block.
123 |
124 | Here we remove the additional latency of waiting for the whole block execution body to
125 | download to the relay (which will become even more relevant with full Danksharding).
126 | This can result in an invalid or missing block from the builder,
127 | in which case the proposer refund is handled in the same manner as v1.
128 |
129 | > This design has an additional practical benefit. The relay is no longer able to censor because the bid is eligible when the header is parsed. The relay doesn't yet have information about the transactions in the execution body, so it can't censor based on those transactions.
130 |
131 |
132 | ## Optimistic Relay v3 — "Relay as an oracle"
133 | The figure below demonstrates the critical path under Optimistic Relaying v3.
134 |
135 |
136 |
137 |
138 | 1. The builder submits a header-only bid to the mempool (could be referred to as a "bidpool").
139 | 2. The proposer sends the signed header back to mempool and the builder publishes the full block.
140 | |
141 |
142 | #### What is the critical path?
143 | Note that now the relay is no longer in the critical path! The builders
144 | and proposers communicate directly through the p2p layer. We assume that the builders will be well-connected nodes in the p2p network because they are incentivized to have extremely short
145 | paths to the proposers, thus the messages will still be fast.
146 |
147 |
148 | The critical path is now just
149 | - the proposer listening over the p2p network for header-only bids (this is a few hundred bytes), and
150 | - the builder listening over the p2p network for a signed header corresponding to their bid.
151 |
152 | The relay is still present in this architecture, but only serves as an oracle
153 | in the case of a missed slot. The relay observes the mempool and determines if
154 | (a) a signed header was produced on time, but (b) the corresponding signed block was not produced on time. In this situation, the builder is at fault, and their collateral is again used to refund
155 | the proposer as in v1 \& v2.
156 |
157 | > Note that the proposer payments could be implemented in an unconditional way. Such a mechanism was [presented](https://github.com/flashbots/mev-boost/issues/109)
158 | by Alex O. and Stephane. This would eliminate the need for the relay to directly control the builder collateral, and reduce the relay to a data-availability oracle for the signed block appearing on time. The tradeoff here
159 | is the engineering complexity and inherent risk of using smart contracts to implement this logic.
160 | This may well be worth investing time into in the future, but we estimate that in the short-term, the
161 | relay operators (or [guarantors](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md#relay-collateralization)) should handle the refunds manually.
162 |
163 | ## ePBS — "Replace the relay with a committee"
164 | The figure below demonstrates the critical path under ePBS.
165 |
166 |
167 |
168 |
169 | 1. The builder submits a header-only bid to the mempool.
170 | |
171 |
172 | #### What is the critical path?
173 | The final evolution of this roadmap is to replace the v3 relay with a committee of
174 | validators and enshrine PBS into the protocol. The specifics of the mechanism
175 | can vary; Vitalik has proposed [single-slot](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) and
176 | [two-slot](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980).
177 | Under the two-slot implementation (which seems to be the most popular currently),
178 | the proposer chooses a header to include in their beacon block and publishes it in their slot.
179 | One committee attests to this block and once a builder is confident that the block
180 | will not be reorged, they publish an intermediate block in the subsequent slot that contains the full execution payload. The remaining committees attest to this intermediate block. The enforcement
181 | of honest behavior is implemented into the fork-choice rule, so that if a builder produces a
182 | block that differs from the winning header, the committee members simply do not attest to it.
183 |
184 | > The first committee attests
185 | to the timely publication of a signed header and the remaining committees attest to the timely publication of a signed block, which is exact set of actions that the
186 | relay listens for in v3.
187 |
188 |
--------------------------------------------------------------------------------