├── builder-onboarding.md ├── img ├── nonopt.png ├── v1.png ├── v2.png ├── v3.png └── v4.png ├── proposal.md ├── stale ├── img │ ├── nonopt.png │ ├── v1.png │ ├── v2.png │ └── v3.png └── towards-epbs.md └── towards-epbs.md /builder-onboarding.md: -------------------------------------------------------------------------------- 1 | # Optimistic relaying—builder guide 2 | 3 | Thank you for your interest in low-latency optimistic relaying with [the ultra sound relay](https://relay.ultrasound.money)! This document is an onboarding guide for you, Ethereum block builder. Please take the time to understand it :) 4 | 5 | **TLDR** 6 | 7 | >1. Share with us over Telegram or Discord the list of builder pubkeys you want promoted for optimistic relaying. We will manually review recent bid submissions from those pubkeys to ensure a low historical rate of bad bids. A bad bid is one with an invalid block or an insufficient payment to the proposer. 8 | >2. Post a maximum of 1 ETH collateral to `relay.ultrasound.eth` and share the transaction details with us. The transaction sender must be an address publicly associated with one of your builder pubkeys, ideally your primary fee recipient address. 9 | >3. The relay will automatically demote you for submitting a single bad bid to the relay. You will only be re-promoted after the underlying reason for submitting a bad bid is addressed. 10 | >4. A bad bid that wins the auction and is signed by the proposer will cause an on-chain incident, i.e. a missed slot or an insufficient proposer payment. We expect you to directly compensate the proposer the bid value plus a fixed 0.01 ETH penalty within 24 hours and send us the transaction details. 11 | >5. Without receiving proof the proposer was compensated within 24 hours we may use your collateral to compensate the proposer ourselves. 12 | 13 | ### Purpose 14 | 15 | This document outlines key aspects of optimistic relaying with the ultra sound relay. Optimistic relaying allows builders to significantly reduce the latency of their bid submissions through asynchronous simulation. For more detail see [the proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [the implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and the discussion in [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348). 16 | 17 | ### Optimistic logic 18 | 19 | The optimistic relay implementation adds three DB fields to every builder pubkey: 20 | 21 | 1. `is_optimistic`: This boolean, which defaults to `false`, indicates whether or not the pubkey is eligible for optimistic relaying. Promoting a builder pubkey for optimistic relaying is a manual process. When a builder submits a bad bid `is_optimistic` is reset to `false` before moving to the next slot, with demotion details recorded in the relay DB. 22 | 2. `collateral`: This integer reflects the collateral value in wei backstopping the value of optimistic bids. That is, optimistic relaying happens when `is_optimistic` is `true` and `collateral` is at least as large as the bid value. 23 | 3. `builder_id`: This string is used to share collateral across multiple pubkeys. The demotion of a pubkey will result in the simultaneous demotion of all pubkeys sharing the same builder ID. 24 | 25 | Consider the example below: 26 | 27 | ``` 28 | builder_pubkey | is_optimistic | collateral | builder_id 29 | ----------------+---------------+--------------------+------------------ 30 | 0xaaaaaa... | true | 990000000000000000 | mike 31 | 0xbbbbbb... | true | 990000000000000000 | mike 32 | 0xcccccc... | false | 990000000000000000 | flashbots 33 | 0xdddddd... | false | 0 | bloxroute 34 | ``` 35 | 36 | Pubkeys `0xaaaaaa` and `0xbbbbbb` share the same builder ID `mike` and collateral of 0.99 ETH. (0.99 ETH is the maximum 1 ETH collateral minus 0.01 ETH for the fixed penalty.) Since `is_optimistic` is `true` any bid with a value less than or equal to 0.99 ETH will be relayed optimistically. A larger bid, e.g. with 10 ETH of value, will not be relayed optimistically. If either pubkey submits an invalid bid both pubkeys will be demoted before the next slot. 37 | 38 | Pubkey `0xcccccc` also has 0.99 ETH of collateral but `is_optimistic` is `false` so their bids will not be relayed optimistically. Builder `0xdddddd` has no collateral so their bids will also not be relayed optimistically. 39 | 40 | ### Collateral 41 | 42 | Collateral for optimistic relaying must be posted to `relay.ultrasound.eth` from an address publicly associated with one of your builder pubkeys, ideally your primary fee recipient address. The maximum collateral per pubkey is currently 1 ETH—this value may be increased or decreased from time to time. Please contact us if you wish to stop optimistic relaying and have your collateral returned. 43 | 44 | ### Builder ID 45 | 46 | For collateral efficiency you may reuse the same piece of collateral for multiple builder pubkeys. Let us know which pubkeys you want to share a builder ID. The demotion of a pubkey will result in the demotion of all pubkeys sharing the same builder ID. 47 | 48 | ### Promotions and demotions 49 | 50 | We will manually promote your pubkeys by setting `is_optimistic` to `true` after collateral is posted and you have indicated readiness for optimistic relaying. For transparency we intend to publicly disclose `is_optimistic`, `collateral`, `builder_id` for every pubkey. 51 | 52 | When a bad bid is submitted, even if the bid does not get signed by the proposer, the demotion logic resets `is_optimistic` to `false` before the next slot. Only after the root cause of a demotion is understood and fixed can we manually reset `is_optimistic` to `true`. 53 | 54 | ### On-chain incidents 55 | 56 | An on-chain incident, i.e. a missed slot or an insufficient proposer payment, will likely occur if a bad bid wins the auction and is signed by the proposer. (There are exceptional edge cases where an on-chain incident may not happen, including reorgs and proposer double signing.) A proposer that suffers an on-chain incident due to a bad bid needs to be made whole by the builder the full bid value plus 0.01 ETH. The fixed 0.01 ETH penalty attempts to cover missed consensus rewards as well as accounting hassle from the delayed payment. 57 | 58 | Note that we expect you, within 24 hours of the demotion, to directly send ETH to the proposer's fee recipient to compensate for a bad bid leading to an on-chain incident. Please share with us details of the corresponding transaction. Without proof the proposer was compensated within 24 hours we may use your collateral to compensate the proposer ourselves. For transparency we plan to publish a public post-mortem of every on-chain incident due to a bad bid. 59 | -------------------------------------------------------------------------------- /img/nonopt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/nonopt.png -------------------------------------------------------------------------------- /img/v1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v1.png -------------------------------------------------------------------------------- /img/v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v2.png -------------------------------------------------------------------------------- /img/v3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v3.png -------------------------------------------------------------------------------- /img/v4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/img/v4.png -------------------------------------------------------------------------------- /proposal.md: -------------------------------------------------------------------------------- 1 | # Optimistic Relay Proposal 2 | 3 | ## Purpose 4 | 5 | This document introduces the concept of the Optimistic Relay to accompany https://github.com/flashbots/mev-boost-relay/pull/285, which is a small 6 | PR which adds the functionality to Flashbots' `mev-boost-relay`. While the PR gets into the details of *how* the optimistic feature is added, 7 | this document aims at motivating the change in the broader context of `mev-boost` and the existing Ethereum literature on Proposer-Builder Separation (PBS). 8 | 9 | ## mev-boost today 10 | 11 | `mev-boost` has become critical infrastructure since it was introduced by Flashbots. There are a number of excellent data sources to demostrate this: 12 | 13 | - https://www.mevboost.org/ 14 | - https://mevboost.pics/ 15 | - https://transparency.flashbots.net/ 16 | 17 | tl;dr; >90% of validators are using `mev-boost` to outsource block building. 18 | 19 | The Flashbots' team has continued to engage with the community around the future of software: 20 | 21 | - https://collective.flashbots.net/t/toward-an-open-research-and-development-process-for-mev-boost/464 22 | - https://collective.flashbots.net/t/mev-boost-development-philosophy/505 23 | - https://collective.flashbots.net/t/development-next-steps-for-pbs-roundtable-at-devcon/438 24 | 25 | Part of that discussion is around how the community can continue iterating on the initial architecture to build understanding around what could/should 26 | be enshrined in the protocol. 27 | 28 | ## Proposer-Builder Separation (PBS) 29 | 30 | PBS has been extensively researched. See https://notes.ethereum.org/@domothy/pbs_links and https://github.com/michaelneuder/mev-bibliography for links to the literature and 31 | https://barnabe.substack.com/p/pbs for a thorough overview of the current research landscape. 32 | 33 | ## Optimistic Relay: phase 1 34 | 35 | The Optimistic Relay is an idea from Justin Drake to help bridge the gap between the research and the current `mev-boost` software, with the 36 | goal of building understanding about how the mechanisms proposed in the PBS literature behave in practice. 37 | This aims to be a step towards understanding what (if anything) should be enshrined in the protocol. 38 | 39 | The primary difference between `mev-boost` and in-protocol PBS (IP-PBS) is the presence of the relay. The relay is a trusted intermediary between the block builders and 40 | the proposers, while in IP-PBS other validators enforce the rules of the block building auction through attestations. For this reason, phase 1 of the Optimistic Relay reduces the role of the relay in block production. 41 | 42 | #### Asynchronous block validation 43 | The key idea in phase 1 is to change the builder block submission flow to make the block validation an asnychronous process. When the builder submits a block 44 | to the relay, that bid is immediately elligible to win the auction, even if the relay hasn't had a chance to check that the block is valid. This allows builders 45 | to submit more blocks, and importantly submit blocks later in the slot (because they don't have to wait for the block to be validated for the bid to register). 46 | We call it "optimistic" because the relay assumes that the block is valid in the short-term, while deferring the actual validation to a later time. A 47 | builder must post collateral (or have a guarantor) with the relay to take advantage of this optimistic processing of their blocks, and if they submit an invalid block they are "demoted" 48 | back to the current `mev-boost` implementation where each block is validated before the bid is considered elligible. Critically, if a builder submits an 49 | invalid block that ends up winning the auction and being proposed the validator who missed their slot is refunded the amount of the winning bid. 50 | 51 | #### A builder's perspective 52 | 53 | The sequence diagram below shows the block building pipeline of regular `mev-boost`: 54 | Screen Shot 2023-02-09 at 2 41 14 PM 55 | 56 | The builders block must be validated before the proposer calls `getHeader` on the relay at the beginning of the slot. In this diagram, Δ denotes the amount 57 | of time that it takes the relay to validate the block. Thus the builders submission must arrive at least Δ before the start of the next slot for the bid to be 58 | valid. Compare this to the optimistic block building pipeline: 59 | 60 | Screen Shot 2023-02-09 at 2 53 41 PM 61 | 62 | Here the builder immediately gets a response indicating that their bid is active and the relay might not simulate the block until the payload has already been 63 | published as a full beacon block. A builder is incentivized to use optimistic relaying because they no longer have to pay the Δ time penalty during their 64 | block submission. MEV is a game of small margins, and speed is critical for successful builders. To demonstrate that consider the following situation: 65 | 66 | Screen Shot 2023-02-09 at 3 36 18 PM 67 | In this case, the builder knows that their submission needs to reach the relay before time 12-Δ. Right after they submit their block, a large MEV opportunity arises. 68 | They can try to submit a new block that captures more MEV and thus has a higher probability of winning the auction, but it is too late as 69 | the simulation will not complete before t=12. Intuitively, the longer the builder can wait before submitting their block, the longer they have to listen for transactions and thus the higher their bid. 70 | 71 | #### A proposer's perspective 72 | The proposers perspective is almost identical to the status quo of `mev-boost`, with the slight change that they no longer have a guarantee that the header 73 | they sign corresponds to a valid block. They still call `getHeader` at the beginning of the slot and sign the header they receive. 74 | The trust assumptions on the relay remain the same. If the block they end up proposing turns out to be invalid 75 | they must trust the relay to refund them for their lost slot, which is the same as trusting the relay to validate the block includes a payment corresponding to 76 | the bid amount in the base `mev-boost` case. 77 | 78 | ## Security considerations 79 | 80 | There were a number of concerns raised in https://github.com/michaelneuder/mev-boost-relay/pull/2 that we would like to address. Thanks to Chris Hager, Alex Stokes, and Mateusz Morusiewicz for this initial feedback! 81 | 82 | #### Relay collateralization 83 | Relay operators who manage collateral to enable optimistic processing are accountable for the additional legal and operational complexity of managing these funds. This has been raised as a significant concern by some parties. 84 | The solution we propose for the Ultra Sound Relay is a builder-guarantor approach. The relay acts as an intermediary to determine if/when a builder bug results in a missed slot, but the 85 | expected outcome is that once the issue is brought to the builder, they directly refund the proposer who missed the slot. Since builder reputation far exceeds 86 | the monetary value of the missed slots, we expect that in the vast majority of refunds to process smoothly in this fashion. In the exceptional case where a builder stops 87 | responding or decides to withhold the refund, the builder-guarantor intervenes and refunds the proposer. The guarantor for the offending builder will likely remove 88 | their remaining collateral for that builder, and thus the builder will be back to non-optimistic processing. Additionally, for each refundable event, we plan 89 | on publishing a postmortem publicly explaining what happened. The format for the postmortem will be: (a) a timeline of events, (b) the builder and proposer involved, (c) the bid that resulted in the missed slot, (d) the error associated with the bid, (e) the refund details. 90 | This information will provide transparency into the relay operation and allow us to monitor common builder failure modes. 91 | An additional benefit of this approach is the ability 92 | for guarantors to be separate entities from the relay itself. For example, the Ultra Sound Relay is willing to act as a guarantor for trusted builders for other smaller relays 93 | to remove any overhead of managing collateral and refunds. To bootstrap this process, the Ultra Sound Relay is willing to act as a 1 ETH guarantor for the following builders: 94 | ``` 95 | * builder0x69 96 | * flashbots 97 | * beaver 98 | * bloxroute 99 | * manta 100 | * blocknative 101 | * rsync 102 | * eden 103 | * eth-builder 104 | * buildai 105 | * payload 106 | * nfactorial 107 | * s[0-9].+ 108 | * lightspeed 109 | * manifold 110 | * Rick Astley 111 | ``` 112 | New builders who are willing to engage with the Ultra Sound Relay team and demonstrate high-fidelity block production can get added to this list! 113 | 114 | #### Missed slots (liveness-attack) 115 | This proposal increases the risk of missed slots, but only marginally. Because the relay doesn't validate the block, the proposer could end up signing a bad header. The proposer is 116 | unable to rectify the situation by signing a valid block, because signing two blocks of the same height is a slashing condition. However, we implemented it so that there will be 117 | at most one missed slot per-optimistc builder before a manual intervention. By default, all builders are treated non-optimistically and the process of posting collateral is manual. 118 | If that builder ends up submitting a bad block, a single slot will be missed and we will manually investigate the failure, initiate the refund, and 119 | communicate with the builder. Only once we have high confidence that we understand what went wrong and why it won't happen again will we allow the builder to re-post collateral and have access 120 | to optimistic processing once again. Additionally, we only optimistically process blocks that have a value less than the collateral posted for a builder. The builder is free to submit blocks with huge MEV that exceeds their collateral, we just will validate those bids synchronously because they can't cover refund amount. Lastly, have the benefit of rolling this change out slowly. By starting with just the Ultra Sound Relay we can continually monitor the number 121 | of missed slots caused by this change. If we ever determine that it exceeds what we are comfortable with, we can simply turn it off. 122 | 123 | #### Collusion 124 | Consider the case where the proposer and the builder are the same malicious actor. The builder could submit a large bid with an invalid block in order to ensure that they win the auction through the relay. This will result in an invalid block being proposed and the relay accounting system recording that the proposer is owed a large refund. 125 | However, the bid was collateralized by that same builder, so even though we issue a refund, the funds are coming from same actor, so the actor is not earing any extra profit. The only downside is that the relay team had to manually issue a refund, but we anticipate this to be a rare occurance, and 126 | by only onboarding trusted builders, we minimize this toil. Additionally, the proposer didn't accomplish anything other than skipping their slot, which they could 127 | do by just turning off their machine in the first place. 128 | 129 | #### Incentive compatiblity 130 | 131 | The builder is incentivized to use the optimistic relay honestly because it strictly increases the amount of MEV they are able to produce. Submitting an invalid 132 | bid results in them losing their collateral, which is not a rational choice. 133 | 134 | The proposer is incentivized to use the optimistic relay because it increases the size of the bids against their slot. Larger bids means more value extracted 135 | for the proposer themselves. 136 | 137 | The relay remains a neutral party that serves as an escrow mechanism, but does not receive any rewards. 138 | 139 | #### Moral hazard 140 | The more subtle argument is that this change introduces a moral hazard. Missed slots impact the entire chain, and while we acknowledge that there is a chance a few extra missed slots, we plan on approaching 141 | this change conservatively. By only allow-listing trusted builders initially, we ensure that there will be no runaway amount of missed slots (again only one 142 | per-builder per-manual intervention). We will actively collect data around any invalid blocks proposed and work with builders to understand what caused invalid block submissions. Public post-mortems will increase transparency and allow community feedback and engagement around the project as a whole. The builders 143 | are highly incentivized to avoid submitting invalid blocks and thus will want to know what is going wrong with their blocks if something fails. 144 | 145 | ## Learnings from Goerli 146 | 147 | #### Brittle nature of duplicate simulations 148 | 149 | In our first implementation, we simulated optimistic blocks that won the auction twice: (1) in the async call triggered by the block submission (2) in `getPayload` to check 150 | if the winning block was valid. While this works in theory, we found it to be very brittle to simulate the same block twice as a number of errors from race 151 | conditions between the parallel block submissions arose. Therefore we refactored the optimistic relay to only simulate each block a single time. In `getPayload` we make 152 | use of the waitgroup for optimistic blocks. Once we know all the blocks for the slot have been processed, we check the demotions table for the winning block. 153 | If it is there, then we know that an invalid block was delivered to the proposer, and thus a refund is necessary. This triggers an update on the demotions table 154 | where we insert the `SignedBeaconBlock` and `SignedValidatorRegistration` to ensure all the relevant data is in one place. 155 | This new design is presented in https://github.com/flashbots/mev-boost-relay/pull/285. 156 | 157 | #### Payload too large errors 158 | We found that many blocks returned an error of `Payload too large`. After investigation, we realized that the max size of the payload for block submissions to the `prio-load-balancer` was too small. This PR updates the max size to 2MB: https://github.com/flashbots/prio-load-balancer/commit/d4d171c3fcda4e4ca49075031efa44065c4f9a5a. 159 | 160 | #### Performance analysis 161 | Since the optimistic relay is mainly about improving the performance of the block submission flow to allow builders to submit blocks later into the slot, we 162 | collected performance data from the Goerli relay. The table below contains various percentiles of the distribution of the number of microseconds for the three stages of the submit block flow: 163 | 164 | Screen Shot 2023-02-09 at 3 36 18 PM 165 | 166 | 167 | The middle stage, `simulation_duration`, is what the v1 optimistic relay eliminates by removing the simulation of the blocks from the fast path. The first stage, `decode_duration`, is another huge portion of the overall runtime of the block submission flow. This stage is the process of recieving the payload over the network. With an `8 MB/s` connection, we have `8 KB/ms`. 168 | The median decode time is `~80ms` which implies `640KB` blocks. This seems like a reasonable estimate. Additionally, the network latency is high variance. 169 | The following plot shows the correlation between decoding time and the size of the payload. We color the data points by the builder pubkey. Clearly, there 170 | is a positive correlation between the two, and different builders have different networking connections which also show up in the data. 171 | 172 | Screen Shot 2023-02-09 at 3 36 18 PM 173 | 174 | We also collected data around the builder bids over the course of a slot. The following figure shows the value of a bid as a function of time for slot 5771427 for 10 different builders: 175 | 176 | Screen Shot 2023-02-09 at 3 36 18 PM 177 | 178 | More broadly, we can observe the probability distributions over the slot duration of when the winning block arrived at the relay. 179 | Screen Shot 2023-02-09 at 3 36 18 PM 180 | 181 | Clearly, the later seconds in the slot have a much higher probability of containing the winning bid. This follows directly from the fact that more MEV will occur 182 | during the slot, allowing later blocks to capture more and thus increase their bid size. 183 | 184 | ## FAQ by Justin Drake 185 | 186 | **Does an optimistic relay simulate blocks?** 187 | 188 | _Yes, an optimistic relay simulates all blocks that could have been forwarded to a proposer. Simulation just happens away from the latency-critical path._ 189 | 190 | **Can optimistic relaying lead to mass missed slots?** 191 | 192 | _Assuming no relay bug, the worst case is one missed slot per collateralised builder (prior to reactivation). The beacon chain is designed to handle missed slots._ 193 | 194 | **Are builders incentivised to produce invalid blocks?** 195 | 196 | _The builder of an invalid winning block suffers a financial loss. Moreover, a single invalid block will disable optimistic relaying for the builder, yielding a latency disadvantage pending reactivation._ 197 | 198 | **What recommendations do you have for optimistic relay operators?** 199 | 200 | _We have several recommendations for optimistic relay operators:_ 201 | 202 | * _alerts – setup automatic alerts (e.g. email or phone call) for demotions_ 203 | * _refunds – promptly transfer (e.g. within 24 hours) the bid value plus beacon chain penalties and missed rewards to the proposer fee recipient_ 204 | * _max collateral – cap the maximum collateral amount per builder (e.g. to 10 ETH) to keep a level playing field_ 205 | * _investigation – manually investigate demotions and ask builders to fix block building bugs before reactivating optimistic relaying_ 206 | * _cool-off – impose a post-demotion cool-off period (e.g. 24 hours) before reactivating optimistic relaying_ 207 | * _penalty – consider a fixed penalty (e.g. 0.1 ETH) per demotion, especially for repeat demotion_ 208 | 209 | -------------------------------------------------------------------------------- /stale/img/nonopt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/nonopt.png -------------------------------------------------------------------------------- /stale/img/v1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v1.png -------------------------------------------------------------------------------- /stale/img/v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v2.png -------------------------------------------------------------------------------- /stale/img/v3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michaelneuder/optimistic-relay-documentation/4fb032e92080383b7b5d8af5675ef2bf9855adc3/stale/img/v3.png -------------------------------------------------------------------------------- /stale/towards-epbs.md: -------------------------------------------------------------------------------- 1 | # Towards Enshrined PBS — An Optimistic Roadmap 2 | 3 | ## Purpose 4 | Present a roadmap towards Enshrined PBS (ePBS) through a series of modifications 5 | to the existing `mev-boost` [relay](https://github.com/flashbots/mev-boost-relay) 6 | functionality. By progressively removing the relay responsibilities, 7 | we aim to converge to a system that looks quite similar to [existing](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980) [proposals](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) for ePBS. 8 | 9 | ### Rationale 10 | 1. **Agility** — We aim to approach this protocol upgrade as suggested by Justin in the [Censorship Panel](https://www.youtube.com/watch?v=Z9VCdiSPJEQ&t=2729s) at SBC. To front-load the 11 | R&D effort, we can iterate quickly by experimenting with a portion of existing 12 | relays, builders, and validators to reduce uncertainty and risk around full ePBS. 13 | There are tradeoffs between ePBS and `mev-boost` as highlighted by Barnabé at [Devconnect](https://youtu.be/jQjBNbEv9Mg?t=943) and in [Notes on PBS](https://barnabe.substack.com/i/82304191/market-structure-and-allocation-mechanism). Through this roadmap we explore the 14 | design space between the two ends of the spectrum, without fully committing upfront to ePBS. 15 | 2. **Inevitability** — At [Devconnect](https://www.youtube.com/watch?v=OD54WfVuDWw&t=818s), Vitalik noted that with Danksharding, some builder separation becomes mandatory because bandwidth 16 | requirements for large blocks exceed what is within reach of a home-staker. This roadmap 17 | allows us to progress the research discussion by examining how different mechanisms work in practice. 18 | 3. **Accessibility** — By presenting relay data and designing the optimistic architecture in the open, we increase visibility into the 19 | existing block building market. This helps answer research questions (e.g., 20 | [ROP-0](https://efdn.notion.site/ROP-0-Timing-games-in-Proof-of-Stake-385f0f6279374a90b52bf380ed76a85b)) and allows 21 | independent relays stay competitive with vertically integrated Builder-Relays. 22 | Additionally, optimistic relaying actually reduces the hardware and networking 23 | [resources](https://collective.flashbots.net/t/ideas-for-incentivizing-relays/586) required for a relay to be competitive, lowering the barrier of entry. 24 | 25 | 26 | 27 | 28 | ## Block lifecycle 29 | The figure below schematizes the lifecycle of a block in the existing `mev-boost` 30 | architecture. 31 |
32 | non-optimistic 33 | 34 | 1. The builder submits a bid to the relay, which contains a header and an execution 35 | body. 36 | 2. The relay validates the block and asserts that it includes an appropriate payment 37 | to the proposer. 38 | 3. The proposer calls `getHeader` to receive the highest paying bid. 39 | 4. The proposer signs the header and calls `getPayload` which delivers the signed header 40 | to the relay. 41 | 5. The relay publishes the signed block to the p2p network. 42 |
43 | 44 | #### What is the relay doing? 45 | As implemented, the relay aims to be a simple, mutually-trusted, neutral third-party to 46 | connect builders and proposers. The relay duties and the corresponding trust assumptions are 47 | 48 | - Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.* 49 | - Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay 50 | to provide a valid header to sign.* 51 | - Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay 52 | to check that the block that pays them.* 53 | - Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.\** 54 | 55 | > \* Note that the winning block is returned to the proposer, 56 | so even if the relay doesn't publish the block, the proposer should and is incentivized to do so. We still list it as a trust assumption because the builder doesn't trust the proposer, 57 | so in essence they still trust that the relay publishes their block if it wins the auction. 58 | 59 | ## Optimistic Relay v1 — "Efficient builder submissions" 60 | The figure below demonstrates the block lifecycle under Optimistic Relaying v1. 61 | This idea is [proposed](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md) and [implemented](https://github.com/flashbots/mev-boost-relay/pull/285); it 62 | was also discussed in the [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348). 63 | 64 |
65 | non-optimistic 66 | 67 | 1. The builder submits a bid to the relay, which contains a header and an execution 68 | body. 69 | 2. The proposer calls `getHeader` to receive the highest paying bid. 70 | 3. The proposer signs the header and calls `getPayload` which delivers the signed header 71 | to the relay. 72 | 4. The relay publishes the signed block to the p2p network. 73 |
74 | 75 | 76 | #### What is the relay doing? 77 | This flow differs only slightly from the previous; the relay does not immediately 78 | validate the block sent from the builder. This results in a reduction of 79 | the latency between when a builder submits a bid to the relay and when that bid 80 | becomes eligible to win the auction. This benefits both the builders and 81 | the proposers because it allows bids to arrive later in the slot, thus capturing more 82 | MEV for both parties. Under this design, builders can submit high bids for invalid blocks 83 | that end up winning the auction, which results in a missed slot (because the proposer 84 | signed an invalid header). We account for this by requiring builders to post collateral to 85 | the relay which will be used to refund proposers if a slot is missed. See the [proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and [community call](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348) for further details. 86 | This proposal should lower the hardware and networking requirements of running a relay because now there is no need for large amounts of burst compute and bandwidth due to the block simulation being handled asynchronously in the next slot. 87 | 88 | Beyond the practical benefits mentioned above, we also modified the relay duties and trust assumptions. Now, the relay is responsible for 89 | 90 | - Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.* 91 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay 92 | to provide a valid header to sign.*~~ 93 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay 94 | to check that the block that pays them.*~~ 95 | - Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.* 96 | - **[new]** Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid.* 97 | 98 | This demonstrates the main objective of this roadmap — "to reduce the duties and trust assumptions associated with relays". 99 | 100 | > Optimistic Relay v1.5 — "Header-only parsing". One minor modification of v1 is 101 | to further optimize the block submission flow by making a bid eligible to win the 102 | auction immediately upon receipt of the header. Since the relay already asynchronously 103 | validates the body, we save extra critical milliseconds by parsing the stream for just the 104 | bid details and then updating the highest bid accordingly, as opposed to waiting until the full payload is downloaded. This introduces some technical 105 | complexity because it creates a race condition between the block body availability and the 106 | proposer's call to `getPayload`. However, these are implementation details and do 107 | not impact the relay duties and trust assumptions listed above, so we treat it as an 108 | extension of v1. 109 | 110 | ## Optimistic Relay v2 —"Relay as a header proxy" 111 | The figure below demonstrates the block lifecycle under Optimistic Relaying v2. 112 | 113 |
114 | non-optimistic 115 | 116 | 1. The builder submits a bid to the relay, which contains only the header and a value. 117 | 2. The proposer calls `getHeader` to receive the highest paying bid. 118 | 3. The proposer signs the header and calls `getPayload` which delivers the signed header 119 | to the relay. 120 | 4. The relay proxies the signed header to the corresponding builder. 121 | 5. The builder publishes the signed block. 122 |
123 | 124 | 125 | #### What is the relay doing? 126 | Under this design, the relay no longer receives the execution body of the block that the builder constructed. 127 | This removes another piece of the trust between the builder and the relay 128 | because it removes the relay's ability to steal MEV. The block never leaves the builder's 129 | machine until it receives a signed header committing to that block. This allocates the 130 | task of publishing the signed block up to the builder, who is incentivized to do so 131 | in order to earn the reward associated with the publication. The relay has one new task under 132 | this paradigm, which is to observe the mempool and await the signed block associated with 133 | the header that won the auction. If the block does not appear on time, then the proposer 134 | must be refunded in the same manner as v1. The relay role has evolved to 135 | 136 | - Storing the builder header ~~and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*~~ 137 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay 138 | to provide a valid header to sign.*~~ 139 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay 140 | to check that the block that pays them.*~~ 141 | - ~~Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.*~~ 142 | - Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid block.* 143 | - **[new]** Observing the mempool $\implies$ *Trust assumption 6: the proposer trusts the relay to refund them in the case of a missing block.* 144 | 145 | > This design has 2 additional practical benefits. (1) It completely removes the bandwidth 146 | intensive process of transmitting the block from the builder to the relay. This is a 147 | [zero copy](https://en.wikipedia.org/wiki/Zero-copy) approach as the block only ever 148 | resides on the builder machine until it is published. This becomes increasingly important as the size of blocks grows as a result of the sharding roadmap, and further reduces the hardware and network requirements to run an independent relay. (2) The relay is no longer able to censor. 149 | The relay only proxies the header, and thus has no ability determine information about 150 | the transactions in the execution body. 151 | 152 | 153 | ## Optimistic Relay v3 — "Relay as an oracle" 154 | The figure below demonstrates the block lifecycle under Optimistic Relaying v3. 155 | 156 |
157 | non-optimistic 158 | 159 | 1. The builder submits a bid to the mempool, which contains only the header and a value. 160 | 2. The proposer listens to the mempool and selects a header. 161 | 3. The proposer signs the header and publishes it to the mempool. 162 | 4. The builder listens to the mempool for the signed header corresponding to their bid. 163 | 5. The builder publishes the signed block. 164 |
165 | 166 | 167 | #### What is the relay doing? 168 | Notice how the relay no longer plays an active role in the block building flow. 169 | The relay becomes an oracle that observes the mempool for the case where 170 | the proposer signed a header on time, but the builder did not 171 | publish a valid block on time. This situation would be treated exactly as in v1 & v2, where 172 | the proposer is refunded for their missed slot using the builder collateral. In this 173 | last iteration, the relay role is reduced to 174 | 175 | - ~~Storing the builder header and execution body $\implies$ *Trust assumption 1: the builder trusts relay not to steal their MEV.*~~ 176 | - ~~Validating the body $\implies$ *Trust assumption 2: the proposer trusts the relay 177 | to provide a valid header to sign.*~~ 178 | - ~~Validating the payment $\implies$ *Trust assumption 3: the proposer trusts the relay 179 | to check that the block that pays them.*~~ 180 | - ~~Publishing the winning block $\implies$ *Trust assumption 4: the builder trusts the relay to publish the winning block.*~~ 181 | - Refunding proposers who signed invalid header $\implies$ *Trust assumption 5: the proposer trusts the relay to refund them in the case of an invalid block.* 182 | - Observing the mempool $\implies$ *Trust assumption 6: the proposer trusts the relay to refund them in the case of a missing block.* 183 | 184 | > Note that the proposer payments could be implemented in an unconditional way. Such a mechanism was [presented](https://github.com/flashbots/mev-boost/issues/109) 185 | by Alex O. and Stephane. This would reduce the trust assumptions to zero. The tradeoff here 186 | is the engineering complexity and inherent risk of using smart contracts to implement this logic. 187 | This may well be worth investing time into in the future, but we estimate that in the short-term, the 188 | relay operators (or [guarantors](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md#relay-collateralization)) should handle the refunds manually. 189 | 190 | ## ePBS — "Replace the relay with a committee" 191 | The final evolution of this roadmap is to replace the v3 relay with a committee of 192 | validators and enshrine PBS into the protocol. The specifics of the mechanism 193 | can vary; Vitalik has proposed [single-slot](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) and 194 | [two-slot](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980). 195 | Under the two-slot implementation (which seems to be the most popular currently), 196 | the proposer chooses a header to include in their beacon block. 197 | One committee attests to this block and once a builder is confident that the block 198 | will not be reorged, they publish an intermediate block with the full execution payload that the 199 | remaining committees attest to. 200 | 201 | Note how the first committee attests 202 | to the timely publication of a signed header and the remaining committees attest to the timely publication of a signed block, which is exactly the role that the 203 | relay plays in v3. 204 | 205 | -------------------------------------------------------------------------------- /towards-epbs.md: -------------------------------------------------------------------------------- 1 | # Towards enshrined PBS — an optimistic roadmap 2 | 3 | ###### _"This has been a latency awakening" - Justin Drake (March 1, 2023)_ 4 | 5 | ### Purpose 6 | Present a roadmap towards enshrined PBS (ePBS) through a series of modifications 7 | to the existing `mev-boost` [relay](https://github.com/flashbots/mev-boost-relay). By progressively removing the relay responsibilities and pruning the 8 | critical path of the block-building pipeline, 9 | we aim to converge to a system that looks quite similar to [existing](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980) [proposals](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) for ePBS. 10 | 11 | ### Rationale 12 | 1. **Agility** — We aim to approach this protocol upgrade as suggested by Justin in the [Censorship Panel](https://www.youtube.com/watch?v=Z9VCdiSPJEQ&t=2729s) at SBC. To front-load the 13 | R&D effort, we can iterate quickly by experimenting with a portion of existing 14 | relays, builders, and validators to reduce uncertainty and risk around full ePBS. 15 | There are tradeoffs between ePBS and `mev-boost` as highlighted by Barnabé at [Devconnect](https://youtu.be/jQjBNbEv9Mg?t=943) and in [Notes on PBS](https://barnabe.substack.com/i/82304191/market-structure-and-allocation-mechanism). This roadmap explores the 16 | design space between the two ends of the spectrum. 17 | 2. **Inevitability** — At [Devconnect](https://www.youtube.com/watch?v=OD54WfVuDWw&t=818s), Vitalik noted that with Danksharding, some builder separation becomes mandatory because bandwidth 18 | requirements for large (32 MB) blocks exceed what is within reach of a home-staker. This roadmap 19 | allows us to progress the research discussion by examining how different mechanisms work in practice. 20 | 3. **Accessibility** — By presenting relay data and designing the optimistic architecture in the open, we increase visibility into the 21 | existing block-building market. This helps answer research questions (e.g., 22 | [ROP-0](https://efdn.notion.site/ROP-0-Timing-games-in-Proof-of-Stake-385f0f6279374a90b52bf380ed76a85b) from Barnabé and Caspar) and allows 23 | independent relays stay competitive with vertically integrated Builder-Relays. 24 | Additionally, optimistic relaying reduces the hardware and networking 25 | [resources](https://collective.flashbots.net/t/ideas-for-incentivizing-relays/586) required for a relay to be competitive, lowering the barrier of entry. From the builder perspective, 26 | optimistic building requires collateral be posted, but it removes the need for them to be in the 27 | high-prio queue of each relay they connect to. 28 | 29 | We define the "critical path" as the set of operations and messages that compose 30 | the block production pipeline beginning when the builder submits a bid. 31 | We present a progression of 3 relay modifications that each reduce the latency associated 32 | with the critical path. 33 | 34 | > Latency is a centralizing force in the competitive block-building market. Since 35 | MEV is constantly being produced and extracted, rational 36 | builders are incentivized to colocate with relays, form trusted connections to relays, or even run 37 | their own relay infrastructure in order to minimize (down to $\mu s$ or $ns$) the time between when they build a block 38 | and when that header is part of the auction. We believe that by simplifying the pipeline, we can maintain a competitive and open builder ecosystem, while working to phase out relays all together. 39 | 40 | 41 | ## Block production today 42 | The figure below schematizes the critical path under the existing `mev-boost` 43 | architecture. 44 |
45 | non-optimistic 46 | 47 | 1. The builder submits a bid that contains the full execution body 48 | of the block to the relay. 49 | 2. The relay simulates the block; the simulation must complete before the bid can win the auction. 50 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block. 51 |
52 | 53 | #### What is the critical path? 54 | The critical path runs through the relay, which is tasked with 55 | - receiving the full block over the network from the builder (this can be several MB of data and will increase with 4844), 56 | - validating the block against an execution node internally before communicating the winning header to the proposer, and 57 | - receiving the signed header from the proposer and publishing\* the block. 58 | 59 | In practice, this entire process can take hundreds of milliseconds, even when running on 60 | high-performance hardware with good network connectivity. 61 | 62 | > \* Note that the winning block is returned to the proposer, 63 | so even if the relay doesn't publish the block, the proposer should and is incentivized to do so. 64 | 65 | ## Optimistic Relay v1 — "Asynchronous block validation" 66 | The figure below denotes the critical path under Optimistic Relaying v1. 67 | This idea is [proposed](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md) and [implemented](https://github.com/flashbots/mev-boost-relay/pull/285); it 68 | was also discussed in the [MEV community call #0](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348). 69 | 70 |
71 | non-optimistic 72 | 73 | 1. The builder submits a bid that contains the full execution body 74 | of the block to the relay. 75 | 2. The relay immediately marks the bid as eligible to win the auction. 76 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block. 77 |
78 | 79 | 80 | #### What is the critical path? 81 | This flow differs only slightly from the previous; the relay does not immediately 82 | validate the block sent from the builder. This results in a reduction of 83 | the latency between when a builder submits a bid to the relay and when that bid 84 | becomes eligible to win the auction. This benefits both the builders and 85 | the proposers because it allows bids to arrive later in the slot, thus capturing more 86 | MEV for both parties. Under this design, builders can dishonestly submit high bids for invalid blocks 87 | that end up winning the auction, which results in a missed slot (because the proposer 88 | signed an invalid header). Alternatively, EVM valid blocks could be proposed and included but not pay the proposer 89 | the value of the bid. We account for this by requiring builders to post collateral to 90 | the relay which will be used to refund proposers if a slot or payment is missed. See the [proposal](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md), [implementation](https://github.com/flashbots/mev-boost-relay/pull/285), and [community call](https://collective.flashbots.net/t/mev-boost-community-call-0-23-feb-2023/1348) for further details. 91 | This proposal should lower the hardware and networking requirements of running a relay because now there is no need for large amounts of burst compute and bandwidth right at the end of the slot. 92 | 93 | >The end of the slot is highly congested because all the highest bids are arriving in the final milliseconds before the proposer calls `getHeader`. This exaggerates the issue if simulation latency and further 94 | benefits builders who are vertically integrated with relays. 95 | 96 | Beyond the practical benefits mentioned above, we modified the critical path. Now, the relay is responsible for 97 | 98 | - receiving the full block over the network from the builder (this can be several MB of data and will increase with 4844), 99 | - communicating the winning header to the proposer, and 100 | - receiving the signed header from the proposer and publishing the block. 101 | 102 | The relay still validates each block it receives, but this validation happens asynchronously, thus 103 | is no longer part of the critical path. This demonstrates the main objective of this roadmap — "to reduce the duties of the relay 104 | and the latency of the critical path". 105 | 106 | ## Optimistic Relay v2 —"Header-only parsing" 107 | The figure below demonstrates the critical path under Optimistic Relaying v2. 108 | 109 |
110 | non-optimistic 111 | 112 | 1. The builder submits a bid to the relay. 113 | 2. The relay parses the incoming message just for the bid details and then marks the bid as eligible to win the auction. 114 | 3. The proposer sends the signed header back to the relay and the relay publishes the full block. 115 |
116 | 117 | 118 | #### What is the critical path? 119 | The critical path still runs through the relay, which is tasked with 120 | - parsing the incoming message from the builder for just the bid details (this is a few hundred bytes), 121 | - communicating the winning header to the proposer, and 122 | - receiving the signed header from the proposer and publishing the block. 123 | 124 | Here we remove the additional latency of waiting for the whole block execution body to 125 | download to the relay (which will become even more relevant with full Danksharding). 126 | This can result in an invalid or missing block from the builder, 127 | in which case the proposer refund is handled in the same manner as v1. 128 | 129 | > This design has an additional practical benefit. The relay is no longer able to censor because the bid is eligible when the header is parsed. The relay doesn't yet have information about the transactions in the execution body, so it can't censor based on those transactions. 130 | 131 | 132 | ## Optimistic Relay v3 — "Relay as an oracle" 133 | The figure below demonstrates the critical path under Optimistic Relaying v3. 134 | 135 |
136 | non-optimistic 137 | 138 | 1. The builder submits a header-only bid to the mempool (could be referred to as a "bidpool"). 139 | 2. The proposer sends the signed header back to mempool and the builder publishes the full block. 140 |
141 | 142 | #### What is the critical path? 143 | Note that now the relay is no longer in the critical path! The builders 144 | and proposers communicate directly through the p2p layer. We assume that the builders will be well-connected nodes in the p2p network because they are incentivized to have extremely short 145 | paths to the proposers, thus the messages will still be fast. 146 | 147 | 148 | The critical path is now just 149 | - the proposer listening over the p2p network for header-only bids (this is a few hundred bytes), and 150 | - the builder listening over the p2p network for a signed header corresponding to their bid. 151 | 152 | The relay is still present in this architecture, but only serves as an oracle 153 | in the case of a missed slot. The relay observes the mempool and determines if 154 | (a) a signed header was produced on time, but (b) the corresponding signed block was not produced on time. In this situation, the builder is at fault, and their collateral is again used to refund 155 | the proposer as in v1 \& v2. 156 | 157 | > Note that the proposer payments could be implemented in an unconditional way. Such a mechanism was [presented](https://github.com/flashbots/mev-boost/issues/109) 158 | by Alex O. and Stephane. This would eliminate the need for the relay to directly control the builder collateral, and reduce the relay to a data-availability oracle for the signed block appearing on time. The tradeoff here 159 | is the engineering complexity and inherent risk of using smart contracts to implement this logic. 160 | This may well be worth investing time into in the future, but we estimate that in the short-term, the 161 | relay operators (or [guarantors](https://github.com/michaelneuder/opt-relay-docs/blob/main/proposal.md#relay-collateralization)) should handle the refunds manually. 162 | 163 | ## ePBS — "Replace the relay with a committee" 164 | The figure below demonstrates the critical path under ePBS. 165 | 166 |
167 | non-optimistic 168 | 169 | 1. The builder submits a header-only bid to the mempool. 170 |
171 | 172 | #### What is the critical path? 173 | The final evolution of this roadmap is to replace the v3 relay with a committee of 174 | validators and enshrine PBS into the protocol. The specifics of the mechanism 175 | can vary; Vitalik has proposed [single-slot](https://ethresear.ch/t/single-slot-pbs-using-attesters-as-distributed-availability-oracle/11877) and 176 | [two-slot](https://ethresear.ch/t/two-slot-proposer-builder-separation/10980). 177 | Under the two-slot implementation (which seems to be the most popular currently), 178 | the proposer chooses a header to include in their beacon block and publishes it in their slot. 179 | One committee attests to this block and once a builder is confident that the block 180 | will not be reorged, they publish an intermediate block in the subsequent slot that contains the full execution payload. The remaining committees attest to this intermediate block. The enforcement 181 | of honest behavior is implemented into the fork-choice rule, so that if a builder produces a 182 | block that differs from the winning header, the committee members simply do not attest to it. 183 | 184 | > The first committee attests 185 | to the timely publication of a signed header and the remaining committees attest to the timely publication of a signed block, which is exact set of actions that the 186 | relay listens for in v3. 187 | 188 | --------------------------------------------------------------------------------