├── .gitignore
├── .nojekyll
├── LICENSE
├── README.md
├── assets
└── pdf
│ ├── ethcc2021.pdf
│ ├── notes-georgios.pdf
│ └── rig-ethcc.pdf
├── auctions
└── notebooks
│ └── fpa_hybrid_sim.ipynb
├── eip1559
├── combination.md
├── fixedesc.jpeg
├── floatingesc.jpeg
├── floatingescfixedtip.jpeg
└── notes-call3.md
├── ethdata
├── extract.sh
└── notebooks
│ ├── explore_data.Rmd
│ ├── explore_data.html
│ └── gas_weather_reports
│ ├── exploreJuly21.Rmd
│ └── exploreJuly21.html
├── index.html
├── posdata
├── README.md
├── default.html5
├── index.html
├── notebooks
│ ├── lib.R
│ ├── mainnet_compare.Rmd
│ ├── mainnet_explore.Rmd
│ ├── mainnet_explore.html
│ ├── medalla_explore.Rmd
│ ├── pyrmont_compare.Rmd
│ ├── pyrmont_explore.Rmd
│ └── uptime_reward_gif.R
└── scripts
│ ├── 20210424_plots.R
│ ├── 20210908_slowdown.R
│ └── 20210918_altair_sync.R
├── static
├── authorFormatting.js
├── component-library.js
├── footer.js
├── header.js
├── index.css
├── jupyter.css
├── react-dom.development.js
├── react.development.js
├── referencesFormatting.js
├── rig.png
├── sectionFormatting.js
└── theme-light.css
└── supply-chain-health
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | .DS_Store
3 | __pycache__
4 | econ-review.docx
5 | .ipynb_checkpoints
6 | reviews/
7 | visuals/
8 | temp
9 | .Rproj.user
10 | cadCAD/
11 | ethdata/scripts/default_url.R
12 | venv/
13 | data/
14 |
--------------------------------------------------------------------------------
/.nojekyll:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/.nojekyll
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | The Robust Incentives Group is an Ethereum Foundation research team dedicated to the study of protocol mechanisms with the lens of game theory, mechanism design, crypto-economics, formal methods, and data science.
2 |
3 | For a complete directory of our papers, posts, presentations, as well as "RIG's Open Problems (ROPs)", please visit https://rig.ethereum.org.
4 |
--------------------------------------------------------------------------------
/assets/pdf/ethcc2021.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/assets/pdf/ethcc2021.pdf
--------------------------------------------------------------------------------
/assets/pdf/notes-georgios.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/assets/pdf/notes-georgios.pdf
--------------------------------------------------------------------------------
/assets/pdf/rig-ethcc.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/assets/pdf/rig-ethcc.pdf
--------------------------------------------------------------------------------
/eip1559/combination.md:
--------------------------------------------------------------------------------
1 | # Combination EIP1559 / escalator
2 |
3 | **TL;DR:** We present three models for combining EIP1559 and escalator. Of the three, only one really makes sense for us (the _floating escalator_ model), while the other two (_thresholded escalator_ and _fixed escalator_) are presented for the sake of providing a complete exploration of the design space.
4 |
5 | ## Base dynamics and parameters
6 |
7 | ### Base parameters
8 |
9 | - `c` = target gas used
10 | - `1 / d` = max rate of change
11 | - `g[t]` = gas used by block t
12 | - `b[t]` = basefee at block t
13 | - `p[t]` = premium at block t
14 |
15 | ### Dynamics
16 |
17 | **EIP 1559 dynamics**
18 |
19 | - `b[t+1] = b[t] * (1 + (g[t] - c) / c / d)`
20 |
21 | **Linear escalator, given `startblock`, `endblock`, `startpremium` and `maxpremium`**
22 |
23 | - `p[t] = startpremium + (t - startblock) / (endblock - startblock) * (maxpremium - startpremium)`
24 |
25 | ## Thresholded escalator
26 |
27 | **Intuition:** Vanilla escalator with the condition that a bid cannot be included if the `gasprice` is lower than the current `basefee`.
28 |
29 | ### User-specified parameters
30 |
31 | - `startbid`
32 | - `startblock`
33 | - `endblock`
34 | - `maxpremium`
35 |
36 | ### Computed parameters
37 |
38 | - `startpremium = 0`
39 |
40 | ### Gas price
41 |
42 | ```python
43 | gasprice[t] = startbid + p[t]
44 |
45 | # Include only if
46 | assert gasprice[t] >= b[t]
47 | ```
48 |
49 | ### Pros/cons
50 |
51 | #### Pros
52 |
53 | - "Pure" escalator, only modulated by the presence of the basefee which determines inclusion or not.
54 | - Wallets can default to `startbid = b[t]`. This is the _fixed escalator_ model.
55 |
56 | #### Cons
57 |
58 | - Cannot write EIP 1559 simple strategy basefee + fixed premium under that model.
59 |
60 | ## Fixed escalator
61 |
62 | **Intuition:** Vanilla escalator with a reasonable `startbid` parameter provided by the current `basefee`.
63 |
64 | ### User-specified parameters
65 |
66 | - `startblock`
67 | - `endblock`
68 | - `maxfee`
69 | - `startpremium`
70 |
71 | ### Computed parameters
72 |
73 | - `maxpremium = maxfee - b[startblock]`
74 |
75 | ### Gas price
76 |
77 | ```python
78 | gasprice[t] = min(
79 | max(b[startblock] + p[t], b[t]),
80 | b[startblock] + maxpremium
81 | )
82 |
83 | # Include only if
84 | assert gasprice[t] >= b[t]
85 | ```
86 |
87 | - Gas price set to either current basefee `b[t]` OR basefee at the start of the escalator `b[startblock]` + current premium `p[t]`, whichever is higher, bounded above by the maxfee.
88 | - Setting `startpremium = 0` means starting bid = basefee.
89 |
90 | 
91 | _Bid in solid purple line, basefee in blue._
92 |
93 | ### Pros/cons
94 |
95 | #### Pros
96 |
97 | - Respects intuition of `basefee` as good default current price + escalating tip.
98 | - For stable `basefee`, looks like escalator with a well-defined `startbid`.
99 |
100 | #### Cons
101 |
102 | - Gas price can raise faster than the escalator would plan, if basefee increases faster than the escalator slope. Should the premium follow? See "floating escalator started on basefee".
103 | - Cannot write EIP 1559 simple strategy basefee + fixed premium under that model.
104 |
105 | ## Floating escalator
106 |
107 | **Intuition:** The "true" EIP 1559 with escalating tips. User specifies an escalator for the tip, which is added to the current basefee always, as opposed to the basefee at `startblock` for the fixed escalator. Users specifying a steeper escalator "take off" above other users, expressing their higher time preferences.
108 |
109 | ### User-specified parameters
110 |
111 | - `startblock`
112 | - `endblock`
113 | - `startpremium`
114 | - `maxfee` OR `maxpremium` OR both.
115 |
116 | ### Computed parameters
117 |
118 | - If `maxfee` is given: `maxpremium = maxfee - (b[startblock] + startpremium)`
119 | - If `maxpremium` is given: `maxfee = b[startblock] + maxpremium`
120 | - If both are given, NA.
121 |
122 | ### Gas price
123 |
124 | ```python
125 | gasprice[t] = min(
126 | b[t] + p[t],
127 | maxfee
128 | )
129 |
130 | # Include only if
131 | assert gasprice[t] >= b[t]
132 | ```
133 |
134 | Gas price set current basefee `b[t]` + current premium `p[t]`, bounded above by `maxfee`.
135 |
136 | 
137 | _Bid in solid purple line, basefee in blue._
138 |
139 | ### Pros/cons
140 |
141 | #### Pros
142 |
143 | - Respects intuition of `basefee` as good default current price + escalating tip.
144 | - For stable `basefee`, looks like escalator with a well-defined `startbid`.
145 | - For unstable `basefee`, escalates tip in excess of the current basefee, unlike the fixed escalator.
146 | - Setting `startpremium = maxpremium` and some `maxfee`, this is equivalent to the EIP 1559 paradigm (with `endblock` far into the future).
147 |
148 | 
149 | _Bid in solid purple line, basefee in blue._
150 |
151 | #### Cons
152 |
153 | - "Double dynamics" of basefee varying + tip varying, maybe hard to reason about.
154 | - You can reach your `maxfee` much faster than you intended if `basefee` increases during the transaction lifetime.
155 |
--------------------------------------------------------------------------------
/eip1559/fixedesc.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/eip1559/fixedesc.jpeg
--------------------------------------------------------------------------------
/eip1559/floatingesc.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/eip1559/floatingesc.jpeg
--------------------------------------------------------------------------------
/eip1559/floatingescfixedtip.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ethereum/rig/6ad58089cf2c36d2632b0d9e70a02e41ef3b2a28/eip1559/floatingescfixedtip.jpeg
--------------------------------------------------------------------------------
/eip1559/notes-call3.md:
--------------------------------------------------------------------------------
1 | # EIP 1559 implementers' call #3 notes
2 |
3 | ## 1559 and the escalator
4 |
5 | I see two mutually exclusive paths:
6 |
7 | 1. Keeping the current transaction model with the escalator rule for the fee.
8 | 2. Adopting 1559 and then possibly adopting the escalator rule for the premium.
9 |
10 | If it is decided to combine 1559 and the escalator, I believe the [floating escalator](combination.md) is the best way to do so. It is the only option for which it is possible to unbundle the 1559 side and the escalator side, allowing us to implement 1559 first and decide later on (or in concert) whether the escalator rule should be proposed.
11 |
12 | As a reminder, the escalator rule governs the premium:
13 |
14 | - `p[t] = startpremium + (t - startblock) / (endblock - startblock) * (maxpremium - startpremium)`
15 |
16 | In the floating escalator, we simply add the escalating premium to the current basefee `b[t]`.
17 |
18 | ```python
19 | gasprice[t] = min(
20 | b[t] + p[t],
21 | maxfee
22 | )
23 |
24 | # Include only if
25 | assert gasprice[t] >= b[t]
26 | ```
27 |
28 | A user must decide `maxfee`, `startpremium`, `maxpremium`, `startblock` and `endblock`.
29 |
30 | In addition, it is possible even with the escalator rule to emulate the behaviour of a 1559 tx with parameters `gas_premium` and `maxfee`, by setting `startblock` to the current block, `endblock` to an outrageously far away block and `startpremium == maxpremium == gas_premium`. This should help for compatibility and UX if and once the escalator rule is adopted to move the premium value.
31 |
32 | ## 2718 and 1559
33 |
34 | I don't have much to say about this. It seems 2718 offers a clean way to upgrade transaction patterns. This is perhaps helpful with the above?
35 |
36 | ## (In progress) Simulations
37 |
38 | We have the beginning of a more robust environment for agent-based simulations [here](abm1559.ipynb). We need to think through how agents should behave but initial tests show basefee converges quickly when the demand is at steady-state (e.g., same expectation of arrivals between two blocks). There is also support for escalator-style transactions but untested so far.
39 |
40 | Currently, agents can have two different cost functions, one where they incur a cost for waiting one extra block that is fixed, with some value for having their transaction included and one where this value is discounted over time (the later inclusion, the smaller the value). Agents decide to enter or not based on their estimation of profit: if they expect to realise a negative profit, they balk and do not submit their transaction.
41 |
42 | Note that _without the option to cancel their transaction (for free or at some predictible cost)_, an agent may realise a negative profit after all if their estimation was too optimistic. This violates ex post individual rationality.
43 |
44 | The current agent estimation of waiting time is pretty dumb (they simply expect to wait 5 blocks). A better estimator must depend on the submitted transaction parameters (the higher the premium/maxfee, the lower their expected waiting time) and could look like the estimators currently used by wallets. This will also be helpful to test these estimators empirically and decide on good transaction default values.
45 |
46 | ## (Important) Wallet defaults
47 |
48 | How should wallets set `max_fee` and `gas_premium`? We look for good default values to proposer to users. In the current UX paradigm, users are presented with 4 options:
49 |
50 | - Three of them suggest values corresponding to "fast", "average" and "slow" inclusions.
51 | - Otherwise, users can set their own transaction values.
52 |
53 | Suppose a wallet offers defaults pegged to the basefee, e.g., three defaults $\rho_1 < \rho_2 < \rho_3$ such that proposed maxfees are $m_i = (1+\rho_i) b(t)$. Assuming users broadly follow wallet defaults (they seem to), miners now make a higher profit when basefee is higher, all else equal.
54 |
55 | It was suggested to default to a fixed premium for users, e.g., 1 Gwei, or the amount of Gwei that would exactly compensate a miner for the extra ommer risk of including the transaction in their block. The tip however will likely decide the speed of inclusion of the transaction, given that the tip is received by miners. We prefer high value or time-sensitive transactions to get in first and with a fixed premium, may not be able to discriminate between low and high value instances. The floating escalator can come in handy to help discriminate between the two.
56 |
57 | ### Pegged premium rule: A naive proposal that doesn't work
58 |
59 | A default that respects this intuition is pegging the premium to the proposed maxfee. We assume then that users only declare their maxfee and the premium is set in protocol, taking e.g. one hundredth of the declared maxfee.
60 |
61 | I value my transaction a lot and am ready to pay 10 Gwei for it. The default sets my premium to 10 / 100 = 0.1 Gwei. Someone else who values theirs less, e.g., is only ready to pay up to 5 Gwei for it, has their premium set to 5 / 100 = 0.05 Gwei. Miners prefer my transaction to theirs. This also collapses the number of parameters to set from 2 to 1.
62 |
63 | When the premium is equal to a fixed fraction of the maxfee, the tip becomes a consistent transaction order, in addition to representing exactly the miner profit. Whenever $m_i < m_j$, two maxfees of two users $i$ and $j$, we _always_ have $p_i < p_j$ (premiums) and $t_i < t_j$ (tips).
64 |
65 | From an incentive-compatibility point of view, a user who wants to "game" the system by inflating their maxfee to inflate their tip exposes themselves to a high transaction fee, in the case where basefee increases before they are included.
66 |
67 | But there is a trivial strategy to defeat this rule: a user could declare a maxfee they would not be ready to pay and monitor the basefee, cancelling their transaction whenever basefee rises above their true (undeclared) maxfee. So the pegged premium rule is not incentive compatible.
68 |
69 | ## (Important) Client strategies
70 |
71 | We need to figure out how clients handle pending transactions. In the current paradigm, clients can simply rank and update their list of pending transactions based on the gasprice. This is *not true* when users can set both the maxfee and the premium! For instance, when basefee is equal to 5, consider these two users:
72 |
73 | | Basefee = 5 | Maxfee | Premium | Tip |
74 | |-|-|-|-|
75 | | **User A** | 10 | 8 | 5 |
76 | | **User B** | 15 | 6 | 6 |
77 |
78 | We like ranking by premiums since these do not vary over time. It means miners can easily update their pending transactions list. But ranking by premiums, a miner would prefer user A to user B, even though the miner would receive a greater payoff from including B.
79 |
80 | So we must rank by tips, in which case B is preferred. But tips are time-varying! Suppose basefee now drops to 2.
81 |
82 | | Basefee = 2 | Maxfee | Premium | Tip |
83 | |-|-|-|-|
84 | | **User A** | 10 | 8 | 8 |
85 | | **User B** | 15 | 6 | 6 |
86 |
87 | Now user A is preferred to B. Miners must re-rank all pending transactions between each block based on the new basefee.
88 |
89 | This issue compounds with time-varying premiums, as suggested in the [floating escalator](combination.md) for instance.
90 |
91 | Clients must also handle their memory -- by default I believe, clients only keep around the current 8092 most profitable transactions in their transaction pools. Should a client keep around a currently invalid transaction (one where current basefee is higher than maxfee) in the hope that when basefee lowers they will reap a good tip?
92 |
93 | When basefee is high, some high-premium transactions may be submerged.
94 |
95 | | Basefee = 10 | Maxfee | Premium | Tip |
96 | |-|-|-|-|
97 | | **User A** | 9 | 4 | - |
98 | | **User B** | 15 | 3 | 3 |
99 |
100 | But let the tide ebb, and the transaction is now preferred.
101 |
102 | | Basefee = 5 | Maxfee | Premium | Tip |
103 | |-|-|-|-|
104 | | **User A** | 9 | 4 | 4 |
105 | | **User B** | 15 | 3 | 3 |
106 |
107 | With some work it is likely possible to find a good rule / heuristics to have a pretty good approximation of the optimum. This is something that we should discuss more with the Nethermind team too as they raised this concern in their 1559 document.
108 |
109 | ## (Nice to have) Equilibrium strategy
110 |
111 | We can take a cue from [Huberman et al.](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3025604) and analyse the transaction fee market as a strategic game of queueing. Assuming all transactions have constant gas requirements, how should we define the game?
112 |
113 | - It is a batched service queue (a round of service includes a maximum of _K_ transactions). Normalise time units such that service happens deterministically each time step ($\mu = 1$).
114 | - There is one server/miner (logically, although practically the server/miner varies between services).
115 | - The server sets a _dynamic_ minimum fee (the basefee $b(t)$), observed by users before deciding whether to enter the queue or balk.
116 | - The dynamic fee depends on the congestion.
117 |
118 | We can use the model of users having some fixed value $v$ for the transaction, and random per-time-unit costs (distributed according to some CDF $F$). A user with per-time-unit cost $c$ served after $w$ time steps at time $t$ who submitted _tip_ $\overline{p}(t) = \min(maxfee - b(t), premium)$ has payoff $v - \overline{p}(t) - c \cdot w$. We look for equilibrium waiting times and strategies. Users come in following a Poisson arrival process of rate $\lambda$ (i.e., during $t$ time units, we expect $t\lambda$ arrivals).
119 |
120 | This differs from the Huberman et al. case since we have a time-varying basefee and thus time-varying premiums. In the Huberman et al. setting, there exists an equilibrium distribution of bids $G$ such that a player bids $p$ and expects payoff $v - p - c \cdot w(p|G)$, where $w(p|G)$ denotes that the waiting time $w$ depends on $p$ given $G$. $G$ is entirely determined by $F$ and $\lambda$.
121 |
122 | The equivalent of $G$ in EIP 1559 is the distribution over $\overline{p}$ which is what miners consider for inclusion. We look for the following properties:
123 |
124 | - Users with greater costs always offer greater tips, i.e., whenever $c_i \leq c_j$ for two users $i$ and $j$, $\overline{p}_i(t) \leq \overline{p}_j(t)$ for all $t$. In the case where all users propose the same premium, this is true if players with greater costs choose higher $maxfee$.
125 | - An equilibrium basefee $\overline{b}$ given $\lambda$ and $F$. Demand shocks are interpreted as increasing $\lambda$.
126 |
--------------------------------------------------------------------------------
/ethdata/extract.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env zsh
2 |
3 | startblock=12965000
4 | endblock=12970000
5 |
6 | ethereumetl export_blocks_and_transactions --start-block $startblock --end-block $endblock \
7 | --provider-uri http://192.168.0.120:8545 --blocks-output blocks.csv --transactions-output transactions.csv
8 |
9 | ethereumetl extract_csv_column --input transactions.csv --column hash --output transaction_hashes.txt
10 |
11 | ethereumetl export_receipts_and_logs --transaction-hashes transaction_hashes.txt \
12 | --provider-uri http://192.168.0.120:8545 --receipts-output receipts.csv
13 |
14 | # Blocks
15 | # number,hash,parent_hash,nonce,sha3_uncles,logs_bloom,transactions_root,state_root,receipts_root,miner, (10)
16 | # difficulty,total_difficulty,size,extra_data,gas_limit,gas_used,timestamp,transaction_count,base_fee_per_gas (19)
17 |
18 | # Transactions
19 | # hash,nonce,block_hash,block_number,transaction_index,from_address,to_address,value, (8)
20 | # gas,gas_price,input,block_timestamp,max_fee_per_gas,max_priority_fee_per_gas,transaction_type (15)
21 |
22 | # Receipts
23 | # transaction_hash,transaction_index,block_hash,block_number,cumulative_gas_used,gas_used, (6)
24 | # contract_address,root,status,effective_gas_price (10)
25 |
26 | cut -d , -f 1,13,15,16,17,19 blocks.csv > blocks-cut.csv
27 | cut -d , -f 1,6 receipts.csv > receipts-cut.csv
28 | cut -d , -f 1,4,8,9,10,13,14,15 transactions.csv > transactions-cut.csv
29 |
30 | rm blocks.csv
31 | rm receipts.csv
32 | rm transactions.csv
33 | mv blocks-cut.csv data/bxs-$startblock-$endblock.csv
34 | mv receipts-cut.csv data/rxs-$startblock-$endblock.csv
35 | mv transactions-cut.csv data/txs-$startblock-$endblock.csv
36 |
--------------------------------------------------------------------------------
/ethdata/notebooks/explore_data.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exploring blocks, gas and transactions"
3 | description: |
4 | A focus on the recent high gas prices, towards understanding high congestion regimes for EIP 1559.
5 | author:
6 | - name: Barnabé Monnot
7 | url: https://twitter.com/barnabemonnot
8 | affiliation: Robust Incentives Group, Ethereum Foundation
9 | affiliation_url: https://github.com/ethereum/rig
10 | date: "`r Sys.Date()`"
11 | output:
12 | distill::distill_article:
13 | toc: true
14 | toc_depth: 3
15 | ---
16 |
17 | While real world gallons of oil went negative, Ethereum gas prices have sustained a long period of high fees since the beginning of May. I wanted to dig in a bit deeper, with a view to understanding the fundamentals of the demand. Some of the charts below retrace steps that are very well-known to a lot of us -- these are mere restatements and updates. The data includes all blocks produced between May 4th, 2020, 13:22:16 UTC and May 19th, 2020, 19:57:17 UTC.
18 |
19 | [Onur Solmaz](https://twitter.com/onurhsolmaz) from Casper Labs wrote [a very nice post](https://solmaz.io/2019/10/21/gas-price-fee-volatility/) arguing that since we observe daily cycles, there must be something more than one-off ICOs and Ponzis at play.
20 |
21 |
24 |
25 | We will see these cycles here too, and a few more questions I thought were interesting (or at least, that I kinda knew the answer to but never had derived or played with myself). This is an excuse to play with my new [DAppNode](https://dappnode.io) full node, using the wonderful [ethereum-etl](https://github.com/blockchain-etl/ethereum-etl) package from Evgeny Medvedev to extract transaction and block details. This data will also be useful to calibrate good simulations for EIP 1559 (more on this soon!)
26 |
27 | ```{r setup, message = FALSE}
28 | library(tidyverse)
29 | library(here)
30 | library(glue)
31 | library(lubridate)
32 | library(forecast)
33 | library(infer)
34 | library(matrixStats)
35 | library(rmarkdown)
36 | library(knitr)
37 | library(skimr)
38 |
39 | options(digits=10)
40 | options(scipen = 999)
41 |
42 | # Make the plots a bit less pixellated
43 | knitr::opts_chunk$set(dpi = 300)
44 |
45 | # A minimal theme I like (zero bonus point for using it though!)
46 | newtheme <- theme_grey() + theme(
47 | axis.text = element_text(size = 9),
48 | axis.title = element_text(size = 12),
49 | axis.line = element_line(colour = "#000000"),
50 | panel.grid.major = element_blank(),
51 | panel.grid.minor = element_blank(),
52 | panel.background = element_blank(),
53 | legend.title = element_text(size = 12),
54 | legend.text = element_text(size = 10),
55 | legend.box.background = element_blank(),
56 | legend.key = element_blank(),
57 | strip.text.x = element_text(size = 10),
58 | strip.background = element_rect(fill = "white")
59 | )
60 | theme_set(newtheme)
61 | ```
62 |
63 | ```{r}
64 | start_block <- 10000001
65 | end_block <- 10100000
66 | suffix <- glue("-", start_block, "-", end_block)
67 | ```
68 |
69 | ```{r message = FALSE, eval=FALSE}
70 | txs <- read_csv(here::here(glue("data/txs", suffix, ".csv")))
71 | txs <- txs %>% select(-block_timestamp)
72 | txs %>% glimpse()
73 | ```
74 |
75 | ```{r message=FALSE, eval=FALSE}
76 | txs_receipts <- txs %>%
77 | left_join(
78 | read_csv(here::here(glue("data/rxs", suffix, ".csv"))),
79 | by = c("hash" = "transaction_hash")) %>%
80 | arrange(block_number)
81 | saveRDS(txs_receipts, here::here(glue("data/txs", suffix, ".rds")))
82 | ```
83 |
84 | ```{r message=FALSE, cache=TRUE}
85 | txs_receipts <- readRDS(here::here(glue("data/txs", suffix, ".rds"))) %>%
86 | mutate(gas_fee = gas_price * gas_used) %>%
87 | mutate(gas_price = gas_price / (10 ^ 9),
88 | gas_fee = gas_fee / (10 ^ 9),
89 | value = value / (10 ^ 18))
90 | ```
91 |
92 | ```{r message=FALSE, cache=TRUE}
93 | blocks <- read_csv(here::here(glue("data/bxs", suffix, ".csv"))) %>%
94 | mutate(block_date = as_datetime(timestamp),
95 | prop_used = gas_used / gas_limit) %>%
96 | rename(block_number = number) %>%
97 | arrange(block_number)
98 |
99 | gas_prices_per_block <- blocks %>%
100 | select(block_number) %>%
101 | left_join(
102 | txs_receipts %>%
103 | group_by(block_number) %>%
104 | summarise(
105 | min_gas_price = min(gas_price),
106 | total_gas_used = sum(gas_used),
107 | avg_gas_price = sum(gas_fee) / total_gas_used,
108 | med_gas_price = weightedMedian(gas_price, w = gas_used),
109 | max_gas_price = max(gas_price)
110 | )
111 | ) %>%
112 | select(-total_gas_used)
113 |
114 | blocks <- blocks %>%
115 | left_join(gas_prices_per_block)
116 | ```
117 |
118 | ```{r message=FALSE, cache=TRUE}
119 | date_sample <- interval(ymd("2020-05-13"), ymd("2020-05-20"))
120 | blocks_sample <- blocks %>%
121 | filter(block_date %within% date_sample)
122 |
123 | txs_sample <- txs_receipts %>%
124 | semi_join(blocks_sample)
125 | ```
126 |
127 | ## Block properties
128 |
129 | ### Gas used by a block
130 |
131 | Miners have some control over the gas limit of a block, but how much gas do blocks generally use?
132 |
133 | ```{r}
134 | blocks %>%
135 | ggplot() +
136 | geom_histogram(aes(x = gas_used), bins = 1000, fill = "steelblue") +
137 | scale_y_log10()
138 | ```
139 |
140 | There are a few peaks, notably at 0 (the amount of gas used by an empty block) and towards the maximum gas limit set at 10,000,000. Let's zoom in on blocks that use more than 9,800,000 gas.
141 |
142 | ```{r}
143 | blocks %>%
144 | filter(gas_used >= 9.8 * 10^6) %>%
145 | ggplot() +
146 | geom_histogram(aes(x = gas_used), fill = "steelblue")
147 | ```
148 |
149 | We can also look at the proportion of gas used, i.e., the amount of gas used by the block divided by the total gas available in that block. Taking a moving average over the last 500 blocks, we obtain the following plot.
150 |
151 | ```{r}
152 | blocks_sample %>%
153 | mutate(ma_prop_used = ma(prop_used, 500)) %>%
154 | ggplot() +
155 | geom_line(aes(x = block_date, y = ma_prop_used), colour = "#FED152") +
156 | xlab("Block timestamp")
157 | ```
158 |
159 | Where does the dip on May 15th come from? Empty blocks? We plot how many empty blocks are found in chunks of 2000 blocks.
160 |
161 | ```{r}
162 | chunk_size <- 2000
163 | blocks_sample %>%
164 | mutate(block_chunk = block_number %/% chunk_size) %>%
165 | filter(gas_used == 0) %>%
166 | group_by(block_chunk) %>%
167 | summarise(block_date = min(block_date),
168 | `Empty blocks` = n()) %>%
169 | ggplot() +
170 | geom_point(aes(x = block_date, y = 1/2, size = `Empty blocks`),
171 | alpha = 0.3, colour = "steelblue") +
172 | scale_size_area(max_size = 12) +
173 | theme(
174 | axis.line.y = element_blank(),
175 | axis.text.y = element_blank(),
176 | axis.title.y = element_blank(),
177 | axis.ticks.y = element_blank(),
178 | ) +
179 | xlab("Block timestamp")
180 | ```
181 |
182 | It doesn't seem so.
183 |
184 | ### Relationship between block size and gas used
185 |
186 | Does the block weight (in _gas_) roughly correlate with the block size (in _bytes_)?
187 |
188 | ```{r}
189 | cor.test(blocks$gas_used, blocks$size)
190 | ```
191 |
192 | It does! But since most blocks have very high `gas_used` anyways, it pays to look a bit more closely.
193 |
194 | ```{r}
195 | blocks %>%
196 | ggplot() +
197 | geom_point(aes(x = gas_used, y = size), alpha = 0.1, colour = "steelblue") +
198 | scale_y_log10() +
199 | xlab("Gas used per block") +
200 | ylab("Block size (in bytes)")
201 | ```
202 |
203 | We use a logarithmic scale for the y-axis. There is definitely a big spread around the 10 million gas limit. Does the block size correlate with the number of transactions instead then?
204 |
205 | ```{r cache=TRUE}
206 | blocks_num_txs <- blocks %>%
207 | left_join(
208 | txs_receipts %>%
209 | group_by(block_number) %>%
210 | summarise(n = n())
211 | ) %>%
212 | replace_na(list(n = 0))
213 | ```
214 |
215 | ```{r}
216 | blocks_num_txs %>%
217 | ggplot() +
218 | geom_point(aes(x = n, y = size), alpha = 0.2, colour = "steelblue") +
219 | xlab("Number of transactions per block") +
220 | ylab("Block size (in bytes)")
221 | ```
222 |
223 | A transaction has a minimum size, if only to include things like the sender and receiver addresses and the other necessary fields. This is why we pretty much only observe values above some diagonal. The largest blocks (in bytes) are not the ones with the most transactions.
224 |
225 | ## Gas prices
226 |
227 | ### Distribution of gas prices
228 |
229 | First, some descriptive stats for the distribution of gas prices.
230 |
231 | ```{r}
232 | quarts = c(0, 0.25, 0.5, 0.75, 1)
233 | tibble(
234 | `Quartile` = quarts,
235 | ) %>%
236 | add_column(`Value` = quantile(txs_receipts$gas_price, quarts)) %>%
237 | kable()
238 | ```
239 |
240 | 75% of included transactions post a gas price less than or equal to 31 Gwei! Plotting the distribution of gas prices under 2000 Gwei:
241 |
242 | ```{r}
243 | txs_receipts %>%
244 | filter(gas_price <= 2000) %>%
245 | ggplot() +
246 | geom_histogram(aes(x = gas_price), bins = 100, fill = "#F05431") +
247 | scale_y_log10() +
248 | xlab("Gas price (Gwei)")
249 | ```
250 |
251 | The y-axis is in logarithmic scale. Notice these curious, regular peaks? Turns out people love round numbers (or their wallets do). Let's dig into this.
252 |
253 | ### Do users like default prices?
254 |
255 | How do users set their gas prices? We can make the hypothesis that most rely on some oracle (e.g., the Eth Gas Station or their values appearing as Metamask defaults). We show next the 50 most frequent gas prices (in Gwei) and their frequency among included transactions.
256 |
257 | ```{r cache=TRUE}
258 | gas_price_freqs <- txs_receipts %>%
259 | group_by(gas_price) %>%
260 | summarise(count = n()) %>%
261 | arrange(-count) %>%
262 | mutate(freq = count / nrow(txs_receipts), cumfreq = cumsum(freq),
263 | `Gas price (Gwei)` = gas_price,
264 | `Gas price (wei)` = round(gas_price * (10 ^ 9), 10)) %>%
265 | mutate(frequency = str_c(round(freq * 100), "%"), cum_freq = str_c(round(cumfreq * 100), "%")) %>%
266 | select(-freq, -cumfreq, -gas_price)
267 | ```
268 |
269 | ```{r}
270 | paged_table(gas_price_freqs %>%
271 | select(`Gas price (Gwei)`, `Gas price (wei)`, count, frequency, cum_freq) %>%
272 | filter(row_number() <= 50))
273 | ```
274 |
275 | Clearly round numbers dominate here!
276 |
277 | ### Evolution of gas prices
278 |
279 | I wanted to see how the gas prices evolve over time. To compute the average gas price in a block, I do a weighted mean using `gas_used` as weight. I then compute the average gas price over 100 blocks by doing another weighted mean using the total gas used in the blocks.
280 |
281 | ```{r}
282 | chunk_size <- 100
283 | blocks_sample %>%
284 | mutate(block_chunk = block_number %/% chunk_size) %>%
285 | replace_na(list(
286 | avg_gas_price = 0, gas_used = 0)) %>%
287 | mutate(block_num = gas_used * avg_gas_price) %>%
288 | group_by(block_chunk) %>%
289 | summarise(avg_prop_used = mean(prop_used),
290 | gas_used_chunk = sum(gas_used),
291 | num_chunk = sum(block_num),
292 | avg_gas_price = num_chunk / gas_used_chunk,
293 | block_date = min(block_date)) %>%
294 | ggplot() +
295 | geom_line(aes(x = block_date, y = avg_gas_price), colour = "#F05431") +
296 | xlab("Block timestamp")
297 | ```
298 |
299 | We see a daily seasonality, with peaks and troughs corresponding to high congestion and low congestion hours of the day. How does this jive with other series we saw before? We now average over 200 blocks and present a comparison with the series of block proportion used.
300 |
301 | ```{r}
302 | chunk_size <- 200
303 | blocks_sample %>%
304 | mutate(block_chunk = block_number %/% chunk_size) %>%
305 | replace_na(list(
306 | avg_gas_price = 0, gas_used = 0)) %>%
307 | mutate(block_num = gas_used * avg_gas_price) %>%
308 | group_by(block_chunk) %>%
309 | summarise(gas_limit_chunk = sum(gas_limit),
310 | gas_used_chunk = sum(gas_used),
311 | num_chunk = sum(block_num),
312 | avg_gas_price = num_chunk / gas_used_chunk,
313 | block_date = min(block_date),
314 | prop_used = gas_used_chunk / gas_limit_chunk) %>%
315 | select(block_date, `Proportion used` = prop_used, `Average gas price` = avg_gas_price) %>%
316 | pivot_longer(-block_date, names_to = "Series") %>%
317 | ggplot() +
318 | geom_line(aes(x = block_date, y = value, color = Series)) +
319 | scale_color_manual(values = c("#F05431", "#FED152")) +
320 | facet_grid(rows = vars(Series), scales = "free") +
321 | xlab("Block timestamp")
322 | ```
323 |
324 | Blocks massively unused right after a price peak? The mystery deepens.
325 |
326 | ### Timestamp difference between blocks
327 |
328 | How much time elapses between two consecutive blocks? Miners are responsible for setting the timestamp, so it's not a perfectly objective value, but good enough!
329 |
330 | ```{r}
331 | blocks %>%
332 | mutate(time_difference = timestamp - lag(timestamp)) %>%
333 | ggplot() +
334 | geom_histogram(aes(x = time_difference), binwidth = 1, fill = "#BFCE80")
335 | ```
336 |
337 | ```{r cache=TRUE}
338 | late_blocks <- blocks %>%
339 | mutate(time_difference = timestamp - lag(timestamp),
340 | late_block = time_difference >= 20) %>%
341 | replace_na(list(gas_used = 0, avg_gas_price = 0)) %>%
342 | drop_na()
343 |
344 | mean_diff <- late_blocks %>%
345 | specify(formula = avg_gas_price ~ late_block) %>%
346 | calculate(stat = "diff in means", order = c(TRUE, FALSE))
347 | ```
348 |
349 | ```{r cache=TRUE}
350 | null_distribution <- late_blocks %>%
351 | specify(formula = avg_gas_price ~ late_block) %>%
352 | hypothesize(null = "independence") %>%
353 | generate(reps = 500, type = "permute") %>%
354 | calculate(stat = "diff in means", order = c(TRUE, FALSE))
355 | ```
356 |
357 | We can do a simple difference-in-means test to check whether the difference between the average gas price of late blocks (with timestamp difference greater than 20 seconds) and early blocks (lesser than 20 seconds) is significant.
358 |
359 | ```{r fig.cap="Mean gas price in \"late\" and \"early\" blocks"}
360 | kable(late_blocks %>%
361 | group_by(late_block) %>%
362 | summarise(avg_gas_price = mean(avg_gas_price)))
363 | ```
364 |
365 |
374 |
--------------------------------------------------------------------------------
/ethdata/notebooks/gas_weather_reports/exploreJuly21.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Gas weather report: July 21st - July 27th"
3 | description: |
4 | Do gas limit increases decrease gas prices?
5 | author:
6 | - name: Barnabé Monnot
7 | url: https://twitter.com/barnabemonnot
8 | affiliation: Robust Incentives Group, Ethereum Foundation
9 | affiliation_url: https://github.com/ethereum/rig
10 | date: "`r Sys.Date()`"
11 | output:
12 | distill::distill_article:
13 | toc: true
14 | toc_depth: 3
15 | ---
16 |
17 | The data includes all blocks produced between July 21st, 2020, 02:00:17 UTC (block 10500001) and July 27th, 2020, 06:37:09 UTC (block 10540000). It was obtained from [Geth](https://geth.ethereum.org/) using a [DAppNode](https://dappnode.io) full node with the wonderful [ethereum-etl](https://github.com/blockchain-etl/ethereum-etl) package from Evgeny Medvedev to extract transaction and block details.
18 |
19 |
22 |
23 | ```{r setup, message = FALSE}
24 | library(tidyverse)
25 | library(here)
26 | library(glue)
27 | library(lubridate)
28 | library(forecast)
29 | library(infer)
30 | library(matrixStats)
31 | library(rmarkdown)
32 | library(knitr)
33 | library(skimr)
34 |
35 | options(digits=10)
36 | options(scipen = 999)
37 |
38 | # Make the plots a bit less pixellated
39 | knitr::opts_chunk$set(dpi = 300)
40 |
41 | # A minimal theme I like (zero bonus point for using it though!)
42 | newtheme <- theme_grey() + theme(
43 | axis.text = element_text(size = 9),
44 | axis.title = element_text(size = 12),
45 | axis.line = element_line(colour = "#000000"),
46 | panel.grid.major = element_blank(),
47 | panel.grid.minor = element_blank(),
48 | panel.background = element_blank(),
49 | legend.title = element_text(size = 12),
50 | legend.text = element_text(size = 10),
51 | legend.box.background = element_blank(),
52 | legend.key = element_blank(),
53 | strip.text.x = element_text(size = 10),
54 | strip.background = element_rect(fill = "white")
55 | )
56 | theme_set(newtheme)
57 | ```
58 |
59 | ```{r}
60 | start_block <- 10500001
61 | end_block <- 10540000
62 | suffix <- glue("-", start_block, "-", end_block)
63 | ```
64 |
65 | ```{r message = FALSE, eval=FALSE}
66 | txs <- read_csv(here::here(glue("data/txs", suffix, ".csv")))
67 | txs %>% glimpse()
68 | ```
69 |
70 | ```{r message=FALSE, eval=FALSE}
71 | txs_receipts <- txs %>%
72 | left_join(
73 | read_csv(here::here(glue("data/rxs", suffix, ".csv"))),
74 | by = c("hash" = "transaction_hash")) %>%
75 | arrange(block_number)
76 | saveRDS(txs_receipts, here::here(glue("data/txs", suffix, ".rds")))
77 | ```
78 |
79 | ```{r message=FALSE, cache=TRUE}
80 | txs_receipts <- readRDS(here::here(glue("data/txs", suffix, ".rds"))) %>%
81 | mutate(gas_fee = gas_price * gas_used) %>%
82 | mutate(gas_price = gas_price / (10 ^ 9),
83 | gas_fee = gas_fee / (10 ^ 9),
84 | value = value / (10 ^ 18))
85 | ```
86 |
87 | ```{r message=FALSE, cache=TRUE}
88 | blocks <- read_csv(here::here(glue("data/bxs", suffix, ".csv"))) %>%
89 | mutate(block_date = as_datetime(timestamp),
90 | prop_used = gas_used / gas_limit) %>%
91 | rename(block_number = number) %>%
92 | arrange(block_number)
93 |
94 | gas_prices_per_block <- blocks %>%
95 | select(block_number) %>%
96 | left_join(
97 | txs_receipts %>%
98 | group_by(block_number) %>%
99 | summarise(
100 | min_gas_price = min(gas_price),
101 | total_gas_used = sum(gas_used),
102 | avg_gas_price = sum(gas_fee) / total_gas_used,
103 | med_gas_price = weightedMedian(gas_price, w = gas_used),
104 | max_gas_price = max(gas_price)
105 | )
106 | ) %>%
107 | select(-total_gas_used)
108 |
109 | blocks <- blocks %>%
110 | left_join(gas_prices_per_block)
111 | ```
112 |
113 | ```{r message=FALSE, cache=TRUE}
114 | # To get all blocks
115 | date_sample <- interval(min(blocks$block_date), max(blocks$block_date))
116 |
117 | # To get a sample
118 | # date_sample <- interval(ymd("2020-05-13"), ymd("2020-05-20"))
119 |
120 | blocks_sample <- blocks %>%
121 | filter(block_date %within% date_sample)
122 |
123 | txs_sample <- txs_receipts %>%
124 | semi_join(blocks_sample)
125 | ```
126 |
127 | ## Block properties
128 |
129 | ### Gas used by a block
130 |
131 | Miners have some control over the gas limit of a block, but how much gas do blocks generally use?
132 |
133 | ```{r}
134 | blocks %>%
135 | ggplot() +
136 | geom_histogram(aes(x = gas_used), bins = 1000, fill = "steelblue") +
137 | scale_y_log10() +
138 | xlab("Gas used") +
139 | ylab("Number of blocks")
140 | ```
141 |
142 | The gas limit was increased from the previous limit of 10M gas, and gas used in blocks soon followed. We notice two peaks. Let's zoom in.
143 |
144 | ```{r}
145 | blocks %>%
146 | filter(gas_used >= 10.05 * 10^6) %>%
147 | ggplot() +
148 | geom_histogram(aes(x = gas_used), fill = "steelblue", bins = 60) +
149 | xlab("Gas used") +
150 | ylab("Number of blocks")
151 | ```
152 |
153 | How did the gas limit evolve over time?
154 |
155 | ```{r}
156 | blocks %>%
157 | ggplot() +
158 | geom_line(aes(x = block_date, y = gas_limit), color = "#FED152") +
159 | xlab("Block date") +
160 | ylab("Gas limit")
161 | ```
162 |
163 | We have a shift in the middle of the week from about 12M gas limit to 12.5M. Did this release some pressure from transaction fees?
164 |
165 | ## Gas prices
166 |
167 | ### Distribution of gas prices
168 |
169 | First, some descriptive stats for the distribution of gas prices.
170 |
171 | ```{r}
172 | quarts = c(0, 0.25, 0.5, 0.75, 1)
173 | tibble(
174 | `Quartile` = quarts,
175 | ) %>%
176 | add_column(`Value` = quantile(txs_receipts$gas_price, quarts)) %>%
177 | kable()
178 | ```
179 |
180 | 75% of included transactions post a gas price less than or equal to 90 Gwei. This is much higher than in our last [gas weather report in May](https://ethereum.github.io/rig/ethdata/notebooks/explore_data.html).
181 |
182 | ### Evolution of gas prices
183 |
184 | To compute the average gas price in a block, I do a weighted mean using `gas_used` as weight. I then compute the average gas price over 100 blocks by doing another weighted mean using the total gas used in the blocks.
185 |
186 | ```{r}
187 | chunk_size <- 100
188 | blocks_sample %>%
189 | mutate(block_chunk = block_number %/% chunk_size) %>%
190 | replace_na(list(
191 | avg_gas_price = 0, gas_used = 0)) %>%
192 | mutate(block_num = gas_used * avg_gas_price) %>%
193 | group_by(block_chunk) %>%
194 | summarise(avg_prop_used = mean(prop_used),
195 | gas_used_chunk = sum(gas_used),
196 | num_chunk = sum(block_num),
197 | avg_gas_price = num_chunk / gas_used_chunk,
198 | block_date = min(block_date)) %>%
199 | ggplot() +
200 | geom_line(aes(x = block_date, y = avg_gas_price), colour = "#F05431") +
201 | xlab("Block timestamp") +
202 | ylab("Average gas price")
203 | ```
204 |
205 | We see a daily seasonality, with peaks and troughs corresponding to high congestion and low congestion hours of the day.
206 |
207 | Did increasing the gas limit reduce the prices overall? We can take a look visually.
208 |
209 | ```{r}
210 | chunk_size <- 200
211 | blocks_sample %>%
212 | mutate(block_chunk = block_number %/% chunk_size) %>%
213 | replace_na(list(
214 | avg_gas_price = 0, gas_used = 0)) %>%
215 | mutate(block_num = gas_used * avg_gas_price) %>%
216 | group_by(block_chunk) %>%
217 | summarise(gas_limit_chunk = sum(gas_limit),
218 | gas_used_chunk = sum(gas_used),
219 | num_chunk = sum(block_num),
220 | avg_gas_price = num_chunk / gas_used_chunk,
221 | block_date = min(block_date),
222 | prop_used = gas_used_chunk / gas_limit_chunk,
223 | avg_gas_limit_chunk = mean(gas_limit),
224 | avg_gas_used_chunk = mean(gas_used)) %>%
225 | select(block_date, `Gas limit` = avg_gas_limit_chunk, `Average gas price` = avg_gas_price, `Gas used` = avg_gas_used_chunk) %>%
226 | pivot_longer(-block_date, names_to = "Series") %>%
227 | ggplot() +
228 | geom_line(aes(x = block_date, y = value, color = Series)) +
229 | scale_color_manual(values = c("#F05431", "#FED152", "steelblue")) +
230 | facet_grid(rows = vars(Series), scales = "free") +
231 | xlab("Block timestamp")
232 | ```
233 |
234 | It doesn't seem like it did to me, even though the average gas used in blocks increased in concert with the gas limit. We can look at average prices for transactions in blocks with gas limit lesser than 12.25M gas ("small blocks") vs. blocks with gas limit greater than 12.25M ("big blocks").
235 |
236 | ```{r cache=TRUE}
237 | big_blocks <- blocks %>%
238 | mutate(big_block = if_else(gas_limit > 12.25 * 10^6, "Big block", "Small block")) %>%
239 | replace_na(list(gas_used = 0, avg_gas_price = 0)) %>%
240 | drop_na()
241 | ```
242 |
243 | ```{r fig.cap="Mean gas price in \"big\" and \"small\" blocks"}
244 | kable(big_blocks %>%
245 | group_by(big_block) %>%
246 | summarise(avg_gas_price = mean(avg_gas_price)))
247 | ```
248 |
249 | The two averages are mighty close to each other, with big blocks posting even slightly higher gas prices than small ones (a negligible difference however).
250 |
251 |
254 |
255 |
256 |
257 |
258 |
259 |
260 |
--------------------------------------------------------------------------------
/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 | Redirecting...
7 |
8 |
9 |
If you are not redirected automatically, click here.