├── bitcoin-difficulty-adjustment ├── README.md └── simul.cpp ├── elligator-square-for-bn ├── README.md └── test.sage ├── minimizing-golomb-filters └── README.md ├── private-authentication-protocols └── README.md ├── uniform-range-extraction ├── README.md └── test.py └── von-neumann-debias-tables ├── README.md ├── coin_extract_8rolls_15states.txt ├── dice_extract_1rolls_116states.txt └── dice_extract_4rolls_15states.txt /bitcoin-difficulty-adjustment/README.md: -------------------------------------------------------------------------------- 1 | # Bitcoin's steady-state difficulty adjustment 2 | 3 | ## Abstract 4 | 5 | In this document, we analyze the probabilistic behavior of Bitcoin's difficulty adjustment 6 | rules under the following assumptions: 7 | * The hashrate does not change during the period of time we're studying. 8 | * Blocks are produced by a pure [Poisson process](https://en.wikipedia.org/wiki/Poisson_point_process), and carry the exact timestamp 9 | they were mined at. 10 | * All timestamps and difficulty are arbitrary precision (treating them as real numbers, without rounding). 11 | * There is no restriction on the per-window difficulty adjustment (the real protocols restricts these to a factor 4× or 0.25×). 12 | 13 | The real-world hashrate is of course not constant, but making this simplification 14 | does give us some insights into how stable the difficulty adjustment process can get. 15 | The other assumptions do not affect the outcome of our analysis much, as long as the 16 | [timewarp bug](https://bitcointalk.org/index.php?topic=43692.msg521772#msg521772) is not exploited. 17 | 18 | Specifically, we wonder what the mean and standard deviation is for the durations of (sequences of) 19 | blocks and windows, for randomly picked windows in the future, taking into account that the 20 | difficulties for those windows are also subject to randomness. 21 | 22 | ## Introduction 23 | 24 | The exact process studied is this: 25 | * There is a fixed hash rate of $h$ hashes per time unit. 26 | * A window is a sequence of $n$ blocks 27 | (with $n>2$). 28 | Block number $i$ is 29 | part of window number $j = {\lfloor}i/n{\rfloor}$. 30 | In Bitcoin $n$ is *2016*. 31 | * Every window $j$ (and every block in that window) has an associated difficulty 32 | $D_j$. 33 | * Every block $i$ has a length 34 | $L_i$, which is how long it took to 35 | produce the block. $L_i$ follows an [exponential distribution](https://en.wikipedia.org/wiki/Exponential_distribution): 36 | $L_i \sim \mathit{Exp}(\lambda = h/D_{\lfloor i/n \rfloor})$, independent from the lengths 37 | of all other blocks. Since the mean of an exponential distribution is $λ^{-1}$, 38 | the expected length of a block is $D_{\lfloor i/n \rfloor}/h$, which means the 39 | expected number of hashes for a block is $D_{\lfloor i/n \rfloor}$. 40 | The notion of difficulty in actual 41 | Bitcoin has an additional constant factor (roughly $2^{32}$), but that factor is not relevant for this 42 | discussion, so we ignore it here. 43 | * The total length of a window is defined as the sum of the lengths of the blocks in it, 44 | $$W_j = \sum_{i=jn}^{jn+n-1} L_i$$ 45 | Because the length of a window excluding its last block is relevant as well, we also define 46 | $$W^\prime_j = W_j - L_{jn+n-1} = \sum_{i=jn}^{jn+n-2} L_i$$ 47 | * The difficulty adjusts every $n$ blocks, aiming for roughly one block every 48 | $t$ time units. 49 | Specifically, $D_{j+1} = D_j tn / W^\prime_j$, or the difficulty 50 | gets multiplied by $tn$ (the expected time a window takes) divided by 51 | $W^\prime_j$ (the time between the first and last block in the window, but not 52 | including the time until the first block of the next window). 53 | In Bitcoin $t$ is *600 seconds*, and thus 54 | $tn$ is *2 weeks*. 55 | * The initial difficulty $D_0$ is a given constant value $d_0$. All future difficulties are 56 | necessarily random variables (as they depend on $W^\prime_j$, which depend on 57 | $L_i$, which are random). 58 | 59 | ## Probability distribution of difficulties *Di* 60 | 61 | The description above uses a random variable ($D_{\lfloor i/n \rfloor}$) in the parameterization 62 | of $L_i$'s probability distribution. This makes reasoning complicated. To address this, 63 | we introduce simpler "base" random variables: $B_i = hL_i / D_i$, representing how 64 | "stretched" the number of hashes needed was compared to the difficulty. Due to the fact that the 65 | $\lambda$ parameter of the exponential distribution is an 66 | inverse [scale parameter](https://en.wikipedia.org/wiki/Scale_parameter#Rate_parameter), 67 | this means 68 | $$B_i \sim \mathit{Exp}(\lambda = 1)$$ 69 | for all $i$. All these 70 | $B_i$ variables, even across windows, are identically distributed. Furthermore, they are 71 | all independent from one another. Analogous to the $W_j$ 72 | and $W^\prime_j$ variables, we also introduce variables to express how many hashes each window 73 | required, relative to the number of hashes expected based on their difficulty: 74 | $$A_j = hW_j/D_j = \sum_{i=nj}^{n(j+1)-1} B_i$$ 75 | $$A^\prime_j = hW^\prime_j/hD_j = \sum_{i=nj}^{n(j+1)-2} B_i$$ 76 | These variables are i.i.d. (independent and identically distributed) 77 | within their sets, and more generally independent from one another 78 | unless they are based on overlapping $B_i$ variables. Furthermore, 79 | together, the $B_i$ variables are *all* the randomness in the system; 80 | as we'll see later, all other random variables can be written as 81 | deterministic functions of them. 82 | 83 | Given $A^\prime_j = hW^\prime_j/D_j$ and 84 | $D_{j+1} = D_j tn / W^\prime_j$, we learn 85 | that $D_{j+1} = htn / A^\prime_j$. 86 | Surprisingly, this means that the distribution of all $D_j$ 87 | (except for $j=0$) only depends on what happened in the window before it, 88 | and not the windows before that (and thus by extension, also not on 89 | previous difficulties). Despite the fact 90 | that the difficulty of the next window is computed as the difficulty of the previous window 91 | multiplied by a correction factor, the actual previous value does not matter. This is because 92 | by whatever factor the previous difficulty might have been "off", the same factor will appear in the rate 93 | of the next window, exactly undoing it for that next window's difficulty. 94 | 95 | As all the $A^\prime_j$ are i.i.d., 96 | so are the $D_j$ variables. But what is that distribution? 97 | We start with the distribution of $A^\prime_j$. 98 | These variables are the sum of $n-1$ 99 | distinct $B_i$ variables, which are all i.i.d. exponentially distributed. 100 | The result of that is an [Erlang distribution](https://en.wikipedia.org/wiki/Exponential_distribution), and 101 | $$A^\prime_j \sim \mathit{Erlang}(k=n-1, \lambda=1)$$ 102 | $$A_j \sim \mathit{Erlang}(k=n, \lambda=1)$$ 103 | 104 | As the Erlang distribution is a special case of a [gamma distribution](https://en.wikipedia.org/wiki/Gamma_distribution), we can also say that 105 | $$A^\prime_j \sim \Gamma(\alpha=n-1, \beta=1)$$ 106 | Again exploiting the fact that 107 | $\beta$ is an inverse scale parameter, that means that 108 | $$A^\prime_j/htn \sim \Gamma(\alpha=n-1, \beta=htn)$$ 109 | Since the inverse of a gamma distribution is an 110 | [inverse-gamma distribution](https://en.wikipedia.org/wiki/Inverse-gamma_distribution), 111 | we conclude 112 | $$D_{j+1} = htn/A^\prime_j \sim \mathit{InvGamma}(\alpha=n-1, \beta=htn)$$ 113 | 114 | Note again that $t$ and 115 | $n$ are protocol constants 116 | ($tn$ is *2 weeks* in Bitcoin), and we 117 | we have assumed that the hashrate $h$ is a constant for the duration of our 118 | analysis. The $\mathit{InvGamma}$ distribution has mean 119 | $\beta/(\alpha-1)$, so 120 | $$E[D_{j+1}] = \frac{htn}{n-2}$$ 121 | For Bitcoin, this means the 122 | average difficulty corresponds to $2016/2014 \approx 1.000993$ times *600 seconds* per block at 123 | the given hashrate. Does this translate to blocks longer on average than *600 seconds* as well? 124 | 125 | ## Probability distribution of block lengths $L_i$ 126 | 127 | Remember that $B_i = hL_i/D_{\lfloor i/n \rfloor}$, and thus 128 | $L_i = {B_i}{D_{\lfloor i/n \rfloor}}/h$. We know that 129 | $D_j = htn/A^\prime_{j-1}$ as well, and thus 130 | $L_i = tn{B_i}/A^\prime_{\lfloor i/n \rfloor -1}$. 131 | $B_i$ is exponentially distributed, which is also a special case of a gamma distribution, 132 | like $A^\prime_j$. Or 133 | $$B_i \sim \Gamma(\alpha=1, \beta=1)$$ 134 | $$A^\prime_{\lfloor i/n \rfloor -1} \sim \Gamma(\alpha=n-1, \beta=1)$$ 135 | Thus, $L_i/tn = B_i / A_{\lfloor i/n \rfloor - 1}$ is distributed as the ratio of two independent gamma distributions with the 136 | same $\beta$. Such a ratio is a 137 | [beta prime distribution](https://en.wikipedia.org/wiki/Beta_prime_distribution) and 138 | $$L_i/tn = B_i/A_{\lfloor i/n \rfloor-1} \sim \mathit{Beta'}(\alpha=1, \beta=n-1)$$ 139 | 140 | This distribution has mean $\alpha/(\beta-1)$, and thus the expected time per block is 141 | $$E[L_i] = \frac{tn}{n-2}$$ 142 | Surprisingly, this is not the $t$ we were aiming for, or even the 143 | $tn/(n-1)$ we might have 144 | expected given that the last block's length is ignored in every window, but a factor $n/(n-2)$ times 145 | $t$. The implication for Bitcoin, 146 | if the assumptions made here hold, is that 147 | the expected time per block under constant hashrate is not 10 minutes, but the factor $1.000993$ predicted in the previous section more: 148 | ***10 minutes and 0.5958 seconds***. 149 | 150 | The variance for this distribution is $\alpha(\alpha+\beta-1)/((\beta-2)(\beta-1)^2)$, and thus 151 | the standard deviation for the time per block is 152 | $$\mathit{StdDev}(L_i) = \frac{tn}{n-2}\sqrt{\frac{n-1}{n-3}} \approx t\sqrt{1+\frac{6}{n}}$$ 153 | For Bitcoin this is ***10 minutes and 0.8941 seconds***. 154 | 155 | ## Probability distribution of multiple blocks in a window 156 | 157 | These block lengths $L_i$ are however **not independent** if they belong to the same 158 | window. That is because they share a common, but random, scaling factor: that window's difficulty. 159 | Blocks from subsequent windows are not independent either, because they are related through the 160 | first window's duration. This means we cannot just multiply the variance with $n$ to 161 | obtain the variance for multiple blocks. Instead, we need to determine its probability distribution. 162 | 163 | Let's look at the length of $r$ consecutive blocks in the same window, where 164 | $0 \leq r \leq n$. The distribution of any $r$ distinct blocks in any single window (excluding the first window) 165 | is the same, so for determining its distribution, assume without loss of generality the range 166 | $n \ldots n+r-1$. Call the sum of those lengths 167 | $$Y_r = \sum_{i=n}^{n+r-1} L_i$$ 168 | Given that $L_i = {B_i}{D_{\lfloor i/n \rfloor}}/h$, we get 169 | $$Y_r = \sum_{i=n}^{n+r-1} \frac{{B_i}{D_1}}{h} = \frac{D_1}{h} \sum_{i=n}^{n+r-1} B_i$$ 170 | Substituting $D_j = htn/A^\prime_{j-1}$ we get 171 | $$Y_r = \frac{tn}{A^\prime_0} \sum_{i=n}^{n+r-1} B_i$$ 172 | The sum in this expression is again a sum of $r$ i.i.d. exponential distributions, for which 173 | $$\sum_{i=n}^{n+r-1} B_i \sim Erlang(k=r, \lambda=1) \sim \Gamma(\alpha=r, \beta=1)$$ 174 | Thus $Y_r$ is again the ratio of two independent gamma distributions with the same 175 | $\beta$, and we obtain 176 | $$Y_r/tn \sim \mathit{Beta'}(\alpha=r, \beta=n-1)$$ 177 | 178 | Using the formula for mean of a beta prime distribution we get 179 | $$E[Y_r] = tn\frac{\alpha}{\beta-1} = \frac{rtn}{n-2}$$ 180 | And using the formula for variance we get 181 | $$\mathit{StdDev}(Y_r) = tn\sqrt{\frac{\alpha(\alpha+\beta-1)}{(\beta-2)(\beta-1)^2}} = \frac{nt}{n-2}\sqrt{\frac{r(n+r-2)}{n-3}} \approx t\sqrt{r\left(1 + \frac{r+5}{n} + \frac{7r}{n^2}\right)}$$ 182 | 183 | This standard deviation is the same as what we'd get for the sum of $r(n+r-2)/(n-3)$ independent exponentially 184 | distributed block lengths, each with mean $tn/(n-2)$. This expression grows quadratically, ranging from 185 | $(n-1)/(n-3) \approx 1$ at 186 | $r=1$, to 187 | $2n(n-1)/(n-3) \approx 2n$ at 188 | $r=n$. 189 | If the block lengths were independent, we'd expect this to grow linearly. But because of the fact that there is 190 | a shared random contribution to all of them (the difficulty), it grows faster. 191 | 192 | When looking at the length of the whole window $W_1 = Y_n$ (or any other window 193 | $W_j$ except the first where 194 | $j=0$), 195 | these expressions simplify using $r=n$ to: 196 | $$E[W_j] = \frac{tn^2}{n-2}$$ 197 | $$\mathit{StdDev}(W_j) = \frac{tn}{n-2}\sqrt{\frac{2n(n-1)}{n-3}} \approx t\sqrt{2n+12}$$ 198 | For sufficiently large $n$, this approximates the standard deviation we'd expect from a Poisson process for 199 | $2n$ blocks, if they were all mined at "exact" difficulty (the difficulty corresponding to the hashrate). 200 | Why is this? 201 | * $n$ blocks' worth of standard deviation come from the difficulty 202 | $D_j$, contributed by the randomness in the 203 | durations of the blocks in the previous window ($A^\prime_{j-1}$). 204 | * $n$ blocks' worth of standard deviation come from the duration of the blocks in the window 205 | $W_j$ itself 206 | ($A_j$). 207 | 208 | For Bitcoin this yields an average window duration of ***2 weeks, 20 minutes, 1.19 seconds*** with a 209 | standard deviation of ***10 hours, 35 minutes, 55.59 seconds***. 210 | 211 | ## Properties of the sum of consecutive windows 212 | 213 | Next we investigate is properties of the probability distribution of the sum of multiple consecutive 214 | windows. This cannot be expressed as a well-studied probability distribution anymore, but we can compute 215 | its mean and standard deviation. 216 | 217 | Let's look at the sum $X_c$ of the lengths of 218 | $c$ consecutive windows, each consisting of $n$ blocks. The first window is special, as it has a 219 | different difficulty distribution than the others, so let's exclude it and look at windows $1$ 220 | through $c$. As far as the distribution is concerned, this is without loss of generality: 221 | the distributions of any $c$ consecutive windows starting at window 222 | $1$ and later are identical. 223 | $$X_c = \sum_{j=1}^c W_j = tn \cdot \sum_{j=1}^c \frac{A_j}{A^\prime_{j-1}} = tn \cdot \sum_{j=1}^c \frac{A^\prime_j + B_{jn+n-1}}{A^\prime_{j-1}}$$ 224 | 225 | The terms of this summation are not independent, because all the "inner" $A^\prime_j$ variables occur in two terms: 226 | once in the numerator and once in the denominator of the next one. This does not matter for the mean 227 | $E[X_c]$ however: the expected value of a sum is the sum of the expected values, even when 228 | they are not independent. Thus we find that 229 | $$E[X_c] = \sum_{j=1}^c E[W_j] = ctn\cdot E\left[\frac{A_1}{A^\prime_0}\right] = \frac{ctn^2}{n-2}$$ 230 | This is just $2016/2014$ times *2c weeks* in Bitcoin's case. 231 | 232 | To analyze the variance and standard deviation, we first introduce a new variable $T_j = L_{jn+n-1}$, the terminal 233 | block of window $j$. This simplifies the expression to: 234 | $$X_c = tn \cdot \sum_{j=1}^c \frac{A^\prime_j + T_j}{A^\prime_{j-1}}$$ 235 | where all the $A^\prime_j$ and 236 | $T_j$ variables are independent, as they do not derive from overlapping $B_i$ variables. Furthermore, we 237 | know the distribution of all of them: 238 | $$A^\prime_j \sim \Gamma(\alpha=n-1, \beta=h)$$ 239 | $$T_j \sim \Gamma(\alpha=1, \beta=h)$$ 240 | Let $\sigma_c^2$ be the variance of this expression: 241 | $$\sigma_c^2 = Var(X_c) = E[X_c^2] - E^2[X_c] = (tn)^2 E\left[\left(\sum_{j=1}^c \frac{A^\prime_j + T_j}{A^\prime_{j-1}}\right)^2\right] - \left(\frac{ctn^2}{n-2}\right)^2$$ 242 | Working on the second moment $E[X_c^2]$ divided by the constant factor 243 | $(tn)^2$, we get: 244 | $$\frac{E[X_c^2]}{(tn)^2} = \sum_{j=1}^{c} \sum_{m=1}^c E\left[\left(\frac{A^\prime_j + T_j}{A^\prime_{j-1}}\right)\left(\frac{A^\prime_m + T_m}{A^\prime_{m-1}}\right)\right] $$ 245 | By grouping the products into sums where $|j-m|$ is either 246 | $0$, 247 | $1$, 248 | or more than $1$, and reverting 249 | $A^\prime_j + T_j$ to 250 | $A_j$ again in several places, 251 | we get 252 | $$\frac{E[X_c^2]}{(tn)^2} = \sum_{j=1}^{c} E\left[\frac{A_j^2}{{A^\prime_{j-1}}^2}\right] + 2\sum_{j=1}^{c-1} E\left[\left(\frac{A^\prime_j + T_j}{A^\prime_{j-1}}\right)\frac{A_{j+1}}{A^\prime_j}\right] + 2\sum_{j=1}^{c-2} \sum_{m=j+2}^{c} E\left[\frac{A_j A_m}{A^\prime_{j-1} A^\prime_{m-1}}\right] $$ 253 | Expanding the middle term further, we get 254 | $$\frac{E[X_c^2]}{(tn)^2} = \sum_{j=1}^{c} E\left[\frac{A_j^2}{{A^\prime_{j-1}}^2}\right] + 2\sum_{j=1}^{c-1} E\left[\frac{A_{j+1}}{A^\prime_{j-1}} + \frac{T_j A_{j+1}}{A^\prime_{j-1}A^\prime_j}\right] + 2\sum_{j=1}^{c-2} \sum_{m=j+2}^{c} E\left[\frac{A_j A_m}{A^\prime_{j-1} A^\prime_{m-1}}\right] $$ 255 | Using the fact that the expectation of the product of independent variables is the product of their expectations, we can drop the indices and write 256 | everything as a sum of the products of powers of independent variables. 257 | $$\frac{E[X_c^2]}{(tn)^2} = c E[A^2] E[{A^\prime}^{-2}] + 2(c-1)\left(E[A]E[{A^\prime}^{-1}] + E[T]E[A]E^2[{A^\prime}^{-1}]\right) + (c-1)(c-2) E^2[A] E^2[{A^\prime}^{-1}] $$ 258 | Knowing the distribution of all $A_j$, 259 | $A^\prime_j$, and 260 | $T_j$, we can calculate that 261 | $E[A] = n$, 262 | $E[A^2] = n(n+1)$, 263 | $E[{A^\prime}^{-1}] = (n-2)^{-1}$, 264 | $E[{A^\prime}^{-2}] = ((n-2)(n-3))^{-1}$, 265 | $E[T] = 1$. Using those we get 266 | $$\frac{E[X_c^2]}{(tn)^2} = \frac{cn(n+1)}{(n-2)(n-3)} + 2(c-1)\left(\frac{n}{n-2} + \frac{n}{(n-2)^2}\right) + \frac{(c-1)(c-2)n^2}{(n-2)^2} = n\frac{c^2 n^2 - 3c^2 n + 4c + 2n - 6}{(n-3)(n-2)^2}$$ 267 | So, we get 268 | $$\sigma_c^2 = (tn)^2 \left( \frac{E[X_c^2]}{(tn)^2} - \left(\frac{cn}{n-2}\right)^2\right) = (tn)^2 \frac{2n(2c+n-3)}{(n-3)(n-2)^2}$$ 269 | and the standard deviation we're looking for is 270 | $$\mathit{StdDev}(X_c) = \sigma_c = \frac{tn}{n-2} \sqrt{\frac{2n(2c+n-3)}{n-3}} \approx t \sqrt{2n + 4c + 8} $$ 271 | 272 | In Bitcoin's case where $n=2016$, this value increases *extremely* slowly with 273 | $c$. For 274 | $c=1$ (2016 blocks, or roughly 2 weeks) that gives the same *10 hours, 35 minutes, 55.59 seconds* as we found in the previous 275 | section, but for $c=104$ (209664 blocks, or just under 1 halvening period) it is barely more: 276 | ***11 hours, 7 minutes, 38.53 seconds***. The explanantion for this is that most randomness is compensated for: 277 | when an overly-long block randomly occurs in any window but the last one, the difficulty of the next window 278 | will be increased, resulting in a shorter next window, which mostly compensates for the increase. 279 | 280 | Because of this cancellation, the standard deviation for the length of multiple consecutive windows 281 | grows *slower* than what would be expected for independent events. This is in sharp contrast with 282 | the evolution of the standard deviation for the length of multiple blocks within one window, where 283 | the growth is *faster* than what would be expected for independent events. 284 | For sufficiently large $n$, the standard deviation approaches 285 | $t\sqrt{2n + 4c}$, which is the same as the standard deviation we'd expect from a Poisson process with 286 | $2n + 4c$ blocks, all at exact difficulty. This can again be explained by looking at the sources of 287 | randomness: 288 | * The $2n$ blocks' worth are the result of the 289 | $n$ blocks of the window preceding the ones considered, by contributing to the initial difficulty, and 290 | $n$ blocks in the last window considered, as these are not compensated for. 291 | This is the same as in the previous section, where we talked about one full window. 292 | * $3c$ blocks' worth of standard deviation come from the imperfection of the compensation through difficulty adjustment: 293 | random increases/decreases in one of the intermediary blocks aren't compensated perfectly. Interestingly, 294 | the extent of this imperfection does not scale with the number of blocks in a window, but is just roughly 295 | $3$ blocks per window worth. 296 | * $c$ blocks' worth is due to the one last block in every window whose length does not affect the difficulty 297 | computation at all, so it just passes through into our final expression. A modified formula that did take 298 | all durations into account would only result in a standard deviation of approximately $t\sqrt{2n + 3c}$, 299 | as well as fixing the time-warp vulnerability. 300 | 301 | All the formulas in this section have been verified using simulations. 302 | 303 | ## Conclusion 304 | 305 | The main findings discussed above, under the assumption of constant hashrate and the chosen simplifications of the difficulty adjustment algorithm, are: 306 | * The probability distribution for all future difficulties are all identically and independently distributed. Specifically, 307 | they all follow $\mathit{InvGamma}(\alpha=n-1, \beta=htn)$. Additionally, this independence implies that, surprisingly, 308 | the difficulty of one window does not affect the difficulty of future ones. 309 | * The expected sum of the durations of $r$ distinct blocks is $rtn/(n-2)$. This holds regardless of whether these blocks belong to the same difficulty 310 | window or not. 311 | * The standard deviation on the sum of the duration of consecutive blocks depends on whether they belong to the same window or not: 312 | * If they do, the duration is distributed as $tn \cdot \mathit{Beta'}(\alpha=r, \beta=n-1)$, whose standard deviation is $nt/(n-2)\cdot\sqrt{r(n+r-2)/(n-3)}$; 313 | growing faster with $r$ than what would be expected for independent events. 314 | * If they don't, but we're looking at $c = r/n$ consecutive full windows, 315 | the standard deviation is $nt/(n-2)\cdot\sqrt{2n(2c+n-3)/(n-3)}$; 316 | growing significantly slower with $c$ than what would be expected for independent events. 317 | Specifically, for large $n$, that is roughly the standard deviation we'd expect for 318 | $2n+4c$ independent blocks, despite being about the duration of 319 | $cn$ blocks. 320 | 321 | For Bitcoin, with $n=2016$ and 322 | $t$ *10 minutes*, this means: 323 | * Blocks take on average *10 minutes, 0.5958 seconds*, which translates to 324 | windows taking *2 weeks, 20 minutes, 1.19 seconds*, a factor 325 | $2016/2014 \approx 1.000993$ more than what was targetted. 326 | * The standard deviation of a block is close to its mean, as would be expected for 327 | Poisson processes, namely *10 minutes, 0.8941 seconds*. 328 | * The standard deviation for a window is *10 hours, 35 minutes, 55.59 seconds*, approximately $\sqrt{2}$ times what would 329 | be expected for independent Poisson processes. 330 | * The standard deviation for 104 windows (~4 years) is only barely larger: *11 hours, 7 minutes, 38.53 seconds*. 331 | If the window durations were independent, we'd expect this to roughly be $\sqrt{104} \approx 10.198$ times the standard deviation for one window. 332 | It is much less than that due to randomness in the intermediary windows being mostly compensated for by the difficulty of the successor window. 333 | 334 | The mean and standard deviation formulas listed here were verified by [simulations](simul.cpp). In fact, the results 335 | seem to mostly hold even when the hashrate is not a constant but exponentially growing one, in which 336 | case mean and standard deviation roughly get divided by the growth rate per window. 337 | 338 | ## Acknowledgments 339 | 340 | Thanks to Clara Shikhelman for the discussion and many comments that led to this writeup. 341 | -------------------------------------------------------------------------------- /bitcoin-difficulty-adjustment/simul.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | namespace { 9 | 10 | inline uint64_t RdRand() noexcept 11 | { 12 | uint8_t ok; 13 | uint64_t r; 14 | while (true) { 15 | __asm__ volatile (".byte 0x48, 0x0f, 0xc7, 0xf0; setc %1" : "=a"(r), "=q"(ok) :: "cc"); // rdrand %rax 16 | if (ok) break; 17 | __asm__ volatile ("pause" :); 18 | } 19 | return r; 20 | } 21 | 22 | static inline uint64_t Rotl(const uint64_t x, int k) { 23 | return (x << k) | (x >> (64 - k)); 24 | } 25 | 26 | /** Xoshiro256++ 1.0 */ 27 | class RNG { 28 | uint64_t s0, s1, s2, s3; 29 | 30 | public: 31 | RNG() : s0(RdRand()), s1(RdRand()), s2(RdRand()), s3(RdRand()) {} 32 | 33 | uint64_t operator()() { 34 | uint64_t t0 = s0, t1 = s1, t2 = s2, t3 = s3; 35 | const uint64_t result = Rotl(t0 + t3, 23) + t0; 36 | const uint64_t t = t1 << 17; 37 | t2 ^= t0; 38 | t3 ^= t1; 39 | t1 ^= t2; 40 | t0 ^= t3; 41 | t2 ^= t; 42 | t3 = Rotl(t3, 45); 43 | s0 = t0; 44 | s1 = t1; 45 | s2 = t2; 46 | s3 = t3; 47 | return result; 48 | } 49 | }; 50 | 51 | class StatRNG { 52 | RNG rng; 53 | 54 | public: 55 | long double Exp() { 56 | return -::logl((static_cast(rng()) + 0.5L) * 0.0000000000000000000542101086242752217L); 57 | } 58 | 59 | long double Erlang(int k) { 60 | long double ret = 0.0L; 61 | for (int i = 0; i < k; ++i) { 62 | ret += Exp(); 63 | } 64 | return ret; 65 | } 66 | }; 67 | 68 | template 69 | void Simul(F f, int retarget) { 70 | long double diff = 1.0L; 71 | StatRNG rng; 72 | while (true) { 73 | long double most = rng.Erlang(retarget - 1) * diff; 74 | long double last = rng.Exp() * diff; 75 | diff /= most; 76 | f(most + last); 77 | } 78 | } 79 | 80 | static constexpr int RETARGET = 10; 81 | 82 | #ifndef KVAL 83 | static constexpr int K = 1; 84 | #else 85 | static constexpr int K = (KVAL); 86 | #endif 87 | 88 | static constexpr int SLACK = 3; 89 | static constexpr int PRINTFREQ = 40000000 / (RETARGET * (K + SLACK)); 90 | 91 | } // namespace 92 | 93 | int main(void) { 94 | int iter = 0; 95 | uint64_t cnt = 0; 96 | long double acc = 0.0L; 97 | long double sum = 0.0L; 98 | long double sum2 = 0.0L; 99 | long double sum3 = 0.0L; 100 | long double sum4 = 0.0L; 101 | constexpr long double CR = RETARGET; 102 | constexpr long double CEx = K * CR / (CR - 2.0L); 103 | constexpr long double CVar = 2.0L * CR * (CR + (2*K - 3)) / ((CR - 3.0L)*(CR - 2.0L)*(CR - 2.0L)); 104 | auto proc = [&](long double winlen) { 105 | iter += 1; 106 | if (iter >= SLACK) { 107 | acc += winlen; 108 | if (iter == K + SLACK - 1) { 109 | cnt += 1; 110 | long double acc2 = acc*acc; 111 | sum += acc; 112 | sum2 += acc2; 113 | sum3 += acc*acc2; 114 | sum4 += acc2*acc2; 115 | acc = 0.0L; 116 | iter = 0; 117 | if ((cnt % PRINTFREQ) == 0) { 118 | long double mu = sum / cnt; 119 | long double mu2p = sum2 / cnt; 120 | long double mu3p = sum3 / cnt; 121 | long double mu4p = sum4 / cnt; 122 | long double mu2 = mu2p - mu*mu; 123 | long double mu4 = mu4p + mu*(-4.0L*mu3p + mu*(6.0L*mu2p + -3.0L*mu*mu)); 124 | long double var = (mu2 * cnt) / (cnt - 1); 125 | long double smu = sqrtl(CVar/cnt); 126 | long double svar = sqrtl((mu4 - (cnt - 3)*CVar*CVar/(cnt - 1))/cnt); 127 | printf("%lu: avg=%.15Lf(+-%Lf; E%Lf) var=%.15Lf(+-%Lf, E%Lf)\n", (unsigned long)cnt, mu, smu, (mu - CEx) / smu, var, svar, (var - CVar) / svar); 128 | } 129 | } 130 | } 131 | }; 132 | Simul(proc, RETARGET); 133 | } 134 | -------------------------------------------------------------------------------- /elligator-square-for-bn/README.md: -------------------------------------------------------------------------------- 1 | # Elligator Squared for BN-like curves 2 | 3 | This document explains how to efficiently implement the Elligator Squared 4 | algorithm for BN curves and BN-like curves like `secp256k1`. 5 | 6 | ## 1 Introduction 7 | 8 | ### 1.1 Elligator 9 | 10 | Sometimes it is desirable to be able to encode elliptic curve public keys as 11 | uniform byte strings. In particular, a Diffie-Hellman key exchange requires 12 | sending a group element in both directions, which for Elliptic Curve based 13 | variants implies sending a curve point. As the coordinates of such points 14 | satisfy the curve equation, this results in a detectable relation between 15 | the sent bytes if those coordinates are sent naively. Even if just the X 16 | coordinates of the points are sent, the knowledge that only around 50% of 17 | X coordinates have corresponding points on the curve means that an attacker 18 | who observes many connections can distinguish these from random: the probability 19 | that 30 random observed transmission would always be valid X coordinates is less than 20 | one in a billion. 21 | 22 | Various approaches have been proposed for this solution, including 23 | [Elligator](https://elligator.cr.yp.to/) and Elligator 2 by Bernstein et al., which both 24 | define a mapping between a subset of points on the curve and byte 25 | arrays, in such a way that the encoding of uniformly generated 26 | curve points within that set is indistinguishable from random bytes. 27 | This permits running ECDH more *covertly*: instead of permitting 28 | any public key as ECDH ephemeral, restrict the choice to those which 29 | have an Elligator mapping, and send the encoding. This requires on 30 | average 2 ECDH ephemeral keys to be generated, but it results 31 | in performing an ECDH negotiation with each party sending just 32 32 | uniform bytes to each other (for *256*-bit curves). 33 | 34 | Unfortunately, Elligator and Elligator 2 have requirements that make 35 | them incompatible with curves of odd order like BN curves are. 36 | 37 | ### 1.2 Elligator Squared 38 | 39 | In [this paper](https://eprint.iacr.org/2014/043.pdf), Tibouchi describes a more 40 | generic solution that works for any elliptic curve 41 | and for any point on it rather than a subset, called Elligator Squared. The downside is that 42 | the encoding output is twice the size: for 256-bit curves, the encoding is 64 uniformly random bytes. 43 | On the upside, it's generally also faster as it doesn't need generating multiple ECDH keys. 44 | 45 | It relies on a function *f* that maps field elements to points, with the following 46 | properties: 47 | * Every field element is mapped to a single valid point on the curve by *f*. 48 | * A significant fraction of points on the curve have to be reachable (but not all). 49 | * The number of field elements that map to any given point must have small upper bound *d*. 50 | * These preimages (the field elements that map to a given point) must be efficiently computable. 51 | 52 | The Elligator and Elligator 2 mapping functions can be used as *f* for curves where they exist, 53 | but due to being less restrictive, adequate mapping functions for all elliptic curves exist, 54 | including ones with odd order. 55 | 56 | The Elligator Squared encoding then consists of **two** field elements, and together they 57 | represent the sum (elliptic curve group operation) of the points obtained by applying *f* to those two field elements. 58 | To decode such a pair *(u,v)*, just compute *f(u) + f(v)*. 59 | 60 | To find an encoding for a given point *P*, the following random sampling algorithm is used: 61 | * Loop: 62 | * Generate a uniformly random field element *u*. 63 | * Compute the point *Q = P - f(u)*. 64 | * Compute the set of preimages *t* of *Q* (so for every *v* in *t* it holds that *f(v) = Q*). This set can have any size in *[0,d]*. 65 | * Generate a random number *j* in range *[0,d)*. 66 | * If *j < len(t)*: return *(u,t[j])*, otherwise start over. 67 | 68 | Effectively, this algorithm uniformly randomly picks one pair of field elements from the set of 69 | all of those that encode *P*. It can be shown that the number of preimage pairs points have 70 | only differ negligibly from each other, and thus this sampling algorithm results 71 | in uniformly random pair of field elements, given uniformly random points *P*. 72 | 73 | ### 1.3 Mapping function for BN-like curves 74 | 75 | In [this paper](https://www.di.ens.fr/~fouque/pub/latincrypt12.pdf), Fouque and Tibouchi describe a so-called Shallue-van de Woestijne mapping function *f* 76 | that meets all the requirements above, for BN-like curves. 77 | 78 | Specifically, given a prime *p*, the field *F = GF(p)*, and the elliptic 79 | curve *E* over it defined by *y2 = g(x) = x3 + b*, where *p mod 12 = 7* and *1+b is a nonzero square in F*, define 80 | the following constants (in *F*): 81 | * *c1 = √(-3)* 82 | * *c2 = (c1 - 1) / 2* 83 | * *c3 = (-c1 - 1) / 2* 84 | 85 | And define the following 3 functions: 86 | * *q1(s) = c2 - c1s / (1+b+s)* 87 | * *q2(s) = c3 + c1s / (1+b+s)* 88 | * *q3(s) = 1 - (1+b+s)2 / (3s)* 89 | 90 | Then the paper shows that given a nonzero square *s* in *F*, *g(q1(s))×g(q2(s))×g(q3(s))* 91 | will also be square, or in other words, either exactly one of *{q1(s), q2(s), q3(s)}*, or 92 | all three of them, are valid X coordinates on *E*. For *s=0*, *q1(0)* and *q2(0)* map to valid points 93 | on the curve, while *q3(0)* is not defined (division by zero). 94 | 95 | With that, the function *f(u)* can be defined as follows: 96 | * Compute *x1 = q1(u2)*, *x2 = q2(u2)*, *x3 = q3(u2)* 97 | * Let *x* be the first of *{x1,x2,x3}* that's a valid X coordinate on *E* (i.e., *g(x)* is square). 98 | * Let *y* be the square root of *g(x)* whose parity equals that of *u* (every nonzero square mod *P* has two distinct roots, negations of each other, of which one is even and one which is odd). 99 | * Return *(x,y)*. 100 | 101 | This function meets all our requirements. It maps every field element to a curve point, around *56.25%* of curve points are reached, no point has more 102 | than *d=4* preimages, and those preimages can be efficiently computed. Furthermore, when implemented this way, divisions by zero are not 103 | a concern. The *1+b+s* in *q1* and *q2* is never zero for square *s* (it would require *s = -1-b*, but *-1-b* is never square). 104 | The *3s* in *q3* can be *0*, but this won't be reached as *q1(0)* lands on the curve. 105 | 106 | ## 2 Specializing Elligator Squared 107 | 108 | ### 2.1 Inverting *f* 109 | 110 | Elligator Squared needs an efficient way to find the field elements *v* for which *f(v) = Q*, given *Q*. 111 | We start by defining 4 partial inverse functions *r1..4* for *f*. 112 | Given an *(x,y)* coordinate pair on the curve, each of these either returns *⊥*, or returns a field element *t* such that *f(t) = (x,y)*. 113 | 114 | ***ri(x,y)***: 115 | * Compute *s = q?-1(x)* 116 | * If *i=1*: *q1-1(x)*: *s = (1+b)(c1-z) / (c1+z)* where *z = 2x+1* 117 | * If *i=2*: *q2-1(x)*: *s = (1+b)(c1+z) / (c1-z)* where *z = 2x+1* 118 | * If *i=3*: *q3-1(x)*: *s = (z + √(z2 - 16(b+1)2))/4* where *z = 2-4B-6x* 119 | * If *i=4*: *q3-1(x)*: *s = (z - √(z2 - 16(b+1)2))/4* where *z = 2-4B-6x* 120 | * If *s* does not exist (because of division by zero, or non-existing square root), return *⊥*. 121 | * If *s* is not square: return *⊥* 122 | * For all *j* in *1..min(i-1,2)*: 123 | * If *g(qj(s))* is square: return *⊥*; to guarantee that the constructed preimage roundtrips back through the corresponding forward *qi* and not through a lower one. 124 | * Compute *u = √s* 125 | * If *is_odd(u) = is_odd(y)*: 126 | * Return *u* 127 | * Else: 128 | * If *u=0*: return *⊥* (would require an odd *0*, but negating doesn't change parity) 129 | * Return *-u* 130 | 131 | The (multi-valued) *f-1(x,y)* function can be defined as the set of non-*⊥* 132 | values in *{r1(x,y),r2(x,y),r3(x,y),r4(x,y)}*, as every preimage of *(x,y)* under *f* is one of these four. 133 | 134 | ### 2.2 Avoiding computation of all inverses 135 | 136 | It turns out that we don't actually need to evaluate *f-1(x,y)* in full. 137 | Consider the following: the Elligator Squared sampling loop is effectively the following: 138 | * Loop: 139 | * Generate a uniformly random field element *u*. 140 | * Compute the point *Q = P - f(u)*. 141 | * Compute the list of distinct preimages *t = f-1(Q)*. 142 | * Pad *t* with *⊥* elements to size *4* (where *d=4* is the maximum number preimages for any given point). 143 | * Select a uniformly random *v* in *t*. 144 | * If *v* is not *⊥*, return *(u,v)*; otherwise start over. 145 | 146 | In this loop, an alternative list *t' = [r1(x,y),r2(x,y),r3(x,y),r4(x,y)]* 147 | can be used. It has exactly the same elements as the padded *t* above, except potentially in a different order. This is a valid 148 | alternative because if all we're going to do is select a uniformly random element from it, the order of this list is irrelevant. 149 | To be precise, we do need to deal with the edge case here where multiple *ri(x,y)* for distinct *i* values are the same, as this would 150 | introduce a bias. To do so, we add to the definition of *ri(x,y)* that *⊥* is returned if *rj(x,y) = ri(x,y)* for *j < i*. 151 | 152 | Selecting a uniformly random element from *t'* is easy: just select one of the four *ri* functions, 153 | evaluate it in *Q*, and start over if it's *⊥*: 154 | * Loop: 155 | * Generate a uniformly random field element *u*. 156 | * Compute the point *Q = P - f(u)*. 157 | * Generate a uniformly random *j* in *1..4*. 158 | * Compute *v = rj(Q)*. 159 | * If *v* is not *⊥*, return *(u,v)*; otherwise start over. 160 | 161 | This avoids the need to compute *t* or *t'* in their entirety. 162 | 163 | ### 2.3 Simplifying the round-trip checks 164 | 165 | As explained in Paragraph 2.1, the *ri>1* partial reverse functions must check that the value obtained through the *qi-1* formula 166 | doesn't map to the curve through a lower-numbered forward *qi* function. 167 | 168 | For *r2*, this means checking that *q1(q2-1(x))* isn't on the curve. 169 | Thankfully, *q1(q2-1(x))* is just *-x-1*, which simplifies the check. 170 | 171 | For *r3,4* it is actually unncessary to check forward mappings through both *q1* and *q2*. 172 | The Shallue-van de Woestijne construction of *f* guarantees that either exactly one, or all three, of the *qi* 173 | functions land on the curve. When computing an inverse through *q3-1*, and that inverse exists, 174 | then we know it certainly lands on the curve when forward mapping through *q3*. That implies that either 175 | both *q1* and *q2* also land on the curve, or neither of them does. Thus for *r3,4* it 176 | suffices to check that *q1* doesn't land on the curve. 177 | 178 | ### 2.4 Simplifying the duplicate preimage checks 179 | 180 | As explained in Paragraph 2.2, we need to deal with the edge case where multiple *ri(x,y)* with distinct *i* map to the same value. 181 | 182 | Most of these checks are actually unnecessary. For example, yes, it is possible that *q1-1(x)* and *q2-1(x)* 183 | are equal (when *x = -1/2*), but when that is the case, it is obvious that *q1(q2-1(x))* will be on the curve as 184 | well (as it is equal to *x*), and thus the round-trip check will already cause *⊥* to be returned. 185 | 186 | There is only one case that isn't covered by the round-trip check already: 187 | *r4(x,y)* may match *r3(x,y)*, which isn't covered because they both use the same forward *q3(x)*. 188 | This happens when either *x = (-1-4B)/3* or *x = 1*. 189 | 190 | Note that failing to implement these checks will only introduce a negligible bias, as these cases are too rare to occur for cryptographically-sized curves 191 | when only random inputs are used like in Elligator Squared. 192 | They are mentioned here for completeness, as it helps writing small-curve tests where correctness can be verified exhaustively. 193 | 194 | ### 2.5 Dealing with infinity 195 | 196 | The point at infinity is not a valid public key, but it is possible to construct field elements *u* and *v* such that 197 | *f(u)+f(v)=∞*. To make sure every input *(u,v)* can be decoded, it is preferable to remap this special case 198 | to another point on the curve. A good choice is mapping this case to *f(u)* instead: it's easy to implement, 199 | not terribly non-uniform, and even easy to make the encoder target this case (though with actually randomly generated *u*, 200 | the bias from not doing so is negligible). 201 | 202 | On the decoder side, one simply needs to remember *f(u)*, and if *f(u)+f(v)* is the point at infinity, return that instead. 203 | 204 | On the encoder side, one can detect the case in the loop where *Q=P-f(u)=∞*; this corresponds to the situation where 205 | *P=f(u)*. No *v* exists for which *f(v)=Q* in that case, but due to the special rule on the decoder side, it is possible 206 | to target *f(v)=-f(u)* in that case. As *-f(u)* is computed already in the process of finding *Q*, it suffices to try to 207 | find preimages for that. 208 | 209 | ### 2.6 Putting it all together 210 | 211 | The full algorithm can be written as follows in Python-like pseudocode. Note that the variables 212 | (except *i*) represent field elements and are not normal Python integers (with some helper functions it is valid [Sage code](test.sage), though). 213 | 214 | ```python 215 | def f(u): 216 | s = u**2 # Turn u into a square to be fed to the q_i functions 217 | x1 = c2 - c1*s / (1+b+s) # x1 = q_1(s) 218 | g1 = x1**3 + b # g1 = g(x1) 219 | if is_square(g1): # x1 is on the curve 220 | x, g = x1, g1 221 | else: 222 | x2 = -x1-1 # x2 = q_2(s) 223 | g2 = x2**3 + b 224 | if is_square(g2): # x2 is on the curve 225 | x, g = x2, g2 226 | else: # Neither x1 or x2 is on the curve, so x3 is 227 | x3 = 1 - (1+b+s)**2 / (3*s) # x3 = q3(s) 228 | g3 = x3**3 + b # g3 = g(x3) 229 | x, g = x3, g3 230 | y = sqrt(g) 231 | if is_odd(y) == is_odd(u): 232 | return (x, y) 233 | else: 234 | return (x, -y) 235 | ``` 236 | 237 | Note that the above forward-mapping function *f* differs from the one in the paper from Paragraph 1.3. That's because 238 | the version there aims for constant-time operation. That matters for certain applications, but not for Elligator Squared 239 | which is both inherently variable-time due to the sampling loop, and at least in the context of ECDH, does not operate 240 | on secret data that must be protected from side-channel attacks. 241 | 242 | ```python 243 | def r(Q,i): 244 | x, y = Q 245 | if i == 1 or i == 2: 246 | z = 2*x + 1 247 | t1 = c1 - z 248 | t2 = c1 + z 249 | if not is_square(t1*t2): 250 | # If t1*t2 is not square, then q1^-1(x)=(1+b)*t1/t2 or 251 | # q2^-1(x)=(1+b)*t2/t1 aren't either. 252 | return None 253 | if i == 1: 254 | if t2 == 0: 255 | return None # Would be division by 0. 256 | if t1 == 0 and is_odd(y): 257 | return None # Would require odd 0. 258 | u = sqrt((1+b)*t1/t2) 259 | else: 260 | x1 = -x-1 # q1(q2^-1(x)) = -x-1 261 | if is_square(x1**3 + b): 262 | return None # Would roundtrip through q1 instead of q2. 263 | # On the next line, t1 cannot be 0, because in that case z = c1, or 264 | # x = c2, or x1 == c3, and c3 is a valid X coordinate on the curve 265 | # (c3**3 + b == 1+b which is square), so the roundtrip check above 266 | # already catches this. 267 | u = sqrt((1+b)*t2/t1) 268 | else: # i == 3 or i == 4 269 | z = 2 - 4*b - 6*x 270 | if not is_square(z**2 - 16*(b+1)**2): 271 | return None # Inner square root in q3^-1 doesn't exist. 272 | if i == 3: 273 | s = (z + sqrt(z**2 - 16*(b+1)**2)) / 4 # s = q3^-1(x) 274 | else: 275 | if z**2 == 16*(b+1)**2: 276 | return None # r_3(x,y) == r_4(x,y) 277 | s = (z - sqrt(z**2 - 16*(b+1)**2)) / 4 # s = q3^-1(x) 278 | if not is_square(s): 279 | return None # q3^-1(x) is not square. 280 | # On the next line, (1+b+s) cannot be 0, because both (b+1) and 281 | # s are square, and 1+b is nonzero. 282 | x1 = c2 - c1*s / (1+b+s) 283 | if is_square(x1**3 + b): 284 | return None # Would roundtrip through q1 instead of q3. 285 | u = sqrt(s) 286 | if is_odd(y) == is_odd(u): 287 | return u 288 | else: 289 | return -u 290 | ``` 291 | 292 | ```python 293 | def encode(P): 294 | while True: 295 | u = field_random() 296 | T = curve_negate(f(u)) 297 | Q = curve_add(T, P) 298 | if is_infinity(Q): Q = T 299 | j = secrets.choice([1,2,3,4]) 300 | v = r(Q, j) 301 | if v is not Nothing: return (u, v) 302 | ``` 303 | 304 | ```python 305 | def decode(u, v): 306 | T = f(u) 307 | P = curve_add(T, f(v)) 308 | if is_infinity(P): P = T 309 | return P 310 | ``` 311 | 312 | ### 2.7 Encoding to bytes 313 | 314 | The code above permits encoding group elements into uniform pairs of field elements, and back. However, 315 | our actual goal is encoding and decoding to/from *bytes*. How to do that depends on how close the 316 | field size *p* is to a power of *2*, and to a power of *256*. 317 | 318 | First, in case *p* is close to a power of two (*(2⌈log2(p)⌉-p)/√p* is close to *1*, or less), the 319 | field elements can be encoded as bytes directly, and concatenated, possibly after padding with random bits. In this case, 320 | directly encoding field elements as bits is indistinguishable from uniform. 321 | 322 | Note that in this section, the variables represent integers again, and not field elements. 323 | 324 | ```python 325 | P = ... # field size 326 | FIELD_BITS = P.bit_length() 327 | FIELD_BYTES = (FIELD_BITS + 7) // 8 328 | PAD_BITS = FIELD_BYTES*8 - FIELD_BITS 329 | 330 | def encode_bytes(P): 331 | u, v = encode(P) 332 | up = u + secrets.randbits(PAD_BITS) << FIELD_BITS 333 | vp = v + secrets.randbits(PAD_BITS) << FIELD_BITS 334 | return up.to_bytes(FIELD_BYTES, 'big') + vp.to_bytes(FIELD_BYTES, 'big') 335 | 336 | def decode_bytes(enc): 337 | u = (int.from_bytes(enc[:FIELD_BYTES], 'big') & ((1 << FIELD_BITS) - 1)) % P 338 | v = (int.from_bytes(enc[FIELD_BYTES:], 'big') & ((1 << FIELD_BITS) - 1)) % P 339 | return decode(u, v) 340 | ``` 341 | 342 | Of course, in case `PAD_BITS` is *0*, the padding and masking can be left out. If `encode` is inlined 343 | into `encode_bytes`, an additional optimization is possible where *u* is not generated as a random 344 | field element, but as a random padded number directly. 345 | 346 | ```python 347 | def encode_bytes(P): 348 | while True: 349 | up = secrets.randbits(FIELD_BYTES * 8) 350 | u = (ub & ((1 << FIELD_BITS) - 1)) % P 351 | T = curve_negate(f(u)) 352 | Q = curve_add(T, P) 353 | if is_infinity(Q): Q = T 354 | j = secrets.choice([1,2,3,4]) 355 | v = r(Q, j) 356 | if v is not Nothing: 357 | vp = v + secrets.randbits(PAD_BITS) << FIELD_BITS 358 | return up.to_bytes(FIELD_BITS, 'big') + vp.to_bytes(FIELD_BYTES, 'big') 359 | ``` 360 | 361 | In case *p* is **not** close to a power of two, a different approach is needed. The code below implements the 362 | algorithm suggested in the Elligator Squared paper: 363 | 364 | ```python 365 | P = ... # field size 366 | P2 = P**2 # field size squared 367 | ENC_BYTES = (P2.bit_length() * 5 + 31) // 32 368 | ADD_RANGE = (256**ENC_BYTES) // P2 369 | THRESH = (256**ENC_BYTES) % P2 370 | 371 | def encode_bytes(P): 372 | u, v = encode(P) 373 | w = u*P + v 374 | w += secrets.randbelow(ADD_RANGE + (w < THRESH))*P2 375 | return w.to_bytes(ENC_BYTES, 'big') 376 | 377 | def decode_bytes(enc): 378 | w = int.from_bytes(enc, 'big') % P2 379 | u, v = w >> P, w % P 380 | return decode(u, v) 381 | ``` 382 | 383 | ## 3 Optimizations 384 | 385 | Next we'll convert these algorithms to a shape that's more easily mapped to low-level implementations. 386 | 387 | ### 3.1 Delaying/avoiding inversions and square roots 388 | 389 | Techniques: 390 | * Delay inversions where possible by operating on fractions until the exact result is certainly required. 391 | * Avoid inversions inside *is_square*: *is_square(n / d) = is_square(nd)* (multiplication with *d2*), and *is_square((n/d)3 + b) = is_square(n3d + Bd4)* (multiplication with *d4*). 392 | * Make arguments to *is_square* and *sqrt* identical if the latter follows the former. This means that implementations without fast Jacobi symbol can just compute and use the square root directly instead. 393 | * Multiply with small constants to avoid divisions. 394 | * Avoid common subexpressions. 395 | 396 | ```python 397 | def f(u): 398 | s = u**2 399 | # Write x1 as fraction: x1 = n / d 400 | d = 1+b+s 401 | n = c2*d - c1*s 402 | # Compute h = g1*d**4, avoiding a division. 403 | h = d*n**3 + b*d**4 404 | if is_square(h): 405 | # If h is square, then so is g1. 406 | i = 1/d 407 | x = n*i # x = n/d 408 | y = sqrt(h)*i**2 # y = sqrt(g) = sqrt(h)/d**2 409 | else: 410 | # Update n so that x2 = n / d 411 | n = -n-d 412 | # And update h so that h = g2*d**4 413 | h = d*n**3 + b*d**4 414 | if is_square(h): 415 | # If h is square, then so is g2. 416 | i = 1/d 417 | x = n*i # x = n/d 418 | y = sqrt(h)*i**2 # y = sqrt(g) = sqrt(h)/d**2 419 | else: 420 | x = 1 - d**2 / (3*s) 421 | y = sqrt(x**3 + b) 422 | if is_odd(y) != is_odd(u): 423 | y = -y 424 | return (x, y) 425 | ``` 426 | 427 | ```python 428 | def r(x,y,i): 429 | if i == 1 or i == 2: 430 | z = 2*x + 1 431 | t1 = c1 - z 432 | t2 = c1 + z 433 | t3 = (1+b) * t1 * t2 434 | if not is_square(t3): 435 | return None 436 | if i == 1: 437 | if t2 == 0: 438 | return None 439 | if t1 == 0 and is_odd(y): 440 | return None 441 | u = sqrt(t3) / t2 442 | else: 443 | if is_square((-x-1)**3 + b): 444 | return None 445 | u = sqrt(t3) / t1 446 | else: 447 | z = 2 - 4*b - 6*x 448 | t1 = z**2 - 16*(b+1)**2 449 | if not is_square(t1): 450 | return None 451 | r = sqrt(t1) 452 | if i == 3: 453 | t2 = z + r 454 | else: 455 | if r == 0: 456 | return None 457 | t2 = z - r 458 | # t2 is now 4*s (delaying the divsion by 4) 459 | if not is_square(t2): 460 | return None 461 | # Write x1 as a fraction: x1 = n / d 462 | d = 4*(b+1) + t2 463 | n = (2*(b+1)*(c1-1)) + c3*t2 464 | # Compute h = g1*d**4 465 | h = n**3*d + b*d**4 466 | if is_square(h): 467 | # If h is square then so is g1. 468 | return None 469 | u = sqrt(t2) / 2 470 | if is_odd(y) != is_odd(u): 471 | u = -u 472 | return u 473 | ``` 474 | 475 | ### 3.2 Broken down implementation 476 | 477 | The [test.sage](test.sage) script contains both implementations above, plus a an additional 478 | one with one individual field operation per line, reusing variables where possible. 479 | This format can more directly be translated to a low-level implementation. 480 | 481 | ## 4 Performance 482 | 483 | ### 4.1 Operation counts 484 | 485 | The implementation above requires the following average operation counts: 486 | * For ***f()***: *1* inversion, *1.5* square tests, *1* square root 487 | * For ***r()***: *0.1875* inversions, *1.5* square tests, *0.5* square roots 488 | * For ***encode()***: *8.75* inversions, *12* square tests, *6* square roots 489 | * For ***decode()***: *3* inversions, *3* square tests, *2* square roots 490 | 491 | When no square test function is available, or one that takes more than half 492 | the time of a square root, it is better to replace square tests with square roots. 493 | As many square tests are followed by an actual square root of the same argument 494 | if succesful, these don't need to be repeated anymore. The operation counts 495 | for the resulting algorithm then become: 496 | * For ***f()***: *1* inversion, *1.75* square roots 497 | * For ***r()***: *0.1875* inversions, *1.5* square roots 498 | * For ***encode()***: *8.75* inversions, *13* square roots 499 | * For ***decode()***: *3* inversions, *3.5* square roots 500 | 501 | ### 4.2 Benchmarks 502 | 503 | On a Ryzen 5950x system, an implementation following the procedure described in 504 | this document using the [libsecp256k1](https://github.com/bitcoin-core/secp256k1/pull/982) 505 | library, takes *47.8 µs* for encoding and *14.3 µs* for decoding. 506 | For reference, the same system needs *0.86 µs* for a (variable-time) 507 | modular inverse, *1.1 µs* for a (variable-time)quadratic character, 508 | *3.8 µs* for a square root, and *37.6 µs* for an ECDH evaluation. 509 | 510 | ## 5 Alternatives 511 | 512 | In the [draft RFC](https://datatracker.ietf.org/doc/draft-irtf-cfrg-hash-to-curve/) 513 | on hashing to elliptic curves an alternative approach is discussed: transform points 514 | on a *y2 = x3 + b* curve to an isogenic *y2 = x3 + a'x + b'* 515 | curve, with *a'b' ≠ 0*, and then use the mapping function by [Brier et al.](https://eprint.iacr.org/2009/340.pdf), 516 | on that curve. This has the advantage 517 | of having a simpler (and computationally cheaper) forward mapping function *f*. However, the reverse mapping 518 | function *r* in this case is computationally more expensive due to higher-degree equations. Overall, for 519 | Elligator Squared encoding and decoding, these two roughly cancel out, while making the algorithm 520 | significantly more complex. There is some ugliness too in that the conversion to the isogenic curve and 521 | back does not roundtrip exactly, but maps points to a (fixed) multiple of themselves. This isn't 522 | exactly a problem in the ECDH setting, but it means the Elligator Squared routines can't be treated as 523 | a black box encoder and decoder. 524 | -------------------------------------------------------------------------------- /elligator-square-for-bn/test.sage: -------------------------------------------------------------------------------- 1 | def is_odd(n): 2 | return (int(n) & 1) != 0 3 | 4 | def f1(u): 5 | """Forward mapping function, naively.""" 6 | s = u**2 # Turn u into a square to be fed to the q_i functions 7 | x1 = c2 - c1*s / (1+b+s) # x1 = q_1(s) 8 | g1 = x1**3 + b # g1 = g(x1) 9 | if is_square(g1): # x1 is on the curve 10 | x, g = x1, g1 11 | else: 12 | x2 = -x1-1 # x2 = q_2(s) 13 | g2 = x2**3 + b 14 | if is_square(g2): # x2 is on the curve 15 | x, g = x2, g2 16 | else: # Neither x1 or x2 is on the curve, so x3 is 17 | x3 = 1 - (1+b+s)**2 / (3*s) # x3 = q3(s) 18 | g3 = x3**3 + b # g3 = g(x3) 19 | x, g = x3, g3 20 | y = sqrt(g) 21 | if is_odd(y) == is_odd(u): 22 | return (x, y) 23 | else: 24 | return (x, -y) 25 | 26 | def r1(x,y,i): 27 | """Reverse mapping function, naively.""" 28 | if i == 1 or i == 2: 29 | z = 2*x + 1 30 | t1 = c1 - z 31 | t2 = c1 + z 32 | if not is_square(t1*t2): 33 | # If t1*t2 is not square, then q1^-1(x)=(1+b)*t1/t2 or 34 | # q2^-1(x)=(1+b)*t2/t1 aren't either. 35 | return None 36 | if i == 1: 37 | if t2 == 0: 38 | return None # Would be division by 0. 39 | if t1 == 0 and is_odd(y): 40 | return None # Would require odd 0. 41 | u = sqrt((1+b)*t1/t2) 42 | else: 43 | x1 = -x-1 # q1(q2^-1(x)) = -x-1 44 | if is_square(x1**3 + b): 45 | return None # Would roundtrip through q1 instead of q2. 46 | # On the next line, t1 cannot be 0, because in that case z = c1, or 47 | # x = c2, or x1 == c3, and c3 is a valid X coordinate on the curve 48 | # (c1**3 + b == 1+b which is square), so the roundtrip check above 49 | # already catches this. 50 | u = sqrt((1+b)*t2/t1) 51 | else: # i == 3 or i == 4 52 | z = 2 - 4*b - 6*x 53 | if not is_square(z**2 - 16*(b+1)**2): 54 | return None # Inner square root in q3^-1 doesn't exist. 55 | if i == 3: 56 | s = (z + sqrt(z**2 - 16*(b+1)**2)) / 4 # s = q3^-1(x) 57 | else: 58 | if z**2 == 16*(b+1)**2: 59 | return None # r_3(x,y) == r_4(x,y) 60 | s = (z - sqrt(z**2 - 16*(b+1)**2)) / 4 # s = q3^-1(x) 61 | if not is_square(s): 62 | return None # q3^-1(x) is not square. 63 | # On the next line, (1+b+s) cannot be 0, because both (b+1) and 64 | # s are square, and 1+b is nonzero. 65 | x1 = c2 - c1*s / (1+b+s) 66 | if is_square(x1**3 + b): 67 | return None # Would roundtrip through q1 instead of q3. 68 | u = sqrt(s) 69 | if is_odd(y) == is_odd(u): 70 | return u 71 | else: 72 | return -u 73 | 74 | def f2(u): 75 | """Forward mapping function, optimized.""" 76 | s = u**2 77 | # Write x1 as fraction: x1 = n / d 78 | d = 1+b+s 79 | n = c2*d - c1*s 80 | # Compute h = g1*d**4, avoiding a division. 81 | h = d*n**3 + b*d**4 82 | if is_square(h): 83 | # If h is square, then so is g1. 84 | i = 1/d 85 | x = n*i # x = n/d 86 | y = sqrt(h)*i**2 # y = sqrt(g) = sqrt(h)/d**2 87 | else: 88 | # Update n so that x2 = n / d 89 | n = -n-d 90 | # And update h so that h = g2*d**4 91 | h = d*n**3 + b*d**4 92 | if is_square(h): 93 | # If h is square, then so is g2. 94 | i = 1/d 95 | x = n*i # x = n/d 96 | y = sqrt(h)*i**2 # y = sqrt(g) = sqrt(h)/d**2 97 | else: 98 | x = 1 - d**2 / (3*s) 99 | y = sqrt(x**3 + b) 100 | if is_odd(y) != is_odd(u): 101 | y = -y 102 | return (x, y) 103 | 104 | def r2(x,y,i): 105 | """Reverse mapping function, optimized.""" 106 | if i == 1 or i == 2: 107 | z = 2*x + 1 108 | t1 = c1 - z 109 | t2 = c1 + z 110 | t3 = (1+b) * t1 * t2 111 | if not is_square(t3): 112 | return None 113 | if i == 1: 114 | if t2 == 0: 115 | return None 116 | if t1 == 0 and is_odd(y): 117 | return None 118 | u = sqrt(t3) / t2 119 | else: 120 | if is_square((-x-1)**3 + b): 121 | return None 122 | u = sqrt(t3) / t1 123 | else: 124 | z = 2 - 4*b - 6*x 125 | t1 = z**2 - 16*(b+1)**2 126 | if not is_square(t1): 127 | return None 128 | r = sqrt(t1) 129 | if i == 3: 130 | t2 = z + r 131 | else: 132 | if r == 0: 133 | return None 134 | t2 = z - r 135 | # t2 is now 4*s (delaying the divsion by 4) 136 | if not is_square(t2): 137 | return None 138 | # Write x1 as a fraction: x1 = d / n 139 | d = 4*(b+1) + t2 140 | n = (2*(b+1)*(c1-1)) + c3*t2 141 | # Compute h = g1*d**4 142 | h = n**3*d + b*d**4 143 | if is_square(h): 144 | # If h is square then so is g1. 145 | return None 146 | u = sqrt(t2) / 2 147 | if is_odd(y) != is_odd(u): 148 | u = -u 149 | return u 150 | 151 | def f3(u): 152 | """Forward mapping function, broken down.""" 153 | t0 = u**2 # t0 = s = u**2 154 | t1 = (1+b) + t0 # t1 = d = 1+b+s 155 | t3 = (-c1) * t0 # t3 = -c1*s 156 | t2 = c2 * t1 # t2 = c2*d 157 | t2 = t2 + t3 # t2 = n = c2*d - c1*s 158 | t4 = t1**2 # t4 = d**2 159 | t4 = t4**2 # t4 = d**4 160 | t4 = b * t4 # t4 = b*d**4 161 | t3 = t2**2 # t3 = n**2 162 | t3 = t2 * t3 # t3 = n**3 163 | t3 = t1 * t3 # t3 = d*n**3 164 | t3 = t3 + t4 # t3 = h = d*n**3 + b*d**4 165 | if is_square(t3): 166 | t3 = sqrt(t3) # t3 = sqrt(h) 167 | t1 = 1/t1 # t1 = i = 1/d 168 | x = t2 * t1 # x = n*i 169 | t1 = t1**2 # t1 = i**2 170 | y = t3 * t1 # y = sqrt(h)*i**2 171 | else: 172 | t2 = t1 + t2 # t2 = n+d 173 | t2 = -t2 # t2 = n = -n-d 174 | t3 = t2**2 # t3 = n**2 175 | t3 = t2 * t3 # t3 = n**3 176 | t3 = t1 * t3 # t3 = d*n**3 177 | t3 = t3 + t4 # t3 = h = d*n**3 + b*d**4 178 | if is_square(t3): 179 | t3 = sqrt(t3) # t3 = sqrt(h) 180 | t1 = 1/t1 # t1 = i = 1/d 181 | x = t2*t1 # x = n*i 182 | t1 = t1**2 # t1 = i**2 183 | y = t3*t1 # y = sqrt(g)*i**2 184 | else: 185 | t0 = 3*t0 # t0 = 3*s 186 | t0 = 1/t0 # t0 = 1/(3*s) 187 | t1 = t1**2 # t1 = d**2 188 | t0 = t1 * t0 # t0 = d**2 / (3*s) 189 | t0 = -t0 # t0 = -d**2 / (3*s) 190 | x = 1 + t0 # x = 1 - d**2 / (3*s) 191 | t0 = x**2 # t0 = x**2 192 | t0 = t0*x # t0 = x**3 193 | t0 = t0 + b # t0 = x**3 + b 194 | y = sqrt(t0) # y = sqrt(x**3 + b) 195 | if is_odd(y) != is_odd(u): 196 | y = -y 197 | return (x, y) 198 | 199 | def r3(x,y,i): 200 | """Reverse mapping function, broken down.""" 201 | if i == 1 or i == 2: 202 | t0 = 2*x # t0 = 2x 203 | t0 = t0 + 1 # t0 = z = 2x+1 204 | t1 = t0 + (-c1) # t1 = z-c1 205 | t1 = -t1 # t1 = c1-z 206 | t0 = c1 + t0 # t0 = c1+z 207 | t2 = t0 * t1 # t2 = (c1-z)*(c1+z) 208 | t2 = (1+b) * t2 # t2 = (1+b)*(c1-z)*(c1+z) 209 | if not is_square(t2): 210 | return None 211 | if i == 1: 212 | if t0 == 0: 213 | return None 214 | if t1 == 0 and is_odd(y): 215 | return None 216 | t2 = sqrt(t2) # t2 = sqrt((1+b)*(c1-z)*(c1+z)) 217 | t0 = 1/t0 # t0 = 1/(c1+z) 218 | u = t0 * t2 # u = sqrt((1+b)*(c1-z)/(c1+z)) 219 | else: 220 | t0 = x + 1 # t0 = x+1 221 | t0 = -t0 # t0 = -x-1 222 | t3 = t0**2 # t3 = (-x-1)**2 223 | t0 = t0 * t3 # t0 = (-x-1)**3 224 | t0 = t0 + b # t0 = (-x-1)**3 + b 225 | if is_square(t0): 226 | return None 227 | t2 = sqrt(t2) # t2 = sqrt((1+b)*(c1-z)*(c1+z)) 228 | t1 = 1/t1 # t1 = 1/(c1-z) 229 | u = t1 * t2 # u = sqrt((1+b)*(c1+z)/(c1-z)) 230 | else: 231 | t0 = 6*x # t0 = 6x 232 | t0 = t0 + (4*b - 2) # t0 = -z = 6x + 4B - 2 233 | t1 = t0**2 # t1 = z**2 234 | t1 = t1 + (-16*(b+1)**2) # t1 = z**2 - 16*(b+1)**2 235 | if not is_square(t1): 236 | return None 237 | t1 = sqrt(t1) # t1 = r = sqrt(z**2 - 16*(b+1)**2) 238 | if i == 4: 239 | if t1 == 0: 240 | return None 241 | t1 = -t1 # t1 = -r 242 | t0 = -t0 # t0 = 2-4B-6x 243 | t0 = t0 + t1 # t0 = 4s = 2-4B-6x +- r 244 | if not is_square(t0): 245 | return None 246 | t1 = t0 + (4*(b+1)) # t1 = d = 4s + 4(b+1) 247 | t2 = c3 * t0 # t2 = c3*(2-4B-6x +- r) 248 | t2 = t2 + (2*(b+1)*(c1-1)) # t2 = n = c3(2-4B-6x +- r) + 2(b+1)(c1-1) 249 | t3 = t2**2 # t3 = n**2 250 | t3 = t2 * t3 # t3 = n**3 251 | t3 = t1 * t3 # t3 = d*n**3 252 | t1 = t1**2 # t1 = d**2 253 | t1 = t1**2 # t1 = d**4 254 | t1 = b * t1 # t1 = b*d**4 255 | t3 = t3 + t1 # t3 = h = d*n**3 + b*d**4 256 | if is_square(t3): 257 | return None 258 | t0 = sqrt(t0) # t0 = sqrt(4s) 259 | u = t0 / 2 # u = sqrt(s) 260 | if is_odd(y) != is_odd(u): 261 | u = -u 262 | return u 263 | 264 | # Iterate over field sizes. 265 | for p in range(7, 32768, 12): 266 | if not is_prime(p): 267 | continue 268 | # Set of curve orders encountered so far for field size p. 269 | orders = set() 270 | # Compute field F and c_i constants. 271 | F = GF(p) 272 | c1 = F(-3).sqrt() 273 | c2 = (c1 - 1) / 2 274 | c3 = (-c1 - 1) / 2 275 | # Randomly try b constants in y^2 = x^3 + b equations. 276 | while True: 277 | b = F.random_element() 278 | if 27*b**2 == 0: 279 | # Not an elliptic curve 280 | continue 281 | # There can only be 6 distinct (up to isomorphism) curves y^2 = x^3 + b for a given field size. 282 | if len(orders) == (2 if p == 7 else 5 if p == 19 else 6): 283 | break 284 | # b+1 must be a square in the field. 285 | if jacobi_symbol(b+1, p) != 1: 286 | continue 287 | # Define elliptic curve E and compute its order. 288 | E = EllipticCurve(F,[0,b]) 289 | order = E.order() 290 | # Skip orders we've seen so far. 291 | if order in orders: 292 | continue 293 | orders.add(order) 294 | # Only operate on prime-ordered curves. 295 | if not order.is_prime(): 296 | continue 297 | # Compute forward mapping tables according to f1, f2, f3, and compare them. 298 | FM = [f1(F(uval)) for uval in range(0, p)] 299 | assert FM == [f2(F(uval)) for uval in range(0, p)] 300 | assert FM == [f3(F(uval)) for uval in range(0, p)] 301 | # Verify that all resulting points are on the curve. 302 | for x, y in FM: 303 | assert y**2 == x**3 + b 304 | cnt = 0 305 | reached = 0 306 | G = E.gen(0) 307 | # The number of preimages every multiple of G has 308 | PC = [0 for _ in range(order)] 309 | # Iterate over all points on the curve. 310 | for m in range(1,order): 311 | A = m*G 312 | x, y, _ = A 313 | # Compute the list of all preimages of the point. 314 | preimages = [] 315 | for i in range(1,5): 316 | # Compute preimages using r_i (3 different implementation). 317 | u1, u2, u3 = r1(x, y, i), r2(x, y, i), r3(x, y, i) 318 | # Compare the results of the 3 implementations. 319 | if u1 is not None: 320 | assert u1 == u2 321 | assert u1 == u3 322 | preimages.append(u1) 323 | else: 324 | assert u2 is None 325 | assert u3 is None 326 | # Verify all preimages are distinct 327 | assert len(set(preimages)) == len(preimages) 328 | # Verify all preimages round-trip correctly. 329 | for u in preimages: 330 | assert FM[int(u)] == (x, y) 331 | cnt += len(preimages) 332 | PC[m] = len(preimages) 333 | reached += (len(preimages) > 0) 334 | # Verify that all preimages are reached. 335 | assert cnt == len(FM) 336 | # Verify that the point at infinity cannot be reached. 337 | assert PC[0] == 0 338 | # Compute number of preimages for each multiple of G (except 0), under Elligator Squared 339 | PC2 = [0 for _ in range(order-1)] 340 | for i in range(1,order): 341 | if PC[i] == 0: continue 342 | for j in range(1, order - i): PC2[i + j - 1] += PC[i] * PC[j] 343 | PC2[i - 1] += PC[i] * PC[order - i] 344 | for j in range(order - i + 1, order): PC2[i + j - order - 1] += PC[i] * PC[j] 345 | mn, mx, s = min(PC2), max(PC2), sum(PC2) 346 | d1 = sum(abs(x / s - 1 / (order - 1)) for x in PC2) 347 | d2 = sqrt(sum((x / s - 1 / (order - 1))**2 for x in PC2)) 348 | print("y^2 = x^3 + (b=%i) mod (p=%i): order %i, %.2f%% reached, range=%i..%i, Delta1 = %.4f/sqrt(p), Delta2 = %.4f/p" % (b, p, order, reached * 100 / (order-1), mn, mx, d1 * sqrt(p), d2 * p)) 349 | -------------------------------------------------------------------------------- /minimizing-golomb-filters/README.md: -------------------------------------------------------------------------------- 1 | # Minimizing the redundancy in Golomb Coded Sets 2 | 3 | A Golomb Coded Set (GCS) is a set of *N* distinct integers within the range *[0..MN-1]*, whose order does not matter, and stored by applying a Golomb-Rice coder with parameter *B* to the differences between subsequent elements after sorting. When the integers are hashes of elements from a set, this is an efficient encoding of a probabilistic data structure with false positive rate *1/M*. It is asymptotically *1 / log(2)* (around 1.44) times more compact than Bloom filters, but harder to update or query. 4 | 5 | The question we try to answer in this document is what combinations of *B* and *M* minimize the resulting size of the filter, and find that the common suggestion *B = log2(M)* is not optimal. 6 | 7 | ## Size of a Golomb Coded Set 8 | 9 | To determine the size of a Golomb coding of a set, we model the differences between subsequent elements after sorting as a geometric distribution with *p = 1 / M*. This is a good approximation if the size of the set is large enough. 10 | 11 | The Golomb-Rice encoding of a single difference *d* consists of: 12 | * A unary encoding of *l = floor(d / 2B)*, so *l* 1 bits plus a 0 bit. 13 | * The lower *B* bits of *d*. 14 | 15 | In other words, its total length is *B + 1 + floor(d / 2B)*. To compute the expected value of this expression, we start with *B + 1*, and add *1* for each *k* for which *d ≥ 2Bk*. In a geometric distribution with *p = 1 / M*, *P(d ≥ 2Bk) = (1 - 1/M)2Bk*. Thus, the total expected size becomes *B + 1 + ∑((1 - 1/M)2Bk for k=1...∞)*. This sum is a geometric series, and its limit is *B + 1 / (1 - (1 - 1/M)2B)*. It can be further approximated by *B + 1 / (1 - e-2B/M)*. 16 | 17 | For *M = 220* and *B = 20*, it is ***21.58197*** while a simulation of a GCS with *N=10000* gives us ***21.58187***. Most of the inaccuracy is due to the fact that the differences between subsequent elements in a sorted set are not exactly independent samples from a single distribution. 18 | 19 | ## Minimizing the GCS size 20 | 21 | For a given value *M* (so a given false positive rate), we want to minimize the size of the GCS. 22 | 23 | In other words, we need to see where the derivative of the expression above is 0. That derivative is *1 - log(2)e2B/M2B / (M(e2B/M-1)2)*, and it is zero when *log(2)e2B/M2B = M(e2B/M-1)2*. If we substitute *r = 2B/M*, we find that *log(2)err = (er-1)2* must hold, or *1 + log(√2)r = cosh(r)*, leading to the solution *r = 2B/M = 0.6679416*. 24 | 25 | In other words, we find that the set size is minimized when *B = log2(M) - 0.5822061*, or *M = 1.497137 2B*. These numbers are only exact for the approximation made above, but simulating the size for actual random sets confirms that these are close to optimal. 26 | 27 | Of course, *B* can only be chosen to be an integer. To find the range of *M* values for which a given *B* value is optimal, we need to find the switchover point. At this switchover point, for a given *M*, *B* and *B+1* result in the same set size. If we solve *B + 1 + 1 / (1 - exp(-2B/M)) = B + 1 + 1 / (1 - exp(-2B+1/M))*, we find *M = 2B / log((1 + √5)/2)*. This means a given *B* value is optimal in the range *M = 1.039043 2B ... 2.078087 2B*. 28 | 29 | Surprisingly *2B* itself is outside that range. This means that if *M* is chosen as *2Q* with integer *Q*, the optimal value for *B* is **not** *Q* but *Q-1*. 30 | 31 | ## Compared with entropy 32 | 33 | A next question is how close we are to optimal. 34 | 35 | To answer this, we must find out how much entropy is there in a set of *N* uniformly randomly integers in range *[0..MN-1]*. 36 | 37 | The total number of such possible sets, taking into account that the order does not matter, is simply *(MN choose N)*. This can be written as *((MN-N+1)(MN-N+2)...(MN)) / N!*. When *M* is large enough, this can be approximated by *(MN)N / N!*. Using Stirling's approximation for *N!* gives us *(eM)N / √(2πN)*. 38 | 39 | As each of these sets is equally likely, information theory tells us an encoding for a randomly selected set must use at least *log2((eM)N / √(2πN))* bits of data, or at least *log2((eM)N / √(2πN))/N* per element. This equals *log2(eM) - log2(2πN)/(2N)*. For large *N*, this expression approaches *log2(eM)*. 40 | 41 | For practical numbers, this approximation is pretty good. When picking *M = 220*, this gives us ***21.44270*** bits per element, while the exact value of *log2(MN choose N)/N* for *N = 10000* is ***21.44190***. 42 | 43 | If we look at the ratio between our size formula *B + 1 / (1 - e-2B/M)* and the entropy *log2(eM)*, evaluated for *M = 1.497137 2B*, we get *(B + 2.052389)/(B + 2.024901)*. For *B = 20* that value is *1.001248*, which means the approximate GCS size is less than 0.125% larger than theoretically possible. To contrast, if *M = 2M*, the redundancy is around 0.65%. 44 | 45 | ## Conclusion 46 | 47 | When you have freedom to vary the false positive rate for your GCS, picking *M = 1.497137 2Q* for an integer *Q* will give you the best bang for the buck. In this case, use *B = Q* and you end up with a set only slightly larger than theoretically possible for that false positive rate. 48 | 49 | When you don't have that freedom, use *B = floor(log2(M) - 0.055256)*. -------------------------------------------------------------------------------- /private-authentication-protocols/README.md: -------------------------------------------------------------------------------- 1 | # Private authentication protocols 2 | 3 | * [Introduction](#introduction) 4 | * [Definition](#definition) 5 | * [Discussion](#discussion) 6 | + [Selective interception attack](#selective-interception-attack) 7 | + [Reducing surveillance](#reducing-surveillance) 8 | * [Example protocols](#example-protocols) 9 | + [Assumptions and notation](#assumptions-and-notation) 10 | + [Partial solutions](#partial-solutions) 11 | - [Example 1: no responder privacy](#example-1-no-responder-privacy) 12 | - [Example 2: only responder privacy when failing, no challenger privacy](#example-2-only-responder-privacy-when-failing--no-challenger-privacy) 13 | - [Example 3: single key, no challenger privacy](#example-3-single-key-no-challenger-privacy) 14 | + [Private authentication protocols](#private-authentication-protocols-1) 15 | - [Example 4: simple multi-key PAP](#example-4-simple-multi-key-pap) 16 | - [Example 5: Countersign](#example-5-countersign) 17 | - [Example 6: constructing a PAP from private set intersection](#example-6-constructing-a-pap-from-private-set-intersection) 18 | + [Comparison of applicability](#comparison-of-applicability) 19 | * [Acknowledgments](#acknowledgments) 20 | 21 | ## Introduction 22 | 23 | Authentication protocols are used to verify that network connections are 24 | not being monitored through a man-in-the-middle attack (MitM). But the 25 | commonly used constructions for authentication—often some framework 26 | surrounding a digital signature or key exchange protocol—reveal considerable amounts of 27 | identifying information to the participants (and MitMs). This information can 28 | potentially be used to track otherwise anonymous users around the network 29 | and correlate users across multiple services, if keys are reused. 30 | 31 | Ultimately, key-based authentication protocols are trying to answer the 32 | question, "Does the remote party know the corresponding private key for 33 | an identity key we accept?" A protocol which answers this question and 34 | *nothing* else would naturally provide for 35 | authentication that is undetectable by MitMs: just make its usage mandatory and use random keys 36 | when no identity is expected. Such a protocol would also provide no 37 | avenue for leaking any identifying information beyond the absolute 38 | minimum needed to achieve authentication. 39 | 40 | This is a work in progress to explore the possibilities and properties 41 | such protocols have. It lacks formal definitions and security 42 | proofs for the example protocols we have in mind, but these are being 43 | worked on. 44 | 45 | ## Definition 46 | 47 | We define a **private authentication protocol** (PAP) as a protocol that allows a 48 | challenger to establish whether their peer 49 | possesses a private key corresponding to one of (potentially) several acceptable public keys. 50 | It is intended to be used over an encrypted but unauthenticated connection. 51 | 52 | There are two parties: the challenger and the responder. The challenger has a set of acceptable 53 | public keys ***Q***, with a publicly-known upper bound *n*. ***Q*** can be empty if no authentication is desired. 54 | The responder has a set of private keys ***p*** (possibly with a known upper bound *m*), with corresponding 55 | set of public keys ***P***. These sets may be empty if the responder has no key material. The challenger at 56 | the end outputs a Boolean: success or failure. 57 | 58 | A PAP must have the following properties: 59 | * **Correctness** When the intersection between ***P*** and ***Q*** is non-empty, the challenger returns true. This is obviously 60 | necessary for the scheme to work at all. 61 | * **Security against impersonation** When the intersection between ***P*** and ***Q*** is empty, the challenger 62 | returns false. This means it must be impossible for someone without access to an acceptable private key 63 | to spoof authentication, even if they know the acceptable public key(s). 64 | * **Challenger privacy** Responders learn nothing about the keys in ***Q*** except possibly the intersection 65 | with its ***P***. Specifically, responders which have a database of public keys (without corresponding private keys) cannot know 66 | which of these, or how many of them, are in ***Q***. Responders can also not determine whether multiple 67 | PAP sessions have overlapping ***Q*** sets (excluding public keys the responder has private keys for). 68 | This prevents unsuccessful responders (including MitMs) from knowing whether authentication is desired at all or for whom, 69 | or following challengers around. 70 | * **Responder privacy** Challengers learn nothing about ***P*** apart from whether its intersection with ***Q*** is non-empty. 71 | Specifically, a challenger with a set of public keys (without corresponding private keys) trying to learn ***P*** cannot do 72 | better than what the correctness property permits when choosing ***Q*** equal to that set. Furthermore, challengers 73 | cannot determine whether there is overlap between the ***P*** sets in multiple failing PAP sessions, or even overlap 74 | between the non-winning keys in ***P*** in successful PAP sessions. 75 | This property prevents a MitM from identifying failing responders or following them around. In addition, it rate-limits the 76 | ability for challengers 77 | to learn information about the responder across multiple protocol runs to one guess per protocol. Note that 78 | if *n>1*, this property implies an inability for challengers to know which acceptable key(s) a successful responder used. 79 | 80 | Two additional properties may be provided: 81 | * **Forward challenger privacy**: challenger privacy is maintained even against responders who have access to 82 | a database with private keys. This prevents responders (including MitMs) that record network traffic from correlating 83 | protocol runs with the keys used, even after the private keys are leaked. As this definition includes 84 | honest responders, forward challenger privacy is equivalent to stating that responders do not learn anything at all. 85 | * **Forward responder privacy**: responder privacy is maintained even against challengers who have a database 86 | of private keys. 87 | 88 | Note that while a PAP does not require the responder to output whether they are successful, doing so 89 | is also not in conflict with any of the required properties above. When a PAP has forward challenger privacy however, 90 | it is actually impossible for a responder to know whether they are successful. 91 | 92 | A PAP as defined here is unidirectional. If the responder also wants to authenticate the challenger, 93 | it can be run a second time with the roles reversed. If done sequentially, the outcome of the protocol 94 | in one direction can be used as input for the next one, i.e. if the first run fails the first challenger 95 | can act as second responder with empty ***p*** (no private keys). 96 | 97 | ## Discussion 98 | 99 | These properties are primarily useful in the context of *optional authentication*. 100 | Imagine a setting where ephemeral encryption is automatic and mandatory but authentication is 101 | opportunistic: if an end-point expects a known identity 102 | it will authenticate, otherwise it won't and only get resistance against 103 | passive observation. 104 | 105 | ### Selective interception attack 106 | 107 | In this configuration, when the attempt at authentication is observable 108 | to an active attacker, a **selective interception** attack is possible 109 | that evades detection: 110 | * When no authentication is requested on a connection, the MitM maintains 111 | the connection and intercepts it. 112 | * When authentication is requested, the MitM innocuously terminates the 113 | connection, and blacklists the network address involved so it will 114 | discontinue intercepting retried connections. 115 | 116 | Challenger privacy allows mitigating this vulnerability: due to it, 117 | MitMs cannot distinguish PAP runs which do and don't desire authentication. 118 | Thus if all connections (even those that don't seek authentication) use a PAP, the MitM 119 | is forced to either drop all connections (becoming powerless while causing collateral damage) or 120 | risk being detected on every connection (as every PAP-employing connection could be an attempt to authenticate). 121 | 122 | ### Reducing surveillance 123 | 124 | As unauthenticated connections are an explicit use case, private 125 | authentication protocols assure the responder's privacy in the unauthenticated case. 126 | Responder privacy implies that the challenger cannot learn whether two separate 127 | protocol runs (in separate connections) were with peers that possess the 128 | same private key, effectively preventing the challenger from surveiling its 129 | unauthenticated peers and following them around. 130 | 131 | Responder privacy also implies that the challenger does not learn which of its 132 | acceptable public keys the responder's private key corresponded to, in case there 133 | are multiple. To see why this 134 | may be useful, note that the anti-surveillance property from the previous 135 | paragraph breaks down whenever the challenger can run many instances of the protocol 136 | with separate acceptable keys, for a large set of (e.g. leaked) keys that 137 | may include the responder's public key. In order to combat this, the responder can limit the 138 | number of independent protocol runs it is willing to participate in. If the challenger 139 | could learn which acceptable public key the responder's private key corresponded to, 140 | this would need to be a limit on the total number of keys in all protocol 141 | runs combined, rather than the total number of protocol runs. If the challenger has 142 | hundreds of acceptable public keys, and the responder is one of them, the responder must support 143 | participating in a protocol with hundreds of acceptable keys—but 144 | doesn't have to accept participating in more than one protocol run. 145 | 146 | ## Example protocols 147 | 148 | ### Common notation and assumptions 149 | 150 | We assume an encrypted but unauthenticated connection already exists 151 | between the two participants. We also assume a unique session id *s* exists, only 152 | known to the participants. Both could for example be set up per a Diffie-Hellman 153 | negotiation. 154 | 155 | *G* and *M* are two generators of an additively-denoted cyclic group in which 156 | the discrete logarithm problem is hard, and *M*'s discrete logarithm w.r.t. *G* 157 | is not known to anyone. The *⋅* symbol denotes scalar multiplication (repeated 158 | application of the group operation). 159 | Lowercase variables refer to integers modulo the 160 | group order, and uppercase variables refer to group elements. *h* refers to a hash function 161 | onto the integers modulo the group order, modeled as a random oracle. 162 | Sets are denoted in **bold**, and considered serialized by concatenating their elements in sorted order. 163 | 164 | The set of acceptable public keys ***Q*** consists of group elements *Q0*, *Q1*, ..., *Qn-1*. 165 | The set of the responder's private keys is ***p***, consisting of integers *p0*, *p1*, ..., *pm-1*. 166 | The corresponding set of public keys is ***P***, consisting of *P0 = p0⋅G*, 167 | *P1 = p1⋅G*, ..., *Pm-1 = pm-1⋅G*. 168 | 169 | In case fewer than *n* acceptable public keys exist, the *Qi* values 170 | are padded with randomly generated ones. In case no authentication is desired, 171 | all of them are randomly generated. Similarly, if a protocol has an upper bound 172 | on the number of private keys *m*, and fewer keys than that are present, it is 173 | padded with randomly generated ones. 174 | 175 | Terms used in the security properties: 176 | * Unconditionally: the stated property holds against adversaries with unbounded computation. 177 | * ROM (random oracle model): *h* is indistinguisable from a random function. 178 | * The DL (discrete logarithm) assumption: given group elements *(P, a⋅P)*, it is hard to compute *a*. 179 | * The CDH (computational Diffie-Hellman) assumption: given group elements *(P, a⋅P, b⋅P)*, it is hard to compute *ab⋅P*. 180 | * The DDH (decisional Diffie-Hellman) assumption: group element tuples *(P, a⋅P, b⋅P, ab⋅P)* are hard to distinguish from random tuples. 181 | 182 | ### Partial solutions 183 | 184 | Here we give a few examples of near solutions which don't provide all desired properties simultaneously. 185 | This demonstrates how easy some of the properties are in isolation, while being nontrivial to combine them. 186 | 187 | #### Example 1: no responder privacy 188 | 189 | If we do not care about responder privacy, it is very simple. The responder just reports their 190 | public key and a signature with it. For a single private key version (*m=1*) that is: 191 | 192 | * The responder: 193 | * Computes a digital signature *d* on *s* using key *p0*. 194 | * Sends *(P0, d)*. 195 | * The challenger: 196 | * Returns whether *P0* is in ***Q***, and whether *d* is a valid signature on *s* using *P0*. 197 | 198 | This clearly provides challenger privacy, as the challenger does not send anything at all. 199 | It however does not provide responder privacy, as the responder unconditionally reveals their public key. 200 | 201 | #### Example 2: only responder privacy when failing, no challenger privacy 202 | 203 | It seems worthwhile to try to only have the responder reveal their key in case of success. 204 | In order to do that, it must know whether its key is acceptable. In this first attempt, 205 | we compromise by giving up challenger privacy. 206 | 207 | * The challenger: 208 | * Sends ***Q***. 209 | * The responder: 210 | * If any *Pi* is in ***Q***: 211 | * Computes a digital signature *d* on *s* using key *pi*. 212 | * Sends *(Pi, d)*. 213 | * Otherwise, if there is no overlap between ***P*** and ***Q***: 214 | * Sends a zero message with the same size as a public key and a signature. 215 | * The challenger: 216 | * Returns whether *Pi* is in ***Q***, and whether *d* is a valid signature on *s* using *Pi*. 217 | 218 | Now there is obviously no challenger privacy as ***Q*** is revealed directly. Yet, we have not recovered 219 | responder privacy either, as the matching public key is revealed in case of success. 220 | 221 | 222 | #### Example 3: single key, no challenger privacy 223 | 224 | In case there is at most a single acceptable public key *Q0* (*n=1*), responder 225 | privacy can be recovered: 226 | 227 | * The challenger: 228 | * Sends *Q0*. 229 | * The responder: 230 | * If *Q0* equals any *Pi*: 231 | * Computes a digital signature *d* on *s* using key *pi*. 232 | * Sends *d*. 233 | * Otherwise, if *Q0* is not in ***P***: 234 | * Sends a zero message with the same size as a digital signature. 235 | * The challenger: 236 | * Returns whether *d* is a valid signature on *s* using *Q0*. 237 | 238 | Yet, challenger privacy is still obviously lacking. It is possible to improve upon this somewhat 239 | by e.g. sending *h(Q0 || s)* instead of *Q0* directly. While that indeed 240 | means the key is not revealed directly anymore, an attacker who has a list of candidate 241 | keys can still test for matches, and challenger privacy requires that no information 242 | about the key can be inferred at all. 243 | 244 | 245 | ### Private authentication protocols 246 | 247 | We conjecture that the following protocols do provide all the properties needed for a PAP 248 | under reasonable assumptions. 249 | 250 | #### Example 4: simple multi-key PAP 251 | 252 | To achieve responder privacy in the multi-key case, while simultaneously retaining 253 | challenger privacy, we need a different approach. The idea is to perform a Diffie-Hellman 254 | key exchange between an ephemeral key (*d* below) and the acceptable public keys, 255 | and use the results to blind a secret value *y*, whose hash is revealed to the responder. 256 | If the responder can recover *y*, they must have one of the corresponding private keys. 257 | 258 | The result is a multi-acceptable-key (*n≥1*), unbounded-private-keys (*m* need not be publicly known), single-roundtrip PAP 259 | with *O(n)* communication cost. Scalar and hash operations scale with *O(mn)*, but group operations only with *O(n)*. 260 | 261 | * The challenger: 262 | * Generates random integers *d* and *y*. 263 | * Computes *D = d⋅G*. 264 | * Computes *w = h(y)*. 265 | * Constructs the set ***c***, consisting of the values of *y - h(d⋅Qi || s)* for each *Qi* in ***Q***. 266 | * Sends *(D, w, **c**)*. 267 | * Remembers *w* (or *y*). 268 | * The responder: 269 | * Constructs the set ***f***, consisting of the values of *h(pj⋅D || s)* for each *pj* in ***p***. 270 | * If for any *ci* in ***c*** and any *fj* in ***f*** it holds that *h(ci + fj) = w*: 271 | * Sets *z = ci + fj*. 272 | * Otherwise, if this does not hold for any *i,j*: 273 | * Sets *z = 0*. 274 | * Sends *z*. 275 | * The challenger: 276 | * Returns whether or not *h(z) = w* (or equivalently, *z = y*). 277 | 278 | Conjectured properties: 279 | * Correctness: unconditionally 280 | * Security against impersonation: under ROM+CDH 281 | * Challenger privacy: under ROM+CDH, or under DDH 282 | * (Forward) responder privacy: unconditionally 283 | 284 | It has no forward challenger privacy, as responders learn which of their private key(s) was acceptable. 285 | 286 | #### Example 5: Countersign 287 | 288 | If we want forward challenger privacy, we must go even further. we again perform a Diffie-Hellman exchange 289 | between an ephemeral key and the acceptable public key (now restricted to just a single one), 290 | but then use a (unidirectional) variant of the [socialist millionaire](https://en.wikipedia.org/wiki/Socialist_millionaire_problem) 291 | protocol to verify both sides reached the same shared secret. This is similar to the technique used in 292 | the [Off-the-Record](https://en.wikipedia.org/wiki/Off-the-Record_Messaging) protocol for authentication, as well 293 | as in [SPAKE2](https://tools.ietf.org/id/draft-irtf-cfrg-spake2-10.html) for verifying passwords. 294 | 295 | The result is a protocol we call Countersign: a single-acceptable-key (*n=1*), multi-private-key (*m≥1*), 296 | single-roundtrip PAP with both forward challenger privacy and forward responder privacy. 297 | It has *O(m)* communication and computational cost. 298 | 299 | * The challenger: 300 | * Generates random integers *d* and *y*. 301 | * Computes *D = d⋅G*. 302 | * Computes *C = y⋅G - h(d⋅Q0|| s)⋅M*. 303 | * Sends *(D, C)*. 304 | * Remembers *y*. 305 | * The responder: 306 | * Generates random integer *k*. 307 | * Computes *R = k⋅G*. 308 | * Constructs the set ***w***, consisting of the values *h(k⋅(C + h(pj⋅D || s)⋅M))* for each *pj* in ***p***. 309 | * Sends *(R, **w**)*. 310 | * The challenger: 311 | * Returns whether or not *wj = h(y⋅R)* for any *wj* in ***w***. 312 | 313 | Conjectured properties: 314 | * Correctness: unconditionally 315 | * Security against impersonation: under CDH 316 | * (Forward) challenger privacy: unconditionally 317 | * (Forward) responder privacy: under ROM+CDH, or under DDH 318 | 319 | Because of forward challenger privacy, this protocol does not let the responder learn whether they 320 | are successful themselves. 321 | 322 | Note that one cannot simply run this protocol multiple times with different acceptable keys to obtain 323 | an *n>1* PAP, because the challenger would learn which acceptable key succeeded, violating 324 | responder privacy. 325 | 326 | Also interesting is that there appears to be an inherent trade-off between unconditional challenger 327 | privacy and unconditional responder privacy, and both cannot exist simultaneously. Responder privacy 328 | seems to imply that the messages sent by the challenger must form a secure commitment to the set 329 | ***Q***. If this wasn't the case and challengers could "open" it to other keys, then nothing would 330 | prevent them from learning about intersections between ***P*** and these other keys as well. Thus 331 | responder privacy seems to imply that the challenger must bind to their keys, while challenger 332 | privacy requires hiding the keys. Binding to and hiding the same data cannot both be achieved unconditionally. 333 | 334 | #### Example 6: constructing a PAP from private set intersection 335 | 336 | There is a striking similarity between PAPs and [private set intersection](https://en.wikipedia.org/wiki/Private_set_intersection) 337 | protocols (PSIs); both are related to finding properties of the intersection between two sets of elements in a private way. The differences 338 | are: 339 | * In PAPs, the elements being compared are asymmetric (private and public keys), 340 | while PSIs is about finding the intersection of identical elements on both sides. 341 | * In PAPs, the challenger only learns whether the intersection is non-empty, whereas 342 | in PSIs the intersection itself is revealed. 343 | * In PAPs, it is unnecessary for the responder to learn anything, and with forward 344 | challenger privacy it is even forbidden. 345 | 346 | With certain restrictions, it is possible to exploit this similarity and build PAPs 347 | out of (variations of) PSIs. We first need to convert the private keys 348 | ***p*** and public keys ***Q*** to sets of symmetric elements that can be compared. 349 | To do so, we repeat the Diffie-Hellman trick from the previous protocols: 350 | * The challenger: 351 | * Generates a random integer *d*. 352 | * Computes *D = d⋅G*. 353 | * Constructs the set ***e***, consisting of values *h(d⋅Qi || s)* for each *Qi* in ***Q***. 354 | * Sends *D*. 355 | * The responder: 356 | * Constructs the set ***f***, consisting of values *h(pi⋅D || s)* for each *pi* in ***p***. 357 | 358 | The PAP problem is now reduced to the challenger learning whether the intersection between ***e*** and ***f*** is 359 | non-empty. Depending on the conditions there are various ways to accomplish this with PSIs. In all cases, 360 | the PAP properties for correctness, security against impersonation, challenger privacy, and (forward) 361 | responder privacy then follow from CDH plus the PSI's security and privacy properties. 362 | * If *n=1*, a one-to-many PSI can be used directly. As in this case ***e*** is a singleton, learning its intersection with ***f*** 363 | is equivalent to learning whether the intersection is non-empty. 364 | * Also for *n=1*, such a one-to-many PSI may be constructed from a single-round one-to-one PSI. The challenger sends their 365 | PSI message first, and the responder sends the union of PSI responses (one for each private key in ***p***). This is not the 366 | same as running multiple independent one-to-one PSIs, as the shared challenger message prevents the challenger from changing 367 | the choice of ***Q*** between runs, which would permit the challenger to break responder privacy. This approach is effectively 368 | what Countersign is using, with the socialist millionaire problem taking the role of a one-to-one PSI. 369 | * If *n>1*, a PSI algorithm cannot be used in a black-box fashion, and it needs to be modified to not reveal which element 370 | of ***Q*** matched. If *m>1* in addition to *n>1*, the size of the intersection would need to be hidden as well. 371 | * To achieve forward challenger privacy, a one-sided PSI that does not reveal anything to the responder needs to be used. 372 | 373 | ### Comparison of applicability 374 | 375 | Countersign is the most private protocol in the single-acceptable-key setting. It appears most useful 376 | in situations where a connection initiator knows who they are trying to connect to (implicitly 377 | limiting to *n=1*). 378 | 379 | The multi-key protocol is significantly more flexible, but lacks forward challenger privacy, and 380 | thus more strongly relies on keeping private key material private, even after decommissioning. 381 | 382 | A potentially useful composition of the two is using Countersign for the connection initiator 383 | trying to authenticate the connection acceptor, but then using the multi-key protocol for the 384 | other direction. In case the first protocol fails, the second direction can run without 385 | private keys. 386 | 387 | ## Acknowledgments 388 | 389 | Thanks to Greg Maxwell for the idea behind Countersign, and the rationale for private 390 | authentication protocols and optional authentication. 391 | Thanks to Tim Ruffing for the simple multi-key PAP, discussions, and feedback. 392 | Thanks to Mark Erhardt for proofreading. 393 | -------------------------------------------------------------------------------- /uniform-range-extraction/README.md: -------------------------------------------------------------------------------- 1 | # Extracting multiple uniform numbers from a hash 2 | 3 | This document introduces a technique for extracting multiple numbers 4 | in any range from a single hash function result, while optimizing for various 5 | uniformity properties. 6 | 7 | * [Introduction](#introduction) 8 | + [The fast range reduction](#the-fast-range-reduction) 9 | + [Maximally uniform distributions](#maximally-uniform-distributions) 10 | * [Generalizing to multiple outputs](#generalizing-to-multiple-outputs) 11 | + [Splitting the hash in two](#splitting-the-hash-in-two) 12 | + [Transforming the hash](#transforming-the-hash) 13 | + [Extracting and updating the state](#extracting-and-updating-the-state) 14 | + [Fixing individual uniformity](#fixing-individual-uniformity) 15 | + [Avoiding the need to decompose *n*](#avoiding-the-need-to-decompose--n-) 16 | + [C version](#c-version) 17 | * [Use as a random number generator?](#use-as-a-random-number-generator-) 18 | * [Conclusion](#conclusion) 19 | * [Acknowledgments](#acknowledgments) 20 | 21 | ## Introduction 22 | 23 | ### The fast range reduction 24 | 25 | In [this post](https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/) 26 | by Daniel Lemire, a fast method is described to map a *B*-bits hash *x* to a number in range *[0,n)*. 27 | Such an operation is needed for example to convert the output of a hash function to a hash table index. 28 | The technique is primarily aimed at low-level languages where the cost of this operation may already matter, 29 | but for presentation purposes I'm going to use Python here. 30 | 31 | ```python 32 | B = 32 33 | MASK = 2**B - 1 34 | 35 | def fastrange(x, n): 36 | """Map x in [0,2**B) to output in [0,n).""" 37 | assert 0 <= x <= MASK 38 | assert 0 < n 39 | return (x * n) >> B 40 | ``` 41 | 42 | This function has an interesting property: if *x* is uniformly distributed in 43 | *[0,2B)*, then *fastrange(x,n)* (for any non-zero *n*) will be as close to 44 | uniform as *(x mod n)* is. The latter is often used in hash table implementations, but 45 | relatively slow on modern CPU's. As *fastrange(x,n)* is just as uniform, it's a great 46 | drop-in replacement for the modulo reduction. Note that it doesn't behave **identically** to that; it 47 | just has a similar distribution, and that's all we need. 48 | 49 | ### Maximally uniform distributions 50 | 51 | We can state this property a bit more formally. When starting from an input that is 52 | uniformly distributed over *2B* possible values, and obtaining our 53 | output by applying a deterministic function to *n* outputs, the probability of every 54 | output must be a multiple of *2-B*. With that constraint, the distribution 55 | closest to uniform is one that has *2B mod n* values with probability 56 | *⌈2B/n⌉/2B* each, and 57 | *n - (2B mod n)* values with probability 58 | *⌊2B/n⌋/2B* each. We will call such 59 | distributions **maximally uniform**, with the parameters *B* and *n* implicit 60 | from context. If *n* is a power of two not larger than *2B*, only the 61 | uniform distribution itself is maximally uniform. 62 | 63 | To reach such a maximally uniform distribution, it suffices that the function from the hash has the property 64 | that every output can be reached from either exactly *⌊2B/n⌋* 65 | inputs, or exactly *⌈2B/n⌉*. This is the case for both *x mod n* and *fastrange(x,n)*. 66 | 67 | ## Generalizing to multiple outputs 68 | 69 | But what if we want multiple independent outputs, say both a number in range *[0,n1)* 70 | and a number in range *[0,n2)*? This problem appears in certain hash table 71 | variations (such as [Cuckoo Filters](https://en.wikipedia.org/wiki/Cuckoo_filter) and 72 | [Ribbon Filters](https://arxiv.org/abs/2103.02515)), where both a table position and another 73 | hash of each data element are needed. 74 | 75 | It's of course possible to compute multiple hashes, 76 | for example using prefixes like *xi = H(i || input)*, and using *fastrange* on each. 77 | Increasing the number of hash function invocations comes with computational cost, however, and 78 | furthermore **feels** unnecessary, especially when *n1n2 ≤ 2B*. 79 | Can we avoid it? 80 | 81 | ### Splitting the hash in two 82 | 83 | Another possibility is simply splitting the hash into two smaller hashes, and applying *fastrange* 84 | on each. Here is the resulting distribution you'd get, starting from a (for demonstration 85 | purposes very small) *8*-bit hash, extracting numbers in ranges *n1 = 6* and 86 | *n2 = 10* from the bottom and top *4* bits using *fastrange*: 87 | 88 | ```python 89 | x = hash(...) 90 | B = 4 91 | out1 = fastrange(x & 15, 6) 92 | out2 = fastrange(x >> 4, 10) 93 | ``` 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 |
Variable
Probability / 28 for value ...
0123456789
out1484832484832----
out232321632163232163216
139 | 140 | It is no surprise that all probabilities are a multiple of *16 (/ 28)*, as they're based 141 | on just *4*-bit hash (fragments) each. It does however show that with respect to the original 142 | *8*-bit hash, these results are very far from maximally uniform: for that, the table values 143 | in each row can only differ by one. 144 | 145 | ### Transforming the hash 146 | 147 | Alternatively, it is possible to apply a transformation to the output of the hash function, and then to feed 148 | both the transformed and untransformed hash to *fastrange*. The Knuth multiplicative hash (multiplying 149 | by a large randomish odd constant modulo *2B*) is a popular choice. Redoing our example, we get: 150 | 151 | ```python 152 | x = hash(...) 153 | out1 = fastrange(x, 6) 154 | out2 = fastrange((x * 173) & 0xFF, 10) 155 | ``` 156 | 157 | 158 | 159 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 |
Variable 160 | Probability / 28 for value ...
0123456789
out1434342434342----
out226262526252626252625
201 | 202 | This gives decent results, as now both *out1* and *out2* are maximally uniform. This in 203 | fact holds with this approach regardless of what values of *n1* and *n2* 204 | are used. If we look at the **joint** distribution, however, the result is suboptimal: 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 240 | 241 | 242 | 243 | 244 | 245 | 246 | 247 |
Value of out2Total
0123456789
out10643644454343
1643463464343
2463453454442
3445436436443
4346436434643
5345444634542
Total26262526252626252625256
248 | 249 | While *out1* and *out2* are now individually maximally uniform, the distribution of the combination 250 | of their values is **not**. This matters, because for example in a Cuckoo Filter, one doesn't just care about 251 | the uniformity of the data hashes, but the uniformity of data hashes **in each individual cell** in the table. 252 | If we'd use *out1* as table index, and *out2* as hash to place in that table cell, this 253 | joint distribution shows that the per-cell hash distribution isn't maximally uniform. 254 | 255 | ### Extracting and updating the state 256 | 257 | Turns out, it is easy to make the joint distribution maximally uniform. It is possible to create a variant 258 | of *fastrange* that doesn't just return a number in the desired range, but also returns an updated 259 | "hash" (which I'll call **state** in what follows) which is ready to be used for further extractions. The idea is that this updating should 260 | move the unused portion of the entropy in that hash to the top bits, so that the next extraction 261 | will primarily use those. And we effectively already have that: the bottom bits of `tmp`, which 262 | aren't returned as output, are exactly that. 263 | 264 | ```python 265 | def fastrange2(x, n): 266 | """Given x in [0,2**B), return out in [0,n) and new x.""" 267 | assert 0 <= x <= MASK 268 | assert 0 < n 269 | tmp = x * n 270 | new_x = tmp & MASK 271 | out = tmp >> B 272 | return out, new_x 273 | 274 | # Usage 275 | x1 = hash(...) 276 | out1, x2 = fastrange2(x1, n1) 277 | out2, _ = fastrange2(x2, n2) 278 | ``` 279 | 280 | I don't have a proof, but it can be verified exhaustively for small values of *B*, *n1*, and *n2* 281 | that the resulting joint distribution of *(out1,out2)* is maximally uniform. 282 | In fact, this property remains true regardless of how many values are extracted. 283 | 284 | Repeating the earlier experiment to extract a range *n1 = 6* and range *n2 = 10* 285 | value from a *B=8*-bit hash, we get: 286 | 287 | 288 | 289 | 290 | 291 | 292 | 293 | 294 | 295 | 296 | 297 | 298 | 299 | 300 | 301 | 302 | 303 | 304 | 305 | 306 | 307 | 308 | 309 | 310 | 311 | 312 | 313 | 314 | 315 | 316 | 317 | 318 | 319 | 320 | 321 | 322 | 323 | 324 | 325 | 326 | 327 | 328 |
Value of out2Total
0123456789
out10544544454443
1454445445443
2445444544442
3544544454443
4454445445443
5445444544442
Total26262626242626262624256
329 | 330 | Indeed, both *out1* individually, and *(out1,out2)* jointly now look maximally uniform. 331 | However, *out2* individually **lost** its maximal uniformity: *out2 = 4* (and *9*) have 332 | probability *24/256*, while the others have probability *26/256*. Can we fix that? 333 | 334 | ### Fixing individual uniformity 335 | 336 | The cause is simple: given an input state *xi*, the next state *xi+1 = xini mod 2B*. 337 | When *ni* is even, it will increment the number of consecutive bottom zero bits in the *xi* state variable by at least 338 | one. When *ni* is divisible by a large power of two, multiple zeroes will get introduced. Those zeroes mean the *x2* 339 | variable has less entropy than *x1*, which in turn results in non-maximal uniformity in *out2*. 340 | 341 | To address that, we must prevent the degeneration of the quality of the state variables (each *xi*). 342 | We already know that even ranges cause a loss of entropy in the state, and that is in fact the only cause. 343 | Whenever the range *ni* is odd, the operation *xi+1 = xini mod 2B* is 344 | a bijection. Because the [GCD](https://en.wikipedia.org/wiki/Greatest_common_divisor) of an odd number and 345 | *2B* is *1*, every odd number has a [modular inverse](https://en.wikipedia.org/wiki/Modular_multiplicative_inverse) modulo *2B*, and thus this 346 | multiplication operation can be undone by multiplying with this inverse. That implies no entropy can be lost by this operation. 347 | 348 | Let's restrict the problem to **just** powers of two: what if *ni = 2ri*? In this case, 349 | *fastrange2* is equivalent to returning the top *ri* bits of *xi* as output, and the bottom *(B-ri)* bits of it 350 | (left shifted by *ri* positions) as new state. This left shift destroys information, but we can simply replace it 351 | by a left rotation: that brings the same identical bits to the top and thus maintains the maximal joint uniformity 352 | property, but also leaves all the entropy in the state intact. 353 | 354 | Now we need to combine this rotation with something that supports non-powers-of-2, while retaining all the 355 | uniformity properties we desire. Every non-zero integer *n* can be written as *2rk*, where *k* is odd. 356 | Multiplication by odd numbers preserves entropy. Multiplication by powers of 2 (that aren't 1) does not, but those 357 | can be replaced by bitwise rotations. Composing these two operations yields a solution: 358 | 359 | ```python 360 | def rotl(x, n): 361 | """Bitwise left rotate x by n positions.""" 362 | return ((x << n) | (x >> (B - n))) & MASK 363 | 364 | def fastrange3(x, k, r): 365 | """Given x in [0,2**B), return output in [0,k*2**r) and new x.""" 366 | assert 0 <= x <= MASK 367 | assert k & 1 368 | assert k > 0 369 | assert r >= 0 370 | out = (x * k << r) >> B 371 | new_x = rotl((x * k) & MASK, r) 372 | return out, new_x 373 | ``` 374 | 375 | The output is the same as *fastrange* and *fastrange2*, but the new state differs: instead of having *r* 0-bits 376 | in the bottom, those bits are now obtained by rotating *kx*. The top bits are unchanged. 377 | 378 | This is clearly a bijection, as both multiplying with *k* (mod *2B*), and rotations are bijections. 379 | So when repeating the earlier example, we get: 380 | 381 | 382 | 383 | 384 | 385 | 386 | 387 | 388 | 389 | 390 | 391 | 392 | 393 | 394 | 395 | 396 | 397 | 398 | 399 | 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 | 408 | 409 | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 417 | 418 | 419 | 420 | 421 | 422 |
Value of out2Total
0123456789
out10544544454443
1454445444543
2445444544442
3544454445443
4454445445443
5444544544442
Total26262526252626252625256
423 | 424 | This is great. We now get that each individual *outi* is maximally uniform, and so is the joint distribution. 425 | Again, only experimentally verified, but it appears that this property even generalizes rather strongly to arbitrary 426 | numbers of extractions: the individual distribution of any extracted value, as well as the the joint distributions of 427 | any **consecutively** extracted values appear to be maximally uniform. 428 | 429 | ### Avoiding the need to decompose *n* 430 | 431 | The above *fastrange3* function works great, but it requires passing in the desired range of outputs in decomposed *2rk* 432 | form. This is slightly annoying, and it can be avoided. 433 | 434 | Consider what is happening. *fastrange3* behaves the same as *fastrange2* (with *n = 2rk*), except that the bottom *r* 435 | bits of the new state are filled in, and not (necessarily) 0. Those bits are the result of rotating *xk* by *r* positions, 436 | or put otherwise, the top *r* bits of *xk mod 2B*, and thus bits *B...(B+r-1)* of *x2rk = xn*, or, 437 | the bottom *r* bits of *⌊xn / 2B⌋ = out*. 438 | In other words, the bottom *r* bits of the output are copied into the new state. So we could write an identical *fastrange3* as: 439 | 440 | ```python3 441 | def fastrange3(x, k, r): 442 | """Given x in [0,2**B), return output in [0,k*2**r) and new x.""" 443 | assert 0 <= x <= MASK 444 | assert k & 1 445 | assert k > 0 446 | assert r >= 0 447 | n = k << r 448 | tmp = x * n 449 | out = tmp >> B 450 | new_x = (tmp & MASK) | (out & ((1 << r) - 1)) 451 | return out, new_x 452 | ``` 453 | 454 | This almost avoids the need to know *r*, except for the need to construct the bitmask `(1 << r) - 1` = *2r - 1*. 455 | This can be done very efficiently using a bit fiddling hack: `(n-1) & ~n`; that's identical to *(1 << r) - 1*, where *r* is 456 | the number of consecutive zero bits in *n*, starting at the bottom. Put all together we can write a new *fastrange4* function 457 | that behaves exactly like *fastrange3*, but just takes *n* as input directly: 458 | 459 | ```python 460 | def fastrange4(x, n): 461 | """Given x in [0,2**B), return output in [0,n) and new x.""" 462 | assert 0 <= x < MASK 463 | assert 0 < n <= MASK 464 | tmp = x * n 465 | out = tmp >> B 466 | new_x = (tmp & MASK) | (out & (n-1) & ~n) 467 | return out, new_x 468 | ``` 469 | 470 | ### C version 471 | 472 | I've used Python above for demonstration purposes, but this is of course easily translated to C or similar low-level languages: 473 | 474 | ```c 475 | uint32_t fastrange4(uint32_t *x, uint32_t n) { 476 | uint64_t tmp = (uint64_t)*x * (uint64_t)n; 477 | uint32_t out = tmp >> 32; 478 | *x = tmp | (out & (n-1) & ~n); 479 | return out; 480 | } 481 | ``` 482 | 483 | for *B=32*. A version supporting *B=64* hashes but restricted to 32-bit ranges can be written as: 484 | 485 | ```c 486 | uint32_t fastrange4(uint64_t *x, uint32_t n) { 487 | #if defined(UINT128_MAX) || defined(__SIZEOF_INT128__) 488 | unsigned __int128 tmp = (unsigned __int128)(*x) * n; 489 | uint32_t out = tmp >> 64; 490 | *x = tmp | (out & (n-1) & ~n); 491 | return out; 492 | #else 493 | uint64_t x_val = *x; 494 | uint64_t t_lo = (x_val & 0xffffffff) * (uint64_t)n; 495 | uint64_t t_hi = (x_val >> 32) * (uint64_t)n + (t_lo >> 32); 496 | uint32_t upper32 = t_hi >> 32; 497 | uint64_t lower64 = (t_hi << 32) | (t_lo & 0xffffffff); 498 | *x = lower64 | (upper32 & (n-1) & ~n); 499 | return upper32; 500 | #endif 501 | } 502 | ``` 503 | 504 | which makes use of a 64×64→128 multiplication if the platform supports `__int128`. If 64-bit ranges are 505 | needed, a full double-limb multiplication is needed. The code is based on [this snippet](https://stackoverflow.com/a/26855440): 506 | 507 | ```c 508 | uint64_t fastrange4(uint64_t *x, uint64_t n) { 509 | #if defined(UINT128_MAX) || defined(__SIZEOF_INT128__) 510 | unsigned __int128 tmp = (unsigned __int128)(*x) * n; 511 | uint64_t out = tmp >> 64; 512 | *x = tmp | (out & (n-1) & ~n); 513 | return out; 514 | #else 515 | uint64_t x_val = *x; 516 | uint64_t x_hi = x_val >> 32, x_lo = x_val & 0xffffffff; 517 | uint64_t n_hi = y >> 32, n_lo = y & 0xffffffff; 518 | uint64_t hh = x_hi * n_hi; 519 | uint64_t hl = x_hi * n_lo; 520 | uint64_t lh = x_lo * n_hi; 521 | uint64_t ll = x_lo * n_lo; 522 | uint64_t mid34 = (ll >> 32) + (lh & 0xffffffff) + (hl & 0xffffffff); 523 | uint64_t upper64 = hh + (lh >> 32) + (hl >> 32) + (mid34 >> 32); 524 | uint64_t lower64 = (mid34 << 32) | (ll & 0xffffffff); 525 | *x = lower64 | (upper64 & (n-1) & ~n); 526 | return upper64; 527 | #endif 528 | } 529 | ``` 530 | 531 | Note that for the final extraction it is unnecessary to update the state further, and the normal fast range 532 | reduction *fastrange* function can be used instead. It is identical to the above routines, but with the `*x =` line 533 | and (if present) the `uint64_t lower64 =` line removed. 534 | 535 | ## Use as a random number generator? 536 | 537 | It is appealing to use this as the basis for a random number generator like interface, to produce extremely fast numbers in any range: 538 | 539 | ```python 540 | class FastRangeExtractor: 541 | __init__(self, x): 542 | assert 0 <= x <= MASK 543 | self.x = x 544 | def randrange(self, n): 545 | assert 0 < n <= MASK 546 | tmp = self.x * n 547 | out = tmp >> B 548 | self.x = (tmp & MASK) | (out & (n-1) & ~n) 549 | return out 550 | ``` 551 | 552 | However, a word of caution: the extraction scheme(s) presented here only **extracts** information efficiently 553 | and uniformly from the provided entropy, and doesn't introduce any unpredictability by itself. The simple 554 | structure implies that if someone observes a number of consecutive ranges and their corresponding outputs, where the 555 | product of those ranges exceeds *2B*, they can easily compute the state. This is easy to see, as 556 | the maximal uniformity property in this case implies no outputs can be reached by more than one input (every 557 | output must be reached by exactly 0 or exactly 1 input, if *⌊2B/n⌋ = 0*). 558 | It also means that in this case, structure will remain in the produced numbers, as it can be seen as an 559 | attempt to extract more entropy than was originally present. This is similar to the 560 | [hyperplane structure](https://stats.stackexchange.com/questions/38328/hyperplane-problem-in-linear-congruent-generator) 561 | present in the output of [linear congruential generators](https://en.wikipedia.org/wiki/Linear_congruential_generator). 562 | 563 | Because of this, it is inadvisable to do extractions whose ranges multiply to a number larger than *2B*. 564 | 565 | ## Conclusion 566 | 567 | We've constructed a simple and efficient generalization to multiple outputs of the fast range reduction method, 568 | in a way that maximizes uniformity properties both for the individually extracted numbers and their joint distribution. 569 | 570 | ## Acknowledgments 571 | 572 | Thanks for Greg Maxwell for several discussions that lead to this idea, as well as proofreading and suggesting 573 | improvements to the text. Thanks to Russell O'Connor for improving the mixed 64/32-bit C function. 574 | -------------------------------------------------------------------------------- /uniform-range-extraction/test.py: -------------------------------------------------------------------------------- 1 | import secrets 2 | 3 | BITS = 8 4 | MASK = (1 << BITS) - 1 5 | 6 | def extract2(x, n): 7 | assert 0 <= x <= MASK 8 | assert 0 < n 9 | assert n <= MASK 10 | tmp = x * n 11 | return (tmp >> BITS, tmp & MASK) 12 | 13 | def extract4(x, n): 14 | assert 0 <= x <= MASK 15 | assert 0 < n 16 | assert n <= MASK 17 | low_mask = (n - 1) & ~n 18 | tmp = x * n 19 | out = tmp >> BITS 20 | return (out, (tmp | (out & low_mask)) & MASK) 21 | 22 | for BITS in range(0, 32): 23 | print("%i BITS" % BITS) 24 | MASK = (1 << BITS) - 1 25 | # Iterate over various products of N1*N2*N3*N4. 26 | for P in range(16, 1<<(2*BITS)): 27 | # Iterate over individual N1,N2,N3,N4 values whose product is P. 28 | for N1 in range(2, min(1<>3 + 1)): 29 | for N2 in range(2, min(1<>2 + 1)): 30 | N12 = N1*N2 31 | for N3 in range(2, min(1<>1 + 1)): 32 | N123 = N12*N3 33 | N4 = P // N123 34 | if N4 > 1 and N4 < 1<= max(e1) 98 | assert min(e2) + 1 >= max(e2) 99 | assert min(e3) + 1 >= max(e3) 100 | assert min(e4) + 1 >= max(e4) 101 | assert min(d1) + 1 >= max(d1) 102 | assert min(d2) + 1 >= max(d2) 103 | assert min(d3) + 1 >= max(d3) 104 | assert min(d4) + 1 >= max(d4) 105 | assert min(d5) + 1 >= max(d5) 106 | assert min(d6) + 1 >= max(d6) 107 | assert min(d7) + 1 >= max(d7) 108 | assert min(d8) + 1 >= max(d8) 109 | assert min(d9) + 1 >= max(d9) 110 | assert min(d0) + 1 >= max(d0) 111 | # Verify that all intermediary states are reached exactly once. 112 | assert all(v == 1 for v in da) 113 | assert all(v == 1 for v in db) 114 | assert all(v == 1 for v in dc) 115 | -------------------------------------------------------------------------------- /von-neumann-debias-tables/README.md: -------------------------------------------------------------------------------- 1 | This directory contains tables for doing Von Neumann debiasing by hand, based on dice rolls and coin tosses. 2 | 3 | Unfortunately I've lost the code to generate these. 4 | -------------------------------------------------------------------------------- /von-neumann-debias-tables/coin_extract_8rolls_15states.txt: -------------------------------------------------------------------------------- 1 | Entropy extraction table for possible-biased coin flips. 2 | 3 | * Start in state S00 4 | * Flip 1 coin 8 times (the same coin every time) 5 | * Find the row with those 8 flips on the left in the column with your current state 6 | * Go to the new state listed there, and output the bits listed after the comma (if any) 7 | * Repeat 8 | 9 | Produces 4.5 bits of entropy per 8 rolls if the coin is unbiased. 10 | 11 | In | Prev S00 S01 S02 S03 S04 S05 S06 S07 S08 S09 S10 S11 S12 S13 S14 12 | 00000000: S00 S01 S01,1 S05,1 S05,111 S05 S05,11 S07 S07,1 S07,10 S07,1 S07,1 S07,101 S07,11 S07,11 13 | 10000000: S07,111 S05,111 S05,1111 S00,1111 S00,111111 S00,111 S00,11111 S01,111 S01,1111 S01,11101 S01,1111 S01,1111 S01,111011 S01,11111 S01,11111 14 | 01000000: S07,011 S05,011 S05,0111 S00,0111 S00,011111 S00,011 S00,01111 S01,011 S01,0111 S01,01101 S01,0111 S01,0111 S01,011011 S01,01111 S01,01111 15 | 11000000: S00,11 S01,10 S01,1 S05,011 S05,1 S05,01 S05 S07,00 S07,0 S07 S07,1 S07,001 S07,1 S07,01 S07,11 16 | 00100000: S07,101 S05,101 S05,1011 S00,1011 S00,110111 S00,101 S00,11011 S01,101 S01,1101 S01,10101 S01,1101 S01,1011 S01,101011 S01,11011 S01,11011 17 | 10100000: S01,111 S00,111 S00,1111 S07,11 S08,111 S07,110 S08,1101 S05,110 S05,1101 S05,11010 S05,1101 S05,11 S05,1110 S05,111 S05,111 18 | 01100000: S01,011 S00,011 S00,0111 S07,01 S08,011 S07,010 S08,0101 S05,010 S05,0101 S05,01010 S05,0101 S05,01 S05,0110 S05,011 S05,011 19 | 11100000: S08,1011 S05,10101 S05,1011 S00,010111 S00,1011 S00,01011 S00,101 S01,00101 S01,0101 S01,101 S01,1101 S01,001011 S01,1011 S01,01011 S01,11011 20 | 00010000: S07,001 S05,001 S05,0011 S00,0011 S00,100111 S00,001 S00,10011 S01,001 S01,1001 S01,10001 S01,1001 S01,0011 S01,100011 S01,10011 S01,10011 21 | 10010000: S01,101 S00,101 S00,1011 S07,10 S08,101 S07,100 S08,1001 S05,100 S05,1001 S05,10010 S05,1001 S05,10 S05,1010 S05,101 S05,101 22 | 01010000: S01,001 S00,001 S00,0011 S07,00 S08,001 S07,000 S08,0001 S05,000 S05,0001 S05,00010 S05,0001 S05,00 S05,0010 S05,001 S05,001 23 | 11010000: S08,0011 S05,00101 S05,0011 S00,000111 S00,0011 S00,00011 S00,001 S01,00001 S01,0001 S01,001 S01,1001 S01,000011 S01,0011 S01,00011 S01,10011 24 | 00110000: S00,01 S01,00 S01,0 S05,110 S05,111 S05,10 S05,11 S08,1 S08,11 S08,101 S08,11 S08,11 S08,1011 S08,111 S08,111 25 | 10110000: S08,1111 S05,11101 S05,1111 S00,110111 S00,1111 S00,11011 S00,111 S01,11001 S01,1101 S01,111 S01,1111 S01,110011 S01,1111 S01,11011 S01,11111 26 | 01110000: S08,0111 S05,01101 S05,0111 S00,010111 S00,0111 S00,01011 S00,011 S01,01001 S01,0101 S01,011 S01,0111 S01,010011 S01,0111 S01,01011 S01,01111 27 | 11110000: S00 S01 S01,1 S05,011 S05,110 S05,01 S05,10 S08,001 S08,01 S08,1 S08,11 S08,0011 S08,11 S08,011 S08,111 28 | 00001000: S07,110 S05,110 S05,1110 S00,1110 S00,111101 S00,110 S00,11101 S01,110 S01,1110 S01,10110 S01,1110 S01,1110 S01,101110 S01,11110 S01,11110 29 | 10001000: S01,1111 S00,1111 S00,11111 S11,1111 S13,11111 S07,1111 S08,11111 S05,1111 S05,11111 S05,111011 S05,11111 S03,1111 S03,111011 S03,11111 S03,11111 30 | 01001000: S01,0111 S00,0111 S00,01111 S11,0111 S13,01111 S07,0111 S08,01111 S05,0111 S05,01111 S05,011011 S05,01111 S03,0111 S03,011011 S03,01111 S03,01111 31 | 11001000: S08,1110 S05,10110 S05,1110 S00,011101 S00,1110 S00,01101 S00,110 S01,00110 S01,0110 S01,110 S01,1110 S01,001110 S01,1110 S01,01110 S01,11110 32 | 00101000: S01,1011 S00,1011 S00,10111 S11,1011 S13,11011 S07,1011 S08,11011 S05,1011 S05,11011 S05,101011 S05,11011 S03,1011 S03,101011 S03,11011 S03,11011 33 | 10101000: S03,11110 S11,11110 S13,11110 S01,11110 S02,111101 S01,110110 S02,1101101 S00,110110 S00,1101101 S00,11011010 S00,1101101 S00,11110 S00,1111010 S00,111101 S00,111101 34 | 01101000: S03,01110 S11,01110 S13,01110 S01,01110 S02,011101 S01,010110 S02,0101101 S00,010110 S00,0101101 S00,01011010 S00,0101101 S00,01110 S00,0111010 S00,011101 S00,011101 35 | 11101000: S02,10111 S00,101110 S00,10111 S13,01011 S11,1011 S08,01011 S07,1011 S05,001011 S05,01011 S05,1011 S05,11011 S03,001011 S03,1011 S03,01011 S03,11011 36 | 00011000: S01,0011 S00,0011 S00,00111 S11,0011 S13,10011 S07,0011 S08,10011 S05,0011 S05,10011 S05,100011 S05,10011 S03,0011 S03,100011 S03,10011 S03,10011 37 | 10011000: S03,10110 S11,10110 S13,10110 S01,10110 S02,101101 S01,100110 S02,1001101 S00,100110 S00,1001101 S00,10011010 S00,1001101 S00,10110 S00,1011010 S00,101101 S00,101101 38 | 01011000: S03,00110 S11,00110 S13,00110 S01,00110 S02,001101 S01,000110 S02,0001101 S00,000110 S00,0001101 S00,00011010 S00,0001101 S00,00110 S00,0011010 S00,001101 S00,001101 39 | 11011000: S02,00111 S00,001110 S00,00111 S13,00011 S11,0011 S08,00011 S07,0011 S05,000011 S05,00011 S05,0011 S05,10011 S03,000011 S03,0011 S03,00011 S03,10011 40 | 00111000: S08,0110 S05,00110 S05,0110 S00,111010 S00,111101 S00,11010 S00,11101 S02,1101 S02,11101 S02,101101 S02,11101 S02,11101 S02,1011101 S02,111101 S02,111101 41 | 10111000: S02,11111 S00,111110 S00,11111 S13,11011 S11,1111 S08,11011 S07,1111 S05,110011 S05,11011 S05,1111 S05,11111 S03,110011 S03,1111 S03,11011 S03,11111 42 | 01111000: S02,01111 S00,011110 S00,01111 S13,01011 S11,0111 S08,01011 S07,0111 S05,010011 S05,01011 S05,0111 S05,01111 S03,010011 S03,0111 S03,01011 S03,01111 43 | 11111000: S07,110 S05,110 S05,1110 S00,011101 S00,111010 S00,01101 S00,11010 S02,001101 S02,01101 S02,1101 S02,11101 S02,0011101 S02,11101 S02,011101 S02,111101 44 | 00000100: S07,010 S05,010 S05,1010 S00,1010 S00,110101 S00,010 S00,10101 S01,010 S01,1010 S01,10010 S01,1010 S01,1010 S01,101010 S01,11010 S01,11010 45 | 10000100: S01,1101 S00,1101 S00,11011 S11,1101 S13,11101 S07,1101 S08,11101 S05,1101 S05,11101 S05,111001 S05,11101 S03,1101 S03,111001 S03,11101 S03,11101 46 | 01000100: S01,0101 S00,0101 S00,01011 S11,0101 S13,01101 S07,0101 S08,01101 S05,0101 S05,01101 S05,011001 S05,01101 S03,0101 S03,011001 S03,01101 S03,01101 47 | 11000100: S08,1010 S05,10010 S05,1010 S00,010101 S00,1010 S00,00101 S00,010 S01,00010 S01,0010 S01,010 S01,1010 S01,001010 S01,1010 S01,01010 S01,11010 48 | 00100100: S01,1001 S00,1001 S00,10011 S11,1001 S13,11001 S07,1001 S08,11001 S05,1001 S05,11001 S05,101001 S05,11001 S03,1001 S03,101001 S03,11001 S03,11001 49 | 10100100: S03,11010 S11,11010 S13,11010 S01,11010 S02,110101 S01,110010 S02,1100101 S00,110010 S00,1100101 S00,11001010 S00,1100101 S00,11010 S00,1101010 S00,110101 S00,110101 50 | 01100100: S03,01010 S11,01010 S13,01010 S01,01010 S02,010101 S01,010010 S02,0100101 S00,010010 S00,0100101 S00,01001010 S00,0100101 S00,01010 S00,0101010 S00,010101 S00,010101 51 | 11100100: S02,10011 S00,100110 S00,10011 S13,01001 S11,1001 S08,01001 S07,1001 S05,001001 S05,01001 S05,1001 S05,11001 S03,001001 S03,1001 S03,01001 S03,11001 52 | 00010100: S01,0001 S00,0001 S00,00011 S11,0001 S13,10001 S07,0001 S08,10001 S05,0001 S05,10001 S05,100001 S05,10001 S03,0001 S03,100001 S03,10001 S03,10001 53 | 10010100: S03,10010 S11,10010 S13,10010 S01,10010 S02,100101 S01,100010 S02,1000101 S00,100010 S00,1000101 S00,10001010 S00,1000101 S00,10010 S00,1001010 S00,100101 S00,100101 54 | 01010100: S03,00010 S11,00010 S13,00010 S01,00010 S02,000101 S01,000010 S02,0000101 S00,000010 S00,0000101 S00,00001010 S00,0000101 S00,00010 S00,0001010 S00,000101 S00,000101 55 | 11010100: S02,00011 S00,000110 S00,00011 S13,00001 S11,0001 S08,00001 S07,0001 S05,000001 S05,00001 S05,0001 S05,10001 S03,000001 S03,0001 S03,00001 S03,10001 56 | 00110100: S08,0010 S05,00010 S05,0010 S00,101010 S00,110101 S00,01010 S00,10101 S02,0101 S02,10101 S02,100101 S02,10101 S02,10101 S02,1010101 S02,110101 S02,110101 57 | 10110100: S02,11011 S00,110110 S00,11011 S13,11001 S11,1101 S08,11001 S07,1101 S05,110001 S05,11001 S05,1101 S05,11101 S03,110001 S03,1101 S03,11001 S03,11101 58 | 01110100: S02,01011 S00,010110 S00,01011 S13,01001 S11,0101 S08,01001 S07,0101 S05,010001 S05,01001 S05,0101 S05,01101 S03,010001 S03,0101 S03,01001 S03,01101 59 | 11110100: S07,010 S05,010 S05,1010 S00,010101 S00,101010 S00,00101 S00,01010 S02,000101 S02,00101 S02,0101 S02,10101 S02,0010101 S02,10101 S02,010101 S02,110101 60 | 00001100: S00,10 S02,1 S02,11 S05,100 S05,110 S05,00 S05,10 S08,0 S08,10 S08,100 S08,10 S08,10 S08,1010 S08,110 S08,110 61 | 10001100: S08,1101 S05,11001 S05,1101 S00,111110 S00,111111 S00,11110 S00,11111 S02,1111 S02,11111 S02,111011 S02,11111 S02,11111 S02,1110111 S02,111111 S02,111111 62 | 01001100: S08,0101 S05,01001 S05,0101 S00,011110 S00,011111 S00,01110 S00,01111 S02,0111 S02,01111 S02,011011 S02,01111 S02,01111 S02,0110111 S02,011111 S02,011111 63 | 11001100: S00,11 S02,101 S02,11 S05,010 S05,100 S05,00 S05,00 S08,000 S08,00 S08,0 S08,10 S08,0010 S08,10 S08,010 S08,110 64 | 00101100: S08,1001 S05,10001 S05,1001 S00,101110 S00,110111 S00,10110 S00,11011 S02,1011 S02,11011 S02,101011 S02,11011 S02,10111 S02,1010111 S02,110111 S02,110111 65 | 10101100: S02,1111 S00,11110 S00,1111 S08,110 S07,11 S08,1100 S07,110 S05,11000 S05,1100 S05,110 S05,1101 S05,1100 S05,11 S05,110 S05,111 66 | 01101100: S02,0111 S00,01110 S00,0111 S08,010 S07,01 S08,0100 S07,010 S05,01000 S05,0100 S05,010 S05,0101 S05,0100 S05,01 S05,010 S05,011 67 | 11101100: S07,101 S05,101 S05,1011 S00,010111 S00,101110 S00,01011 S00,10110 S02,001011 S02,01011 S02,1011 S02,11011 S02,0010111 S02,10111 S02,010111 S02,110111 68 | 00011100: S08,0001 S05,00001 S05,0001 S00,001110 S00,100111 S00,00110 S00,10011 S02,0011 S02,10011 S02,100011 S02,10011 S02,00111 S02,1000111 S02,100111 S02,100111 69 | 10011100: S02,1011 S00,10110 S00,1011 S08,100 S07,10 S08,1000 S07,100 S05,10000 S05,1000 S05,100 S05,1001 S05,1000 S05,10 S05,100 S05,101 70 | 01011100: S02,0011 S00,00110 S00,0011 S08,000 S07,00 S08,0000 S07,000 S05,00000 S05,0000 S05,000 S05,0001 S05,0000 S05,00 S05,000 S05,001 71 | 11011100: S07,001 S05,001 S05,0011 S00,000111 S00,001110 S00,00011 S00,00110 S02,000011 S02,00011 S02,0011 S02,10011 S02,0000111 S02,00111 S02,000111 S02,100111 72 | 00111100: S00,01 S02,001 S02,01 S05,1 S05,111 S05 S05,11 S07 S07,1 S07,10 S07,1 S07,1 S07,101 S07,11 S07,11 73 | 10111100: S07,111 S05,111 S05,1111 S00,110111 S00,111110 S00,11011 S00,11110 S02,110011 S02,11011 S02,1111 S02,11111 S02,1100111 S02,11111 S02,110111 S02,111111 74 | 01111100: S07,011 S05,011 S05,0111 S00,010111 S00,011110 S00,01011 S00,01110 S02,010011 S02,01011 S02,0111 S02,01111 S02,0100111 S02,01111 S02,010111 S02,011111 75 | 11111100: S00,10 S02,1 S02,11 S05,011 S05,1 S05,01 S05 S07,00 S07,0 S07 S07,1 S07,001 S07,1 S07,01 S07,11 76 | 00000010: S07,100 S05,100 S05,1100 S00,1100 S00,111100 S00,100 S00,11100 S01,100 S01,1100 S01,10100 S01,1100 S01,1100 S01,101100 S01,11100 S01,11100 77 | 10000010: S01,1110 S00,1110 S00,11110 S11,1110 S13,11110 S07,1110 S08,11110 S05,1110 S05,11110 S05,111010 S05,11110 S03,1110 S03,111010 S03,11110 S03,11110 78 | 01000010: S01,0110 S00,0110 S00,01110 S11,0110 S13,01110 S07,0110 S08,01110 S05,0110 S05,01110 S05,011010 S05,01110 S03,0110 S03,011010 S03,01110 S03,01110 79 | 11000010: S08,1100 S05,10100 S05,1100 S00,011100 S00,1100 S00,01100 S00,100 S01,00100 S01,0100 S01,100 S01,1100 S01,001100 S01,1100 S01,01100 S01,11100 80 | 00100010: S01,1010 S00,1010 S00,10110 S11,1010 S13,11010 S07,1010 S08,11010 S05,1010 S05,11010 S05,101010 S05,11010 S03,1010 S03,101010 S03,11010 S03,11010 81 | 10100010: S03,11100 S11,11100 S13,11100 S01,11100 S02,111100 S01,110100 S02,1101100 S00,110100 S00,1101100 S00,11010100 S00,1101100 S00,11100 S00,1110100 S00,111100 S00,111100 82 | 01100010: S03,01100 S11,01100 S13,01100 S01,01100 S02,011100 S01,010100 S02,0101100 S00,010100 S00,0101100 S00,01010100 S00,0101100 S00,01100 S00,0110100 S00,011100 S00,011100 83 | 11100010: S02,10110 S00,101010 S00,10110 S13,01010 S11,1010 S08,01010 S07,1010 S05,001010 S05,01010 S05,1010 S05,11010 S03,001010 S03,1010 S03,01010 S03,11010 84 | 00010010: S01,0010 S00,0010 S00,00110 S11,0010 S13,10010 S07,0010 S08,10010 S05,0010 S05,10010 S05,100010 S05,10010 S03,0010 S03,100010 S03,10010 S03,10010 85 | 10010010: S03,10100 S11,10100 S13,10100 S01,10100 S02,101100 S01,100100 S02,1001100 S00,100100 S00,1001100 S00,10010100 S00,1001100 S00,10100 S00,1010100 S00,101100 S00,101100 86 | 01010010: S03,00100 S11,00100 S13,00100 S01,00100 S02,001100 S01,000100 S02,0001100 S00,000100 S00,0001100 S00,00010100 S00,0001100 S00,00100 S00,0010100 S00,001100 S00,001100 87 | 11010010: S02,00110 S00,001010 S00,00110 S13,00010 S11,0010 S08,00010 S07,0010 S05,000010 S05,00010 S05,0010 S05,10010 S03,000010 S03,0010 S03,00010 S03,10010 88 | 00110010: S08,0100 S05,00100 S05,0100 S00,110100 S00,111100 S00,10100 S00,11100 S02,1100 S02,11100 S02,101100 S02,11100 S02,11100 S02,1011100 S02,111100 S02,111100 89 | 10110010: S02,11110 S00,111010 S00,11110 S13,11010 S11,1110 S08,11010 S07,1110 S05,110010 S05,11010 S05,1110 S05,11110 S03,110010 S03,1110 S03,11010 S03,11110 90 | 01110010: S02,01110 S00,011010 S00,01110 S13,01010 S11,0110 S08,01010 S07,0110 S05,010010 S05,01010 S05,0110 S05,01110 S03,010010 S03,0110 S03,01010 S03,01110 91 | 11110010: S07,100 S05,100 S05,1100 S00,011100 S00,110100 S00,01100 S00,10100 S02,001100 S02,01100 S02,1100 S02,11100 S02,0011100 S02,11100 S02,011100 S02,111100 92 | 00001010: S01,110 S00,110 S00,1110 S11,111 S13,1111 S11,11 S13,111 S03,11 S03,111 S03,1011 S03,111 S03,111 S03,10111 S03,1111 S03,1111 93 | 10001010: S03,11111 S11,11111 S13,11111 S01,11111 S02,111111 S01,111110 S02,1111110 S00,111110 S00,1111110 S00,11101110 S00,1111110 S00,11111 S00,1110111 S00,111111 S00,111111 94 | 01001010: S03,01111 S11,01111 S13,01111 S01,01111 S02,011111 S01,011110 S02,0111110 S00,011110 S00,0111110 S00,01101110 S00,0111110 S00,01111 S00,0110111 S00,011111 S00,011111 95 | 11001010: S02,1110 S00,10110 S00,1110 S13,0111 S11,111 S13,011 S11,11 S03,0011 S03,011 S03,11 S03,111 S03,00111 S03,111 S03,0111 S03,1111 96 | 00101010: S03,10111 S11,10111 S13,10111 S01,10111 S02,110111 S01,101110 S02,1101110 S00,101110 S00,1101110 S00,10101110 S00,1101110 S00,10111 S00,1010111 S00,110111 S00,110111 97 | 10101010: S00,1111 S01,1111 S02,1111 S03,1111 S04,1111 S03,11011 S04,11011 S11,11011 S13,11011 S12,11011 S14,11011 S11,1111 S12,1111 S13,1111 S14,1111 98 | 01101010: S00,0111 S01,0111 S02,0111 S03,0111 S04,0111 S03,01011 S04,01011 S11,01011 S13,01011 S12,01011 S14,01011 S11,0111 S12,0111 S13,0111 S14,0111 99 | 11101010: S04,10111 S12,10111 S14,10111 S02,010111 S01,10111 S02,0101110 S01,101110 S00,00101110 S00,0101110 S00,101110 S00,1101110 S00,0010111 S00,10111 S00,010111 S00,110111 100 | 00011010: S03,00111 S11,00111 S13,00111 S01,00111 S02,100111 S01,001110 S02,1001110 S00,001110 S00,1001110 S00,10001110 S00,1001110 S00,00111 S00,1000111 S00,100111 S00,100111 101 | 10011010: S00,1011 S01,1011 S02,1011 S03,1011 S04,1011 S03,10011 S04,10011 S11,10011 S13,10011 S12,10011 S14,10011 S11,1011 S12,1011 S13,1011 S14,1011 102 | 01011010: S00,0011 S01,0011 S02,0011 S03,0011 S04,0011 S03,00011 S04,00011 S11,00011 S13,00011 S12,00011 S14,00011 S11,0011 S12,0011 S13,0011 S14,0011 103 | 11011010: S04,00111 S12,00111 S14,00111 S02,000111 S01,00111 S02,0001110 S01,001110 S00,00001110 S00,0001110 S00,001110 S00,1001110 S00,0000111 S00,00111 S00,000111 S00,100111 104 | 00111010: S02,0110 S00,00110 S00,0110 S12,111 S14,1111 S12,11 S14,111 S04,11 S04,111 S04,1011 S04,111 S04,111 S04,10111 S04,1111 S04,1111 105 | 10111010: S04,11111 S12,11111 S14,11111 S02,110111 S01,11111 S02,1101110 S01,111110 S00,11001110 S00,1101110 S00,111110 S00,1111110 S00,1100111 S00,11111 S00,110111 S00,111111 106 | 01111010: S04,01111 S12,01111 S14,01111 S02,010111 S01,01111 S02,0101110 S01,011110 S00,01001110 S00,0101110 S00,011110 S00,0111110 S00,0100111 S00,01111 S00,010111 S00,011111 107 | 11111010: S01,110 S00,110 S00,1110 S14,0111 S12,111 S14,011 S12,11 S04,0011 S04,011 S04,11 S04,111 S04,00111 S04,111 S04,0111 S04,1111 108 | 00000110: S01,010 S00,010 S00,1010 S11,101 S13,1101 S11,01 S13,101 S03,01 S03,101 S03,1001 S03,101 S03,101 S03,10101 S03,1101 S03,1101 109 | 10000110: S03,11011 S11,11011 S13,11011 S01,11011 S02,111011 S01,110110 S02,1110110 S00,110110 S00,1110110 S00,11100110 S00,1110110 S00,11011 S00,1110011 S00,111011 S00,111011 110 | 01000110: S03,01011 S11,01011 S13,01011 S01,01011 S02,011011 S01,010110 S02,0110110 S00,010110 S00,0110110 S00,01100110 S00,0110110 S00,01011 S00,0110011 S00,011011 S00,011011 111 | 11000110: S02,1010 S00,10010 S00,1010 S13,0101 S11,101 S13,001 S11,01 S03,0001 S03,001 S03,01 S03,101 S03,00101 S03,101 S03,0101 S03,1101 112 | 00100110: S03,10011 S11,10011 S13,10011 S01,10011 S02,110011 S01,100110 S02,1100110 S00,100110 S00,1100110 S00,10100110 S00,1100110 S00,10011 S00,1010011 S00,110011 S00,110011 113 | 10100110: S00,1101 S01,1101 S02,1101 S03,1101 S04,1101 S03,11001 S04,11001 S11,11001 S13,11001 S12,11001 S14,11001 S11,1101 S12,1101 S13,1101 S14,1101 114 | 01100110: S00,0101 S01,0101 S02,0101 S03,0101 S04,0101 S03,01001 S04,01001 S11,01001 S13,01001 S12,01001 S14,01001 S11,0101 S12,0101 S13,0101 S14,0101 115 | 11100110: S04,10011 S12,10011 S14,10011 S02,010011 S01,10011 S02,0100110 S01,100110 S00,00100110 S00,0100110 S00,100110 S00,1100110 S00,0010011 S00,10011 S00,010011 S00,110011 116 | 00010110: S03,00011 S11,00011 S13,00011 S01,00011 S02,100011 S01,000110 S02,1000110 S00,000110 S00,1000110 S00,10000110 S00,1000110 S00,00011 S00,1000011 S00,100011 S00,100011 117 | 10010110: S00,1001 S01,1001 S02,1001 S03,1001 S04,1001 S03,10001 S04,10001 S11,10001 S13,10001 S12,10001 S14,10001 S11,1001 S12,1001 S13,1001 S14,1001 118 | 01010110: S00,0001 S01,0001 S02,0001 S03,0001 S04,0001 S03,00001 S04,00001 S11,00001 S13,00001 S12,00001 S14,00001 S11,0001 S12,0001 S13,0001 S14,0001 119 | 11010110: S04,00011 S12,00011 S14,00011 S02,000011 S01,00011 S02,0000110 S01,000110 S00,00000110 S00,0000110 S00,000110 S00,1000110 S00,0000011 S00,00011 S00,000011 S00,100011 120 | 00110110: S02,0010 S00,00010 S00,0010 S12,101 S14,1101 S12,01 S14,101 S04,01 S04,101 S04,1001 S04,101 S04,101 S04,10101 S04,1101 S04,1101 121 | 10110110: S04,11011 S12,11011 S14,11011 S02,110011 S01,11011 S02,1100110 S01,110110 S00,11000110 S00,1100110 S00,110110 S00,1110110 S00,1100011 S00,11011 S00,110011 S00,111011 122 | 01110110: S04,01011 S12,01011 S14,01011 S02,010011 S01,01011 S02,0100110 S01,010110 S00,01000110 S00,0100110 S00,010110 S00,0110110 S00,0100011 S00,01011 S00,010011 S00,011011 123 | 11110110: S01,010 S00,010 S00,1010 S14,0101 S12,101 S14,001 S12,01 S04,0001 S04,001 S04,01 S04,101 S04,00101 S04,101 S04,0101 S04,1101 124 | 00001110: S09,100 S06,100 S06,1100 S00,100100 S00,110100 S00,00100 S00,10100 S02,0100 S02,10100 S02,100100 S02,10100 S02,10100 S02,1010100 S02,110100 S02,110100 125 | 10001110: S02,11010 S00,110010 S00,11010 S12,1110 S14,11110 S09,1110 S10,11110 S06,1110 S06,11110 S06,111010 S06,11110 S04,1110 S04,111010 S04,11110 S04,11110 126 | 01001110: S02,01010 S00,010010 S00,01010 S12,0110 S14,01110 S09,0110 S10,01110 S06,0110 S06,01110 S06,011010 S06,01110 S04,0110 S04,011010 S04,01110 S04,01110 127 | 11001110: S10,1100 S06,10100 S06,1100 S00,010100 S00,100100 S00,00100 S00,00100 S02,000100 S02,00100 S02,0100 S02,10100 S02,0010100 S02,10100 S02,010100 S02,110100 128 | 00101110: S02,10010 S00,100010 S00,10010 S12,1010 S14,11010 S09,1010 S10,11010 S06,1010 S06,11010 S06,101010 S06,11010 S04,1010 S04,101010 S04,11010 S04,11010 129 | 10101110: S04,11100 S12,11100 S14,11100 S02,110100 S01,11100 S02,1100100 S01,110100 S00,11000100 S00,1100100 S00,110100 S00,1101100 S00,1100100 S00,11100 S00,110100 S00,111100 130 | 01101110: S04,01100 S12,01100 S14,01100 S02,010100 S01,01100 S02,0100100 S01,010100 S00,01000100 S00,0100100 S00,010100 S00,0101100 S00,0100100 S00,01100 S00,010100 S00,011100 131 | 11101110: S01,1010 S00,1010 S00,10110 S14,01010 S12,1010 S10,01010 S09,1010 S06,001010 S06,01010 S06,1010 S06,11010 S04,001010 S04,1010 S04,01010 S04,11010 132 | 00011110: S02,00010 S00,000010 S00,00010 S12,0010 S14,10010 S09,0010 S10,10010 S06,0010 S06,10010 S06,100010 S06,10010 S04,0010 S04,100010 S04,10010 S04,10010 133 | 10011110: S04,10100 S12,10100 S14,10100 S02,100100 S01,10100 S02,1000100 S01,100100 S00,10000100 S00,1000100 S00,100100 S00,1001100 S00,1000100 S00,10100 S00,100100 S00,101100 134 | 01011110: S04,00100 S12,00100 S14,00100 S02,000100 S01,00100 S02,0000100 S01,000100 S00,00000100 S00,0000100 S00,000100 S00,0001100 S00,0000100 S00,00100 S00,000100 S00,001100 135 | 11011110: S01,0010 S00,0010 S00,00110 S14,00010 S12,0010 S10,00010 S09,0010 S06,000010 S06,00010 S06,0010 S06,10010 S04,000010 S04,0010 S04,00010 S04,10010 136 | 00111110: S10,0100 S06,00100 S06,0100 S00,1100 S00,111100 S00,100 S00,11100 S01,100 S01,1100 S01,10100 S01,1100 S01,1100 S01,101100 S01,11100 S01,11100 137 | 10111110: S01,1110 S00,1110 S00,11110 S14,11010 S12,1110 S10,11010 S09,1110 S06,110010 S06,11010 S06,1110 S06,11110 S04,110010 S04,1110 S04,11010 S04,11110 138 | 01111110: S01,0110 S00,0110 S00,01110 S14,01010 S12,0110 S10,01010 S09,0110 S06,010010 S06,01010 S06,0110 S06,01110 S04,010010 S04,0110 S04,01010 S04,01110 139 | 11111110: S09,100 S06,100 S06,1100 S00,011100 S00,1100 S00,01100 S00,100 S01,00100 S01,0100 S01,100 S01,1100 S01,001100 S01,1100 S01,01100 S01,11100 140 | 00000001: S07,000 S05,000 S05,1000 S00,1000 S00,111000 S00,000 S00,11000 S01,000 S01,1000 S01,10000 S01,1000 S01,1000 S01,101000 S01,11000 S01,11000 141 | 10000001: S01,1100 S00,1100 S00,11100 S11,1100 S13,11100 S07,1100 S08,11100 S05,1100 S05,11100 S05,111000 S05,11100 S03,1100 S03,111000 S03,11100 S03,11100 142 | 01000001: S01,0100 S00,0100 S00,01100 S11,0100 S13,01100 S07,0100 S08,01100 S05,0100 S05,01100 S05,011000 S05,01100 S03,0100 S03,011000 S03,01100 S03,01100 143 | 11000001: S08,1000 S05,10000 S05,1000 S00,011000 S00,1000 S00,01000 S00,000 S01,00000 S01,0000 S01,000 S01,1000 S01,001000 S01,1000 S01,01000 S01,11000 144 | 00100001: S01,1000 S00,1000 S00,10100 S11,1000 S13,11000 S07,1000 S08,11000 S05,1000 S05,11000 S05,101000 S05,11000 S03,1000 S03,101000 S03,11000 S03,11000 145 | 10100001: S03,11000 S11,11000 S13,11000 S01,11000 S02,111000 S01,110000 S02,1101000 S00,110000 S00,1101000 S00,11010000 S00,1101000 S00,11000 S00,1110000 S00,111000 S00,111000 146 | 01100001: S03,01000 S11,01000 S13,01000 S01,01000 S02,011000 S01,010000 S02,0101000 S00,010000 S00,0101000 S00,01010000 S00,0101000 S00,01000 S00,0110000 S00,011000 S00,011000 147 | 11100001: S02,10100 S00,101000 S00,10100 S13,01000 S11,1000 S08,01000 S07,1000 S05,001000 S05,01000 S05,1000 S05,11000 S03,001000 S03,1000 S03,01000 S03,11000 148 | 00010001: S01,0000 S00,0000 S00,00100 S11,0000 S13,10000 S07,0000 S08,10000 S05,0000 S05,10000 S05,100000 S05,10000 S03,0000 S03,100000 S03,10000 S03,10000 149 | 10010001: S03,10000 S11,10000 S13,10000 S01,10000 S02,101000 S01,100000 S02,1001000 S00,100000 S00,1001000 S00,10010000 S00,1001000 S00,10000 S00,1010000 S00,101000 S00,101000 150 | 01010001: S03,00000 S11,00000 S13,00000 S01,00000 S02,001000 S01,000000 S02,0001000 S00,000000 S00,0001000 S00,00010000 S00,0001000 S00,00000 S00,0010000 S00,001000 S00,001000 151 | 11010001: S02,00100 S00,001000 S00,00100 S13,00000 S11,0000 S08,00000 S07,0000 S05,000000 S05,00000 S05,0000 S05,10000 S03,000000 S03,0000 S03,00000 S03,10000 152 | 00110001: S08,0000 S05,00000 S05,0000 S00,110000 S00,111000 S00,10000 S00,11000 S02,1000 S02,11000 S02,101000 S02,11000 S02,11000 S02,1011000 S02,111000 S02,111000 153 | 10110001: S02,11100 S00,111000 S00,11100 S13,11000 S11,1100 S08,11000 S07,1100 S05,110000 S05,11000 S05,1100 S05,11100 S03,110000 S03,1100 S03,11000 S03,11100 154 | 01110001: S02,01100 S00,011000 S00,01100 S13,01000 S11,0100 S08,01000 S07,0100 S05,010000 S05,01000 S05,0100 S05,01100 S03,010000 S03,0100 S03,01000 S03,01100 155 | 11110001: S07,000 S05,000 S05,1000 S00,011000 S00,110000 S00,01000 S00,10000 S02,001000 S02,01000 S02,1000 S02,11000 S02,0011000 S02,11000 S02,011000 S02,111000 156 | 00001001: S01,100 S00,100 S00,1100 S11,110 S13,1110 S11,10 S13,110 S03,10 S03,110 S03,1010 S03,110 S03,110 S03,10110 S03,1110 S03,1110 157 | 10001001: S03,11101 S11,11101 S13,11101 S01,11101 S02,111101 S01,111010 S02,1111010 S00,111010 S00,1111010 S00,11101010 S00,1111010 S00,11101 S00,1110101 S00,111101 S00,111101 158 | 01001001: S03,01101 S11,01101 S13,01101 S01,01101 S02,011101 S01,011010 S02,0111010 S00,011010 S00,0111010 S00,01101010 S00,0111010 S00,01101 S00,0110101 S00,011101 S00,011101 159 | 11001001: S02,1100 S00,10100 S00,1100 S13,0110 S11,110 S13,010 S11,10 S03,0010 S03,010 S03,10 S03,110 S03,00110 S03,110 S03,0110 S03,1110 160 | 00101001: S03,10101 S11,10101 S13,10101 S01,10101 S02,110101 S01,101010 S02,1101010 S00,101010 S00,1101010 S00,10101010 S00,1101010 S00,10101 S00,1010101 S00,110101 S00,110101 161 | 10101001: S00,1110 S01,1110 S02,1110 S03,1110 S04,1110 S03,11010 S04,11010 S11,11010 S13,11010 S12,11010 S14,11010 S11,1110 S12,1110 S13,1110 S14,1110 162 | 01101001: S00,0110 S01,0110 S02,0110 S03,0110 S04,0110 S03,01010 S04,01010 S11,01010 S13,01010 S12,01010 S14,01010 S11,0110 S12,0110 S13,0110 S14,0110 163 | 11101001: S04,10101 S12,10101 S14,10101 S02,010101 S01,10101 S02,0101010 S01,101010 S00,00101010 S00,0101010 S00,101010 S00,1101010 S00,0010101 S00,10101 S00,010101 S00,110101 164 | 00011001: S03,00101 S11,00101 S13,00101 S01,00101 S02,100101 S01,001010 S02,1001010 S00,001010 S00,1001010 S00,10001010 S00,1001010 S00,00101 S00,1000101 S00,100101 S00,100101 165 | 10011001: S00,1010 S01,1010 S02,1010 S03,1010 S04,1010 S03,10010 S04,10010 S11,10010 S13,10010 S12,10010 S14,10010 S11,1010 S12,1010 S13,1010 S14,1010 166 | 01011001: S00,0010 S01,0010 S02,0010 S03,0010 S04,0010 S03,00010 S04,00010 S11,00010 S13,00010 S12,00010 S14,00010 S11,0010 S12,0010 S13,0010 S14,0010 167 | 11011001: S04,00101 S12,00101 S14,00101 S02,000101 S01,00101 S02,0001010 S01,001010 S00,00001010 S00,0001010 S00,001010 S00,1001010 S00,0000101 S00,00101 S00,000101 S00,100101 168 | 00111001: S02,0100 S00,00100 S00,0100 S12,110 S14,1110 S12,10 S14,110 S04,10 S04,110 S04,1010 S04,110 S04,110 S04,10110 S04,1110 S04,1110 169 | 10111001: S04,11101 S12,11101 S14,11101 S02,110101 S01,11101 S02,1101010 S01,111010 S00,11001010 S00,1101010 S00,111010 S00,1111010 S00,1100101 S00,11101 S00,110101 S00,111101 170 | 01111001: S04,01101 S12,01101 S14,01101 S02,010101 S01,01101 S02,0101010 S01,011010 S00,01001010 S00,0101010 S00,011010 S00,0111010 S00,0100101 S00,01101 S00,010101 S00,011101 171 | 11111001: S01,100 S00,100 S00,1100 S14,0110 S12,110 S14,010 S12,10 S04,0010 S04,010 S04,10 S04,110 S04,00110 S04,110 S04,0110 S04,1110 172 | 00000101: S01,000 S00,000 S00,1000 S11,100 S13,1100 S11,00 S13,100 S03,00 S03,100 S03,1000 S03,100 S03,100 S03,10100 S03,1100 S03,1100 173 | 10000101: S03,11001 S11,11001 S13,11001 S01,11001 S02,111001 S01,110010 S02,1110010 S00,110010 S00,1110010 S00,11100010 S00,1110010 S00,11001 S00,1110001 S00,111001 S00,111001 174 | 01000101: S03,01001 S11,01001 S13,01001 S01,01001 S02,011001 S01,010010 S02,0110010 S00,010010 S00,0110010 S00,01100010 S00,0110010 S00,01001 S00,0110001 S00,011001 S00,011001 175 | 11000101: S02,1000 S00,10000 S00,1000 S13,0100 S11,100 S13,000 S11,00 S03,0000 S03,000 S03,00 S03,100 S03,00100 S03,100 S03,0100 S03,1100 176 | 00100101: S03,10001 S11,10001 S13,10001 S01,10001 S02,110001 S01,100010 S02,1100010 S00,100010 S00,1100010 S00,10100010 S00,1100010 S00,10001 S00,1010001 S00,110001 S00,110001 177 | 10100101: S00,1100 S01,1100 S02,1100 S03,1100 S04,1100 S03,11000 S04,11000 S11,11000 S13,11000 S12,11000 S14,11000 S11,1100 S12,1100 S13,1100 S14,1100 178 | 01100101: S00,0100 S01,0100 S02,0100 S03,0100 S04,0100 S03,01000 S04,01000 S11,01000 S13,01000 S12,01000 S14,01000 S11,0100 S12,0100 S13,0100 S14,0100 179 | 11100101: S04,10001 S12,10001 S14,10001 S02,010001 S01,10001 S02,0100010 S01,100010 S00,00100010 S00,0100010 S00,100010 S00,1100010 S00,0010001 S00,10001 S00,010001 S00,110001 180 | 00010101: S03,00001 S11,00001 S13,00001 S01,00001 S02,100001 S01,000010 S02,1000010 S00,000010 S00,1000010 S00,10000010 S00,1000010 S00,00001 S00,1000001 S00,100001 S00,100001 181 | 10010101: S00,1000 S01,1000 S02,1000 S03,1000 S04,1000 S03,10000 S04,10000 S11,10000 S13,10000 S12,10000 S14,10000 S11,1000 S12,1000 S13,1000 S14,1000 182 | 01010101: S00,0000 S01,0000 S02,0000 S03,0000 S04,0000 S03,00000 S04,00000 S11,00000 S13,00000 S12,00000 S14,00000 S11,0000 S12,0000 S13,0000 S14,0000 183 | 11010101: S04,00001 S12,00001 S14,00001 S02,000001 S01,00001 S02,0000010 S01,000010 S00,00000010 S00,0000010 S00,000010 S00,1000010 S00,0000001 S00,00001 S00,000001 S00,100001 184 | 00110101: S02,0000 S00,00000 S00,0000 S12,100 S14,1100 S12,00 S14,100 S04,00 S04,100 S04,1000 S04,100 S04,100 S04,10100 S04,1100 S04,1100 185 | 10110101: S04,11001 S12,11001 S14,11001 S02,110001 S01,11001 S02,1100010 S01,110010 S00,11000010 S00,1100010 S00,110010 S00,1110010 S00,1100001 S00,11001 S00,110001 S00,111001 186 | 01110101: S04,01001 S12,01001 S14,01001 S02,010001 S01,01001 S02,0100010 S01,010010 S00,01000010 S00,0100010 S00,010010 S00,0110010 S00,0100001 S00,01001 S00,010001 S00,011001 187 | 11110101: S01,000 S00,000 S00,1000 S14,0100 S12,100 S14,000 S12,00 S04,0000 S04,000 S04,00 S04,100 S04,00100 S04,100 S04,0100 S04,1100 188 | 00001101: S09,000 S06,000 S06,1000 S00,100000 S00,110000 S00,00000 S00,10000 S02,0000 S02,10000 S02,100000 S02,10000 S02,10000 S02,1010000 S02,110000 S02,110000 189 | 10001101: S02,11000 S00,110000 S00,11000 S12,1100 S14,11100 S09,1100 S10,11100 S06,1100 S06,11100 S06,111000 S06,11100 S04,1100 S04,111000 S04,11100 S04,11100 190 | 01001101: S02,01000 S00,010000 S00,01000 S12,0100 S14,01100 S09,0100 S10,01100 S06,0100 S06,01100 S06,011000 S06,01100 S04,0100 S04,011000 S04,01100 S04,01100 191 | 11001101: S10,1000 S06,10000 S06,1000 S00,010000 S00,100000 S00,00000 S00,00000 S02,000000 S02,00000 S02,0000 S02,10000 S02,0010000 S02,10000 S02,010000 S02,110000 192 | 00101101: S02,10000 S00,100000 S00,10000 S12,1000 S14,11000 S09,1000 S10,11000 S06,1000 S06,11000 S06,101000 S06,11000 S04,1000 S04,101000 S04,11000 S04,11000 193 | 10101101: S04,11000 S12,11000 S14,11000 S02,110000 S01,11000 S02,1100000 S01,110000 S00,11000000 S00,1100000 S00,110000 S00,1101000 S00,1100000 S00,11000 S00,110000 S00,111000 194 | 01101101: S04,01000 S12,01000 S14,01000 S02,010000 S01,01000 S02,0100000 S01,010000 S00,01000000 S00,0100000 S00,010000 S00,0101000 S00,0100000 S00,01000 S00,010000 S00,011000 195 | 11101101: S01,1000 S00,1000 S00,10100 S14,01000 S12,1000 S10,01000 S09,1000 S06,001000 S06,01000 S06,1000 S06,11000 S04,001000 S04,1000 S04,01000 S04,11000 196 | 00011101: S02,00000 S00,000000 S00,00000 S12,0000 S14,10000 S09,0000 S10,10000 S06,0000 S06,10000 S06,100000 S06,10000 S04,0000 S04,100000 S04,10000 S04,10000 197 | 10011101: S04,10000 S12,10000 S14,10000 S02,100000 S01,10000 S02,1000000 S01,100000 S00,10000000 S00,1000000 S00,100000 S00,1001000 S00,1000000 S00,10000 S00,100000 S00,101000 198 | 01011101: S04,00000 S12,00000 S14,00000 S02,000000 S01,00000 S02,0000000 S01,000000 S00,00000000 S00,0000000 S00,000000 S00,0001000 S00,0000000 S00,00000 S00,000000 S00,001000 199 | 11011101: S01,0000 S00,0000 S00,00100 S14,00000 S12,0000 S10,00000 S09,0000 S06,000000 S06,00000 S06,0000 S06,10000 S04,000000 S04,0000 S04,00000 S04,10000 200 | 00111101: S10,0000 S06,00000 S06,0000 S00,1000 S00,111000 S00,000 S00,11000 S01,000 S01,1000 S01,10000 S01,1000 S01,1000 S01,101000 S01,11000 S01,11000 201 | 10111101: S01,1100 S00,1100 S00,11100 S14,11000 S12,1100 S10,11000 S09,1100 S06,110000 S06,11000 S06,1100 S06,11100 S04,110000 S04,1100 S04,11000 S04,11100 202 | 01111101: S01,0100 S00,0100 S00,01100 S14,01000 S12,0100 S10,01000 S09,0100 S06,010000 S06,01000 S06,0100 S06,01100 S04,010000 S04,0100 S04,01000 S04,01100 203 | 11111101: S09,000 S06,000 S06,1000 S00,011000 S00,1000 S00,01000 S00,000 S01,00000 S01,0000 S01,000 S01,1000 S01,001000 S01,1000 S01,01000 S01,11000 204 | 00000011: S00,00 S02,0 S02,10 S06,1 S06,111 S06 S06,11 S09 S09,1 S09,10 S09,1 S09,1 S09,101 S09,11 S09,11 205 | 10000011: S09,111 S06,111 S06,1111 S00,111100 S00,111110 S00,11100 S00,11110 S02,1110 S02,11110 S02,111010 S02,11110 S02,11110 S02,1110110 S02,111110 S02,111110 206 | 01000011: S09,011 S06,011 S06,0111 S00,011100 S00,011110 S00,01100 S00,01110 S02,0110 S02,01110 S02,011010 S02,01110 S02,01110 S02,0110110 S02,011110 S02,011110 207 | 11000011: S00,10 S02,100 S02,10 S06,011 S06,1 S06,01 S06 S09,00 S09,0 S09 S09,1 S09,001 S09,1 S09,01 S09,11 208 | 00100011: S09,101 S06,101 S06,1011 S00,101100 S00,110110 S00,10100 S00,11010 S02,1010 S02,11010 S02,101010 S02,11010 S02,10110 S02,1010110 S02,110110 S02,110110 209 | 10100011: S02,1110 S00,11100 S00,1110 S09,11 S10,111 S09,110 S10,1101 S06,110 S06,1101 S06,11010 S06,1101 S06,11 S06,1110 S06,111 S06,111 210 | 01100011: S02,0110 S00,01100 S00,0110 S09,01 S10,011 S09,010 S10,0101 S06,010 S06,0101 S06,01010 S06,0101 S06,01 S06,0110 S06,011 S06,011 211 | 11100011: S10,1011 S06,10101 S06,1011 S00,010110 S00,101100 S00,01010 S00,10100 S02,001010 S02,01010 S02,1010 S02,11010 S02,0010110 S02,10110 S02,010110 S02,110110 212 | 00010011: S09,001 S06,001 S06,0011 S00,001100 S00,100110 S00,00100 S00,10010 S02,0010 S02,10010 S02,100010 S02,10010 S02,00110 S02,1000110 S02,100110 S02,100110 213 | 10010011: S02,1010 S00,10100 S00,1010 S09,10 S10,101 S09,100 S10,1001 S06,100 S06,1001 S06,10010 S06,1001 S06,10 S06,1010 S06,101 S06,101 214 | 01010011: S02,0010 S00,00100 S00,0010 S09,00 S10,001 S09,000 S10,0001 S06,000 S06,0001 S06,00010 S06,0001 S06,00 S06,0010 S06,001 S06,001 215 | 11010011: S10,0011 S06,00101 S06,0011 S00,000110 S00,001100 S00,00010 S00,00100 S02,000010 S02,00010 S02,0010 S02,10010 S02,0000110 S02,00110 S02,000110 S02,100110 216 | 00110011: S00,00 S02,000 S02,00 S06,110 S06,111 S06,10 S06,11 S10,1 S10,11 S10,101 S10,11 S10,11 S10,1011 S10,111 S10,111 217 | 10110011: S10,1111 S06,11101 S06,1111 S00,110110 S00,111100 S00,11010 S00,11100 S02,110010 S02,11010 S02,1110 S02,11110 S02,1100110 S02,11110 S02,110110 S02,111110 218 | 01110011: S10,0111 S06,01101 S06,0111 S00,010110 S00,011100 S00,01010 S00,01100 S02,010010 S02,01010 S02,0110 S02,01110 S02,0100110 S02,01110 S02,010110 S02,011110 219 | 11110011: S00,00 S02,0 S02,10 S06,011 S06,110 S06,01 S06,10 S10,001 S10,01 S10,1 S10,11 S10,0011 S10,11 S10,011 S10,111 220 | 00001011: S09,110 S06,110 S06,1110 S00,111000 S00,111100 S00,11000 S00,11100 S02,1100 S02,11100 S02,101100 S02,11100 S02,11100 S02,1011100 S02,111100 S02,111100 221 | 10001011: S02,11110 S00,111100 S00,11110 S12,1111 S14,11111 S09,1111 S10,11111 S06,1111 S06,11111 S06,111011 S06,11111 S04,1111 S04,111011 S04,11111 S04,11111 222 | 01001011: S02,01110 S00,011100 S00,01110 S12,0111 S14,01111 S09,0111 S10,01111 S06,0111 S06,01111 S06,011011 S06,01111 S04,0111 S04,011011 S04,01111 S04,01111 223 | 11001011: S10,1110 S06,10110 S06,1110 S00,011100 S00,111000 S00,01100 S00,11000 S02,001100 S02,01100 S02,1100 S02,11100 S02,0011100 S02,11100 S02,011100 S02,111100 224 | 00101011: S02,10110 S00,101100 S00,10110 S12,1011 S14,11011 S09,1011 S10,11011 S06,1011 S06,11011 S06,101011 S06,11011 S04,1011 S04,101011 S04,11011 S04,11011 225 | 10101011: S04,11110 S12,11110 S14,11110 S02,111100 S01,11110 S02,1101100 S01,110110 S00,11011000 S00,1101100 S00,110110 S00,1101101 S00,1111000 S00,11110 S00,111100 S00,111101 226 | 01101011: S04,01110 S12,01110 S14,01110 S02,011100 S01,01110 S02,0101100 S01,010110 S00,01011000 S00,0101100 S00,010110 S00,0101101 S00,0111000 S00,01110 S00,011100 S00,011101 227 | 11101011: S01,1011 S00,1011 S00,10111 S14,01011 S12,1011 S10,01011 S09,1011 S06,001011 S06,01011 S06,1011 S06,11011 S04,001011 S04,1011 S04,01011 S04,11011 228 | 00011011: S02,00110 S00,001100 S00,00110 S12,0011 S14,10011 S09,0011 S10,10011 S06,0011 S06,10011 S06,100011 S06,10011 S04,0011 S04,100011 S04,10011 S04,10011 229 | 10011011: S04,10110 S12,10110 S14,10110 S02,101100 S01,10110 S02,1001100 S01,100110 S00,10011000 S00,1001100 S00,100110 S00,1001101 S00,1011000 S00,10110 S00,101100 S00,101101 230 | 01011011: S04,00110 S12,00110 S14,00110 S02,001100 S01,00110 S02,0001100 S01,000110 S00,00011000 S00,0001100 S00,000110 S00,0001101 S00,0011000 S00,00110 S00,001100 S00,001101 231 | 11011011: S01,0011 S00,0011 S00,00111 S14,00011 S12,0011 S10,00011 S09,0011 S06,000011 S06,00011 S06,0011 S06,10011 S04,000011 S04,0011 S04,00011 S04,10011 232 | 00111011: S10,0110 S06,00110 S06,0110 S00,1110 S00,111101 S00,110 S00,11101 S01,110 S01,1110 S01,10110 S01,1110 S01,1110 S01,101110 S01,11110 S01,11110 233 | 10111011: S01,1111 S00,1111 S00,11111 S14,11011 S12,1111 S10,11011 S09,1111 S06,110011 S06,11011 S06,1111 S06,11111 S04,110011 S04,1111 S04,11011 S04,11111 234 | 01111011: S01,0111 S00,0111 S00,01111 S14,01011 S12,0111 S10,01011 S09,0111 S06,010011 S06,01011 S06,0111 S06,01111 S04,010011 S04,0111 S04,01011 S04,01111 235 | 11111011: S09,110 S06,110 S06,1110 S00,011101 S00,1110 S00,01101 S00,110 S01,00110 S01,0110 S01,110 S01,1110 S01,001110 S01,1110 S01,01110 S01,11110 236 | 00000111: S09,010 S06,010 S06,1010 S00,101000 S00,110100 S00,01000 S00,10100 S02,0100 S02,10100 S02,100100 S02,10100 S02,10100 S02,1010100 S02,110100 S02,110100 237 | 10000111: S02,11010 S00,110100 S00,11010 S12,1101 S14,11101 S09,1101 S10,11101 S06,1101 S06,11101 S06,111001 S06,11101 S04,1101 S04,111001 S04,11101 S04,11101 238 | 01000111: S02,01010 S00,010100 S00,01010 S12,0101 S14,01101 S09,0101 S10,01101 S06,0101 S06,01101 S06,011001 S06,01101 S04,0101 S04,011001 S04,01101 S04,01101 239 | 11000111: S10,1010 S06,10010 S06,1010 S00,010100 S00,101000 S00,00100 S00,01000 S02,000100 S02,00100 S02,0100 S02,10100 S02,0010100 S02,10100 S02,010100 S02,110100 240 | 00100111: S02,10010 S00,100100 S00,10010 S12,1001 S14,11001 S09,1001 S10,11001 S06,1001 S06,11001 S06,101001 S06,11001 S04,1001 S04,101001 S04,11001 S04,11001 241 | 10100111: S04,11010 S12,11010 S14,11010 S02,110100 S01,11010 S02,1100100 S01,110010 S00,11001000 S00,1100100 S00,110010 S00,1100101 S00,1101000 S00,11010 S00,110100 S00,110101 242 | 01100111: S04,01010 S12,01010 S14,01010 S02,010100 S01,01010 S02,0100100 S01,010010 S00,01001000 S00,0100100 S00,010010 S00,0100101 S00,0101000 S00,01010 S00,010100 S00,010101 243 | 11100111: S01,1001 S00,1001 S00,10011 S14,01001 S12,1001 S10,01001 S09,1001 S06,001001 S06,01001 S06,1001 S06,11001 S04,001001 S04,1001 S04,01001 S04,11001 244 | 00010111: S02,00010 S00,000100 S00,00010 S12,0001 S14,10001 S09,0001 S10,10001 S06,0001 S06,10001 S06,100001 S06,10001 S04,0001 S04,100001 S04,10001 S04,10001 245 | 10010111: S04,10010 S12,10010 S14,10010 S02,100100 S01,10010 S02,1000100 S01,100010 S00,10001000 S00,1000100 S00,100010 S00,1000101 S00,1001000 S00,10010 S00,100100 S00,100101 246 | 01010111: S04,00010 S12,00010 S14,00010 S02,000100 S01,00010 S02,0000100 S01,000010 S00,00001000 S00,0000100 S00,000010 S00,0000101 S00,0001000 S00,00010 S00,000100 S00,000101 247 | 11010111: S01,0001 S00,0001 S00,00011 S14,00001 S12,0001 S10,00001 S09,0001 S06,000001 S06,00001 S06,0001 S06,10001 S04,000001 S04,0001 S04,00001 S04,10001 248 | 00110111: S10,0010 S06,00010 S06,0010 S00,1010 S00,110101 S00,010 S00,10101 S01,010 S01,1010 S01,10010 S01,1010 S01,1010 S01,101010 S01,11010 S01,11010 249 | 10110111: S01,1101 S00,1101 S00,11011 S14,11001 S12,1101 S10,11001 S09,1101 S06,110001 S06,11001 S06,1101 S06,11101 S04,110001 S04,1101 S04,11001 S04,11101 250 | 01110111: S01,0101 S00,0101 S00,01011 S14,01001 S12,0101 S10,01001 S09,0101 S06,010001 S06,01001 S06,0101 S06,01101 S04,010001 S04,0101 S04,01001 S04,01101 251 | 11110111: S09,010 S06,010 S06,1010 S00,010101 S00,1010 S00,00101 S00,010 S01,00010 S01,0010 S01,010 S01,1010 S01,001010 S01,1010 S01,01010 S01,11010 252 | 00001111: S00 S01 S01,1 S06,100 S06,110 S06,00 S06,10 S10,0 S10,10 S10,100 S10,10 S10,10 S10,1010 S10,110 S10,110 253 | 10001111: S10,1101 S06,11001 S06,1101 S00,1111 S00,111111 S00,111 S00,11111 S01,111 S01,1111 S01,11101 S01,1111 S01,1111 S01,111011 S01,11111 S01,11111 254 | 01001111: S10,0101 S06,01001 S06,0101 S00,0111 S00,011111 S00,011 S00,01111 S01,011 S01,0111 S01,01101 S01,0111 S01,0111 S01,011011 S01,01111 S01,01111 255 | 11001111: S00,11 S01,10 S01,1 S06,010 S06,100 S06,00 S06,00 S10,000 S10,00 S10,0 S10,10 S10,0010 S10,10 S10,010 S10,110 256 | 00101111: S10,1001 S06,10001 S06,1001 S00,1011 S00,110111 S00,101 S00,11011 S01,101 S01,1101 S01,10101 S01,1101 S01,1011 S01,101011 S01,11011 S01,11011 257 | 10101111: S01,111 S00,111 S00,1111 S10,110 S09,11 S10,1100 S09,110 S06,11000 S06,1100 S06,110 S06,1101 S06,1100 S06,11 S06,110 S06,111 258 | 01101111: S01,011 S00,011 S00,0111 S10,010 S09,01 S10,0100 S09,010 S06,01000 S06,0100 S06,010 S06,0101 S06,0100 S06,01 S06,010 S06,011 259 | 11101111: S09,101 S06,101 S06,1011 S00,010111 S00,1011 S00,01011 S00,101 S01,00101 S01,0101 S01,101 S01,1101 S01,001011 S01,1011 S01,01011 S01,11011 260 | 00011111: S10,0001 S06,00001 S06,0001 S00,0011 S00,100111 S00,001 S00,10011 S01,001 S01,1001 S01,10001 S01,1001 S01,0011 S01,100011 S01,10011 S01,10011 261 | 10011111: S01,101 S00,101 S00,1011 S10,100 S09,10 S10,1000 S09,100 S06,10000 S06,1000 S06,100 S06,1001 S06,1000 S06,10 S06,100 S06,101 262 | 01011111: S01,001 S00,001 S00,0011 S10,000 S09,00 S10,0000 S09,000 S06,00000 S06,0000 S06,000 S06,0001 S06,0000 S06,00 S06,000 S06,001 263 | 11011111: S09,001 S06,001 S06,0011 S00,000111 S00,0011 S00,00011 S00,001 S01,00001 S01,0001 S01,001 S01,1001 S01,000011 S01,0011 S01,00011 S01,10011 264 | 00111111: S00,01 S01,00 S01,0 S06,1 S06,111 S06 S06,11 S09 S09,1 S09,10 S09,1 S09,1 S09,101 S09,11 S09,11 265 | 10111111: S09,111 S06,111 S06,1111 S00,110111 S00,1111 S00,11011 S00,111 S01,11001 S01,1101 S01,111 S01,1111 S01,110011 S01,1111 S01,11011 S01,11111 266 | 01111111: S09,011 S06,011 S06,0111 S00,010111 S00,0111 S00,01011 S00,011 S01,01001 S01,0101 S01,011 S01,0111 S01,010011 S01,0111 S01,01011 S01,01111 267 | 11111111: S00 S01 S01,1 S06,011 S06,1 S06,01 S06 S09,00 S09,0 S09 S09,1 S09,001 S09,1 S09,01 S09,11 268 | -------------------------------------------------------------------------------- /von-neumann-debias-tables/dice_extract_1rolls_116states.txt: -------------------------------------------------------------------------------- 1 | Randomness extraction table for dice rolls 2 | 3 | * Start in state S000 4 | * Roll one die (the same die every time) 5 | * Find the state you are in in the column corresponding to the roll you made. 6 | * Go to the new state listed there, and output the bits after the comma, if any 7 | * Repeat 8 | 9 | Produces 0.680556 bits per roll. 10 | 11 | Roll: 1 2 3 4 5 6 12 | S000: S022 S023 S024 S007 S007 S025 13 | S001: S065 S066 S067 S030 S030 S068 14 | S002: S055 S056 S057 S026 S026 S058 15 | S003: S008 S008,0 S061 S028 S028 S062 16 | S004: S008,1 S008 S063 S029 S029 S064 17 | S005: S031 S032 S006,0 S009 S009 S009,0 18 | S006: S049 S050 S051 S001,0 S001,0 S001,00 19 | S007: S040,1 S041,1 S042,1 S002 S002 S002,0 20 | S008: S010 S011 S002,0 S017,0 S017,0 S048,0 21 | S009: S012,1 S013,1 S001,10 S019 S019 S052 22 | S010: S027 S027,0 S097 S059 S059 S098 23 | S011: S027,1 S027 S099 S060 S060 S100 24 | S012: S038 S038,0 S104 S078 S078 S105 25 | S013: S038,1 S038 S109 S079 S079 S110 26 | S014: S006 S006,0 S031,0 S034 S034 S034,0 27 | S015: S006,1 S006 S032,0 S035 S035 S035,0 28 | S016: S031,1 S032,1 S006 S036 S036 S036,0 29 | S017: S069 S070 S020,0 S037 S037 S037,0 30 | S018: S074 S075 S021,0 S009,1 S009,1 S009 31 | S019: S080 S082 S033,0 S039 S039 S039,0 32 | S020: S014,1 S015,1 S016,1 S000,0 S000,0 S000,00 33 | S021: S089 S090 S091 S001,01 S001,01 S001,0 34 | S022: S002 S002,0 S010,0 S040,0 S040,0 S085,0 35 | S023: S002,1 S002 S011,0 S041,0 S041,0 S086,0 36 | S024: S010,1 S011,1 S002 S042,0 S042,0 S087,0 37 | S025: S085,1 S086,1 S087,1 S002,1 S002,1 S002 38 | S026: S014,10 S015,10 S016,10 S000 S000 S000,0 39 | S027: S003 S004 S000,0 S005,00 S005,00 S018,00 40 | S028: S017,1 S017,10 S088,1 S010 S010 S010,0 41 | S029: S017,11 S017,1 S092,1 S011 S011 S011,0 42 | S030: S014,1 S015,1 S016,1 S000,1 S000,1 S000,10 43 | S031: S019 S019,0 S095 S012,0 S012,0 S012,00 44 | S032: S019,1 S019 S096 S013,0 S013,0 S013,00 45 | S033: S014 S015 S016 S000,00 S000,00 S000,000 46 | S034: S001,1 S001,10 S012,10 S049 S049 S089 47 | S035: S001,11 S001,1 S013,10 S050 S050 S090 48 | S036: S012,11 S013,11 S001,1 S051 S051 S091 49 | S037: S003,1 S004,1 S000,10 S005,1 S005,1 S018,1 50 | S038: S003,1 S004,1 S000,10 S005,0 S005,0 S018,0 51 | S039: S003,10 S004,10 S000,100 S005 S005 S018 52 | S040: S020 S020,0 S069,0 S071 S071 S071,0 53 | S041: S020,1 S020 S070,0 S072 S072 S072,0 54 | S042: S069,1 S070,1 S020 S073 S073 S073,0 55 | S043: S021 S021,0 S074,0 S034,1 S034,1 S034 56 | S044: S021,1 S021 S075,0 S035,1 S035,1 S035 57 | S045: S074,1 S075,1 S021 S036,1 S036,1 S036 58 | S046: S006,1 S006,10 S031 S076 S076 S076,0 59 | S047: S006,11 S006,1 S032 S077 S077 S077,0 60 | S048: S101 S102 S053,0 S037,1 S037,1 S037 61 | S049: S033 S033,0 S080,0 S081 S081 S081,0 62 | S050: S033,1 S033 S082,0 S083 S083 S083,0 63 | S051: S080,1 S082,1 S033 S084 S084 S084,0 64 | S052: S106 S107 S054,0 S039,1 S039,1 S039 65 | S053: S043,1 S044,1 S045,1 S000,01 S000,01 S000,0 66 | S054: S043 S044 S045 S000,001 S000,001 S000,00 67 | S055: S000 S000,0 S003,0 S014,00 S014,00 S043,00 68 | S056: S000,1 S000 S004,0 S015,00 S015,00 S044,00 69 | S057: S003,1 S004,1 S000 S016,00 S016,00 S045,00 70 | S058: S043,10 S044,10 S045,10 S000,1 S000,1 S000 71 | S059: S005,10 S005,100 S046,10 S003 S003 S003,0 72 | S060: S005,101 S005,10 S047,10 S004 S004 S004,0 73 | S061: S002,1 S002,10 S010 S088,0 S088,0 S113,0 74 | S062: S048,1 S048,10 S113,1 S010,1 S010,1 S010 75 | S063: S002,11 S002,1 S011 S092,0 S092,0 S114,0 76 | S064: S048,11 S048,1 S114,1 S011,1 S011,1 S011 77 | S065: S000,1 S000,10 S003,10 S014,0 S014,0 S043,0 78 | S066: S000,11 S000,1 S004,10 S015,0 S015,0 S044,0 79 | S067: S003,11 S004,11 S000,1 S016,0 S016,0 S045,0 80 | S068: S043,1 S044,1 S045,1 S000,11 S000,11 S000,1 81 | S069: S005,1 S005,10 S046,1 S003,0 S003,0 S003,00 82 | S070: S005,11 S005,1 S047,1 S004,0 S004,0 S004,00 83 | S071: S000,1 S000,10 S003,10 S014,1 S014,1 S043,1 84 | S072: S000,11 S000,1 S004,10 S015,1 S015,1 S044,1 85 | S073: S003,11 S004,11 S000,1 S016,1 S016,1 S045,1 86 | S074: S052 S052,0 S115 S012,01 S012,01 S012,0 87 | S075: S052,1 S052 S116 S013,01 S013,01 S013,0 88 | S076: S001,11 S001,110 S012,1 S095 S095 S115 89 | S077: S001,111 S001,11 S013,1 S096 S096 S116 90 | S078: S005,1 S005,10 S046,1 S003,1 S003,1 S003,10 91 | S079: S005,11 S005,1 S047,1 S004,1 S004,1 S004,10 92 | S080: S005 S005,0 S046 S003,00 S003,00 S003,000 93 | S081: S000,10 S000,100 S003,100 S014 S014 S043 94 | S082: S005,1 S005 S047 S004,00 S004,00 S004,000 95 | S083: S000,101 S000,10 S004,100 S015 S015 S044 96 | S084: S003,101 S004,101 S000,10 S016 S016 S045 97 | S085: S053 S053,0 S101,0 S071,1 S071,1 S071 98 | S086: S053,1 S053 S102,0 S072,1 S072,1 S072 99 | S087: S101,1 S102,1 S053 S073,1 S073,1 S073 100 | S088: S020,1 S020,10 S069 S103 S103 S103,0 101 | S089: S054 S054,0 S106,0 S081,1 S081,1 S081 102 | S090: S054,1 S054 S107,0 S083,1 S083,1 S083 103 | S091: S106,1 S107,1 S054 S084,1 S084,1 S084 104 | S092: S020,11 S020,1 S070 S108 S108 S108,0 105 | S093: S021,1 S021,10 S074 S076,1 S076,1 S076 106 | S094: S021,11 S021,1 S075 S077,1 S077,1 S077 107 | S095: S033,1 S033,10 S080 S111 S111 S111,0 108 | S096: S033,11 S033,1 S082 S112 S112 S112,0 109 | S097: S000,1 S000,10 S003 S046,00 S046,00 S093,00 110 | S098: S018,10 S018,100 S093,10 S003,1 S003,1 S003 111 | S099: S000,11 S000,1 S004 S047,00 S047,00 S094,00 112 | S100: S018,101 S018,10 S094,10 S004,1 S004,1 S004 113 | S101: S018,1 S018,10 S093,1 S003,01 S003,01 S003,0 114 | S102: S018,11 S018,1 S094,1 S004,01 S004,01 S004,0 115 | S103: S000,11 S000,110 S003,1 S046,1 S046,1 S093,1 116 | S104: S000,11 S000,110 S003,1 S046,0 S046,0 S093,0 117 | S105: S018,1 S018,10 S093,1 S003,11 S003,11 S003,1 118 | S106: S018 S018,0 S093 S003,001 S003,001 S003,00 119 | S107: S018,1 S018 S094 S004,001 S004,001 S004,00 120 | S108: S000,111 S000,11 S004,1 S047,1 S047,1 S094,1 121 | S109: S000,111 S000,11 S004,1 S047,0 S047,0 S094,0 122 | S110: S018,11 S018,1 S094,1 S004,11 S004,11 S004,1 123 | S111: S000,101 S000,1010 S003,10 S046 S046 S093 124 | S112: S000,1011 S000,101 S004,10 S047 S047 S094 125 | S113: S053,1 S053,10 S101 S103,1 S103,1 S103 126 | S114: S053,11 S053,1 S102 S108,1 S108,1 S108 127 | S115: S054,1 S054,10 S106 S111,1 S111,1 S111 128 | S116: S054,11 S054,1 S107 S112,1 S112,1 S112 129 | --------------------------------------------------------------------------------