├── README.md
└── utxo-diff.py
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ## Project Overview
4 |
5 | This project advances the utxo-live project by visualizing the differences between two states of the Bitcoin blockchain. A new feature in Bitcoin Core 0.20 allows users to dump the state of the blockchain (the UTXO set) using the command `dumptxoutset`. While the previous project utxo-live visualizes the indivdiual UTXO sets, this project visualizes the differences (the coins spent) between the two states of the chain.
6 |
7 |
8 |
9 |
10 | Figure description: The image above displays all coins spent between blocks 678336 and 680188 of the Bitcoin blockchain. The heatmap is a two dimensional histogram showing the output date (x-axis), BTC amount (y-axis), and number of coins spent (color map) in each histogram bin. The bin size of the histogram is 200 (yaxis) by 400 (xaxis). Zooming in to the image usually reveals more detail. A daily updating version of this image is running at utxo.live.
11 |
12 |
13 | ## Privacy & Security
14 |
15 | Because the user executes the `dumptxoutset` command inside Bitcoin Core, the python script does not interact with Core directly. The script simply reads the dump files after they're completed. No private keys, passwords, xpubs, or wallet addresses are exchanged between Core and the python script.
16 |
17 |
18 | ## Requirements
19 | * Bitcoin Core version 0.20 or higher
20 | * Python3 (comes standard on all operating systems)
21 |
22 |
23 | ## Instructions for experienced users
24 | * Create a folder called `utxo-live` in a familiar location
25 | * Dump the first utxo set `bitcoin-cli dumptxoutset /xxxxxx.dat` where xxxxxxx is the current block height (10-20 min)
26 | (Note: `bitcoin-cli` doesn't ship with Core on Mac OS, use Window->Console instead)
27 | * Wait until a later block height, and dump the second utxo set by repeating the step above at a larger block number
28 | * Install two python dependencies `python3 -m pip install numpy matplotlib`
29 | * Download utxo-diff.py to your `utxo-live` folder and run it `python3 utxo-diff.py` (20 min)
30 |
31 | ## Step by step instructions
32 | 1. Make sure Bitcoin Core (version 0.20 or higher) is running and synchronized.
33 |
34 | 2. Create a new folder called `utxo-live` in a familiar location on your machine (e.g. in your Documents folder).
35 |
36 | 3. Open a terminal window and display the current folder path. Do this by:
37 |
38 | * Windows: open a terminal (Start -> Command Prompt) and type:
39 | ```sh
40 | echo %cd%
41 | ```
42 |
43 | * Mac/Linux: open a terminal (Mac: Applications -> Utilities -> Terminal) and type:
44 | ```sh
45 | pwd
46 | ```
47 |
48 | 4. Navigate to the `utxo-live` folder using the change directory `cd` command. For example if you're currently in `Users/Steve/` (or on Windows `C:\Users\Steve\`) and you've created the folder `Steve/Documents/bitcoin-tools/utxo-live/` then type:
49 |
50 | ```sh
51 | cd Document/bitcoin-tools/utxo-live/
52 | ```
53 | Note: Windows sometimes requires forward slashes `/` instead of back slashes `\`.
54 |
55 | 5. Again display the current folder (Step 3) and copy to your clipboard the full path to the `utxo-live` folder. We will be pasting this path into Bitcoin Core soon.
56 |
57 | 6. Leave the terminal window momentarily, and open the Bitcoin Core console window. (Alternatively for bitcoin-cli users, open another terminal window and type the console commands in the next steps as `bitcoin-cli` commands.)
58 |
59 |
60 |
61 | 7. Get the current block count by typing in the console window:
62 |
63 | ```sh
64 | getblockcount
65 | ```
66 | and hitting enter. The output will look like:
67 |
68 |
69 |
70 |
71 | 8. Dump the first utxo set by typing in the console window:
72 |
73 | ```sh
74 | dumptxoutset /
75 | ```
76 | where `` is copy-pasted from Step 5, and `` is the block count. For example if the block count is 678505, the command (for my path) is:
77 |
78 | ```sh
79 | dumptxoutset /Users/Steve/Documents/bitcoin-tools/utxo-live/678505.dat
80 | ```
81 | If there are no error messages after hitting enter, then it's working. It will take 10-20 minutes. Look in your `utxo-live` folder and you should see the file being created as `xxxxxx.dat.incomplete`.
82 |
83 | 9. While the utxo file is dumping, download utxo-diff.py and install two python dependencies. To do this:
84 |
85 | * Right click on utxo-diff.py, choose "Save Link As" and select the `utxo-live` folder.
86 |
87 | * In the terminal window (not the Bitcoin console), type the following command to install two python dependencies:
88 | ```sh
89 | python3 -m pip install numpy matplotlib
90 | ```
91 |
92 | Note: you might already have these installed, but running the command won't hurt anything.
93 |
94 | 10. If 10-20 minutes have passed, check that the utxo dump is completed. Do this in two ways:
95 |
96 | * Check that the file no longer has `.incomplete` after `xxxxxx.dat`
97 | * Check that the Bitcoin Core console displays the results of the dump as something like:
98 |
99 |
100 |
101 | 11. Decide how long you'd like to wait between block heights, and repeat steps 7-8 at the later height. The minimum height difference is one block. I have yet to find any maximum height difference.
102 |
103 | 11. If both utxo sets have finished dumping and Step 9 is also completed (utxo-diff.py is downloaded and python dependencies were installed), then run utxo-diff.py by typing in the terminal:
104 |
105 | ```sh
106 | python3 utxo-diff.py
107 | ```
108 |
109 | 13. The program will take 30-40 minutes to complete and it will update you on the progress. If there several xxxxxxx.dat files in the folder, it will ask you which two you'd like to process. When finished the image is stored in the folder as `utxo_diff_xxxxxx_to_yyyyyy.png`, where `xxxxxx` and `yyyyyy` are the two block heights.
110 |
111 |
112 |
113 |
--------------------------------------------------------------------------------
/utxo-diff.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | """
4 | Created on Mon Apr 5 13:12:34 2021
5 |
6 | This script takes two utxo sets dumped by Bitcoin Core and visualizes
7 | the differences (coins spent) between the older and newer set.
8 |
9 | @author: jeffrocks
10 | """
11 |
12 |
13 | #imports for runtime
14 | import matplotlib.pyplot as plt
15 | import matplotlib.cm as cm
16 | import numpy as np
17 | import sys
18 | import time
19 | import copy
20 | import struct
21 | import subprocess
22 | from os import walk
23 |
24 |
25 | class UTXO:
26 |
27 | def __init__(self, t_b, o, h, a):
28 | self.txid_b = t_b #keeping txid binary for comparison
29 | self.outnum = o
30 | self.height = h
31 | self.amount = a
32 |
33 | #decide if two utxos are the same
34 | def __eq__(self, other):
35 | if (self.txid_b != other.txid_b):
36 | return False
37 | elif (self.outnum != other.outnum):
38 | return False
39 | else:
40 | return True
41 |
42 | def print(self):
43 | txid = decode_hex256(self.txid_b)
44 | pstr = str(self.outnum)+"\t"+str(txid)
45 | print(pstr)
46 | return(pstr)
47 |
48 |
49 | #decode the binary best block hash (256 bit = 32 bytes)
50 | def decode_hex256(bbhash_b):
51 |
52 | #make sure byte length is 32
53 | assert(len(bbhash_b)==32)
54 |
55 | # call in as 8 uints then convert to 1 hex string
56 | bbhash = ''
57 | a = struct.unpack('>8I', bbhash_b[::-1])
58 | for ai in a:
59 | bbhash += '{0:08x}'.format(ai)
60 |
61 | return bbhash
62 |
63 |
64 | #parse the varint
65 | #code modified from https://github.com/sr-gi/bitcoin_tools/blob/0f6ea45b6368200e481982982822f0416e0c438d/bitcoin_tools/analysis/status/utils.py
66 | def parse_b128(fin):
67 | data = fin.read(1).hex()
68 | more_bytes = int(data, 16) & 0x80
69 | while more_bytes:
70 | tmp = fin.read(1).hex()
71 | data += tmp
72 | more_bytes = int(tmp, 16) & 0x80
73 |
74 | return data
75 |
76 |
77 | #decode the varint
78 | #code modified from https://github.com/sr-gi/bitcoin_tools/blob/0f6ea45b6368200e481982982822f0416e0c438d/bitcoin_tools/analysis/status/utils.py
79 | def b128_decode(data):
80 | n = 0
81 | i = 0
82 | while True:
83 | d = int(data[2 * i:2 * i + 2], 16)
84 | n = n << 7 | d & 0x7F
85 | if d & 0x80:
86 | n += 1
87 | i += 1
88 | else:
89 | return n
90 |
91 |
92 | #decompress a bitcoin amount
93 | #code modified from https://github.com/sr-gi/bitcoin_tools/blob/0f6ea45b6368200e481982982822f0416e0c438d/bitcoin_tools/analysis/status/utils.py
94 | def txout_decompress(x):
95 | if x == 0:
96 | return 0
97 | x -= 1
98 | e = x % 10
99 | x = x // 10
100 | if e < 9:
101 | d = (x % 9) + 1
102 | x = x // 9
103 | n = x * 10 + d
104 | else:
105 | n = x + 1
106 | while e > 0:
107 | n *= 10
108 | e -= 1
109 | return n
110 |
111 | #parse the script portion of the utxo
112 | def parse_script(data_b, data_size, first_byte):
113 |
114 | data = data_b.hex()
115 | if first_byte:
116 | data = first_byte+data
117 | return data
118 |
119 | #to decompress script in future look at
120 | #modified from https://github.com/sr-gi/bitcoin_tools/blob/0f6ea45b6368200e481982982822f0416e0c438d/bitcoin_tools/analysis/status/utils.py#L259
121 |
122 |
123 | # read the utxo dump file header
124 | def read_fileheader(fin):
125 | #get binary base block_hash, coin count, and ? from header info
126 | bbhash_b = fin.read(32)
127 | ccount_b = fin.read(8)
128 | txcount_b = fin.read(4)
129 |
130 | #decode binary bbhash, ccount
131 | bbhash = decode_hex256(bbhash_b)
132 | ccount = struct.unpack('Q', ccount_b)[0]
133 | txcount = struct.unpack('I', txcount_b)[0]
134 |
135 | return ccount
136 |
137 |
138 | #read a single UTXO
139 | def get_UTXO(fin):
140 |
141 | ### Read in bytes of utxo
142 |
143 | #read in txid, outnum
144 | txid_b = fin.read(32)
145 | outnum_b = fin.read(4)
146 |
147 | #read the binary stream until stop given by Varint
148 | code = parse_b128(fin)
149 |
150 | #next varint is the utxo amount
151 | amount_v = parse_b128(fin)
152 |
153 | #next varint is the script type
154 | out_type_v = parse_b128(fin)
155 |
156 | #script type must be decoded now because it has variable length
157 | out_type = b128_decode(out_type_v)
158 |
159 | #get data size based on script type
160 | NSPECIALSCRIPTS = 6
161 | first_byte = None
162 | if out_type in [0, 1]:
163 | data_size = 20 # 20 bytes
164 | elif out_type in [2, 3, 4, 5]:
165 | data_size = 32
166 | first_byte = out_type_v[-1] # need previous byte from stream
167 | else:
168 | data_size = (out_type - NSPECIALSCRIPTS) * 1
169 |
170 | #parse script
171 | script_b = fin.read(data_size)
172 |
173 |
174 |
175 | # ### decode txid, outnum, height, coinbase and btc amount
176 |
177 | # decode txid, outnum
178 | # txid = decode_hex256(txid_b)
179 | outnum = struct.unpack('I', outnum_b)[0]
180 |
181 | # decode the varint to get coinbase and height
182 | code = b128_decode(code)
183 | height = code >> 1
184 | #coinbase = code & 0x01
185 |
186 | # #decode btc amount of utxo
187 | amount = txout_decompress(b128_decode(amount_v))
188 |
189 | utxo = UTXO(txid_b, outnum, height, amount)
190 |
191 | return utxo
192 |
193 | #decode script
194 | #script = parse_script(script_b, data_size, first_byte)
195 |
196 |
197 | #generate histogram of a batch of utxos
198 | def get_Histogram(x,y, xedges, yedges):
199 |
200 | #place batch into histogram
201 | x = np.array(x)
202 | y = np.array(y)*1e-8
203 |
204 | #take log of amounts
205 | y[np.where(y==0)]=1e-9
206 | y = np.log10(y)
207 |
208 | tmp_hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))
209 | return tmp_hist, xedges, yedges
210 |
211 |
212 | #ask user to select file name if multiple
213 | def get_filename(path, old_new):
214 |
215 | #get file list of directory
216 | _, _, filenames = next(walk(path))
217 | dat_files = [f for f in filenames if '.dat' in f]
218 |
219 | #check for zero dat files
220 | if not dat_files:
221 | print('\nError, no utxo.dat files found in this directory. \
222 | Make sure the utxo dump file from core is in this directory')
223 | sys.exit()
224 |
225 | #Need two dat files in directory in order to diff
226 | utxo_fn = './'+dat_files[0]
227 | if len(dat_files)==1:
228 | print('\nError, only one .dat file was found. Need two to compare.')
229 | sys.exit()
230 | else:
231 | print('\nSelect the '+old_new+' utxo file:\n')
232 | for n in range(len(dat_files)):
233 | print(str(n+1)+") "+dat_files[n])
234 | try:
235 | fnum = int(input("Enter a number: "))
236 | utxo_fn = './'+dat_files[fnum-1]
237 | except:
238 | print('\nError, could not open file. Type the right number?')
239 | sys.exit()
240 |
241 |
242 | #check for incomplete
243 | if 'incomplete' in utxo_fn:
244 | print('\nError, core has not finished dumping the file')
245 | sys.exit()
246 |
247 | #get block height from file name
248 | block_height = 0
249 | try:
250 | block_height = int(utxo_fn[2:-4])
251 | except:
252 | print('\nError: the file name is not a valid block height')
253 | sys.exit()
254 |
255 | #check reasonable block+heights
256 | if block_height < 600000 or block_height > 6000000: #100 years from now
257 | print('\nError: the file name is not a valid block height')
258 | sys.exit()
259 |
260 | return utxo_fn, block_height
261 |
262 | # %%
263 |
264 | def openImage(path):
265 | imageViewerFromCommandLine = {'linux':'xdg-open',
266 | 'win32':'explorer',
267 | 'darwin':'open'}[sys.platform]
268 | subprocess.run([imageViewerFromCommandLine, path])
269 |
270 |
271 | # %%
272 |
273 | def utxo_lessthan(u2,u1):
274 |
275 | #first check integer segments of the txid
276 | a = struct.unpack('>32B', (u2.txid_b))
277 | b = struct.unpack('>32B', (u1.txid_b))
278 |
279 | #print(a)
280 | #print(b)
281 |
282 | #loop over each integer to check for less than
283 | k = False
284 | r = False
285 | n = 0 #len(a)
286 | while not k and n batch_size:
400 |
401 | #add to histogram
402 | tmphist,xedges,yedges = get_Histogram(spent_heights, spent_amounts, xedges, yedges)
403 | hist += tmphist
404 |
405 | #start fresh spent lists
406 | spent_heights = []
407 | spent_amounts = []
408 | batch_count = 0
409 |
410 | #update user on status
411 | perc_done = (100*u1_count) // coin_count1
412 | if perc_done > last_per_done:
413 | print("Percent completed: "+str(perc_done)+"%")
414 | last_per_done += 1
415 |
416 |
417 | #after loop, add last batch to histogram
418 | if batch_count > 0:
419 | tmphist,xedges,yedges = get_Histogram(spent_heights, spent_amounts, xedges, yedges)
420 | hist += tmphist
421 |
422 |
423 | #close file streams
424 | fin1.close()
425 | fin2.close()
426 |
427 | print("\nFinal step - rendering the heatmap (1 - 2 min)...")
428 |
429 | #plotting histogram
430 | phist = hist
431 |
432 | #non-zero, take logs and rotate hist matrix
433 | phist[np.where(phist==0)]=.01
434 | phist = np.log10(phist)
435 | phist = np.rot90(phist)
436 | phist = np.flipud(phist)
437 |
438 | # get max values
439 | hmax = phist.max()
440 | hmin = phist.min()
441 |
442 | # insert nan from zero value bins
443 | phist[np.where(phist==hmin)]=np.nan
444 |
445 | # get figure handles
446 | plt.clf()
447 | fig = plt.figure(figsize=(8, 6), facecolor='black')
448 | ax = fig.add_axes([.11,.37,.8,.55])
449 |
450 | #color maps for pcolor
451 | my_cmap = copy.copy(cm.gnuplot2)
452 | my_cmap.set_bad(color='black')
453 |
454 | # render scatter
455 | im = ax.pcolormesh(phist, vmin=-1, vmax=np.floor(hmax*.8), cmap=my_cmap, label='UTXO Histogram')
456 |
457 | #yaxis format
458 | plt.yticks(np.linspace(0, yres, num=14))
459 | labels = ["100k","10k","1k",'100','10',
460 | "1",".1",'.01','.001','10k sat',
461 | "1k sat","100 sat",'10 sat','0 sat',]
462 | labels.reverse()
463 | ax.set_yticklabels(labels, fontsize=8)
464 | ax.yaxis.set_ticks_position('both')
465 | ax.tick_params(labelright=True)
466 |
467 | #xaxis format
468 | ticks_year=['2009','2010','2011','2012',
469 | '2013','2014','2015','2016',
470 | '2017','2018','2019','2020','2021']
471 | ticks_height = [1,32500,100400,160400,
472 | 214500,278200,336700,391300,
473 | 446200,502100,556500,610800,664100]
474 | ticks_x = []
475 | label_x = []
476 | for n in range(len(ticks_height)):
477 | th = ticks_height[n]
478 | ticks_x.append(np.argmin(np.abs(np.array(xedges)-th)))
479 | label_x.append(ticks_year[n]+"\n"+str(th))
480 |
481 | plt.xticks(ticks_x)
482 | ax.set_xticklabels(label_x, rotation=0, fontsize=6)
483 |
484 | #title and labels
485 | tick_color = "white"
486 | fig_title = " Coins spent from height "+utxo_fn1[2:-4]+" to "+utxo_fn2[2:-4]
487 | tobj = plt.title(fig_title, fontsize=12, loc='left')
488 | plt.setp(tobj, color=tick_color)
489 | ax.set_ylabel('Amount (BTC)', fontsize=8)
490 | ax.spines['bottom'].set_color(tick_color)
491 | ax.spines['top'].set_color(tick_color)
492 | ax.tick_params(axis='x', colors=tick_color)
493 | ax.xaxis.label.set_color(tick_color)
494 | ax.spines['right'].set_color(tick_color)
495 | ax.spines['left'].set_color(tick_color)
496 | ax.tick_params(axis='y', colors=tick_color)
497 | ax.yaxis.label.set_color(tick_color)
498 | ax.set_xlabel("Output time (year, block height)", fontsize=8)
499 |
500 | # Color bar
501 | cbaxes = fig.add_axes([0.72, .925, 0.18, 0.015])
502 | cb = plt.colorbar(im, orientation="horizontal", cax=cbaxes)
503 | cbaxes.set_xlim(-0.01,np.floor(hmax*.8)+.1)
504 | cbaxes.xaxis.set_ticks_position('top')
505 | cbticks = np.arange(int(np.floor(hmax*.8))+1)
506 | cb.set_ticks(cbticks)
507 | clabels = ['1','10','100','1k','10k','100k','1M']
508 | cbaxes.set_xticklabels(clabels[0:len(cbticks)], fontsize=6)
509 | cbaxes.set_ylabel("Number of \noutputs spent", rotation=0, fontsize=6)
510 | cbaxes.yaxis.set_label_coords(-.24,0)
511 | cbaxes.tick_params('both', length=0, width=0, which='major')
512 | cb.outline.set_visible(False)
513 | cbaxes.spines['bottom'].set_color(tick_color)
514 | cbaxes.tick_params(axis='x', colors=tick_color)
515 | cbaxes.yaxis.label.set_color(tick_color)
516 |
517 | # save the image
518 | fig_name = "./utxo_diff_"+str(block_height1)+"_to_"+str(block_height2)+".png"
519 | plt.savefig(fig_name, dpi=1200, bbox_inches='tight', facecolor=fig.get_facecolor(), transparent=True)
520 |
521 | print("\nImage saved as "+fig_name)
522 | print("\nDone")
523 |
524 | #print("\nrun time: ", (time.time() - start_time)/60)
525 |
526 |
527 | # try to open the image automatically
528 | try:
529 | openImage(fig_name)
530 | except:
531 | sys.exit()
532 |
533 |
534 |
535 |
--------------------------------------------------------------------------------