├── .gitignore ├── README.md ├── download.md ├── paper-results ├── Paralellism.ipynb ├── cache-hitrates.ipynb ├── classification.ipynb ├── intdata-characteristics.ipynb ├── io-distribution.ipynb └── submission-times.ipynb ├── profile.md └── schema.md /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints 2 | 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Snowflake Dataset 2 | 3 | This repository contains documentation for the dataset that accompanies our NSDI 2020 paper, "Building an Elastic Query Engine on Disaggregated Storage". It also includes scripts to aid with processing of the data and to reproduce the analysis results in the paper. 4 | 5 | ## Main Dataset 6 | 7 | The main dataset contains several statistics (timing, I/O, resource usage, etc..) pertaining to ~70 million queries from all customers that ran on [Snowflake](https://www.snowflake.com/) over a 14 day period from Feb 21st 2018 to March 7th 2018. This dataset is available in both CSV and Parquet formats and can be obtained from the [Downloads](download.md) page. 8 | 9 | ### Schema 10 | 11 | Each row corresponds to one unique query with the columns representing various characteristics pertaining to that query. The **queryId** column contains a unique 64-bit identifier for each query. For a detailed description of all of the columns, please refer to the [Schema](schema.md) page. 12 | 13 | ## Auxiliary Time-series Explosion 14 | 15 | We also provide some auxiliary data to make it easier to perform time-series analysis (e.g. things like how do resource utilizations vary over time). This data is also available in both CSV and Parquet formats and can be obtained from the [Downloads](download.md) page. Note that this data can be computed from the main dataset, we are just providing a pre-computed version for convenience. 16 | 17 | ### Schema 18 | 19 | Each row consists of a **timestamp**, **queryId** pair indicating that query with identifier **queryId** was running/active at timestamp **timestamp**. For every query that is running/active at **timestamp** there will be one such row. So to compute a time-series of how many queries were active at given timestamp, one could simply do the equivalent of `SELECT COUNT(*) AS queryCount GROUP BY timestamp`. To bring in other query statistics one can join this data with the main dataset on the **queryId** column. For more details, please refer to the [Schema](schema.md) page. 20 | 21 | ## Scripts 22 | 23 | The [scripts/](scripts/) directory has some helper scripts to aid with dataset manipulation. This includes AWK headers and some sample pandas scripts. In addition the [paper-results/](paper-results/) directory contains a set of IPython notebooks (written mostly using pandas) that can re-produce all of the results in our NSDI 2020 paper. 24 | 25 | ## Limitations 26 | 27 | ## Privacy Concerns 28 | 29 | All identifiers in the dataset that could potentially reveal a customer's identity have been replaced by pseudo-random numbers to preserve anonymity. Public access to the information in this dataset does not lead to any privacy or other ethical concerns. 30 | 31 | ## Contact 32 | 33 | Midhul Vuppalapati ([midhul@cs.cornell.edu](mailto:midhul@cs.cornell.edu)) 34 | 35 | ## Usage 36 | 37 | Information in this dataset is open to the public for use in research and education purposes. Kindly cite the following publication if you are using our dataset: 38 | 39 | ``` 40 | @inproceedings {snowflake-nsdi20, 41 | author = {Midhul Vuppalapati and Justin Miron and Rachit Agarwal and Dan Truong and Ashish Motivala and Thierry Cruanes}, 42 | title = {Building An Elastic Query Engine on Disaggregated Storage }, 43 | booktitle = {17th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 20)}, 44 | year = {2020}, 45 | isbn = {978-1-939133-13-7}, 46 | address = {Santa Clara, CA}, 47 | pages = {449--462}, 48 | url = {https://www.usenix.org/conference/nsdi20/presentation/vuppalapati}, 49 | publisher = {{USENIX} Association}, 50 | month = feb, 51 | } 52 | ``` 53 | 54 | -------------------------------------------------------------------------------- /download.md: -------------------------------------------------------------------------------- 1 | The dataset is available for download from the following links. 2 | 3 | ## Main Dataset 4 | 5 | - Main Dataset CSV (gzip compressed): http://www.cs.cornell.edu/~midhul/snowset/snowset-main.csv.gz 6 | - Main Dataset Parquet (tar + gzip compressed): http://www.cs.cornell.edu/~midhul/snowset/snowset-main.parquet.tar.gz 7 | 8 | ## Auxiliary Time-series Explosion 9 | 10 | - Auxiliary Time-series Explosion CSV (gzip compressed): http://www.cs.cornell.edu/~midhul/snowset/ts-explosion.csv.gz 11 | - Auxiliary Time-series Explosion Parquet (tar + gzip compressed): http://www.cs.cornell.edu/~midhul/snowset/ts-explosion.parquet.tar.gz 12 | -------------------------------------------------------------------------------- /paper-results/cache-hitrates.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Cache hit-rate distributions\n", 10 | "import pandas as pd\n", 11 | "import matplotlib.pyplot as plt\n", 12 | "from matplotlib import colors\n", 13 | "import numpy as np\n", 14 | "%matplotlib inline" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 2, 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "# Load persistent data I/O stats of queries\n", 24 | "df = pd.read_parquet('~/snowflake-analysis/snowset-main.parquet', columns = ['persistentReadBytesCache', \n", 25 | " 'persistentReadBytesS3', \n", 26 | " 'persistentWriteBytesS3',\n", 27 | " 'persistentWriteBytesCache'],\n", 28 | " engine='fastparquet')" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 5, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "# Calculate total amount of persistent data bytes read and written\n", 38 | "df['persistentReadBytes'] = df['persistentReadBytesCache'] + df['persistentReadBytesS3']\n", 39 | "# Cache is write-through\n", 40 | "df['persistentWriteBytes'] = df['persistentWriteBytesS3']\n", 41 | "# Assign query classes\n", 42 | "df['ronly'] = (df['persistentWriteBytes'] == 0)\n", 43 | "df['wonly'] = (df['persistentReadBytes'] == 0)\n", 44 | "df['rw'] = ((df['persistentReadBytes'] > 0) & (df['persistentWriteBytes'] > 0))" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 7, 50 | "metadata": {}, 51 | "outputs": [], 52 | "source": [ 53 | "# Eliminate W-only queries\n", 54 | "df.query('ronly or rw', inplace=True)" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 9, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "# Calculate hitrate\n", 64 | "df['hitrate'] = df['persistentReadBytesCache'] / df['persistentReadBytes']" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 20, 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "def ecdf(x):\n", 74 | " xs = np.sort(x)\n", 75 | " ys = np.arange(1, len(xs)+1)/float(len(xs))\n", 76 | " return xs, ys\n", 77 | "\n", 78 | "def weighted_ecdf(x, w):\n", 79 | " d = pd.DataFrame({'x': x, 'w': w})\n", 80 | " d.sort_values(by='x', inplace=True)\n", 81 | " xs = d['x']\n", 82 | " ys = np.cumsum(d['w']) / np.sum(d['w'])\n", 83 | " return xs,ys\n" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 23, 89 | "metadata": {}, 90 | "outputs": [ 91 | { 92 | "data": { 93 | "text/plain": [ 94 | "Text(0.5, 1.0, 'R-only queries')" 95 | ] 96 | }, 97 | "execution_count": 23, 98 | "metadata": {}, 99 | "output_type": "execute_result" 100 | }, 101 | { 102 | "data": { 103 | "image/png": "\n", 104 | "text/plain": [ 105 | "
" 106 | ] 107 | }, 108 | "metadata": { 109 | "needs_background": "light" 110 | }, 111 | "output_type": "display_data" 112 | } 113 | ], 114 | "source": [ 115 | "# Plot per-class hit-rate CDFs\n", 116 | "plt.plot(*ecdf(df[df['ronly']]['hitrate']), label='Fraction of queries')\n", 117 | "plt.plot(*weighted_ecdf(df[df['ronly']]['hitrate'], df[df['ronly']]['persistentReadBytes']), label='Fraction of bytes')\n", 118 | "plt.gca().set_ylim((0,1))\n", 119 | "plt.gca().legend(bbox_to_anchor=(1.04,1), loc=\"upper left\")\n", 120 | "plt.gca().set_xlabel('Cache hit rate')\n", 121 | "plt.gca().set_ylabel('Fraction')\n", 122 | "plt.gca().set_title('R-only queries')" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 24, 128 | "metadata": {}, 129 | "outputs": [ 130 | { 131 | "data": { 132 | "text/plain": [ 133 | "Text(0.5, 1.0, 'RW queries')" 134 | ] 135 | }, 136 | "execution_count": 24, 137 | "metadata": {}, 138 | "output_type": "execute_result" 139 | }, 140 | { 141 | "data": { 142 | "image/png": "\n", 143 | "text/plain": [ 144 | "
" 145 | ] 146 | }, 147 | "metadata": { 148 | "needs_background": "light" 149 | }, 150 | "output_type": "display_data" 151 | } 152 | ], 153 | "source": [ 154 | "plt.plot(*ecdf(df[df['rw']]['hitrate']), label='Fraction of queries')\n", 155 | "plt.plot(*weighted_ecdf(df[df['rw']]['hitrate'], df[df['rw']]['persistentReadBytes']), label='Fraction of bytes')\n", 156 | "plt.gca().set_ylim((0,1))\n", 157 | "plt.gca().legend(bbox_to_anchor=(1.04,1), loc=\"upper left\")\n", 158 | "plt.gca().set_xlabel('Cache hit rate')\n", 159 | "plt.gca().set_ylabel('Fraction')\n", 160 | "plt.gca().set_title('RW queries')" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": {}, 167 | "outputs": [], 168 | "source": [] 169 | } 170 | ], 171 | "metadata": { 172 | "kernelspec": { 173 | "display_name": "Python 3 (ipykernel)", 174 | "language": "python", 175 | "name": "python3" 176 | }, 177 | "language_info": { 178 | "codemirror_mode": { 179 | "name": "ipython", 180 | "version": 3 181 | }, 182 | "file_extension": ".py", 183 | "mimetype": "text/x-python", 184 | "name": "python", 185 | "nbconvert_exporter": "python", 186 | "pygments_lexer": "ipython3", 187 | "version": "3.7.5" 188 | } 189 | }, 190 | "nbformat": 4, 191 | "nbformat_minor": 4 192 | } 193 | -------------------------------------------------------------------------------- /paper-results/classification.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "import pandas as pd\n", 10 | "import matplotlib.pyplot as plt\n", 11 | "from matplotlib import colors\n", 12 | "import numpy as np\n", 13 | "%matplotlib inline" 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 2, 19 | "metadata": {}, 20 | "outputs": [], 21 | "source": [ 22 | "# Load persistent data I/O stats of queries\n", 23 | "df = pd.read_parquet('../snowset-main.parquet', columns = ['persistentReadBytesCache', \n", 24 | " 'persistentReadBytesS3', \n", 25 | " 'persistentWriteBytesS3'],\n", 26 | " engine='fastparquet')" 27 | ] 28 | }, 29 | { 30 | "cell_type": "code", 31 | "execution_count": 3, 32 | "metadata": {}, 33 | "outputs": [], 34 | "source": [ 35 | "# Calculate total amount of persistent data bytes read and written\n", 36 | "df['persistentReadBytes'] = df['persistentReadBytesCache'] + df['persistentReadBytesS3']\n", 37 | "# Cache is write-through\n", 38 | "df['persistentWriteBytes'] = df['persistentWriteBytesS3'] " 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 4, 44 | "metadata": {}, 45 | "outputs": [ 46 | { 47 | "data": { 48 | "text/plain": [ 49 | "Text(0.5, 1.0, 'Persistent Data I/O scatter-plot')" 50 | ] 51 | }, 52 | "execution_count": 4, 53 | "metadata": {}, 54 | "output_type": "execute_result" 55 | }, 56 | { 57 | "data": { 58 | "image/png": "\n", 59 | "text/plain": [ 60 | "
" 61 | ] 62 | }, 63 | "metadata": { 64 | "needs_background": "light" 65 | }, 66 | "output_type": "display_data" 67 | } 68 | ], 69 | "source": [ 70 | "# Scatter-plot for persistent read/write bytes\n", 71 | "def logScatterPlot(x, y):\n", 72 | " x_bins = np.logspace(np.log10(min(x)), np.log10(max(x)), 150)\n", 73 | " y_bins = np.logspace(np.log10(min(y)), np.log10(max(y)), 150)\n", 74 | " \n", 75 | " Z, xedges, yedges = np.histogram2d(x,y,[x_bins,y_bins])\n", 76 | " p = plt.pcolormesh(xedges, yedges, Z.T, norm=colors.LogNorm(vmin=1, vmax=Z.max()),\n", 77 | " cmap='PuBu')\n", 78 | " plt.gcf().colorbar(p)\n", 79 | "\n", 80 | " plt.yscale('log')\n", 81 | " plt.xscale('log')\n", 82 | " \n", 83 | " return plt.gca()\n", 84 | "\n", 85 | "ax = logScatterPlot(df['persistentReadBytes'] + 1, df['persistentWriteBytes'] + 1)\n", 86 | "ax.set_xlabel('Persistent Bytes Read + 1')\n", 87 | "ax.set_ylabel('Persistent Bytes Written + 1')\n", 88 | "ax.set_title('Persistent Data I/O scatter-plot')" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 5, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "# Assign query classes\n", 98 | "df['ronly'] = (df['persistentWriteBytes'] == 0).astype('int')\n", 99 | "df['wonly'] = (df['persistentReadBytes'] == 0).astype('int')\n", 100 | "df['rw'] = ((df['persistentReadBytes'] > 0) & (df['persistentWriteBytes'] > 0)).astype('int')" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 6, 106 | "metadata": {}, 107 | "outputs": [ 108 | { 109 | "data": { 110 | "text/plain": [ 111 | "ronly 20253279\n", 112 | "wonly 10394740\n", 113 | "rw 39814996\n", 114 | "dtype: int64" 115 | ] 116 | }, 117 | "execution_count": 6, 118 | "metadata": {}, 119 | "output_type": "execute_result" 120 | } 121 | ], 122 | "source": [ 123 | "# Compute per-class query counts\n", 124 | "df[['ronly', 'wonly', 'rw']].sum()" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": {}, 131 | "outputs": [], 132 | "source": [] 133 | } 134 | ], 135 | "metadata": { 136 | "kernelspec": { 137 | "display_name": "Python 3 (ipykernel)", 138 | "language": "python", 139 | "name": "python3" 140 | }, 141 | "language_info": { 142 | "codemirror_mode": { 143 | "name": "ipython", 144 | "version": 3 145 | }, 146 | "file_extension": ".py", 147 | "mimetype": "text/x-python", 148 | "name": "python", 149 | "nbconvert_exporter": "python", 150 | "pygments_lexer": "ipython3", 151 | "version": "3.7.5" 152 | } 153 | }, 154 | "nbformat": 4, 155 | "nbformat_minor": 4 156 | } 157 | -------------------------------------------------------------------------------- /paper-results/intdata-characteristics.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "# Intermediate data characteristics\n", 10 | "import pandas as pd\n", 11 | "import matplotlib.pyplot as plt\n", 12 | "from matplotlib import colors\n", 13 | "import numpy as np\n", 14 | "%matplotlib inline" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 2, 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "# Load intermediate data I/O characteristics\n", 24 | "df = pd.read_parquet('~/snowflake-analysis/snowset-main.parquet', columns = ['intDataNetSentBytes', \n", 25 | " 'intDataNetSentRequests',\n", 26 | " 'persistentReadBytesCache', \n", 27 | " 'persistentReadBytesS3', \n", 28 | " 'persistentWriteBytesS3',\n", 29 | " 'userCpuTime',\n", 30 | " 'systemCpuTime'],\n", 31 | " engine='fastparquet')" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 3, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "# Calculate total amount of persistent data bytes read and written\n", 41 | "df['persistentReadBytes'] = df['persistentReadBytesCache'] + df['persistentReadBytesS3']\n", 42 | "# Cache is write-through\n", 43 | "df['persistentWriteBytes'] = df['persistentWriteBytesS3'] \n", 44 | "# Assign query classes\n", 45 | "df['ronly'] = (df['persistentWriteBytes'] == 0)\n", 46 | "df['wonly'] = (df['persistentReadBytes'] == 0)\n", 47 | "df['rw'] = ((df['persistentReadBytes'] > 0) & (df['persistentWriteBytes'] > 0))" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 4, 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "def ecdf(x):\n", 57 | " xs = np.sort(x)\n", 58 | " ys = np.arange(1, len(xs)+1)/float(len(xs))\n", 59 | " return xs, ys" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 5, 65 | "metadata": {}, 66 | "outputs": [ 67 | { 68 | "data": { 69 | "text/plain": [ 70 | "Text(0, 0.5, 'Fraction of Queries')" 71 | ] 72 | }, 73 | "execution_count": 5, 74 | "metadata": {}, 75 | "output_type": "execute_result" 76 | }, 77 | { 78 | "data": { 79 | "image/png": "\n", 80 | "text/plain": [ 81 | "
" 82 | ] 83 | }, 84 | "metadata": { 85 | "needs_background": "light" 86 | }, 87 | "output_type": "display_data" 88 | } 89 | ], 90 | "source": [ 91 | "# Plot per-class CDFs of intermediate data exchanged over network \n", 92 | "plt.plot(*ecdf(df[df['ronly']]['intDataNetSentBytes']), label='R-only')\n", 93 | "plt.plot(*ecdf(df[df['wonly']]['intDataNetSentBytes']), label='W-only')\n", 94 | "plt.plot(*ecdf(df[df['rw']]['intDataNetSentBytes']), label='RW')\n", 95 | "plt.gca().set_xscale('log')\n", 96 | "plt.gca().set_xlim((1,10**14))\n", 97 | "plt.gca().legend(bbox_to_anchor=(1.04,1), loc=\"upper left\")\n", 98 | "plt.gca().set_xlabel('Intermediate Data Exchanged (Bytes)')\n", 99 | "plt.gca().set_ylabel('Fraction of Queries')" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 6, 105 | "metadata": {}, 106 | "outputs": [ 107 | { 108 | "data": { 109 | "text/plain": [ 110 | "Text(0, 0.5, 'Fraction of Queries')" 111 | ] 112 | }, 113 | "execution_count": 6, 114 | "metadata": {}, 115 | "output_type": "execute_result" 116 | }, 117 | { 118 | "data": { 119 | "image/png": "\n", 120 | "text/plain": [ 121 | "
" 122 | ] 123 | }, 124 | "metadata": { 125 | "needs_background": "light" 126 | }, 127 | "output_type": "display_data" 128 | } 129 | ], 130 | "source": [ 131 | "# Plot per-class CDFs of number of network requests for intermediate data exchange \n", 132 | "plt.plot(*ecdf(df[df['ronly']]['intDataNetSentRequests']), label='R-only')\n", 133 | "plt.plot(*ecdf(df[df['wonly']]['intDataNetSentRequests']), label='W-only')\n", 134 | "plt.plot(*ecdf(df[df['rw']]['intDataNetSentRequests']), label='RW')\n", 135 | "plt.gca().set_xscale('log')\n", 136 | "plt.gca().set_xlim((1,10**12))\n", 137 | "plt.gca().legend(bbox_to_anchor=(1.04,1), loc=\"upper left\")\n", 138 | "plt.gca().set_xlabel('Number of requests')\n", 139 | "plt.gca().set_ylabel('Fraction of Queries')" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 7, 145 | "metadata": {}, 146 | "outputs": [ 147 | { 148 | "data": { 149 | "text/plain": [ 150 | "Text(0, 0.5, 'Fraction of Queries')" 151 | ] 152 | }, 153 | "execution_count": 7, 154 | "metadata": {}, 155 | "output_type": "execute_result" 156 | }, 157 | { 158 | "data": { 159 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdsAAAEGCAYAAAAt2j/FAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3deZwU1bn/8c/Ty+wDMwPDvssmoqggqKAiiWsMRs1iNIkao8aoN9uNV69er4k3/jRmNRoVvYmSq4ka17hvaBI3VkGQbdhB1plh9q27z++PqoFhnKWBqWkGvu/Xq19dderUqacLZp6pqtPnmHMOERERCU4o1QGIiIgc7JRsRUREAqZkKyIiEjAlWxERkYAp2YqIiAQskuoA9lbPnj3dkCFDUh2GiEiXMm/evB3OucJUx3Go6nLJdsiQIcydOzfVYYiIdClmti7VMRzKdBtZREQkYEq2IiIiAVOyFRERCViXe2YrIiKpM2/evF6RSOQhYCy6YGuUABbHYrHvjB8/fltLFZRsRUQkaZFI5KE+ffocXlhYWBoKhTS4PpBIJGz79u1jtmzZ8hAwvaU6gf1VYmZ/NLNtZra4le1mZnebWZGZLTKzY4OKRUREOszYwsLCciXa3UKhkCssLCzDu9pvuU6Ax38YOLON7WcBI/zXlcB9AcYiIiIdI6RE+1n+OWk1pwZ2G9k59w8zG9JGlXOBmc6b4+8DM8szs77Ouc1BxSTSIuegoQbqKqC+EuINkIiBi0MiDi7hv8ebvMcgkWhW1qRuIuYt7z7InsdrtWxfyjuiDecXOZxzxEmQaPruEiRwJJwjQYI4joRLeOXOedtIEPeXG8vjOG+LgwQO5xIk2HPZOYfzyxIu0WTZAbvbiyUcsUSCWKLxmJBIePUdzjuG31Zx/2ne8RMJ7/P4n8vtiqOxfXYds/Gze8vNy/wYnQP/M3q7O/+Y3rrz26axLruPc+PkKziiT7/P/rvJISGVz2z7AxuarG/0yz6TbM3sSryrXwYNGtQpwUmKOQcN1VBfDbFaiNd777G6Jsv1EK/bXRav9xJlvN4va/DqNVR7SbS+CuoqvaS661Xuvbv43ocI1BvUmlFnIerMqDWj3n+vM6PBoN4vi5kRAxr85QaDGOavQwP+uxkxfzlmRhyI+/vG/fXd5RD36yYw4ub11IhjJPzlPcrNdm13BnH2LE+Ydcy/X6qtWpjqCPbgnLGqZLqSbQcJh8PjR4wYUROPx23gwIF1TzzxxJqePXvu/Q9xM8uXL08755xzRqxcuXJJR8TZVJfoIOWcmwHMAJgwYYJuXxwo4jEvkTW+6hvfq5q8/CTXWN5Q7V1FNn9v3LehZvd+dMA/tYUhLQfSsne/MrpB9lBIz4X0bpCes/s9LYdYKMT6+jLW1pWyvaGS7Q2V7IhVUNxQyY6GCnbUl7OzoYraRP3+x+eLWJhIKNLkPUIktOdyiBAhC2OE/FcYXAjDvG0uRMSFcBg4wzkDDJfwlhOEcAkj4Yx4HGIJiMeNWBwaEhCLQUPce8XjACGvHULgQt56Y9v+NgjhGrc1lmN+fSNEmGg4RDgUJhoKEw2HiYTDRCxEyEKEwyHCFiIcihCyEJFQiHAoRCQcJisaISstQkY0SkY0QmY0QmZahIxwhGg0TMjMr2/ePiEjFAoRDYUIhcxb39WmETavnhmEzDCMcMgwM0JmhC2EARbyz24ohJl59QkRDrGrbshC/rsRMnatm78cNq/dxvalY6WnpyeWLVv2CcD5558/5K677iq88847t6Q6rrakMtluAgY2WR/gl0lnScShtgxqSlt57fxsWX3l7sSYaNi740UyIJrlvzIhzV9Oz4Wc3k3KmyXHSDqE0733SDqE07y2di3777uWo7vXQ+E9QnDOUdFQQWltKaW1pRTXFlNcU8zW6q1srFjJqs2rWF22moYmn80w8jPyKcwspGfuAIZl9iA/PZ+MSAYZkQzSw+m7X5F0MsK7y9LCaURDUcJEqY8Z9XGjrt6oazBqG6CiJkF5dZyy2jjlNTHKahoor22gvKaB8tqY/95ATX2cWGL//vhIj4RIi4RIj4TJTg+TlRahW1qYrPQI2WneenZ6mMy0MJnRMOmRMOmREOnREGnhxn13t5EW2V2+x7ZweFdZOKREI8E6/vjjqxYtWpTZ0rbly5enXXLJJUNKSkoiPXr0iM2cOXPtiBEj6i+44IIhubm58YULF2Zv3749etttt2287LLLSpvuO2HChFF33333+hNPPLEGYPz48aPuueee9SeccELNvsSZymT7PHCtmf0VmASU6XntPorVJZcom2+vK2u73fRukJkHmfmQkQfd+npl0azdiTKa6b+yd5el5XjLTZNmNOszia8jxBNxyurLKKkpobTmU4priymrLaOsvoyddTspqyvblVR31OygpKaEmIt9pp2QheiV1YvhecM5od8JjMgfwbDuw+id1Zv8jHwioQjOOcprYmyrqGVbRR07Kusoq2yaFGOU1zZQURujsraByroKKvxtVfXt3+HKSY/QPTNKbkaEbplR+udlcnjfXLplRMnyE2BmWpj0aJgsfzkzLUxGJLwrIaY3TYSNr3CIaFhXWNLxfvK3hQNXbKnI6sg2R/bJrb7ry+M2tF8TYrEYs2bNyr388st3tLT96quvHnTxxRcXX3fddcW//e1ve1x99dUD33jjjVUAW7dujc6dO3fZRx99lHHeeecNb55sL7nkkh0PPfRQzxNPPHHDokWL0uvq6kL7mmghwGRrZn8BpgI9zWwj8N9AFMA5dz/wEnA2UARUA5cFFUuXVVsO5Z9CxWao2AIVn3rv5f571Tao2uFdZbbGQl6ybEyY2YXQc+Tusj1eeU3qdveuEFPAOUdxbTGbKjexrXobJTUllNSVeAm1rpSS2hJKa733nXU7SezREWm3zEgm+en5dE/vTo/MHozMH0nPzJ4UZBSQl55HbjSPCLlYPJdELIeymjglVfXsKK3nww11vFxVR3FVEaVV9ZRWN7Czur7Vq8v0SIjcjCjdMiLkZkbJTY/QKzeDbD+Bds+MkpMRISc9TE56lOz0MDnpEfKyohRkp9MtI0IkrPEBRJJRV1cXGj169JitW7dGDzvssNovfelL5S3VW7BgQfbLL7+8CuDqq68u+elPfzqgcdv06dN3hsNhxo8fX1tcXPyZX3aXXnpp6V133dW3rq5u4/3339/zoosuajGhJyvI3shfb2e7A64J6vhd1vYV8MG9sPINKN/42e0Z3SG3L+T2gR6HecmztYSZmQ9puRA68H6J18Xr2FSxifUV61lfvp4NFRvYULGBjZUb2Vy5mfoWnod2S+tGQUYBBRkFDOk2hGN6HUNBRgH5Gfn0yOhB97Q8opaLi2dTXZtGWTUUV9ZRXFlPcXkdWz+t55Oqenb4ZZV15cBnf0bNIC8zSo+cdAqy0zisMIf87Cj5WWkUZKfRq1sGhTnpFOamk5flXYmmRzr+ql3kQJfsFWhHa3xmW1FREZo6deqIO+64o9fNN9+87brrruv/+uuvdwdofKbbmoyMjF1/OTf2IG8qNzc3cdJJJ5U/9thjec8//3zBggUL2myvPV2ig9QhoWoHzPo5zHvEe9Y46kzocznkDfISa2OCTctOdaRJa7xCXVGygpU7V7KmbA0bKjawvmI9W6u2+l+m8ORGcxnYbSAj80dy6sBT6Zvdl95ZfcmwAlw8h4a6LEqr4xRX1rOzpp6ysga2bW5gRXUDpdX1FFfWs6OyhFii+DNxhAwKstPpmZNGj5w0jhqQ5y1np5GfnUZBVhp5fiLtkZNGXmZUV5kiXUBubm7i7rvvXv+Vr3xl+H/8x39s+/3vf7+JJn1/jjnmmKqHHnoo/5prril54IEHCiZMmFC5N+1/97vf3XHBBRcMP+644yoLCwv3q7ezkm2qlW2COQ/C7IcgVgMTvg1Tb4DsnqmObK/UxGpYtXMVK0tXsqJ0xa730rrdj0EKMgoYkDOAI/KPYXKvvmRZb8KJnrj6nlRUp7GjvJ5Nm+r4yL/yLKupxnvCsKdwyOieGSUvM0r3rCi9ctM5vG83euWm0zMnnR45afTMSadXrndlmp+VRkgddUQOSpMnT64ZPXp0zYwZMwquueaakqbb7r///vXf+ta3hvzud7/r09hBam/aPumkk6qzs7Pjl1122X7dQgawli6fD2QTJkxwXX7y+LJNsOJlWPoCrHnHKzt8Oky7GXqOSG1sSXDOsbFiI3O3zmXe1nl8vONj1pSt2XWlGrV08iKDyLIBRGL9aKjtTVVFISXlUcprP9s5CSA3PULP3HQKc9Lpmesly4Js772nnzwbb+t2y4ios4/IXjKzec65CfvbzsKFC9eOGzduv5NPV7B27dro1KlTR61atWpxONz+o6KFCxf2HDdu3JCWtunKtrPUlMLip2HhX2HjbK+s4DA46cdwzDcgf0hKw2tPaW0Zf1/5Ju9s+CdLSuZTFff+gAwlsnG1g6mtmkairi/x2j64hgJKCNEtI+Iny3QG906j8DBvuad/BVqYuzuRZkT1zFNEDhz33HNPj//5n//pf/vtt29IJtG2R8k2aIk4vPd7+Mdd3ndUCw+Hz90Co8/xegUfQFdo5bUNbCipZkNJNetLqllXXMni0rlsjL1NXfRjLBQnEcsmXjWccP1UekUPZ1C3IfTvk03/vEzvlZ9Jv7xMCnPSSYvouaeIdE3XXntt8bXXXvvZTiD7SMk2SLVl8Pg3YM0/YNQX4OR/h37HpDzBVtQ2sOTTcpZvqWDF1gqWfFrOuuIqSqu9gRwsuoNo3jzS8+ZDpIxoOJfDs87k+F7TmNTvaEb26UZhTrpu5YqIJEnJNijxBnjsQu+W8bn3wtEXd3qSTSQc60qqdyXV5VsrWPppOat3VO2q0y0jwug+3Th9bAF16QtYXTuLNZUfE7IQU/pP4bzh53HKgFOIpug7tyIiBwMl2yDUVcLTV8D69+D8h+CorwR+yMbEumjjTj7asJNlmytY8mnZHh2SBhZkMrpPN847pj9jB3RnTN9uRKPVPL7icf6y9C+UVpQyuNtgvn/s95l+2HR6ZfUKPG4RkUOBkm1HK98Mj30Vti6Gs38ZWKJ1zrFqeyVvLt3Gv4p2sHDDzl2JNTMaZmSfXL5wVD/GDejOmH7dGN4rh6y03f/ca8vWMmPJAzy36jnq4nWcPOBkLj3iUib0nqDbwyIiHUzJtiPVlMLM6d5wil9/HEae3mFNF1fWsXDjTj7aUMbCDTtZvKmM4ipvlKXRfXI5Z5yXWI/sn8fI3jktDsrgnGPBtgU8vORh3t7wNpFQhOmHTedbY77FsLxhHRariEiQLr/88oGDBw+uu+WWW7YBTJkyZUT//v3rH3/88XUAV1xxxYD+/fs33HrrrVv39RgdPd2ekm1HicfgqSugZDV86zkYMmW/m1xfXM3D763l7RXbWL3de84aMhjRK5dpo3tx9KA8Th3Vi355LU54sTu0RJw317/JI0seYdGORXRP786VR13JhaMvpGdm1xo8Q0RkypQplU8++WQ+sC0ej1NaWhqprKzc9f2cOXPm5Fx44YUpGUqyNUq2HSGRgBd/BEWvwzm/6ZBEC3DHK0t56eMtnDKykK9OGMgxA/MY27872enJ/bM1xBt4btVz/HHxH9lQsYFBuYO4edLNTB8+ncxI2wlaRORAdeqpp1beeOONAwHmzZuXOWrUqJqtW7dGt2/fHs7JyUmsWrUqY/LkyXsMP5fq6faUbDvCazfD/Edgyo+84RY7SEVtjHED83jk2xP3ar+ES/Di6he596N72VS5iSN6HMGvp/6aaQOnEQ5gmjsROUQ9e81Atn3SoVPs0WtMNV+6t82r0iFDhjSEw2G3cuXKtHfeeSf7+OOPr9q0aVP0rbfeysnPz4+NHDmypulEA5D66faUbPfXJ897s/RMvMobrKID1dTHydqLkZUSLsFb69/iDwv/wMrSlRxecDg3f/5mJvebrE5PInJQGT9+fOWsWbOy33///Zyf/OQnW9evX5/27rvvZnfv3j0+adKkz0w4kOrp9pRs90dtGbz4Y+g7Ds74+X5/j9Y5x9LNFbyyeDMvL97Cym2VnH1kn3b3S7gEszbM4r6P7mN56XKGdBvCHSfdwVlDzyJkGsVJRALSzhVokE488cTK9957L2fZsmWZxx13XM2wYcPqf/vb3/bOycmJX3rppTsOtOn2lGz3xzu/gKrtcNHj+zzReiLhmL++lBc/3swri7ewuayWkMFxQwq49YtjOPfo/q3uG0vEeGnNS/xp8Z8o2lnE4G6DuX3K7Zw99GzdLhaRg9rJJ59cec899/QZNGhQXSQSoXfv3vHy8vLwypUrM2fOnLnu4osvLuMAmm5PyXZ/bF8GfY6E/sfu1W6xeILZa0t4bclWXlm8hS3ltaSFQ5w8spAffn4kp47uRWFueqv7l9WV8fyq53l06aNsqtzE8Lzh3D7lds4aehaRkP5JReTgN3HixJqdO3dGzj///F3jF48ePbqmqqoq3Ldv389ML5bq6fb0m3l/NNRAerekq6/aXsmf3l3DC4s2s7O6gfSIl2BvOHI0nzu8F7kZrV8dJ1yCOVvm8FzRc7y27jXq4nUcXXg0N0y8gVMGnKJnsiJySIlEIlRWVi5oWvbUU0+tba3+yJEj6z/44IMVzcub71NdXb0AYNSoUfVNv2O7du3aqHPOzjvvvPJ9indfdhJfbTnk9m632ieflvObN1bw+idbSY+EOOOIPpx9ZB9OHlm4x6hOLVlXvo7nip7jhdUvsLlqMznRHM497Fy+MuorjC4Y3VGfREREWtER0+0p2e6r5a/A1o/hyAtarVJW3cCvX1/Onz9YR25GlH+bNpxvnjCkzVvEAA2JBt5c/yZ/WfoX5m+bT8hCnND3BH5w7A+YNmgaGZGMjv40IiLSio6Ybk/Jdl+UfwrPXg29x8Lx32uxyssfb+a/nltCSVUdF04cxPVnjCIvK63NZjdWbOTplU/zbNGzbK/ZzsDcgZoUQETkIKBku7ecg5d+Ag3V8JWHIbLnVWpNfZzrn1rE3xd+yhH9uvHwZccxtn/3NppzzN82n5lLZjJrwyzMjMn9JnPLqFs4qf9J6lUsInIQULLdW0uehmUvwGk/g54j9ti0o7KO7zwyl4Ubd/Lj00Zy9dTDWpwQALzxit/a8BYPL35413jF3znyO3x11Ffpk93+d2tFRKTrULLdW9uWeu8nXLtH8artlVz6p9lsr6jjvovHc+bYlhNmVUMVz6x8hkeXPsrGyo0MyBmg8YpFRA5yGl5ob9TshCXPQv4QaHJ79+ONZVxw33tU18X5yxXHt5ho6+J1PLjoQU578jTunHMnhVmF/OqUX/HCeS/wtdFfU6IVEUlSOBweP3r06DEjRow4Ytq0acN37NgRBjjttNMO+/Of/5zXWG/IkCFjr7/++r6N62ecccZhjzzySF5LbQZNyTZZ8Rg8dTmUroUv3r2reH1xNd/844dkp0V4+nsncsyg/D12c87x5ro3OffZc7l7wd1M6DOBx85+jJlnzeT0IafrmayIyF5KT09PLFu27JOVK1cuycvLi911112FACeccELlu+++mwOwZcuWcFZWVnz27NnZjfstWLAg+9RTT92rkaM6ipJtsl6/BYregLN/AcNOAbyRoL7/+ALiCcdjV0xicI/sPXYpKi3iitev4Adv/4DMSCYPnv4gd0+7myMLj0zFJxAROej4M/6kAZx00kmVc+bMyQZ46623cs4444yy4uLiaCKRYNmyZWnp6emJQYMGfWZ0qc6gZ7bJcA5WvgZDTtpjCr2H/rWGBet38vuvH7NHoq1uqOb+hfcz85OZZEezuXHijXx11Fc1lKKIHFT+693/GlhUWtShU+wNzx9efdvk25Ka4CAWizFr1qzcyy+/fAfAlClTqlesWJFZW1tr7777bs6pp55asWbNmvQFCxZkzJ49O2v8+PFVHRnr3tBv//Y4B2/cCsUr4Zhv7CpeV1zFb99YweljevPFcf12lf9r07+47f3b+LTqU84fcT4/OPYH5Gfkt9CwiIjsi7q6utDo0aPHbN26NXrYYYfVfulLXyoHyMzMdCNGjKh99913s+bOnZt96623bikqKkp/5513chYsWJB14oknpuQWMijZtu8fv4R3f+td0U7+/q7i215YStiMn557BAA7anbwizm/4OU1LzO0+1AePvNhxvcen6qoRUQCl+wVaEdrfGZbUVERmjp16og77rij180337wNYOLEiZWzZs3KqaqqChcWFsanTJlS9bvf/a7X4sWLs6699trtqYgX9My2bfVVsGAm9DsGvvDrXfPVzl1bwhtLt/K9U4fTPcvx4KIH+eIzX+SNdW9w9bir+dsX/6ZEKyISsNzc3MTdd9+9/g9/+EPvhoYGACZPnlz5yCOPFI4ZM6YaYNKkSdXz58/P3rx5c9qECRNqUhVroMnWzM40s+VmVmRmN7SwfZCZzTKzBWa2yMzODjKevVJTCo98Eco2wkk/3mNi+D+9t5aC7DSGDV7DOc+c4/Uy7j2Bp6c/zfeO/h5p4baHZRQRkY4xefLkmtGjR9fMmDGjAGDatGmVGzduTD/++OOrAKLRKD169IiNHTu2al8nEegIgd1GNrMwcC9wGrARmGNmzzvnms5wfzPwhHPuPjMbA7wEDAkqpr3ywf2waR58+U9w+Bd3FW8rr+X1FYsZNOINrv/nAkYXjOZXU3/FMb2OSWGwIiKHjsZp8Bq99dZbRY3L/fv3jznn5jXdPnv27OWdFVtrgnxmOxEocs6tBjCzvwLnAk2TrQMaJ4TtDnwaYDzJm/MQvHMnDJ4MR5y3x6ZfvPN30gb/hnLS+PH4H3Px4RcTDbc+D62IiEiQybY/0PTh+UZgUrM6twKvmdl1QDbw+ZYaMrMrgSsBBg0a1OGB7uHTBfDqTTBwElz85B63j19f+wav7riTNMvhxfOepjCrMNhYRETkoJDqDlJfBx52zg0Azgb+bGafick5N8M5N8E5N6GwMMAEV7wK/vQFyOoJ590Pabu/O/vgogf50Ts/JF5XyI/H3qNEKyKHqkQikbD2qx1a/HOSaG17kMl2EzCwyfoAv6ypy4EnAJxz7wMZQM8AY2pd6Vr445ngEt4VbcHQXZseXfoody+4m542gbRt1/GVozUClIgcshZv3769uxLubolEwrZv394dWNxanSBvI88BRpjZULwkeyFwUbM664HPAQ+b2eF4ybbzvwdVthEePgfqyuGyl6D3mF2bXlj9AnfMvoMp/U7hzXdO56KJg0mLpPqGgIhIasRise9s2bLloS1btowl9XdHDxQJYHEsFvtOaxUCS7bOuZiZXQu8CoSBPzrnlpjZz4C5zrnngR8DD5rZD/E6S13qnHNBxdSi2jJ4ZDpUboNLX4T+u78fW9VQxT0L7mFE/ghO6f7vvBxbwgXHDujU8EREDiTjx4/fBkxPdRxdTaAjSDnnXsL7Ok/TsluaLH8CTA4yhjbFY/DY17xbyN98BgYet2tTTayGq16/ii1VW/jD5//AjFe3MbAgk7H9u7XenoiISAsO7VsAr9wA69+Hc+/ZNZNPo0eXPsrC7Qu565S7OCLvON4t2sHZY/tipscUIiKydw7dZLvoCZjzIEz6Lhy956PkDzd/yH0f3cfEPhM5bfBpvPrJFhrijrOP7NtKYyIiIq07NJNtyRr4+/dhwEQ4/X/22FQbq+Xmd2+mX04/fnXKrwB4YdFmBuRnctSA7qmIVkREurhDM9m+fD1gcP4D0GT0p4RL8KO3f8TWqq38x8T/IC8jj53V9bxbtINzjuqnW8giIrJPDr1ku3WJNxH8iddBwbA9Nr24+kX+uemf/HD8D5nSfwoA/yraQTzhOG1Mr1REKyIiB4FDL9l+8jxgMOmqPYrXla/j5x/+nKN6HsXFh1+8q3zWsu3kZUU5eqAmgBcRkX1zaCVb5+DjJ7xxj7MKdhXXxmr58ds/JhKK8MtTfrlrirxYPMHby7dxyshCwiHdQhYRkX3TbrI1s8lmlu0vf8PMfm1mg4MPLQAbZkPJahh/yR7Fd8y+g+Wly7l9yu30zdnd43jO2lKKq+o544g+nR2piIgcRJK5sr0PqDazcXgjPq0CZgYaVVBWvgoWhpFn7ip6/9P3eWrlU1w+9nJOHnDyHtVfXbKFtEiIU0Zq0gEREdl3ySTbmD+E4rnAPc65e4HcYMMKyOaF0GvMrlvI9fF6bv/wdvrn9Oe74767R1XnHK9/spWTR/QkOz3QgbZEROQgl0yyrTCzG4FvAi/6U+B1zdnSty2DXofvWn14ycOsLV/LTZNuIiOSsUfVNTuq2LSzhlNGqReyiIjsn2SS7deAOuDbzrkteFPl3RVoVEGo3AblG3cl2w0VG5ixaAanDT6Nkwac9Jnqs5Z7kw9N1S1kERHZT+0mWz/BPgWk+0U7gGeCDCoQ79/rvY84HYD7F96Pc47rj7u+xepvL9/GsMJsBhZkdVaEIiJykEqmN/IVwN+AB/yi/sCzQQbV4WpKYc5DMOZL0Gcsy0uW8/dVf+eiwy+iT/ZnexpX1cX4cHUJ03QLWUREOkAyt5GvwZsGrxzAObcS6FpZaPnLUF/pjRoFzFg0g8xIJt85suV5ft8t2kF9PMG00V3rY4qIyIEpmWRb55yrb1wxswjeRO9dx5JnIKc39DuWtWVreW3da1x8+MV0T295YoEPVpeQHgkxYUhBi9tFRET2RjLJ9h0z+08g08xOA54E/h5sWB1o0zxvLOTjroBQiJmfzCQainLR4Re1ust7q3YwfnA+aZFDa4AtEREJRjLZ5AZgO/AxcBXwEnBzkEF1qLfvgIw8mHQV26u382zRs5w7/Fx6ZvZssfqOyjqWbalg8vCWt4uIiOytdkdrcM4lgAf9V9eyca53VTvtvyCjG/837zfEXZxvH/HtVnd5b1UxgJKtiIh0mFaTrZk94Zz7qpl9TAvPaJ1zRwUaWUd4507IzIdJV1HdUM2Ty5/kc4M+x8BuA1vd5b2iHeRmRDiyvyaKFxGRjtHWle33/fdzOiOQDrdtmXdVO/VGSM/l+WV/paKhgm+N+Vabu72/uphJQ3tolh8REekwrSZb5wMmlg4AABrvSURBVNxmMwsDDzvnTu3EmDrG/EcgFIXjvoNzjidWPMHhBYczrnBcq7tsK69lXXE135jUNSc1EhGRA1ObHaScc3EgYWZd655qvAEWPQGjzoLsnqwoXcHK0pVcMOICzFq/Yp27rhSACUM0UbyIiHScZKazqQQ+NrPXgarGQufcvwUW1f766DGo3gFHe1/veXnNy4QtzGlDTmtztzlrS8iIhjiiX9f620JERA5sySTbp/3XgaH8U3j1JkjEwcUhEfOWG9crt8HqWTBwEow4A+ccr6x9heP7Hk9BRtuDVMxZW8LRA/P0/VoREelQyXz15xEzywQGOeeWd0JMbavcCnP/BKEIhELeu4Uh5L8imTD5B97QjKEQS3YsZlPlJq466qo2m62qi7F0cwXfm3pYJ30QERE5VLSbbM3si8AvgTRgqJkdDfzMOTc96OBaFM2Emz5Nuvora14hEoowbdC0Nust3LiTeMJx7CA9rxURkY6VzP3SW4GJwE4A59xHwLAAY+owCZfglbWvMKXflFbHQW403+8cdcygvM4ITUREDiHJJNsG51xZs7JEEMF0tPlb57O1eitnDT2r3bpz1pYyolcOeVlpnRCZiIgcSpJJtkvM7CIgbGYjzOz3wHsBx9Uhnil6huxoNlMHTm2zXjzhmL+uVLP8iIhIIJJJttcBRwB1wF/w5rX9QTKNm9mZZrbczIrM7IZW6nzVzD4xsyVm9liygbentLaUV9e+ytlDzyYrmtVm3aJtlVTUxRg/WM9rRUSk4yXTG7kauMl/Jc0ffepe4DRgIzDHzJ53zn3SpM4I4EZgsnOu1Mw6bLb2x5Y9Rl28josPv7jduvPXe89rlWxFRCQIyfRGnkXLExG03b3X61RV5Jxb7bfzV+Bc4JMmda4A7nXOlfptbksy7jatLlvNw4sf5vTBp3NYXvtf5Zm3rpSC7DSG9Gj7ClhERGRfJDOoxb83Wc4ALgBiSezXH9jQZH0jMKlZnZEAZvYuEAZudc690rwhM7sSuBJgXL/0Ng+6cPtCbvznjWRGMrn+uOuTCNO7sj12UF6bQzmKiIjsq2RuI89rVvSumc3uwOOPAKYCA4B/mNmRzrmdzWKYAcwAOGJwtvto20fEXZxYIkY8ESfmYlQ1VPH2hrd5ac1L9Mrsxe8/93t6Z/duN4DSqnpWb6/iy+MHdNBHEhER2VMyt5GbdtENAeOBZAYP3gQ0nTh2gF/W1EbgQ+dcA7DGzFbgJd85rTW6OpTgmy9/s8Vt2dFsLht7GVcddRXZ0ewkQoQFG7zntRrMQkREgpLMbeSmV7YxYA1weRL7zQFGmNlQvCR7IXBRszrPAl8H/mRmPfFuK69uq9GoM+77/H1EQhHCFiYaihIJRYiGogzrPoxoOJpEaLvNW1dKOGSMG6DBLEREJBjJ3EYeui8NO+diZnYt8Cre89g/OueWmNnPgLnOuef9baeb2SdAHPiJc664rXbDwJT+U/YlpBbNX7eTMX27kZkW7rA2RUREmmoz2ZpZX+AaYIxfNBd4oL2E2Mg59xLwUrOyW5osO+BH/qvTxeIJPtqwk68dN7D9yiIiIvuo1UEtzOwUYDbe0IwP+6904C0zG2pmf+6MAIO0bEsFNQ1xjYcsIiKBauvK9i5gunNuQZOy583sGWAh8EygkXWCD1Z7F+gTh2qYRhERCU5bwzXmNEu0wK5Zf7YClwUWVSeZvaaEQQVZ9O2emepQRETkINZWsjUz+8z3YfyvAsWcc11i5p/WJBKO2WtLmKSrWhERCVhbyfY3wGtmdoqZ5fqvqcDL/rYubfnWCnZWNzBpWI9UhyIiIge5Vp/ZOudmmNmnwG14s/44vHGN/8c59/dOii8wc9eWADBR0+qJiEjA2vzqj3PuBeCFToqlU/2raAf98zIZWKDntSIiEqxk5rM96DjnmLu2lOOH9dDkAyIiErhDMtku21JBcVU9k4bpFrKIiASvrUEtvu+/T+68cDrHu0U7ADh5RGGKIxERkUNBW1e2jd+j/X1nBNKZ/lW0gyE9sujTPSPVoYiIyCGgrQ5SS81sJdDPzBY1KTe8YY2PCja0YDTEE3ywupivTdB4yCIi0jna+urP182sD97MPNM7L6RgLdpYRm1DguM0mIWIiHSS9r76swUYZ2ZpeHPNAiz3J3vvkhq/X3uCBrMQEZFO0u58tv7sPzOBtXi3kAea2SXOuX8EHFsg5q4rZVBBFj1y0lMdioiIHCLaTbbAr4HTnXPLAcxsJPAXYHyQgQUhkXDMXlPCmUf0SXUoIiJyCEnme7bRxkQL4JxbAUSDCyk4y7dWUFbToO/XiohIp0rmynaumT0E/J+/fjEwN7iQgjN3XSkAx2k8ZBER6UTJJNurgWuAf/PX/wn8IbCIArR4Yxk9stMYkK/xkEVEpPO0m2ydc3V4z21/HXw4wVq+tYLhvXI0HrKIiHSqQ2Zs5Fg8wdLN5RzZv3uqQxERkUPMIZNsV26rpC6W4MgBSrYiItK5Dplk+/HGMgDG6spWREQ6WTKDWowEfgIMblrfOTctwLg63MebyshJjzC0R3aqQxERkUNMMr2RnwTuBx4E4sGGE5yPN5Uxpl83QiF1jhIRkc6VTLKNOefuCzySADX4naO+cfzgVIciIiKHoGSe2f7dzL5nZn3NrKDxFXhkHWj5lgrqYgnGDcxLdSgiInIISubK9hL//SdNyhwwrOPDCcbSzeUAHNGvW4ojERGRQ1Eyg1oM7YxAgrR0cwUZ0RBD1DlKRERSIJneyFG8IRtP9oveBh7oSnPaLttSzqjeuYTVOUpERFIgmWe29+FNp/cH/zXeL+sSnHMs21LB6D66hSwiIqmRTLI9zjl3iXPuLf91GXBcMo2b2ZlmttzMiszshjbqXWBmzswmJBt4srZX1lFSVc/ovrkd3bSIiEhSkkm2cTM7rHHFzIaRxPdtzSwM3AucBYwBvm5mY1qolwt8H/gw2aD3xsqtlQCM7K1kKyIiqZFMb+SfALPMbDVgeCNJXZbEfhOBIufcagAz+ytwLvBJs3q3AXeyZ2/nDrNyawUAI3rlBNG8iIhIu5LpjfymmY0ARvlFy/1p99rTH9jQZH0jMKlpBTM7FhjonHvRzFpNtmZ2JXAlQPfBGUkcereV2yrplhGhMDd9r/YTERHpKK0mWzOb5px7y8zOb7ZpuJnhnHt6fw5sZiG8OXIvba+uc24GMAMgf0iW25vjrNxWyYjeuZrDVkREUqatK9tTgLeAL7awzQHtJdtNwMAm6wP8ska5wFjgbT8R9gGeN7Ppzrm57bSdFOccSzeXM31cv45oTkREZJ+0mmydc//tL/7MObem6TYzS2agiznACL/uJuBC4KIm7ZcBPZu0+Tbw7x2VaAHWl1RTURvjiH6aVk9ERFInmd7IT7VQ9rf2dnLOxYBrgVeBpcATzrklZvYzM5u+d2Hum483eXPYHqk5bEVEJIXaemY7GjgC6N7suW03IKleSs65l4CXmpXd0krdqcm0uTc+3lRGNGyM7KOeyCIikjptPbMdBZwD5LHnc9sK4Iogg+ooC9bvZEzfbqRHwqkORUREDmFtPbN9DnjOzE5wzr3fiTF1iPpYgkUbd3LRRM1hKyIiqZXMM9vvmtmuiWDNLN/M/hhgTB1i0cad1DYkmDi0S029KyIiB6Fkku1RzrmdjSvOuVLgmOBC6hgfrikB4Lgh+SmOREREDnXJJNuQme3KWGZWQHLDPKbUB6uLGdErhx45GjlKRERSK5mk+SvgfTN7Em9s5C8DPw80qv1UWRfjw9UlfPMEPa8VEZHUS2Zs5JlmNg841S863znXfDKBA8o/VmynPp7gtDG9Ux2KiIhIcreD/cEotuN/v9bMBjnn1gca2T5yzvH4nA3kZ0WZMFjPa0VEJPXafWZrZtPNbCWwBngHWAu8HHBc++zXr6/gnRXbuebU4UTCyTySFhERCVYyV7a3AccDbzjnjjGzU4FvBBtW++IJR0M8QTzhiMUdG3dW8/T8Tfzvv9bwtQkDuXxKMsM3i4iIBC+ZZNvgnCs2s5CZhZxzs8zst4FH1ooEjqE3vohrYaI9Mzj/2P78/LyxmlJPREQOGMkk251mlgP8A3jUzLYBVcGG1bZrTx1OJBQiEjaiYSMcCpGfFWXK8J706rZ3k8uLiIgEzVxLl4hNK5hlAzV4z3cvBroDjzrnioMP77PyhmS6nWtrUnFoEZEuy8zmOecmpDqOQ1WbV7ZmFgZecM6dCiSARzolKhERkYNIm911nXNxIGFmmhBWRERkHyXzzLYS+NjMXqfJs1rn3L8FFpWIiMhBJJlk+7T/EhERkX3QarJtHCXKOafntCIiIvuhrWe2zzYumNlTnRCLiIjIQamtZNt0VIhhQQciIiJysGor2bpWlkVERGQvtNVBapyZleNd4Wb6y/jrzjnXLfDoREREDgKtJlvnXLgzAxERETlYaQ46ERGRgCnZioiIBEzJVkREJGBKtiIiIgFTshUREQmYkq2IiEjAlGxFREQCFmiyNbMzzWy5mRWZ2Q0tbP+RmX1iZovM7E0zGxxkPCIiIqkQWLI1szBwL3AWMAb4upmNaVZtATDBOXcU8DfgF0HFIyIikipBXtlOBIqcc6udc/XAX4Fzm1Zwzs1yzlX7qx8AAwKMR0REJCWCTLb9gQ1N1jf6Za25HHi5pQ1mdqWZzTWzuZoRQUREupq2JiLoNGb2DWACcEpL251zM4AZAHlDMpVvRUSkSwky2W4CBjZZH+CX7cHMPg/cBJzinKsLMB4REZGUCPI28hxghJkNNbM04ELg+aYVzOwY4AFgunNuW4CxiIiIpExgydY5FwOuBV4FlgJPOOeWmNnPzGy6X+0uIAd40sw+MrPnW2lORESkyzLnutYj0LwhmW7n2ppUhyEi0qWY2Tzn3IRUx3Go0ghSIiIiAVOyFRERCZiSrYiISMCUbEVERAKmZCsiIhIwJVsREZGAKdmKiIgETMlWREQkYEq2IiIiAVOyFRERCZiSrYiISMCUbEVERAKmZCsiIhIwJVsREZGAKdmKiIgETMlWREQkYEq2IiIiAVOyFRERCZiSrYiISMCUbEVERAKmZCsiIhIwJVsREZGAKdmKiIgETMlWREQkYEq2IiIiAVOyFRERCZiSrYiISMCUbEVERAKmZCsiIhIwJVsREZGAKdmKiIgELNBka2ZnmtlyMysysxta2J5uZo/72z80syFBxiMiIpIKgSVbMwsD9wJnAWOAr5vZmGbVLgdKnXPDgd8AdwYVj4iISKoEeWU7EShyzq12ztUDfwXObVbnXOARf/lvwOfMzAKMSUREpNMFmWz7AxuarG/0y1qs45yLAWVAj+YNmdmVZjbXzOamJSIBhSsiIhKMLtFByjk3wzk3wTk3YVCvUakOR0REZK8EmWw3AQObrA/wy1qsY2YRoDtQHGBMIiIinS7IZDsHGGFmQ80sDbgQeL5ZneeBS/zlLwNvOedcgDGJiIh0usAegDrnYmZ2LfAqEAb+6JxbYmY/A+Y6554H/hf4s5kVASV4CVlEROSgEmhvI+fcS8BLzcpuabJcC3wlyBhERERSrUt0kBIREenKlGxFREQCpmQrIiISMCVbERGRgFlX+6aNmVUAy1MdRzM9gR2pDqKZAzEmODDjUkzJUUzJOxDjGuWcy011EIeqrjj24XLn3IRUB9GUmc1VTMk5EONSTMlRTMk7EOMys7mpjuFQptvIIiIiAVOyFRERCVhXTLYzUh1ACxRT8g7EuBRTchRT8g7EuA7EmA4ZXa6DlIiISFfTFa9sRUREuhQlWxERkYB1qWRrZmea2XIzKzKzGwI8zkAzm2Vmn5jZEjP7vl9eYGavm9lK/z3fLzczu9uPa5GZHdukrUv8+ivN7JLWjrkXsYXNbIGZveCvDzWzD/1jP+5PZ4iZpfvrRf72IU3auNEvX25mZ3RATHlm9jczW2ZmS83shFSfKzP7of9vt9jM/mJmGak4V2b2RzPbZmaLm5R12Lkxs/Fm9rG/z91mZvsY013+v98iM3vGzPLaOwet/Ty2dp73NqYm235sZs7Meqb6PPnl1/nnaomZ/SLV58nMjjazD8zsIzOba2YTO/M8SZKcc13ihTdN3ypgGJAGLATGBHSsvsCx/nIusAIYA/wCuMEvvwG4018+G3gZMOB44EO/vABY7b/n+8v5+xnbj4DHgBf89SeAC/3l+4Gr/eXvAff7yxcCj/vLY/xzlw4M9c9peD9jegT4jr+cBuSl8lwB/YE1QGaTc3RpKs4VcDJwLLC4SVmHnRtgtl/X/H3P2seYTgci/vKdTWJq8RzQxs9ja+d5b2PyywfiTdO5Duh5AJynU4E3gHR/vVeqzxPwWuPn8c/N2515nvRK8ndBqgNIOlA4AXi1yfqNwI2ddOzngNPwRq7q65f1xRtgA+AB4OtN6i/3t38deKBJ+R719iGOAcCbwDTgBf8HYge7f0nuOkf+L6gT/OWIX8+an7em9fYxpu54ic2alafsXOEl2w3+L5OIf67OSNW5AoY0++XYIefG37asSfke9fYmpmbbzgMebennrPEc0MrPY1v/J/clJuBvwDhgLbuTbcrOE16C/HwL9VJ2nvxjfa3JZ3uss8+TXu2/utJt5MZfoI02+mWB8m8pHgN8CPR2zm32N20BercTW0fH/FvgeiDhr/cAdjrnYi20v+vY/vYyv35HxzQU2A78ybzb2w+ZWTYpPFfOuU3AL4H1wGa8zz6P1J+rRh11bvr7yx0d37fxrmr2Jaa2/k/uFTM7F9jknFvYbFMqz9NI4CT/9u87ZnbcPsbUYecJ+AFwl5ltwPt/f+M+xhTU/yehiz2z7WxmlgM8BfzAOVfedJvz/vTrtO9Nmdk5wDbn3LzOOmaSIni3te5zzh0DVOHdGt0lBecqHzgX7w+BfkA2cGZnHX9vdPa5aY+Z3QTEgEdTHEcW8J/ALamMowURvDsmxwM/AZ44AJ5rXg380Dk3EPgh8L8pjkda0JWS7Sa85zeNBvhlgTCzKF6ifdQ597RfvNXM+vrb+wLb2omtI2OeDEw3s7XAX/FuJf8OyDOzxjGum7a/69j+9u5AcQfHBN5fvxudcx/663/DS76pPFefB9Y457Y75xqAp/HOX6rPVaOOOjeb/OUOic/MLgXOAS72/wjYl5iKaf08743D8P5YWuj/nx8AzDezPvsQU0eep43A084zG+8uU899iKmjzhPAJXj/xwGeBCb6yyn9/yTNpPo+drIvvL8oV+P9ADZ2NDgioGMZMBP4bbPyu9izY8sv/OUvsGdHhNl+eQHe88x8/7UGKOiA+Kayu4PUk+zZyeJ7/vI17Nnp5wl/+Qj27Mixmv3vIPVPvBlFAG71z1PKzhUwCVgCZPnHeQS4LlXnis8+Y+uwc8NnO7ScvY8xnQl8AhQ2q9fiOaCNn8fWzvPextRs21p2P7NN5Xn6LvAzf3kk3u1YS+V5ApYCU/3lzwHzOvs86ZXEv1uqA9irYL3edSvwevfdFOBxpuDd2lsEfOS/zsZ7zvImsBKvR2Ljf1AD7vXj+hiY0KStbwNF/uuyDopvKruT7TD/B6TI/+Ft7CWZ4a8X+duHNdn/Jj/W5XRAb0PgaGCuf76e9X+AU3qugJ8Cy4DFwJ/9X4Kdfq6Av+A9N27Auyq6vCPPDTDB/4yrgHto1lFtL2Iqwkscjf/f72/vHNDKz2Nr53lvY2q2fS27k20qz1Ma8H9+W/OBaak+T3i/r+bhJfIPgfGdeZ70Su6l4RpFREQC1pWe2YqIiHRJSrYiIiIBU7IVEREJmJKtiIhIwJRsRUREAqZkKwc0M4v7s5ksNLP5ZnZiO/WPNrOzOyu+rsjM/jPVMYgcapRs5UBX45w72jk3Dm/M1//XTv2j8b7XmFJNRgbaq22dRMlWpJMp2UpX0g0oBTCzmWb2pcYNZvaoP3D9z4Cv+VfDXzOzbH8O0Nn+RAnn+vWP8Ms+8uf6HNH8YGZWaWa/8ectfdPMCv3yw8zsFTObZ2b/NLPRfvnDZna/mX2IN41e07YuNbPnzewtvAEtMLOfmNkc//g/bVL3JjNbYWb/Mm8u3n/3y982swn+ck9/GMPGOY7vatLWVX55XzP7h/8ZF5vZSWZ2B5Dpl6V0/GORQ0mq/8IWaU+mmX2EN8pTX7wxocEbbP2HwLNm1h04EW+M2Hy8kXKuBTCz24G3nHPfNm9C9Nlm9gbesHu/c849at6k3eEWjp0NzHXO/dDMbgH+G7gWmAF81zm30swmAX9oEtcA4ETnXLyF9o4FjnLOlZjZ6cAIvHFsDXjezE7Gm8jhQrwr9AjeKEXtTT5xOVDmnDvOzNKBd83sNeB8vGnbfm5mYSDLOfdPM7vWOXd0O22KSAdSspUDXU1jYjCzE4CZZjbWOfeOmf3Bv9q8AHjKORdrYQKW0/EmcPh3fz0DGAS8D9xkZgPwBpZf2cKxE8Dj/vL/AU/7M0GdCDzZ5FjpTfZ5spVEC/C6c66kSVynAwv89Ry85JsLPOOcq/Y/8/OttNX8Mx5lZl/217v7bc0B/uhPqvGsc+6jJNoSkQAo2UqX4Zx738x6AoV4M+XMBL6BdyV4WSu7GXCBc255s/Kl/u3eLwAvmdlVzrm32gsB79HLzjauDKva2L/pNgP+n3PugT2CNftBG/vH2P3oJ6NZW9c5515tvoN/tfwF4GEz+7VzbmYb7YtIQPTMVroM/9loGG96MoCH8SbOxjn3iV9WgXd12OhV4LrGOUfN7Bj/fRiw2jl3N/AccFQLhwwBjVeLFwH/ct68xmvM7Ct+O2Zm4/bh47wKfNu/UsbM+ptZL+AfwJfMLNPMcoEvNtlnLTDeX/5ys7au9q9gMbOR/rPqwcBW59yDwEN4t7EBGhrrikjnULKVA11jZ56P8G7pXtJ4m9Y5txVverE/Nak/CxjT2EEKuA2IAovMbIm/DvBVYLHf7li8q+TmqoCJZrYY75nsz/zyi4HLzWwh3lR+5+7th3LOvQY8BrxvZh/jzQOc65yb73/OhXhTnM1pstsv8ZLqArw5VBs9hDc93nw/1gfw7lpNxZsPdgHwNbz5j8F75rxIHaREOo9m/ZEuy8yy8KYOO9Y5VxZA+5XOuZyObncvY7gVqHTO/TKVcYjI/tGVrXRJZvZ5vKva3weRaEVEOpKubEVERAKmK1sREZGAKdmKiIgETMlWREQkYEq2IiIiAVOyFRERCdj/Bxx2aZyZlCm0AAAAAElFTkSuQmCC\n", 160 | "text/plain": [ 161 | "
" 162 | ] 163 | }, 164 | "metadata": { 165 | "needs_background": "light" 166 | }, 167 | "output_type": "display_data" 168 | } 169 | ], 170 | "source": [ 171 | "# Plot per-class CDFs of network request sizes for intermediate data exchange\n", 172 | "# Plot per-class CDFs of number of network requests for intermediate data exchange \n", 173 | "plt.plot(*ecdf((df[df['ronly']]['intDataNetSentBytes'] / df[df['ronly']]['intDataNetSentRequests'])\n", 174 | " .replace([np.inf, -np.inf, np.nan], 0)), label='R-only')\n", 175 | "plt.plot(*ecdf((df[df['wonly']]['intDataNetSentBytes'] / df[df['wonly']]['intDataNetSentRequests'])\n", 176 | " .replace([np.inf, -np.inf, np.nan], 0)), label='W-only')\n", 177 | "plt.plot(*ecdf((df[df['rw']]['intDataNetSentBytes'] / df[df['rw']]['intDataNetSentRequests'])\n", 178 | " .replace([np.inf, -np.inf, np.nan], 0)), label='RW')\n", 179 | "plt.gca().set_xlim((0,18000))\n", 180 | "plt.gca().legend(bbox_to_anchor=(1.04,1), loc=\"upper left\")\n", 181 | "plt.gca().set_xlabel('Bytes per request')\n", 182 | "plt.gca().set_ylabel('Fraction of Queries')" 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": 8, 188 | "metadata": {}, 189 | "outputs": [ 190 | { 191 | "data": { 192 | "text/plain": [ 193 | "(1, 100000000000000)" 194 | ] 195 | }, 196 | "execution_count": 8, 197 | "metadata": {}, 198 | "output_type": "execute_result" 199 | }, 200 | { 201 | "data": { 202 | "image/png": "\n", 203 | "text/plain": [ 204 | "
" 205 | ] 206 | }, 207 | "metadata": { 208 | "needs_background": "light" 209 | }, 210 | "output_type": "display_data" 211 | } 212 | ], 213 | "source": [ 214 | "# Scatter-plot of CPU-time vs Intermediate data exchanged\n", 215 | "def logScatterPlot(x, y):\n", 216 | " x_bins = np.logspace(np.log10(min(x)), np.log10(max(x)), 150)\n", 217 | " y_bins = np.logspace(np.log10(min(y)), np.log10(max(y)), 150)\n", 218 | " \n", 219 | " Z, xedges, yedges = np.histogram2d(x,y,[x_bins,y_bins])\n", 220 | " p = plt.pcolormesh(xedges, yedges, Z.T, norm=colors.LogNorm(vmin=1, vmax=Z.max()),\n", 221 | " cmap='Greens')\n", 222 | " plt.gcf().colorbar(p)\n", 223 | "\n", 224 | " plt.yscale('log')\n", 225 | " plt.xscale('log')\n", 226 | " \n", 227 | " return plt.gca()\n", 228 | "\n", 229 | "ax = logScatterPlot(df['userCpuTime'] + df['systemCpuTime'] + 1, df['intDataNetSentBytes'] + 1)\n", 230 | "ax.set_xlabel('Total CPU Time')\n", 231 | "ax.set_ylabel('Intermediate Data Exchanged')\n", 232 | "ax.set_title('CPU time vs Intermediate data scatter-plot')\n", 233 | "ax.set_ylim((1,10**14))\n", 234 | "ax.set_ylim((1,10**14))" 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": {}, 241 | "outputs": [], 242 | "source": [] 243 | } 244 | ], 245 | "metadata": { 246 | "kernelspec": { 247 | "display_name": "Python 3 (ipykernel)", 248 | "language": "python", 249 | "name": "python3" 250 | }, 251 | "language_info": { 252 | "codemirror_mode": { 253 | "name": "ipython", 254 | "version": 3 255 | }, 256 | "file_extension": ".py", 257 | "mimetype": "text/x-python", 258 | "name": "python", 259 | "nbconvert_exporter": "python", 260 | "pygments_lexer": "ipython3", 261 | "version": "3.7.5" 262 | } 263 | }, 264 | "nbformat": 4, 265 | "nbformat_minor": 4 266 | } 267 | -------------------------------------------------------------------------------- /profile.md: -------------------------------------------------------------------------------- 1 | ## Profiling Stats 2 | 3 | The dataset contains high-fidelity, coarse-grained profiling statistics at a per-query level which can potentially give insight into where queries are spending their time. 4 | 5 | ### Overview 6 | 7 | A profiler thread is forked in every worker process to monitor the activity of all the threads of the worker process. Every 10msec, the profiler thread wakes up and scans all active worker threads to see their states and records various statistics / metrics. 8 | 9 | The profiling time breakdowns are available along two dimensions: 1.) by resource and 2.) by operator 10 | 11 | ### By Resource (prof\*) 12 | 13 | These statistics provide a breakdown of CPU time into the following: 1.) Time during which CPU is busy (**profCpu**) 2.) Time during which CPU is idle (**profIdle**) and 3.) Time during which CPU is blocked on various resources/activities (**prof\*** other than **profCpu** and **profIdle**). The accounting for these stats is done as follows: 14 | 15 | Every time the profiler thread wakes up (every 10msec) it will detect how many threads are running vs waiting for a resource (like disk I/O) and compute metrics like CPU usage and wasted CPU time waiting on resources or CPUs idle. Every active worker thread will update its state to report if it is __running__ or __waiting for a resource__ (and if so what resource it is waiting on). Therefore looking at the state itself doesn't say if a thread is currently scheduled on a CPU and running. The OS may have it context switched out even if it thinks it is running. Computing wait time is therefore done by estimations, assuming the worker process is given 100% of the CPUs of its DOP (maximum number of cores a worker process of the query can use. This number is available in the **perServerCores** field in the dataset). 16 | 17 | A worker forks more threads than its DOP. A thread waiting for a resource is not necessarily counted as a blocked state if all the CPUs are in use by other worker threads. In that case the workload is still CPU bound. 18 | 19 | * If there's more threads claiming they are running than the DOP, then we can assume they are sharing the CPUs and just context switch. In that case we report 100% busy. (profCpu is incremented to account for the full time slice) 20 | * If there's fewer threads claiming they are __running__ than the DOP, then we assume some CPUs are not used. In that case, the threads in __waiting__ state are examined, all the wait types (prof\* other than profCpu/profIdle) are tallied, and equally assigned a portion of the unused CPUs. The logic is that if these wait states did not exist, these threads would grab a portion of that unused CPU. 21 | * If there's fewer threads __waiting__ or __running__ than the DOP, then there's more CPUs than can be used: These are reported as IDLE CPU time. 22 | 23 | The full list of __by resource__ breakdown fields are given below: 24 | 25 | Column | Description | Units | Datatype 26 | :------:|:-----:|:-----:|:-----: 27 | **profIdle**|CPU Idle time|Milliseconds|int64 28 | **profCpu**|CPU busy time|Milliseconds|int64 29 | **profPersistentReadCache**|Time blocked on persistent data read from cache|Milliseconds|int64 30 | **profPersistentWriteCache**|Time blocked on persistent data write to cache|Milliseconds|int64 31 | **profPersistentReadS3**|Time blocked on persistent data read from S3|Milliseconds|int64 32 | **profPersistentWriteS3**|Time blocked on persistent data write to S3|Milliseconds|int64 33 | **profIntDataReadLocalSSD**|Time blocked on intermediate data read from local SSD|Milliseconds|int64 34 | **profIntDataWriteLocalSSD**|Time blocked on intermediate data write to local SSD|Milliseconds|int64 35 | **profIntDataReadS3**|Time blocked on intermediate data read from S3|Milliseconds|int64 36 | **profIntDataWriteS3**|Time blocked on intermediate data write to S3|Milliseconds|int64 37 | **profRemoteExtRead**|?|Milliseconds|int64 38 | **profRemoteExtWrite**|?|Milliseconds|int64 39 | **profResWriteS3**|Time blocked on writing result to S3|Milliseconds|int64 40 | **profFsMeta**|Time blocked on filesystem metadata operations|Milliseconds|int64 41 | **profDataExchangeNet**|Time blocked on network for intermediate data exchange|Milliseconds|int64 42 | **profDataExchangeMsg**|Time blocked on network for intermediate data exchange|Milliseconds|int64 43 | **profControlPlaneMsg**|Time blocked on network for communication with control plane (cloud services)|Milliseconds|int64 44 | **profOs**|CPU time spent in kernel mode processing|Milliseconds|int64 45 | **profMutex**|CPU time spent contending for mutexes|Milliseconds|int64 46 | **profSetup**|Time spent in worker process setup|Milliseconds|int64 47 | **profSetupMesh**|Time spent in setting up the mesh of network connections among worker processes|Milliseconds|int64 48 | **profTeardown**|Time spent in teardown operartions|Milliseconds|int64 49 | 50 | 51 | ### By Operator (prof\*Rso) 52 | 53 | These statictics give a breakdown of time based on the operator being processed (scan, filter, join etc..). The full list of these fields is given in the table below. Note that within each operator different resources may be consumed. (for example within profScanRso, some of that time may be spent on CPU, some of it may be spent blocked on disk I/O etc..) Unfortunately and internal breakdown of how much time is spent doing what __within each operator__ is not available in the dataset. 54 | 55 | Column | Description | Units | Datatype 56 | :------:|:-----:|:-----:|:-----: 57 | **profScanRso**|Scan operators profiled time|Milliseconds|int64 58 | **profXtScanRso**|External scan operators profiled time|Milliseconds|int64 59 | **profProjRso**|Projection operators profiled time|Milliseconds|int64 60 | **profSortRso**|Sort operators profiled time|Milliseconds|int64 61 | **profFilterRso**|Filter operators profiled time|Milliseconds|int64 62 | **profResRso**|Result operators profiled time|Milliseconds|int64 63 | **profDmlRso**|DML operators profiled time|Milliseconds|int64 64 | **profHjRso**|Hash-join operators profiled time|Milliseconds|int64 65 | **profBufRso**|Buffer operators profiled time|Milliseconds|int64 66 | **profFlatRso**|Flatten operators profiled time|Milliseconds|int64 67 | **profBloomRso**|Bloom filter operators profiled time|Milliseconds|int64 68 | **profAggRso**|Aggregate operators profiled time|Milliseconds|int64 69 | **profBandRso**|Band-join operators profiled time|Milliseconds|int64 70 | **profPercentileRso**|Percentile operators profiled time|Milliseconds|int64 71 | **profUdtfRso**|User defined table operators profiled time|Milliseconds|int64 72 | **profOtherRso**|Other operators profiled time|Milliseconds|int64 73 | 74 | 75 | 76 | -------------------------------------------------------------------------------- /schema.md: -------------------------------------------------------------------------------- 1 | ## Main dataset 2 | 3 | _Note: Some of the columns in the dataset have missing values which are denoted by "\N" in the CSV version_ 4 | 5 | Each row corresponds to one unique query with the columns representing various characteristics pertaining to that query. The queryId column contains a unique 64-bit identifier for each query. For descriptions of the various columns along with details like units and datatypes, please refer to the following table: 6 | 7 | Column | Description | Units | Datatype 8 | :------:|:-----:|:-----:|:-----: 9 | **queryId**|Query identifier (anonymized). Uniquely identifies a given query. ||int64 10 | **warehouseId**|Identifier of warehouse (anonymized) in which the query ran. Queries that ran in the same warehouse will have the same warehouseId||int64 11 | **databaseId**|Unique identifier of database that this query is associated with. (Note: missing values "\N" possible )||string 12 | **createdTime**|Timestamp at which query is created / enters the system||UTC Timestamp 13 | **endTime**|Timestamp at which query is fully complete||UTC Timestamp 14 | **durationTotal**|Total end-to-end duration of query|Milliseconds|int64 15 | **durationExec**|Time spent for actual query execution (worker processes running)|Milliseconds|int64 16 | **durationControlPlane**|Time spent in control plane operations|Milliseconds|int64 17 | **durationCompiling**|Time spent for query compilation|Milliseconds|int64 18 | **execTime**|Query compute execution duration.||int64 19 | **scheduleTime**|Time spent for query to start executing after it entered the system||int64 20 | **serverCount**|Number of servers used for query execution.||int32 21 | **warehouseSize**|Size of warehouse in which this query is executing||int32 22 | **perServerCores**|Maximum number of cores this query is allowed to use for compute||int32 23 | **persistentReadBytesS3**|Persistent data bytes read from S3 (size post-compression)|Bytes|int64 24 | **persistentReadRequestsS3**|Total number of Persistent data read requests to S3||int64 25 | **persistentReadBytesCache**|Persistent data bytes read from cache (persistent data is opportunisitically cached in the eph. storage system) (size post-compression)|Bytes|int64 26 | **persistentReadRequestsCache**|Total number of Persistent data read requests from cache (persistent data is opportunistically cached in the eph. storage system)||int64 27 | **persistentWriteBytesCache**|Persistent data bytes written to cache (persistent data is opportunistically cached in the eph. storage system) (size post-compression)|Bytes|int64 28 | **persistentWriteRequestsCache**|Total number of Persistent data write requests to cache (persistent data is opportunistically cached in the eph. storage system)||int64 29 | **persistentWriteBytesS3**|Persistent data bytes written to S3 (size post-compression)|Bytes|int64 30 | **persistentWriteRequestsS3**|Total number of Persistent data write requests to S3 (persistent data is opportunistically cached in the eph. storage system)||int64 31 | **intDataWriteBytesLocalSSD**|Intermediate data bytes spilled to Local SSD (size post-compression)|Bytes|int64 32 | **intDataWriteRequestsLocalSSD**|Total number of intermediate data write requests to Local SSD||int64 33 | **intDataReadBytesLocalSSD**|Intermediate data bytes read from Local SSD (size post-compression)|Bytes|int64 34 | **intDataReadRequestsLocalSSD**|Total number of intermediate data read requests from Local SSD||int64 35 | **intDataWriteBytesS3**|Intermediate data bytes spilled to S3 (size post-compression)|Bytes|int64 36 | **intDataWriteRequestsS3**|Total number of intermediate data write requests to S3||int64 37 | **intDataReadBytesS3**|Intermediate data bytes read from Local SSD (size post-compression)|Bytes|int64 38 | **intDataReadRequestsS3**|Total number of intermediate data read requests from S3||int64 39 | **intDataWriteBytesUncompressed**|Intermediate data bytes spilled to Local SSD / S3 (size pre-compression)|Bytes|int64 40 | **readBytesRemoteExternal**|?||int64 41 | **readRequestsRemoteExternal**|?||int64 42 | **intDataNetReceivedBytes**|Intermediate data exchange over network recveived bytes (size post-compression)|Bytes|int64 43 | **intDataNetSentBytes**|Intermediate data exchange over network sent bytes (size post-compression)|Bytes|int64 44 | **intDataNetSentRequests**|Intermediate data exchange total number of network requests||int64 45 | **intDataNetSentBytesUncompressed**|Intermediate data exchange over network sent bytes (size pre-compression)|Bytes|int64 46 | **producedRows**|Number of produced rows in output set.||int64 47 | **returnedRows**|Number of rows returned by this job||int64 48 | **fileStolenCount**|Total number of files stolen by worker nodes (work stealing).||int64 49 | **remoteSeqScanFileOps**|?|| 50 | **localSeqScanFileOps**|Sequential scan file operations.||int64 51 | **localWriteFileOps**|Local write file operations.||int64 52 | **remoteSkipScanFileOps**|Number of times fdn files were scanned||int64 53 | **remoteWriteFileOps**|?|| 54 | **filesCreated**|Number of files created by query.||int64 55 | **scanAssignedBytes**|Should be equivalent to scanBytes|Bytes|int64 56 | **scanAssignedFiles**|Total number of files to be scanned after pruning|Bytes|int64 57 | **scanBytes**|Total number of bytes scanned from files|Bytes|int64 58 | **scanFiles**|Same as scanAssignedFiles|Bytes|int64 59 | **scanOriginalFiles**|Total number of files in the table before pruning||int64 60 | **userCpuTime**|User CPU time summed across all worker processes|Microseconds|int64 61 | **systemCpuTime**|Kernel CPU time summed across all worker processes|Microseconds|int64 62 | **memoryUsed**|Memory used across all worker processes|Bytes|int64 63 | **profIdle**|CPU Idle time|Milliseconds|int64 64 | **profCpu**|CPU busy time|Milliseconds|int64 65 | **profPersistentReadCache**|Time blocked on persistent data read from cache|Milliseconds|int64 66 | **profPersistentWriteCache**|Time blocked on persistent data write to cache|Milliseconds|int64 67 | **profPersistentReadS3**|Time blocked on persistent data read from S3|Milliseconds|int64 68 | **profPersistentWriteS3**|Time blocked on persistent data write to S3|Milliseconds|int64 69 | **profIntDataReadLocalSSD**|Time blocked on intermediate data read from local SSD|Milliseconds|int64 70 | **profIntDataWriteLocalSSD**|Time blocked on intermediate data write to local SSD|Milliseconds|int64 71 | **profIntDataReadS3**|Time blocked on intermediate data read from S3|Milliseconds|int64 72 | **profIntDataWriteS3**|Time blocked on intermediate data write to S3|Milliseconds|int64 73 | **profRemoteExtRead**|?|Milliseconds|int64 74 | **profRemoteExtWrite**|?|Milliseconds|int64 75 | **profResWriteS3**|Time blocked on writing result to S3|Milliseconds|int64 76 | **profFsMeta**|Time blocked on filesystem metadata operations|Milliseconds|int64 77 | **profDataExchangeNet**|Time blocked on network for intermediate data exchange|Milliseconds|int64 78 | **profDataExchangeMsg**|Time blocked on network for intermediate data exchange|Milliseconds|int64 79 | **profControlPlaneMsg**|Time blocked on network for communication with control plane (cloud services)|Milliseconds|int64 80 | **profOs**|CPU time spent in kernel mode processing|Milliseconds|int64 81 | **profMutex**|CPU time spent contending for mutexes|Milliseconds|int64 82 | **profSetup**|Time spent in worker process setup|Milliseconds|int64 83 | **profSetupMesh**|Time spent in setting up the mesh of network connections among worker processes|Milliseconds|int64 84 | **profTeardown**|Time spent in teardown operartions|Milliseconds|int64 85 | **profScanRso**|Scan operators profiled time|Milliseconds|int64 86 | **profXtScanRso**|External scan operators profiled time|Milliseconds|int64 87 | **profProjRso**|Projection operators profiled time|Milliseconds|int64 88 | **profSortRso**|Sort operators profiled time|Milliseconds|int64 89 | **profFilterRso**|Filter operators profiled time|Milliseconds|int64 90 | **profResRso**|Result operators profiled time|Milliseconds|int64 91 | **profDmlRso**|DML operators profiled time|Milliseconds|int64 92 | **profHjRso**|Hash-join operators profiled time|Milliseconds|int64 93 | **profBufRso**|Buffer operators profiled time|Milliseconds|int64 94 | **profFlatRso**|Flatten operators profiled time|Milliseconds|int64 95 | **profBloomRso**|Bloom filter operators profiled time|Milliseconds|int64 96 | **profAggRso**|Aggregate operators profiled time|Milliseconds|int64 97 | **profBandRso**|Band-join operators profiled time|Milliseconds|int64 98 | **profPercentileRso**|Percentile operators profiled time|Milliseconds|int64 99 | **profUdtfRso**|User defined table operators profiled time|Milliseconds|int64 100 | **profOtherRso**|Other operators profiled time|Milliseconds|int64 101 | --------------------------------------------------------------------------------