├── missing-data-slides.pdf ├── .gitignore ├── README.md ├── 01_unit_missingness.ipynb ├── solutions ├── 01_unit_missingness.ipynb └── 00_interactive_plot.ipynb ├── 00_interactive_plot.ipynb └── 02_item_missingness.ipynb /missing-data-slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/matthewbrems/missing-data-workshop/HEAD/missing-data-slides.pdf -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # OSX DS Store 2 | .DS_Store 3 | 4 | # Byte-compiled / optimized / DLL files 5 | __pycache__/ 6 | *.py[cod] 7 | *$py.class 8 | 9 | # C extensions 10 | *.so 11 | 12 | # Distribution / packaging 13 | .Python 14 | env/ 15 | build/ 16 | develop-eggs/ 17 | dist/ 18 | downloads/ 19 | eggs/ 20 | .eggs/ 21 | lib/ 22 | lib64/ 23 | parts/ 24 | sdist/ 25 | var/ 26 | *.egg-info/ 27 | .installed.cfg 28 | *.egg 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *,cover 49 | .hypothesis/ 50 | 51 | # Translations 52 | *.mo 53 | *.pot 54 | 55 | # Django stuff: 56 | *.log 57 | local_settings.py 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # IPython Notebook 73 | *.ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # dotenv 82 | .env 83 | 84 | # virtualenv 85 | venv/ 86 | ENV/ 87 | 88 | # Spyder project settings 89 | .spyderproject 90 | 91 | # Rope project settings 92 | .ropeproject 93 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Good, Fast, Cheap: How to do Data Science with Missing Data 2 | 3 | ## Resources 4 | 5 | #### Academic Content (roughly sorted from least technical to most) 6 | - [Good summary of single vs. multiple imputation](https://scikit-learn.org/stable/modules/impute.html#multiple-vs-single-imputation) 7 | - [UTexas Slides on Missing Data](https://liberalarts.utexas.edu/prc/_files/cs/Missing-Data.pdf) 8 | - [Flexible Imputation of Missing Data, 2nd ed.](https://stefvanbuuren.name/fimd/) 9 | - [Pattern Submodel Paper from "Biostatistics"](https://academic.oup.com/biostatistics/advance-article/doi/10.1093/biostatistics/kxy040/5092384) 10 | - [The prevention and handling of missing data](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3668100/) 11 | - [Should you use a missing data indicator?](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3414599/) 12 | - [Are all biases missing data problems?](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4643276/) 13 | - [Accounting for missing data in statistical analyses: multiple imputation is not always the answer](https://academic.oup.com/ije/article/48/4/1294/5382162?login=false#.XVpWZLg4jrU.twitter) 14 | - [The proportion of missing data should not be used to guide decisions on multiple imputation](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6547017/) 15 | - [Boston University Technical Report on Missing Data, Assumptions, and Applications](http://www.bu.edu/sph/files/2014/05/Marina-tech-report.pdf) 16 | - [Andrew Gelman Chapter on Missing Data - thorough and very good, but academic](http://www.stat.columbia.edu/~gelman/arm/missing.pdf) 17 | 18 | #### Python 19 | - [Scikit-Learn Write-Up of Imputation](https://scikit-learn.org/stable/modules/impute.html) 20 | - [Scikit-Learn SimpleImputer Class](https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html) 21 | - [Scikit-Learn IterativeImputer Class](https://scikit-learn.org/stable/modules/generated/sklearn.impute.IterativeImputer.html) **note: this is experimental** 22 | - [Scikit-Learn KNNImputer Class](https://scikit-learn.org/stable/modules/generated/sklearn.impute.KNNImputer.html) 23 | - [Scikit-Learn Mean Imputation](http://scikit-learn.org/stable/auto_examples/plot_missing_values.html#sphx-glr-auto-examples-plot-missing-values-py) 24 | - [MissingNo in Python](https://github.com/ResidentMario/missingno) 25 | - [MissingNo Paper](http://joss.theoj.org/papers/52b4115d6c03864b884fbf3334851322) 26 | 27 | ##### Rubin's Rules (combining parameter estimates across multiple imputations) 28 | - [Article about Rubin's Rules](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2727536/) 29 | - [Rubin's Rules Formulas](https://bookdown.org/mwheymans/bookmi/rubins-rules.html) 30 | - [R Code for Pooling Estimates](http://finzi.psych.upenn.edu/R/library/mice/html/pool.html) 31 | 32 | #### Non-Statistical References 33 | - [How One 19-Year-Old Illinois Man Is Distorting National Polling Averages - NYTimes](https://www.nytimes.com/2016/10/13/upshot/how-one-19-year-old-illinois-man-is-distorting-national-polling-averages.html) 34 | 35 | 36 | ### Feel free to contact me afterward! 37 | - [LinkedIn](https://www.linkedin.com/in/matthewbrems) 38 | - [Twitter](https://www.twitter.com/matthewbrems) 39 | - [Medium](https://www.medium.com/@matthew.w.brems) 40 | -------------------------------------------------------------------------------- /01_unit_missingness.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Unit Missingness Demo\n", 8 | "\n", 9 | "When handling unit missingness, the most common method is to do **weight class adjustments**. This requires us to break our observations into classes and weight them before doing our analysis." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": null, 15 | "metadata": {}, 16 | "outputs": [], 17 | "source": [ 18 | "# Import libraries.\n", 19 | "import pandas as pd\n", 20 | "import numpy as np\n", 21 | "\n", 22 | "# Set random seed.\n", 23 | "np.random.seed(42)\n", 24 | "\n", 25 | "# Generate dataframe.\n", 26 | "value_score = [min(np.random.poisson(5), 10) if i % 2 == 0 else min(np.random.poisson(6), 10) for i in range(10_000)]\n", 27 | "value_score = [value_score[i] if (i % 8 == 0 or (i % 7 != 0 and i % 2 == 1)) else np.nan for i in range(10_000)]\n", 28 | "departments = ['finance' if i % 2 == 0 else 'accounting' for i in range(10_000)]\n", 29 | "df = pd.DataFrame({\n", 30 | " 'dept': departments,\n", 31 | " 'score': value_score\n", 32 | "})\n", 33 | "\n", 34 | "# Check first five rows.\n", 35 | "df.head()" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [ 44 | "# What is the distribution of department?\n", 45 | "df['dept'].value_counts(normalize = True)" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": null, 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "# Check for nulls.\n", 55 | "df.isnull().sum()" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [ 64 | "# Drop NAs.\n", 65 | "df.dropna(inplace = True)" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": null, 71 | "metadata": {}, 72 | "outputs": [], 73 | "source": [ 74 | "# What proportion of our responses came from accounting?\n", 75 | "df['dept'].value_counts(normalize = True)['accounting']" 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": null, 81 | "metadata": {}, 82 | "outputs": [], 83 | "source": [ 84 | "df['dept'].value_counts(normalize = True)" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "metadata": {}, 90 | "source": [ 91 | "1. Take the full sample (observed and missing) and break them into subgroups based on characteristics we know.\n", 92 | "2. Calculate a weight for each observation:\n", 93 | "\n", 94 | "$$\n", 95 | "\\text{weight}_i = \\frac{\\text{true proportion in group }i}{\\text{proportion of observed values in group }i}\n", 96 | "$$" 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "execution_count": null, 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "# Calculate and print the weight for accounting.\n", 106 | "w_accounting = (1/2) / df['dept'].value_counts(normalize = True)['accounting']\n", 107 | "\n", 108 | "print(f'The weight for each accounting vote is: {w_accounting}.')" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": null, 114 | "metadata": {}, 115 | "outputs": [], 116 | "source": [ 117 | "# Calculate the and print weight for finance.\n", 118 | "w_finance = (1/2) / df['dept'].value_counts(normalize = True)['finance']\n", 119 | "\n", 120 | "print(f'The weight for each finance vote is: {w_finance}.')" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": null, 126 | "metadata": {}, 127 | "outputs": [], 128 | "source": [ 129 | "# Let's confirm that the weights times the counts\n", 130 | "# yields a 50/50 split.\n", 131 | "print(w_accounting * df['dept'].value_counts()['accounting'])\n", 132 | "print(w_finance * df['dept'].value_counts()['finance'])" 133 | ] 134 | }, 135 | { 136 | "cell_type": "code", 137 | "execution_count": null, 138 | "metadata": {}, 139 | "outputs": [], 140 | "source": [ 141 | "# Create column that stores the weights.\n", 142 | "\n", 143 | "df['weights'] = [w_accounting if i == 'accounting' else w_finance for i in df['dept']]" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": {}, 150 | "outputs": [], 151 | "source": [ 152 | "# Confirm counts.\n", 153 | "\n", 154 | "df['weights'].value_counts()" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [ 163 | "# Calculate raw mean of my employee satisfaction score.\n", 164 | "\n", 165 | "np.mean(df['score'])" 166 | ] 167 | }, 168 | { 169 | "cell_type": "code", 170 | "execution_count": null, 171 | "metadata": {}, 172 | "outputs": [], 173 | "source": [ 174 | "# Calculate weighted mean of my employee satisfaction score.\n", 175 | "\n", 176 | "np.mean(df['score'] * df['weights'])" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "
Our goal with post-weighting is to decrease bias. What should we be concerned about?\n", 184 | " \n", 185 | "- Due to the bias-variance tradeoff, as we decrease bias, we may cause an increase in variance.\n", 186 | "- This can be a really big deal, [said the New York Times in 2016](https://www.nytimes.com/2016/10/13/upshot/how-one-19-year-old-illinois-man-is-distorting-national-polling-averages.html).\n", 187 | "
" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "
What might be a situation where we may not be able to use weight class adjustments?\n", 195 | " \n", 196 | "- If we don't know the true distribution of our classes.\n", 197 | "- For example, if I didn't know that half of our team was in accounting and half in finance.\n", 198 | "- Another example, let's say I wanted to apply this weighting method to understand the percentage of voters supporting the Democratic candidate in the upcoming election. I don't know how many people will be in each of the age groups 18-34, 35-54, and 55+. I'll have to make a guess. (Hopefully an educated one!)\n", 199 | "
" 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "#### Have more variables and want to build a sophisticated model?\n", 207 | "Pass `df['weight']` into `sklearn` when fitting your model. [Source](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor.fit).\n", 208 | "> `model.fit(X_train, y_train, X_train['weight'])`" 209 | ] 210 | }, 211 | { 212 | "cell_type": "markdown", 213 | "metadata": {}, 214 | "source": [ 215 | "In R, I am using `wtd.chi.sq`." 216 | ] 217 | } 218 | ], 219 | "metadata": { 220 | "kernelspec": { 221 | "display_name": "Python 3", 222 | "language": "python", 223 | "name": "python3" 224 | }, 225 | "language_info": { 226 | "codemirror_mode": { 227 | "name": "ipython", 228 | "version": 3 229 | }, 230 | "file_extension": ".py", 231 | "mimetype": "text/x-python", 232 | "name": "python", 233 | "nbconvert_exporter": "python", 234 | "pygments_lexer": "ipython3", 235 | "version": "3.8.3" 236 | } 237 | }, 238 | "nbformat": 4, 239 | "nbformat_minor": 4 240 | } 241 | -------------------------------------------------------------------------------- /solutions/01_unit_missingness.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Unit Missingness Demo\n", 8 | "\n", 9 | "When handling unit missingness, the most common method is to do **weight class adjustments**. This requires us to break our observations into classes and weight them before doing our analysis." 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "data": { 19 | "text/html": [ 20 | "
\n", 21 | "\n", 34 | "\n", 35 | " \n", 36 | " \n", 37 | " \n", 38 | " \n", 39 | " \n", 40 | " \n", 41 | " \n", 42 | " \n", 43 | " \n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | "
deptscore
0finance5.0
1accounting4.0
2financeNaN
3accounting5.0
4financeNaN
\n", 70 | "
" 71 | ], 72 | "text/plain": [ 73 | " dept score\n", 74 | "0 finance 5.0\n", 75 | "1 accounting 4.0\n", 76 | "2 finance NaN\n", 77 | "3 accounting 5.0\n", 78 | "4 finance NaN" 79 | ] 80 | }, 81 | "execution_count": 1, 82 | "metadata": {}, 83 | "output_type": "execute_result" 84 | } 85 | ], 86 | "source": [ 87 | "# Import libraries.\n", 88 | "import pandas as pd\n", 89 | "import numpy as np\n", 90 | "\n", 91 | "# Set random seed.\n", 92 | "np.random.seed(42)\n", 93 | "\n", 94 | "# Generate dataframe.\n", 95 | "value_score = [min(np.random.poisson(5), 10) if i % 2 == 0 else min(np.random.poisson(6), 10) for i in range(10_000)]\n", 96 | "value_score = [value_score[i] if (i % 8 == 0 or (i % 7 != 0 and i % 2 == 1)) else np.nan for i in range(10_000)]\n", 97 | "departments = ['finance' if i % 2 == 0 else 'accounting' for i in range(10_000)]\n", 98 | "df = pd.DataFrame({\n", 99 | " 'dept': departments,\n", 100 | " 'score': value_score\n", 101 | "})\n", 102 | "\n", 103 | "# Check first five rows.\n", 104 | "df.head()" 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": 2, 110 | "metadata": {}, 111 | "outputs": [ 112 | { 113 | "data": { 114 | "text/plain": [ 115 | "accounting 0.5\n", 116 | "finance 0.5\n", 117 | "Name: dept, dtype: float64" 118 | ] 119 | }, 120 | "execution_count": 2, 121 | "metadata": {}, 122 | "output_type": "execute_result" 123 | } 124 | ], 125 | "source": [ 126 | "# What is the distribution of department?\n", 127 | "df['dept'].value_counts(normalize = True)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 3, 133 | "metadata": {}, 134 | "outputs": [ 135 | { 136 | "data": { 137 | "text/plain": [ 138 | "dept 0\n", 139 | "score 4464\n", 140 | "dtype: int64" 141 | ] 142 | }, 143 | "execution_count": 3, 144 | "metadata": {}, 145 | "output_type": "execute_result" 146 | } 147 | ], 148 | "source": [ 149 | "# Check for nulls.\n", 150 | "df.isnull().sum()" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 4, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "# Drop NAs.\n", 160 | "df.dropna(inplace = True)" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": 5, 166 | "metadata": {}, 167 | "outputs": [ 168 | { 169 | "data": { 170 | "text/plain": [ 171 | "0.7742052023121387" 172 | ] 173 | }, 174 | "execution_count": 5, 175 | "metadata": {}, 176 | "output_type": "execute_result" 177 | } 178 | ], 179 | "source": [ 180 | "# What proportion of our responses came from accounting?\n", 181 | "df['dept'].value_counts(normalize = True)['accounting']" 182 | ] 183 | }, 184 | { 185 | "cell_type": "markdown", 186 | "metadata": {}, 187 | "source": [ 188 | "1. Take the full sample (observed and missing) and break them into subgroups based on characteristics we know.\n", 189 | "2. Calculate a weight for each observation:\n", 190 | "\n", 191 | "$$\n", 192 | "\\text{weight}_i = \\frac{\\text{true proportion in group }i}{\\text{proportion of observed values in group }i}\n", 193 | "$$" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": 6, 199 | "metadata": {}, 200 | "outputs": [ 201 | { 202 | "name": "stdout", 203 | "output_type": "stream", 204 | "text": [ 205 | "The weight for each accounting vote is: 0.645823611759216.\n" 206 | ] 207 | } 208 | ], 209 | "source": [ 210 | "# Calculate and print the weight for accounting.\n", 211 | "w_accounting = (1/2) / df['dept'].value_counts(normalize = True)['accounting']\n", 212 | "\n", 213 | "print(f'The weight for each accounting vote is: {w_accounting}.')" 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": 7, 219 | "metadata": {}, 220 | "outputs": [ 221 | { 222 | "name": "stdout", 223 | "output_type": "stream", 224 | "text": [ 225 | "The weight for each finance vote is: 2.2144.\n" 226 | ] 227 | } 228 | ], 229 | "source": [ 230 | "# Calculate the and print weight for finance.\n", 231 | "w_finance = (1/2) / df['dept'].value_counts(normalize = True)['finance']\n", 232 | "\n", 233 | "print(f'The weight for each finance vote is: {w_finance}.')" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 8, 239 | "metadata": {}, 240 | "outputs": [ 241 | { 242 | "name": "stdout", 243 | "output_type": "stream", 244 | "text": [ 245 | "2767.9999999999995\n", 246 | "2768.0\n" 247 | ] 248 | } 249 | ], 250 | "source": [ 251 | "# Let's confirm that the weights times the counts\n", 252 | "# yields a 50/50 split.\n", 253 | "print(w_accounting * df['dept'].value_counts()['accounting'])\n", 254 | "print(w_finance * df['dept'].value_counts()['finance'])" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": 9, 260 | "metadata": {}, 261 | "outputs": [], 262 | "source": [ 263 | "# Create column that stores the weights.\n", 264 | "\n", 265 | "df['weights'] = [w_accounting if i == 'accounting' else w_finance for i in df['dept']]" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 10, 271 | "metadata": {}, 272 | "outputs": [ 273 | { 274 | "data": { 275 | "text/plain": [ 276 | "0.645824 4286\n", 277 | "2.214400 1250\n", 278 | "Name: weights, dtype: int64" 279 | ] 280 | }, 281 | "execution_count": 10, 282 | "metadata": {}, 283 | "output_type": "execute_result" 284 | } 285 | ], 286 | "source": [ 287 | "# Confirm counts.\n", 288 | "\n", 289 | "df['weights'].value_counts()" 290 | ] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "execution_count": 11, 295 | "metadata": {}, 296 | "outputs": [ 297 | { 298 | "data": { 299 | "text/plain": [ 300 | "5.724530346820809" 301 | ] 302 | }, 303 | "execution_count": 11, 304 | "metadata": {}, 305 | "output_type": "execute_result" 306 | } 307 | ], 308 | "source": [ 309 | "# Calculate raw mean of my employee satisfaction score.\n", 310 | "\n", 311 | "np.mean(df['score'])" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 12, 317 | "metadata": {}, 318 | "outputs": [ 319 | { 320 | "data": { 321 | "text/plain": [ 322 | "5.450634997666867" 323 | ] 324 | }, 325 | "execution_count": 12, 326 | "metadata": {}, 327 | "output_type": "execute_result" 328 | } 329 | ], 330 | "source": [ 331 | "# Calculate weighted mean of my employee satisfaction score.\n", 332 | "\n", 333 | "np.mean(df['score'] * df['weights'])" 334 | ] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": {}, 339 | "source": [ 340 | "
Our goal with post-weighting is to decrease bias. What should we be concerned about?\n", 341 | " \n", 342 | "- Due to the bias-variance tradeoff, as we decrease bias, we may cause an increase in variance.\n", 343 | "- This can be a really big deal, [said the New York Times in 2016](https://www.nytimes.com/2016/10/13/upshot/how-one-19-year-old-illinois-man-is-distorting-national-polling-averages.html).\n", 344 | "
" 345 | ] 346 | }, 347 | { 348 | "cell_type": "markdown", 349 | "metadata": {}, 350 | "source": [ 351 | "
What might be a situation where we may not be able to use weight class adjustments?\n", 352 | " \n", 353 | "- If we don't know the true distribution of our classes.\n", 354 | "- For example, if I didn't know that half of our team was in accounting and half in finance.\n", 355 | "- Another example, let's say I wanted to apply this weighting method to understand the percentage of voters supporting the Democratic candidate in the upcoming election. I don't know how many people will be in each of the age groups 18-34, 35-54, and 55+. I'll have to make a guess. (Hopefully an educated one!)\n", 356 | "
" 357 | ] 358 | }, 359 | { 360 | "cell_type": "markdown", 361 | "metadata": {}, 362 | "source": [ 363 | "#### Have more variables and want to build a sophisticated model?\n", 364 | "Pass `df['weight']` into `sklearn` when fitting your model. [Source](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor.fit).\n", 365 | "> `model.fit(X_train, y_train, X_train['weight'])`" 366 | ] 367 | } 368 | ], 369 | "metadata": { 370 | "kernelspec": { 371 | "display_name": "Python 3", 372 | "language": "python", 373 | "name": "python3" 374 | }, 375 | "language_info": { 376 | "codemirror_mode": { 377 | "name": "ipython", 378 | "version": 3 379 | }, 380 | "file_extension": ".py", 381 | "mimetype": "text/x-python", 382 | "name": "python", 383 | "nbconvert_exporter": "python", 384 | "pygments_lexer": "ipython3", 385 | "version": "3.8.3" 386 | } 387 | }, 388 | "nbformat": 4, 389 | "nbformat_minor": 4 390 | } 391 | -------------------------------------------------------------------------------- /00_interactive_plot.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "#### To confirm that you have the latest versions of these packages, uncomment and run the following command.\n", 10 | "# !pip install numpy pandas matplotlib sklearn ipywidgets IPython missingno --upgrade\n", 11 | "\n", 12 | "# To generate and store data.\n", 13 | "import numpy as np\n", 14 | "import pandas as pd\n", 15 | "\n", 16 | "# To visualize data.\n", 17 | "import matplotlib.pyplot as plt\n", 18 | "\n", 19 | "# To fit linear regression model.\n", 20 | "from sklearn.linear_model import LinearRegression\n", 21 | "\n", 22 | "# To allow interactive plot.\n", 23 | "from ipywidgets import *\n", 24 | "from IPython.display import display\n", 25 | "\n", 26 | "# There is a SciPy issue that won't affect our work, but a warning exists\n", 27 | "# and an update is not imminent.\n", 28 | "import warnings\n", 29 | "warnings.filterwarnings(action=\"ignore\")\n", 30 | "\n", 31 | "# To render plots in the notebook.\n", 32 | "%matplotlib inline" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": null, 38 | "metadata": {}, 39 | "outputs": [], 40 | "source": [ 41 | "# Generate data and store in a dataframe.\n", 42 | "\n", 43 | "np.random.seed(42)\n", 44 | "\n", 45 | "age = np.random.uniform(20, 60, size = 100)\n", 46 | "income = 15000 + 750 * age + np.random.normal(0, 20000, size = 100)\n", 47 | "income = [i if i >= 0 else 0 for i in income]\n", 48 | "\n", 49 | "df = pd.DataFrame({'income':income,\n", 50 | " 'age': age})" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "# Create three functions to model missingness according to certain patterns.\n", 60 | "\n", 61 | "def create_mcar_column(df, missing_column = 'income', p_missing = 0.01, random_state = 42):\n", 62 | " \"\"\"\n", 63 | " Creates missingness indicator column, where data are MCAR (missing completely at random).\n", 64 | " \n", 65 | " User must specify:\n", 66 | " df = the pandas DataFrame the user wants to read in for analysis\n", 67 | " column = the name of the column in df that is missing\n", 68 | " p_missing = the proportion of observations that are missing\n", 69 | " \n", 70 | " Function returns:\n", 71 | " mcar_column = a column that indicates whether data are missing, assuming MCAR\n", 72 | " \"\"\"\n", 73 | " np.random.seed(random_state)\n", 74 | " \n", 75 | " mcar_indices = [df.sample(n = 1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 76 | " \n", 77 | " while len(set(mcar_indices)) < round(p_missing * df.shape[0]):\n", 78 | " mcar_indices.append(df.sample(n = 1).index[0])\n", 79 | " \n", 80 | " mcar_column = [1 if i in mcar_indices else 0 for i in range(df.shape[0])]\n", 81 | " \n", 82 | " return mcar_column\n", 83 | "\n", 84 | "def create_mar_column(df, missing_column = 'income', depends_on = 'age', method = 'linear', p_missing = 0.01, random_state = 42):\n", 85 | " \"\"\"\n", 86 | " Creates missingness indicator column, where data are MAR (missing at random).\n", 87 | " \n", 88 | " User must specify:\n", 89 | " df = the pandas DataFrame the user wants to read in for analysis\n", 90 | " missing_column = the name of the column in df that is missing\n", 91 | " depends_on = the name of the column in df which affects the missingness\n", 92 | " method = 'linear' or 'quadratic'\n", 93 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 94 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 95 | " p_missing = the proportion of observations that are missing\n", 96 | " \n", 97 | " Function returns:\n", 98 | " mar_column = a column that indicates whether data are missing, assuming MAR\n", 99 | " \"\"\"\n", 100 | " np.random.seed(random_state)\n", 101 | " \n", 102 | " if method == 'linear':\n", 103 | " mar_indices = [df.sample(n = 1, weights = df[depends_on] ** -1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 104 | "\n", 105 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 106 | " mar_indices.append(df.sample(n = 1, weights = df[depends_on] ** -1).index[0])\n", 107 | " \n", 108 | " elif method == 'quadratic':\n", 109 | " mar_indices = [df.sample(n = 1, weights = df[depends_on] ** -2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 110 | "\n", 111 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 112 | " mar_indices.append(df.sample(n = 1, weights = df[depends_on] ** -2).index[0])\n", 113 | "\n", 114 | " mar_column = [1 if i in mar_indices else 0 for i in range(df.shape[0])]\n", 115 | " \n", 116 | " return mar_column\n", 117 | "\n", 118 | "def create_nmar_column(df, missing_column = 'income', method = 'linear', p_missing = 0.01, random_state = 42):\n", 119 | " \"\"\"\n", 120 | " Creates missingness indicator column, where data are NMAR (not missing at random).\n", 121 | " \n", 122 | " User must specify:\n", 123 | " df = the pandas DataFrame the user wants to read in for analysis\n", 124 | " missing_column = the name of the column in df that is missing\n", 125 | " method = 'linear' or 'quadratic'\n", 126 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 127 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 128 | " p_missing = the proportion of observations that are missing\n", 129 | " \n", 130 | " Function returns:\n", 131 | " nmar_column = a column that indicates whether data are missing, assuming NMAR\n", 132 | " \"\"\"\n", 133 | " np.random.seed(random_state)\n", 134 | " \n", 135 | " if method == 'linear':\n", 136 | " nmar_indices = [df.sample(n = 1, weights = df[missing_column] ** -1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 137 | "\n", 138 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 139 | " nmar_indices.append(df.sample(n = 1, weights = df[missing_column] ** -1).index[0])\n", 140 | " \n", 141 | " elif method == 'quadratic':\n", 142 | " nmar_indices = [df.sample(n = 1, weights = df[missing_column] ** -2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 143 | "\n", 144 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 145 | " nmar_indices.append(df.sample(n = 1, weights = df[missing_column] ** -2).index[0])\n", 146 | " \n", 147 | " nmar_column = [1 if i in nmar_indices else 0 for i in range(df.shape[0])]\n", 148 | " \n", 149 | " return nmar_column" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "def generate_scatterplot(p_missing, missing_type, method = 'linear', missing_column = 'income', depends_on = 'age'):\n", 159 | " # Generate one plot.\n", 160 | " fig, ax = plt.subplots(nrows = 1, ncols = 1, figsize = (16,9))\n", 161 | "\n", 162 | " # Set labels and axes.\n", 163 | " ax.set_xlabel(\"Age\", position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 164 | " ax.set_ylabel(\"Income\", position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 165 | " \n", 166 | " ax.set_ylim([-1000, 100000])\n", 167 | " \n", 168 | " # Generate data with proportion p missing.\n", 169 | " if missing_type == 'MCAR':\n", 170 | " df['missingness'] = create_mcar_column(df,\n", 171 | " missing_column = missing_column,\n", 172 | " p_missing = p_missing)\n", 173 | " elif missing_type == 'MAR':\n", 174 | " df['missingness'] = create_mar_column(df,\n", 175 | " missing_column = missing_column,\n", 176 | " depends_on = depends_on,\n", 177 | " method = method,\n", 178 | " p_missing = p_missing)\n", 179 | " \n", 180 | " elif missing_type == 'NMAR':\n", 181 | " df['missingness'] = create_nmar_column(df,\n", 182 | " missing_column = missing_column,\n", 183 | " method = method,\n", 184 | " p_missing = p_missing)\n", 185 | " \n", 186 | " # Generate scatterplot.\n", 187 | " ax.scatter(df['age'][df['missingness'] == 0], df['income'][df['missingness'] == 0], s = 35, color = '#185fad', alpha = 0.75, label = 'Observed')\n", 188 | " ax.scatter(df['age'][df['missingness'] == 1], df['income'][df['missingness'] == 1], s = 35, color = 'grey', alpha = 0.25, label = '')\n", 189 | " \n", 190 | " # Generate lines of best fit based on observed and missing values.\n", 191 | " x = np.linspace(20, 60)\n", 192 | " ax.plot(x, 15000 + 750 * x, c = 'orange', alpha = 0.7, label = '\"True\" Line', lw = 3)\n", 193 | " model = LinearRegression().fit(df[['age']][df['missingness'] == 0], df['income'][df['missingness'] == 0])\n", 194 | " ax.plot(x, model.intercept_ + model.coef_ * x, c = '#185fad', alpha = 0.7, label='Observed Line', lw = 3)\n", 195 | "\n", 196 | " # Generate title and legend.\n", 197 | " ax.set_title(f'Type of Missing Data: {missing_type} \\nProportion Missing: {p_missing}', position = (0,1), ha = 'left', fontsize = 25)\n", 198 | " ax.legend(prop={'size': 20}, loc = 2)\n", 199 | " \n", 200 | " ax.set_xticks([])\n", 201 | " ax.set_yticks([])\n", 202 | " plt.show();" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": {}, 209 | "outputs": [], 210 | "source": [ 211 | "generate_scatterplot(p_missing=0.1,\n", 212 | " missing_type = 'MCAR',\n", 213 | " method = 'linear')" 214 | ] 215 | }, 216 | { 217 | "cell_type": "code", 218 | "execution_count": null, 219 | "metadata": { 220 | "scrolled": false 221 | }, 222 | "outputs": [], 223 | "source": [ 224 | "def plot_interact(p_missing = 0.8, missing_type = 'MCAR', method = 'linear'):\n", 225 | " generate_scatterplot(p_missing, missing_type, method, missing_column = 'income', depends_on = 'age')\n", 226 | " \n", 227 | "interact(plot_interact, p_missing = (0, 0.99, 0.05), missing_type = ['MCAR','MAR','NMAR'], method = ['linear','quadratic']);" 228 | ] 229 | } 230 | ], 231 | "metadata": { 232 | "kernelspec": { 233 | "display_name": "Python 3", 234 | "language": "python", 235 | "name": "python3" 236 | }, 237 | "language_info": { 238 | "codemirror_mode": { 239 | "name": "ipython", 240 | "version": 3 241 | }, 242 | "file_extension": ".py", 243 | "mimetype": "text/x-python", 244 | "name": "python", 245 | "nbconvert_exporter": "python", 246 | "pygments_lexer": "ipython3", 247 | "version": "3.8.3" 248 | } 249 | }, 250 | "nbformat": 4, 251 | "nbformat_minor": 2 252 | } 253 | -------------------------------------------------------------------------------- /02_item_missingness.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Item Missingness Demo" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": null, 13 | "metadata": {}, 14 | "outputs": [], 15 | "source": [ 16 | "# To generate and store data.\n", 17 | "import numpy as np\n", 18 | "import pandas as pd\n", 19 | "import scipy.stats as stats\n", 20 | "\n", 21 | "# To visualize data.\n", 22 | "import matplotlib.pyplot as plt\n", 23 | "\n", 24 | "# To fit linear regression model.\n", 25 | "from sklearn.linear_model import LinearRegression, LogisticRegression\n", 26 | "\n", 27 | "# Install and import missingno to visualize missingness patterns. (Uncomment first line to install missingno.)\n", 28 | "# !pip3 install missingno\n", 29 | "import missingno as msno\n", 30 | "\n", 31 | "# # There is a SciPy issue that won't affect our work, but a warning exists\n", 32 | "# # and an update is not imminent.\n", 33 | "import warnings\n", 34 | "warnings.filterwarnings(action=\"ignore\")\n", 35 | "\n", 36 | "# To render plots in the notebook.\n", 37 | "%matplotlib inline" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "### Let's generate some data. Specifically, we'll generate age, partnered, children, and income data, where income is linearly related to age, partnered, and children." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "metadata": {}, 51 | "outputs": [], 52 | "source": [ 53 | "# To ensure we get the same results.\n", 54 | "np.random.seed(42)\n", 55 | "\n", 56 | "# Generate data.\n", 57 | "age = np.round(np.random.uniform(20, 60, size = 100))\n", 58 | "partnered = np.random.binomial(1, 0.8, size = 100)\n", 59 | "children = np.random.poisson(2, size = 100)\n", 60 | "income = 15000 + 750 * age + 20000 * partnered - 2500 * children + np.random.normal(0, 20000, size = 100)\n", 61 | "\n", 62 | "# Ensure income is not negative!\n", 63 | "income = [i if i >= 0 else 0 for i in income]\n", 64 | "\n", 65 | "# Combine our results into one dataframe.\n", 66 | "df = pd.DataFrame({'age': age,\n", 67 | " 'partnered': partnered,\n", 68 | " 'children': children,\n", 69 | " 'income': income})\n", 70 | "\n", 71 | "# Check the first five rows of df to make sure we did this properly.\n", 72 | "df.head()" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": {}, 78 | "source": [ 79 | "### Run this cell. These are functions that will generate missing values according to MCAR, MAR, or NMAR." 80 | ] 81 | }, 82 | { 83 | "cell_type": "code", 84 | "execution_count": null, 85 | "metadata": {}, 86 | "outputs": [], 87 | "source": [ 88 | "def create_mcar_column(df, missing_column = 'income', p_missing = 0.01, random_state = 42):\n", 89 | " \"\"\"\n", 90 | " Creates missingness indicator column, where data are MCAR (missing completely at random).\n", 91 | " \n", 92 | " User must specify:\n", 93 | " df = the pandas DataFrame the user wants to read in for analysis\n", 94 | " column = the name of the column in df that is missing\n", 95 | " p_missing = the proportion of observations that are missing\n", 96 | " \n", 97 | " Function returns:\n", 98 | " mcar_column = a column that indicates whether data are missing, assuming MCAR\n", 99 | " \"\"\"\n", 100 | " np.random.seed(random_state)\n", 101 | " \n", 102 | " mcar_indices = [df.sample(n = 1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 103 | " \n", 104 | " while len(set(mcar_indices)) < round(p_missing * df.shape[0]):\n", 105 | " mcar_indices.append(df.sample(n = 1).index[0])\n", 106 | " \n", 107 | " mcar_column = [1 if i in mcar_indices else 0 for i in range(df.shape[0])]\n", 108 | " \n", 109 | " return mcar_column\n", 110 | "\n", 111 | "def create_mar_column(df, missing_column = 'income', depends_on = 'age', method = 'linear', p_missing = 0.01, random_state = 42):\n", 112 | " \"\"\"\n", 113 | " Creates missingness indicator column, where data are MAR (missing at random).\n", 114 | " \n", 115 | " User must specify:\n", 116 | " df = the pandas DataFrame the user wants to read in for analysis\n", 117 | " missing_column = the name of the column in df that is missing\n", 118 | " depends_on = the name of the column in df which affects the missingness\n", 119 | " method = 'linear' or 'quadratic'\n", 120 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 121 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 122 | " p_missing = the proportion of observations that are missing\n", 123 | " \n", 124 | " Function returns:\n", 125 | " mar_column = a column that indicates whether data are missing, assuming MAR\n", 126 | " \"\"\"\n", 127 | " np.random.seed(random_state)\n", 128 | " \n", 129 | " if method == 'linear':\n", 130 | " mar_indices = [df.sample(n = 1, weights = depends_on).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 131 | "\n", 132 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 133 | " mar_indices.append(df.sample(n = 1, weights = depends_on).index[0])\n", 134 | " \n", 135 | " elif method == 'quadratic':\n", 136 | " mar_indices = [df.sample(n = 1, weights = df[depends_on] ** 2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 137 | "\n", 138 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 139 | " mar_indices.append(df.sample(n = 1, weights = df[depends_on] ** 2).index[0])\n", 140 | "\n", 141 | " mar_column = [1 if i in mar_indices else 0 for i in range(df.shape[0])]\n", 142 | " \n", 143 | " return mar_column\n", 144 | "\n", 145 | "def create_nmar_column(df, missing_column = 'income', method = 'linear', p_missing = 0.01, random_state = 42):\n", 146 | " \"\"\"\n", 147 | " Creates missingness indicator column, where data are NMAR (not missing at random).\n", 148 | " \n", 149 | " User must specify:\n", 150 | " df = the pandas DataFrame the user wants to read in for analysis\n", 151 | " missing_column = the name of the column in df that is missing\n", 152 | " method = 'linear' or 'quadratic'\n", 153 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 154 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 155 | " p_missing = the proportion of observations that are missing\n", 156 | " \n", 157 | " Function returns:\n", 158 | " nmar_column = a column that indicates whether data are missing, assuming NMAR\n", 159 | " \"\"\"\n", 160 | " np.random.seed(random_state)\n", 161 | " \n", 162 | " if method == 'linear':\n", 163 | " nmar_indices = [df.sample(n = 1, weights = missing_column).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 164 | "\n", 165 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 166 | " nmar_indices.append(df.sample(n = 1, weights = missing_column).index[0])\n", 167 | " \n", 168 | " elif method == 'quadratic':\n", 169 | " nmar_indices = [df.sample(n = 1, weights = df[missing_column] ** 2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 170 | "\n", 171 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 172 | " nmar_indices.append(df.sample(n = 1, weights = df[missing_column] ** 2).index[0])\n", 173 | " \n", 174 | " nmar_column = [1 if i in nmar_indices else 0 for i in range(df.shape[0])]\n", 175 | " \n", 176 | " return nmar_column" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "### Let's generate some missing data!" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": null, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "df['age_missingness'] = create_mcar_column(df,\n", 193 | " missing_column = 'age', \n", 194 | " p_missing = 0.3,\n", 195 | " random_state = 42)\n", 196 | "\n", 197 | "df['partnered_missingness'] = create_mar_column(df,\n", 198 | " missing_column = 'partnered',\n", 199 | " method = 'linear',\n", 200 | " p_missing = 0.2,\n", 201 | " random_state = 42)\n", 202 | "\n", 203 | "df['income_missingness'] = create_nmar_column(df,\n", 204 | " missing_column = 'income',\n", 205 | " method = 'quadratic',\n", 206 | " p_missing = 0.2,\n", 207 | " random_state = 42)\n", 208 | "\n", 209 | "print(df['age_missingness'].value_counts())\n", 210 | "print(df['partnered_missingness'].value_counts())\n", 211 | "print(df['income_missingness'].value_counts())" 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "execution_count": null, 217 | "metadata": {}, 218 | "outputs": [], 219 | "source": [ 220 | "df.head()" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": [ 227 | "### Let's create a new dataframe with the values actually missing." 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": null, 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [ 236 | "df_missing = pd.DataFrame(df['children'])\n", 237 | "\n", 238 | "df_missing['age'] = [df.loc[i,'age'] if df.loc[i,'age_missingness'] == 0 else np.nan for i in range(100)]\n", 239 | "df_missing['partnered'] = [df.loc[i,'partnered'] if df.loc[i,'partnered_missingness'] == 0 else np.nan for i in range(100)]\n", 240 | "df_missing['income'] = [df.loc[i,'income'] if df.loc[i,'income_missingness'] == 0 else np.nan for i in range(100)]\n", 241 | "\n", 242 | "df_missing.head()" 243 | ] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": {}, 248 | "source": [ 249 | "### Let's visualize our missing data.\n", 250 | "- Children is 100% observed.\n", 251 | "- Age is missing completely at random and is missing 30% of its observations.\n", 252 | "- Partnered is missing at random and is missing 20% of its observations.\n", 253 | "- Income is missing at random and is missing 20% of its observations." 254 | ] 255 | }, 256 | { 257 | "cell_type": "code", 258 | "execution_count": null, 259 | "metadata": {}, 260 | "outputs": [], 261 | "source": [ 262 | "msno.matrix(df_missing);" 263 | ] 264 | }, 265 | { 266 | "cell_type": "markdown", 267 | "metadata": {}, 268 | "source": [ 269 | "### Generate histograms." 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | "metadata": {}, 276 | "outputs": [], 277 | "source": [ 278 | "def compare_histograms(df, imputed_column, original_column, missingness_column, x_label, y_label = 'Frequency'):\n", 279 | " fig, (ax0, ax1) = plt.subplots(nrows = 2, ncols = 1, figsize = (16,9))\n", 280 | "\n", 281 | " # Set axes of histograms.\n", 282 | " mode = stats.mode(df[imputed_column])\n", 283 | " rnge = max(df[original_column]) - min(df[original_column])\n", 284 | " xmin = min(df[original_column]) - 0.02 * rnge\n", 285 | " xmax = max(df[original_column]) + 0.02 * rnge\n", 286 | " ymax = 1.3 * (mode[1][0] + df[df[original_column] == mode[0][0]].shape[0])\n", 287 | "\n", 288 | " ax0.set_xlim(xmin, xmax)\n", 289 | " ax0.set_ylim(0, ymax)\n", 290 | " ax1.set_xlim(xmin, xmax)\n", 291 | " ax1.set_ylim(0, ymax)\n", 292 | "\n", 293 | " # Set top labels.\n", 294 | " ax0.set_title('Real Histogram', position = (0,1), ha = 'left', fontsize = 25)\n", 295 | " ax0.set_xlabel(x_label, position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 296 | " ax0.set_ylabel(y_label, position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 297 | " ax0.set_xticks([])\n", 298 | " ax0.set_yticks([])\n", 299 | "\n", 300 | " # Generate top histogram.\n", 301 | " ax0.hist(df[original_column], bins = 15, color = '#185fad', alpha = 0.75, label = '')\n", 302 | " ax0.axvline(np.mean(df[original_column]), color = '#185fad', lw = 5, label = 'True Mean')\n", 303 | " ax0.legend(prop={'size': 15}, loc = 1)\n", 304 | "\n", 305 | " # Set bottom labels.\n", 306 | " ax1.set_title('Observed + Imputed Histogram', position = (0,1), ha = 'left', fontsize = 25)\n", 307 | " ax1.set_xlabel(x_label, position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 308 | " ax1.set_ylabel(y_label, position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 309 | "\n", 310 | " # Generate bottom histogram.\n", 311 | " ax1.hist([df[imputed_column][df[missingness_column] == 0], df[imputed_column][df[missingness_column] == 1]], bins = 15, color = ['#185fad','orange'], alpha = 0.75, label = '', stacked = True)\n", 312 | " ax1.axvline(np.mean(df[original_column]), color = '#185fad', lw = 5, label = 'True Mean')\n", 313 | " ax1.axvline(np.mean(df[original_column][df[missingness_column] == 0]), color = 'grey', alpha = 0.5, lw = 5, label = 'Observed Mean')\n", 314 | " ax1.axvline(np.mean(df[imputed_column]), color = 'orange', lw = 5, label = 'Observed and Imputed Mean')\n", 315 | " ax1.legend(prop={'size': 15}, loc = 1)\n", 316 | " \n", 317 | " plt.tight_layout()\n", 318 | "\n", 319 | " plt.show();" 320 | ] 321 | }, 322 | { 323 | "cell_type": "markdown", 324 | "metadata": {}, 325 | "source": [ 326 | "### Examine various imputation methods." 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "##### Mean Imputation" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": null, 339 | "metadata": {}, 340 | "outputs": [], 341 | "source": [ 342 | "def impute_mean(df, impute_column, missingness_column):\n", 343 | " \"\"\"\n", 344 | " Imputes mean for any value where data is marked missing.\n", 345 | " \n", 346 | " User must specify:\n", 347 | " df = the pandas DataFrame the user wants to read in for analysis\n", 348 | " impute_column = the name of the column in df that is missing\n", 349 | " missingness_column = the name of the missingness indicator column\n", 350 | " \n", 351 | " Function returns:\n", 352 | " mean_impute = a column with the mean imputed for any missing value.\n", 353 | " \"\"\"\n", 354 | " mean_impute = [df.loc[i,impute_column] if df.loc[i,missingness_column] == 0 else np.mean(df[impute_column]) for i in range(df.shape[0])]\n", 355 | " \n", 356 | " return mean_impute" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | "metadata": {}, 363 | "outputs": [], 364 | "source": [ 365 | "df['age_mean_imputed'] = impute_mean(df, 'age', 'age_missingness')" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": null, 371 | "metadata": {}, 372 | "outputs": [], 373 | "source": [ 374 | "compare_histograms(df = df,\n", 375 | " imputed_column = 'age_mean_imputed',\n", 376 | " original_column = 'age',\n", 377 | " missingness_column = 'age_missingness',\n", 378 | " x_label = 'Age',\n", 379 | " y_label = 'Frequency')" 380 | ] 381 | }, 382 | { 383 | "cell_type": "markdown", 384 | "metadata": {}, 385 | "source": [ 386 | "How to read the above chart:\n", 387 | "- The blue line is the true mean of all data (observed and unobserved).\n", 388 | "- The grey line is the mean of just the observed data. (i.e. no imputation)\n", 389 | "- The orange line is the mean of the observed and imputed data." 390 | ] 391 | }, 392 | { 393 | "cell_type": "markdown", 394 | "metadata": {}, 395 | "source": [ 396 | "$$\n", 397 | "\\begin{eqnarray*}\n", 398 | "s &=& \\sqrt{\\frac{\\sum_{i=1}^n(x_i - \\bar{x})^2}{n-1}} \\\\\n", 399 | "\\text{impute mean for values } k+1 \\text{ through } n \\Rightarrow s &=& \\sqrt{\\frac{\\sum_{i=1}^k(x_i - \\bar{x})^2}{n-1} + \\frac{\\sum_{i=k+1}^n(\\bar{x} - \\bar{x})^2}{n-1}} \\\\\n", 400 | "&=& \\sqrt{\\frac{\\sum_{i=1}^k(x_i - \\bar{x})^2}{n-1}} \\\\\n", 401 | "&\\Rightarrow& \\text{the denominator increases but numerator remains fixed} \\\\\n", 402 | "&\\Rightarrow& \\text{the sample standard deviation is underestimated} \\\\\n", 403 | "&\\Rightarrow& \\text{confidence intervals relying on the mean are narrower than they should be}\n", 404 | "\\end{eqnarray*}\n", 405 | "$$" 406 | ] 407 | }, 408 | { 409 | "cell_type": "markdown", 410 | "metadata": {}, 411 | "source": [ 412 | "##### Median Imputation" 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": null, 418 | "metadata": {}, 419 | "outputs": [], 420 | "source": [ 421 | "def impute_median(df, impute_column, missingness_column):\n", 422 | " \"\"\"\n", 423 | " Imputes median for any value where data is marked missing.\n", 424 | " \n", 425 | " User must specify:\n", 426 | " df = the pandas DataFrame the user wants to read in for analysis\n", 427 | " impute_column = the name of the column in df that is missing\n", 428 | " missingness_column = the name of the missingness indicator column\n", 429 | " \n", 430 | " Function returns:\n", 431 | " median_impute = a column with the median imputed for any missing value.\n", 432 | " \"\"\"\n", 433 | " median_impute = [df.loc[i,impute_column] if df.loc[i,missingness_column] == 0 else np.median(df[impute_column]) for i in range(df.shape[0])]\n", 434 | " \n", 435 | " return median_impute" 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "metadata": {}, 442 | "outputs": [], 443 | "source": [ 444 | "df['age_median_imputed'] = impute_median(df, 'age', 'age_missingness')" 445 | ] 446 | }, 447 | { 448 | "cell_type": "code", 449 | "execution_count": null, 450 | "metadata": {}, 451 | "outputs": [], 452 | "source": [ 453 | "compare_histograms(df = df,\n", 454 | " imputed_column = 'age_median_imputed',\n", 455 | " original_column = 'age',\n", 456 | " missingness_column = 'age_missingness',\n", 457 | " x_label = 'Age',\n", 458 | " y_label = 'Frequency')" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": {}, 464 | "source": [ 465 | "##### Mode Imputation" 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": null, 471 | "metadata": {}, 472 | "outputs": [], 473 | "source": [ 474 | "def impute_mode(df, impute_column, missingness_column):\n", 475 | " \"\"\"\n", 476 | " Imputes mode for any value where data is marked missing.\n", 477 | " \n", 478 | " User must specify:\n", 479 | " df = the pandas DataFrame the user wants to read in for analysis\n", 480 | " impute_column = the name of the column in df that is missing\n", 481 | " missingness_column = the name of the missingness indicator column\n", 482 | " \n", 483 | " Function returns:\n", 484 | " mode_impute = a column with the mode imputed for any missing value.\n", 485 | " \"\"\"\n", 486 | " mode_impute = [df.loc[i,impute_column] if df.loc[i,missingness_column] == 0 else stats.mode(df[impute_column])[0][0] for i in range(df.shape[0])]\n", 487 | " \n", 488 | " return mode_impute" 489 | ] 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": null, 494 | "metadata": {}, 495 | "outputs": [], 496 | "source": [ 497 | "df['age_mode_imputed'] = impute_mode(df, 'age', 'age_missingness')" 498 | ] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "metadata": {}, 504 | "outputs": [], 505 | "source": [ 506 | "compare_histograms(df = df,\n", 507 | " imputed_column = 'age_mode_imputed',\n", 508 | " original_column = 'age',\n", 509 | " missingness_column = 'age_missingness',\n", 510 | " x_label = 'Age',\n", 511 | " y_label = 'Frequency')" 512 | ] 513 | }, 514 | { 515 | "cell_type": "markdown", 516 | "metadata": {}, 517 | "source": [ 518 | "##### Regression Imputation" 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": null, 524 | "metadata": {}, 525 | "outputs": [], 526 | "source": [ 527 | "def regression_imputation(df, impute_column, X_columns, missingness_column, regression = 'linear'):\n", 528 | " \"\"\"\n", 529 | " Fits regression line to observed data, then imputes regression prediction\n", 530 | " for any value where data is marked missing.\n", 531 | " \n", 532 | " User must specify:\n", 533 | " df = the pandas DataFrame the user wants to read in for analysis\n", 534 | " impute_column = the name of the column in df that is missing\n", 535 | " X_columns = the names of the columns used as independent variables\n", 536 | " to impute the missing value\n", 537 | " missingness_column = the name of the missingness indicator column\n", 538 | " regression = the type of regression to run; only supports 'linear'\n", 539 | " for LinearRegression and 'logistic' for LogisticRegression\n", 540 | " \n", 541 | " Function returns:\n", 542 | " regression_impute = a column with the regression value imputed for any missing value.\n", 543 | " \n", 544 | " NOTE: Only set up to do linear or logistic regression.\n", 545 | " \"\"\"\n", 546 | " \n", 547 | " if regression == 'linear':\n", 548 | " model = LinearRegression()\n", 549 | " elif regression == 'logistic':\n", 550 | " model = LogisticRegression()\n", 551 | " \n", 552 | " model.fit(df[X_columns], df[impute_column])\n", 553 | " \n", 554 | " regression_impute = [df.loc[i,'age'] if df.loc[i,'age_missingness'] == 0\n", 555 | " else model.predict(pd.DataFrame(df.loc[i,['children', 'partnered', 'income']]).T)[0] \n", 556 | " for i in range(df.shape[0])]\n", 557 | " \n", 558 | " return regression_impute" 559 | ] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "execution_count": null, 564 | "metadata": {}, 565 | "outputs": [], 566 | "source": [ 567 | "df['age_regression_imputed'] = regression_imputation(df, 'age', ['children', 'partnered', 'income'], 'age_missingness')" 568 | ] 569 | }, 570 | { 571 | "cell_type": "code", 572 | "execution_count": null, 573 | "metadata": {}, 574 | "outputs": [], 575 | "source": [ 576 | "compare_histograms(df = df,\n", 577 | " imputed_column = 'age_regression_imputed',\n", 578 | " original_column = 'age',\n", 579 | " missingness_column = 'age_missingness',\n", 580 | " x_label = 'Age',\n", 581 | " y_label = 'Frequency')" 582 | ] 583 | }, 584 | { 585 | "cell_type": "code", 586 | "execution_count": null, 587 | "metadata": {}, 588 | "outputs": [], 589 | "source": [ 590 | "np.std(df['age_regression_imputed'], ddof = 1)" 591 | ] 592 | }, 593 | { 594 | "cell_type": "code", 595 | "execution_count": null, 596 | "metadata": {}, 597 | "outputs": [], 598 | "source": [ 599 | "np.std(df['age'], ddof = 1)" 600 | ] 601 | }, 602 | { 603 | "cell_type": "markdown", 604 | "metadata": {}, 605 | "source": [ 606 | "### Work in progress:" 607 | ] 608 | }, 609 | { 610 | "cell_type": "code", 611 | "execution_count": null, 612 | "metadata": {}, 613 | "outputs": [], 614 | "source": [ 615 | "def compare_scatterplots(df, imputed_column, original_X_column, original_Y_column, missingness_column, x_label, y_label):\n", 616 | " fig, (ax0, ax1) = plt.subplots(nrows = 1, ncols = 2, figsize = (20,8))\n", 617 | "\n", 618 | " # Set axes of scatterplots.\n", 619 | " x_rnge = max(df[original_X_column]) - min(df[original_X_column])\n", 620 | " xmin = min(df[original_X_column]) - 0.1 * x_rnge\n", 621 | " xmax = max(df[original_X_column]) + 0.1 * x_rnge\n", 622 | " y_rnge = max(df[original_Y_column]) - min(df[original_Y_column])\n", 623 | " ymin = min(df[original_Y_column]) - 0.1 * y_rnge\n", 624 | " ymax = max(df[original_Y_column]) + 0.1 * y_rnge\n", 625 | "\n", 626 | " ax0.set_xlim(xmin, xmax)\n", 627 | " ax0.set_ylim(ymin, ymax)\n", 628 | " ax1.set_xlim(xmin, xmax)\n", 629 | " ax1.set_ylim(ymin, ymax)\n", 630 | "\n", 631 | " # Set left labels.\n", 632 | " ax0.set_title('Real Scatterplot', position = (0,1), ha = 'left', fontsize = 25)\n", 633 | " ax0.set_xlabel(x_label, position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 634 | " ax0.set_ylabel(y_label, position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 635 | " ax0.set_xticks([])\n", 636 | " ax0.set_yticks([])\n", 637 | "\n", 638 | " # Generate left scatterplot.\n", 639 | " ax0.scatter(df[original_X_column], df[original_Y_column], color = '#185fad', alpha = 0.5, label = 'True Values')\n", 640 | " ax0.legend(prop={'size': 15}, loc = 1)\n", 641 | " \n", 642 | " # Set right labels.\n", 643 | " ax1.set_title('Observed + Imputed Scatterplot', position = (0,1), ha = 'left', fontsize = 25)\n", 644 | " ax1.set_xlabel(x_label, position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 645 | " ax1.set_ylabel(y_label, position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 646 | " ax1.set_xticks([])\n", 647 | " ax1.set_yticks([])\n", 648 | "\n", 649 | " # Generate right histogram.\n", 650 | " ax1.scatter(df[original_X_column][df[missingness_column] == 1], df[original_Y_column][df[missingness_column] == 1], color = 'orange', alpha = 0.5, label = 'Imputed Values')\n", 651 | " ax1.scatter(df[original_X_column][df[missingness_column] == 0], df[imputed_column][df[missingness_column] == 0], color = '#185fad', alpha = 0.5, label = 'Observed Values')\n", 652 | "\n", 653 | " ax1.legend(prop={'size': 15}, loc = 1)\n", 654 | " \n", 655 | " plt.show();" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "metadata": {}, 662 | "outputs": [], 663 | "source": [ 664 | "compare_scatterplots(df = df,\n", 665 | " imputed_column = 'age_regression_imputed',\n", 666 | " original_X_column = 'children',\n", 667 | " original_Y_column = 'age',\n", 668 | " missingness_column = 'age_missingness',\n", 669 | " x_label = 'Children',\n", 670 | " y_label = 'Age')" 671 | ] 672 | } 673 | ], 674 | "metadata": { 675 | "kernelspec": { 676 | "display_name": "Python 3", 677 | "language": "python", 678 | "name": "python3" 679 | }, 680 | "language_info": { 681 | "codemirror_mode": { 682 | "name": "ipython", 683 | "version": 3 684 | }, 685 | "file_extension": ".py", 686 | "mimetype": "text/x-python", 687 | "name": "python", 688 | "nbconvert_exporter": "python", 689 | "pygments_lexer": "ipython3", 690 | "version": "3.8.3" 691 | } 692 | }, 693 | "nbformat": 4, 694 | "nbformat_minor": 2 695 | } 696 | -------------------------------------------------------------------------------- /solutions/00_interactive_plot.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "#### To confirm that you have the latest versions of these packages, uncomment and run the following command.\n", 10 | "# !pip install numpy pandas matplotlib sklearn ipywidgets IPython missingno --upgrade\n", 11 | "\n", 12 | "# To generate and store data.\n", 13 | "import numpy as np\n", 14 | "import pandas as pd\n", 15 | "\n", 16 | "# To visualize data.\n", 17 | "import matplotlib.pyplot as plt\n", 18 | "\n", 19 | "# To fit linear regression model.\n", 20 | "from sklearn.linear_model import LinearRegression\n", 21 | "\n", 22 | "# To allow interactive plot.\n", 23 | "from ipywidgets import *\n", 24 | "from IPython.display import display\n", 25 | "\n", 26 | "# There is a SciPy issue that won't affect our work, but a warning exists\n", 27 | "# and an update is not imminent.\n", 28 | "import warnings\n", 29 | "warnings.filterwarnings(action=\"ignore\")\n", 30 | "\n", 31 | "# To render plots in the notebook.\n", 32 | "%matplotlib inline" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": 2, 38 | "metadata": {}, 39 | "outputs": [], 40 | "source": [ 41 | "# Generate data and store in a dataframe.\n", 42 | "\n", 43 | "np.random.seed(42)\n", 44 | "\n", 45 | "age = np.random.uniform(20, 60, size = 100)\n", 46 | "income = 15000 + 750 * age + np.random.normal(0, 20000, size = 100)\n", 47 | "income = [i if i >= 0 else 0 for i in income]\n", 48 | "\n", 49 | "df = pd.DataFrame({'income':income,\n", 50 | " 'age': age})" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 3, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "# Create three functions to model missingness according to certain patterns.\n", 60 | "\n", 61 | "def create_mcar_column(df, missing_column = 'income', p_missing = 0.01, random_state = 42):\n", 62 | " \"\"\"\n", 63 | " Creates missingness indicator column, where data are MCAR (missing completely at random).\n", 64 | " \n", 65 | " User must specify:\n", 66 | " df = the pandas DataFrame the user wants to read in for analysis\n", 67 | " column = the name of the column in df that is missing\n", 68 | " p_missing = the proportion of observations that are missing\n", 69 | " \n", 70 | " Function returns:\n", 71 | " mcar_column = a column that indicates whether data are missing, assuming MCAR\n", 72 | " \"\"\"\n", 73 | " np.random.seed(random_state)\n", 74 | " \n", 75 | " mcar_indices = [df.sample(n = 1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 76 | " \n", 77 | " while len(set(mcar_indices)) < round(p_missing * df.shape[0]):\n", 78 | " mcar_indices.append(df.sample(n = 1).index[0])\n", 79 | " \n", 80 | " mcar_column = [1 if i in mcar_indices else 0 for i in range(df.shape[0])]\n", 81 | " \n", 82 | " return mcar_column\n", 83 | "\n", 84 | "def create_mar_column(df, missing_column = 'income', depends_on = 'age', method = 'linear', p_missing = 0.01, random_state = 42):\n", 85 | " \"\"\"\n", 86 | " Creates missingness indicator column, where data are MAR (missing at random).\n", 87 | " \n", 88 | " User must specify:\n", 89 | " df = the pandas DataFrame the user wants to read in for analysis\n", 90 | " missing_column = the name of the column in df that is missing\n", 91 | " depends_on = the name of the column in df which affects the missingness\n", 92 | " method = 'linear' or 'quadratic'\n", 93 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 94 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 95 | " p_missing = the proportion of observations that are missing\n", 96 | " \n", 97 | " Function returns:\n", 98 | " mar_column = a column that indicates whether data are missing, assuming MAR\n", 99 | " \"\"\"\n", 100 | " np.random.seed(random_state)\n", 101 | " \n", 102 | " if method == 'linear':\n", 103 | " mar_indices = [df.sample(n = 1, weights = df[depends_on] ** -1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 104 | "\n", 105 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 106 | " mar_indices.append(df.sample(n = 1, weights = df[depends_on] ** -1).index[0])\n", 107 | " \n", 108 | " elif method == 'quadratic':\n", 109 | " mar_indices = [df.sample(n = 1, weights = df[depends_on] ** -2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 110 | "\n", 111 | " while len(set(mar_indices)) < round(p_missing * df.shape[0]):\n", 112 | " mar_indices.append(df.sample(n = 1, weights = df[depends_on] ** -2).index[0])\n", 113 | "\n", 114 | " mar_column = [1 if i in mar_indices else 0 for i in range(df.shape[0])]\n", 115 | " \n", 116 | " return mar_column\n", 117 | "\n", 118 | "def create_nmar_column(df, missing_column = 'income', method = 'linear', p_missing = 0.01, random_state = 42):\n", 119 | " \"\"\"\n", 120 | " Creates missingness indicator column, where data are NMAR (not missing at random).\n", 121 | " \n", 122 | " User must specify:\n", 123 | " df = the pandas DataFrame the user wants to read in for analysis\n", 124 | " missing_column = the name of the column in df that is missing\n", 125 | " method = 'linear' or 'quadratic'\n", 126 | " - 'linear' means the probability of missingness is linearly related to the depends_on variable\n", 127 | " - 'quadratic' means the probability of missingness is quadratically related to the depends_on variable\n", 128 | " p_missing = the proportion of observations that are missing\n", 129 | " \n", 130 | " Function returns:\n", 131 | " nmar_column = a column that indicates whether data are missing, assuming NMAR\n", 132 | " \"\"\"\n", 133 | " np.random.seed(random_state)\n", 134 | " \n", 135 | " if method == 'linear':\n", 136 | " nmar_indices = [df.sample(n = 1, weights = df[missing_column] ** -1).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 137 | "\n", 138 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 139 | " nmar_indices.append(df.sample(n = 1, weights = df[missing_column] ** -1).index[0])\n", 140 | " \n", 141 | " elif method == 'quadratic':\n", 142 | " nmar_indices = [df.sample(n = 1, weights = df[missing_column] ** -2).index[0] for i in range(round(p_missing * df.shape[0]))]\n", 143 | "\n", 144 | " while len(set(nmar_indices)) < round(p_missing * df.shape[0]):\n", 145 | " nmar_indices.append(df.sample(n = 1, weights = df[missing_column] ** -2).index[0])\n", 146 | " \n", 147 | " nmar_column = [1 if i in nmar_indices else 0 for i in range(df.shape[0])]\n", 148 | " \n", 149 | " return nmar_column" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": 4, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [ 158 | "def generate_scatterplot(p_missing, missing_type, method = 'linear', missing_column = 'income', depends_on = 'age'):\n", 159 | " # Generate one plot.\n", 160 | " fig, ax = plt.subplots(nrows = 1, ncols = 1, figsize = (16,9))\n", 161 | "\n", 162 | " # Set labels and axes.\n", 163 | " ax.set_xlabel(\"Age\", position = (0,0), ha = 'left', fontsize = 25, color = 'grey', alpha = 0.85)\n", 164 | " ax.set_ylabel(\"Income\", position = (0,1), ha = 'right', va = 'top', fontsize = 25, rotation = 0, color = 'grey', alpha = 0.85)\n", 165 | " \n", 166 | " ax.set_ylim([-1000, 100000])\n", 167 | " \n", 168 | " # Generate data with proportion p missing.\n", 169 | " if missing_type == 'MCAR':\n", 170 | " df['missingness'] = create_mcar_column(df,\n", 171 | " missing_column = missing_column,\n", 172 | " p_missing = p_missing)\n", 173 | " elif missing_type == 'MAR':\n", 174 | " df['missingness'] = create_mar_column(df,\n", 175 | " missing_column = missing_column,\n", 176 | " depends_on = depends_on,\n", 177 | " method = method,\n", 178 | " p_missing = p_missing)\n", 179 | " \n", 180 | " elif missing_type == 'NMAR':\n", 181 | " df['missingness'] = create_nmar_column(df,\n", 182 | " missing_column = missing_column,\n", 183 | " method = method,\n", 184 | " p_missing = p_missing)\n", 185 | " \n", 186 | " # Generate scatterplot.\n", 187 | " ax.scatter(df['age'][df['missingness'] == 0], df['income'][df['missingness'] == 0], s = 35, color = '#185fad', alpha = 0.75, label = 'Observed')\n", 188 | " ax.scatter(df['age'][df['missingness'] == 1], df['income'][df['missingness'] == 1], s = 35, color = 'grey', alpha = 0.25, label = '')\n", 189 | " \n", 190 | " # Generate lines of best fit based on observed and missing values.\n", 191 | " x = np.linspace(20, 60)\n", 192 | " ax.plot(x, 15000 + 750 * x, c = 'orange', alpha = 0.7, label = '\"True\" Line', lw = 3)\n", 193 | " model = LinearRegression().fit(df[['age']][df['missingness'] == 0], df['income'][df['missingness'] == 0])\n", 194 | " ax.plot(x, model.intercept_ + model.coef_ * x, c = '#185fad', alpha = 0.7, label='Observed Line', lw = 3)\n", 195 | "\n", 196 | " # Generate title and legend.\n", 197 | " ax.set_title(f'Type of Missing Data: {missing_type} \\nProportion Missing: {p_missing}', position = (0,1), ha = 'left', fontsize = 25)\n", 198 | " ax.legend(prop={'size': 20}, loc = 2)\n", 199 | " \n", 200 | " ax.set_xticks([])\n", 201 | " ax.set_yticks([])\n", 202 | " plt.show();" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 5, 208 | "metadata": {}, 209 | "outputs": [ 210 | { 211 | "data": { 212 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA+sAAAJICAYAAAAZ/3W9AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAACVfklEQVR4nOzdfXzbZb3/8feVNL1Jum7tRrfB2NrdcCOgODbl7hxgCHrwFkEOgscNBY43yI3H6Q8VSkWPeFBRBNSBUJSb4wGUGw8CU24UBdwYggJHt7kOxqBlbbeuSdOmyfX74/tNmrRpm7ZJkyav5+ORR5rv7ZU0bfL5Xp/rcxlrrQAAAAAAQOHw5LsBAAAAAAAgFcE6AAAAAAAFhmAdAAAAAIACQ7AOAAAAAECBIVgHAAAAAKDAEKwDAAAAAFBgCNZRMowxdcaYa40xW40xfcYY695m5bFNa9w2tOaxDQ1Jr0VDvtoBAAAAYBDBegFLCqAmcluT7/YXEmOMV9JvJX1O0mJJ/ZLa3Fssw2O0Jr2+XcaYyjG2n2eMiSTt8/jkngWGMsYcP8L7P2SMecMY87wx5qfGmE8bY+py3JaLjTFXGGMOz+V5JiPN6/WjDPZZO57/LcaYCmPMJ4wxdxlj/mGM2eteHHvdGPNbY8xXjTGNGbb3maTznpfB9g0m/fshaozZbYzZaIz5ljFmYSbnBwAAyCeC9cLWNsItmME2vVPa0sJ3kqTDJUUk/ZO1doa1dp57657A8WZJOnWMbVZLKhtjmz2S/iZp6wTakC0Rtw1/c3+erro0+P4PSZot6a2S/k3SDZJ2upkVgRyd/2JJTXLeZ9PFmcaYqjG2OSfTgxlj3ifpH5J+Iul0SY2SfHJ+H3MlrZJ0paTNxpgbxjjWoZLekbToE5m2w9WtwffDbkkzJR0h6YuSXjTG/Ms4jwcAADClCNYLWFIwmXKT9O2xtrHW/jyPTS9Eh7n3L1hrn5zksVrd+7GCmDVDth/GWvtLa+1B1toTJ9mmCbPWvua24SBr7Wv5akcWfDjp/T/HWuuTEyz+m6Q/SqqQk1nxjDGmNp8NLRCtcgLYES86GWOOlHSwRnkPJ23775Luk7SvpFclfVbSQmttpbW2Vs7r/8+Srpc0IOmsMQ75Sfe+RdJeSUcaY94yVjuSXJT0fpgtKSDnb3a3pGpJd+Y62wIAAGAyCNZRKvzufU8WjnWPnOyGE0dKpzXGHC3pIEnbJP0uC+fEBFhrW621t1lrj5H0eXfxIZL+O4/NKhS3uvej9VjH17WMdiBjzDGSrpPzmfI7SYdZa2+w1r4a38ZaG7HW/t5ae4GkAySNeNHMGFMu6WPuwx/J+ZuTBgP4cbPWhqy1LZIudBfNlNP7DwAAUJAI1ouIMeYqd3zmi2NsV2OM6Rk6/tQdb5sYW22MOcMY84QxptMYEzTGPGuMucAd/z3a8fcxxnzdGPOcMWaPMSbsjl39iTHmkEk+x5nGmMuNMZuMMd3GmF5jzGZjzA+NMYvTbN9ijLGSrnAXHTdkLOsVQ/fJQI+ku+T8/aweYZvkIMeO8nxGLTBnjHmnMeZ2Y8w293UMGmO2u7+Xy4wxC9Lsc5AxZp0x5u/u2O2wMeZVY8zTxpj/NMYcNGT7EQvMJY9xdh8vNcbc7B6vzxizwxhzozFmv5Geo7vfYcaYnxtnHHn8/fADY0z90HPkirX2Gjm9upJ0sjFmWDaDMeZQ9+/gUeMUIux132fPue/pOWn2ucJt+yJ30S1Dx0xP9hw5cpec9/IqY8yioSuNMX5J/yrn/Xvr0PVDfEfOkI92SadZa/eMtrG19hVJHxhlkw9KmiPpb9baZ5LO/zFjjG+MtozloaSfJ/X/CAAAIJcI1ovLj+V8sX6LMebYUbY7S05K6B5JadPljTHfctf9k7uoUtJyST+Q9L/GmIoR9nuXpL9L+oqcsbtVclJeG+UEsJuMMR8f17MaPPYhkv4qqVnS2+WMhY1IWirpU5JeMsacNmS3PUod5x9R6tj+ifa03+LerzHGmCHt9Es6Q07hurGCnBEZY1ZLekrO76vBXTwgaaGcdOKvSXrXkH1OkvRnSedJWiYngOqVtEDSOyVdKunMCbbnBEnPyUklninn/8d+ks6V9KeRAnZjzKmSnpXzmsyV8zuYL+kCt60N6fbLkW9ocFx+ugstv5Iz7vwEOencITkp04fLeU//2Rhz4JB9epRaqDB5rHT8NtlzSBpWIG7NmM92dEE5AbtR+tfiNEk1kh6z1raOdBBjzEo57y1J+oG1dlcmJ7fWjlbYMd6D/lP3/glJ2yXVS3p/JscfRfLf66gXHgEAAPKJYL2IWGu3SXrYfTha5eT4up9Za9MVojtcThGm6yTNtdbWSaqVdJmciwHvlvTNoTsZYw6TdL+c4ms3SnqLpCprbbWcXscbJJVL+okxZsV4npsxZoakB+QEna9Jeq+kgLW2xm3v03LGxN5ujHlbfD9r7UVDxvn/ccjY/m9rAqy1v5O0RU5l+eOGrP6IpBmSHrXWbp/I8d2A/wdyAovbJC11x/7OlBPYrZB0tZyezGQ/lPM6PCInFbncHS9cJelQOUFi60TaJCcV+VFJB7uve0BOz+teOUFnuvfEYrf9PkmbJK2w1s6QMyzhJDlV+b87wfaMm7X2dTkXHKThvzfJCQrXSFpkra1yxzpXyrko8ic5FyfuGHLMb7vvsXjK90V2eJ2JSZ0jh25274dddNJgdsjNGl1yhsIvJ9sgY8z+ct4bVs57R9Zaq8HAfcKp8K73JP38j0keCwAAIGcI1otPfCqmj5g084cbY46Q00MuSetGOMZMOYH856y1b0qStbbbWvt1OT2TkvQ5Y8y+Q/b7npyg8JvW2vOttS9ba6Pu/q9Yaz8r6Vo5vb1fHefz+oyc3vmIpPdYax+M98xZa5+XdLKcILQiqY251uLeDy00Fw9ybtHEHSon4A9KOsdam6gWb60NWmuftdZ+0Vr7YHy5MaZe0hL34Rpr7V+T9glba1+01n7NHbc7EX+WdKq19v/cY/Zba/9HTm+wJJ1ujBla/f7LcgLzdkknWWufdfe11trfyLnw49fUet69Xzi0vdba1dbaW9007fiyfmvtb+UEpW2Slo+RuTKqqTjHONrypKTNcv62jo8vdy+yHCcnM+UXYxwmnkreJ+nlLDTrHDmfTY8lv0YaDNbfneZ/z5iMMX43W+X77qI+SXdOqqUAAAA5RLBefH4laYecoPnf0qyP96o/Za39yyjH+doIy6+Wk1ZdJidNVpIz7lnOtEwDSqpWn0b8C/e7zBhj34f4V/f+7uQgNM5au1fSf7kP/8UYM3Mcx56oW+WkPp/u9vzLGLNEztCB3Ro7yBnNbve+XM4UZJnYq8FU7PmTOPdI/nOE1OX73PsqOan3kiS3pzb+HvmhtbZz6I7W2r9J+p9sN3QMye3IuBq4tbZHTq+4JOUkkB7rHNbax621xr21ZOm08YtKyRed1sjJ6vjvEbJvksXfn7vHSG0fk/ueibfjp8nrrLVb5FT192pwpoXRfN+tkfCGMaZDzoWvFjmZPxFJq91MCwAAgIJEsF5k3J7sG92HKanwxplfOj5d0ki96pL0qvvFON3xu+WMP5acVOy4Y9x7j5yx42+ku2mwuFNAGQahxqkM/Vb34W9G2XR9UhuWj7JdVlhrd7jnjBfikpxAIx7khCdx+K2S/k9O+vgzxpgvGWMOH+0ChxtU/dZ9+JAx5mvGKVBXPol2JHtmhOU7k35ODn4XywmMpMEANJ3HJ96k7DPGvM8thvcP4xT0Sy4Ud4a72bDCfoV2jnH4qaSopNOMU3wyuXDiZLJDJmKVnBoGQQ1WgE8WrwGRydzvNXJqJMxV6vvyFUlvtUxvCQAAChzBenG6SU4P92HGmSc57kw5qdW7NUJhOddYc23H19cnLYunpXo0+AU53S250nWm6c91GiwENVrbdiT9XD/iVtkVH897jhvkfHzI8glxL7qcKWfqt0WSrpIz1rrbGLPeGPNpd1z7UOfKSfPeR06Ngacl7TXGPGmMWWsmMa+0m72QbvlA0sPkSt37JP2cHNAPNdVzuye/BoledmOMxxhzh5zaCGfISQ0vl9SlwUJx8QswgYmceCrOMV7W2tfk1DiIX3Q6UU4Rw5fcSuxj6XDvZ7l/A5MRH4/+SzfLYKj/kfP6LDXGpKs5kOyceBaCnKE9J0j6g5zndosxpnqSbQUAAMgpgvUiZK3dKafQmySdn7Qq3tN+WwapreMVD6bbktJ0x7q1ZrkN+XCfnEDraEmflbS/pBettRsme2B3LP5BclLJ18mphF8lpxDZDZL+zy3ql7zPK3KyCt4jpz7As3L+zo+RM0xgizFm1WTbNgE5nZZtnOIFCLcPudDwSUkfldPL/DU5Kf0V1tq6pEJxd7vbDi3GlqmpOMdEJKfCj7fmQnyqyApJB0+0AcaYWkmnug8/NnT6OzfroEtOMT5pHIXm3Jobj8upbfGipCPlFNAEAAAoWATrxSteaO4MN7X1MA1Or/TjMfYddc7spPXJlcjfcO/nuOn22dQpJ7iRRk8NTl43tEp6Tlhr+zRYuTs+Vj9rqcNu4bFfWGv/3Vp7mJze6k/JeU32V5qp4ay1MWvtw9aphL9CTk/y2XLSf2sl3ZHF1PjRvJn082gFwcZ6v2WNMWa+nGn/pOHp9/Ep7W6y1jZZa7ekGYM9tLL7eE3FOSbifjnvqaPkXBwakPSzDPf9bdLPp4641djO1mAgnonTjTE14zmBtTYk6XPuw9XGmKPHsz8AAMBUIlgvXr+RM7VYQM6X4OTCcsMKtA2xv1sobRi3kNoR7sONSav+4N57Jf3LhFo8Amttv6QX3IcnjrJpfM7xmJxpwqZKPOW9XOMLcsbNWtthrf2xpC+5i95ujBl17L+1dq+19g4N9kTOlXTYKLtkyz80WCjv+FG2G21dtn1Fg6n6LUPW7e/eP6c03LTpd6Zb54oH3aP1iE/2HDnhXnS63X3ok/SgtXbo/PAj7btBzpRzknSBMWbOaNvHpUmZj78/vy9nuM5It5lyLgRVyclSGBdr7WMarKFw1Xj3BwAAmCoE60XKWms12IP+GUkfc38erbBcsstGWP4fcr4kDyipAJS1drMGeyq/MVY19gmMnf5v9/50Y8yhaY5XLWdueMkJNPaM8/gTZq3dJGf+8u9IusRaO+lefWNMxRibJA9jiLn7jNVbPmyfXHLfg/GK+J9y05xTGGOWabCgWk4ZYy6WM1RBkh5y06KTxd8zb1N6l8kJFkfS7d7PGmWbyZ4jl66T8x7+jqRvjnPfL8jJfpkr6Z4M/v4XSLo36fFySYe7D++01vaMcuvW4PtqonOux6d3/CdjzEkTPAYAAEBOEawXt1vkzCV8qJz0590avbBc3B45KaLfj/eSGWNmGGO+LOlyd5vr3bHxyT4nqUfSAZKeNsZ80BiTSGs1xuxnjPk3Y8xvJX1rnM/lh3KKrfkk/doY8y/xnjk3xf9hOcW6+jT+OdwnzTrzl3/BWputcbBnGmP+YIz5d3fOa0mSMcZrjHm3BnsEn7LWdrk/H22MecEYc4kx5uCk18e46b4/dLfbocFMhVz7ppyLBHMlPWKMeXtSm1bJ+b2FcnVyY8xCY8zZxpgnJV3jLv6L0vfIxmcqOM8Yc3784ocxZp4x5ho5F4M60uwXF89YOT3dhYlsnMMYc3zSGO41o7Rl3Ky1f3ffw1+w1j49zn1/L+kiObUJ/lnSC8aYz7hBebztPmPM0caY70n6u7tdXDzo3p5hUbv4dH8r0128y6C96yXF60qMNE0lAABAXhGsFzFrbYeku5IWZVpY7s9yipFdKKndGNMpp7DTN+Sk+P5G0v9Lc76/yils9oacwmj3SuoxxuwyxoTkBIk/lTM903ify15JH5BTOXyBpAclBY0xe+QEnkfLCdQ/5hZmm+6MnOf0I0lbjTFhY8wuSf1yAr4Fciqsf2LIfodJ+q6klyQl7/MHd123pLPcavM5504B+HE5mRgrJG0yxnTLuajzWzlDBz7vbt43ydP9wgxOE7jLGNMvabuk2+QU2OuTk2J9pLV2d5r9vyNnurwyOVkpvcaYLjmv88Xusl+Ncv51coLVoyW9aYzZaYxpNca0ZvEcBctae72cMeuvy6m4fr2kV40xve7/kD4578OL5Fx0u0WS3At68Skl7x563BE8ocG6FBPtXf9P9/5IY8x7J3gMAACAnCFYL37JwfpYheUSrLVfklMM60k5gWO/nCD+IknvGWkOcWvtH+T0rH9B0u/k9ObPkpMi+7KcwOlsOYHJuLgXAw6RdIXblgE5Fai3yglqD7HWZvplv9DdLyfIvUXOVGx75IzV3StnfPBlcp7v/yXts0FOSvkP5VSB3yVnrumwBi/AHOz2gk4Z93eyQs578U05v7M2OYHz2zWYGr57kqeq1eAUgQE5F5hekFND4NOS5ltrL3aLjKVr5245gfb3JLXKec8OyBne8VFr7adGO7m19neS3ivnYtZutx2L3FtWzlHorLX3SVosp0bGLzT4HANyguvfyqkbsNhae4m722kaHDrwP8qAe7Epngr/sQkWTLxPg5Xs6V0HAAAFxzjDSlGsjDE/kHSBnHTpUSsfG2OukDP2+glr7fG5bx0gGWO+IenLkh611o5WQBAAAAAoGfSsFzF3WqOPuw9/ONq2QD4YY/aRdK778KHRtgUAAABKCcF6kXKriX9fThr0q8qssByQdcaYC40x/88Ys9QYU+YuqzDGnCJnqES9nPT4m0c7DgAAAFBKyvLdAGSXOz3VxXICoCp38efducqBfFgsp9bBNyVF3aKANRr8/7NH0hluQUQAAAAAIlgvRrPkFLSKFxX7ZhEVXcP0dKucImP/LGk/SbPlTOe2Tc7Ubd+31r6Wv+YBAAAAhYcCcwAAAAAAFBjGrAMAAAAAUGAI1gEAAAAAKDAE6wAAAAAAFBiCdSCJMabFGGONMS35bkshMMa0uq/Hmjy2gd8JAAAASg7B+jRgjLnCDVaG3sLGmB3GmPuNMWcYY0y+21qojDHHu6/jmny3JReMMWuGvDf+Xwb7XD9kn+Nz31KkY4wpN8Z83hizwRizxxjTY4z5izGm2RgzYxLH9Rtj/sUY81VjzC+MMduTft9XZPEpAAAAIMuYum36aUv6eaacqbD2k/R+SWuMMadaa/vy0rLCdrykJklPSGoZZbvXJf3NvZ/O1ki6aqSVxphKSR/N4Dhb5UwDuCc7zZqQYvmdpGWMqZX0W0lvdxf1yZnq7lD3ttoYc5y1dvsEDv8OSQ9mpaEAAACYUvSsTzPW2nnxm6SAnC/z693V/yLp63lrXBGw1l5qrT3IWntpvtsyCa2SDjTGHD3KNh+SVOtuOyJr7Ynu6/HLrLVunIrkdzKa2+UE6t2S/lWS31obkHSynAsUiyQ9YIzxTvD4XXIuBlwt5wLNG5NuMQAAAHKOYH0as9bGrLUvSvqApC3u4n83xpAxUdpude8/Mco28XUtuW0KRmOMOVHORTZJ+ndr7f9Ya2OSZK1dL+k0d91hcrIlxuv31to6a+27rLVftNb+t5yeewAAABQ4gvUiYK0NS7rLfThD0kGSZIxpSBqf2mCMWWKMWWeM2WaM6TPGtCYfxxgz0xhzuTFmkzGm2xjTa4zZbIz5oTFm8UjnTx7zbIyZZ4y5zj1H2BjzhjHmdmPMQaM9B2NMpTHmYmPMH40xXe6+240xPzXGHD7KfokCaMaYamPM19yxvnuTnreVkwIvScelGfu/Jul4YxYzc5/nXcaY19zXcZcx5rfGmHNG6v1MqjvwuPv4RGPM/xpj3nSf68vGmCY3PX2ybpVkJZ1hjPGnactCSSdK6pF092gHMqMUmDPGVBljvmCMecr9nUXc5/OSMeZWY8xpafYpM8acb4x53H3dIsaYDmPM34wxPzfGfDLNPiP+TtzjWPf1NcaY84wxz7jv371u2z42xnP0GWe8+J+NMUFjTKd73NOHnmO040zQavf+H5J+PnSltfYpSY+7Dz8+3oNba6MTbhkAAADyih7Y4rEj6eeaNOuPlvRjSdWSQpIiySuNMYdIekjSAndR2N1mqXs7xxhztrX2nlHa0CjpTknzJPW6+8+VdJakDxtnPP1DQ3cyxuznnvtQd1HEbeNCSf8m6WxjzMXW2h+Mcu7Zkp6VdICkfnd/Sdpfzjj/ajnDBiKSOofs2zvKcYe29buSLnEfWjljuWdJWuXePmaM+ZC1du8ox1gr6Vvuwz2SyuVcYLlCzsWEkyYTZFlrt7kXBU6QdLqknw7ZZLWcC3X/Iyk4kXMYp+jZ7yW9LX5aDb4WcyQdLOk4Sfck7eOVM376pKRD7ZHze6mT87s7Q9JPJtAkr6RfSvqgpAE5v/8Zko6UdKQxZpm1tmnoTsaYgNumf3YXReX0PP+znN/FN0c7qXsR4xb34QnW2sfH2e74a/GQtdaOsM2v5dRcONYYU2Wtzfj9CgAAgOmLnvXi0ZD089BgVHIC9RclrbTWBqy11XLGxMYDrwfkBOqvSXqvpIC1tkbS4ZKellQh6XZjzNvSHDvuGjmB8snu/jMkvVPSXyRVSvq5MWZB8g5uAHePnEB9j6SPSaq21s6StETSr+S8T79vjPkXjewKORcpTnX3r5UTqD/rju//trvdH5PH/bu3YT2a6RhjLtBgoL5O0r7ueWa6ywfkBOw3jnKYt8kp/HaVpHp3/1mSvuauP0GDva2TcbN7n5IKb4wxGkynvlkTd5Gc59IpJ1W7yn0uFXIKHn5c0iND9vmonOA0LOlcSTPc33OVnIs6H9YYPf2j+KycgHaNpBpr7Uw5v/8H3PVfNcYsS7Pfd+QE5jFJX5I0y1pbJ6le0rWSLtXgBYmsMsbMlnNhS5L+Osqm8XUeORdBAAAAUAII1ouAMaZG0tnuw05Jf0+zWYekd1lrN8YXWGvj231GTq94RNJ7rLUPJo2bfV5O8N0qJxD7xihNqXL3Xx/vJbTW/knSu9x21cgJfpKdLiegl6QzrLW3W2v73X3/ISf4fkaSkfRfY5z7FGvtvdbaiLv/DmttaJR9MmaMqZLU7D6801r779baN9zzBK2135P0eXf9vxpjjhjhULMkXWmt/bK1dpe7f7fb6/sLd5tMqrSP5R45Bcv+2aQOYThe0mJJf7fW/mESx48Xr/u2tfYX8RkI3DoKO621P7PWnj/CPj+11v7EWtvj7mOtte3W2l9aaz8ywfbUSjrVWntrvOfZWrtD0kck7ZTzv+6M5B3c4QDnuQ+brLX/ldSmXdbai+QMKZg1wTaNZd+kn18bZbvkdfuOuBUAAACKCsH6NGaMmWWcAlWPavBL/PfjgfYQ18UDkTT+1b2/21o7rIfPTemOB8r/YoyZOcJx7rLWvpxm/3ZJPxpyrqHnfspaO7QnVtbaAQ0GyYcaYw4b4dwPWWufG2FdNpwkJ1Vbcnrx07lBg9OLnTXCNn0a7OUf6j73/q3jbdxQbsD633IucqxJWnWOe3/L0H3Gabd7P38C+8wbbaMJ+oO19rGhC92LCA+7D4e+rqfJ+R8YkpMVks6Vo53UWttirTXu7fHxNVnJ86ePdlEped2E51wHAADA9EKwPs0kF0aTMyXTbyTFe3Fv08g932l7UY0x5RoMYn4zyqnj08N5JC0fYZtHR9k/vm62MaYxafmKDM79mJyxxMnbDzWZXuJMxM/7alJGQgp3nPmjQ7Yf6sVRLprsdO/rRlg/XvGAfLUxxuNmYJwm57UcOo59vH7l3l9gjLnTGPMhY8ycMfZ5UM7Y9g8YY35tjPmoMSZbPcXPjLJupNc1/j7eaK1NO3bfWrtV0quTbBsAAAAwbgTr009b0u0VSZvkFORaZa39t1EKk7WPsLxOTnEuafRU3OQCdvUjbJNpKm99mp9H3Netdr9rjHOP9PyyZcx2uuKv00jtHLHwnJwx71KWCj9aa5+W9LKcQn0nyp3DW9LD1tqdo+2bwbHvkPR9OcH3mXKKu71pnNkDrk83DMBa+6ScceH9kt4j6Q5JrxljXjXG3GKMOWESTcrkdfUNWb6Pez/WazHW73yikts8rGr/COtGe54AAAAoIgTr08yQwmiLrLVHWGvPTZcCPESxT+FU7M9vouK96+dosNjcZFPgJUnW2oslHSjpy3Iqlu+WM3PAZyRtNMZ8L80+V8upj3CJpHvlXGRZICdV/1F3SryhQXWujVSFPdeSLxLsN8p2yesmdZEFAAAA0wfBOjo1GOguGGW75HUj9WJnGnC0p/l5xHO7c4/PHuPcuTZmO4esz1c7h/qZnJ7l0+VMY9Yh6f5sHdxau8Va+01r7SlyfkdHyQnCJekiY8wH0uyz01r7PWvtqdbauXKGYdzkrj5d0qez1b4xvOnej5WKP9r7esKstR2S3nAfHjrKpvF1MTmZEgAAACgBBOslzq28/oL78MRRNn2Xex+Tk3qfzmhpzPF1ndbabUnL49XpRzv38RpMDd8wynajiRfdMxPcP97OBcaYA9Jt4E5DF3+eE21nVrkV63+twRTwRLX9HJwr5qbeny5niIaUOqf6SPv9xVp7ngbrDoy5T5bE38cr3PnWh3Er6e+fwzbEa0G8251WL533uPdPMsc6AABA6SBYh+RUDZek040xw3r4jDHVkr7oPnzQWrtnhON8xBhzYJr950j6d/fh0DnN4+c+yhhzcpp9yyRd7j78a7pq9Rnqdu9nTXD/9XJ6paWRq8H/uwZ7ae+c4Hly4T/lzCf+HUnXZ+OAxpiKkda5dRPiFwQSMxOMto8rHoimm80gF37hnisgZ974dL6S4zbc6t4vkTPNXApjzDs1eAFoskUBAQAAMI0QrEOSfihpm5ze118bY/7FGOORJHeqtIfljDPuk/TVUY4TlvSQMeZd8V5CY8xKOZXe58gpjnXVkH3u0WAl7/8xxpwVH7PsVo2/R05qtTR4wWAi4kH+IcaYo0fdMg23R/MK9+FHjTE/MsbMddvpN8ZcKOl77vqfW2ufnURbs8pa+7S19gvuLW0l+wl4xhhzrTHm+OReaWPMvsaYH8gZuy45FeDj7jXG3Oy+v2Yl7VNnjPmqBrMr/jdLbRyVtXa7nOKMkvQ1Y8wX3AtTMsbMNsZ8V844/90jHcMYsyZphobjJ9CG38rJfJCkdcaYjyT97Z0o54KCJP1FUstEzm+MqTXGzInfNPh/35+8PP7cAQAAUBiyUnUa05u1dq87tvghOWOuH5QUNsb0S6pxN+uT9DFr7fOjHOoSOb246yWFjDExSfEAoE/SR621ryTvYK2NGmNOk3NB4BBJt0u6xRgT0mAveEzSJdbaX2viHpf0NzkF0f5gjOnSYG/7F6y1d491AGvtdW5a9CVyetHPN8bsljP3dfxv6TFJ502indPFLEmfc2/WGLNHzsWe5HTya6y1Dyc9rpJT6O4cSTLGxF//mqRt7tbg+PWp8HlJB0s6VtLVkq5y2zVLzpCJr0v6Z/cWzlEbzpb0W0lvl/Q/cv72YhqsAr9d0vtHmelhLM9JWpRm+Vr3FnernEJ/AAAAKAD0rEOS5KaXHyKn9/jPcoqSVUjaKulHkg7JIKDdJifguF5O8a5yOYXW7pT0dmtt2h5Ta+1rcuYl/7ykp+WkQ/vlzG/9M0lHWGuvnfizk6y1A3J6bm9y2xmQE8As0uAFhUyO83lJq+T0+Le5++6VE6R/QtJJ1tpSmF7rTElNcoLMbXJ+1z45geXPJZ3ovlbJPidn6rYHJW2WEwxXyalwfr+k06y1H7HWTlUavNw570+UE7S+ICd930h6QtKHrbWXafCi0e4ctaFLTvG/L0h6VlJEToX6v0r6mqS3ulkAAAAAKCHG2nzNWoRiYYyJv4lOsNY+ns+2ANnkpoZ3yLkY8c/W2t/nuUkAAAAoEfSsA8DIPi8nUO9UgVT4BwAAQGkgWAdQsowxM4wx/22Mec+QoneLjDFXa7Co4Pestbkasw4AAAAMQxo8Jo00eExXboDelbQoXm9gRtKyeySd6dY9AAAAAKYE1eABlLIeSRdIOknSoZL2kVP07nVJG+XMbX6P5aomAAAAphg96wAAAAAAFBh61gvYnDlzbENDQ76bAQAAAADIgWeffXaXtXafdOsyDtabm5s/JyddVE1NTQdmqW0YRUNDgzZu3JjvZgAAAAAAcsAYs32kdVSDBwAAAACgwBCsAwAAAABQYAjWAQAAAAAoMJMuMNfc3PwOST+TnLHszc3NiyT9u6RjJM2W1Cnpd5J+0NTU1DbKcXySPiTpPZIOllQjaY+kHe7+9zU1Ne1Is99bJK2RtFLSHElhSVsl/VrSnU1NTf1p9vmwpG9K2tnU1HRCc3PzCknnSnqbnGmbWiXd1tTUdHfSPsdLWu22rUrSFkk/aWpqenCM12e5pLMkHeG2r1/SNkkPS7q9qakpNNr+AAAAAIDSk9We9ebm5ndKulfSaZJmuMefK+kjku5ubm6eO8J+CyT9UtLXJR0rqU5Sr6SApMMlXSgnUB663xpJv5D0QUn7SuqT5Jf0dklfds+ZtrJe0jE+Iudiw/GSfHIC8YMlfaO5ufk/3G0ulPRjSUfKucBRKWdO5muam5s/OsJxPc3NzV+RdKek97vtG3CPf5ikL0j6RXNz876jtQ8AAAAAUHqynQZ/naSnJf1LU1PTcjk91RdLCkqql/QfQ3dobm6ulnSzpGWSuiVdJmllU1PTyqampsMlvUvSVZJeG7LfCZIulWQk/VbSiU1NTSvkBPdfdM95oKQfNDc3e0dob62kJkm3STra3f8dci4cSNK5zc3N50r6lKTvSXqHu82xkn7vbvOl5ubmGWmO/TlJH5fUIanZ3fftkt4q6d8kvSSpUdJ1zc3NDEcAAAAAACRke571lyV9tqmpKSZJTU1NEUm/bm5uniPpq5Le3dzcfGlTU1M0aZ9PSlokJz18dVNT00vJB2xqanpV0i1pzrXWvd8o6XPxY7rnvK+5uXmvpB/K6WV/l5y086GqJN3V1NT0jaTz7XF7xFdKWuCe53tNTU0/TNrmzebm5oslPeke4wRJ98fXNzc37ydnKEBY0ieampr+L2nfAUl/am5u/pikByUdImmVpN+kaR8AAAAAoARlu0f3R/FAfYjfuveVkhqGrDvdvb9raKA+kubm5gMlLXEf/nBI8C9JampqelTSC+7D941yuHVp9o1Kesp92CepJc02PZKecx8OnXf+w5K8kn6fHKgP2T+owQD92FHaBwAAAAAoMdnuWX9+hOXJheVmxn9wx2vXuw8fHcd5DnXvo5L+NMp2f5CTdn7oCOv3NDU1vTLCul3u/ZampqbeEbbpcO9nDlm+3L0/trm5+Q+jtC/g3u83yjYZ6evrU2dnp/bu3atodNi1CwDj5PV6NWPGDNXV1amioiLfzQEAAECJyWqw7vYWp1sebW5uTnfO5OJvO8dxqtnufVe6au9J4hcJZo+wPm17XdEMthlw74e+jvELEFXubSyVGWwzor6+Pr3yyiuqra1VQ0ODfD6fjDGTOSRQ0qy1ikQi6u7u1iuvvKKFCxcSsAMAAGBKZbtnfbxsns+fK/GCdjc2NTV9O9cn6+zsVG1trebMmZPrUwElwRij8vLyxN9UZ2en5s+fn+dWAQAAoJTkuwr5rqSfxzOFWTz9vLa5ubl8lO3iU8V1jLJNLrzp3k/JtGx79+5VTU3NVJwKKDk1NTXau3dvvpsBAACAEpPXYL2pqWmnBlPVV41j17+69145U62N5Gj3/i/jbNpkbYqfv7m5Oee5s9FoVD6fL9enAUqSz+ejDgQAAACmXL571iXpbvf+I83NzW/JZIempqa/SdriPvx0unnUm5ubj5Mzz7sk/e+kWzk+98gZ814r6cLRNmxubvY1Nzf7J3tCxqgDucHfFgAAAPIh32PWJekncqZWWyTp1ubm5qslPehOjabm5uaFkt4vKdzU1PSTpP2+LelHklZIura5ufmbTU1NO5qbm8sk/YukK9ztntMUz2He1NT0SnNz8/VyAvVzm5ub95Ezfn2zJLkXFw6Qk03wEUn/IenZqWwjAAAAAKBw5T1Yb2pqCjY3N39STuC9VNKVkpqbm5v3SirXYDX1nw7Z77Hm5uZvSvp/kt4l6V3Nzc3d7vbxnPC/S7ow3TzsU+AGOa/vpyV9UNIHm5ubw5LCkmZosAidVLyF9gAAAAAAE1AIafBqamp6VdKH5PSG/0lSt5w5yPfK6Rn/vqRb0uzXIuk0SfdLel1OoB6W9GdJ/ynptKampvYcNz+tpqYm29TU9H05WQF3SNoqKSYnUO+W87xuknRmU1PTphEPBOTYFVdcIWOMHn/88Xw3BQAAAECctZZbgd6OOOIIm4mXXnopo+2K2bZt26wk29TUZK219pZbbrGS7GOPPWattXbRokVWTgZDRrf4cfJlrOczlqampoyfR3zbTI9divgbAwAAQC5I2mhHiAfzngYPTIWLL75Yu3fvTlnW0tKi7du3a/Xq1WpoaEhZd/zxx09Z2/Ltggsu0JlnnqmFCxfmuykAAAAAXATrKAkXX3zxsGWPP/64tm/frjVr1pRUcD7UnDlzNGfOnHw3AwAAAECSghizDhSSNWvWyBijf/zjH/rBD36gt771raqqqkoE9C0tLTLGqKWlJe3+xpi0wf/AwIBuuOEGHXnkkaqpqZHf79fb3/52XXfddYrFYrl7QmMYacx6/Hns2rVL559/vubPn6+KigodcsghuuWWYSUkEh5++GGdcsopmjNnjioqKrRkyRKtXbt2WGYDAAAAgJHRsw6M4KKLLtLvf/97vfe979Upp5wir9c79k4jiEQiev/736+HH35YBx54oM466yxVVlbqscce0+c+9zk988wz+tnPfpbF1mfH7t27dcwxx6i8vFynn366+vr6dNddd+kTn/iEPB6PVq9enbJ9c3OzrrjiCtXV1el973uf6uvr9cILL+jb3/62HnzwQT311FOqqanJ07MBAAAApg+CdRSFhoYGOfUZHGvWrNGaNWsmdcxNmzbpueeeU2Nj4yRbJ33jG9/Qww8/rAsuuEDf+973EoF/NBrV+eefr5tvvlmnn366PvjBD0rKzfOZiOeff16f/OQn9eMf/zjR5osvvlhvfetb9a1vfSslWH/sscd0xRVX6KijjtKDDz6oWbNmJda1tLTonHPOUVNTk6655pqpfhoAAADAtEOwXuwef3++W5C54x/IdwtSfPGLX8xKoB6LxfSDH/xA8+bN0zXXXJPSQ+/1evWd73xHt9xyi26//fZEsF4o/H6/vvvd76a0+S1veYuOOeYY/e53v1NPT4+qq6slSddee60k6cYbb0wJ1CXnYsP3v/993X777QTrAAAAQAYI1oERvOMd78jKcf7+97+rs7NTy5Yt09e//vW021RVVenll1/OyvmyadmyZWnT1vfff39JUldXVyJYf+qpp+Tz+XTXXXfprrvuGrZPf3+/3nzzTXV0dGj27Nm5bTgAAAAwzRGsAyOYN29eVo7T0dEhSdq8ebOam5tH3K6npycr58umoT3kcWVlzr+OaDSaWNbR0aGBgYFRn6PkPE+CdQAAAGB0BOvFrsBSy6cTY0za5R6PM4nCwMDAsHXpKp7PnDlTknTqqafqF7/4RfYaWGBmzpypWCymzs7OfDcFAAAAmPaYug0Yp9raWknSq6++Omzdxo0bhy076KCDNGvWLD399NOKRCI5b1++HHnkkerq6tKLL76Y76YAAABMWigUUmtrq1566SW1trYqFArlu0koMQTrwDitWLFCHo9Hd9xxR8o/7c7OTn3xi18ctn1ZWZk+97nP6fXXX9eFF16o3t7eYdu8/vrreumll3La7ly75JJLJEnnnXeedu7cOWx9MBjU008/PdXNAgAAGLdQKKStW7cqHA6rsrJS4XBYW7duJWDHlCINHhin+fPn6+yzz9bPfvYzHX744Xrve9+r7u5uPfjgg/rnf/5nPffcc8P2ueyyy/T888/rRz/6kR544AGtWrVK++23n9rb27V582b94Q9/0De+8Q295S1vyWpb7733XrW2tqZdd/LJJ+uss87K2rlOPPFEXXXVVbr00ku1bNkynXLKKWpsbFRPT4+2b9+uJ554Qscee6weeuihrJ0TAAAgF9rb2+Xz+VRZWSlJifv29nY1NDTksWUoJQTrwATceOONmjt3ru68805df/31WrhwoS688EKtXbtW//M//zNse5/Pp3vvvVe33XabWlpa9Ktf/Uo9PT3aZ5991NjYqCuvvFJnn3121tv5/PPP6/nnn0+7btasWVkN1iXpS1/6ko455hhde+21evLJJ3Xfffdp5syZ2m+//XT++edn/XwAAAC5EAqFEgF6XHl5OT3rmFLGWpvvNmAEK1assOnGQA/18ssv6+CDD56CFgGlib8xAABKS2trayIFPi7+mJ51ZJMx5llr7Yp06xizDgAAAABJ6uvrFYlEFA6HFYvFFA6HFYlEVF9fn++moYQQrAMAAABAEr/fryVLliSKy1VWVmrJkiXy+/35bhpKCGPWAQAAAGAIv99Pyjvyip51AAAAAAAKDME6AAAAAAAFhmAdAAAAAIACQ7AOAAAAAECBIVgHAAAAAKDAEKwDAAAAAFBgCNYBAAAAACgwBOsAAAAAABQYgnUAAAAAAAoMwToAAAAAAAWGYB2Q1NLSImOMWlpa8t2UacEYo+OPP37aHRsAAACYLgjWUXQ2btyoc845R4sXL1ZVVZVqamp02GGHae3atXrttdfy3bySZIyRMSbfzQAAIGs2bO7Qedc/o5Mu/63Ov/4Zbdjcke8mASgyBOsoGtZafelLX9LKlSt122236aCDDtKFF16oT37yk/L7/fr2t7+tAw44QHfffXe+m4pRvPzyy/rpT3+a72YAADCiDZs7tLZlk1rbehSoLNO2th6tbdlEwA4gq8ry3QAgW6688kr913/9lxoaGvSrX/1KhxxySMr6e+65Rx/72Md05plnav369TrhhBPy1FKM5qCDDsp3EwAAGNW6R7bI5/Woxu+TJNX4feoORXTT+i1auWx2nlsHoFjQs46i0NraqiuvvFI+n0/333//sEBdkk477TRdc801ikaj+vSnP61YLJb2WP/7v/+ro48+WoFAQLW1tTr99NO1efPmYdu1tbXpC1/4gg488EAFAgHNmjVLBx54oNasWaN//OMfw7Z/+OGHdcopp2jOnDmqqKjQkiVLtHbtWu3evXvYtg0NDWpoaFB3d7c+//nPq6GhQT6fT1dccYU+9alPyRij++67L237n3nmGRljdPrpp6csD4VC+uY3v6nDDz9cgUBA1dXVOuqoo3TnnXemPU5/f7+uvPJKLVmyRBUVFWpsbNRXv/pV9fX1pd0+W9KNWb/iiitkjNHjjz+uu+++W+94xzvk9/tVV1enM888c8ThDZ2dnbr00kt18MEHq6qqSjNnztSJJ56oRx55JKfPAQBQ3OI96smcHvZgnloEoBgRrKMo3HLLLRoYGNCpp56qww47bMTtzj33XM2fP19/+9vf9MQTTwxb/4tf/EIf+tCHtGDBAl100UU66qijdM899+jII4/U3/72t8R2oVBIxxxzjL7zne9o0aJF+vSnP61PfvKTOuyww3TffffppZdeSjluc3Oz3vOe9+iZZ57Re9/7Xl144YVaunSpvv3tb+uYY45Rd3f3sLb09/dr1apVuvfee3XyySfroosuUmNjo1avXi1JI6aK33rrrZKkNWvWJJbt3r1bxx57rL785S/L6/XqE5/4hFavXq0333xTZ511lr761a+mHMNaqzPOOEOXX365jDG64IIL9L73vU8333yzzjjjjBFf31y74YYb9LGPfUwNDQ367Gc/q0MPPVQ///nP9a53vWvYRYTt27friCOO0FVXXaV99tlHn/rUp/Sv//qvevnll/We97xHN954Y56eBQBgumuYW61geCBlWTA8oMa5gTy1CEBRstZyK9DbEUccYTPx0ksvZbRdMVu1apWVZNetWzfmtmeddZaVZK+88srEsltuucVKspLsAw88kLL99773PSvJrlq1KrHs/vvvt5LsxRdfPOz4fX19tru7O/H40UcftZLsUUcdZbu6ulK2jZ936HEWLVpkJdkTTzzR9vT0DDvHAQccYMvLy21HR0fK8nA4bGtra219fb2NRCKJ5atXr7aS7Le+9a2U7Xt7e+273/1ua4yxzz33XGL57bffbiXZI4880vb29iaWd3R02MWLF1tJ9rjjjhvWrpHEX9tMtx167KamJivJzpgxw77wwgsp6z760Y9aSfbnP/95yvLjjjvOGmPsnXfembK8q6vLvu1tb7OVlZX2jTfeyKhN/I0BAJL96e+77HFffsS+67Lf2A9+43H7rst+Y4/78iP2T3/fle+mAZhmJG20I8SDjFkvch+/5o/5bkLGfnrJ0RPe9/XXX5ck7b///mNuG99m586dw9atWrVK73vf+1KWXXDBBfrBD36gRx99VNu3b9eiRYsS66qqqoYdo7y8XOXl5YnH1157rSTpxhtv1KxZs1K2XbNmjb7//e/r9ttv1zXXXDPsWN/5zncUCAy/Sr969Wp95Stf0Z133qnPfvazieUPPPCAurq6dMkll6iszPnz7ujo0G233aYVK1boi1/8YspxKisr9a1vfUsPP/yw7rjjDh1++OGSnEwFSfrP//xPVVZWJravq6vTZZddpnPOOWdYm6bChRdeOCxz4rzzztOdd96pP/3pT4le/+eff15PPPGETj/9dJ155pkp28+aNUvNzc360Ic+pHvuuUef+cxnpqz9AIDisHLZbF29ZrluWr9F29qCWjyvWueetJTx6gCyimAdSHLccccNW+b1enXsscdq69ateu6557Ro0SIdd9xx2m+//XTVVVdp06ZNOuWUU3TMMcfo8MMPl9frTdn/qaeeks/n01133aW77rpr2PH7+/v15ptvqqOjQ7NnD37IV1ZW6q1vfWvadn784x/XZZddpltvvTUlWE+XAr9hwwZFo1EZY3TFFVcMO1YkEpHkVGGP27Rpkzwej4499thh2+dzDvQVK1YMWxa/+NLV1ZVY9tRTT0mS9uzZk/Y5v/nmm5JSnzMAAOOxctlsgnMAOUWwjqIwb948vfzyy3r11VfH3Da+zb777jts3dy5c0c8vuQEf5JUU1Ojp59+Wk1NTbr//vv18MMPS5LmzJmjz3zmM/rqV78qn8+pENvR0aGBgQE1NzeP2q6enp6UYL2+vn7EuckXLFigE088UevXr9fLL7+sgw8+WO3t7XrooYd0+OGHpwT5HR3ONDIbNmzQhg0bRj1/3J49e1RXV5d4Dulei3wYmpkgKZFBEI1GE8viz3n9+vVav379iMdLfs4AAABAISFYL3KTSS2fTo499lg99thj+s1vfqPzzjtvxO2i0agef/xxSdIxxxwzbH1bW1va/d544w1J0syZMxPLFixYoJ/85Cey1uqll17So48+quuvv15f+9rXFIvFdOWVVyb2icVi6uzsHNdzGilQj1u9erXWr1+vW2+9VVdddZVuv/12DQwMJArQxcXbfMkll+i73/1uRueeOXOmOjs7FYlEhgXs8deikMWf8/e//31deOGFeW4NAAAAMH5Ug0dRWLNmjbxer375y1/qxRdfHHG7m2++WTt37tSBBx6YNuU9XYX4aDSqJ598UpL09re/fdh6Y4wOOeQQfe5zn0v04t57772J9UceeaS6urpGbddEfPjDH1ZNTY1uu+02xWIx3XrrrSorK9NZZ52Vst073vEOeTwe/f73v8/42MuXL1csFks872Txix2F7Mgjj5SkcT1nAAAAoJAQrKMoLF68WF/+8pcViUT0gQ98YNjUaZITQF900UXyer364Q9/KI9n+Nv/0Ucf1a9+9auUZdddd522bt2qE044IVFc7sUXX0zbCx9f5vf7E8suueQSSU4htHRF7YLBoJ5++ulxPFtHVVWVzjjjDL322mu65ppr9Pzzz+uUU05RfX19ynb19fU6++yztXHjRl155ZUp6eJxW7du1bZt2xKP4wXkvvKVrygcDieWd3Z26utf//q42zrVVqxYoX/6p3/SL37xC918881pt/nLX/6i9vb2KW4ZAAAAkBnS4FE0rrjiCgWDQX33u9/V2972Nr373e/WIYccokgkoj/+8Y965plnVFVVpTvvvFMnnHBC2mO8//3v16mnnqpTTz1VS5cu1Z///Gf9+te/Vl1dnW644YbEduvXr9fatWt11FFH6YADDlB9fb127Nih++67Tx6PR2vXrk1se+KJJ+qqq67SpZdeqmXLlumUU05RY2Ojenp6tH37dj3xxBM69thj9dBDD437Oa9evVo33XSTLr300sTjdK677jpt3rxZl19+uX72s5/p2GOP1dy5c7Vz5069/PLL2rBhg+688041NjZKkj760Y/q5z//ue6//34deuih+uAHP6hIJKK7775bK1eu1NatW8fdVim18N1QN9xwQ8pFjsm64447tGrVKn3yk5/Utddeq3e+852aNWuWduzYoRdeeEF//etf9dRTTw27uAEAAAAUhJHmdOOW/xvzrE/MM888Yz/+8Y/bhoYGW1lZaQOBgD3kkEPsf/zHf9hXX3017T7x+c5vueUW+8ADD9gjjzzS+v1+O3PmTPvhD3/Y/u1vf0vZ/qWXXrKXXHKJPeKII+ycOXNseXm5XbRokT3ttNPsH/7wh7Tn+P3vf28/8pGP2Pnz51ufz2fnzJlj3/a2t9lLLrnEbtiwIWXbRYsW2UWLFmX0fJcuXWol2bq6OtvX1zfidn19ffYHP/iBPeqoo2xNTY0tLy+3+++/v121apW95ppr7K5du4Zt39zcbBsbGxPP78tf/rINh8MTnmd9tFt8Dvp0x47Ps/7YY48NO/a2bdusJLt69eph67q7u+03vvENu3z5chsIBGxlZaVtaGiwp5xyiv3xj3+cdg77dPgbAwAAQC5olHnWjbMehWjFihV248aNY24XrwYOIDf4GwMAAEAuGGOetdYOn59YjFkHAAAAAKDgEKwDAAAAAFBgKDAHAABQxDZs7tC6R7aota1HjXOrdd7JS7Vy2ex8NwsAMAZ61gEAAIrUhs0dWtuySa1tPQpUlmlbW4/WtmzShs0d+W4aAGAMBOsAAABFat0jW+TzelTj98nrMarx++TzenTT+i35bhoAYAwE6wAAAEUq3qOezOlhD+apRQCATBGsAwAAFKmGudUKhgdSlgXDA2qcG8hTiwAAmSJYBwAAKFLnn7xUkWhM3aGIojGr7lBEkWhM5560NN9NAwCMgWAdAACgSK1cNltXr1muxfOqFeqLavG8al29ZjnV4AFgGmDqNgAAgCK2ctlsgnMAmIboWQcAAAAAoMAQrAMAAAAAUGAI1gEAAAAAKDAE6yh5LS0tMsaopaUl302ZFowxOv744/PdDAAAAKCoEayj6GzcuFHnnHOOFi9erKqqKtXU1Oiwww7T2rVr9dprr+W7eQAAAAAwJoJ1FA1rrb70pS9p5cqVuu2223TQQQfpwgsv1Cc/+Un5/X59+9vf1gEHHKC77747300FAAAAgFExdRuKxpVXXqn/+q//UkNDg371q1/pkEMOSVl/zz336GMf+5jOPPNMrV+/XieccEKeWgoAAAAAo6NnHUWhtbVVV155pXw+n+6///5hgboknXbaabrmmmsUjUb16U9/WrFYbNg2//u//6ujjz5agUBAtbW1Ov3007V58+Zh27W1tekLX/iCDjzwQAUCAc2aNUsHHnig1qxZo3/84x/Dtn/44Yd1yimnaM6cOaqoqNCSJUu0du1a7d69e9i2DQ0NamhoUHd3tz7/+c+roaFBPp9PV1xxhT71qU/JGKP77rsv7evwzDPPyBij008/PWV5KBTSN7/5TR1++OEKBAKqrq7WUUcdpTvvvDPtcfr7+3XllVdqyZIlqqioUGNjo7761a+qr68v7fYAAAAAsouedRSFW265RQMDAzrjjDN02GGHjbjdueeeq6997Wv629/+pieeeCKld/0Xv/iFfv3rX+vUU0/V8ccfrz//+c+655579Nhjj+mPf/yjDjzwQElO4HvMMcdo69atOumkk/T+979f1lpt375d9913n04//XQtXrw4cdzm5mZdccUVqqur0/ve9z7V19frhRde0Le//W09+OCDeuqpp1RTU5PSzv7+fq1atUqdnZ06+eSTVVNTo8bGRr373e/Wj3/8Y/30pz/VBz/4wWHP79Zbb5UkrVmzJrFs9+7dWrVqlZ577jktX75cn/jEJxSLxfTwww/rrLPO0osvvqivf/3rie2ttTrjjDN03333acmSJbrgggvU39+vm2++WX/5y1/G94sBAAAAMDHWWm4FejviiCNsJl566aWMtpusP/19lz33uqftuy77jT3vuqftn/6+a0rOm4lVq1ZZSXbdunVjbnvWWWdZSfbKK6+01lp7yy23WElWkn3ggQdStv3e975nJdlVq1Yllt1///1Wkr344ouHHbuvr892d3cnHj/66KNWkj3qqKNsV1dXyrbx8w49zqJFi6wke+KJJ9qenp5h5zjggANseXm57ejoSFkeDodtbW2tra+vt5FIJLF89erVVpL91re+lbJ9b2+vffe7322NMfa5555LLL/99tutJHvkkUfa3t7exPKOjg67ePFiK8ked9xxw9pVzKbqbwwAAAClRdJGO0I8SBo8MrJhc4fWtmxSa1uPApVl2tbWo7Utm7Rhc0e+myZJev311yVJ+++//5jbxrfZuXNnyvJVq1bpfe97X8qyCy64QEuWLNGjjz6q7du3p6yrqqoaduzy8nLNmDEj8fjaa6+VJN14442aNWuWJCkYHlBre4+OPunDesuhb9Vtt9+etp3f+c53FAgEhi1fvXq1+vv7h6WwP/DAA+rq6tLZZ5+tsjInaaajo0O33XabVqxYoS9+8Ysp21dWVupb3/qWrLW64447EstvueUWSdJ//ud/qrKyMrG8rq5Ol112Wdq2AgAAAMgu0uCRkXWPbJHP61GN3ydJqvH71B2K6Kb1W7Ry2ew8ty47jjvuuGHLvF6vjj32WG3dulXPPfecFi1apOOOO0777befrrrqKm3atEmnnHKKjjnmGB1++OHyer0p+z/11FPy+Xy66667dNddd6l/IKbuUERGkjFOuvuuN9/UK6+1aeF+cxP7VVZW6q1vfWvadn784x/XZZddpltvvVWf/exnE8vTpcBv2LBB0WhUxhhdccUVw44ViUQkSS+//HJi2aZNm+TxeHTssccO25751QEAAICpQbCOjMR71JM5PezBPLUo1bx58/Tyyy/r1VdfHXPb+Db77rtvyvK5c+em21zz5s2TJO3Zs0eSVFNTo6efflpNTU26//779fDDD0uS5syZo8985jP66le/Kp/PuajR0dGhgYEBNTc3j96mNzpSgvX6+noZY9Juu2DBAp144olav369Xn75ZR188MFqb2/XQw89pMMPPzwlyO/ocDIfNmzYoA0bNox4/p6ensTPe/bsUV1dXeI5pHstAAAAAOQWafDISMPcagXDAynLguEBNc4dnqadD/Fe4N/85jejbheNRvX4449Lko455piUdW1tbWn3eeONNyRJM2fOTCxbsGCBfvKTn6i9vV1//etfde2112r27Nn62te+pq997WuJ7WbOnKna2trEuJO/7dijzTu7h932mbcg5ZwjBepxq1evljTYm3777bdrYGAgsTz5/JJ0ySWXjFof4bHHHkvZp7OzM9Hrnu61AAAAAJBbBOvIyPknL1Uk6qRwR2NW3aGIItGYzj1pab6bJslJ/fZ6vfrlL3+pF198ccTtbr75Zu3cuVMHHnjgsLT3J554Ytj20WhUTz75pCTp7W9/+7D1xhgdcsgh+tznPqf169dLku69997E+iOPPFJdXV2JNpX7PIrFbMoxYjGrCt/4/hQ//OEPq6amRrfddptisZhuvfVWlZWV6ayzzkrZ7h3veIc8Ho9+//vfZ3zs5cuXKxaLJZ53sviFDgAAAAC5RbCOjKxcNltXr1muxfOqFeqLavG8al29ZnnBjFdfvHixvvzlLysSiegDH/iAXnrppWHb3Hvvvbrooovk9Xr1wx/+UB5P6tv/0Ucf1a9+9auUZdddd522bt2qE044QYsWLZIkvfjii2l74ePL/H5/Ytkll1wiSTrvvPO0c+dO7VNTKSslAvaenh499+wGzampHHa80VRVVemMM87Qa6+9pmuuuUbPP/+8TjnlFNXX16dsV19fr7PPPlsbN27UlVdeqWg0OuxYW7du1bZt2xKPzznnHEnSV77yFYXD4cTyzs7OlCneAAAAAOQOY9aRsZXLZhdMcJ7OFVdcoWAwqO9+97t629vepne/+9065JBDFIlE9Mc//lHPPPOMqqqqdOedd6bMrx73/ve/X6eeeqpOPfVULV26VH/+85/161//WnV1dbrhhhsS261fv15r167VUUcdpQMOOED19fXasWOH7rvvPnk8Hq1duzax7YknnqirrrpKl156qZYtW6ZTTjlFC/ZfpF1de/TqK69ow9N/0NHHHKN//cCJ436+q1ev1k033aRLL7008Tid6667Tps3b9bll1+un/3sZzr22GM1d+5c7dy5Uy+//LI2bNigO++8U42NjZKkj370o/r5z3+u+++/X4ceeqg++MEPKhKJ6O6779bKlSu1devWcbcVAAAAwPgYZ2o3FKIVK1bYjRs3jrldvMgYHH/60590/fXX63e/+53eeOMNeb1eNTQ06D3veY8uvvhiLViQOj68paVF55xzjm655RbNmTNH3/jGN/TCCy/I5/PpxBNP1De/+U0dcMABie1ffvll3Xjjjfrd736n7du3q7u7W/Pnz9eKFSv0+c9/XkcfffSwNj355JO69tpr9eSTT2rXrl2aOXOm9ttvP61atUpnnXWWVqxYkdi2oaFBktTa2jrmc122bJm2bNmiuro6vf766yovL0+7XX9/v9atW6c77rhDL774osLhsObOnatly5bp/e9/v/7t3/5Ns2fPTtn+qquuUktLi1577TXNnz9fZ599ti6//HJVVlbquOOOK6mUeP7GAAAAkAvGmGettSvSriNYL1wE60Bh4G8MAAAAuTBasM6YdQAAAAAACgzBOgAAAAAABYZgHQAAAACAAkOwDgAAAABAgSFYBwAAAACgwDDPOoCCEQwP6M3usPojMZX7PNqnplKBSv5NAQAAoPTQsw6gIATDA9rREVJ/JCaPx6g/EtOOjpCC4YF8Nw0AAACYcgTrRcJam+8mAJPyZndYRpLHYyT33kja1R3Oa7v42wIAAEA+EKwXAa/Xq0gkku9mAJMS71FP5vEY9UVieWqRIxKJyOv15rUNAAAAKD0E60VgxowZ6u7uznczgEkp93kUi6X2YsdiVhW+/P6b6u7u1owZM/LaBgAAAJQegvUiUFdXp66uLu3atUv9/f2k7WJa2qemUlZKBOyxmJWVNKemcsrbYq1Vf3+/du3apa6uLtXV1U15GwAAAFDaKLNcBCoqKrRw4UJ1dnaqtbVV0Wg0300CJqQvElVPeEADUasyr1F1ZZle6c5PCrrX69WMGTO0cOFCVVRU5KUNAAAAKF0E60WioqJC8+fP1/z58/PdFAAAAADAJJEGDwAAAABAgSFYBwAAAACgwBCsAwAAAABQYAjWAQAAAAAoMATrAAAAAAAUGIJ1AAAAAAAKDME6AAAAAAAFhmAdAAAAAIACU5bvBgAAAABAsduwuUPrHtmi1rYeNc6t1nknL9XKZbPz3SwUMHrWAQAAACCHNmzu0NqWTWpt61Ggskzb2nq0tmWTNmzuyHfTUMAI1gEAAAAgh9Y9skU+r0c1fp+8HqMav08+r0c3rd+S76ahgJEGDwAAAAA5FO9RT+b0sAfz1KLpqdSGEtCzDgAAAEyBDZs7dN71z+iky3+r869/hhToEtIwt1rB8EDKsmB4QI1zA3lq0fRTikMJCNYBAACAHCvFQAODzj95qSLRmLpDEUVjVt2hiCLRmM49aWm+mzZtlOJQAoJ1AAAAIMdKMdDAoJXLZuvqNcu1eF61Qn1RLZ5XravXLC/qFO5sK8WhBIxZBwAAAHKsFAMNpFq5bDbB+SQ0zK1Wa1uPavy+xLJgeECL51XnsVW5Rc86AAAAkGOMWQYmpxSHEhCsAwAAADlWioEGkE2lOJTAWGvz3QaMYMWKFXbjxo35bgYAAACyYMPmDt20fou2tQXVODegc08q7mmnAIzNGPOstXZFunWMWQcAAACmAGOWAYwHafAAAAAAABQYgnUAAAAAAAoMwToAAAAAAAWGYB0AAAAAgAJDsA4AAAAAQIGhGjwAAIBrw+YOrXtki1rbetQ4t1rnnczUWgCA/CBYBwAAkBOor23ZJJ/Xo0Blmba19WhtyyZdvWY5ATuAgsFFxdJBGjwAAICkdY9skc/rUY3fJ6/HqMbvk8/r0U3rt+S7aQAgafCiYmtbT8pFxQ2bO/LdNOQAwToAAICU+PKbzPkyHMxTiwAgFRcVSwvBOgAAgKSGudUKhgdSlgXDA2qcG8hTiwAgFRcVSwvBOgAAgKTzT16qSDSm7lBE0ZhVdyiiSDSmc09amu+mAYAkLiqWGoJ1AAAASSuXzdbVa5Zr8bxqhfqiWjyvmuJyQIHZsLlD513/jE66/Lc6//pnSm6sNhcVS4ux1ua7DRjBihUr7MaNG/PdDAAAACDvhs7YEAwPKBKNldxFtQ2bO3TT+i3a1hZU49yAzj2JavDTmTHmWWvtinTrmLoNAJBVTCkDAMiF5OJqklTj96k7FNFN67eU1OfMymWzS+r5ljLS4AEAWcOUMgCAXKG4GkoNwToAIGuYUgYAkCsUV0OpIVgHAGQNvR4AgFyhuBpKDcE6ACBr6PUAAOQKMzag1FBgDgCQNeefvFRrWzapOxRJqdRLrwcAIBsoroZSQs86ACBr6PUAAADIDnrWAQBZRa8HAADA5NGzDgAAAABAgSFYBwAAAACgwBCsAwAAAABQYAjWAQAAAAAoMATrAAAAAAAUGIJ1AAAAAAAKDME6AAAAAAAFhmAdAAAAAIACU5bvBgAACt+GzR1a98gWtbb1qHFutc47ealWLpud72YBAAAULXrWAQCj2rC5Q2tbNqm1rUeByjJta+vR2pZN2rC5I99NAwAAKFoE6wCAUa17ZIt8Xo9q/D55PUY1fp98Xo9uWr8l300DAAAoWgTrAIBRxXvUkzk97ME8tQgAAKD4EawDAEbVMLdawfBAyrJgeECNcwN5ahEAAEDxI1gHAIzq/JOXKhKNqTsUUTRm1R2KKBKN6dyTlua7aQAAAEWLYB0AMKqVy2br6jXLtXhetUJ9US2eV62r1yynGjwAAEAOMXUbAGBMK5fNJjgHAACYQvSsAwAAAABQYOhZBwAAAABMmQ2bO7TukS1qbetR49xqnXfyUjL40qBnHQAAAAAwJTZs7tDalk2JqWG3tfVobcsmbdjcke+mFRyCdQAAAADAlFj3yBb5vB7V+H3yeoxq/D75vB7dtH5LvptWcAjWAQAAAABTIt6jnszpYQ/mqUWFi2AdAAAAADAlGuZWKxgeSFkWDA+ocW4gTy0qXATrAAAAAIApcf7JSxWJxtQdiigas+oORRSJxnTuSUvz3bSCQ7AOAAAAAJgSK5fN1tVrlmvxvGqF+qJaPK9aV69ZTjX4NJi6DQAAAAAwZVYum01wngF61gEAAAAAKDAE6wAAAAAAFBiCdQAAAAAACgzBOgAAAAAABYZgHQAAAACAAkOwDgAAAABAgSFYBwAAAACgwBCsAwAAAABQYAjWAQAAAAAoMATrAAAAAAAUGIJ1AAAAAAAKDME6AAAAAAAFhmAdAAAAAIACU5bvBgBTbcPmDq17ZIta23rUOLda5528VCuXzc53swAAAAAggZ51lJQNmzu0tmWTWtt6FKgs07a2Hq1t2aQNmzvy3TQAAAAASCBYR0lZ98gW+bwe1fh98nqMavw++bwe3bR+S76bBgAAAAAJBOsoKfEe9WROD3swTy0CAAAAgOEYs46S0jC3Wq1tParx+xLLguEBLZ5XncdWAQAAlC7qCQHp0bOOknL+yUsVicbUHYooGrPqDkUUicZ07klL8900AACAkkM9ocKzYXOHzrv+GZ10+W91/vXP8LvII4J1lJSVy2br6jXLtXhetUJ9US2eV62r1yzn6i0AAEAeUE+osHDxpLCQBo+Ss3LZbIJzAACAAkA9ocKSfPFEkmr8PnWHIrpp/Zbp9/052i+FXpEie6S6I/LdmgkhWAcAAACQF9QTKizT8uKJtVLfm1LPNinY6tx6tkm9O511vhnS0bdLxuS7peNGsA4AAAAgL84/eanWtmxSdyiiQGWZguEB6gnlUcFfPBkIScHtqYF5cLuzfCSRvVJ/l1RRN1WtzBqCdQAAAAB5Ea8ndNP6LdrWFtTiedU69ySqwedLwVw8iUWl8OtST2tqb3m4PfNjGCNV7SsFFkmx/hw1NLeMtTbfbcAIVqxYYTdu3JjvZgAAAACYZiY6Jd6GzR2JiyeNcwO5v3jSvycpIHfvQ684Y84z5ZshVTc6gXmgUQo0SIGFkrciJ03OJmPMs9baFWnXEawXLoJ1AACA4sF84pgq8aruPq8npYc8r7MgxSJS6NXUNPaeVidFPVOeMsm/wA3GG6TqBic4L6+dlmPSpdGDddLgAQAAgBwbGjzFp8RiClnkQl6rulsr9XVIwW2pgXloh2RjmR+nos4NyBsHg3P/AidgLxGl80wBAACAPCmqKbFQ8Kasqns0nBSQbx8cWz4wjvN4yyX/osFe8uoGJzD3zchuW6chgnUAAIARhEIhtbe3KxQKye/3q76+Xn6/P9/NyjnStbNvWk6JhWkr61XdbUzqfSOpAnurk8Le+/r4jlM1LymF3e0xr5onGc/E2lXkCNYBAADSCIVC2rp1q3w+nyorKxUOh7V161YtWbKkqAN20rVzo+CnxEJRmVRV98je1GJv8enRon2ZN6As4BR7iwfk1Y2Sf6FUVjWh51OqCNYBAADSaG9vTwTqkhL37e3tamhoyGPLcot07dwomCmxpgjZGfmV0ZR4sQFnHHk8GI+PLe/ryPxExiP590tNXw80ShWzp23Bt0JCsA4AAJBGKBRKBOhx5eXlCoVCeWrR1CBdOzdKaT5xsjMKw8pls53X21qn4npwm/TqE26P+TYnUI8NZH7A8lmp6evVDVLVAmfMOXKCYB0AACANv9+vcDicErD39/cXdQq8RLp2LiWCpyJHdkYeRfuk4CtDxpZvc1LbM+XxOXOUD63EXj4zFy3GKAjWAQAA0qivr9fWrVslOT3q/f39ikQiWrhwYZ5bllullq6N7CM7YwpYK4XbB4PxeGDeu9NZl6nK+tT5yqsbpMr5ksebi1ZjnAjWAQAA0vD7/VqyZElKNfiFCxcWfc96KaVrIzfIzsiygaA7przVnbu81Xk80Jv5McqqBnvIEz3mC51CcONEPYKpY+x4rrxgSq1YscJu3Lgx380AAAAAMjZ0zHo8O4Mx62OIRZ2e8eSp0YKtTg96poyRqvYdMra8UarYJysF3/jdZp8x5llr7Yp06+hZBwAAAJA1ZGdkoH+P00ueMj3aK1IskvkxfDPcgNxNX/cvcnrLvRW5abOoRzDVCNYBAAAAZFWpFNMbU7Rf6t3hjitPmh6tf3fmx/CUSf79k8aWu7fy2imfHo16BFOLYB0AAAAYBWN0MSZrpb5dQ1LYt0mh1yQby/w4FbPTTI+2nxOwFwDqEUytwvitAwAAAAWIOcMxzECv00ueMj1aq1MILlPeCimwaEjBt0VOansBY7aIqUWwDgAAAIyAMbolzMak3tdT09eDrVLvG+M7TtX81PT16kapct6Up7BnA/UIphbBOgAgq0gXBVBMGKNbIiJ7U9PXe1ql0HZnzHmmygKDPeTxom+BRZK3MjdtzhPqEUwdgnUAQNaQLgqg2DBGt8jEBqTQjuFjy/s6Mz+G8Uj+BaljywMNznjzadhbjsJFsA4AyBrSRQEUG8boTlPWSv2dqfOVB1ul0KvOfOaZKq8dTGGPB+b+BZLHN8aOwOQRrAMoCKROFwfSRQEUG8boTgPRsDNHeXB76tzlkb2ZH8NbLvkXOmnr8bnLA4uk8pm5ajUwJoJ1AHlH6nTxIF0UQDFijG6BsFYKv+EG425gHmx1isBZm/lxKutT09erG6TK+ZLHm5NmAxNFsA4g70idLh6kiwIAsiLS46auu1Ok9Wxzfo6GMz9GWdXwceWBRVKZPydNBrKNYB1A3pE6XTxIFwUKH8OOUFBiUal3Z2r6enCbFN6V+TGMkar2S5oeza3EXrEPBd8wrRGsA8g7UqeLC+miQOFi2BHyqn93Ui95qzs92itOhfZM+WoGe8oTBd/2d8acA0WGYB3TAr0AxY3UaQCYGgw7KgxF/70m2u9UXR86PVr/nsyP4SlzCr6lVGJfJPlm0VuOkmHseIoxYEqtWLHCbty4Md/NyLuhvQDxQI5egOKyYXNHInW6cW6gJFKni/7LGoCCc9Llv1Wgskxez2CwE41ZhfqieqR5VR5bVjqK6nuNtVLfm85Y8nhvebBVCr0m2diwzXcH+7X9zaBC/VH5y71atE9AswLlUuWcwerr8UrsVfOdgB0ocsaYZ621K9Kt4y8ABY9egNJQaqnTpKICyAeGHeXftP1eM9CbVPBt22BF9oHM6svsDvbrz6/06s2BfdWpffVK1z7a+co+uuCMd+uIgxbltOkobbGYlcczPbMxCNZR8Cg+hmI0bb+sAZjWGHaUfwX/vcbGnKnQ4tXXE9OjtWV+DGOkynkpVdi/flennn+jQjX+wbHl3QMRrXv0Df2YYB0TFO6PqivYr66ewVvH3j7nZ3f5QDSmGz71jnw3dUII1lHw6AVAMSr4L2sAihIzNuRfQX2viXQnVWBvdQu+bXfGnGfKVz04ptwfT2NfKHkrUzZ7/o3f8rmHjFlr1RMeUFdPvzr39qkzHowH+9W5dzAwD/VlVpww3B9VZbk3x63OPoJ1FDx6AVCMCurLGoCSUmrDjgpNXr7XxCJSaEdSsTf31teZ+TE8XqlqQWrBN/8iqWJ2RgXf+NxD3EA0pj2hiBN0B51gPLlnvNMNygcGhtc9mKg9oX5Vlldl7XhThQJzBYwCc4NKsfgYiltRFRgCAIxLzr7XWCv1dTi948nTo/XucOYzz1RFXSJ93QnOGyX/fpLHN8aOI+Nzr/ikK5R72KJZifTzeBDe6d52u73i3aF+ZSsE9XqNagPlqptRodrqcvfncufn6nLVVZdrVqBcZV5Pdk6YA6MVmCNYL2AE60Bx4yIUphtmMAAKSDQsBV9JSmF3g/NIT+bH8Ja7qesNTiX2QKPzs68mJ03mc2/6stZqb++A2/vdp6f/3qF1D2+WtZLXY9TbH1UkGtNB+9U4Ff6zwF9RplnVvtRgvLpcs6vLVVvtPJ5RVSYzzafyI1ifpgjWAQCFgl4xIE+slcJvDBlbvs1ZNp7v8VVz3d7yxsFU9qr5kincHkdMjYFoTLuDEbcXPDUlPbEsGElJS3++dbd6+wZU5jVJx7HyV5bprYtmjXo+Y6Qaf2rvd13856Qe8uk4xnwimLoNAABMCjMYAFMgstetwN6aFJxvd3rRM1XmT01fDzQ4veZl02+8LiYvXi09eVx4PC09/ngiaem9fQPyelN7tL1eo96+qPaZWaG66sGe8MGg3Fk2K+Ar6LT0QkKwDgAAxsQMBkAWxQak3p2p6evBVim8K/NjGCP5FwyOLY8H6BX7ZFTwDdPb0LT05AC8c2+fuoL92t0Tybhaeib8FWWJwPu1zl51h/o1o8qnMq+Rz+tRb39US+dX6zufOCJr5yx1BOsAAGBMVHIGJsBaKbJ7sJc8HpiHXnUC9kyVz0xNXw80SP79nTHnKDrxtPShKemDveFOcB6NZmc4szHSzEC52+udWVr6CYfN1dqWTfJ6jPwVztComLU67+RlWWkTHATrAABgTEyjCYwh2i+FXhk+PVr/nsyP4SmT/AvducobBoPz8lnZby/yorc/qq6eviHBd2qv+N7eSNaqpZeVeZwCbcnp6DPcAm1u5fSZ/vGnpa9cNltXr1meKBi4eF41BQNzgAJzBYwCcwCAQkIlZ0Du9GhvJgXk29zp0V4bX8G3yn0G5yuPV2Kv2teZzxzTztC09I5E8O08jk9n1ts3jin0xhBPS09MVRYYLNhWW12huhnlqq6c/tXSix0F5gAAwKStXDab4BylZSA0WPAtkca+3VmeKW9lavp6vMe8LJCDBiMXnLT0IT3he/tTC7cFc5OWHg/AZ88YUrCthKqllzKCdQAAAJQ2G5N6X08t9tazTQq3Z34MY5yp0IZOj1Y5l4JvBSyelh4PvlMqpruBeHcokrXzJdLSZ5QP3idVSq+tnlhaOooTwToAAABKR/8et7d822Aqe+gVZ8x5pnwzUseUBxqkwEKnFx0FIZ6WHi/S1rk3tUCb0yver3B/dtPSEynpyWnpMyoSRdsCpKVjHAjWAQAAUHxiESm0I7UKe0+r1N+V+TE8XqfqenL6eqBBKq+jtzyPhqalO+PC+1Orp+coLb2uenhPeJ1bRZ20dGQbwToAAACmL2ulvo4h48pbnYJvsXH0mlbUpZkebYFToR1TJtQ3kBJwxyukdyalp+/NYlq6r8yTEnQPDcTj05l5PVycwdTjvw8wTW3Y3KF1j2xRa1uPGudW67yTqcoMAChy0bCTwp48NVrPNmkgmPkxvOWSf5EblLuV2KsbJF9NTpoMh7VW3aFIIv083bzhXT3ZTUsPVJYN9nwPCcJnu/ekpaOQEawDBSaTIHzD5g6tbdkkn9ejQGWZtrX1aG3LJl29ZjkBO1BEuCiHkmWtU/AtHpDHU9h7Xx/fcarmDU9hr5ovGYp3ZVM8Lb1jb3KBtqSU9BylpafMHT4kLb22ulwVPtLSMb0xz3oBY5710jM0CA+GBxSJxoYF4edd/4xa23pU4/cllnWHIlo8r1o//sw789F0AFmW6f8DYNqL7HV7y5MqsQe3S9G+zI9RFhg+PVpgkVRWlYsWl5ShaenxecM7k3rGc5mWPjQlnbR0FBvmWQemiXWPbJHP60kE4TV+n7pDEd20fkvKl/PWth4FKlP/fJ0e9nGkAQIoaJn+P5Dogcc0ERtwxpHHU9jjwXlfR+bHMB7Jv9/wseUVcyj4Nk7xtPSUucOH9Ip37u1XXyR7aenVVWWJCum1bhCemLosQFo6MBTBOlBAMg3CG+ZWD+tZD4YHtHhe9ZS0E0DuZfr/gGExKDjWOhXXk9PXg9ucyuyxgcyPUz7LTV9vHExjr1rgjDnHqCIDg9XSO+NTlQ0JyndnOS19VtI0ZfFpyxIF22Y4veGkpQPjQ7COvKEnaLhMg/DzT16qtS2b1B2KpKTHnnvS0qluMoAcyfT/wXh64IGsi/ZJwVeGjC3f5qS2Z8rjc+YoTwTmi9zp0WblosXTmrVWvf3R4T3he915w3OYll43o1x1gdSU9Hiv+Ew/aelALhCsIy/oCUov0yB85bLZunrNct20fou2tQW1eF61zj2Jix1AMcn0/wHDYjAlrJXC7anp68FWqXensy5TlfWpxd4CDVLVvs585iXOWqs9oUhKD3hHUo94vIJ6LtPS48E3aelAYSBYR17QE5TeeILwlctml/RrBRS7TP8fMCwGWTcQTJoebdtgwbeB3syP4a0cTF9PBOeLnEJwJWi0tPTOvYPzicdi2U1LT6SkJ6WlDwbkFSovoyo+UMgI1pEX9ASNjCAcQFwm/w8YFoMJi0WdnvHgdicojxd+C7dnfgxjnJ7x5MA80OD0oJdAb2w8LT2lSNvevsHK6TlKSx8sylaRGA+eWFZdoZl+nzykpQPTHsE68oKeIADIDobFICP9e1LHlAdbnbHmsXEEkb4ZblCeNK48sFDyVuSmzXkWiw2mpe8OOkF4Z3AwHT1+y1VaerxXvK66XLPctPS6GRXyV3hJSwdKBME68oKeIADIHjJykBCLSKFXh0+P1r8782N4yiT/guHTo5XXFk1vef9ATLtTer/7UsaFd/b0aXcwkrW0dI/HaFbAl5iuLB6EJ99nKy09FAqpvb1doVBIfr9f9fX18vv9WXgWAKYawTrygp4gAAAmwVpnfvL4mPLE9GivSTaW+XEqZqeZHm0/J2CfhpLT0hPp6PGb2yve2dOvnt7sp6XXVVcMCb4Hl01VWnooFNLWrVvl8/lUWVmpcDisrVu3asmSJQTswDQ0Pf8ToyjQEwQAQAaiYbfg27bUucsHxlHnxVuRlLreMNhj7puRgwbnRnJaepfbGz5YsC13ael1aXrDE9XTq8sLKi29vb09EahLSty3t7eroaEhjy0DMBEE6wAAAIXAxqTeN9JMj/bG+I5TNT81fT3QIFXNk0zhVv6Op6V39vSpKxhJ6RWPp6nnMi09tTd8+lZLD4VCiQA9rry8XKFQKE8tAjAZBOsAAABTLbI3NX29p1UKbZei/ZkfoyyQfno0b+UYO06deFp6x97U6cq6knrEs52WXu7zpO0NT15WrNXS/X6/wuFwSsDe399PCjwwTRGsAwAA5EpsQArtSE1fD26T+jozP4bxuAXfGtzg3E1nr5iT14JvQ9PSO+NF2oY87o+MYwz9GKqrfINzhQeSU9KdSul11eWqKi+ctPSpVl9fr61bt0pyetT7+/sViUS0cOHCPLcMo9mwuUPrHtmi1rYeNc6t1nknU8cJDmNtdtKJkH0rVqywGzduzHczgGmPD8HiRdVjFAxrpf7OpIDcvYV2OAF7psprU1PYqxsk//6Sxzf6flnWPxAbrJA+pCe8K8dp6Yme8BmD48JrA870ZeNJSy/V//38X5xeNmzu0NqWTfJ5PSkzJF29ZnlJvF8hGWOetdauSLuOYL1wEawDk8eHYPFKrnqc3INE1WPkXLTPmaM82DqYwh5sdVLbM+Utl/wLU4u9BRqk8pk5aPAga61CfW619J4+Z/qy5IB8b5+6gv3q6R3HBYYxVPi8Q3rAnTnDa3OYls7/fkwX513/jFrbelTjH7wg1x2KaPG8av34M+/MY8swVUYL1kmDB1DU1j2yRT6vJ/EhWOP3qTsU0U3rt/CFbZqj6jFyzlop3Jaavh5slXpfd9ZlqrI+dWx5YJFUta/k8Wa1ufG09M40Y8OTl2U9LX1GeaL3OzklPT5ePB9p6fzvx3TR2tajQGVqSBaoLNO2tnHM9oCiRbAOoKjxIVi8qHqMrBoIpqavxwP0aDjzY5RVpc5ZHmiQAgudQnCTNDQtvXOvM294LtPSa5MCcCclfXJp6VOJ//2YLhrmVg/rWQ+GB7R4XnUeW4VCQbAOoKjxIVi8qHqMCYlFpd6dSQG521sefjPzYxgjVe3npq+7Bd+qG6WKfcZd8C2Rlr63T53B1PnCE9OX5TAtPR6Ez3ZT0mcFnGUz/b5pXaSN//2YLs4/eanWtmxSdyiSMmTj3JOW5rtpKAAE6wCKGh+CxYuqxxhT/57UMeU926TQq1JsHNOE+WoGe8rjY8v9C50x52OIxax2h+KBt3O/O+j83BnsSyyLDGQvLX2G3zc4XVlgeEp6bYlUS+d/P7ItVwULVy6bravXLNdN67doW1tQi+dV69yTSqMYIsZGgbkCRoE5YLiJfFhu2NyR+BBsnBvgQ7CIUPUYkpy5yXt3DPaSx8eX9+/J/BieMicIH1qJ3TcrbW95Slr6XrdAWzC1V3xPKPtp6XXVqdOVxdPS69xecV+BpqXnA//7kS0ULEQuUQ1+miJYB1LxYQmUOGulvl2p6evBVin0mmTH0TtdOcdNX28YDM6r9pU8ZbLWKhgecALuYGqveDw4z1VaejwlvS5p+rLkaunF3huOwlSqU+Alo2I7colq8MAo6JmbPqjuC5SQgV4puH3I9GjbnUJwmfJWOuPJ3aA8WrVIe8x+6gz7nKB7T7+6XnOC8I6e/0v0iucqLb2uenC6suRg3F/B1zEUpqEXybe19Whty6aSu0hOwULkC58OKGnJ8zRXVlYqHA5r69atzNNcoPiwBIqQjTlToQ2bHq0t82MYoz7ffHV5l6hLC9Vl56kzOkdd/X51vjY4fdme4F5Z+39ZabbHY9KkpDu94rXV5Zo9o1wz/aSlY3rjIrmDgoXIF4J1lDTmaZ5e+LAEprlId+r0aD2tUmi7M+Z8BNZKwYEydYUr1dlXoc6BWerSfuqyc9U1UKuuyAx1hisU7Bs6rK/bvY1fhc+bkpKebqx4PtPSSUvGVOEiuYOChcgXgnWUNOZpnl74sASmidiAFNqRWok92Cr1daZsFo0Z7e4vV1dfjbr6KtXVV6GucIU6+yrV2V+proE6dQ3MUkR+Zw5zr1/ypKvCnnn9naFp6fFK6fHK6YWelk5aMqZSoV8kn6oLV1RsR75QYK6AUWAu91pbW4fN0xx/TM96YaK6L1BArJX6O1MLvvW0Sr071BexTvDdV6nOcIW6+pwgvKuvwn1cqT395bJWTgAeD8bL/JK3yrkp855rr9eoNlCucCSql3d0qzsU0b51VTrtqP119MH7qK66ONLSKXSFqVTIhV0LuW3AeFBgDhgB8zRPPyuXzeZDGMiHaFgKviLbs03BrlfUtWuHOrreVFcwqs6wE4Tv7qtQZ998dfU1KBhJ8xXDeNyAvMq5VbjBuRn960hluZuWHkgt0BZPVa8NlKvG79PGLZ2JL+/7z/ErGB5Qy6P/0CELZ+mAfWty9MJMLdKSMZUKuUeZ8fQoBQTrKGl+v19LlixJqQa/cOFCissBKEnRmNXunj51dexU165X1dnxhrp271LXnj3OPOJ9Tg95JOqRVOveRuCtGNJT7neWDektr3HT0p2ibBWD48InkJZeCl/eCz0tGcWnUC+Sc+EKpYBgHSXP7/eT8g6g6IX7o+oK9iemJ+vq6Vfnnm51drara0+Xuvb0aE8wLDvQK9nokL397i0N43UD8qRx5d4qecvKEgF33YyKpJ/LEz3jswLlKvNmLy29FL68U7sDcHDhCqWAYB0AgGnMWque8EBierLOvX1OMJ4UmHfuDSsUCkrRkDN/eTTk3kauwj6MMZKnUirzq6qySrU1M1RbW6fambNUW12huhnxom1O7/iMqrIpr5ZeCl/eCzktGZhKXLhCKaDAXAGjwBwAlLZozGp3cHCe8K6ePnXu7U8E4517nfuBgdjgTjYiDYSkaO/gfbTXmc98DMZIM3z9qvVLdTV+1c2cqdraOaqtm6/a2fuptiaguhkVqir35vBZTxwFp4DSQtFZFIPRCswRrBcwgnUAmWDO5fEplNcrXVp6x5Be8T3Bfo38MR1zA/Le1B7zWCTt1l6PVW1Fn+oq+lRbEVZdZZ9qKyNOz3jdXNXN3lez9mlUWU2jVD4rV0875/jyDgCYTgjWpymCdQBjoSdxfKbi9YqnpTs94H1uj3hST7j7ONQ3kPlBY32p6esDvVIsrHgkX1UWVW1FWLUVfU5AXun8PLuyL7F8xoyZMtWNUqBBqm6QAo1S1XzJw4g4AADyhanbAKBIlUL162ya7Os1EI1pTyiSSD9PjA/vSUpVH5qWPh426qatOwG5iQZV4+1WbXlQtZV9qnMD77qkILy2ok9VZUkF4byVUmCRVH2AE5jHg/OywMTaBAAA8oJgHQCmsVKofp1No71eyWnp8SC8073tdnvFu0OjpaWPh5VX/aor71Wtb6/qyjpV62lTrefNRBBeVxnWrPJ+lXlGOKExTs94oMHpJa9ucH6unOusAwAA0xrBOgBMY6VQ/XqyrLXa2+tUS58VKNeru4IqL/MoErUaiMYU6ouqstyr869/Jmvn9FeUaVa1z6mMXhVTbdke1XnfVJ15XbX2FdXabZrhCWYeU/uq3R7yRicwDyySAgudXnQAAFCUCNYBjEuhFOeCo9SnrhmIxrQ7GHF7wVNT0hPLgpFEWnpfJKpde/vkkZHXaxSNWsVktSTDixvGSDV+Z67wWYH4VGXuvOEBr2q9u1Sr11TZ1yoFW6Xgdqmvc8hBRjmBxyv593d7yxe5wXmDVF5HbzmAnOBzHShcFJgrYBSYQ6GhmFlhKtbq1/G09OSU9OT7rp6JpaXvDvbrlV0hhfoG5K8o08I5fs0KlMvrNSnzhCff4stmBXwq8xipr8MNxlulnm1OUN67Q4pFxzr9oIq61PT1QIPkX0DBNwBThs91IP8oMAcgKyhmVphWLps9rV7/5LT0rp4+dST3hO/tS4wb7+0bR+A7Bn9F2WDgPaNctYF4EF6u2uoK1c0oV3VlmczQ3uto2AnEe1qlttbBAD3Sk/nJveWSf9FgUB7vLffNyNKzA4CJ4XMdKGwE6wAyRjEzjMVJS0/t/U7pFXeD8Wg0O1ldxkgz3cA7OS29bkZS73igXJXl3tEPZK0UfsPtJW91A/RtzrLxdN1XzUuqwO4G5VXzJOOZ+JMEUBCKMV2cz3WgsBGsA8gYxcxKW29/VF09fcNS0ZMD8b29kSxVS5fKyjyqDQyOCU/0ildXOMtnlGum36cy7zgD4cjewWA8HpgHW6Vo3zgaF0hNX4+PMS+rGl9bAEwLQ9PFt7X1aG3LpmmfLs7nOlDYCNYBZKzUi5kVq+S09M6evqTg20lTz2VauhN8p6alx3vF06alj0dsQOp9zUlhj48tD22XwrsyP4bxSP79Biuwx3vLK+ZQ8A0oIcWaLs7nOlDYCNaRFcWYGobhVi6bravXLE8UM1s8r7poipkVq3Rp6Z17+1MKt+UqLT0egM+eMaRgWyZp6eNhrdTflVTwrVUKbpNCO5yAPVPls5LS193AvGqBM+YcQEkr1nRxPteBwkawjkkr1tQwpDfdipkVs3haejz4Tk5Ljwfi3aFI1s5XVuYZTEkPDPaKJ1dPn1Ba+nhE+6TgK4OBeTw4j3RnfgyPz5mjPDG23L0vn5X15gIoDsWcLs7nOlC4CNYxacWaGgbkSzwtPT5veOfe5J7xPrdXvF/h/uympdfNKFdvX1R/eWW39gT7td9sv844ZpGOfcs+qqsuV2CyaenjYa0Ubk9NX+/ZJvXuHF/Bt8r6pPT1RU46e9W+znzmAJAh0sUB5APBOiatWFPDgFxITkt3xoT3J8aKJ6qn5ygtva56eE94nVtFvbLcm5Ils99sv4LhAa17ZLOW7TtDC/cJZKU9aQ0EB6dHS+4xH+jN/BjeyqSCb/G5yxc5heAAYJJIFweQDwTrmLRiTg0DxiPUN5AScMcrpHcmpafvzWJauq/MkxJ0Dw3E49OZeT2Z9YbnPEsmFpXCrw+OKY8H5+H2zI9hjNMznpLCvkiqnEfBNwA5Rbo4gKlGsI5JIzUMxc5aq+5QJJF+njp12WAwns209EBl2WDPd1IQXjejXHVuT3m209KzmiXTvydpTLk7RVroVSnan/kxfDMGq68nbgslb8X42wMASKAwMDA9EKxj0gopNYwPn8zxWjniaekdI6Wk5ygtPdETPqMiZdqyeI94hW/qx1RPKEsmFnGC8OTp0YLbnersmfKUSf4FqenrgUapvJbecgDIMgoDA9OHseMp1IMptWLFCrtx48Z8N2PaGPrhE+/h58NnuFJ5rYampcfnDe9M6hnPZVr60JT08aalT7VR3xdL66S+Did9PTE9WqszPZqNZX6SitmpFdgDDU6g7uHaMbIrFAqpvb1doVBIfr9f9fX18vv9+W4WkHfnXf/MsAuz3aGIFs+r1o8/8848tgylhE6jQcaYZ621K9Kt49sRigZV6TM33V+reFp6ytzhQ3rEO/f2qy+SvbT06qqyRA94bVJKem117tLSp1o8S6Zl/Yvq7dymVQu69IFDomoIPir9YZtTCC5T3gq3h3xRUo95g5PaDuRYKBTS1q1b5fP5VFlZqXA4rK1bt2rJkiUE7Ch5FAZGvpHdkTmCdRQNPnwyV8ivVWQgqVp6fKqyIUH57iynpc8KJKWkx+cQT6SpO73h+UhLz6a0V7CX1kq9b6RUYF/Z06qVb3l9cMeIpN1jHLxqfmpPeaBBqponmRzOtw6Mor29PRGoS0rct7e3q6GhIY8tA/KPwsDIt+neaTSVCNZRNPjwydxUvVbJAWJDfUAfP2GxFs+rTu0R3+vOG57DtHSnKFtqSnq8V3ymv3DT0rNlw+YOXf7TJ7Wgsl3H1+7SzP7X9I9fv6Ely/pVVzWOix5lgdSCb/Hx5d7KHLUcmJhQKJQI0OPKy8sVCoXy1CKgcFAYGPlWyJ1GhYZgHUWDD5/MZfu1stZqTyiS0gO+cUuHbv9dq2Sd3utX3gzqsb+26S0LZmpWoHzSz2FoWno8+C6mtPQJiQ0448gTldhbFXl2gy7fv1M+7+BrEYla7Wj3qm5R7fBjGI9b8K0hNTivmE3BN0wLfr9f4XA4JWDv7+8nBR5QYRUGRmmigy1zFJgrYBSYG78NmzsSHz6NcwN8+Iwi09dqtLT0zr2D84nHYqn/S55v3a3evgGVJQWIA1Erf2WZ3rpo1ojtiqelJ1LS42npM5KrpVeovKzEU6ytdSquJ+Yr3+78HNrhBOxJnvr7LpV5TEqcba0UjVkdeeiy1BT26gbJv7/k8QmYrpLHrJeXl6u/v1+RSIQx6wBQAEql0HGmRiswR7BewAjWkUvWWvX2R4enpMcrp7vBeE/vxNLSn/7bLnm9Rin9sMZ5fPbxDaqrrnDmEA8kFWqrrtBMv0+eIk9LH7donxR8xe0td6dG69kmRfZmtPvzrV3q7jPabefrzeg8tQ/M07aeOaqsXaLvf+bE3LYdyBOqwQNA4aKDbRDV4IESE4sNpqXvDjpBeGdi6rLBW66qpdfNqNCuvX3q2tunmkC5yrxGPq9HPb0RLZk/Q18+/dCsnbeoWCuF21KnRgu2Sr07nXWZqqxPGlPeKLvPTDXd8Zq8Xl/qFewzDs/BkwAKg9/vp5gcABSolctml2xwPh4E68A00z8Q0+6k3u94Wno8EO/s6dPuYGRYWvpEeTxGswK+xHRl8TnEk+/TpaUftmiW1rZsUixm5Sv3Khge0EDMUkMgbiCYmr4eD9Cj4cyPUVaVWoG9ulEKLHQKwSU5vF761pp9GZ8IAAAwjZAGX8BIgy8caae9ynKgk5yWnkhHj9/cXvHOnomnpadT7vMk0s9Tg+/BZZNJS88kxWkqXtu8ikWdnvFEwTc3MA+/mfkxjJGq9k0t9hZYJFXOpeAbAADANMaY9WmKYH1qjRQ0ZqMIRnJaepfbGz5YsC13ael1aXrD4z3hs2eUq6rcm9dq6UVXYKR/T1LBt/jtFSk2jgssvhluUN7oBOSBBqe33FuRmzYDAAAgbxizDoxhaNC4ra1Ha1s26eo1y7XukS2KDMS0q7tP4f6oKsu9mun36ab1W7Ry2exEWnpnT5+6gpGUXvF4mnou09JTe8OnV7X0dY9skc/rSUzdUeP3qTsUSby2BSvaL/XuGOwljwfn/bszP4anzKm6nkhfdwPz8lp6ywEAAECwDkjpg8Y9wX796KG/6y/bd6t9dzhRyby7N6I9oYh2dffrMz/6k3p6B0Y/+DiU+zxpe8OTlxVTtfTWth4FKlP/DTkXS4J5atEQ1kp9u1LT14OtUug1ycYyP07lnCFjyxukqv2cgB0AAABIg2+KKEmxmNXu0OBc4S9s61KZ12hvb0QDMauBqFX/QEyv7AqpfyCmSNQqOT6OWau+gei4AvXqKp/qqt1pygLJKemFk5Y+1RrmVqu1rSdxkUSSguEBLZ5XPfWNGeh1i721Jo0vb3UKwWXKWzE4njy5x9w3IxctBgAAQBEjWEfByNacuP0DsZQK6Z1ugbbkucT3hFLT0sORmHp7BlTmHQyUB6JW/soyRaMRRSRZWUlGkrOf1w2qk9PSEz3hM8oHHwfKNau6fFqkpU+1809eqrUtm9QdiqSMWc9pxXgbk3pfTzM92huZH8MYqXJeavp6daOzrIQuthSyoi9cCAAAih4F5gpYKRWYC4VC2rp1q3w+n8rLy9Xf369IJKIlS5YkAnZrrYLhASfgDvanTFcWD867gv0TSkvfHezXSzv2yCMjr9coGrWKyeptDbXa0RFSdyii/oGYBqJWFT6PZlSW6aD9Z+q681cWVVp6PmRSMX7CIntT09d7WqXQdmfMeaZ81cNT2AOLJG9ldtqIrCu6woUAAKBoUWAOBe/1N9oUGjAK90l7QiHt6R1Qx56wHvjLX2R9gUSveGRgHOOEx1Bd5VPdjPJE7/dxPXP15Mvt2tXdp4b6gM4/eZmOfcs+2rilM+0X/4vff5Bqq8uz1p5StXLZ7MkHULEBKbQjqRK7O3d5X2fmx/B4paoFTiBe3eBWY2+QKmbTWz7NTNvChQAAAEkI1pFzfZFoyvRknYl5wwerpr/6Rqc8Ho+TZR5npWgsqhnV45vOzOMxzjjwQHJK+mBael11uWYFyuVLk5b+Hx86eNiylctm6+o1yxO9v4vnVWe39xeZs1bq70wq+OaOMQ+96sxnnqmKuiHjyhsk/wLJ4xtrT0wDBV+4EAAAIAME65iwoWnp8ZT0zp4+7U4KzIPhsdPSPV6PYtGYPN7BADoWi8nr9aZsV+HzDlZKd4PwukC56mZUaFbAWTbT78t6kbas9P5ifKJhZ47yeC95fHx5ZG/mx/CWS/6FSenr7hjz8pm5aTMKQkEVLgQAAJgggnVMyE3rt+ip/9uVtbT0ivIKKRrWLH+ZaqvLFSiXqsulg5Ys0LzZNYnCbaVWLb0kWCuF33CC8p5WJzAPtjpF4MZTU6Nq7pCx5Y1S1XzJUNiv1OSlcCEATCPZKuoLILcI1jEhRibjQD2ell5XPXy6suS09Eh/mA+OYjcQTK3AHk9lj4YzP0aZP6mnvGFwqrQy3itwFOvQFSrcA8iG5KK+lZWVCofD2rp1a0pRXwCFgWAdExIvrBZPS0+kpCdNXxa/ZZqW7ivzq6GhIcctx5SIDUi9O4dMj7ZNCu/K/BjGOOPIh1Zir9iHgm8YU7ENXRla4X5bW4/Wtmyiwj2AcWtvb08E6pIS9+3t7XwPAwoMwTom5D3L5+s9y+fLX8FbqKRZK0V2D+ktb5VCrzgBe6bKZw6OJ08UfNvfGXMOgAr3ALImFAolAvS48vJyhUKhPLUIwEiItDAhBOklKNrvBOHB7alzl/fvyfwYnjKn4Fs8hT0emJfPykWLgaJBhXsA2eL3+xUOh1MC9v7+flLggQJExAUglbVS35up6es9rVLva+Mr+FY5Z3Cu8ngl9qr5TsAOYFyocA8gW+rr67V161ZJTo96f3+/IpGIFi5cmOeWARiKb81AKRsIDc5Vnjy+fGAcqXDeytT09XhwXhbIQYOB0kSFewDZ4vf7tWTJkpSivgsXLqRnHShABOtAKbAxZyq05PT1YKvU25b5MYxxesYDDU4veTyVvXIuBd+AHCvWCvcA8sPvp6gvMB0QrAPFJtKdNF+5O3d5aLsz5jxTvhmp6euBBimwv9OLDiAviq3CPQAAGB3BOjBdxSJSaMeQ6dFapb7OzI/h8TpV1+NzlVc3Sv5FUsVsessBAACAPCJYBwqdtVJfR2r6ek+r1LtDikUzP05FXdL0aA3Oz/4FFHwDAAAAChDf0oFCEg0Ppq4nB+eRnhF32R3s1/Y3gwr1R+Uv92r/ubWqq182OKY8ns7uq5mCJwAAAAAgGwjWgXywVgq/kVTwzZ27PPzGuKZHe7N/lu7/u1ft0X3VbeZrW7BebZtn6b/WrGBsKwBgSmzY3KF1j2xRa1uPGudW67yTKX4IANlAsA7kWmTvYDAeD8yDrVK0L/NjlAVS09fdMeZf/vELau1Kmnu5XCobiOim9Vv4ogQAyLkNmzu0tmWTfF6PApVl2tbWo7Utm3T1muV8DgHAJBGsA9kSG5B6Xxuewh7elfkxjEfy75eUvu4G5hVz0hZ8a23rUaAy9c/Y+bIUnPDTAAAgU+se2SKf15O4aFzj96k7xEVjAMgGgnVgvKyV+rtSi70FW6XQq07AnqnymanzlVc3SlULJG95xodomFut1raknnVJwfCAFs+rzrwdAABMEBeNASB3CNaB0UT7B9PWg9uduct7Wp25zDPl8TnTo8V7yePBefmsSTfv/JOXam3LJnWHIgpUlikYHlAkGtO5Jy2d9LEBABgLF40BIHcI1gHJLfjWPmR6tG1S785xFXxTZf3gfOXxVPaqfZ35zHNg5bLZunrNct20fou2tQW1eF61zj2Jwj7FhuJNwPRQCH+rU92GXF00LoTXEgDyzdjxBCKYUitWrLAbN27MdzOKz0Aw/fRoA72ZH8NbmTQ1WtLc5WWBHDQYpWxo8ab4F2GKNwGFpRD+VvPVhg2bOxIXjRvnBiZ90bgQXksAmCrGmGettSvSraNnHcUrFpXCr7tB+bbB4DzcnvkxjHF6xocWfKusT1vwDcg2ijcB00Mh/K3mqw0rl83O6vEL4bUEgEJAsI7i0L8nNX09XvAt2p/5MXwzUtPXAw1SYKHkrchFi4GMULwJmB4K4W+1ENqQDcXyPABgsgjWMb3EIk4QnpzC3tPqVGfPlKdM8i9IqsS+yPm5vJbechQcijcB00Mh/K0WQhuyoVieBwBMFsE6CpO1Ul+Hk76eMj3aDsnGMj9OxezU9PXAIidQ9/DWx/RAxX9geiiEv9VCaEM2FMvzAIDJosBcASuZAnPRsFvwbdvgNGk925xCcJnylkt+t8hb8tzlvhm5aTMwhbJdvAlAbhTC32ohtCEbiuV5AMBYRiswR7BewIouWLcxqfeNIdOjtUq9r4/vOFXzhhR8WyRVzZeMJ9stRhpMpwMAAABkB9XgMfUie1PT14OtTq95tC/zY5QFkqZHa3ACc/9CqawqBw1GJoZOp7OtrUdrWzYxnQ4AAACQZQTrmJzYgDOOPDl9PdjqjDfPlPFI/v1S09cDjc54cwq+FRSm0wEAAACmBsE6Jmbbz6SOPzmBemwg8/3KZ6UWfKtukKoWOGPOUfCYTgcAAACYGgTrmJhwm5PiPhKPz5mjPKUSe4NUPnNq2oecYDodAAAAYGoQrGNiAg2SnnB+rqwf7CUPxAu+7St5vPlrH3KC6XQAFAIKXQIASgHV4AtYQVeDD7dLfbucwLwskO/WYAoV0nQ6fGEHSs/QQpfxi4YUugQATEdM3TZNFXSwDuQZX9iB0nTe9c8MG47THYpo8bxq/fgz78xjywAAGL/RgnUmpgYwLSVXpvd6jGr8Pvm8Ht20fku+mwYghyh0CQAoFYxZBzAt8YUdmL4mM4SFQpdAdjCUDCh89KwDmJYa5lYrGE6dNjAYHlDjXGooAIUsPoQlfsFtW1uP1rZs0obNHRntf/7JSxWJxtQdiigas+oORSh0CYzTZP8OAUwNgnUA0xJf2IHpabJDWFYum62r1yzX4nnVCvVFtXheNbUqgHFiKBkwPZAGD2Bain9hj1emXzyvOq+V6QFkJhtDWFYum83fOjAJDCUDpgeCdQDTFl/YgemHMedA/vF3CEwPpMEDAIApwxAWIP/4OwSmB4J1AAAwZRhzDuQff4fA9GCstfluA0awYsUKu3Hjxnw3AwAAAACQA8aYZ621K9Kto2cdAAAAAIACQ7AOAAAAAECBIVgHAAAAAKDAEKwDAAAAAFBgmGcdAIARbNjcoXWPbFFrW48a51brvJOXUi0ZAABMCXrWAQBIY8PmDq1t2aTWth4FKsu0ra1Ha1s2acPmjnw3DQAAlACCdQAA0lj3yBb5vB7V+H3yeoxq/D75vB7dtH5LvpsGAABKAME6AABpxHvUkzk97ME8tQgAAJQSxqwDABIYoz2oYW61Wtt6VOP3JZYFwwNaPK86j60CAAClgp51AIAkxmgPdf7JSxWJxtQdiigas+oORRSJxnTuSUvz3TQAAFACCNYBAJIYoz3UymWzdfWa5Vo8r1qhvqgWz6vW1WuWl2ymAQAAmFqkwQMAJDFGO52Vy2YTnAMAgLygZx0AIMkZox0MD6QsC4YH1Dg3kKcWAQAAlC6CdQCAJMZoAwAAFBKCdQCAJMZoAwAAFBLGrAMAEhijDQAAUBjoWQcAAAAAoMAQrAMAAAAAUGAI1gEAAAAAKDAE6wAAAAAAFBiCdQAAAAAACgzBOgAAAAAABYZgHQAAAACAAkOwDgAAAABAgSFYBwAAAACgwBCsAwAAAABQYMry3QAAAFB6Nmzu0LpHtqi1rUeNc6t13slLtXLZ7Hw3CwCAgkHPOgAAmFIbNndobcsmtbb1KFBZpm1tPVrbskkbNnfku2kAABQMgnUAADCl1j2yRT6vRzV+n7weoxq/Tz6vRzet35LvpgEAUDAI1gEAwJSK96gnc3rYg3lqEQAAhYcx6wAAYEo1zK1Wa1uPavy+xLJgeECL51XnsVUAgFyiVsn40bMOAACm1PknL1UkGlN3KKJozKo7FFEkGtO5Jy3Nd9MAADlArZKJIVgHAABTauWy2bp6zXItnletUF9Ui+dV6+o1y+lhAYAiRa2SiSENHgAATLmVy2YTnANAiaBWycQQrAMARsUYMwAAMBnUKpkY0uABACNijBkAAJgsapVMDME6AGBEjDEDAACTRa2SiSENHphGQqGQ2tvbFQqF5Pf7VV9fL7/fn+9moYgxxgwAAGQDtUrGj551YJoIhULaunWrwuGwKisrFQ6HtXXrVoVCoXw3DUWsYW61guGBlGXB8IAa5wby1CIAAIDSQLAOTBPt7e3y+XyqrKyUx+NRZWWlfD6f2tvb8900FDHGmAEAAOQHwTowTYRCIZWXl6csKy8vp2cdOcUYMwAAgPxgzDowTfj9/kQKfFx/fz9j1pFzjDEDAACYevSsA9NEfX29IpGIwuGwYrGYwuGwIpGI6uvr8900AAAAAFlGsA5ME36/X0uWLEkUl6usrNSSJUvoWQcAAACKEGnwwDTi9/vV0NCQ72YAAAAAyDGCdQAAMO1t2NyhdY9sUWtbjxrnVuu8k5dSawEAMK2RBg8AAKa1DZs7tLZlk1rbehSoLNO2th6tbdmkDZs78t00AAAmjGAdAABMa+se2SKf16Mav09ej1GN3yef16Ob1m/Jd9MAAJgw0uABAMC0Fu9RT+b0sAfz1KKpxRAAAChO9KwDAIBprWFutYLhgZRlwfCAGucG8tSiqcMQAAAoXgTrAABgWjv/5KWKRGPqDkUUjVl1hyKKRGM696Sl+W5azjEEAACKF8E6AACY1lYum62r1yzX4nnVCvVFtXheta5es7wkUsFLfQgAABQzxqwDAHIqF+NpGaOLoVYum12S74GGudVqbetRjd+XWBYMD2jxvOo8tgoAkA30rAMAciYX42kZowsMKuUhAABQ7AjWAQA5k4vxtIzRBQaV8hAAACh2pMEDAHImF+NpGaObHQwlKB6lOgQAAIodPesAgJzJxZRapTxNV7YwlAAAgMJHsA4AyJlcjKdljO7kMZQAAIDCR7AOAMiZXIynZYzu5DGUAACAwseYdQBATuViPC1jdCeH6b4AACh89KwDAFBiGEoAAEDhI1gHAKDEMJQAAIDCRxo8AAAliKEEAAAUNnrWAQAAAAAoMATrAAAAAAAUGIJ1AAAAAAAKDME6AAAAAAAFhgJzKBqhUEjt7e0KhULy+/2qr6+X3+/Pd7MAAAAAYNzoWUdRCIVC2rp1q8LhsCorKxUOh7V161aFQqF8Nw0AAAAAxo1gHUWhvb1dPp9PlZWV8ng8qqyslM/nU3t7e76bBgAAAADjRrCOohAK/f/27i+2juyuA/jXdpwNjuuKJo0jVlXzx3loqaAb4iJgxb+SiJa2W8pupRZBQ7W7CFHRFwISEgwHaUE0TytQQVEkghACWlUNLaKr7NKHbhGl3qaUlQBhQ7L8lQXmTxobJzf25eGOdx3Ldv6u7oz9+UjWuffMzL2/GT997zlzZiE7d+68qW/nzp1G1gEAgFYS1tkSRkZGcv369Zv6rl+/7p51AACglYR1toR9+/al0+lkcXExy8vLWVxcTKfTyb59+/pdGgAAwB0T1tkSRkZGcvjw4ZcXl9u1a1cOHz5sZB0AAGglj25jyxgZGcmBAwf6XQYAAMA9E9bhVTA1PZczF2ZyefZqDo6P5okTE5k8sqffZQEAAC1hGjzcZ1PTczl17mIuz17N7l07cmn2ak6du5ip6bl+lwYAALSEsA732ZkLMxkeGszYyHCGBgcyNjKc4aHBnH12pt+lAQAALSGsw322MqK+Wm+Efb5PFQEAAG0jrMN9dmB8NPOLN27qm1+8kYPju/tUEQAA0DbCOtxnT56YSGdpOVcWOlla7ubKQiedpeU8fnyi36UBAAAtIazDfTZ5ZE9OnzyaQ/tHs3BtKYf2j+b0yaNWgwcAAG6bR7fBq2DyyB7hHAAAuGtG1gEAAKBhjKwDrTQ1PZczF2ZyefZqDo6P5okTE2YzbMC1AgBoHyPrQOtMTc/l1LmLLz8m79Ls1Zw6dzFT03P9Lq1xXCsAgHYS1oHWOXNhJsNDgxkbGc7Q4EDGRoYzPDSYs8/O9Lu0xnGtAADaSVgHWmdllHi13qjxfJ8qai7XCgCgnYR1oHUOjI9mfvHGTX3zizdycHx3nypqLtcKAKCdhHWgdZ48MZHO0nKuLHSytNzNlYVOOkvLefz4RL9LaxzXCgCgnQa63W6/a2ADx44d677wwgv9LgMaaWp6Lmefncml2fkcHN+dx49b4XwjrhVsfZ76ANBOAwMDX+l2u8fW3SasN5ewDgDcyspTH4aHBrN7147ML95IZ2k5p08eFdgBGm6zsG4aPABAi3nqA8DWJKwDALSYpz4AbE3COgBAi3nqA8DWJKwDALSYpz4AbE3COgBAi00e2ZPTJ4/m0P7RLFxbyqH9oxaXA9gCdtx6FwAAmmzyyB7hHGCLMbIOAAAADSOsAwAAQMMI6wAAANAwwjoAAAA0jLAOAAAADSOsAwAAQMMI6wAAANAwwjoAAAA0jLAOAAAADSOsAwAAQMMI6wAAANAwwjoAAAA0jLAOAAAADSOsAwAAQMMI6wAAANAwwjoAAAA0jLAOAAAADSOsAwAAQMMI6wAAANAwwjoAAAA0jLAOAAAADSOsAwAAQMPs6HcBQP9MTc/lzIWZXJ69moPjo3nixEQmj+zpd1kAALDtGVmHbWpqei6nzl3M5dmr2b1rRy7NXs2pcxczNT3X79IAAGDbE9ZhmzpzYSbDQ4MZGxnO0OBAxkaGMzw0mLPPzvS7NAAA2PaEddimVkbUV+uNsM/3qSIAAGCFsA7b1IHx0cwv3ripb37xRg6O7+5TRQAAwAphHbapJ09MpLO0nCsLnSwtd3NloZPO0nIePz7R79IAAGDbE9Zhm5o8sienTx7Nof2jWbi2lEP7R3P65FGrwQMAQAN4dBtsY5NH9gjnAADQQEbWAQAAoGGEdQAAAGgYYR0AAAAaRlgHAACAhhHWAQAAoGGEdQAAAGgYYR0AAAAaRlgHAACAhhHWAQAAoGGEdQAAAGiYgW632+8a2MDAwMB/JHmp33UAAADwqnhjt9t9/XobhHUAAABoGNPgAQAAoGGEdQAAAGgYYR0AAAAaZke/C+DulFLGknwxyQN114mqqixGBwAAsAUYWW+v9+SVoJ4kj/arEAAAAO4vYb29VsL579XtD5dShvpVDAAAAPePsN5CpZRvTvKmJFeSnE7yL0len+R7+lkXAAAA94d71ttpZVT9T6uqulZKOZ/kI0l+JMnnNzuwlPL2JB9K8ub0/v//lOQzSc4l+an6c75cVdWPbXD8g0lOJvnOJA8mGUjy70meT/I7VVX92z2cFwAAADGy3jqllAeSvKt+e35V203yfaWUvZsc+/NJPp7k25O8JkknyUSSU+mF9U1/vCmlvDvJM0l+vD5uZdr9wbrvT0opD9/hKQEAALCGsN4+J5KMJXmpqqqvJklVVf+c5Cvphef3rndQKeWHkny4fvvZJN9dVdVkkoeS/GKSb0nywY2+tJTyXUk+Vn/H2STfXx/z1iQ/mF6I353k6VLKN93LCQIAAGx3psG3z2N1e35N//kkx9KbCn929YZSykCSj9Zv/zzJqaqquklSVdW1JJ8opdxI8mvrfWEpZTBJld6PO79UVdUfrdnlUpKPllJ+K70Q/xNJnrrTEwMAAKDHyHqLlFLekORt6U15/+M1mz+XZDHJoVLKQ2u2vSnJG+vXv70S1Nf4dJKN7jc/Vh//30k+uUmJ5+vWVHgAAIB7YGS9Xd6X3oJuU1VV/evqDVVVXS2lPJfe/eyPJfnqqs1vrtsba/pXH98tpUwleWSdzd9Wt69J8nwpZaP6dtatafAAAAD3wMh6S9RT0d9Xvz2/wW6frtt3lFJGVvW/rm7/p6qqziZfM7tB/7663ZFk7yZ/Y/V+uzb5DgAAAG7ByHp7PJxkf/36qVLKZveEjyR5R5JPrelfb/r77Vj5UedrVVW9/y4/AwAAgNtkZL09Hr31Ljd5bNXr/6rbbyylDG9yzPgG/f9Ztw/eYQ0AAADcBWG9BUopr0vy9vrtz6T3uLWN/lZC/UOllIP167+p2x31Put9x0CSyQ1KuFi3e0spb7nL0wAAAOA2mQbfDo+k97/6epLP3+K+8xdLKf+Y5FB6wf10kr9N8lJ6K7r/ZCllap0V4d+TjReG+9Kq43+hlPKhzWoopby2qqr/vY3zAgAAYB1G1tthZbT8z24R1Fc8U7fvLaUM1cH8N+q+h5P8eillX5KUUh4opTya5FeSXFnvw6qqWkrvOetL6a0M//ullO8opbz8Y08p5Q2llA+UUj6V5Efv8PwAAABYRVhvuFLKW5NM1G+f2WTX1T5Xt3uTfG+SVFX12SS/W/c/kuQL9aPaLiZ5KslfJfmDevv1tR9YVdVfpDcFfz7JtyY5l+RrpZS/LKW8mOS5JL+c5C25+4XsAAAAiLDeBiuj6l9P8sXbOaCqqr9P8g9rjk9VVb+a5CNJvpxe6B6u9/tYkg8n+YZ6141G2J9LcjzJbyb56yQL6T17/XqSv0vyySQ/neTsbZ0ZAAAA6xrodg2C0lNK+cP0FqB7uqqqj/e7HgAAgO3KyDpJklLK2/LKSvHP97MWAACA7c5q8NtIKaVK8mKSLySZq6qqW0oZS/LOJD9b7/alqqpe7FeNAAAACOvbzdEkH6xfXy+lLKZ3z/lA3TeT5Of6URgAAACvENa3l6eT/EB6q7nvTS+oX0kyneRCkk9UVfV//SsPAACAxAJzAAAA0DgWmAMAAICGEdYBAACgYYR1AAAAaBhhHQAAABpGWAcAAICGEdYBAACgYf4fK4SNCHDYDowAAAAASUVORK5CYII=\n", 213 | "text/plain": [ 214 | "
" 215 | ] 216 | }, 217 | "metadata": {}, 218 | "output_type": "display_data" 219 | } 220 | ], 221 | "source": [ 222 | "generate_scatterplot(p_missing=0.1, missing_type = 'MCAR', method = 'linear')" 223 | ] 224 | }, 225 | { 226 | "cell_type": "code", 227 | "execution_count": 6, 228 | "metadata": { 229 | "scrolled": false 230 | }, 231 | "outputs": [ 232 | { 233 | "data": { 234 | "application/vnd.jupyter.widget-view+json": { 235 | "model_id": "2370061c73b749058d6d52b202778572", 236 | "version_major": 2, 237 | "version_minor": 0 238 | }, 239 | "text/plain": [ 240 | "interactive(children=(FloatSlider(value=0.0, description='p_missing', max=0.99, step=0.05), Dropdown(descripti…" 241 | ] 242 | }, 243 | "metadata": {}, 244 | "output_type": "display_data" 245 | } 246 | ], 247 | "source": [ 248 | "def plot_interact(p_missing = 0, missing_type = 'MCAR', method = 'linear'):\n", 249 | " generate_scatterplot(p_missing, missing_type, method, missing_column = 'income', depends_on = 'age')\n", 250 | " \n", 251 | "interact(plot_interact, p_missing = (0, 0.99, 0.05), missing_type = ['MCAR','MAR','NMAR'], method = ['linear','quadratic']);" 252 | ] 253 | } 254 | ], 255 | "metadata": { 256 | "kernelspec": { 257 | "display_name": "Python 3", 258 | "language": "python", 259 | "name": "python3" 260 | }, 261 | "language_info": { 262 | "codemirror_mode": { 263 | "name": "ipython", 264 | "version": 3 265 | }, 266 | "file_extension": ".py", 267 | "mimetype": "text/x-python", 268 | "name": "python", 269 | "nbconvert_exporter": "python", 270 | "pygments_lexer": "ipython3", 271 | "version": "3.8.3" 272 | } 273 | }, 274 | "nbformat": 4, 275 | "nbformat_minor": 2 276 | } 277 | --------------------------------------------------------------------------------