├── README.md └── Arrythmia Project.ipynb /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | This repo contains the Jupyter Notebook for predicting heart arrhythmias using deep learning using the dataset from MIH-BIH Arrythmia dataset from https://physionet.org/content/mitdb/1.0.0/ . 4 | 5 | The methodology behind this notebook is explained in my Medium Blog post: https://towardsdatascience.com/detecting-heart-arrhythmias-with-deep-learning-in-keras-with-dense-cnn-and-lstm-add337d9e41f 6 | -------------------------------------------------------------------------------- /Arrythmia Project.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "Recently, I was reviewing Andrew Ng's team's work(https://stanfordmlgroup.github.io/projects/ecg/) on heart arrhythmia detector with convolutional neural networks (CNN). I found this quite fascinating especially with emergence of wearable products (e.g. Apple Watch and portable EKG machines) that are capable of monitoring your heart while at home. As such, I was curious how to build a machine learning algorithm that could detect abnormal heart beats. Here we will use an ECG signal (continuous electrical measurement of the heart) and train 3 neural networks to predict heart arrythmias: dense neural network, CNN, and LSTM. " 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "# Dataset" 22 | ] 23 | }, 24 | { 25 | "cell_type": "markdown", 26 | "metadata": {}, 27 | "source": [ 28 | "We will use the MIH-BIH Arrythmia dataset from https://physionet.org/content/mitdb/1.0.0/. This is a dataset with 48 half-hour two-channel ECG recordings measured at 360 Hz. The recordings have annotations from cardiologists for each heart beat. The symbols for the annotations can be found at https://archive.physionet.org/physiobank/annotations.shtml" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": {}, 34 | "source": [ 35 | "# Project Definition" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "Predict if a heart beat from the first ECG signal has an arrhythmia for each 6 second window centered on the peak of the heart beat. " 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "metadata": {}, 48 | "source": [ 49 | "To simplify the problem, we will assume that a QRS detector is capable of automatically identifying the peak of each heart beat. We will ignore any non-beat annotations and any heart beats in the first or last 3 seconds of the recording due to reduced data. We will use a window of 6 seconds so we can compare the current beat to beats just before and after. This decision was based after talking to a physician who said it is easier to identify if you have something to compare it to. " 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "# Data Preparation" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": 1, 62 | "metadata": {}, 63 | "outputs": [], 64 | "source": [ 65 | "import pandas as pd\n", 66 | "import numpy as np\n", 67 | "import matplotlib.pyplot as plt\n", 68 | "from os import listdir\n" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": 2, 74 | "metadata": {}, 75 | "outputs": [], 76 | "source": [ 77 | "# data must be downloaded and path provided\n", 78 | "data_path = 'mit-bih-arrhythmia-database-1.0.0/'\n" 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 3, 84 | "metadata": {}, 85 | "outputs": [], 86 | "source": [ 87 | "# list of patients\n", 88 | "pts = ['100','101','102','103','104','105','106','107',\n", 89 | " '108','109','111','112','113','114','115','116',\n", 90 | " '117','118','119','121','122','123','124','200',\n", 91 | " '201','202','203','205','207','208','209','210',\n", 92 | " '212','213','214','215','217','219','220','221',\n", 93 | " '222','223','228','230','231','232','233','234']" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": {}, 99 | "source": [ 100 | "Here we will use a pypi package wfdb for loading the ecg and annotations. " 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 4, 106 | "metadata": {}, 107 | "outputs": [], 108 | "source": [ 109 | "import wfdb" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "Let's load all the annotations and see the distribution of heart beat types across all files. " 117 | ] 118 | }, 119 | { 120 | "cell_type": "code", 121 | "execution_count": 5, 122 | "metadata": {}, 123 | "outputs": [], 124 | "source": [ 125 | "df = pd.DataFrame()\n", 126 | "\n", 127 | "for pt in pts:\n", 128 | " file = data_path + pt\n", 129 | " annotation = wfdb.rdann(file, 'atr')\n", 130 | " sym = annotation.symbol\n", 131 | " \n", 132 | " values, counts = np.unique(sym, return_counts=True)\n", 133 | " df_sub = pd.DataFrame({'sym':values, 'val':counts, 'pt':[pt]*len(counts)})\n", 134 | " df = pd.concat([df, df_sub],axis = 0)" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 6, 140 | "metadata": {}, 141 | "outputs": [ 142 | { 143 | "data": { 144 | "text/plain": [ 145 | "sym\n", 146 | "N 75052\n", 147 | "L 8075\n", 148 | "R 7259\n", 149 | "V 7130\n", 150 | "/ 7028\n", 151 | "A 2546\n", 152 | "+ 1291\n", 153 | "f 982\n", 154 | "F 803\n", 155 | "~ 616\n", 156 | "! 472\n", 157 | "\" 437\n", 158 | "j 229\n", 159 | "x 193\n", 160 | "a 150\n", 161 | "| 132\n", 162 | "E 106\n", 163 | "J 83\n", 164 | "Q 33\n", 165 | "e 16\n", 166 | "[ 6\n", 167 | "] 6\n", 168 | "S 2\n", 169 | "Name: val, dtype: int64" 170 | ] 171 | }, 172 | "execution_count": 6, 173 | "metadata": {}, 174 | "output_type": "execute_result" 175 | } 176 | ], 177 | "source": [ 178 | "df.groupby('sym').val.sum().sort_values(ascending = False)" 179 | ] 180 | }, 181 | { 182 | "cell_type": "code", 183 | "execution_count": 7, 184 | "metadata": {}, 185 | "outputs": [], 186 | "source": [ 187 | "# list of nonbeat and abnormal\n", 188 | "nonbeat = ['[','!',']','x','(',')','p','t','u','`',\n", 189 | " '\\'','^','|','~','+','s','T','*','D','=','\"','@','Q','?']\n", 190 | "abnormal = ['L','R','V','/','A','f','F','j','a','E','J','e','S']" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": 8, 196 | "metadata": {}, 197 | "outputs": [], 198 | "source": [ 199 | "# break into normal, abnormal or nonbeat\n", 200 | "df['cat'] = -1\n", 201 | "df.loc[df.sym == 'N','cat'] = 0\n", 202 | "df.loc[df.sym.isin(abnormal), 'cat'] = 1" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": 9, 208 | "metadata": {}, 209 | "outputs": [ 210 | { 211 | "data": { 212 | "text/plain": [ 213 | "cat\n", 214 | "-1 3186\n", 215 | " 0 75052\n", 216 | " 1 34409\n", 217 | "Name: val, dtype: int64" 218 | ] 219 | }, 220 | "execution_count": 9, 221 | "metadata": {}, 222 | "output_type": "execute_result" 223 | } 224 | ], 225 | "source": [ 226 | "df.groupby('cat').val.sum()" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | "Let's write a function for loading a single patient's signals and annotations. Note the annotation values are the indices of the signal array. " 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 10, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": [ 242 | "def load_ecg(file):\n", 243 | " # load the ecg\n", 244 | " # example file: 'mit-bih-arrhythmia-database-1.0.0/101'\n", 245 | " \n", 246 | " # load the ecg\n", 247 | " record = wfdb.rdrecord(file)\n", 248 | " # load the annotation\n", 249 | " annotation = wfdb.rdann(file, 'atr')\n", 250 | " \n", 251 | " # extract the signal\n", 252 | " p_signal = record.p_signal\n", 253 | " \n", 254 | " # verify frequency is 360\n", 255 | " assert record.fs == 360, 'sample freq is not 360'\n", 256 | " \n", 257 | " # extract symbols and annotation index\n", 258 | " atr_sym = annotation.symbol\n", 259 | " atr_sample = annotation.sample\n", 260 | " \n", 261 | " return p_signal, atr_sym, atr_sample " 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "Let's check out what abnormal beats are in a patient's ecg:" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 11, 274 | "metadata": {}, 275 | "outputs": [], 276 | "source": [ 277 | "file = data_path + pts[0]" 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": 12, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [ 286 | "p_signal, atr_sym, atr_sample = load_ecg(file)" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": 13, 292 | "metadata": {}, 293 | "outputs": [ 294 | { 295 | "name": "stdout", 296 | "output_type": "stream", 297 | "text": [ 298 | "+ 3\n", 299 | "J 50\n", 300 | "N 2700\n", 301 | "V 3\n", 302 | "~ 8\n" 303 | ] 304 | } 305 | ], 306 | "source": [ 307 | "values, counts = np.unique(sym, return_counts=True)\n", 308 | "for v,c in zip(values, counts):\n", 309 | " print(v,c)" 310 | ] 311 | }, 312 | { 313 | "cell_type": "markdown", 314 | "metadata": {}, 315 | "source": [ 316 | "Let's make a plot of these, zooming in on one of the abnormal beats" 317 | ] 318 | }, 319 | { 320 | "cell_type": "code", 321 | "execution_count": 14, 322 | "metadata": {}, 323 | "outputs": [ 324 | { 325 | "data": { 326 | "text/plain": [ 327 | "[2044, 66792, 74986, 99579, 128085, 170719, 279576, 305709, 307745, 312825]" 328 | ] 329 | }, 330 | "execution_count": 14, 331 | "metadata": {}, 332 | "output_type": "execute_result" 333 | } 334 | ], 335 | "source": [ 336 | "# get abnormal beat index\n", 337 | "ab_index = [b for a,b in zip(atr_sym,atr_sample) if a in abnormal][:10]\n", 338 | "ab_index" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": 15, 344 | "metadata": {}, 345 | "outputs": [], 346 | "source": [ 347 | "x = np.arange(len(p_signal))" 348 | ] 349 | }, 350 | { 351 | "cell_type": "code", 352 | "execution_count": 16, 353 | "metadata": {}, 354 | "outputs": [ 355 | { 356 | "data": { 357 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAEGCAYAAACaZ8fiAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nOydd3xUVdrHf89MekgCgVBCCwIBAgGREAWxYQNRsWLBXlBx13fdXXdxcXHFVdG17OtrA3WxobsW7AUEBVZUIPReRDqBUBIC6TPn/WPuvXPvzJ1kMu3eZ3K+nw9k5s6dOc9t5zlPOc8hIQQkEolEIpHwwmG1ABKJRCKRSJqPVOASiUQikTBEKnCJRCKRSBgiFbhEIpFIJAyRClwikUgkEoYkWC1ApGnXrp3Iy8uzWgyJRCJhxfLlyw8JIXKslkMSPHGnwPPy8lBSUmK1GBKJRMIKItpptQyS5iFd6BKJRCKRMEQqcIlEIpFIGCIVuEQSA36YNhF7shPgJsKe7AT8MG2i1SJJJBLmxF0MPB44eqIOyYkOpCXJyxMP/DBtIgZPeRnp9Z73XY660GbKy/gBwIhJL1kqm0QSDZYvX94+ISHhNQADIA3FUHEDWNfQ0HDHkCFDDprtYKmGIKJ/AbgYwEEhxACTzwnA/wK4CEAVgFuEECtiK2XsGfzot+jcOhWLJ420WhRJBMh7aoamvFXS6z3bIRW4JA5JSEh4rWPHjv1ycnKOOhwOueBGCLjdbiorKysoLS19DcClZvtYPTJ6A8CoRj4fDaC38m8CgJdjIJMt2FtebbUIkgiRe9TVrO0SSRwwICcn55hU3qHjcDhETk5OBTxeDPN9YiiPH0KIRQCONLLLWABvCQ8/A2hNRJ1iI51EEhn2tXE2a7tEEgc4pPIOH+UcBtTTVlvgTdEZwG7d+z3KNgNENIGISoiopKysLGbCSSTBsONPE3Ai0bjtRKJnu0QikYSK3RU4mWzzG9UJIWYIIYqEEEU5ObKQkMRejJj0ElZOvQc7sxxwA9jTxomVU++RCWwSiSQs7J7mvAdAV937LgD2WSSLRBIyIya9hLzyMQCAHdPGoIvF8kgkEv7Y3QL/DMBN5OE0ABVCiP1WCxVNZq2dhT3Jt2JnyiXI+2ceZq2dZbVIEolEwo6XXnopu7CwsF/fvn0Lrr/++u4NDQ348MMPMwsKCvr16dOnYNiwYfkAsG/fvoThw4f3Ligo6Hf99dd3z83NLdy/f7/djVsA1k8jew/A2QDaEdEeAA8DSAQAIcQrAL6CZwrZNnimkd1qjaSxYdbaWZjw+QS4HFUAgJ0VOzHhc0+cdHzheCtFk0jihllrZ2Hy/MnYVbEL3bK64bFzH5PPVxR54MPVXbeUVqZF8jfzO2ZU/eOqQbsDfb5ixYqUDz/8MLukpGRTcnKyuOGGG7q9/PLLbR977LHOCxYs2NS3b9+6AwcOOAFg0qRJuWeddVblE088Ufrhhx9mvvfee+0iKWs0sVSBCyGua+JzAeDeGIljOZPnT0ZVfZVhW1V9FSbPnyw7GIkkAqiDZPU5k4Pk+OSbb77JWLduXdqgQYP6AUBNTY1j5cqV6cXFxZV9+/atA4AOHTq4AGDp0qWtPvnkk20AcNVVVx3LzMxkM7+ThZugpbCrYleztkskkuYhB8mxpzFLOVoIIejqq68+/OKLL+5Vt82aNSvr/fffzzbZN7bCRRC7x8BbFN2yujVru0QiaR5ykNwyGDVq1LEvvviizd69exMA4MCBA86hQ4dWL1myJGPTpk1J6jYAKC4uPv72229nA8Ds2bMzjx07xqZAg1TgNuKxcx9DWoIxVJSWmIbHzn3MIokkkvhCDpJbBkOGDKl56KGH9p577rn5+fn5BSNHjszfvXt34vPPP7/j8ssv79WnT5+Cyy+//CQAmDZt2r7vvvsus6CgoN+XX36ZlZOTU9+6dWsWbnTpQrcR4wvHo7bejQmf/gEuOoTurWWCjUQSSR479zFDDByQg+R45c477zx65513HvXdPm7cuA3699nZ2a5FixZtSUxMxLx589IXL16ckZqaysKvLhW4zbiy77WYqoRpdvxujMXSSCSBKausxWs/bMefLuwLp8Os5pL9UAfD93z2ACrrS5GTlovnRj8pB8ktmG3btiWNGzeup9vtRmJiopg+ffoOq2UKFqnAbUaD2221CBJJUPz5ozX4btNBnNU7B8N7sZl5g/GF4/HT2n74Ys1+PH/5YFxamGu1SBILKSwsrN24ceOGpve0HzIGbjMa3Cw8NwA8U3K6PtcdjkccsuhMC2PW2ln49+5LsDPlElz5yWB2115NPGbiOJBITJEWuM3gosDlfNqWi3rtq91VAAGlJ/awu/ZuRYM7SGpwCV+kBW4zXC4eCryx+bQSfzjPNfUlHq69O46uh6TlIhW4zeASA5fzaZsHE8dKUMTDtVevh1TkEs5IBW4zXEx6ejmftnnEk6KIh2uvekSYPG4ShhQXF/dZtGhRRGvA+yIVuM3g0qE8du5jSJVFZ4ImjvS3p+BQIu9rr16PeAptcOaVZa9k5z6TW+h4xDEk95ncwleWveJX8jSW1NfXW9l80MgkNpvBxVIbXzgeR0/U4Xdf/1kWnQkCLtc1GNRrfPvHf0Ct+yA6ZXTBPy54gtW1V3PXuHi84plXlr2Sff/c+7vXNNQ4AGD/8f1J98+9vzsA3D307iOh/u7mzZuTRo8e3bu4uPh4SUlJqw4dOtTNmTNn25o1a1Luueee7tXV1Y7u3bvXvvvuuztycnJcxcXFfYqLi48vWbKk1UUXXVS+bt261JSUFPe2bdtS9u7dmzx9+vRf33jjjXbLly9PHzx48ImPPvpoBwCMHz++2+rVq9Nramocl1xyydHnnntuX0ROTBBIC9xmcOrox+Zfgy61M9G95nPs+N0OVh14rGF0WYNifOF4nJ7+b3Sv+RzfXLOG4bX3aHCpv61n6qKpnVXlrVLTUOOYumhq53B/e9euXSn33XffwW3btq3PyspyvfXWW21uueWWHo8//vieLVu2bOjfv3/1n//8Z60QQHl5uXPZsmWbH3nkkQMAUFFRkfDTTz9tmTZt2u5rrrmm9wMPPHBg69at6zdt2pT6448/pgLAs88+u3fdunUbN23atH7x4sUZS5YsSQ1X7mCRCtxmcOro1cGGnInTNAKMLmwz4awEOQ2Y45XS46VJzdneHDp37lw7fPjwagAYPHhw1S+//JJcWVnpHDNmzHEAuPPOOw///PPPrdT9r7vuOoPFP2bMmHKHw4FTTjmlqm3btvXFxcXVTqcT+fn51b/88ksyALz55pvZBQUF/QoKCgq2bt2asnr16pRw5Q4WSxU4EY0ios1EtI2IJpl83o2IvieilUS0hoguam4bs9bOQt4/89gUG+HUn6juR6m/m4azkmsKznFkzrLHCx1bdaxrzvbmkJSUpF1gp9MpysvLGw0bZ2RkGKYBpaSkCOW7ht9yOBxoaGigTZs2Jb3wwgsdFi5cuGXLli0bRo4cWVFTUxMzvWqZAiciJ4AXAYwGUADgOiIq8NntIQDvCyEGA7gWwEvNaUMtOLGzYicEhFZsxM5KnJNF4LXApQpvCk7XNVjU685xcKLeshxljzemnDllb0pCilFxJqS4p5w5ZW+g74RKVlaWKzMz0/XNN9+0AoDXX3+97bBhw46H+ntHjx51pqamurOzs127d+9OWLBgQVbkpG0aKy3wYgDbhBDbhRB1AP4NYKzPPgJApvI6C0CzkgM4Fpzg1NG7lEdOlqNsGkaXtdlwtGLVW1YmsVnP3UPvPvLcBc/t7NSqUx2B0KlVp7rnLnhuZzgJbI0xc+bMX//85z93yc/PL1izZk3qtGnTQk46GzZsWPWAAQOqevfu3f/GG2/MGzJkSMiDgVCwMgu9M4Dduvd7AJzqs8/fAMwlot8CSAdwXnMa4FhwglN/4nWh89Lgs9bOwuT5k7GrYhe6ZcUme56jkmsK9apzumdVVAuc03Wx4r6NFXcPvftIpBV2nz596rZu3bpefT916tQD6uvVq1dv8t1/6dKlm/Xv1Sxzs9/Sf6Z/3djvRQMrLXCzXt/3aboOwBtCiC4ALgLwNhH5yUxEE4iohIhKysrKtO0cC05w6lA0bwEj/W1VWIXRZW02nLxGKqrIXAYfHMOBkuhjpQLfA6Cr7n0X+LvIbwfwPgAIIX4CkALAb91CIcQMIUSREKIoJydH286x4AST/gQAzyQ2q8IqHJVcsHA+NC4DZo7hQEn0sVKBLwPQm4h6EFESPElqn/nsswvAuQBARP3gUeBlCJLxheMx45IZSKEOgCB0Su+KGZfMsLXbyc3FJADgYjiNzKqwCqPL2my4KEEzuFwXjuFASfSxTIELIRoA/AbAHAAb4ck2X09EU4noUmW3PwC4k4hWA3gPwC2imb3F+MLxGNn6A0/BiWvtX3CCS4cCeAcbnGLgVoVV4nEeeDxkcnMRnWM4UBJ9LJ0HLoT4SgiRL4ToKYR4TNk2RQjxmfJ6gxDidCHEICHEyUKIuaG0o675y8FQ4GTNqB03JwvcqhrujC5rs+EYHlAl5vK8cQwHSqJPi6jE5rUU7P+wcrJm1M6PwWnVGF84Hs9e8BKc7hxAELpndY9JWIXDvRcq8XxsdkENB8b6vpXYmxaxmIlacKKBgXbk5GrVrBhGMgPAFX2uwbSPPLmQO343JiZtxrOO43xonAYf4wvHY/Ks1gBid9/GG2lpaYOrqqpWWi2HL88//3zbkpKS9LfeeqtZSQ0twgJXaXDZ/2HVjzHs7t6zuXgBcVkgOCdF0Vzsfp+a4V1O1Fo5JAqvvJKN3NxCOBxDkJtbiFesXU60OVi59GiLUOBqshUHS1Hf0du9c+HYcQPWVN9ieqqCwu1ueh/7IXT/SyzllVeycf/93bF/fxKEAPbvT8L993cPV4mfd955Pfv379+vV69e/Z9++mlt+vGdd97ZpaCgoN+wYcPy9+3blwAAxcXFfe65557OhYWF/fLy8gaopVarqqroqquuysvPzy/o169fweeff54BeCzm0aNHnzRy5MheZ5xxRv4XX3yRMXTo0D4XXXTRSXl5eQMmTpzY+eWXX84uLCzsl5+fX7B+/fpkAHj33XezBg4c2Ldfv34Fw4cPz9+9e3dYXvCWocC14ba1cgSDXinaXVxvIpClYjQbKxQOt3PUHDgfWjxfFzZMndoZvguA1NQ4MDW85URnzZq1Y/369RtXrVq1Yfr06R1KS0ud1dXVjlNOOaVqw4YNG08//fTKSZMmaUuJNjQ00Nq1azc++eSTu6dOnZoLAE8++WR7ANiyZcuGd999d/uECRPyqqqqCABWrFjR6r333vv1559/3gIAmzZtSn355Zd3b9y4cf2HH37YdsuWLSlr167deOONNx565pln2gPA+eeff3zVqlWbNm7cuOGqq646MnXq1I7hHGOLiIG7+ehvQ4fiUeb2TfFmNC4yIF3okYWrJwaI7+vChtIAy4YG2h4kTz75ZIcvv/yyteenShPXr1+f4nA4cMcddxwBgNtuu+3wFVdc0Uvd/+qrrz4KAMOHDz/xwAMPJAHAjz/+2Oq3v/3tQQAYPHhwTW5ubt3atWtTAOCMM8441qFDB5f6/cLCwhPdu3evB4Bu3brVjh49ugIABg0aVL1w4cIMAPj111+TLrvssi5lZWWJdXV1jq5du9aGc4wtwgJX4fCwGmLg1okRFBxCEmZY4kKPeYuxg0FuqMTOdAywbGig7UHwxRdfZCxcuDCjpKRk0+bNmzf069evurq62qwMt/ZaXTo0ISEBLpeLgMYHp2lpaQZfXnJysmG5UfX3HA6H9nu/+c1vuk2cOPHgli1bNrzwwgs7a2trw9LBLUqBM9DfrGLg2jxwa8VoNlYM5DgMHpuL95D4HZs3iY2f7HHHlCl7kWJcThQpKW5MCX050fLycmdWVpYrIyPDvXLlypTVq1enA4Db7cbMmTPbAMAbb7zRtri4uLKx3xkxYsTxd955JxsA1qxZk7x///6kgQMH1oQqV2VlpbNbt271avuh/o5Ki3Chq3B4VI0xcHtLzLXzsyaJjee5CgbOh8ZZ9rjhbmUVsqlTO6O0NAkdO9ZhypS92vYQuPLKKytmzJiRk5+fX9CzZ8+aQYMGnQCA1NRU9/r161P79+/fMSMjwzV79uztjf3On/70p4M33nhj9/z8/AKn04np06fvSE1NDfmumTx58r7rrruuZ4cOHeqKiopO7Nq1KznU3wJamgJn8LQap5FZJ0cwqOJxqsQGyCz0SCF8/nJEuv9twt13HwlHYfuSmpoqFi1atNV3u24OuGHhLP3Sn506dWrYu3fvWgBIS0sTZsuF3nfffYcBHFbfX3zxxZUXX3yxZs3rf0//2Q033FB+ww03lDf1e8EiXeg2g4OMGpxk1WHFOY5nRcHqnlXgWoRIItHTshQ4g4eVVwyc32ImgIyBRwqtlC6D5yoQcXhZJC2IlqXAGTysblYxcKslCA0rxOZ6roKB47F5Bx+SKOF2u928RvY2RDmHAStXSAVuM4zzwK2TIxhsLl5ApAUeGTjHwLXaEIyuCydZAawrKyvLkko8dNxuN5WVlWUBWBdonxaVxMahEzVa4PaGWYeiYYXYTE9VUHC8D9TnjJPonGRtaGi4o7S09LXS0tIBaGGGYgRxA1jX0NBwR6AdWoQC5+Qu86/EZl/4JmZZUciF7clqEpvfpo3C6brwkRQYMmTIQQCXWi1HvGPpyIiIRhHRZiLaRkSTAuwzjog2ENF6Ino3nPY4dDScLHBVQm7TyKwYePAd7DQNJyWowtECNya4MhJcEjUss8CJyAngRQDnA9gDYBkRfSaE2KDbpzeABwGcLoQ4SkTtw2vV/jc9qxi4zeULhDXTyJierMbQ4sjWihEK6oI2nAZWvn0Dt4GzJPJYaYEXA9gmhNguhKgD8G8AY332uRPAi0KIowAghDgYToMcHlZDR29zebVCLpZK0XyssF44Krlg4XhsqteAk/eAl3dOEgusVOCdAezWvd+jbNOTDyCfiBYT0c9ENMrsh4hoAhGVEFFJWVlZwAY5dDTGxUzsLTCH82mGFQO5eHR5xkMWOkvhEZ/3k6T5WKnAzQw337syAUBvAGcDuA7Aa0TU2u9LQswQQhQJIYpycnICNmh3hQgYZbT7M6oVcmHmy7PiPrD5pQwLlspEGP6wgJFzThIjrFTgewB01b3vAp/6tMo+nwoh6oUQvwLYDI9CDwkO/Qyv5USZYkUMnEP8pplwmt3hizr45HRdOFVplMQGKxX4MgC9iagHESUBuBbAZz77fALgHAAgonbwuNQbXT2mMTgkEglGmaZ2ly8QMgs9wjA8NjfDwYcwvOYkuSRaWKbAhRANAH4DYA6AjQDeF0KsJ6KpRKTOH5wD4DARbQDwPYAHhBDNXrGFE3qLQD6i0cEaF3r8Xk2Ox6aFwBmJLi1wiS+WFnIRQnwF4CufbVN0rwWA3yv/ItBeJH4lunBaTtS7mAkvZCW2yMLx2LRSqowGH5ymmEpiQ4sqccfhYeXkJuPaicha6JFBPSKW4QGGhVz03YHd+wZJbGhRCtwdcE0X+yAYpZqy6vx0WCE213MVDByVCcfFTKQLXeJLi1LgHO55TsUaNPnY+dBj32RcWuCcK7GxT2KTSFqaAmfQ08gYePSxQpna/VqGA8dD4zj4kLXQJb60LAVutQBBYPSg21xim4sXCEuS2LierEbQjomhMtHmgTOSnVF0TRIjWpYCZ/CwcopzcVVKVkjNIf8iVDjeBZoFbq0YzYJTlUZJbGgRCpyTu0xwioHbXcAAWOJCj3mL0YfTc+WLtpgJI9kNsjKSWxI9WoQCV+Fwzxtj4PaWWJWVXS10mcQWUex+n5rhfc74yK4/zfF8P0mCp2UpcAb3vHShRx9rlhPlea6CgeORqdeDU2iD0wwVSWxoUQqcw6iVgYganGTVI+eBRxaOhVy8MXA+whs86PF8Q0mCpkUpcA63vGBlgXtg5kG3yIUe+zajjTcGzu/gtHngjETnlB8jiQ0tQoFzmu5iXE7U3vKqHQoz/S1LqUq8i5lYKkXzkLXQJb60CAWuwuGeZxUDt7l8gbDEhW5Bm7GC433Afx44H7kl0aNFKHBSbEQ3Az+m0QK3Nxxdp4BMYos0HJWJYGiCuwWjzkESE1qEAtfmfFosRzBw6gz5SGpETiOLDIJhHFmFZyEX89eSlkuLUOAqHDoaY5zL3gKznQduQffH6bo2F45H401i4yM9pwRXSWywVIET0Sgi2kxE24hoUiP7XUVEgoiKwmmPwz2vd/PbXV5OnZ8eq7PQmZ62gHA8HlVmBlE1Db2s8ejRkTQfyxQ4ETkBvAhgNIACANcRUYHJfhkA7gOwJNw2OSiceO7o7YIVnXY8TwHiFPZR4bicqF5aXnJLooWVFngxgG1CiO1CiDoA/wYw1mS/RwE8BaAm3AY5KETjyNreAnM4n2ZYk8RmbfvRQMsDY3g4Xtn5CM+pzLIkNlipwDsD2K17v0fZpkFEgwF0FUJ80dgPEdEEIiohopKysjK/zzlWXQLs3zFyXQ/citMaz2UwOSoTwdACl/PAJb5YqcDN+n3ttiQiB4DnAPyhqR8SQswQQhQJIYpycnIa2S8UMWMLp46ebyU2CyxwQ/sxbz4qcF6NzK3Jzkd4bgaIJPokBPqAiCphrkMIgBBCZIbZ9h4AXXXvuwDYp3ufAWAAgAVKlnNHAJ8R0aVCiJLmNKQqGA4JK7KQS/SxehpZvHXEHI+G4xQ4/cIrnOSWRI+AClwIkRHltpcB6E1EPQDsBXAtgOt17VcAaKe+J6IFAP7YXOXt+S3lL4OuhlUpVZvLFwhrXOi69nmeNj84rqmt4mboPRCI30GgJDQCKnBfiKg9gBT1vRBiVzgNCyEaiOg3AOYAcAL4lxBiPRFNBVAihPgsnN83bzPSvxh5OM319MrHy4duyRQcu1/MMOCoTLxZ6HxklzFwiS9NKnAiuhTAMwByARwE0B3ARgD9w21cCPEVgK98tk0JsO/ZEWgv3J+IOpweUg7n0ww5DzwycI6Bq3qbQ1hNRVZSlfgSTBLbowBOA7BFCNEDwLkAFkdVqijBoaPhFCvlcD7NsGLgwem6Bovw+csJlsuJ6s60LOQiAYJT4PVCiMMAHETkEEJ8D+DkKMsVFTjc8pwsNZuLFxAr5Lb7tQwF74Ig/A6Oz1wPL5z6BklsCCYGXk5ErQAsAjCLiA4CaIiuWNGBw03PaWStzQPnFQK3PgudzyVuFI5zqVVYWuCMijxJYkMwFvhYANUA7gfwDYBfAFwSTaGiBQflyCsGbrUEoWH1fcD0tPmhHofV5zMUVGuWk+zSApf40qQFLoQ4oXv7ZhRliRqcYnWCUazU3tIFxnoLnOuZM8LRitUQhj9M4Oj4l0STJi1wIrqCiLYSUQURHSOiSiI6FgvhIoXWYTLoaViNspl24JbPA7eg/WigrpzH8Xg4Dj44eecksSGYGPhTAC4RQmyMtjDRQjAabXMqpepVSnaX1Ij1i5nEvPmowDiHjaX7n1ORJ0lsCCYGfoCz8gZ4PaycVq3iWonLahd6vPS9nCoc+sKhL/CFU5EnSWwIxgIvIaL/APgEQK26UQgxO2pSRRhO7jLBKM7FybOhxwqFwym3IVgE1xsAPIvQsAqvSWJCMAo8E0AVgAt02wQANgqcUz/DacECrwvV5oL6YEX1rXh0obsZPVd6uA6mZCEXiS/BZKHfGgtBogknV6/xwbS3wN560rywvJRq7JuPCt7nitcR6a+FfsBse3idZkkMCKYW+vMmmyvgWXDk08iLFHnUh5RDR8PKTcbQDQlYY3XF5zQyz19uh8O1rC2rvkESE4JJYkuBp3TqVuXfQADZAG4non9GUbaIw+GeN7r37A1XF7oli5EFeM0ati5089d2Ry4nKvElmBh4LwAjhRANAEBELwOYC+B8AGujKFvE0Eo+MnhaDR29zcXlcD7NsGYaGc9z1RiqJcstHstpqqYeaYFLfAnGAu8MIF33Ph1ArhDCBV1Wup3h5Orj5Grlm8RkbZs2v6xBw3keuIrdnzE9nLxzktgQjAJ/CsAqIppJRG8AWAngaSJKBzAvnMaJaBQRbSaibUQ0yeTz3xPRBiJaQ0Tziah7KO2o7iYOa/9ySnbSBht2F9QHoyUTG+G5xl0bg5vlrcJ1YRmjd46R4JKoEUwW+utE9BWAYgAE4C9CiH3Kxw+E2jAROQG8CI8rfg+AZUT0mRBig263lQCKhBBVRHQPPIOJa5rbltdStP9Nz6lYA1P9bYwlitispmYYPHI7YSYIIXRzqXkdEKdBsh5pgUt8CWiBE1Ff5e8pADoB2A1gF4COyrZwKQawTQixXQhRB+Df8Kx8piGE+F4IUaW8/RlAl1Aa4lS0gZOlxim3QI8VRdE4FegJBs6F5QSjMJWeeAzDSMKjMQv89wAmAHjG5DMBYGSYbXeGZ1CgsgfAqY3sfzuAr0Nris/dzmgaOKtV3vT4d+DRN8HjrfPl6oYGfOaBM5LditCPxN4EVOBCiAnK33Oi1LZZr2l6VxLRDQCKAJwV4PMJ8Aw20K1bN7/POa39yylDllOJWj16cWPVgbvd+uvK7ISZwHphDabeA+lCl/gSzHKiVxNRhvL6ISKaTUSDI9D2HgBdde+7ANjnuxMRnQdgMoBLhRCmWe9CiBlCiCIhRFFOTo7Z58rfCEgdZdwCcDo8Yxu7y8spt0CPFWEKTtMDg4G3Bc7THRJv95AkfILJQv+rEKKSiEYAuBDAmwBeiUDbywD0JqIeRJQE4FoAn+l3UAYK0+FR3gdDbcjr6rX/XS+EgFPJqrK7vJxyC/RY0X9z8qwEA+cYONdrwTV2L4kewShwl/J3DICXlfKpSeE2rBSG+Q2AOQA2AnhfCLGeiKYS0aXKbv8A0ArAB0S0iog+C/BzjaK6Lznc824BOJSrYnd5uXYiVkjNaZnYYOBUr8AXYwiFj+ycB02S6BBMJba9RDQdwHkAniSiZASn+JtECPEVgK98tk3RvT4vIu0ofzkkrAgh4NAscHvDdRqZFe5fbkquKTi7c7m6/2UlNokvwSjicfBYyaOEEOXw1EEPeblXvdIAACAASURBVP63FXhvdvvf9W4Brwvd5k8p10IuenljZYHFW+fLVQkCMCaxMZJd1kKX+BJMIZcq6Nb+FkLsB7A/mkJFGk5JbEIIOBw8LHCuSWxWzNTjdo6aQujXrWd2bHwLuejfWCaGxEZExBVud7QkNgY3vUsIJCgK3O4PKad11vUYpnRJCzwkOFvgXOP3XJPvJNGjRShwTqsmudzQWeD2ljcOPOixs8AZVdgLBs7KhHP8XoVDXyaJPi1CgXNSNEJngdv9GY2LUqoxS2KLfZvRhLNHgWtRHc5eD0l0aKwW+u1E9IDu/V4iOkZElcrCImxg5UJ3C4aFXHhhhQuVs8VqBmePAtfBlJxGJvGlMQv8bgD/0r0/KITIBJAD4LqoShVhNEuRwW3vFjoFbrEsTcG1lKqe2BVy0bfJ+IQpcF5dTd8PcHJFx1stAUn4NKbAHUKIw7r3HwCAEKIGQGpUpYownCqGGUup2ltge0sXGCtqSseb9cTZo8A1C53zOZdEh8YUeJb+jRDicQAgIgeAttEUKtJ4LUX73/Yuty4GbrEsTcHhfJphhTXMaZ33YNArE05WLOBzzRmJLgK+kbRUGlPgc4no7ybbpwKYGyV5ooLw+Wtn3PpKbDYXmKtLz4q1ubkpuabgGkcGvAM4Ih59ggrnvANJdGiskMsDAF4jom0AVivbBgEoAXBHtAWLJKxc6G6BBKe60qq9BfbNiqXoL6sdEfT3QawUq2jkHUc4u3NVRegkYjWw4jxokkSHxtYDPwHgOiI6CUB/ZfMGIcQvMZEsQghmrj5jKVWLhWkCvrFE3RtLkthi02Y04ep9AbzXwkHE6lroReWwroMk+gRU4ER0IYAMIcSHALbrto+HJyP92xjIFzbcwl0uRlno/p04ExPcAhe6FYlz0YSzBe5StF+Ck1i5orlWkJNEj8Zi4I8AWGiyfT48cXAWuJlpcLdbIEFZT9TuzyhXpeTW1/GOkeDx5v7kPI1M7ROcDmYWOK+uTBIDGlPgaUKIMt+NQohSAOnREymyGPsZ+9/2biG864HbXF7DubW3qAasmAdstFgZnawAcE6o0ixwdgo8vmYySMKnMQWeQkR+LnYiSgSjeeCGhCV34P3sgouRBc5VKVlhycSzBc7teLwWuIOVKzreEiEl4dOYAp8N4FUi0qxt5fUr0C0vGg5ENIqINhPRNiKaZPJ5MhH9R/l8CRHlNbcNbkpGCLBbTpQbVswDj7c61pyPR5U9wUG2f8b0xNsgUBI+jSnwhwAcALCTiJYT0XIAOwCUKZ+FBRE5AbwIYDSAAngy3gt8drsdwFEhRC8AzwF4Mpw2Odz0+uVE7W4dcHXpGeaBW5GFzkptmMO5kItL8cRxi4FzThyURIfGppE1AJhERI8A6KVs3iaEqI5Q28XK720HACL6N4CxADbo9hkL4G/K6w8BvEBEJJqh2bjd9C63t5CL3eHU+RmwRG6eg51AcE6oUmPgiU5CvYtBXE1BWuASXxpbjexPAKAo7L5CiLWq8iaixyPQdmcAu3Xv9yjbTPdRBhQVMCnjSkQTiKiEiErKyox5d9zmqwoBOB3e13bGCks2ElhhPXINNwQiHlzoTmYudG7hQEn0acyFfq3u9YM+n42KQNtmZqbvXRnMPhBCzBBCFAkhinJycgyfcetoPC50Hsu0G6ZjMepQrMie5+ZmbgrjgITXsXmz0B0s+gQz4m1AKAmNxjQFBXht9j4U9gDoqnvfBcC+QPsoGfFZAI40pxFu1cI808jUJDZ7S8xtcKQis9DDh+u1BzyDZEBd9Y+P8LKQi8SXxhR4Y6UaInH3LAPQm4h6EFESPBb/Zz77fAbgZuX1VQC+a078G+BVSlUIASGgS2KzWKAm4FrLw4qOMN7cn1yL+ABe2ROcxMqS5SSrJDY0tpjJICI6Bo+1naq8hvI+JdyGhRANRPQbAHMAOAH8SwixnoimAigRQnwG4HUAbysLqhyB0a0fFJzmq6quPScXBc7UIrB64MHoVAWEW26JHmMWOh/ZOXs9JNGhsSx0Z7QbF0J8BeArn21TdK9rAFwdZhve1+H8UAxQBxvaYiYWyhIMXDORrRh4cJsN0RTcQlN6DJXYLJalORifN06SS6IFj2ypMLCiaEeoaNmxTh7zwO0ekgiEFbXQjW3yPG96OFuDXGuhu918z7kkOsS9AudUbETrWJhY4JzCE3qssIYNU+5i1GY04exR8M4Dd7AahLoY9WWS2BD3CpxTBSzfGLjNxTWKZ3NZ9VhhPXId7ASCcwxcb4Hzum+9rxmJLYkiLUCB8xm1qm7WBCbTyLiuSKXvCGNlgRmVHJ9zFQhOz5UvfGPgPJNGJdEj7hW4odu0+T1vsAxgf3m5zm12WRBL5HquAqGeQo8Ry+uA9J4uToqQ86BJEh3iXoHrEz/sHu9y+SpwK4UJAq5xUCvmZNv93msuXBPBAK/yS3A4WM2t5hQOlMSGuFfgnKY6uZnNA+eU4a/HCmvYLTzWKmD/+zAY1OvtIH4KXD9Q5qQIpQUu8SXuFbibkQbX5oGzjIHzwWWB2SUAbZW5eOh83fpiKKyuvk8MnJHojLoySYyIewVuTB2y922vWgZsSqkyjetasRqZ0Ne453SyAqCf8sjtcLiuRmZF7obE3sS9Ajd21hYKEgSqC93BJAZu9wFRIKwYeIh4c6Erf4nXTCwAOgvc6WA1mIq3evqS8Il7Bc5p6oXbxwK3+zCba1KNywLXv1sIb4EePqcqIEJnxTK69ADixIXOSG5J9Ih7Bc6p+IF3eovnsthdXk75BXqsWY0MbJaJDQZ9vga3DHuuLnSusz4k0aMFKHA+cSPvYiaev3aXFzz1t7GmdIzaFEJoSWysTlYA1OfKQbyUIOB9zhIYzwO3f+cgiQVxr8A5lXz0LmaiWOBM5AV49SduoU8UjFUSG5/5/cHAVQkCOk8Xw/XA1egaJ7kl0SPuFTgntxO3xUw4ZfjrcRsywmPYJhfPShBo88CZuaEBrwdGyzVhghBCVyOC21mXRIO4V+CcvE4un47F7vJytcBdbqGrNx+jNnWdbzxg10psa/aU49Dx2kb38RZy4eHpUnG5vWEYHhJLoo0lCpyIsonoWyLaqvxtY7LPyUT0ExGtJ6I1RHRNKG1xmnqhL44B2P8hNaxxbZ0YGm63wLGa+ib3EwIxzwg3tGmLsxUe2r1qsxj4pS8sxrnPLAQAHKupN1XOWsVDZrMCjKEfi4WR2AKrLPBJAOYLIXoDmK+896UKwE1CiP4ARgH4JxG1bm5D6o3uIKPCsSP+i5nweUrtIOsL32/DwL/NRXlVXaP7uYSA0xnbc+xy69uMSZNRxa1zodvlgOpdnge8oroevx46gYF/m4v3S3b77ad6Q0iLJ9tD/qYwhH4slkViD6xS4GMBvKm8fhPAZb47CCG2CCG2Kq/3ATgIIKe5DdlxykhFVT12H6ny2+67mIld2HqgEjX1Lr/tduv4/rPM01nvr6jB8p1HUVFtbo27hdeFHqtkII/bnsf0wGBQL72dLPBjuut9ztMLAAALNpf57edye+RWnzK7yN8UhkRImz17EmuwSoF3EELsBwDlb/vGdiaiYgBJAH4J8PkEIiohopKyMuMD61320PpsWSEEhBAY/b+LcMZT35t+DthjMRNV1orqepz/3CJM+miNyT6wVWJWerITALBsxxFc+fKPePzLjab7eeSOrTvbzSgBSQiBmnoXyqvq8ODsNThe2+C3j94Ct8vhlJsM2BKd/l2cx5KFZoHbRf6m0BcDkkgAICFaP0xE8wB0NPlocjN/pxOAtwHcLIQwdYILIWYAmAEARUVFwuczANZnnO6vqMawJ77Dlad0wb6KGtN9XH4xcOt6lhFPfo+95dX48r4RAIAffznst4+qlNwue/SAbdKSAABTPl0PAPhh2yHT/fRJbLE6xYZKbLFpMmSembsF0xf9gl7tM7Bx/zEUdc/GlUO6GPbxFnKxT0x/c2ml3zYzL5FbSQYjZjkJhmJAPESWRJmoKXAhxHmBPiOiA0TUSQixX1HQBwPslwngSwAPCSF+DkUOzQK3uGLUZS8uBgB8tGKPtq3e5TZYCHbJQhdCYG95NQDg7neWA/BaK3rcAkonKGzRoWSmJhreqzFRX9z6GHjUpfLgcutCIzY4V4E4cqIOL3y/DQCwcf8xAEBigr8Vqyo9BxEabDKAmzhrhd82M8lcymDKLha42y2wr6IaXdqkNb6f0M+esMc5l1iLVS70zwDcrLy+GcCnvjsQURKAjwG8JYT4INSG7LJu8YFj/lNbKmuMrkn93FrAun7+RJ033r37SHUjewpbZVb7LhNaF0iBu71yx2pQ5xYCCU77nKtAzNYNMFUqTJIC9aVUI3E0q3aX44EPVuP+/6zCja8vwY+/mHtPgiFZN+BoMLkH3G5PMpgaBbdagb+3bBdGPPk9Hv50HXYfqcLfv9iAqjr/sIW+mp8s5CIBrFPg0wCcT0RbAZyvvAcRFRHRa8o+4wCcCeAWIlql/Du5uQ1FuqMJlZ456X7bKn2mPNllOdGSHUf8tnlTfry4hT3i9QBQU+/Cd5uMjhxXAMswXLm/WVeKvElfNpntbpDFLWxzrhqj3uScHa3yjy1r+RoRyi257MXF+GD5Hny8ci/+u/UQrn91SbO+r5dh4QPnaK/N4ve+WehWD6g27fe4/t/8aSeenrsZr/3wKxZv8w9Zudy6QaCN7yFJ7LBEgQshDgshzhVC9Fb+HlG2lwgh7lBevyOESBRCnKz7tyqEtgCoBSesu+vdAmidZnTx+lrg3lro1lpqt8xc5rfNzIUuhIBdyntvUNy9Z+Z7JypU1jYEzJ4PZ6799EWeXMpfyk4E/R291W/Xznf17nI8+c0m7f0L1w8GAJSbKHDfpW/DIdBz+c95W1Bh0rYZ6vWfcOZJ6JiVgjN6twMAHKs2UeBuj0fOLgmY6cneSOanq/YBCOA5iLNaApLwiftKbHqlaNWDunZPBX49dAL57TMM24/5ZM1qBSYstNTcAXxzDhMNbrRkre1QSpXEwAdH98W2x0bjscsHAIBpVS6PAg+9Cpf3K8F/N9xBQyz4YLl3zvQ7t5+KiwfmIjcrxbQ4TiSfK7O52gDwz3lbMWjqXMxZX9rkbzz37VYAwBWndAYA/OuWobiwfwdz2d0CTofXq2T1dMjDJveoPoylIiwoASyxN3GvwL3JNtZ1nPe/73EcdG+bplk1AHDMxwLXktic4Vs1KptLK3HgmHnWuxlHFLfwgM6Z+N9rvRELs7np+pic1f3JYiXjvGNmChKcDnTKSgEAHDpuHr8NpZRqTb0Lo/65CKt2lyvvg68MxKGUamqiZxrevN+fiRGKBZuRkoi5JgrUWF8hvKv/8cq9AICHxvTDy+NP8fv8g5LdqGto/FzvOeqpq9Cng2eQnOh0YHvZCew8XIUTPm50vyS2sKQPDyEE/rv1ELpmpxq2Hw8waJLTyCR64l6B67PQrbIStx08DgCornfh4oG5WKTE6Hxj4PolGoHwrdqv1+7Hhf9chNve8HeJB0JV9vee3QtjT+6MSwflAjB36ennU1uJyy0wa8kuAN4wRbtWyQCAskpzC9wRgudgz9EqbNJNVTKLrwbC7baPt8IMt1vg1f/+CgDopfMUpSQ5kWSWhW54rsJr++ftnpyLO844CaMLO2HN3y4wfD5v40E8M3dzo+ft8Ik6XDu0qzY1DPDG7n8pO27YV3/9gchYs8drGzDl03VBu/xV/rNsN0qP1fgli/qG14DQ71tJ/NICFLg3McyqWz473TM/+c+j+gIAMlI8MS//GLgqq+reDb3N7WXHcY8yrWb9vmPIm/Ql/vLx2ia/d1BReO0zPRbss+MGYVT/jthXUYNqH7deNFbYem/pLvzvvK3N8hrok+7UDjwnI7ACNyxm0gy5fZO5jpt0soFwidgvoNIc9lWYzzY47aTsgMoE8KxdH87xHDS5zpkpidjy99FY/bBXkU9ftB09HvwKkz9e6xfmcbsFjp6o054zlaevHgjA31OizgOPxPrs328+iAPHajBx1gq89dNOPDdvS7O+v3qPx5vTva1xClmlafEcz7x7QLrQJR7iXoHrp2ZZFeuqb3Dj1tPz0DXb85CqCvy/WwNUjVMf0jDaHKks6KDn3SW7sPNw44lX25XErPaKAkxwOjCoq6cE/X6fTl5Ab4GHd27X7qlA3qQv8eDstXhu3hY8OHtt0FZGmRJD/PtlA7RtbdMDK/BQFzPxta5OmEz1CYSxElvwbcaKgybnCfAo09oGt18yoGF2RxgHtFtxfY/o1c6wPSnBgSyfef0AMGvJLixTBmxfrtmPm/+1FDuPVKHBLdBW8bqoqPeAb66JS8ndCLcWekV1PW6duQynPj4fi7Z4nuU3ftwRsP6AGTmKzHN+dyY+vHuYNsgzy90QjIoBSWJD3Ctww6pJFtz19S43Kmsb0DrVax0kOB1wENDgY0l4C7mEb4EH4qx/LMCAh+cE/PzRLzYAANpnejvD/rmZAPzjyZFSSkdP1OGSF34wbPtu00G88eMO/P4/qzB94S+NKgnVEh7Z11uRNynBgTZpiaYdoWFKVzPk9E2IMrNMzfCUpdXnEdiv+z2o1Cn43Xm9DdvV4jiBvEXhLidapXh1/senXZV/3VKEq32qwH22eh8+Wr4H9767Agu3lGl1z9u1MlrgmamegbLvdfNY4Ai7Fvreo+Zeiwv/uShgMqgv5dX1aJ2WiJREJ4rysrHt8YswqGvrgJ4jOw8CJbEn7hW4ep9HspDL3W8vx/WvBlcYTp2C4zuF7Ow+7XHkhL9CBMIvpfre0l3a60/vPd0/Qaa2Ae8t3YXF2w4FVIzJCU7ttRpP9lWGQiAiiUB7AnSEj3y+AbNX7sUTX28ynbJ16Hgt8iZ9iWe+9bgtW6UYCwt2y04LWF4zIYTVyFRLbvlD5yHBQUHHwGMxMAuVFbuOorSiRsv0Hn9qd8Pnmco5VReGWbilDL+UHTdOzwyj/R2HPRZ4RyVk48vIvh3wj6sHGbbNWrILf/hgtd++vi70zJREg+wqqiKkMHNNDp8w91psLzuBfy8zz6z35WhVvVYCWCWnVXLA3A31HrI6c15iD+JegfsqxUjwzfpS/PjLYdTUuwIkdwlU1TVg5a6juOvtEgBAxyxjB9U2PQn7yqsN3/dfTjQ0+R6c7Yl1//GCfAzq2hrPXO1f/+bB2Wsx/rUl+HjlXo+XoKYeh47XonPrVIweYCxh3y7D08GYKfBIWATPf7dVe/2fCafhvpG9/PY579mFqG1woa7BDSEEHvl8PYr+Pg+A103eKsmowId0z8bqPeV+1pA++a45Fa12KivIZaYmolVKgl92cyB8V5mLZNdbVdeAqroGgyxmVbx8qa5z4e63l+OKl37EaU/M1zLB2wZQgtdM/wkPzl6Dm/+1FNdM/0k7bxRmIZc9R6uQ5HSgS5vURvf76r4zcO85PRvdx1eBqy74Rz7fgCXbD2vnxSWEcR54yLIHrlL4l4/X+lUGNKO8qs4vVNA+M5ACj8yAWRI/RK0Wul3QWwrhjlq3lx3HJ0pHBwB9//oNAOD7P56NHu28ldY+KNmDP/ms3tWvY6bhfXarJBytqsfd7yzHazcPBaBbzCTMOFduVgr2VdTgpuF5AIDiHtnY/PdRWLunAle98pNh39+/vxq/f99ozZzT17hqa3ZaEoiAQz6dinGBjtDP7bcbDgAAfpw0ErmtU3HqSW3x/Hfb/Pbr85DnfJ92UraWuazHt6hIXrs01Da4ceh4rZaUBxgTyoI9y3UNbsxcvAOJTkKi04H0pISgk9jU266pwc4PWw+hTXoi+udmads+XbUXw05qa5BfzxUv/WjIjL/n7J54ecEv+PTe07FmbwWuL+6GE3UNeHbuFqzeUw4HEbJSE/2q1qn4nsNaZfrW4RN1eG+px6o8dLxOl8QWngV+qLIObVslGbLHzSjIzURBbiZuPb2HNnADgK7ZqVoGtzqFzOxYrpnh8ZitnnKBMg+cNG0Yar+gDpSfvLIQlTUNuPX0Hnjqm02Yvmg7AOCut5fjuuKuOLdfB9Pv1zW48d+th3B+gfHznFbJOFJVZ7pWgjrVz1ZuHIllxL0CN04jC++3zBLDAM/aw/9zbm/ce04vJCU4MHeD/7zZbj5Zpqq1OG/jQRT+bQ6mXTFQm+uqJrGFKnBORjJSk5ya9QR4XOJFedm495yeePF701VZNbr6LKqQ4HQgOy0JZT4xcIHwl2RUrZQh3dsgt7XXCvv2/jOxsbQSRd3b4NxnFqJal0Rlpry1jk2HatUt2FyGcUO7atuN03GCk3OfsriL6mLeW16N2Sv34tlrPN6N295YhjZpSXhm3CC/76rH2FRo5IbXPeVD1/ztArRKSsCfPlqDD5fvQfe2aYbyoIBn0ZEHZ68xKG8AeHmB59qOVRbP+esn6+Cg0Gtnn9K9tel2NWkr3Bj4oeO1WogmGNq1Ssac352JC/+5CIAn+WvDvmM4pVuboKrCrdh91FtKVd0YZr9w9ZCuWtuTRvfVFPi8jQcwb+MB7Jg2BqUVNUh0kpZot+VAJS54znMMnVsbvQ85GckQwnONO+gGbg1ubwlYqb4lQAtyoXsKuUTvtv/f+VuR/9DXuP7VnzFvo9G68XXtAUB3ncVeWdOAe99doU1BcVB4D2llbQP6dso0/eyBC/s26a40WxWpzuXGe0t34WCld9pPJJLYSpVpRFf5JCr17pCBSwflIrd1KlZOOb/J31k6+Vy/bepA5E8frcE360oVOYUxCz1IOVV3qW94AQDW7a3Ad5sO4qMVe/De0l2obXBBCIHXf/gVt7+xTCvzqZ+61pjbedwrP+HUJ+bjw+WehUV2KnFiPcWPzcOc9QeCkj1Y5f3RPcP9trXPSMEDF/bx275iVzlISQQL57nyKHD/56Mx+nTMwDNXD8L3fzwbaUkJKMrLDqi8B3Q2Pge3zlyG6jqX9owBoT1nNfUuJDoJE8/uaWibiPD8dYMNc+cra+px2hPzMeTv8/DDVk/eycLN3hkoN5zWzfDbgaZAutxuJCgDD2mAS4AWoMA192UEk9hSEgOfNrN1s/96cT+/bZcM7ITCzlmGbeoD6wzzIa2saUBGcmDnyrf3n6UVkzGjs4mCV7OQVQsPiEwhl/2KZdspy9xFDAApiU789eIC08+eHTcI1xV3Q0aK/5Qj/XHc/c5yfLvhALYc8BT1UJcTDdZ9+s36/QCALspUwLvOPAmAZ5bBnW+VaPs9OHst+jz0DS5/6Uc8+sUGzN90EFcrYQu1o3/rpx3o8eBX+Gj5HsxTwgf6cqGbSitNY6Aq9S633wyGUJl1x6kY1b8jXr2pCEO6tzHd556zeqLkofOwYeqF+P6PZwMAEp3KPGryrLT3QcluCCFw4FgNft5+GCt3HcWvh05g1e5y3Pj6EsxashMHj3mS5fQx+uZa4CpXDuliCFsF4rN7R+DmYcbEvL3l1XCQvmBSs5tHaUUN6l0CPXNa+X126aBcrPvbhdr7kp1Htdc3vL4EPR78Co99tVHbltPKeO8HUuANLm/yXbjGyObSSu3ek/ClBbjQlQxgJ4W9BJ8ab5t2xUCMPTkXc9YfQIfMZFz+0o8BvzOiVztcPriL33Yiwqw7T8XAv831+8xjHYT2kP53axnKKmvRqhEFnprkRLe2afh44nBc/tKPOLlra608KAD07Zjh952bhnXHWz/txMzFO9AmLQmXD+6sHQcQuhWmuoA7BIjxqtw+ogdOyknHsep6jB7QCbuPVqGm3oX+uVm44hT/8wsAaT5JbXe+VaIVzGhOIRchBN752ZPZ30HpXNXBwcLNZdhf4V+MRH8+VVSrf9kOT4dulkkdDF+t9QwmTu2RjdtG9MBdby/Hg6P7orBLFnKzUrHt4HHcoRtU6Hn1piIkJThw87+W4u+XDcDpvdrhdJ852L44HKQp2by2TjgdhHqXQKLTsyRnXYMbD3y4Bj3apeP+91eZLkH7362HMPnjdQCAcUVd8NRVgyCEwOHjdWiX0XwFHiwOB+GRsQPw5k87tW2Hj9ehU1aK5kIPJQauzkDwnfmgkpTgwKTRfTHt60241WRxID3qdDcVdW74ve+uwIq/no8UJTykrkYWCQt84qzl+KXsBE7tkY3Xbi4yHQBL7E/cW+DGKTyh3/VCCOw+Uo1LB+XissGdQUQYNaAjBndrg4Fdsky/s2PaGLxzx6kBfzMzJRFv317st121wBdvO4yZi381/e7bP+3ALhPX6v/821N33WwRB18Gd2uDHdPG4JN7T8eOaWO0fykm8eRHLu2vubmf/XYLznjqewDhL3360CeeTr0pBQ4A5/Rpj7End0ZSggM9c1oZkr0CsWPaGNx4mtcCU93RzckIX72nQnudoCQVtVam/qiK8qScdNx11kmN/k4wMyF+enCk9vrhS7xeB3XK4ebSSu0aexbs6Igd08bgrrN6YnjPdshrl47zCjrg4UsKNJd4ruLduGpIF5xf0AFn5edgx7QxuOE0o2UaDESk5RuQzg0NAFe98lMT68d72KqUFq6orkeDW4RkgTeXHdPGaOGP6vrwXehq1n9jA+UxhZ2C+i3fBD7VAq+qc+GON70DMc/0N0fYMfDyqjptWuaSX4/g1Mfnh/FrEiuJewtcn0AUzqj1gFLoQp8VqvLwJQX4y+x1eOO2oThyog4NLuE3bSwQZ/TOwbpHLsTgqXO1tZidRGhwC6zaXY5Vu8uxYHMZ3rytGA0uN4Y+Ng/DerbFV2tL0av9Tsz7/VmG3+ufm4n/bj2ESwd1Dv1gTSAi/OOqgVpcViWQUhJC4NsNB9AhM0Wr5FZV14BNpZU4pVsbbR+VNmnRswB8LRzAaw2v3l2OkX3bN9oRq2Vd379rmLbNV94EB/lNwfIloQkFfsvwPHTKSsWqKeejtsGNDpkpeORzT2Gd/RXVyE5P0pK3cjKSDctQ+nLr6T0A/Tf3NwAAG2dJREFUeDL7M1MTcaiy1pAkGA4piU4cr21Q3NDN//7KXeVYqxsUNTcGHiq/Oy8fXyu5EJ5Bskf4kh1H0LZ/R8PAtcHlxrIdRzGsZ1tt27aDx9EqOQEds1KwW8mJ6JAZePChVl705dJBubh2aFfMXrnXtD/Ry/HDtkMorahBx6wUNCglgCnMcODF/2csmlRV58IFz5kn6ErsjSUWOBFlE9G3RLRV+WsefPPsm0lEe4nohWB+u7yqHi8t2KaVM9TXQg9nGpm6TnJ7kwd2SPdszLn/THTKSkX/3CwM6to6KItSpVVyAvro3NYOn6uycEsZ6l1ulB6rwdGqeny11tMJbTt4HHmTvsT5zy7EjkOeEXWS04H+uZnaalKRhIjw/l3DcIsyPQ3wKqXP1+xD3qQvtTKXH6/ciwlvL8fYFxej8OE5eO2/21EwZQ6ueOlHPDN3MwBvAtujlw1ochpROIzs652mk6OViPW09/oPv2LAw3MCzp0urajBXW8vB+AtLwvAr/jGLcN7YEQvz/S7F64fbJon4fRZZW7q2P6G96lJno67dVqSdv88cqlnH99ENjUG3xS5rVPRKjkBee3STRclCYU0RU4HEYKpGjr25Fwk+Siq9fsqsHCLJ9kzr23TsexIoL8m+lKq//PvVbj5X0tx/39WoayyFidqG/DGjztw3as/I2/Sl7jipcU4eKwG5z27EKc94bFW1UGdWcKnni/vG4GLB3bCir+ejyevLMT2xy/C89cNxvBe7fD01YPwxBWFpt/Te/VOe2I+6hrc3gI0CG8xE7P562puiIQXVrnQJwGYL4ToDWC+8j4QjwIIeni4+2gVnvpmMy763/8C8M6tTnCGN19VVdyhuB2DIUVX+SzBV4MD6D35a4x48nvT7249eBxnP70AK3cdxfxNB/2qvkWS4h7ZOO0kr1WiWhDTF3qmzlz9yk+oqK7HAx9658FX1jbg7196k3b+77tt+OmXw5i9wjOn3qzmdSQZ0r0NNj06CtcO7aq5Pn09BwVT5mB/RTU+XrkHy3cewfKdR/Hukl1ah+0rp+85vv7UbijIzcSmR0fh4oG52PDIKHw8cTjGFXnj8/qlIGfeOhQ3DcvD6ikX4Ow+OeiZk26qlNXysBNnrTAsJlOUlx3KqYgIqgvdQYQ6Hw0+ekBHbHp0FEoeOg+zJw7HTcO649lxJ+O7P56FKwZ3xrf3nwnAMyXqwLFapCY6NQ9NtNFbtmqeicqSX4/g45V7MfSxeej/8BzD/bpiVzmKdW7m7zYdwNwNB5DgINNwk57+uVl44fpTkJ2ehGuGdgtqqhsA3KZ4UFR+8+4KNKhZ6GG40JfrEupWP3wB5v3+zBB/SWIHrHKhjwVwtvL6TQALAPzZdyciGgKgA4BvABQ1p4GtB4/jgucWYlyRZ/5vgsMRltupus6FrNREvzmbkUK1vgB/CzxYGkumiyTpyV5ZE02sukGP+Cfm+XKdrhRtU67lSJCS6ESr5ASt9rbZusrDnvgu4PdbpyWijc5FrrfAE3WWtdqhOxyEwd3aYFCX1ni/xBN20B/nOX08ijkrLRFv3OqfB6Gijw+v3etxO/dol46TY6T0zEhJUmPgQK0yP3/q2P7on5uJws6tkZTgQEqiE+1aJWvhki5t0vDsNSdrA6gN+4/h67X7NY9ITOTWDZIN88CbyW1vlGi/ES3GnpyLd5fuwtJfPR6tuRsOoF2rJM31H6oF/sPWQ9rrrNRErVSuhCdWWeAdhBD7AUD52953ByJyAHgGwANN/RgRTSCiEiIypN5uOXAczyl1ssNdNenAsZpG413hoq89bqZc9FwyKBdn98kJ+Hlz3PehoC+akhiBTsxsSclooI8ZO5sxSmqfkYzlDxnnoqfpBlyf3Ht6wO/qLa5QXNj6gd246Z7paEPzAkacYkKq4oomeCu1pSQ4MaR7dpPHqJ63d5fswtGqeuw64p+IGS1SkryyOXwS8ALRWILdvef4l/yNFESE128u0sI2iU7SxcDDr7vw5m3FWjtPXTkQo/r71zeQ2J+oDb+IaB4As7ticpA/MRHAV0KI3U3FR4UQMwDMAIDkTr0Nt/YJxeJKcIafxBZNxegbn1OZeetQv2ko40/thtNOaouaehf2V9Tg/+ZvxWxdideJZzdeMzpc9J20WRKOylf3nYFFW8twrLoeV5zSBZ1bp6K8us7P0h17cmQT7gKRkaJX4N7tfzg/X1sQxYylk8/z26a/J4PJhgcQchLZ/D+chXN1VQAbS16LBak6L0Ntg+f5Sm6kNoIe32f5wv7mZUajQZLToSm/BAf51TB4/PJCLNtxBC63QEV1PZ4ZNwjZaUn4/furcNWQrjheW4+731mh7R9tx1FGSiJ+fvBc3DxzKbaXncCx6npPFjo8A6dv1pXiwv4dmpU/Uu9yIz3JibPyvQbAuKFdMW5oV0y/KQoHIYkqUesJhBD+vZ4CER0gok5CiP1E1AmAWWHmYQDOIKKJAFoBSCKi40KIxuLlyG2dikcvG4CHP1uPRCehpt5jISQoqyY1uNyYvXIvXl20HfkdM/D8tYMNCrOiqh7JiQ5DbGtfeTVW7S7HOY1YveGSkuh1S+ofyHP6tMerNxXhRG0DLujfAYu3HdZi0CmJTvRol45nrzkZJ+Wk4+m5W9AxMwW92vvP444kBgWue/2Xi/ri8a82ae/z2qWhINc4mEhNSsXnvxmhLR96x4geBtd0NGkVwAL/7bm98dbPO7XCGW3SEvHsNSfj1pnL8MVvRwT8vdysFBQEqbyB4BW9L77FQiaeHT3LLxhSdUls6vPVVCxYz0f3DMeVL3vCPf933SmRFzAARISUBCeq612GJLa8tmmYOnYAzszPwfWndvP73j+vHay9vuvMk7RSqc1Z9ztUHA5CXtt0rN1b4bHAnYQTdS68/fNOvP3zTnxw9zAMzcvGC99tRemxGvz9Mm9S3OHjtchISTQ8rxXV9QGz4yX8sGoo/xmAmwFMU/5+6ruDEGK8+pqIbgFQ1JTyBjyrKd1wWndcM7Qrrpn+E1bs8hTUcDocqG1wodfkr7V9tx48jgv7d8Slg3K1bYOmeuK3S/9yrraAhLpSkyuK5QtVq8YsHqxf7MB34QOVe8/phdoGt1ZgJZroM4r1LvQJZ/bEhDN74taZS/H95jJDzFFPYZcsbHp0FJ6Zuxm/Ocd8HehooC+64XuelylW9isLf8HZfXLQt2Mmdkwb0+jv/figf/lWM1SrT5870FyuK+6G95buQnZ6UkzjxmZocX7SudCbocD1Fd8ilRkfLCmJDs88cN31z8lIxpn5wQ3O7x3ZC1V1LhABE86IrqdLJT05AVW1Su6Gz317WFmf4Om5Hg/S8J7tcFFhJ5yobcCQv8/D+QUd8OpN3vShiqr6qCa5SmKLVQp8GoD3ieh2ALsAXA0ARFQE4G4hxB3hNpDodBhcjQlK9Shf7ntvJV7/4Ves9qmcdfPMZfj6f84A4I3bPX75gHDFCojqQg+1NCkR4Q8X+NesjgbJ+hi4iQv95RuG4OCx2kYzblMSnZg8xrw8arTQW+CBZLv7rMh3ykv/ch6IwrPYHr6kAOv2Vhim8FmFcRqZGgNvniK+eVh3ZEZ59oEZnoFGPZw6T1dzBh+ZKYl49LLo9QNmpCc5tWx/34HnN+v2I1l37ifOWoEFfzwbZz+9AIBnpb+tByrRW1mpbemOIzgvwOpoEn5YosCFEIcB+JkvQogSAH7KWwjxBoA3mtuO0WUaWJn4Km8A2Lj/GIQQICIcOVEHBwG5WdHJQAf0Vk30M7LDRW+BJzj95U1JdPqtvmYH9OUiY5H5rqJazOEk66UkOvF5I+78WKJ6ixKdDq08caCSooF4ZGxslaCKOvhwOhxaDDs5xl6A5pLeSD/2yap9+GTVPsM2VXmr3P3Ocsz/w9koUWo0zNsoa6DHC/a+c8PE1wJvLv+nrEl96HgdstOTgp7DGQqdlMGBftlMuxJsEpvd0CexWdFpR3PaUSxRcxbqXG64FQ1udyWoog7qnQ5vJndygFCPXdCHXpqaoWLGL2UnMHPxr9hnUrNfwhseT12IpOum4Og7z8tOzm10RTGVZ5XM5BU7j5ouCRpJojlFLdLoO+tYxzDDQa/A9dPAYkW8KHC1+lhZZS1+d34+gNAz7GNNuqbACTX1zcugtwqDBa7zeH10j7e078xbhuISXS4PAMOywW/9tBMVVZ54+T1RnqUiiR1xPYtfvfGJvDHP7m3TtKzS95ftRvvMZNwycxkeHN0Xdynxz8K/zdGWz3x3yS5sPlDpt/RntGTlgMGFrpxXDropU+dCb07cM1JE04MTSzq39k6nvHRQriEJ1O7oFbiagGd7CzzJ3JPYt2MmnrpqIAo7Z6Ffp0yc07c93G6BL9fux6k9svGfu4Zh6GPzUFZZi+QEB/aUVyMpwYE/mazvLuEJH60RAurDKgS0qkt6F9S4oZ4qbUv/cq4hs/f1m4dqRTO+3eCpOx7t5KHGFtOwG3pFpLrQObjS9YMkKxR4KO5PO5LJeOlJzYVOpMugt/e9G6gAUXKCQ6s0qTJpdF8cr23A84qR8tpNRRj74mJsKq3EptJKzzKqcXIfSuLcha5Xiuo9q9aW1tM+03hTF/fIxp1neGoRq6UM+3XKjKKk/golr20aii2sdx0sqgvdd7EKu5NqhQKPEws8jdFg0xctg95BqFSW3E1Psvfx6MM9egs8weSZ65qdhjdvK0aWMlXMt8682dr1Er7Y+84NE/3IVZ3BE+zCGZPHFODV//6qVXLr3Ca6MT5fK2DBA+dEtb1IodYBN8tGtzNmNdyjTbwo8FY2V3iNoQ7qExykxe2Le9h7oBzsbJpg0M/Bl/CH75MYBK102ZvHaz2j7VDnnkZ7xSwrXLrh4C1JyceFrke1ZGKZPR0vLvRwCtJYjTqodwvg6iFdMKhLa8NSvnYkLdncAg+WHu3S8euhE+jSJhUv3xC7yneS6BPXClwfq1OT0jJTgz/k4rxsLN1xxLDaVLSwwqUbDq2SElBZ26Cdm7aNLPpgJ5wO0tZVnnHjkKiHRvTESxKbmeuWC6oCr653gYhsr7yB8C3w9+48DSt2HcVFhZ0iKZbEBsS1Au+km9qiKvCM5OAt6bduL0ZlTUOzlH6ocLPA05KdqKxtQPe26biosCNuGd6j6S/ZAAcBLnhc/xfIFZjCol2r2NSwjyRqPLmuIfp1zCOFviRxKKGqjlkpUnnHKXGtwPXzftWElYxmVIxKSXTGTLFyi496LJlapCY58dL4IVaLEzSeSneiWcuJRpK0JGdUl6GMFXN+dybaMlTgasgkFguRRAq958aq+1ZiT+JageuVb0V1eDFwiZEMxa3HbNyhDZRiWUpVz4apoyxpN9JwcD2bkcRQgeux6r6V2JP4VuC6BKX2GSnYcuB4VOuZtyS0WGIdr45QTSTjljUviQxqsmVdQxSXFowiTgfhgQv7wOXmKb8kssS1Atcn2zx/3WCs31ehzY+UhIeqwE/UNVgsSfNQE8ETpCuyRaLWK+BsgcdDCEYSGeJagevJTk/CGb2DW/PXKl654RStOpTdObVHNr7dcMDytambi9UudIm1qBZ4g5vHc+YLt1wZSXRpMQqcA6MG8MkUvX1ED4zo3Q59O8ZuGlYkUDtA2RG2TNRpj5yy0PVIz5FEj7wbJCFBROyUN+Bdb13GwFsmagW+OhfPGLIceEr0WKLAiSibiL4loq3KX9P6fkTUjYjmEtFGItpARHmxlVQSb6gdoODZf0vCpK2yLHCfDq0slqR5qHpbDjwleqyywCcBmC+E6A1gvvLejLcA/EMI0Q9AMYCDzW1ocLfWGCOLGEgU2qTxm7ssiRzd26bjw7uHYerYAVaL0ixUz5G0wCV6rIqBjwVwtvL6TQALAPxZvwMRFQBIEEJ8CwBCiOOhNPTxxNNDFlISf7x+SxG+XluqLWQhaXkUMVjlzxe1AJFMvpToscoC7yCE2A8Ayl//NT6BfADlRDSbiFYS0T+IyLQsGhFNIKISIiopKyuLotgS7nTKSsVtI3iUfZVIVNTpj9ICl+iJmgVORPMAmBWbnhzkTyQAOAPAYAC7APwHwC0AXvfdUQgxA8AMACgqKpLRTYlEEld4pz/KvGOJl6gpcCHEeYE+I6IDRNRJCLGfiDrBPLa9B8BKIcR25TufADgNJgpcIpFI4hkZA5eYYdVw7jMANyuvbwbwqck+ywC0ISK1+spIABtiIJtEIpHYCm8FQanAJV6sUuDTAJxPRFsBnK+8BxEVEdFrACCEcAH4I4D5RLQWAAF41SJ5JRKJxDLU2udJCdKFLvFiSRa6EOIwgHNNtpcAuEP3/lsAA2MomkQikdiOfp0ysXznUbRnVrpYEl1kKVWJRCKxOa/dVITdR6sMCzRJJFKBSyQSic1pk56ENumyCJHEiBzOSSQSiUTCEKnAJRKJRCJhiFTgEolEIpEwRCpwiUQikUgYQiLO1lUkojIAO00+agfgUIzFCRYpW2jYVTa7ygVI2ULFrrJFUq7uQoicpneT2IW4U+CBIKISIUSR1XKYIWULDbvKZle5AClbqNhVNrvKJYkN0oUukUgkEglDpAKXSCQSiYQhLUmBz7BagEaQsoWGXWWzq1yAlC1U7CqbXeWSxIAWEwOXSCQSiSSeaEkWuEQikUgkcYNU4BKJRCKRMISdAiei1kT0IRFtIqKNRDSMiP5GRHuJaJXy7yJl3/G6bauIyE1EJyufLSCizbrP2ivbk4noP0S0jYiWEFFeOLIp23+rtLWeiJ7S7f+g0s5mIrpQt32Usm0bEU3Sbe+hyLRVkTHo1Q2aIxsRnU9Ey4lorfJ3pO53LD1vRJRHRNW69l/R/c4QReZtRPQ8EZGyPZuIvlXO27dE1CYKcll+rym/pbax4//bO9tYO6oqDD8vvbQmKBYa2lysxrZpa7DVFlEpsQQQEzV+BCyxSEQSNCrBPwQSCGrwR7WV+ANtLH4EajAghUgtEm1UsBoUQaClH7FA2x8UarUYsNEEAl3+2Ou0w/Ge+9l7Z4a+T7JzZtas2fu9+6yZPWfP3FmSNlf8a421XtqaEGuDaKs11gbRNaGxZhpMRLSqAD8BPp/Lk4GpwA3A1UPstxDYXVn/PXDGAH5XADfn8nLgzjFqOxf4LTAl7dPz8zRgCzAFmAXsAiZl2QXMzjq2AKflPuuA5bl8M/DlcdK2GDg1lxcAzzao394ObOtRz8PAEkDAr4CPpP3bwLW5fC2w6mjrakKsdW3/DvD1psTaINpqj7VBtNUaa710TXSsuTS31C5gRGLhRGAP+fBdxX4DQw/g3wRWVNZ7BfpGYEku91HecqQxaFsHnD+A/3XAdd3tZtnY7ZcnigNAX9pf43c0tXX5CHieIwNW3f024EkV6Af+Vlm/GPhBLu8E+it+O8e5z2qJta7v7BlgblNirZe2JsTaIP1Wa6wNs8/GNdZcml3aNoU+G/gncKukxyX9WNIJue1KSU9IuqXHtNWngTu6bLfmNNPXOlNgwFsoBwsR8QrwIjBtDNrmAUtz2mqTpPd2t5PsTVsv+zTghdRUtQ+HkWqr8ing8Yh4qWKrs98AZqXvJklLK+3vrfhU+2dGROxLbfuA6eOkq0NdsdZhKbA/Ip7qbiepI9Z6aatSV6wNpq3OWBtMV4fxjjXTYNo2gPcBpwNrImIx8B/KNNUaYA6wCNhHmW46jKT3A/+NiG0V8yURsZBycCwFPttxH6DdGIO2PuAk4EzgGmBdHlS92hmpfTiMVBsAkt4JrAK+WKmr7n7bB7wtfa8Cbpd04hjaP1q6gNpjrcPFvPak3oRY66WtCKw31nppqzvWeukCJizWTINp2wC+F9gbEX/J9buB0yNif0S8GhGHgB8B7+vabzldB0BEPJufB4HbK/vsBd4KIKkPeDPwr9FqS/vPo/AwcIiSgOBwO8lM4LlB7AeAqampah8OI9WGpJnAPcClEbGrU1Hd/RYRL0XE86nhUco93HnpP7NSb7V/9kvqT239wD+Otq7KfnXGWqeeC4E7u/zrjrVe2poQawNqa0Cs9eyzZCJizTSYVg3gEfF34BlJ89P0QWBH54BJLgAOX5FKOg64CPhZxdYnqTNQHQ98rLLPBuBzubwMuD8ihrxS7aUNWA+cl23NozygciDbWZ5Ph84C5lIejHkEmKvyFPBkykG6ITU8kJpIjb8YStdotEmaCtxHuW/6YKeeJvSbpFMkTUr7bEq/7c7pyoOSzsxfxJdW+qeqbVj9NorvswmxBnA+5f5sdYq3CbE2oLaGxFovbXXH2oC6Us+ExJppOCO5Yd6EQpkm/yvwBOVkehJwG7A1bRvIh0jS/xzgoa46TgAeTf/twE3ApNz2BuAu4GnKSW72GLVNBn5KOZAeA86r+F9PuarfST7FmvaPAk/mtusr9tmp6enUOGU8tAFfpUzjba6U6U3oN8p90u2UJ6YfAz5eqeeM9N8FrObImwanAb8DnsrPk8fp+6w11tK+FvjSAP61xlovbU2ItUG01RprQ3yfExZrLs0tfpWqMcYY00JaNYVujDHGmIIHcGOMMaaFeAA3xhhjWogHcGOMMaaFeAA3xhhjWogHcHPMo5IJ6orK+qmS7h6Hdj6hSsavYe6zVtKyoT2NMcca/jcyc8yjklrxlxGxoGYp/4ektRRtR/2CwhjTbvwL3BhYCczJBBA3quSB3gYg6TJJ6yXdK2mPpCslXZVJJx6SdHL6zZH0a5Wc1n+U9I7uRrKu1bm8ViWP9J8k7e78ylZhtaQdku6jkgxDJQf1pmxjo6T+fPvWI5LOSZ9vSVox7j1mjKkdD+DGlMQRuyJiUURcM8D2BcBnKO+VXkFJILEY+DPlNZoAPwS+EhHvAa4Gvj+MdvuBD1BeebkybRcA8yl5nr8AnAWHX435PWBZtnELJY3kK8BlwBpJHwI+DHxj+H+6Maat9A3tYswxzwNRkkMclPQicG/atwLvkvRGykB7l44kJpsyjHrXR0nAs0PSjLSdDdwREa8Cz0m6P+3zKRcSv8k2JlGyZRER2yXdlrqWRMTLY/hbjTEtwQO4MUNTzU99qLJ+iHIMHUfJn71oDPVW0z0O9GCKgO0RsaRHXQuBF4AZPbYbY15neArdGDgIvGm0O0fEv4E9ki6Cw/ex3z3K6v5AyRw2KbPsnZv2ncApkpZkG8er5NBG0oWUJBpnA9/NDF/GmNc5HsDNMU+UnM8PStom6cZRVnMJcLmkLZRMUJ8cZT33UDJZbQXWAJtS48uUNJCrso3NwFmZPnIlcHlEPEnJjHXTKNs2xrQI/xuZMcYY00L8C9wYY4xpIR7AjTHGmBbiAdwYY4xpIR7AjTHGmBbiAdwYY4xpIR7AjTHGmBbiAdwYY4xpIf8D4VkrHTu8OoMAAAAASUVORK5CYII=\n", 358 | "text/plain": [ 359 | "
" 360 | ] 361 | }, 362 | "metadata": { 363 | "needs_background": "light" 364 | }, 365 | "output_type": "display_data" 366 | } 367 | ], 368 | "source": [ 369 | "left = ab_index[1]-1080\n", 370 | "right = ab_index[1]+1080\n", 371 | "\n", 372 | "plt.plot(x[left:right],p_signal[left:right,0],'-',label='ecg',)\n", 373 | "plt.plot(x[atr_sample],p_signal[atr_sample,0],'go',label ='normal')\n", 374 | "plt.plot(x[ab_index],p_signal[ab_index,0],'ro',label='abnormal')\n", 375 | "\n", 376 | "plt.xlim(left,right)\n", 377 | "plt.ylim(p_signal[left:right].min()-0.05,p_signal[left:right,0].max()+0.05)\n", 378 | "plt.xlabel('time index')\n", 379 | "plt.ylabel('ECG signal')\n", 380 | "plt.legend(bbox_to_anchor = (1.04,1), loc = 'upper left')\n", 381 | "plt.show()" 382 | ] 383 | }, 384 | { 385 | "cell_type": "markdown", 386 | "metadata": {}, 387 | "source": [ 388 | "# Make a dataset" 389 | ] 390 | }, 391 | { 392 | "cell_type": "markdown", 393 | "metadata": {}, 394 | "source": [ 395 | "Let's make a dataset that is centered on beats with +- 3 seconds before and after. " 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": 17, 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "def make_dataset(pts, num_sec, fs, abnormal):\n", 405 | " # function for making dataset ignoring non-beats\n", 406 | " # input:\n", 407 | " # pts - list of patients\n", 408 | " # num_sec = number of seconds to include before and after the beat\n", 409 | " # fs = frequency\n", 410 | " # output: \n", 411 | " # X_all = signal (nbeats , num_sec * fs columns)\n", 412 | " # Y_all = binary is abnormal (nbeats, 1)\n", 413 | " # sym_all = beat annotation symbol (nbeats,1)\n", 414 | " \n", 415 | " # initialize numpy arrays\n", 416 | " num_cols = 2*num_sec * fs\n", 417 | " X_all = np.zeros((1,num_cols))\n", 418 | " Y_all = np.zeros((1,1))\n", 419 | " sym_all = []\n", 420 | " \n", 421 | " # list to keep track of number of beats across patients\n", 422 | " max_rows = []\n", 423 | " \n", 424 | " for pt in pts:\n", 425 | " file = data_path + pt\n", 426 | " \n", 427 | " p_signal, atr_sym, atr_sample = load_ecg(file)\n", 428 | " \n", 429 | " # grab the first signal\n", 430 | " p_signal = p_signal[:,0]\n", 431 | " \n", 432 | " # make df to exclude the nonbeats\n", 433 | " df_ann = pd.DataFrame({'atr_sym':atr_sym,\n", 434 | " 'atr_sample':atr_sample})\n", 435 | " df_ann = df_ann.loc[df_ann.atr_sym.isin(abnormal + ['N'])]\n", 436 | " \n", 437 | " X,Y,sym = build_XY(p_signal,df_ann, num_cols, abnormal)\n", 438 | " sym_all = sym_all+sym\n", 439 | " max_rows.append(X.shape[0])\n", 440 | " X_all = np.append(X_all,X,axis = 0)\n", 441 | " Y_all = np.append(Y_all,Y,axis = 0)\n", 442 | " # drop the first zero row\n", 443 | " X_all = X_all[1:,:]\n", 444 | " Y_all = Y_all[1:,:]\n", 445 | " \n", 446 | " # check sizes make sense\n", 447 | " assert np.sum(max_rows) == X_all.shape[0], 'number of X, max_rows rows messed up'\n", 448 | " assert Y_all.shape[0] == X_all.shape[0], 'number of X, Y rows messed up'\n", 449 | " assert Y_all.shape[0] == len(sym_all), 'number of Y, sym rows messed up'\n", 450 | "\n", 451 | " return X_all, Y_all, sym_all\n", 452 | "\n", 453 | "\n", 454 | "\n", 455 | "def build_XY(p_signal, df_ann, num_cols, abnormal):\n", 456 | " # this function builds the X,Y matrices for each beat\n", 457 | " # it also returns the original symbols for Y\n", 458 | " \n", 459 | " num_rows = len(df_ann)\n", 460 | "\n", 461 | " X = np.zeros((num_rows, num_cols))\n", 462 | " Y = np.zeros((num_rows,1))\n", 463 | " sym = []\n", 464 | " \n", 465 | " # keep track of rows\n", 466 | " max_row = 0\n", 467 | "\n", 468 | " for atr_sample, atr_sym in zip(df_ann.atr_sample.values,df_ann.atr_sym.values):\n", 469 | "\n", 470 | " left = max([0,(atr_sample - num_sec*fs) ])\n", 471 | " right = min([len(p_signal),(atr_sample + num_sec*fs) ])\n", 472 | " x = p_signal[left: right]\n", 473 | " if len(x) == num_cols:\n", 474 | " X[max_row,:] = x\n", 475 | " Y[max_row,:] = int(atr_sym in abnormal)\n", 476 | " sym.append(atr_sym)\n", 477 | " max_row += 1\n", 478 | " X = X[:max_row,:]\n", 479 | " Y = Y[:max_row,:]\n", 480 | " return X,Y,sym\n", 481 | " " 482 | ] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": {}, 487 | "source": [ 488 | "# Lesson 1: split on patients not on samples" 489 | ] 490 | }, 491 | { 492 | "cell_type": "markdown", 493 | "metadata": {}, 494 | "source": [ 495 | "Let's start by processing all of our patients." 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": 18, 501 | "metadata": {}, 502 | "outputs": [], 503 | "source": [ 504 | "num_sec = 3\n", 505 | "fs = 360" 506 | ] 507 | }, 508 | { 509 | "cell_type": "code", 510 | "execution_count": 19, 511 | "metadata": {}, 512 | "outputs": [], 513 | "source": [ 514 | "X_all, Y_all, sym_all = make_dataset(pts, num_sec, fs, abnormal)" 515 | ] 516 | }, 517 | { 518 | "cell_type": "markdown", 519 | "metadata": {}, 520 | "source": [ 521 | "Imagine we naively just decided to randomly split our data by samples into a train and validation set. " 522 | ] 523 | }, 524 | { 525 | "cell_type": "code", 526 | "execution_count": 20, 527 | "metadata": {}, 528 | "outputs": [], 529 | "source": [ 530 | "from sklearn.model_selection import train_test_split\n", 531 | "\n", 532 | "X_train, X_valid, y_train, y_valid = train_test_split(X_all, Y_all, test_size=0.33, random_state=42)" 533 | ] 534 | }, 535 | { 536 | "cell_type": "markdown", 537 | "metadata": {}, 538 | "source": [ 539 | "Now we are ready to build our first dense NN. We will do this in Keras for simplicity. " 540 | ] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": 21, 545 | "metadata": {}, 546 | "outputs": [ 547 | { 548 | "name": "stderr", 549 | "output_type": "stream", 550 | "text": [ 551 | "Using TensorFlow backend.\n", 552 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 553 | " _np_qint8 = np.dtype([(\"qint8\", np.int8, 1)])\n", 554 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 555 | " _np_quint8 = np.dtype([(\"quint8\", np.uint8, 1)])\n", 556 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 557 | " _np_qint16 = np.dtype([(\"qint16\", np.int16, 1)])\n", 558 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 559 | " _np_quint16 = np.dtype([(\"quint16\", np.uint16, 1)])\n", 560 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 561 | " _np_qint32 = np.dtype([(\"qint32\", np.int32, 1)])\n", 562 | "/anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.\n", 563 | " np_resource = np.dtype([(\"resource\", np.ubyte, 1)])\n" 564 | ] 565 | } 566 | ], 567 | "source": [ 568 | "from keras.models import Sequential\n", 569 | "from keras.layers import Dense, Flatten, Dropout\n", 570 | "from keras.utils import to_categorical" 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": 22, 576 | "metadata": {}, 577 | "outputs": [ 578 | { 579 | "name": "stdout", 580 | "output_type": "stream", 581 | "text": [ 582 | "WARNING:tensorflow:From /anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", 583 | "Instructions for updating:\n", 584 | "Colocations handled automatically by placer.\n", 585 | "WARNING:tensorflow:From /anaconda3/envs/tutorials/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.\n", 586 | "Instructions for updating:\n", 587 | "Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.\n" 588 | ] 589 | } 590 | ], 591 | "source": [ 592 | "# build the same model\n", 593 | "# lets test out relu (a different activation function) and add drop out (for regularization)\n", 594 | "model = Sequential()\n", 595 | "model.add(Dense(32, activation = 'relu', input_dim = X_train.shape[1]))\n", 596 | "model.add(Dropout(rate = 0.25))\n", 597 | "model.add(Dense(1, activation = 'sigmoid'))" 598 | ] 599 | }, 600 | { 601 | "cell_type": "code", 602 | "execution_count": 23, 603 | "metadata": {}, 604 | "outputs": [], 605 | "source": [ 606 | "# compile the model - use categorical crossentropy, and the adam optimizer\n", 607 | "model.compile(\n", 608 | " loss = 'binary_crossentropy',\n", 609 | " optimizer = 'adam',\n", 610 | " metrics = ['accuracy'])" 611 | ] 612 | }, 613 | { 614 | "cell_type": "code", 615 | "execution_count": 24, 616 | "metadata": {}, 617 | "outputs": [ 618 | { 619 | "name": "stdout", 620 | "output_type": "stream", 621 | "text": [ 622 | "WARNING:tensorflow:From /anaconda3/envs/tutorials/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", 623 | "Instructions for updating:\n", 624 | "Use tf.cast instead.\n", 625 | "Epoch 1/5\n", 626 | "73096/73096 [==============================] - 9s 118us/step - loss: 0.2498 - acc: 0.9069\n", 627 | "Epoch 2/5\n", 628 | "73096/73096 [==============================] - 8s 116us/step - loss: 0.1640 - acc: 0.9469\n", 629 | "Epoch 3/5\n", 630 | "73096/73096 [==============================] - 9s 117us/step - loss: 0.1414 - acc: 0.9558\n", 631 | "Epoch 4/5\n", 632 | "73096/73096 [==============================] - 9s 118us/step - loss: 0.1299 - acc: 0.9592\n", 633 | "Epoch 5/5\n", 634 | "73096/73096 [==============================] - 9s 118us/step - loss: 0.1228 - acc: 0.9622\n" 635 | ] 636 | }, 637 | { 638 | "data": { 639 | "text/plain": [ 640 | "" 641 | ] 642 | }, 643 | "execution_count": 24, 644 | "metadata": {}, 645 | "output_type": "execute_result" 646 | } 647 | ], 648 | "source": [ 649 | "model.fit(X_train, y_train, batch_size = 32, epochs= 5, verbose = 1)" 650 | ] 651 | }, 652 | { 653 | "cell_type": "code", 654 | "execution_count": 25, 655 | "metadata": {}, 656 | "outputs": [], 657 | "source": [ 658 | "from sklearn.metrics import roc_auc_score, accuracy_score, precision_score, recall_score\n", 659 | "def calc_prevalence(y_actual):\n", 660 | " return (sum(y_actual)/len(y_actual))\n", 661 | "def calc_specificity(y_actual, y_pred, thresh):\n", 662 | " # calculates specificity\n", 663 | " return sum((y_pred < thresh) & (y_actual == 0)) /sum(y_actual ==0)\n", 664 | "def print_report(y_actual, y_pred, thresh):\n", 665 | " \n", 666 | " auc = roc_auc_score(y_actual, y_pred)\n", 667 | " accuracy = accuracy_score(y_actual, (y_pred > thresh))\n", 668 | " recall = recall_score(y_actual, (y_pred > thresh))\n", 669 | " precision = precision_score(y_actual, (y_pred > thresh))\n", 670 | " specificity = calc_specificity(y_actual, y_pred, thresh)\n", 671 | " print('AUC:%.3f'%auc)\n", 672 | " print('accuracy:%.3f'%accuracy)\n", 673 | " print('recall:%.3f'%recall)\n", 674 | " print('precision:%.3f'%precision)\n", 675 | " print('specificity:%.3f'%specificity)\n", 676 | " print('prevalence:%.3f'%calc_prevalence(y_actual))\n", 677 | " print(' ')\n", 678 | " return auc, accuracy, recall, precision, specificity" 679 | ] 680 | }, 681 | { 682 | "cell_type": "code", 683 | "execution_count": 26, 684 | "metadata": {}, 685 | "outputs": [ 686 | { 687 | "name": "stdout", 688 | "output_type": "stream", 689 | "text": [ 690 | "73096/73096 [==============================] - 3s 39us/step\n", 691 | "36003/36003 [==============================] - 1s 36us/step\n" 692 | ] 693 | } 694 | ], 695 | "source": [ 696 | "y_train_preds_dense = model.predict_proba(X_train,verbose = 1)\n", 697 | "y_valid_preds_dense = model.predict_proba(X_valid,verbose = 1)" 698 | ] 699 | }, 700 | { 701 | "cell_type": "code", 702 | "execution_count": 27, 703 | "metadata": {}, 704 | "outputs": [ 705 | { 706 | "data": { 707 | "text/plain": [ 708 | "0.3147641457808909" 709 | ] 710 | }, 711 | "execution_count": 27, 712 | "metadata": {}, 713 | "output_type": "execute_result" 714 | } 715 | ], 716 | "source": [ 717 | "thresh = (sum(y_train)/len(y_train))[0]\n", 718 | "thresh" 719 | ] 720 | }, 721 | { 722 | "cell_type": "code", 723 | "execution_count": 28, 724 | "metadata": {}, 725 | "outputs": [ 726 | { 727 | "name": "stdout", 728 | "output_type": "stream", 729 | "text": [ 730 | "Train\n", 731 | "AUC:0.992\n", 732 | "accuracy:0.968\n", 733 | "recall:0.962\n", 734 | "precision:0.937\n", 735 | "specificity:0.970\n", 736 | "prevalence:0.315\n", 737 | " \n", 738 | "Valid\n", 739 | "AUC:0.988\n", 740 | "accuracy:0.962\n", 741 | "recall:0.951\n", 742 | "precision:0.929\n", 743 | "specificity:0.967\n", 744 | "prevalence:0.314\n", 745 | " \n" 746 | ] 747 | } 748 | ], 749 | "source": [ 750 | "print('Train');\n", 751 | "print_report(y_train, y_train_preds_dense, thresh)\n", 752 | "print('Valid');\n", 753 | "print_report(y_valid, y_valid_preds_dense, thresh);" 754 | ] 755 | }, 756 | { 757 | "cell_type": "markdown", 758 | "metadata": {}, 759 | "source": [ 760 | "Amazing! Not that hard! But wait, will this work on new patients? Perhaps not if each patient has a unique heart signature. Technically the same patient can show up in both the training and validation sets. This means that we may have accidentally leaked information across the datasets. " 761 | ] 762 | }, 763 | { 764 | "cell_type": "markdown", 765 | "metadata": {}, 766 | "source": [ 767 | "We can try this again by splitting on patients instead of samples. " 768 | ] 769 | }, 770 | { 771 | "cell_type": "code", 772 | "execution_count": 29, 773 | "metadata": {}, 774 | "outputs": [ 775 | { 776 | "name": "stdout", 777 | "output_type": "stream", 778 | "text": [ 779 | "36 12\n" 780 | ] 781 | } 782 | ], 783 | "source": [ 784 | "import random\n", 785 | "random.seed( 42 )\n", 786 | "pts_train = random.sample(pts, 36)\n", 787 | "pts_valid = [pt for pt in pts if pt not in pts_train]\n", 788 | "print(len(pts_train), len(pts_valid))" 789 | ] 790 | }, 791 | { 792 | "cell_type": "code", 793 | "execution_count": 31, 794 | "metadata": {}, 795 | "outputs": [ 796 | { 797 | "name": "stdout", 798 | "output_type": "stream", 799 | "text": [ 800 | "(80614, 2160) (80614, 1) 80614\n", 801 | "(28485, 2160) (28485, 1) 28485\n" 802 | ] 803 | } 804 | ], 805 | "source": [ 806 | "X_train, y_train, sym_train = make_dataset(pts_train, num_sec, fs, abnormal)\n", 807 | "X_valid, y_valid, sym_valid = make_dataset(pts_valid, num_sec, fs, abnormal)\n", 808 | "print(X_train.shape, y_train.shape, len(sym_train))\n", 809 | "print(X_valid.shape, y_valid.shape, len(sym_valid))" 810 | ] 811 | }, 812 | { 813 | "cell_type": "code", 814 | "execution_count": 32, 815 | "metadata": {}, 816 | "outputs": [ 817 | { 818 | "name": "stdout", 819 | "output_type": "stream", 820 | "text": [ 821 | "Epoch 1/5\n", 822 | "80614/80614 [==============================] - 10s 130us/step - loss: 0.2345 - acc: 0.9143\n", 823 | "Epoch 2/5\n", 824 | "80614/80614 [==============================] - 10s 122us/step - loss: 0.1441 - acc: 0.9567\n", 825 | "Epoch 3/5\n", 826 | "80614/80614 [==============================] - 8s 95us/step - loss: 0.1272 - acc: 0.9619\n", 827 | "Epoch 4/5\n", 828 | "80614/80614 [==============================] - 7s 89us/step - loss: 0.1205 - acc: 0.9647\n", 829 | "Epoch 5/5\n", 830 | "80614/80614 [==============================] - 9s 114us/step - loss: 0.1111 - acc: 0.9665\n" 831 | ] 832 | }, 833 | { 834 | "data": { 835 | "text/plain": [ 836 | "" 837 | ] 838 | }, 839 | "execution_count": 32, 840 | "metadata": {}, 841 | "output_type": "execute_result" 842 | } 843 | ], 844 | "source": [ 845 | "# build the same model\n", 846 | "# lets test out relu (a different activation function) and add drop out (for regularization)\n", 847 | "model = Sequential()\n", 848 | "model.add(Dense(32, activation = 'relu', input_dim = X_train.shape[1]))\n", 849 | "model.add(Dropout(rate = 0.25))\n", 850 | "model.add(Dense(1, activation = 'sigmoid'))\n", 851 | "\n", 852 | "# compile the model - use categorical crossentropy, and the adam optimizer\n", 853 | "model.compile(\n", 854 | " loss = 'binary_crossentropy',\n", 855 | " optimizer = 'adam',\n", 856 | " metrics = ['accuracy'])\n", 857 | "\n", 858 | "model.fit(X_train, y_train, batch_size = 32, epochs= 5, verbose = 1)" 859 | ] 860 | }, 861 | { 862 | "cell_type": "code", 863 | "execution_count": 33, 864 | "metadata": {}, 865 | "outputs": [ 866 | { 867 | "name": "stdout", 868 | "output_type": "stream", 869 | "text": [ 870 | "80614/80614 [==============================] - 3s 34us/step\n", 871 | "28485/28485 [==============================] - 1s 29us/step\n" 872 | ] 873 | } 874 | ], 875 | "source": [ 876 | "y_train_preds_dense = model.predict_proba(X_train,verbose = 1)\n", 877 | "y_valid_preds_dense = model.predict_proba(X_valid,verbose = 1)" 878 | ] 879 | }, 880 | { 881 | "cell_type": "code", 882 | "execution_count": 34, 883 | "metadata": {}, 884 | "outputs": [ 885 | { 886 | "data": { 887 | "text/plain": [ 888 | "0.29906715955045027" 889 | ] 890 | }, 891 | "execution_count": 34, 892 | "metadata": {}, 893 | "output_type": "execute_result" 894 | } 895 | ], 896 | "source": [ 897 | "thresh = (sum(y_train)/len(y_train))[0]\n", 898 | "thresh" 899 | ] 900 | }, 901 | { 902 | "cell_type": "code", 903 | "execution_count": 35, 904 | "metadata": {}, 905 | "outputs": [ 906 | { 907 | "name": "stdout", 908 | "output_type": "stream", 909 | "text": [ 910 | "Train\n", 911 | "AUC:0.993\n", 912 | "accuracy:0.977\n", 913 | "recall:0.956\n", 914 | "precision:0.966\n", 915 | "specificity:0.986\n", 916 | "prevalence:0.299\n", 917 | " \n", 918 | "Valid\n", 919 | "AUC:0.854\n", 920 | "accuracy:0.751\n", 921 | "recall:0.392\n", 922 | "precision:0.816\n", 923 | "specificity:0.951\n", 924 | "prevalence:0.358\n", 925 | " \n" 926 | ] 927 | } 928 | ], 929 | "source": [ 930 | "print('Train');\n", 931 | "print_report(y_train, y_train_preds_dense, thresh)\n", 932 | "print('Valid');\n", 933 | "print_report(y_valid, y_valid_preds_dense, thresh);" 934 | ] 935 | }, 936 | { 937 | "cell_type": "markdown", 938 | "metadata": {}, 939 | "source": [ 940 | "Validation score is much different now! Makes sense since we had data leakage before. " 941 | ] 942 | }, 943 | { 944 | "cell_type": "markdown", 945 | "metadata": {}, 946 | "source": [ 947 | "# Lesson 2: learning curve can tells us we should get more data! " 948 | ] 949 | }, 950 | { 951 | "cell_type": "markdown", 952 | "metadata": {}, 953 | "source": [ 954 | "Given the overfitting between training and validation. Let's make a simple learning curve to see if we should go collect more data. " 955 | ] 956 | }, 957 | { 958 | "cell_type": "code", 959 | "execution_count": 37, 960 | "metadata": {}, 961 | "outputs": [ 962 | { 963 | "name": "stdout", 964 | "output_type": "stream", 965 | "text": [ 966 | "1\n", 967 | "- 0.9240772729323752 0.70463376675334\n", 968 | "18\n", 969 | "- 0.9950558990711743 0.8059944826565408\n", 970 | "36\n", 971 | "- 0.9935767231154906 0.8606843559877093\n" 972 | ] 973 | } 974 | ], 975 | "source": [ 976 | "aucs_train = []\n", 977 | "aucs_valid = []\n", 978 | "\n", 979 | "n_pts = [1,18,36]\n", 980 | "for n_pt in n_pts:\n", 981 | " \n", 982 | " print(n_pt)\n", 983 | " pts_sub = pts_train[:n_pt]\n", 984 | " X_sub, y_sub, sym_sub = make_dataset(pts_sub, num_sec, fs,abnormal)\n", 985 | "\n", 986 | " # build the same model\n", 987 | " # lets test out relu (a different activation function) and add drop out (for regularization)\n", 988 | " model = Sequential()\n", 989 | " model.add(Dense(32, activation = 'relu', input_dim = X_train.shape[1]))\n", 990 | " model.add(Dropout(rate = 0.25))\n", 991 | " model.add(Dense(1, activation = 'sigmoid'))\n", 992 | "\n", 993 | " # compile the model - use categorical crossentropy, and the adam optimizer\n", 994 | " model.compile(\n", 995 | " loss = 'binary_crossentropy',\n", 996 | " optimizer = 'adam',\n", 997 | " metrics = ['accuracy'])\n", 998 | "\n", 999 | " model.fit(X_sub, y_sub, batch_size = 32, epochs= 5, verbose = 0)\n", 1000 | " y_sub_preds_dense = model.predict_proba(X_sub,verbose = 0)\n", 1001 | " y_valid_preds_dense = model.predict_proba(X_valid,verbose = 0)\n", 1002 | " \n", 1003 | " auc_train = roc_auc_score(y_sub, y_sub_preds_dense)\n", 1004 | " auc_valid = roc_auc_score(y_valid, y_valid_preds_dense)\n", 1005 | " print('-',auc_train, auc_valid)\n", 1006 | " aucs_train.append(auc_train)\n", 1007 | " aucs_valid.append(auc_valid)" 1008 | ] 1009 | }, 1010 | { 1011 | "cell_type": "code", 1012 | "execution_count": 38, 1013 | "metadata": {}, 1014 | "outputs": [ 1015 | { 1016 | "data": { 1017 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdgAAAEGCAYAAADG7YTGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXxV9Z3/8dcnCUtYBCJLIhBABRSCSAkgYguuLFN362DVsZ06jtNap/anbW3tRqetrXa0Tu1iHafbjJZR62gLonWpWlEJVUlAWUSBQEICyBJIQpL7+f1xTsglZIWc3HuT9/PxyCP3nvM9535yxfu+55zv93zN3REREZGOlZboAkRERLoiBayIiEgEFLAiIiIRUMCKiIhEQAErIiISgYxEF9BRBg8e7KNHj050GSIiKWXlypU73H1IouvoirpMwI4ePZqCgoJElyEiklLMbFOia+iqdIpYREQkAgpYERGRCEQWsGb2kJmVmVlRM+vNzO4zsw1mtsrMPhK37jozWx/+XBdVjSIiIlGJ8hrsr4CfAL9pZv18YGz4MwP4GTDDzLKAbwL5gAMrzexJd/8wwlpFROQYrVy5cmhGRsaDQB5d/wxpDCiqra29furUqWVNNYgsYN39JTMb3UKTi4HfeHAz5NfMbKCZ5QBzgGfdfReAmT0LzAMejqpWERE5dhkZGQ9mZ2efOmTIkA/T0tK69I3uY7GYlZeXTygtLX0QuKipNonsRTwc2BL3vDhc1tzyI5jZDcANALm5udFUKdIOT7y5lbuWrWXb7kpOGJjJbXPHc8mUJv/5inRFed0hXAHS0tJ8yJAhe0pLS/Oaa5PIgLUmlnkLy49c6P4A8ABAfn5+l/8PKsntiTe3cvvjhVTW1AGwdXcltz9eCKCQlUgk4Re6tO4QrvXCv7XZU+GJDNhiYGTc8xHAtnD5nEbLX+y0qkSO0g+XvXsoXOtV1tTx9f8rYuvuSgDSzDCDNAse1y9LM7C432Zxy2l4Hv/b4tY3ud2hNobFvQ71+wDS0sJl4T6OeA2MtLSG9ta41vra0pp+jfjXtjTi2tT/DQ31x/9d0jp9oUt+iQzYJ4GbzOwRgk5Oe9y9xMyWAd8zs0FhuwuA2xNVpEhL6mLOyk0fsqSwhG27q5pss6+qlruWre3kylLbEV84OPyLyBFfOIhv3xDiwGFfEBp/OYn/khP/5YS49Ydv1/gLQv0XjWa++KS1Un/98jSAZupv9HfVb//b5Zua/EJ317K13TZgS0tL0+fMmTMeYMeOHT3S0tI8KyurFuCtt956p3fv3q0eXV9xxRWjv/71r5dMnjy5+ljriSxgzexhgiPRwWZWTNAzuAeAu/8cWAIsADYAB4BPh+t2mdl3gBXhrhbVd3gSSQa1dTHeeH8XS4pKeLpoOzsqqumZkUbvjDSqamNHtD9hYG9euHUO7uAOMXec8HcMHCfm4N7wu379oef123nDcg5tV7/Mm36NQ9sR1+boX4P67WI0/xocvs8jXoPDa2m5tvo2h7/GoTrilkNQV4uvEdYWO+y9j3sNOKLeQ68RgzpiR9Qb83Bfsca1NezHaea/UxPvd/xy4t/DuH8rNXVNZ8W28GxJKvjda5uy7ntu/fDyfdU9h/TvdfDmc8duveaMUUf9eZ+dnV337rvvrgH44he/eEK/fv3qFi1atD2+TSwWw91JT09vch+PPvroB0f7+o1F2Yv4qlbWO/C5ZtY9BDwURV0iR+NgbYxX39vB00WlPLNmO7v2HySzRzpnnzKE+Xk5nH3KUP68Zvthp+wAMnuk86W5p9Aro+n/mUWO1qw7nz906SHeCQMzE1BN+/3utU1Z3/njmlHVtbE0gLJ91T2/88c1owCOJWSbUlRU1OvSSy89edq0afvefPPNfkuWLFn/1a9+9YTCwsI+VVVVaZdccsmuu+++uwRg6tSp4//jP/5j87Rp0yqzsrJOv/baa8ufe+65AZmZmbE//elPG4YPH17b1tftMvciFuloVTV1vLJ+B0uLSnl2TSl7q2rp1yuDc04ZyoJJ2cweN5TMng3BWX9aLsk6nUgXddvc8U1+obtt7vgEVtXgtkffHrmudF+f5tavKdnbt6bOD7vgXl0bS/v2U6tH/2/BliYnHxiX3f/AXVdM3tLUuta89957vR988MH3Z8+evRng3nvvLR42bFhdTU0NZ5xxxviVK1d+OHXq1MOu81RUVKTPmTNn309/+tOt119//Yj7779/8Pe+973Str6mAlYkTuXBOv6yroylRaU8904ZFdW1HNc7g/MmDGNBXg5njR1M7x7NH41eMmW4AlU6Rap/oWscrq0tP1YjR46snj179oH65w899FDWb3/728G1tbVWXl7eY9WqVZmNA7Z3796xK6+8ci/A1KlTD7z88sv92vOaCljp9vZX1/L8u2U8XVTK8++WUVlTx6A+Pfj4aTnMy8vmzJMG0zOjq9+URlJRMn+ha+1Ic/p3/zypbF91z8bLh/bvdfD/bjqrw3sFZmZmHuogUVhY2OsXv/jFsIKCgncGDx5cd/HFF4+prKw8ItgzMjIOXehOT0/3urq6doW/Ala6pb1VNTz3znaWFpbyl3XlVNfGGNyvF5dPHc78vBxmjMkiI12hKhKVm88duzX+GixAr4y02M3njt0a9Wvv3r07vW/fvnWDBg2q27RpU4+XXnrpuLlz5+7p6NdRwEq3sfvAQZ5Zs52ni0p5Zf0ODtbFyD6uN1dNz2V+Xjb5o7NIT9MYTJHOUN+RqSN7EbfVrFmzDowdO7Zq3LhxE3Nzc6unTp1aEcXrmHvXuOlGfn6+a8J1aWxHRTXPrN7O0qISlr+3k9qYM3xgJgsmZTMvL4cpIweSplCVbszMVrp7fkfs6+233/5g8uTJOzpiX6ni7bffHjx58uTRTa3TEax0Odv3VrFsdSlLC0t5/f2dxBxGH9+Hf/rYiczPy2bS8AG6W5CIRE4BK13Ctt2VLC0qZWlhCSs3f4g7nDy0HzedfTLz8nI4Nae/QlVEOpUCVlLW5p0HWFpUwtKiUt7ashuAU7L7c8t545ifl83YYf0TXKGIdGcKWEkpG8srgiPVohKKtu4FYNLwAXxp3njm5+UwZnDfBFcoIhJQwEpSc3fWl1WwpLCEpYWlrN2+D4ApuQP52oJTmZeXzcisZm8WIyKSMApYSTruzpqSvSwtLGVJUQkby/djBtNGZfHNCycwLy+bnAGpcb9VEem+NJJekoK78/aW3Xx/6TvMvutF/u6+V/jpixvIPq4337kkj9e/ei6Lb5zJp2eNUbiKSJOmT58+/rHHHjsuftmiRYuGXnPNNbnNbdOnT58pAB988EGPefPmndjcfl966aV2nyrTEawkTCzmvLnlQ5YUlvJ0USlbd1eSkWacefJgPjvnJM6fMIzj+/VKdJkiEpUV/5nFX34wnIqynvQbepDZX97KtM8c9Y0mPvGJT+x8+OGHsy6//PK99csee+yxrB/84AfFrW07evTomqeffnrj0b52UxSw0qnqYs6KD3axtLCEp1eXsn1vNT3T0/jo2MHccv44zj91GAP69Eh0mSIStRX/mcWy20dRWx2cSa3Y3pNlt48CONqQvfbaaz/83ve+N7yystIyMzN97dq1PcvKynrMmDHjwMyZM8ft2bMnvba21r7xjW9su+aaa3bHb7t27dqeH//4x8euX79+dUVFhS1cuHDMunXreo8dO7aqqqrqqMb4KWAlcjV1MV7fGExQ/szqUnZUHKRXRhpzxg9hwaQczjllKP17K1RFupQnPjeSsjXNn1YtLexLrObw4KqtTmPpl0fz5u+anK6OoRMOcMn9zU4ikJ2dXTd58uT9jz322IBrrrlm969//eusiy666MN+/frF/vSnP23IysqKlZSUZMyYMeOUT37yk7vT0pq+Snr33XcPzczMjK1bt27N66+/njlr1qwJbfmTG1PASiQO1sb463s7WFpYwjNrtrP7QA19eqZz9ilDWZCXw5zxQ+jbS//8RLqtxuHa2vI2uvLKK3f9/ve/H3TNNdfsfvzxx7MefPDBD2KxmH3hC18Y8dprr/VLS0ujrKysZ3FxcUZubm6Tk6e/8sor/W6++eYygBkzZlSOGzfuQFPtWqNPOOkwVTV1vLw+CNVn39nOvqpa+vfK4NxThzJ/Ug6zxw1pcS5VEelCWjjSBODucZOo2H7EdHX0G3aQG1446unqrr766t133HHHyFdeeaVPVVVV2llnnXXgvvvuO37nzp0ZhYWF7/Tq1cuHDx8+qbKyssVOvh1x5zcFrByTAwdr+cvacpYUlfL8O9vZf7COAZk9mDsxmwWTspl18mB6ZShURaSR2V/eetg1WICMXjFmf/mYpqsbMGBA7Iwzzth3/fXXj77ssst2AezZsyd98ODBNb169fKnnnqq/7Zt244M9jhnnXVWxe9+97usCy+8cN+KFSt6r1u37qgG2ytgpd0qqmt57p1g2rcX1pZRVRMjq29PLjr9BObn5TDzpOPpoblURaQl9R2ZOrAXcb2FCxfuuu666056+OGHNwJcf/31u+bPn39yXl7eqRMnTjwwZsyYqpa2v/XWW8sWLlw4Zty4cRMmTpx4YNKkSfuPpg5NVydtsqeyhj+v2c7SolJeWl/OwdoYQ/r3Yt7EbOZPymb6aE1QLpKKNF3dsdF0dXJUPtx/kGfXbGdJUQl/3bCDmjonZ0Bvrp6Ry4JJOUzNHaS5VEVEmqGAlcOU76tm2ergxg/LN+6kLuaMzMrkH2eNYV5eNpNHaIJyEZG2UMAKpXuqeDqc9u2ND3bhDmMG9+XG2ScyPy+HiSccp7lURaQtYrFYzNLS0rrGtcdWxGIxA2LNrVfAdlPFHx7g6aJSlhaVsnLThwCMG9aPm88Zy/xJ2YwfpgnKRaTdisrLyycMGTJkT1cP2VgsZuXl5QOAoubaRBqwZjYP+DGQDjzo7nc2Wj8KeAgYAuwCrnH34nBdHVAYNt3s7hdFWWt38MGO/SwtKuXpohLeLt4DwISc47j1gnHMy8vh5KH9ElyhiKSy2tra60tLSx8sLS3No+tPJhMDimpra69vrkFkvYjNLB1YB5wPFAMrgKvcfU1cm/8F/ujuvzazc4BPu/u14boKd2/zJ756ETdtQ1kFSwuD079rSoL7X08eMYD5k3KYn5fNqOM1QblId9aRvYjlcFEewU4HNrj7RgAzewS4GFgT12YCcEv4+AXgiQjr6RbcnbXb94Uz1JSwbnsFAFNHDeKOvwsmKB8xSBOUi4hELcqAHQ7E3yqrGJjRqM3bwOUEp5EvBfqb2fHuvhPobWYFQC1wp7sfEb5mdgNwA0BubrPT/XV57s7qbXtZUljC00WlbNwRTFA+fXQW375oInMnZpM9oHeiyxQR6VaiDNimesg0Ph99K/ATM/sU8BKwlSBQAXLdfZuZnQg8b2aF7v7eYTtzfwB4AIJTxB1ZfLJzd97aspulRaUsLSphy65K0tOMmScez2c+OoYLJmQzpL/mUhURSZQoA7YYGBn3fASwLb6Bu28DLgMws37A5e6+J24d7r7RzF4EpgCHBWx3E4s5Kzd/eOhItWRPFT3SjVknD+bzZ4/lvAnDyOrb4i02RUSkk0QZsCuAsWY2huDIdCHwyfgGZjYY2OXuMeB2gh7FmNkg4IC7V4dtZgE/jLDWpFVbF+OND3axtLCUZatLKdtXTc+MND42dgi3zR3PuacOY0Cm5lIVEUk2kQWsu9ea2U3AMoJhOg+5+2ozWwQUuPuTwBzg+2bmBKeIPxdufirwCzOLEXT1vjO+93FXV1MXY/l7O1laVMIzq7ezc/9BevdI4+zxwbRv55wylH6aS1VEJKnpZv9Jorq2jr9u2MGSwlKeXbOdPZU19O2ZzjmnDmNBXjazxw+hT0+Fqoh0LA3TiY4+sROoqqaOv6wrZ2lhCc+9U8a+6lr6987g/FOHMX9SDh8dO1gTlIuIpCgFbCfbX13Li2vLWVJUwgvvlnHgYB0D+/Rg/qRs5k/KYdZJg+mZ0dVvgCIi0vUpYDvBvqoann+3jCWFJby4tpzq2hiD+/XkkinDWZCXw4wTszRBuYhIF6OAjcieAzU8+852lhaW8PL6HRysizHsuF4snDaS+ZNymDY6i3RN+yYi0mUpYDvQzorqcILyUl7dsIPamDN8YCbXzhzFgknZTBmpCcpFRLoLBewxKttXxbLVwZHqaxt3EnPIzerDZz46hgV5OZw2YoCmfRMR6YYUsEehZE9lMJdqYSkrNgUTlJ84pC+fnXMy8ydlMyFHE5SLiHR3Ctg22rIrmKB8SVEJb27eDcAp2f3513PHsmBSDmOH9lOoiojIIQrYFry/Y/+h+/4Wbg0mKM8bfhy3zR3PvLxsThqiCcpFRKRp3T5gn3hzK3ctW8u23ZWcMDCTa8/IpbrWWVpUwrul+wA4feRAvrrgFOZNzCH3eM2lKiIirevWAfvEm1u5/fFVVNbEANi6u5I7n14LwLTRg/jGxycwLy+bEwZmJrJMERFJQd06YO9atvZQuMbLPq43/3vjmQmoSEREuopuffugbbsrm1y+fW9VJ1ciIiJdTbcO2OZO/eqUsIiIHKtuHbC3zR1PZqPZajJ7pHPb3PEJqkhERLqKbn0N9pIpwwEO60V829zxh5aLiIgcrW4dsBCErAJVREQ6Wrc+RSwiIhIVBayIiEgEFLAiIiIRUMCKiIhEQAErIiISAQWsiIhIBBSwIiIiEVDAioiIRCDSgDWzeWa21sw2mNlXmlg/ysyeM7NVZvaimY2IW3edma0Pf66Lsk4REZGOFlnAmlk6cD8wH5gAXGVmExo1uxv4jbufBiwCvh9umwV8E5gBTAe+aWaDoqpVRESko0V5BDsd2ODuG939IPAIcHGjNhOA58LHL8Stnws86+673P1D4FlgXoS1ioiIdKgoA3Y4sCXueXG4LN7bwOXh40uB/mZ2fBu3xcxuMLMCMysoLy/vsMJFRESOVZQBa00s80bPbwVmm9mbwGxgK1Dbxm1x9wfcPd/d84cMGXKs9YqIiHSYKGfTKQZGxj0fAWyLb+Du24DLAMysH3C5u+8xs2JgTqNtX4ywVhERkQ4V5RHsCmCsmY0xs57AQuDJ+AZmNtjM6mu4HXgofLwMuMDMBoWdmy4Il4mIiKSEyALW3WuBmwiC8R1gsbuvNrNFZnZR2GwOsNbM1gHDgO+G2+4CvkMQ0iuAReEyERGRlGDuR1zaTEn5+fleUFCQ6DJERFKKma109/xE19EV6U5OIiIiEVDAioiIREABKyIiEgEFrIiISAQUsCIiIhFQwIqIiERAASsiIhIBBayIiEgEFLAiIiIRUMCKiIhEQAErIiISAQWsiIhIBBSwIiIiEVDAioiIREABKyIiEgEFrIiISAQUsCIiIhFQwIqIiERAASsikqpWLYZ78uBbA4PfqxYnuiKJk5HoAkRE5CisWgxP3Qw1lcHzPVuC5wCnXZm4uuSQZo9gzWyumV3RxPKrzez8aMsSEZEm1VbD5tdgya0N4VqvphKeW5SYuuQILR3Bfhu4sInlzwF/AJ6NpCIREWlQtRe2vAGbXw2CtbgA6qqbb7+nuPNqkxa1FLB93L288UJ3LzWzvhHWJCLSfVWUwaZXYfPy4Pf2IvAYWDqccDpM/yfInQlLb4O9247cfsCIzq9ZmtRSwPY2swx3r41faGY9gMxoyxIR6QbcYdfGMEyXB793vRes69EHRuTDx74Eo2bCiGnQM+7YpubA4ddgAXpkwrnf6Ny/QZrVUsA+DvzSzG5y9/0A4ZHrfeE6ERFpj1gdbF/dcHS6+TWoKA3WZWYFR6ZTPwWjzoScyZDeo/l91Xdkem5RcFp4wIggXNXBKWm0FLB3AP8GbDKzTYABI4H/BL7elp2b2Tzgx0A68KC739lofS7wa2Bg2OYr7r7EzEYD7wBrw6avufuNbfybRESSQ00VbPtbwynfLW9A9d5g3YCRMOZjwdFp7pkweByktXPk5GlXKlCTWLMBG54a/oqZfRs4OVy8wd0rm9smnpmlA/cD5wPFwAoze9Ld18Q1uwNY7O4/M7MJwBJgdLjuPXc/vV1/jYhIIlXtCUK0PlC3roS6g8G6IadC3uXB0WnuTBg4MrG1SuSaDVgzu6zRIgcGmtlb7r6vDfueThDIG8P9PQJcDMQHrAPHhY8HAE1csRcRSVL7ShvCdPNyKC0CHNIyIOd0mPHPwdFp7hnQJyvR1Uona+kUcVNDdLKA08zsM+7+fCv7Hg5siXteDMxo1OZbwDNm9nmgL3Be3LoxZvYmsBe4w91fbvwCZnYDcANAbm5uK+WIiByD+g5J8T18P3w/WNejL4ycBnO+Ehydjsg/vEOSdEstnSL+dFPLzWwUsJgjw/KIpk3tttHzq4BfufuPzGwm8FszywNKgFx332lmU4EnzGyiu+9tVOMDwAMA+fn5jfctInL0YnVQWnh4h6T9ZcG6PscHQTrt+uAaavZpLXdIkm6p3bdKdPdN4VCd1hQTdIqqN4IjTwF/BpgX7ne5mfUGBrt7GVAdLl9pZu8B44CC9tYrItImNZWw9W/BDR02hR2SDoZXwwbmwknnBKd6R4UdkqypYwiRBu0OWDM7hTD8WrECGGtmY4CtwELgk43abAbOBX5lZqcCvYFyMxsC7HL3OjM7ERgLbGxvrSIizarcDVtebzjlu+3Nhg5JQycEvXPrOyQNGJ7YWiUltdTJ6SmOPKWbBeQA17S2Y3evNbObgGUEQ3AecvfVZrYIKHD3J4H/RzDW9pbwtT7l7m5mHwMWmVktUAfc6O67juLvExEJ7N12+A0dtq8m6JDUA06YAmf8SxCmI2eoQ5J0CHNv+tKlmc1utMiBXQQh+/fu/rmIa2uX/Px8LyjQGWQRIeiQtHPD4R2Sdm8K1vXsF9wVqf7odPhU6NknsfUmkJmtdPf8RNfRFbXUyekv9Y/N7HSC07tXAu8Dj0VfmohIG9XVQumqoCNS/U3x94e3Uu8zOOiINOPG4Bpq9mmQrpk6JXotnSIeR3Dd9CpgJ/B7giPeszupNhGRptVUBrPK1B+dFq+AgxXBuoGj4OTzgqPTUWfC8SerQ5IkREtf494FXgYudPcNAOG1UhGRznVgV1yHpNeCDkmxGsBg2ESYfFVDD9/jTkh0tSJAywF7OcER7Atm9jTwCE2PbRUR6Vh7tsaNP10OZeEN4NJ6wPCPwMzPBWE6cjpkDkpsrSLNaOka7B+AP4Qz6FwC3AIMM7OfAX9w92c6qUYR6crcYce6hqPTza/C7s3Bup79gxDNu6yhQ1IPzZYpqaHVK/3hVHX/Dfy3mWUBnwC+AihgRaT96mqh9O2G4TKbl8OBncG6vkOCID3js8HvYXnqkCQpq13/csOxqL8If0REWnfwQNAJ6VCHpAKo2R+sGzQGxs0LwjR3Jhx/kjokSZehr4Yi0rEO7Go41btpOZS8BbFagg5JeTDl6oZAPS4n0dWKREYBKyLHZveWw2+IX/5OsDy9Z3DN9Mybgw5JI6ZB5sDE1irSiRSwItJ27lC+tuHodPNy2BPOStnruKBD0qQrgkA94SPQo3di6xVJIAWsiDSvrgZK3o6bVPw1qAxvC95vWHCa98zPhx2SJkJaemLrFUkiClgRaXBwf9AhadPy4Ci1uABqDgTrsk6E8QuC2w7mzgyeq0OSSLMUsCLd2f6dDUNlNr0aHK16HVha0CHpI//Q0CGp/7BEVyuSUhSwIt2Fe3ADh/gevjvWBuvSewUdks76AuSeCSOnQe8Bia1XJMUpYEW6qlgMyt89vEPS3q3Bul4DIHcGTF4YdkiaAhm9EluvSBejgBXpKmoPBmNO6ycV3/IaVH4YrOuXHV47PTP4PXSCOiSJREwBK5Kqqiug+I2Go9PiAqitDNYdfzKc8vGGScUHjVaHJJFOpoAVSRUV5Q0dkjYvh5JVDR2SsifB1E819PDtNzTR1Yp0ewpYkWTkDrs3NQyX2bQcdq4P1mX0huH58NEvBmE6cjr06p/YekXkCApYkWQQiwVznh4aMrMc9m0L1vUeACPPCO/heyaccLo6JImkAAWsSCLUHoRtbzYcnW55Dar2BOv6n9BwqnfUmTDkVEhLS2y9ItJuCliRjrRqMTy3CPYUw4ARcO434LQroXofbHm9oUPS1pVQWxVsc/xYmHBxQw/fgaPUIUmkCzB3T3QNHSI/P98LCgoSXYZ0Z6sWw1M3Q01lw7K0DOifE4w/9RhYOuSc1hCmuTOh7+DE1SzdnpmtdPf8RNfRFekIVqSj/Plbh4crBPOgVpTBR28NAnXEdOjVLyHliUjnUsCKHKv9O+D1XzTcJamxuoNwztc6tyYRSbhIe06Y2TwzW2tmG8zsK02szzWzF8zsTTNbZWYL4tbdHm631szmRlmnyFHZvRmWfAnuyYOXfggZmU23GzCic+sSkaQQ2RGsmaUD9wPnA8XACjN70t3XxDW7A1js7j8zswnAEmB0+HghMBE4AfizmY1z97qo6hVps+2r4ZV7oeix4CYPp/09zLo5mImm8TXYHplBRycR6XaiPEU8Hdjg7hsBzOwR4GIgPmAdOC58PAAIB/5xMfCIu1cD75vZhnB/yyOsV6R57kHv31fugfXPQI++cMa/wBmfhQHDgzZDxge/m+pFLCLdTpQBOxzYEve8GJjRqM23gGfM7PNAX+C8uG1fa7Tt8MYvYGY3ADcA5ObmdkjRIoeJxWDd00GwFr8BfY6Hs++AaZ+BPllHtj/tSgWqiADRBmxTA/kajwm6CviVu//IzGYCvzWzvDZui7s/ADwAwTCdY6xXpEHtQSh6FP7642DKt4G5sOBuOP1q6Nkn0dWJSAqIMmCLgZFxz0fQcAq43meAeQDuvtzMegOD27itSMerroC//QaW/yToFTwsDy57ECZeCunqdC8ibRflJ8YKYKyZjQG2EnRa+mSjNpuBc4FfmdmpQG+gHHgS+B8z+3eCTk5jgTcirFW6u/074Y1fBMNtqnbDqLPgwh/DyefprkoiclQiC1h3rzWzm4BlQDrwkLuvNrNFQIG7Pwn8P+CXZnYLwSngT3lwa6nVZraYoENULfA59SCWSOzeDK/+JDhqra0M5lCd9QUYOS3RlYlIijV6OZAAAA4oSURBVNOtEqV7am6oTX1PYJFuQrdKjI4uKkn30ZahNiIiHUQBK11fe4faiIh0AAWsdF0aaiMiCaSAla6n8VCboRM11EZEOp0+baTrOGKozSwNtRGRhFHASuprPNRm/N/BWV+AkdMTXZmIdGMKWEld21cH11cLH9VQGxFJOgpYSS2HhtrcC+uXaaiNiCQtBaykhvqhNn+9F7a8rqE2IpL0FLCS3DTURkRSlAJWkpOG2ohIitMnlSQXDbURkS5CASvJQUNtRKSLUcBKYh021MbCoTb/qqE2IpLyFLDS+ZoaajPjRpj5WRgwItHViYh0CAWsdB4NtRGRbkQBK9FraqjN/LtgyjUaaiMiXZYCVqJzaKjN/bC3WENtRKRb0aecdLz6oTZvPACVH4ZDbe7VUBsR6VYUsNJxNNRGROQQBawcOw21ERE5ggJWjo6G2oiItEgBK+2joTYiIm2igJW20VAbEZF2UcBKyzTURkTkqET6CWlm84AfA+nAg+5+Z6P19wBnh0/7AEPdfWC4rg4oDNdtdveLoqxVGtFQGxGRYxJZwJpZOnA/cD5QDKwwsyfdfU19G3e/Ja7954EpcbuodPfTo6pPmqGhNiIiHSLKI9jpwAZ33whgZo8AFwNrmml/FfDNCOuRlmiojYhIh4oyYIcDW+KeFwMzmmpoZqOAMcDzcYt7m1kBUAvc6e5PNLHdDcANALm5uR1UdjezaTm8co+G2oiIdLAoA7apC3XeTNuFwKPuXhe3LNfdt5nZicDzZlbo7u8dtjP3B4AHAPLz85vbtzSmoTYiIpGLMmCLgZFxz0cA25ppuxD4XPwCd98W/t5oZi8SXJ9978hNpc3qaqDwfzXURkSkE0QZsCuAsWY2BthKEKKfbNzIzMYDg4DlccsGAQfcvdrMBgOzgB9GWGvXpqE2IiKdLrJPV3evNbObgGUEw3QecvfVZrYIKHD3J8OmVwGPuHv8Kd5TgV+YWQxII7gG21znKGmOhtqIiCSMHZ5rqSs/P98LCgoSXUZy0FAbEWkjM1vp7vmJrqMr0vnBrqSpoTZn3gxDT0l0ZSIi3Y4CtivQUBsRkaSjgE1VTQ61+RpMu15DbUREkoACNtVoqI2ISEpQwKYKDbUREUkp+mROdhpqIyKSkhSwyUpDbUREUpoCNtloqI2ISJeggE0WGmojItKlKGATSUNtRES6LAVsIjQeajNAQ21ERLoaBWxnanKozS/DoTY9El2diIh0IAVsZ9BQGxGRbkcBGyUNtRER6bYUsFHQUBsRkW5PAduRNNRGRERCCthjFYsFgfrKPRpqIyIihyhgj5aG2oiISAsUsO11cH/QaenVn2iojYiINEsB21ZNDbX5+D0w9nwNtRERkSMoYFujoTYiInIUFLDN0VAbERE5BgrYVYvhuUWwpzgYTjP5Kih5W0NtRETkmHTvgF21GJ66GWoqg+d7tsBLPwyCVUNtRETkGHTvgH1uUUO4xsscBLO/1Pn1iIhIl5EW5c7NbJ6ZrTWzDWb2lSbW32Nmb4U/68xsd9y668xsffhzXSQF7iluevnerZG8nIiIdB+RHcGaWTpwP3A+UAysMLMn3X1NfRt3vyWu/eeBKeHjLOCbQD7gwMpw2w87tMgBI4LTwk0tFxEROQZRHsFOBza4+0Z3Pwg8AlzcQvurgIfDx3OBZ919VxiqzwLzOrzCc78BPTIPX9YjM1guIiJyDKIM2OFA/OFhcbjsCGY2ChgDPN+ebc3sBjMrMLOC8vLy9ld42pVw4X0wYCRgwe8L7wuWi4iIHIMoOzk1dXsjb6btQuBRd69rz7bu/gDwAEB+fn5z+27ZaVcqUEVEpMNFeQRbDIyMez4C2NZM24U0nB5u77YiIiJJJ8qAXQGMNbMxZtaTIESfbNzIzMYDg4DlcYuXAReY2SAzGwRcEC4TERFJCZGdInb3WjO7iSAY04GH3H21mS0CCty9PmyvAh5xd4/bdpeZfYcgpAEWufuuqGoVERHpaBaXayktPz/fCwoKEl2GiEhKMbOV7p6f6Dq6okhvNCEiItJddZkjWDMrBzY1s3owsKMTyzkWqVQrpFa9qVQrpFa9qVQrpFa9Udc6yt2HRLj/bqvLBGxLzKwgVU6BpFKtkFr1plKtkFr1plKtkFr1plKtcjidIhYREYmAAlZERCQC3SVgH0h0Ae2QSrVCatWbSrVCatWbSrVCatWbSrVKnG5xDVZERKSzdZcjWBERkU6lgBUREYlAlw5YM5tnZmvNbIOZfSXR9bTGzD4ws0Ize8vMku62VGb2kJmVmVlR3LIsM3vWzNaHvwclssZ6zdT6LTPbGr6/b5nZgkTWWM/MRprZC2b2jpmtNrN/DZcn63vbXL1J9/6aWW8ze8PM3g5r/Xa4fIyZvR6+t78P75eecC3U+yszez/uvT090bVK67rsNVgzSwfWAecTzM6zArjK3dcktLAWmNkHQL67J+UAeDP7GFAB/Mbd88JlPwR2ufud4ZeYQe7+5UTWGdbVVK3fAirc/e5E1taYmeUAOe7+NzPrD6wELgE+RXK+t83VeyVJ9v6amQF93b3CzHoArwD/CnwReNzdHzGznwNvu/vPElkrtFjvjcAf3f3RhBYo7dKVj2CnAxvcfaO7HwQeAS5OcE0pzd1fAhpPunAx8Ovw8a8JPmgTrplak5K7l7j738LH+4B3gOEk73vbXL1JxwMV4dMe4Y8D5wD1YZVM721z9UoK6soBOxzYEve8mCT9EIjjwDNmttLMbkh0MW00zN1LIPjgBYYmuJ7W3GRmq8JTyElxyjWemY0GpgCvkwLvbaN6IQnfXzNLN7O3gDLgWeA9YLe714ZNkuqzoXG97l7/3n43fG/vMbNeCSxR2qgrB6w1sSzZvwnOcvePAPOBz4WnOaXj/Aw4CTgdKAF+lNhyDmdm/YDHgC+4+95E19OaJupNyvfX3evc/XRgBMGZrVObata5VTWvcb1mlgfcDpwCTAOygIRfKpDWdeWALQZGxj0fAWxLUC1t4u7bwt9lwB8IPgyS3fbwmlz9tbmyBNfTLHffHn54xYBfkkTvb3i97THgv9398XBx0r63TdWbzO8vgLvvBl4EzgAGmln9fNhJ+dkQV++88LS8u3s18F8k2XsrTevKAbsCGBv2FuwJLASebGWbhDGzvmGHEcysL3ABUNTyVknhSeC68PF1wP8lsJYW1YdV6FKS5P0NO7b8J/COu/973KqkfG+bqzcZ318zG2JmA8PHmcB5BNeMXwCuCJsl03vbVL3vxn3RMoLrxQl/b6V1XbYXMUA4TOBeIB14yN2/m+CSmmVmJxIctQJkAP+TbPWa2cPAHILps7YD3wSeABYDucBm4BPunvDORc3UOofg9KUDHwD/XH+NM5HM7CzgZaAQiIWLv0pwXTMZ39vm6r2KJHt/zew0gk5M6QQHFIvdfVH4/9sjBKdb3wSuCY8OE6qFep8HhhBc+noLuDGuM5QkqS4dsCIiIonSlU8Ri4iIJIwCVkREJAIKWBERkQgoYEVERCKggBUREYmAAlaSmpm5mf0o7vmt4U37O2LfvzKzK1pv2aZ9vR7OcrLZzMrjZj0Z3Y59fNfMzm6lzaVmdtux1hvu63dxM7SsNLMZ4fJ/NLPsjngNke4so/UmIglVDVxmZt9PplmGzCzd3evqn7t7fTh9imBGpJvasl08d/9aa6/r7n9orU073eLuT4Rjxn8GfAT4R+BvQGkHv5ZIt6IjWEl2tcADwC2NVzQ+AjWzivD3HDP7i5ktNrN1ZnanmV0dzrNZaGYnxe3mPDN7OWz38XD7dDO7y8xWhDdX/+e4/b5gZv9DcJOFVplZhpntNrN/M7M3CO4t++1w30Vm9vPw7jz1R5SXhI+LLZhf9c2whnHh8uvN7N649j82s1fNbKOZXRpX/88tmE/0KTN7un6/LXgJONnM/p7gZhG/D49se4bvxZqwjh+05e8WEQWspIb7gavNbEA7tplMMI/mJOBaYJy7TwceBD4f1240MBv4O+DnZtYb+Aywx92nEdxc/Z/MbEzYfjrwNXef0I5aBgB/c/fp7r4c+HG470nhunnNbLfd3aeENX+xmTZDgVkEt8/7frjsEwSzw0wC/hmY2YYaLwQK3f33BHcK+vvwhvODgAXARHc/Le41RKQVClhJeuFMLb8Bbm7HZivCG6RXE0xP9ky4vJAgVOstdveYu68HNhLMWHIB8A8WTBn2OnA8MDZs/4a7v9/OP+EgDbfBBDg3PJp9myDcJzazXf1N/1c2qjneE+FN4FfRMOXaWXF/1zbgLy3Udk/4d34a+Kcm1u8iuB3iL8Mj5P0t7EtE4ugarKSKewmuC/5X3LJawi+J4WnWnnHr4u8rG4t7HuPwf/eN7xXqBPd7/by7L4tfYWZzOLqAqfTwnqRm1gf4CfARd99qZv8G9G5mu/qa62j+/9X4v9Ma/W6LW9z9ieZWunuNmeUD5xNMmPEvBF9ARKQVOoKVlBDe5H4xwenbeh8AU8PHFwM9jmLXnzCztPC67InAWmAZ8C8WTMmGmY2zYIajjpBJEPI7LJg96fIO2m+8V4ArLJADtHde4X1A/cxO/YHj3P2PBNfBp3RopSJdmI5gJZX8CIjvnftL4P/C063PcXRHl2sJTqEOI5ihpMrMHiQ4Jfu38Mi4nOAa5zFz951m9muC6cY2EZyC7miLgXPC11gbvsaedmz/X8CDZlYJXAQ8ama9CL6QN3ctWEQa0Ww6Il2QmfVz9wozG0IQsDPcvTzRdYl0JzqCFemalprZcQSnzb+pcBXpfDqCFRERiYA6OYmIiERAASsiIhIBBayIiEgEFLAiIiIRUMCKiIhE4P8DZoxBpgQnOcgAAAAASUVORK5CYII=\n", 1018 | "text/plain": [ 1019 | "
" 1020 | ] 1021 | }, 1022 | "metadata": { 1023 | "needs_background": "light" 1024 | }, 1025 | "output_type": "display_data" 1026 | } 1027 | ], 1028 | "source": [ 1029 | "plt.plot(n_pts, aucs_train, 'o-',label = 'Train')\n", 1030 | "plt.plot(n_pts, aucs_valid, 'o-',label = 'Valid')\n", 1031 | "plt.xlabel('Number Training Pts')\n", 1032 | "plt.ylabel('AUC')\n", 1033 | "plt.legend(bbox_to_anchor = (1.04,1), loc = 'upper left')\n", 1034 | "plt.show()" 1035 | ] 1036 | }, 1037 | { 1038 | "cell_type": "markdown", 1039 | "metadata": {}, 1040 | "source": [ 1041 | "More data appears to add extra value to the model. " 1042 | ] 1043 | }, 1044 | { 1045 | "cell_type": "markdown", 1046 | "metadata": {}, 1047 | "source": [ 1048 | "# Lesson 3: test multiple types of deep learning models" 1049 | ] 1050 | }, 1051 | { 1052 | "cell_type": "markdown", 1053 | "metadata": {}, 1054 | "source": [ 1055 | "## CNN" 1056 | ] 1057 | }, 1058 | { 1059 | "cell_type": "markdown", 1060 | "metadata": {}, 1061 | "source": [ 1062 | "Let's start by making a CNN. Here we will use a 1 dimensional CNN (as opposed to the 2D CNN for images). " 1063 | ] 1064 | }, 1065 | { 1066 | "cell_type": "markdown", 1067 | "metadata": {}, 1068 | "source": [ 1069 | "A CNN is a special type of deep learning algorithm which uses a set of filters and the convolution operator to reduce the number of parameters. This algorithm sparked the state-of-the-art techniques for image classification. Essentially, the way this works for 1D CNN is to take a filter (kernel) of size `kernel_size` starting with the first time stamp. The convolution operator takes the filter and multiplies each element against the first `kernel_size` time steps. These products are then summed for the first cell in the next layer of the neural network. The filter then moves over by `stride` time steps and repeats. The default `stride` in Keras is 1, which we will use. In image classification, most people use `padding` which allows you pick up some features on the edges of the image by adding 'extra' cells, we will use the default padding which is 0. The output of the convolution is then multiplied by a set of weights W and added to a bias b and then passed through a non-linear activation function as in dense neural network. You can then repeat this with addition CNN layers if desired. Here we will use Dropout which is a technique for reducing overfitting by randomly removing some nodes. " 1070 | ] 1071 | }, 1072 | { 1073 | "cell_type": "code", 1074 | "execution_count": 39, 1075 | "metadata": {}, 1076 | "outputs": [ 1077 | { 1078 | "name": "stdout", 1079 | "output_type": "stream", 1080 | "text": [ 1081 | "(80614, 2160, 1)\n", 1082 | "(28485, 2160, 1)\n" 1083 | ] 1084 | } 1085 | ], 1086 | "source": [ 1087 | "# reshape input to be [samples, time steps, features = 1]\n", 1088 | "X_train_cnn = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))\n", 1089 | "X_valid_cnn = np.reshape(X_valid, (X_valid.shape[0], X_valid.shape[1], 1))\n", 1090 | "\n", 1091 | "print(X_train_cnn.shape)\n", 1092 | "print(X_valid_cnn.shape)\n" 1093 | ] 1094 | }, 1095 | { 1096 | "cell_type": "code", 1097 | "execution_count": 40, 1098 | "metadata": {}, 1099 | "outputs": [], 1100 | "source": [ 1101 | "from keras.layers import Conv1D" 1102 | ] 1103 | }, 1104 | { 1105 | "cell_type": "code", 1106 | "execution_count": 41, 1107 | "metadata": {}, 1108 | "outputs": [], 1109 | "source": [ 1110 | "model = Sequential()\n", 1111 | "model.add(Conv1D(filters = 128, kernel_size = 5, activation = 'relu', input_shape = (2160,1)))\n", 1112 | "model.add(Dropout(rate = 0.25))\n", 1113 | "model.add(Flatten())\n", 1114 | "model.add(Dense(1, activation = 'sigmoid'))\n", 1115 | "\n", 1116 | "# compile the model - use categorical crossentropy, and the adam optimizer\n", 1117 | "model.compile(\n", 1118 | " loss = 'binary_crossentropy',\n", 1119 | " optimizer = 'adam',\n", 1120 | " metrics = ['accuracy'])" 1121 | ] 1122 | }, 1123 | { 1124 | "cell_type": "code", 1125 | "execution_count": 42, 1126 | "metadata": {}, 1127 | "outputs": [ 1128 | { 1129 | "name": "stdout", 1130 | "output_type": "stream", 1131 | "text": [ 1132 | "Epoch 1/2\n", 1133 | "80614/80614 [==============================] - 462s 6ms/step - loss: 0.1963 - acc: 0.9338\n", 1134 | "Epoch 2/2\n", 1135 | "80614/80614 [==============================] - 407s 5ms/step - loss: 0.1204 - acc: 0.9641\n" 1136 | ] 1137 | }, 1138 | { 1139 | "data": { 1140 | "text/plain": [ 1141 | "" 1142 | ] 1143 | }, 1144 | "execution_count": 42, 1145 | "metadata": {}, 1146 | "output_type": "execute_result" 1147 | } 1148 | ], 1149 | "source": [ 1150 | "model.fit(X_train_cnn, y_train, batch_size = 32, epochs= 2, verbose = 1)" 1151 | ] 1152 | }, 1153 | { 1154 | "cell_type": "code", 1155 | "execution_count": 43, 1156 | "metadata": {}, 1157 | "outputs": [ 1158 | { 1159 | "name": "stdout", 1160 | "output_type": "stream", 1161 | "text": [ 1162 | "80614/80614 [==============================] - 74s 913us/step\n", 1163 | "28485/28485 [==============================] - 26s 912us/step\n" 1164 | ] 1165 | } 1166 | ], 1167 | "source": [ 1168 | "y_train_preds_cnn = model.predict_proba(X_train_cnn,verbose = 1)\n", 1169 | "y_valid_preds_cnn = model.predict_proba(X_valid_cnn,verbose = 1)" 1170 | ] 1171 | }, 1172 | { 1173 | "cell_type": "code", 1174 | "execution_count": 44, 1175 | "metadata": {}, 1176 | "outputs": [ 1177 | { 1178 | "name": "stdout", 1179 | "output_type": "stream", 1180 | "text": [ 1181 | "Train\n", 1182 | "AUC:0.993\n", 1183 | "accuracy:0.966\n", 1184 | "recall:0.967\n", 1185 | "precision:0.923\n", 1186 | "specificity:0.966\n", 1187 | "prevalence:0.299\n", 1188 | " \n", 1189 | "Valid\n", 1190 | "AUC:0.907\n", 1191 | "accuracy:0.821\n", 1192 | "recall:0.809\n", 1193 | "precision:0.723\n", 1194 | "specificity:0.827\n", 1195 | "prevalence:0.358\n", 1196 | " \n" 1197 | ] 1198 | } 1199 | ], 1200 | "source": [ 1201 | "print('Train');\n", 1202 | "print_report(y_train, y_train_preds_cnn, thresh)\n", 1203 | "print('Valid');\n", 1204 | "print_report(y_valid, y_valid_preds_cnn, thresh);" 1205 | ] 1206 | }, 1207 | { 1208 | "cell_type": "markdown", 1209 | "metadata": {}, 1210 | "source": [ 1211 | "## LSTM" 1212 | ] 1213 | }, 1214 | { 1215 | "cell_type": "code", 1216 | "execution_count": null, 1217 | "metadata": {}, 1218 | "outputs": [], 1219 | "source": [] 1220 | }, 1221 | { 1222 | "cell_type": "code", 1223 | "execution_count": 45, 1224 | "metadata": {}, 1225 | "outputs": [], 1226 | "source": [ 1227 | "from keras.layers import Bidirectional, LSTM" 1228 | ] 1229 | }, 1230 | { 1231 | "cell_type": "code", 1232 | "execution_count": 46, 1233 | "metadata": {}, 1234 | "outputs": [], 1235 | "source": [ 1236 | "model = Sequential()\n", 1237 | "model.add(Bidirectional(LSTM(64, input_shape=(X_train_cnn.shape[1], X_train_cnn.shape[2]))))\n", 1238 | "model.add(Dropout(rate = 0.25))\n", 1239 | "model.add(Dense(1, activation = 'sigmoid'))\n", 1240 | "model.compile(\n", 1241 | " loss = 'binary_crossentropy',\n", 1242 | " optimizer = 'adam',\n", 1243 | " metrics = ['accuracy'])" 1244 | ] 1245 | }, 1246 | { 1247 | "cell_type": "markdown", 1248 | "metadata": {}, 1249 | "source": [ 1250 | "Reduce dataset to make this feasible for weekend project" 1251 | ] 1252 | }, 1253 | { 1254 | "cell_type": "code", 1255 | "execution_count": 47, 1256 | "metadata": {}, 1257 | "outputs": [ 1258 | { 1259 | "name": "stdout", 1260 | "output_type": "stream", 1261 | "text": [ 1262 | "Epoch 1/1\n", 1263 | "10000/10000 [==============================] - 787s 79ms/step - loss: 0.4999 - acc: 0.7680\n" 1264 | ] 1265 | }, 1266 | { 1267 | "data": { 1268 | "text/plain": [ 1269 | "" 1270 | ] 1271 | }, 1272 | "execution_count": 47, 1273 | "metadata": {}, 1274 | "output_type": "execute_result" 1275 | } 1276 | ], 1277 | "source": [ 1278 | "model.fit(X_train_cnn[:10000], y_train[:10000], batch_size = 32, epochs= 1, verbose = 1)" 1279 | ] 1280 | }, 1281 | { 1282 | "cell_type": "code", 1283 | "execution_count": 48, 1284 | "metadata": {}, 1285 | "outputs": [ 1286 | { 1287 | "name": "stdout", 1288 | "output_type": "stream", 1289 | "text": [ 1290 | "10000/10000 [==============================] - 109s 11ms/step\n", 1291 | "28485/28485 [==============================] - 307s 11ms/step\n" 1292 | ] 1293 | } 1294 | ], 1295 | "source": [ 1296 | "y_train_preds_lstm = model.predict_proba(X_train_cnn[:10000],verbose = 1)\n", 1297 | "y_valid_preds_lstm = model.predict_proba(X_valid_cnn,verbose = 1)" 1298 | ] 1299 | }, 1300 | { 1301 | "cell_type": "code", 1302 | "execution_count": 49, 1303 | "metadata": {}, 1304 | "outputs": [ 1305 | { 1306 | "name": "stdout", 1307 | "output_type": "stream", 1308 | "text": [ 1309 | "Train\n", 1310 | "AUC:0.911\n", 1311 | "accuracy:0.817\n", 1312 | "recall:0.901\n", 1313 | "precision:0.766\n", 1314 | "specificity:0.737\n", 1315 | "prevalence:0.489\n", 1316 | " \n", 1317 | "Valid\n", 1318 | "AUC:0.554\n", 1319 | "accuracy:0.634\n", 1320 | "recall:0.281\n", 1321 | "precision:0.481\n", 1322 | "specificity:0.831\n", 1323 | "prevalence:0.358\n", 1324 | " \n" 1325 | ] 1326 | } 1327 | ], 1328 | "source": [ 1329 | "print('Train');\n", 1330 | "print_report(y_train[:10000], y_train_preds_lstm, thresh)\n", 1331 | "print('Valid');\n", 1332 | "print_report(y_valid, y_valid_preds_lstm, thresh);" 1333 | ] 1334 | }, 1335 | { 1336 | "cell_type": "code", 1337 | "execution_count": 50, 1338 | "metadata": {}, 1339 | "outputs": [ 1340 | { 1341 | "data": { 1342 | "image/png": "\n", 1343 | "text/plain": [ 1344 | "
" 1345 | ] 1346 | }, 1347 | "metadata": { 1348 | "needs_background": "light" 1349 | }, 1350 | "output_type": "display_data" 1351 | } 1352 | ], 1353 | "source": [ 1354 | "from sklearn.metrics import roc_curve, roc_auc_score\n", 1355 | "\n", 1356 | "\n", 1357 | "fpr_valid_cnn, tpr_valid_cnn, t_valid_cnn = roc_curve(y_valid, y_valid_preds_cnn)\n", 1358 | "auc_valid_cnn = roc_auc_score(y_valid, y_valid_preds_cnn)\n", 1359 | "\n", 1360 | "fpr_valid_dense, tpr_valid_dense, t_valid_dense = roc_curve(y_valid, y_valid_preds_dense)\n", 1361 | "auc_valid_dense = roc_auc_score(y_valid, y_valid_preds_dense)\n", 1362 | "\n", 1363 | "fpr_valid_lstm, tpr_valid_lstm, t_valid_lstm = roc_curve(y_valid, y_valid_preds_lstm)\n", 1364 | "auc_valid_lstm = roc_auc_score(y_valid, y_valid_preds_lstm)\n", 1365 | "\n", 1366 | "plt.plot(fpr_valid_cnn, tpr_valid_cnn, 'g-', label = 'CNN AUC:%.3f'%auc_valid_cnn)\n", 1367 | "plt.plot(fpr_valid_dense, tpr_valid_dense, 'r-', label = 'Dense AUC:%.3f'%auc_valid_dense)\n", 1368 | "plt.plot(fpr_valid_lstm, tpr_valid_lstm, 'b-', label = 'LSTM AUC:%.3f'%auc_valid_lstm)\n", 1369 | "\n", 1370 | "plt.plot([0,1],[0,1], 'k--')\n", 1371 | "plt.xlabel('FPR')\n", 1372 | "plt.ylabel('TPR')\n", 1373 | "plt.legend(bbox_to_anchor = (1.04,1), loc = 'upper left')\n", 1374 | "plt.title('Validation Set')\n", 1375 | "plt.show()" 1376 | ] 1377 | }, 1378 | { 1379 | "cell_type": "code", 1380 | "execution_count": null, 1381 | "metadata": {}, 1382 | "outputs": [], 1383 | "source": [] 1384 | }, 1385 | { 1386 | "cell_type": "code", 1387 | "execution_count": null, 1388 | "metadata": {}, 1389 | "outputs": [], 1390 | "source": [] 1391 | } 1392 | ], 1393 | "metadata": { 1394 | "kernelspec": { 1395 | "display_name": "tutorials", 1396 | "language": "python", 1397 | "name": "tutorials" 1398 | }, 1399 | "language_info": { 1400 | "codemirror_mode": { 1401 | "name": "ipython", 1402 | "version": 3 1403 | }, 1404 | "file_extension": ".py", 1405 | "mimetype": "text/x-python", 1406 | "name": "python", 1407 | "nbconvert_exporter": "python", 1408 | "pygments_lexer": "ipython3", 1409 | "version": "3.6.9" 1410 | } 1411 | }, 1412 | "nbformat": 4, 1413 | "nbformat_minor": 2 1414 | } 1415 | --------------------------------------------------------------------------------