├── EDA ├── README.md └── NaNs_updated_to_medians.ipynb ├── Models ├── README.md ├── sample_output_decisiontree.png ├── decision trees accuracy comparison.PNG └── KimsLogReg2.ipynb ├── .DS_Store ├── image_16_sample_tree.PNG ├── image_2_sample_tree.PNG ├── image_4_sample_tree.PNG ├── sample_output_decisiontree.png ├── docker-compose.yml ├── README.md └── data └── toy_test300.txt /EDA/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Models/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/.DS_Store -------------------------------------------------------------------------------- /image_16_sample_tree.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/image_16_sample_tree.PNG -------------------------------------------------------------------------------- /image_2_sample_tree.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/image_2_sample_tree.PNG -------------------------------------------------------------------------------- /image_4_sample_tree.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/image_4_sample_tree.PNG -------------------------------------------------------------------------------- /sample_output_decisiontree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/sample_output_decisiontree.png -------------------------------------------------------------------------------- /Models/sample_output_decisiontree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/Models/sample_output_decisiontree.png -------------------------------------------------------------------------------- /Models/decision trees accuracy comparison.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cpapadimitriou/Click-Through-Rate-prediction/HEAD/Models/decision trees accuracy comparison.PNG -------------------------------------------------------------------------------- /docker-compose.yml: -------------------------------------------------------------------------------- 1 | version: '3' 2 | services: 3 | quickstart.cloudera: 4 | image: w261/w261-environment:latest 5 | hostname: docker.w261 6 | privileged: true 7 | command: bash -c "/root/start-notebook.sh;/usr/bin/docker-quickstart" 8 | ports: 9 | - "8887:8888" # Hue server 10 | - "8889:8889" # jupyter 11 | - "10020:10020" # mapreduce job history server 12 | - "8022:22" # ssh 13 | - "7180:7180" # Cloudera Manager 14 | - "11000:11000" # Oozie 15 | - "50070:50070" # HDFS REST Namenode 16 | - "50075:50075" # hdfs REST Datanode 17 | - "8088:8088" # yarn resource manager webapp address 18 | - "19888:19888" # mapreduce job history webapp address 19 | - "8983:8983" # Solr console 20 | - "8032:8032" # yarn resource manager access 21 | - "8042:8042" # yarn node manager 22 | - "60010:60010" # hbase 23 | - "4040:4040" # Spark UI 24 | - "8080:8080" # Hadoop Job Tracker 25 | tty: true 26 | stdin_open: true 27 | volumes: 28 | - .:/media/notebooks 29 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Click Through Rate (CTR) prediction wit pySpark on Criteo's advertising data 2 | 3 | ## Background 4 | 5 | The following analysis is based on a Kaggle dataset from Criteo, an internet advertising company focused on retargeting. Criteo's goal is to increase online clickthrough rates among consumers who have previously visited an advertiser's website. This information will be used by Criteo to more efficiently provide the right ads to the right people. Optimizing the retargeting process not only helps advertisers become more efficient in terms of how they spend their dollars, but also it reduces clutter for consumers who do not want to be "followed" by ads for irrelevant products (or ones they may have already purchased!). Our goal is to create a model that will most accurately predict clickthroughs (label = 1); Due to binary categorical nature of the output label (0,1), we are exploring classification models for analysis. 6 | 7 | Features given in the data set most likely represent characterstics about consumer behavior (history of clickthroughs, site visitiation, etc.), the ads themselves (product, creative approach, placement, etc.) and general metrics such as the date the ad was published. However since there is no visibility into what each feature represents, our challenge is to make our predictions based on the data alone. With over 6 million records to train each day (~45 million per week), this will require a scalable approach. 8 | 9 | ## Dataset 10 | The data for this project is available here: 11 | http://labs.criteo.com/2014/09/kaggle-contest-dataset-now-available-academic-use/ 12 | 13 | Read more about the data at the Kaggle competition website here: 14 | https://www.kaggle.com/c/criteo-display-ad-challenge 15 | 16 | ## Dataset Introduction 17 | 18 | The training dataset consists of a portion of Criteo's traffic over a period of 7 days. Each row corresponds to a display ad served by Criteo and the first column indicates whether this ad has been clicked or not. The positive (clicked) and negatives (non-clicked) examples have both been subsampled (but at different rates 75% - 0 Class, 25% - Class) in order to reduce the dataset size. 19 | 20 | There are 13 numerical features (mostly count features) and 26 categorical features in this dataset. The values of the categorical features have been hashed onto 32 bits for anonymization purposes. The semantic of these features is undisclosed. Some features may have missing values. All the rows are chronologically ordered. The test set is computed in the same way as the training set but it corresponds to events on the day following the training period and does not have the label column. Since, there is no time data available, we are not considering this dataset to be a time series model. 21 | 22 | ## Key Questions: Features and Model 23 | 24 | **1. Which features are most important in predicting clickthroughs?** 25 | 26 | Having this information can help Criteo focus on the metrics that are most critical to their product. With 39 features, there is a high risk of overfitting. We should identify a model that provides an optimal tradeoff between bias and variance. Since we didnt get any metadata about the features, we are relying on EDA and regularization techniques to help us determine the important features and reduce dimensionality of the feature space. 27 | 28 | **2. Which machine learning approach not only provides the highest accuracy in predicting clickthroughs, but is also scalable enough to be useful in a production environment?** 29 | 30 | As internet patterns and product choices change rapidly, the ideal model should be trained daily to update the following day's retargeting model. Scaling would help us achieve shorter training times than processing records sequentially. Any ML algorithm which can be trained using associative and commutative properties (ex. simple addition, with no state dependencies) such as Batch Logisitc Regression or Tree Algorithms based can be used for scaling the training approach. 31 | 32 | ## Resources 33 | Note that ‘Click Through Rate Prediction’ is not a single algorithm like ‘Naive Bayes’ but rather a goal which can be achieved through a number of different methods. There is a lot of literature out there about binary classification, ensemble methods, factorization machines, collaborative filtering and about the original Kaggle Competition. Do not feel pressured to implement any one approach -- instead try to get a sense for the space and then quickly narrow down an approach you will wrap your head around. Here are some reading materials to get you started. 34 | * https://www.dropbox.com/s/s4x7wp8gjsh021d/TISTRespPredAds-Chappelle-CTR-Prediction-2014Paper.pdf?dl+=0 35 | * https://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pd 36 | * https://www.dropbox.com/s/iozods194twg2pv/MLParis2015-excellent-Sldies.pdf?dl=0 37 | * http://statweb.stanford.edu/~jhf/ftp/trebst.pdf 38 | * https://www.dropbox.com/s/2n8uekjwpaur3bj/Deep-Learning-for-Criteo-Documentation.pdf?dl=0 39 | * https://arxiv.org/pdf/1711.01377.pdf 40 | * https://arxiv.org/pdf/1701.04099.pdf 41 | * https://research.fb.com/publications/practical-lessons-from-predicting-clicks-on-ads-at-facebook/ 42 | * https://www.csie.ntu.edu.tw/~r01922136/slides/kaggle-avazu.pdf 43 | 44 | 45 | 46 | ##How to run GCP cluster on jupyter: 47 | https://cloud.google.com/dataproc/docs/tutorials/jupyter-notebook 48 | -------------------------------------------------------------------------------- /EDA/NaNs_updated_to_medians.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# W261 Final Project\n", 8 | "\n", 9 | "#### *Anusha Munjuluri, Arvindh Ganesan, Kim Vignola, Christina Papadimitriou*" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "### Notebook Set-up" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 166, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "# imports\n", 26 | "import re\n", 27 | "import time\n", 28 | "import numpy as np\n", 29 | "import pandas as pd\n", 30 | "import seaborn as sns\n", 31 | "import matplotlib.pyplot as plt" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 167, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "# store path to notebook\n", 41 | "PWD = !pwd\n", 42 | "PWD = PWD[0]" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 168, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "# start Spark Session\n", 52 | "from pyspark.sql import SparkSession\n", 53 | "app_name = \"final_project\"\n", 54 | "master = \"local[*]\"\n", 55 | "spark = SparkSession\\\n", 56 | " .builder\\\n", 57 | " .appName(app_name)\\\n", 58 | " .master(master)\\\n", 59 | " .getOrCreate()\n", 60 | "sc = spark.sparkContext" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## 1. Question Formulation" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": null, 73 | "metadata": {}, 74 | "outputs": [], 75 | "source": [] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "## 2. Algorithm Explanation" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "metadata": {}, 88 | "outputs": [], 89 | "source": [] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": {}, 94 | "source": [ 95 | "### Data Loading and Pre-Processing" 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": 169, 101 | "metadata": {}, 102 | "outputs": [ 103 | { 104 | "name": "stdout", 105 | "output_type": "stream", 106 | "text": [ 107 | "0\t1\t1\t5\t0\t1382\t4\t15\t2\t181\t1\t2\t\t2\t68fd1e64\t80e26c9b\tfb936136\t7b4723c4\t25c83c98\t7e0ccccf\tde7995b8\t1f89b562\ta73ee510\ta8cd5504\tb2cb9c98\t37c9c164\t2824a5f6\t1adce6ef\t8ba8b39a\t891b62e7\te5ba7672\tf54016b9\t21ddcdc9\tb1252a9d\t07b5194c\t\t3a171ecb\tc5c50484\te8b83407\t9727dd16\n" 108 | ] 109 | } 110 | ], 111 | "source": [ 112 | "# take a look at the data\n", 113 | "!head -n 1 data/train.txt" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 170, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [ 122 | "# load the data\n", 123 | "fullTrainRDD = sc.textFile('data/train.txt')\n", 124 | "testRDD = sc.textFile('data/test.txt')\n", 125 | "\n", 126 | "FIELDS = ['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13',\n", 127 | " 'C1','C2','C3','C4','C5','C6','C7','C8','C9','C10','C11','C12','C13','C14',\n", 128 | " 'C15','C16','C17','C18','C19','C20','C21','C22','C23','C24','C25','C26','Label']" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "# number of rows in train/test data\n", 138 | "print(f\"Number of records in train data: {fullTrainRDD.count()} ...\")\n", 139 | "print(f\"Number of records in test data: {testRDD.count()} ...\")" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": null, 145 | "metadata": {}, 146 | "outputs": [], 147 | "source": [ 148 | "# Generate 80/20 (pseudo)random train/test split \n", 149 | "trainRDD, heldOutRDD = fullTrainRDD.randomSplit([0.8,0.2], seed = 1)\n", 150 | "print(f\"... held out {heldOutRDD.count()} records for evaluation and assigned {trainRDD.count()} for training.\")" 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 171, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [ 159 | "toyRDD, trainRDD2 = trainRDD.randomSplit([0.001,0.999], seed = 2)" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 172, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "data": { 169 | "text/plain": [ 170 | "['1\\t5\\t2\\t\\t\\t1382\\t17\\t78\\t25\\t76\\t0\\t9\\t\\t\\t05db9164\\t942f9a8d\\t56472604\\t53a5d493\\t25c83c98\\t\\t49b74ebc\\t6c41e35e\\ta73ee510\\te113fc4b\\tc4adf918\\t08531bcb\\t85dbe138\\t1adce6ef\\tae97ecc3\\t76b06ec3\\te5ba7672\\t1f868fdd\\t9437f62f\\ta458ea53\\tff4c70b8\\t\\t32c7478e\\tda89b7d5\\t7a402766\\tc7beb94e',\n", 171 | " '1\\t2\\t12\\t8\\t3\\t937\\t7\\t36\\t6\\t73\\t1\\t10\\t\\t4\\t05db9164\\tbce95927\\t02391f51\\tb9c629a9\\t25c83c98\\t7e0ccccf\\t9971a939\\t6a698541\\ta73ee510\\t2124a520\\t3ac87d37\\t2397259a\\tcde3ec68\\t07d13a8f\\tfec218c0\\td37efe8c\\te5ba7672\\t04d863d5\\t21ddcdc9\\t5840adea\\tb6119319\\t\\t423fab69\\t45ab94c8\\te8b83407\\tb13f4ade',\n", 172 | " '0\\t\\t51\\t74\\t\\t39039\\t65\\t1\\t0\\t5\\t\\t0\\t\\t\\t05db9164\\t0468d672\\t1c74e7a5\\ta98187d1\\t25c83c98\\tfe6b92e5\\t306a1d05\\tf504a6f4\\ta73ee510\\t3b08e48b\\t788bd9f4\\t6f0b856c\\t71b17693\\t07d13a8f\\tb4512bcd\\t6b56c939\\t07c540c4\\t0c4e94df\\t21ddcdc9\\t5840adea\\tf7d23965\\t\\t93bad2c0\\t61e3864e\\tea9a246c\\tfcd456fa',\n", 173 | " '0\\t\\t0\\t2\\t4\\t3288\\t101\\t3\\t23\\t37\\t\\t1\\t\\t21\\t05db9164\\td833535f\\tb00d1501\\td16679b9\\t4cf72387\\t7e0ccccf\\t9eec359f\\t0b153874\\ta73ee510\\tb8fa4771\\t636405ac\\te0d76380\\t31b42deb\\tb28479f6\\ta733d362\\t1203a270\\td4bb7bd8\\t281769c2\\t\\t\\t73d06dde\\t\\t3a171ecb\\taee52b6f\\t\\t',\n", 174 | " '0\\t3\\t28\\t3\\t19\\t799\\t33\\t3\\t30\\t33\\t1\\t1\\t\\t29\\t05db9164\\td833535f\\td032c263\\tc18be181\\t25c83c98\\t7e0ccccf\\t705feb79\\t1f89b562\\ta73ee510\\tefea433b\\tcff871dc\\tdfbb09fb\\t64533206\\tb28479f6\\te2502ec9\\t84898b2a\\te5ba7672\\t42a2edb9\\t\\t\\t0014c32a\\t\\t32c7478e\\t3b183c5c\\t\\t']" 175 | ] 176 | }, 177 | "execution_count": 172, 178 | "metadata": {}, 179 | "output_type": "execute_result" 180 | } 181 | ], 182 | "source": [ 183 | "toyRDD.take(5)" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": 173, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "# helper functions\n", 193 | "def parse(line):\n", 194 | " \"\"\"\n", 195 | " Map line --> tuple of (features, label)\n", 196 | " \"\"\"\n", 197 | " fields = np.array(line.split('\\t'))\n", 198 | " features,label = fields[1:14], fields[0]\n", 199 | " return(features, label)\n", 200 | "\n", 201 | "def edit_data_types(line):\n", 202 | " \"\"\"\n", 203 | " Map tuple of (features, label) --> tuple of (formated features, label)\n", 204 | " \n", 205 | " * '' is replaced with 'null'\n", 206 | " * numerical fields are converted to integers\n", 207 | " \"\"\"\n", 208 | " features, label = line[0], line[1]\n", 209 | " formated_features = []\n", 210 | " for i, value in enumerate(features):\n", 211 | " if value == '':\n", 212 | " formated_features.append(np.nan)\n", 213 | " else:\n", 214 | " if i < 13:\n", 215 | " formated_features.append(float(value)) \n", 216 | " else:\n", 217 | " formated_features.append(value)\n", 218 | " return (formated_features, label)" 219 | ] 220 | }, 221 | { 222 | "cell_type": "code", 223 | "execution_count": 174, 224 | "metadata": {}, 225 | "outputs": [], 226 | "source": [ 227 | "#trainRDDCached = trainRDD.map(parse).map(edit_data_types).cache()\n", 228 | "toyRDDCached1 = toyRDD.map(parse).map(edit_data_types).cache()" 229 | ] 230 | }, 231 | { 232 | "cell_type": "code", 233 | "execution_count": 175, 234 | "metadata": {}, 235 | "outputs": [ 236 | { 237 | "name": "stdout", 238 | "output_type": "stream", 239 | "text": [ 240 | "[([5.0, 2.0, nan, nan, 1382.0, 17.0, 78.0, 25.0, 76.0, 0.0, 9.0, nan, nan], '1')]\n" 241 | ] 242 | } 243 | ], 244 | "source": [ 245 | "print(toyRDDCached1.take(1))" 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 176, 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": [ 254 | "sample = np.array(toyRDDCached1.map(lambda x: np.append(x[0], [x[1]])).takeSample(False, 1000))\n", 255 | "sample_df = pd.DataFrame(np.array(sample), columns = ['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13', 'Label'])" 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": 177, 261 | "metadata": {}, 262 | "outputs": [], 263 | "source": [ 264 | "columns = (['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13', 'Label'])\n", 265 | "#columns = ['I1', 'I2', 'I3', 'I4', 'I5', 'I6', 'I7', 'I8', 'I9', 'I10', 'I11', 'I12', 'I13']\n", 266 | "sample_numeric = sample_df.reindex(columns=columns)\n", 267 | "sample_numeric[columns] = sample_numeric[columns].astype(np.float)" 268 | ] 269 | }, 270 | { 271 | "cell_type": "code", 272 | "execution_count": 178, 273 | "metadata": {}, 274 | "outputs": [ 275 | { 276 | "name": "stdout", 277 | "output_type": "stream", 278 | "text": [ 279 | "[3.6761061946902656, 129.768, 19.083969465648856, 7.478535353535354, 24031.09182643794, 113.30208333333333, 16.173821989528797, 12.457, 102.717277486911, 0.6495575221238938, 2.8293193717277485, 1.1415929203539823, 8.308080808080808]\n", 280 | "[9.04753693892924, 460.60738615007034, 35.28575565553717, 8.70109965628185, 92782.65943349876, 343.9023789473689, 49.37351119685013, 12.67218019916068, 232.37267788885407, 0.7196513282846247, 5.737100628411079, 5.262771665595769, 11.208584163174447]\n" 281 | ] 282 | } 283 | ], 284 | "source": [ 285 | "\"\"\"Get means and standard deviations. Ideally we should do this in the RDD vs. pandas\"\"\"\n", 286 | "\n", 287 | "means = []\n", 288 | "stdevs = []\n", 289 | "\n", 290 | "for i in sample_numeric.columns[0:13]:\n", 291 | " mean = np.nanmean(sample_numeric[i])\n", 292 | " means.append(mean)\n", 293 | " std = np.nanstd(sample_numeric[i])\n", 294 | " stdevs.append(std)\n", 295 | " \n", 296 | "print(means)\n", 297 | "print(stdevs)\n" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": 179, 303 | "metadata": {}, 304 | "outputs": [ 305 | { 306 | "name": "stdout", 307 | "output_type": "stream", 308 | "text": [ 309 | "[2.0000e+00 4.0000e+00 6.0000e+00 4.0000e+00 1.1855e+03 1.4000e+01\n", 310 | " 7.0000e+00 8.0000e+00 3.2000e+01 1.0000e+00 2.0000e+00 0.0000e+00\n", 311 | " 3.0000e+00 1.0000e+00]\n", 312 | "[2.0000e+00 4.0000e+00 6.0000e+00 4.0000e+00 1.1855e+03 1.4000e+01\n", 313 | " 7.0000e+00 8.0000e+00 3.2000e+01 1.0000e+00 2.0000e+00 0.0000e+00\n", 314 | " 3.0000e+00 1.0000e+00]\n" 315 | ] 316 | } 317 | ], 318 | "source": [ 319 | "\"\"\"Get medians for each class. Ideally we should do this in the RDD vs. pandas\"\"\"\n", 320 | "\n", 321 | "median1 = np.array(sample_numeric[sample_numeric['Label'] == 1.0].median().tolist())\n", 322 | "print(median1)\n", 323 | "\n", 324 | "median0 = np.array(sample_numeric[sample_numeric['Label'] == 0.0].median().tolist())\n", 325 | "print(median1)" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": 180, 331 | "metadata": {}, 332 | "outputs": [], 333 | "source": [ 334 | "# helper functions\n", 335 | "def parse(line):\n", 336 | " \"\"\"\n", 337 | " Map line --> tuple of (features, label)\n", 338 | " \"\"\"\n", 339 | " fields = np.array(line.split('\\t'))\n", 340 | " features,label = fields[1:14], fields[0]\n", 341 | " return(features, label)\n", 342 | "\n", 343 | "\n", 344 | "def update_nans(line):\n", 345 | " \"\"\"\n", 346 | " Map tuple of (features, label) --> tuple of (formated features, label)\n", 347 | " \n", 348 | " * '' is replaced with 'null'\n", 349 | " * numerical fields are converted to integers\n", 350 | " \"\"\"\n", 351 | " \n", 352 | " #median1 = np.array([2.0, 3.5, 4.0, 4.0, 1362.0, 13.5, 8.0, 7.0, 42.5, 1.0, 2.0, 0.0, 3.0, 1.0])\n", 353 | " #median0 = np.array([0.0, 2.0, 7.0, 5.0, 3539.0, 46.5, 2.0, 8.0, 38.5, 0.0, 1.0, 0.0, 5.0, 0.0])\n", 354 | " \n", 355 | " features, label = line[0], float(line[1])\n", 356 | " formated_features = []\n", 357 | " for i, value in enumerate(features):\n", 358 | " if value == '' and label == 1.0:\n", 359 | " formated_features.append(float(median1[i]))\n", 360 | " elif value == '' and label == 0.0:\n", 361 | " formated_features.append(float(median0[i]))\n", 362 | " else:\n", 363 | " if i < 13:\n", 364 | " formated_features.append(float(value)) \n", 365 | " else:\n", 366 | " formated_features.append(value)\n", 367 | " return (formated_features, label)" 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": 181, 373 | "metadata": {}, 374 | "outputs": [], 375 | "source": [ 376 | "toyRDDCached = toyRDD.map(parse).map(update_nans)" 377 | ] 378 | }, 379 | { 380 | "cell_type": "code", 381 | "execution_count": 182, 382 | "metadata": {}, 383 | "outputs": [], 384 | "source": [ 385 | "# part d - helper function to normalize the data (FILL IN THE MISSING CODE BELOW)\n", 386 | "def normalize(dataRDD):\n", 387 | " \n", 388 | " featureMeans = np.array(means)\n", 389 | " featureStdev = np.array(stdevs)\n", 390 | " \n", 391 | " #sc.broadcast(featureMeans)\n", 392 | " #sc.broadcast(featureStdevs)\n", 393 | " \n", 394 | " ################ YOUR CODE HERE #############\n", 395 | " \n", 396 | " normedRDD = dataRDD.map(lambda x: ((x[0]-featureMeans)/featureStdev, x[1]))\n", 397 | " \n", 398 | " ################ FILL IN YOUR CODE HERE #############\n", 399 | " \n", 400 | " return normedRDD" 401 | ] 402 | }, 403 | { 404 | "cell_type": "code", 405 | "execution_count": 184, 406 | "metadata": {}, 407 | "outputs": [ 408 | { 409 | "data": { 410 | "text/plain": [ 411 | "[(array([ 0.27080439, -0.26654119, -0.32657669, -0.41851167, -0.24265445,\n", 412 | " -0.38302342, 0.86688289, 0.89503667, -0.15117633, -0.84217242,\n", 413 | " 1.05136664, -0.40703114, -0.54504306]), 1.0),\n", 414 | " (array([-0.12509599, -0.24048226, -0.24962123, -0.54366042, -0.24878794,\n", 415 | " -0.42130147, 0.26723367, -0.5235247 , -0.16401474, 0.60118154,\n", 416 | " 1.22386501, -0.40703114, -0.43805518]), 1.0),\n", 417 | " (array([-0.38902958, -0.13885242, 1.02014377, -0.29336292, 0.27637699,\n", 418 | " -0.19928876, -0.23247401, -0.97149145, -0.45501867, -0.84217242,\n", 419 | " -0.50111868, -0.40703114, -0.3310673 ]), 0.0)]" 420 | ] 421 | }, 422 | "execution_count": 184, 423 | "metadata": {}, 424 | "output_type": "execute_result" 425 | } 426 | ], 427 | "source": [ 428 | "normedRDD.take(3)" 429 | ] 430 | }, 431 | { 432 | "cell_type": "code", 433 | "execution_count": null, 434 | "metadata": {}, 435 | "outputs": [], 436 | "source": [] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": null, 441 | "metadata": {}, 442 | "outputs": [], 443 | "source": [] 444 | }, 445 | { 446 | "cell_type": "markdown", 447 | "metadata": {}, 448 | "source": [ 449 | "## 3. EDA & Discussion of Challenges" 450 | ] 451 | }, 452 | { 453 | "cell_type": "code", 454 | "execution_count": null, 455 | "metadata": {}, 456 | "outputs": [], 457 | "source": [ 458 | "# sample = np.array(trainRDDCached.map(lambda x: np.append(x[0], [x[1]])).takeSample(False, 1000))\n", 459 | "# sample_df = pd.DataFrame(np.array(sample), columns = FIELDS)" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": null, 465 | "metadata": {}, 466 | "outputs": [], 467 | "source": [ 468 | "sample_df.iloc[:,0:21].describe(include = \"all\")" 469 | ] 470 | }, 471 | { 472 | "cell_type": "code", 473 | "execution_count": null, 474 | "metadata": {}, 475 | "outputs": [], 476 | "source": [ 477 | "sample_df.iloc[:,21:39].describe(include = \"all\")" 478 | ] 479 | }, 480 | { 481 | "cell_type": "code", 482 | "execution_count": null, 483 | "metadata": {}, 484 | "outputs": [], 485 | "source": [ 486 | "# # Take a subset of the dataframe with only numeric features\n", 487 | "# sample_numeric = sample_df[FIELDS[0:13]]\n", 488 | "# columns = ['I1', 'I2', 'I3', 'I4', 'I5', 'I6', 'I7', 'I8', 'I9', 'I10', 'I11', 'I12', 'I13']\n", 489 | "# sample_numeric = sample_num.reindex(columns=columns)\n", 490 | "# sample_numeric[columns] = sample_numeric[columns].astype(np.float)" 491 | ] 492 | }, 493 | { 494 | "cell_type": "code", 495 | "execution_count": null, 496 | "metadata": {}, 497 | "outputs": [], 498 | "source": [] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "metadata": {}, 504 | "outputs": [], 505 | "source": [ 506 | "# Take a look at histograms for each feature (RUN THIS CELL AS IS)\n", 507 | "sample_numeric.hist(figsize=(23,15), bins=15)\n", 508 | "#sample_numeric[FIELDS[:-1]].hist(figsize=(15,15), bins=15)\n", 509 | "plt.show()" 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": {}, 516 | "outputs": [], 517 | "source": [ 518 | "# part b - plot boxplots of each feature vs. the outcome (RUN THIS CELL AS IS)\n", 519 | "\n", 520 | "fig, ax_grid = plt.subplots(5, 3, figsize=(23,15))\n", 521 | "y = sample_df['Label']\n", 522 | "for idx, feature in enumerate(FIELDS[0:13]):\n", 523 | " x = sample_num[feature]\n", 524 | " sns.boxplot(x, y, ax=ax_grid[idx//3][idx%3], orient='h', linewidth=.5)\n", 525 | " ax_grid[idx//3][idx%3].invert_yaxis()\n", 526 | "fig.suptitle(\"BoxPlots by Label\", fontsize=15, y=0.9)\n", 527 | "plt.show()" 528 | ] 529 | }, 530 | { 531 | "cell_type": "code", 532 | "execution_count": null, 533 | "metadata": {}, 534 | "outputs": [], 535 | "source": [ 536 | "corr = sample_numeric[FIELDS[:13]].corr()\n", 537 | "fig, ax = plt.subplots(figsize=(15, 13))\n", 538 | "mask = np.zeros_like(corr, dtype=np.bool)\n", 539 | "mask[np.triu_indices_from(mask)] = True\n", 540 | "cmap = sns.diverging_palette(10, 240, as_cmap=True)\n", 541 | "sns.heatmap(corr, mask=mask, cmap=cmap, center=0, linewidths=.5)\n", 542 | "plt.title(\"Correlations between features\")\n", 543 | "plt.show()" 544 | ] 545 | }, 546 | { 547 | "cell_type": "markdown", 548 | "metadata": {}, 549 | "source": [ 550 | "## 4. Algorithm Implementation" 551 | ] 552 | }, 553 | { 554 | "cell_type": "code", 555 | "execution_count": null, 556 | "metadata": {}, 557 | "outputs": [], 558 | "source": [ 559 | "# part e - define your baseline model here\n", 560 | "BASELINE = np.array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])" 561 | ] 562 | }, 563 | { 564 | "cell_type": "code", 565 | "execution_count": null, 566 | "metadata": {}, 567 | "outputs": [], 568 | "source": [ 569 | "# part d - write function to compute loss (FILL IN MISSING CODE BELOW)\n", 570 | "def LRLoss(cachedRDD, W):\n", 571 | " \"\"\"\n", 572 | " Compute mean squared error.\n", 573 | " Args:\n", 574 | " dataRDD - each record is a tuple of (features_array, y)\n", 575 | " W - (array) model coefficients with bias at index 0\n", 576 | " \"\"\"\n", 577 | " augmentedData = cachedRDD.map(lambda x: (np.append([1.0], x[0]), x[1]))\n", 578 | " ################## YOUR CODE HERE ##################\n", 579 | " \n", 580 | " \n", 581 | " loss = augmentedData.map(lambda x: np.log(1.0 + np.exp(np.multiply(-x[1], (np.dot(W, x[0])))))).mean()\n", 582 | "\n", 583 | " #loss = augmentedData.map(lambda x: np.log(1.0 + np.exp(np.multiply(-x[1], (np.dot(W, x[0][1:]) + x[0][0]))))).sum()\n", 584 | "\n", 585 | " ################## (END) YOUR CODE ##################\n", 586 | " return loss" 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "execution_count": null, 592 | "metadata": {}, 593 | "outputs": [], 594 | "source": [ 595 | "LRLoss(normedRDD, BASELINE)" 596 | ] 597 | }, 598 | { 599 | "cell_type": "code", 600 | "execution_count": null, 601 | "metadata": {}, 602 | "outputs": [], 603 | "source": [ 604 | "W = np.array([meanQuality, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])\n", 605 | "learningRate = 0.1\n", 606 | "#augmentedData = toyRDDnumeric.map(lambda x: (np.append([1.0], x[0]), x[1])).cache()\n", 607 | "#grad = augmentedData.map(lambda x: np.dot(np.multiply(-x[1], (1.0 - 1.0/(1.0 + np.exp(np.multiply(-x[1], (np.dot(W, x[0][1:]) + x[0][0]))))), x[0]))).sum()\n", 608 | "#new_model = W - (learningRate * grad)\n", 609 | "\n", 610 | "\n", 611 | "grad = augmentedData.map(lambda x: (-x[1] * (1.0-(1.0/(1.0 + (np.exp(np.multiply(-x[1], (np.dot(W, x[0][1:]) + x[0][0])))))))) * x[0][1:]).sum()\n", 612 | "print(grad)\n", 613 | "\n", 614 | "new_model = W - (learningRate * grad)\n", 615 | "print(new_model)\n", 616 | "#grad = np.dot(loss_2, x[0]).sum()" 617 | ] 618 | }, 619 | { 620 | "cell_type": "code", 621 | "execution_count": null, 622 | "metadata": {}, 623 | "outputs": [], 624 | "source": [ 625 | "# part b - function to perform a single GD step\n", 626 | "def GDUpdate(dataRDD, W, learningRate):\n", 627 | " \"\"\"\n", 628 | " Perform one OLS gradient descent step/update.\n", 629 | " Args:\n", 630 | " dataRDD - records are tuples of (features_array, y)\n", 631 | " W - (array) model coefficients with bias at index 0\n", 632 | " Returns:\n", 633 | " new_model - (array) updated coefficients, bias at index 0\n", 634 | " \"\"\"\n", 635 | " # add a bias 'feature' of 1 at index 0\n", 636 | " augmentedData = dataRDD.map(lambda x: (np.append([1.0], x[0]), x[1])).cache()\n", 637 | " \n", 638 | " ################## YOUR CODE HERE ################# \n", 639 | " \n", 640 | " grad = augmentedData.map(lambda x: (-x[1] * (1.0-(1.0/(1.0 + (np.exp(np.multiply(-x[1], (np.dot(W, x[0]))))))))) * x[0]).mean()\n", 641 | "\n", 642 | " new_model = W - (learningRate * grad)\n", 643 | "\n", 644 | " ################## (END) YOUR CODE ################# \n", 645 | "\n", 646 | " return new_model" 647 | ] 648 | }, 649 | { 650 | "cell_type": "code", 651 | "execution_count": null, 652 | "metadata": {}, 653 | "outputs": [], 654 | "source": [ 655 | "GDUpdate(normedRDD, BASELINE, learningRate=learningRate)" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "metadata": {}, 662 | "outputs": [], 663 | "source": [ 664 | "# part c - take a look at a few Gradient Descent steps (RUN THIS CELL AS IS)\n", 665 | "nSteps = 5\n", 666 | "model = BASELINE\n", 667 | "print(f\"BASELINE: Loss = {LRLoss(normedRDD,model)}\")\n", 668 | "for idx in range(nSteps):\n", 669 | " print(\"----------\")\n", 670 | " print(f\"STEP: {idx+1}\")\n", 671 | " model = GDUpdate(normedRDD, model)\n", 672 | " loss = LRLoss(normedRDD, model)\n", 673 | " print(f\"Loss: {loss}\")\n", 674 | " print(f\"Model: {[round(w,3) for w in model]}\")" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": null, 680 | "metadata": {}, 681 | "outputs": [], 682 | "source": [ 683 | "# linear_weights = nb.feature_log_prob_[1] - nb.feature_log_prob_[0]\n", 684 | "\n", 685 | "# top_negative_features = np.argsort(linear_weights)[0:10]\n", 686 | "\n", 687 | "# top_positive_features = np.flip(np.argsort(linear_weights)[-10:],0)" 688 | ] 689 | }, 690 | { 691 | "cell_type": "code", 692 | "execution_count": null, 693 | "metadata": {}, 694 | "outputs": [], 695 | "source": [] 696 | }, 697 | { 698 | "cell_type": "code", 699 | "execution_count": null, 700 | "metadata": {}, 701 | "outputs": [], 702 | "source": [ 703 | "# part b - OLS gradient descent function\n", 704 | "def GradientDescent(trainRDD, testRDD, wInit, nSteps = 20, \n", 705 | " learningRate = 0.1, verbose = False):\n", 706 | " \"\"\"\n", 707 | " Perform nSteps iterations of OLS gradient descent and \n", 708 | " track loss on a test and train set. Return lists of\n", 709 | " test/train loss and the models themselves.\n", 710 | " \"\"\"\n", 711 | " # initialize lists to track model performance\n", 712 | " train_history, test_history, model_history = [], [], []\n", 713 | " \n", 714 | " # perform n updates & compute test and train loss after each\n", 715 | " model = wInit\n", 716 | " for idx in range(nSteps): \n", 717 | " \n", 718 | " ############## YOUR CODE HERE #############\n", 719 | " \n", 720 | " model = GDUpdate(trainRDD, model)\n", 721 | " training_loss = OLSLoss(trainRDD, model)\n", 722 | " test_loss = OLSLoss(testRDD, model)\n", 723 | " \n", 724 | " ############## (END) YOUR CODE #############\n", 725 | " \n", 726 | " # keep track of test/train loss for plotting\n", 727 | " train_history.append(training_loss)\n", 728 | " test_history.append(test_loss)\n", 729 | " model_history.append(model)\n", 730 | " \n", 731 | " # console output if desired\n", 732 | " if verbose:\n", 733 | " print(\"----------\")\n", 734 | " print(f\"STEP: {idx+1}\")\n", 735 | " print(f\"training loss: {training_loss}\")\n", 736 | " print(f\"test loss: {test_loss}\")\n", 737 | " print(f\"Model: {[round(w,3) for w in model]}\")\n", 738 | " return train_history, test_history, model_history" 739 | ] 740 | }, 741 | { 742 | "cell_type": "code", 743 | "execution_count": null, 744 | "metadata": {}, 745 | "outputs": [], 746 | "source": [ 747 | "# plot error curves - RUN THIS CELL AS IS\n", 748 | "def plotErrorCurves(trainLoss, testLoss, title = None):\n", 749 | " \"\"\"\n", 750 | " Helper function for plotting.\n", 751 | " Args: trainLoss (list of MSE) , testLoss (list of MSE)\n", 752 | " \"\"\"\n", 753 | " fig, ax = plt.subplots(1,1,figsize = (16,8))\n", 754 | " x = list(range(len(trainLoss)))[1:]\n", 755 | " ax.plot(x, trainLoss[1:], 'k--', label='Training Loss')\n", 756 | " ax.plot(x, testLoss[1:], 'r--', label='Test Loss')\n", 757 | " ax.legend(loc='upper right', fontsize='x-large')\n", 758 | " plt.xlabel('Number of Iterations')\n", 759 | " plt.ylabel('Mean Squared Error')\n", 760 | " if title:\n", 761 | " plt.title(title)\n", 762 | " plt.show()" 763 | ] 764 | }, 765 | { 766 | "cell_type": "code", 767 | "execution_count": null, 768 | "metadata": {}, 769 | "outputs": [], 770 | "source": [ 771 | "# run 50 iterations (RUN THIS CELL AS IS)\n", 772 | "wInit = BASELINE\n", 773 | "trainRDD, testRDD = normedRDD.randomSplit([0.8,0.2], seed = 2018)\n", 774 | "start = time.time()\n", 775 | "MSEtrain, MSEtest, models = GradientDescent(trainRDD, testRDD, wInit, nSteps = 50)\n", 776 | "print(f\"\\n... trained {len(models)} iterations in {time.time() - start} seconds\")" 777 | ] 778 | }, 779 | { 780 | "cell_type": "code", 781 | "execution_count": null, 782 | "metadata": {}, 783 | "outputs": [], 784 | "source": [] 785 | }, 786 | { 787 | "cell_type": "code", 788 | "execution_count": null, 789 | "metadata": {}, 790 | "outputs": [], 791 | "source": [ 792 | "# code from async\n", 793 | "\n", 794 | "def logisticReg_GD_Spark(data,y,w=None,eta=0.05,iter_num=500,regPara=0.01, stopCriteria=0.0001,reg=\"Lasso\"): \n", 795 | " #eta learning rate \n", 796 | " #regPara \n", 797 | " dataRDD = sc.parallelize(np.append(y[:,None],data,axis=1)).cache() \n", 798 | " if w is None: \n", 799 | " w = np.random.normal(size=data.shape[1]+1) \n", 800 | " for i in range(iter_num): \n", 801 | " w_broadcast = sc.broadcast(w) \n", 802 | " g = dataRDD.map(lambda x: −x[0]*{1−1/(1+np.exp(−x[0]) \n", 803 | " *np.dot(w_broadcast.value,np.append(x[1:],1))))) \\ \n", 804 | " *np.append(x[1:],1)).reduce[lambda x,y:x+y)/data.shape[0] \n", 805 | " # Gradient of logloss \n", 806 | " if reg == \"Ridge\": \n", 807 | " wreg = w*1 \n", 808 | " wreg[−1] = 0 #last value of weight vector is bias term; \n", 809 | " ignore in regularization \n", 810 | " elif reg == \"Lasso\": \n", 811 | " wreg = w*1 \n", 812 | " wreg[−1] = 0 #last value of weight vector is bias term; \n", 813 | " ignore in regularization \n", 814 | " wreg = (wreg>0).astype(int)*2−1 \n", 815 | " else: \n", 816 | " wreg = np.zeros(w.shape[0]) \n", 817 | " wdelta = eta*(g+regPara*wreg) #gradient: hinge loss + regularized term \n", 818 | " if sum(abs(wdelta))<=stopCriteria*sum(abs(w)): # converged as updates \n", 819 | " to weight vector are small \n", 820 | " break \n", 821 | " w = w − wdelta \n", 822 | " return w" 823 | ] 824 | }, 825 | { 826 | "cell_type": "code", 827 | "execution_count": null, 828 | "metadata": {}, 829 | "outputs": [], 830 | "source": [] 831 | }, 832 | { 833 | "cell_type": "code", 834 | "execution_count": null, 835 | "metadata": {}, 836 | "outputs": [], 837 | "source": [] 838 | }, 839 | { 840 | "cell_type": "code", 841 | "execution_count": null, 842 | "metadata": {}, 843 | "outputs": [], 844 | "source": [] 845 | }, 846 | { 847 | "cell_type": "code", 848 | "execution_count": null, 849 | "metadata": {}, 850 | "outputs": [], 851 | "source": [] 852 | }, 853 | { 854 | "cell_type": "code", 855 | "execution_count": null, 856 | "metadata": {}, 857 | "outputs": [], 858 | "source": [ 859 | "# Generate 80/20 (pseudo)random train/test split \n", 860 | "trainRDD, heldOutRDD = fullTrainRDD.randomSplit([0.8,0.2], seed = 1)\n", 861 | "print(f\"... held out {heldOutRDD.count()} records for evaluation and assigned {trainRDD.count()} for training.\")" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": null, 867 | "metadata": {}, 868 | "outputs": [], 869 | "source": [ 870 | "from pyspark.mllib.tree import DecisionTree, DecisionTreeModel\n", 871 | "from pyspark.mllib.util import MLUtils\n", 872 | "\n", 873 | "# Load and parse the data file into an RDD of LabeledPoint.\n", 874 | "data = MLUtils.loadLibSVMFile(sc, 'data/mllib/sample_libsvm_data.txt')\n", 875 | "# Split the data into training and test sets (30% held out for testing)\n", 876 | "(trainingData, testData) = data.randomSplit([0.7, 0.3])\n", 877 | "\n", 878 | "# Train a DecisionTree model.\n", 879 | "# Empty categoricalFeaturesInfo indicates all features are continuous.\n", 880 | "model = DecisionTree.trainClassifier(trainingData, numClasses=2, categoricalFeaturesInfo={},\n", 881 | " impurity='gini', maxDepth=5, maxBins=32)\n", 882 | "\n", 883 | "# Evaluate model on test instances and compute test error\n", 884 | "predictions = model.predict(testData.map(lambda x: x.features))\n", 885 | "labelsAndPredictions = testData.map(lambda lp: lp.label).zip(predictions)\n", 886 | "testErr = labelsAndPredictions.filter(\n", 887 | " lambda lp: lp[0] != lp[1]).count() / float(testData.count())\n", 888 | "print('Test Error = ' + str(testErr))\n", 889 | "print('Learned classification tree model:')\n", 890 | "print(model.toDebugString())\n", 891 | "\n", 892 | "# Save and load model\n", 893 | "model.save(sc, \"target/tmp/myDecisionTreeClassificationModel\")\n", 894 | "sameModel = DecisionTreeModel.load(sc, \"target/tmp/myDecisionTreeClassificationModel\")" 895 | ] 896 | }, 897 | { 898 | "cell_type": "code", 899 | "execution_count": null, 900 | "metadata": {}, 901 | "outputs": [], 902 | "source": [ 903 | "from pyspark.mllib.tree import DecisionTree, DecisionTreeModel\n", 904 | "from pyspark.mllib.util import MLUtils\n", 905 | "\n", 906 | "# Load and parse the data file into an RDD of LabeledPoint.\n", 907 | "data = MLUtils.loadLibSVMFile(sc, 'data/mllib/sample_libsvm_data.txt')\n", 908 | "# Split the data into training and test sets (30% held out for testing)\n", 909 | "(trainingData, testData) = data.randomSplit([0.7, 0.3])\n", 910 | "\n", 911 | "# Train a DecisionTree model.\n", 912 | "# Empty categoricalFeaturesInfo indicates all features are continuous.\n", 913 | "model = DecisionTree.trainRegressor(trainingData, categoricalFeaturesInfo={},\n", 914 | " impurity='variance', maxDepth=5, maxBins=32)\n", 915 | "\n", 916 | "# Evaluate model on test instances and compute test error\n", 917 | "predictions = model.predict(testData.map(lambda x: x.features))\n", 918 | "labelsAndPredictions = testData.map(lambda lp: lp.label).zip(predictions)\n", 919 | "testMSE = labelsAndPredictions.map(lambda lp: (lp[0] - lp[1]) * (lp[0] - lp[1])).sum() /\\\n", 920 | " float(testData.count())\n", 921 | "print('Test Mean Squared Error = ' + str(testMSE))\n", 922 | "print('Learned regression tree model:')\n", 923 | "print(model.toDebugString())\n", 924 | "\n", 925 | "# Save and load model\n", 926 | "model.save(sc, \"target/tmp/myDecisionTreeRegressionModel\")\n", 927 | "sameModel = DecisionTreeModel.load(sc, \"target/tmp/myDecisionTreeRegressionModel\")" 928 | ] 929 | }, 930 | { 931 | "cell_type": "code", 932 | "execution_count": null, 933 | "metadata": {}, 934 | "outputs": [], 935 | "source": [] 936 | }, 937 | { 938 | "cell_type": "markdown", 939 | "metadata": {}, 940 | "source": [ 941 | "## 5. Application of Course Concepts" 942 | ] 943 | }, 944 | { 945 | "cell_type": "code", 946 | "execution_count": null, 947 | "metadata": {}, 948 | "outputs": [], 949 | "source": [] 950 | } 951 | ], 952 | "metadata": { 953 | "kernelspec": { 954 | "display_name": "Python 3", 955 | "language": "python", 956 | "name": "python3" 957 | }, 958 | "language_info": { 959 | "codemirror_mode": { 960 | "name": "ipython", 961 | "version": 3 962 | }, 963 | "file_extension": ".py", 964 | "mimetype": "text/x-python", 965 | "name": "python", 966 | "nbconvert_exporter": "python", 967 | "pygments_lexer": "ipython3", 968 | "version": "3.6.6" 969 | } 970 | }, 971 | "nbformat": 4, 972 | "nbformat_minor": 2 973 | } 974 | -------------------------------------------------------------------------------- /data/toy_test300.txt: -------------------------------------------------------------------------------- 1 | 0 1 32 8 7 684 7 1 7 7 1 1 7 05db9164 95e2d337 31d5f78e ebc42d91 307e775a 7e0ccccf 3965ff35 5b392875 a73ee510 e53bceb4 95eaf7a0 cf681365 ccfd4002 f862f261 fe83a0f3 12daa519 d4bb7bd8 7b06fafe 21ddcdc9 5840adea a921d7b8 ad3062eb bcdee96c 1b256e61 b9266ff0 3ff1af9e 2 | 0 1 1 0 2 14 0 65aada8c 38a947a1 46a2cf53 f9aff643 384874ce e3520422 3ffb655b 0b153874 7cc72ec2 3b08e48b d3246de4 c886a342 31078fbf 07d13a8f f8999309 ba92dd6b 2005abd1 fe623d4e ee91f72a c7dc6720 d052bbbf 3 | 0 0 2 2 2 5931 76 100 4 375 0 15 0 2 5bfa8ab5 c5fe64d9 e5ac16aa f7cffaa7 25c83c98 13718bbd efe2d2a2 5b392875 a73ee510 81cb5a77 9a660f03 611ef73a 3dffcbec 07d13a8f 52b49730 90f3244e e5ba7672 c235abed 158cdff9 5840adea d406b2c3 423fab69 ce327ac7 e8b83407 b34f6f11 4 | 0 0 4 25037 245 1 3 88 1 05db9164 09e68b86 f68c9239 644cb012 25c83c98 7e0ccccf a972360e 0b153874 7cc72ec2 45a2d21a 9e511730 1fe01f1a 04e4a7e0 07d13a8f 36721ddc dfd18889 e5ba7672 5aed7436 21ddcdc9 5840adea e525ffea 3a171ecb 3ff8e180 e8b83407 5a205e8e 5 | 0 0 3 0 68fd1e64 8084ee93 02cf9876 c18be181 25c83c98 7e0ccccf 1af5fecb 0b153874 7cc72ec2 3b08e48b 43b7cf5e 8fe001f4 9c7a975e 07d13a8f 422c8577 36103458 2005abd1 52e44668 e587c466 3a171ecb 3b183c5c 6 | 0 137 4 1 5892 0 6 3 0 1 be589b51 b56822db 7da86e4b b733e495 25c83c98 13718bbd 959a03b7 37e4aa92 a73ee510 3b08e48b c19f03c7 ed397d6b d3442f6b b28479f6 a9d1ba1a 056d8866 27c07bd6 38dce391 21ddcdc9 b1252a9d deaf6b52 c9d4222a 32c7478e d9556584 001f3601 6c27a535 7 | 1 -1 83 12 1 7 12 1 05db9164 9b25e48b 3ce38945 13508380 25c83c98 fe6b92e5 896d0c90 5b392875 a73ee510 f71fd33d c5734ebc 6ebb32c0 ec644921 07d13a8f 054ebda1 b9b3b47a 1e88c74f 7d8c03aa 21ddcdc9 b1252a9d 2214c11c ad3062eb 32c7478e 45ab94c8 2bf691b1 c84c4aec 8 | 0 0 0 2 2 4567 107 7 26 82 0 1 2 46300ee3 8dbd550a 4283a094 e5d4c5ff 4cf72387 fe6b92e5 0d273fab 5b392875 a73ee510 c9e11adf 1be1371e 90e10837 76b6f478 1adce6ef 15cdc5a1 f327e411 e5ba7672 ed39765c 0f5b1a96 32c7478e 6c1cdd05 9 | 1 0 16 2 8 1615 11 6 11 24 0 2 0 8 c35dc981 39dfaa0d 645aef0f bd66646c 43b19349 7e0ccccf cef56ca6 0b153874 a73ee510 814278fa 62aedd5c ec42fcc9 e65a5fc3 1adce6ef 80afa366 12ccd83c 27c07bd6 df4fffb7 21ddcdc9 5840adea 3c07ff4c 3a171ecb bc016d83 010f6491 aa5f0a15 10 | 0 0 4 6 4 3375 8 6 6 14 0 3 4 8cf07265 39dfaa0d f1d57af7 5f3e1806 25c83c98 7e0ccccf 5ce839a9 c8ddd494 a73ee510 3b08e48b 52105669 d19e10db d2b8af4a b28479f6 2223bbe1 0ff23d24 e5ba7672 df4fffb7 21ddcdc9 5840adea 5ac15b45 be7c41b4 52f412c3 010f6491 c986348f 11 | 0 -1 40 11494 37 4 0 16 1 05db9164 c66fca21 704fe5b6 f4eea829 25c83c98 7e0ccccf 95402f9a 0b153874 a73ee510 5162b19c b5a9f90e b3e118e3 949ea585 b28479f6 5507c0e6 687aedf2 e5ba7672 1304f63b 21ddcdc9 a458ea53 75f133b6 32c7478e f54aa12a 010f6491 13a184d8 12 | 1 6 14 6 6 636 6 6 6 6 1 1 0 6 be589b51 421b43cd 064c8897 29998ed1 25c83c98 fbad5c96 4aa938fc 0b153874 a73ee510 299aecf1 7e40f08a 6aaba33c 1aa94af3 b28479f6 e1ac77f7 b041b04a 3486227d 2804effd 723b4dfd c9d4222a 3a171ecb b34f3128 13 | 0 4 1 2 4404 67 6 34 143 2 0 2 05db9164 0468d672 cedcacac 7967fcf5 b2241560 bab49c91 5b392875 a73ee510 1d041004 5f0f014d 553e02c3 f849e1ee 1adce6ef 4f3b3616 91a6eec5 e5ba7672 9880032b 21ddcdc9 5840adea a97b62ca 32c7478e 727a7cc7 e8b83407 6935065e 14 | 0 42 4 2 61400 0 3 60 0 2 68fd1e64 8ab240be 805cf7a7 37ff1477 384874ce 7e0ccccf ff08f605 0b153874 7cc72ec2 185d7db8 4a77ddca 757ebc3e dc1d72e4 b28479f6 55ea1fa2 eb093091 e5ba7672 ca533012 21ddcdc9 5840adea 3d8de405 ad3062eb 3a171ecb 8220aced 445bbe3b cfd96da1 15 | 1 23 1 3 15595 37 4 5 22 1 3 05db9164 2c16a946 363deefb e4dbea90 25c83c98 13718bbd 3c0bbea8 5b392875 a73ee510 9eaa24a1 ef8b8995 ea230307 2afe74a2 07d13a8f 18231224 8c04a1d2 e5ba7672 74ef3502 01776ac5 c7dc6720 9117a34a 16 | 0 20 14 5247 51 12 14 53 2 14 05db9164 38d50e09 c86b2d8d 657dc3b9 43b19349 7e0ccccf 5e64ce5f 0b153874 a73ee510 88a43e6d 8b94178b 1ca7a526 025225f2 b28479f6 06373944 ba46c3a1 e5ba7672 fffe2a63 21ddcdc9 b1252a9d eb0fc6f8 32c7478e df487a73 001f3601 c27f155b 17 | 0 290 3 3 0 4 8 0 3 9a89b36c 38a947a1 ee792829 6a14f9b9 25c83c98 fbad5c96 88002ee1 0b153874 7cc72ec2 3b08e48b f1b78ab4 80506efe 6e5da64f 07d13a8f 586a2aab f8b34416 2005abd1 e5f8f18f f3ddd519 ad3062eb 32c7478e b34f3128 18 | 0 139 1 1 118277 0 4 2 0 1 68fd1e64 4f25e98b 401cd7ca ad1e1b58 25c83c98 13718bbd 4b3c7cfe 0b153874 7cc72ec2 9d2a2291 8b94178b 69a457c2 025225f2 b28479f6 8ab5b746 03300014 d4bb7bd8 7ef5affa 9437f62f b1252a9d 6ade3cae 32c7478e 3fdb382b e8b83407 49d68486 19 | 0 1 0 2 0 0 0 10 0 59 1 4 0 05db9164 f8c8e8f8 396df967 328b42c3 25c83c98 7e0ccccf 162c77bf 0b153874 a73ee510 f1cd00b0 fdd3a1fa 8481d649 3f9a68d9 b28479f6 b15b8172 b2f2a0c7 e5ba7672 d2f0bce2 21ddcdc9 5840adea f0bb1194 32c7478e a5ce2d0d f55c04b6 984e0db0 20 | 1 357 4 4 2855 10 13 6 55 2 0 4 05db9164 09e68b86 233378be cb2156cb 25c83c98 7e0ccccf b87f4a4a 5b392875 a73ee510 2124a520 319687c9 44776637 62036f49 b28479f6 52baadf5 54dd60b2 e5ba7672 5aed7436 21ddcdc9 b1252a9d b39b1608 32c7478e 3fdb382b e8b83407 eb9a9610 21 | 1 2 8 18 33 1137 46 37 45 64 1 4 41 05db9164 73a46ff0 dd1e882e f34203e3 25c83c98 7e0ccccf d5e0ab97 25239412 a73ee510 85258115 20b05825 0a48382f c28589ee b28479f6 4f648a87 4e669708 3486227d da507f45 21ddcdc9 5840adea 81f8acaf bcdee96c 15b4ecfa ea9a246c 9636866f 22 | 0 0 9 1 1 4633 53 94 1 129 0 22 1 ae82ea21 3f0d3f28 db7c9717 c24ec7b4 25c83c98 fbad5c96 1c86e0eb 5b392875 a73ee510 213889cd 755e4a50 8892f523 5978055e 1adce6ef 14108df6 0bae3723 e5ba7672 88416823 daf49d49 32c7478e e5fca70a 23 | 0 2 0 1 12 1 8 7 12 84 1 4 8 05db9164 80e26c9b ecad0523 b6951e6b 25c83c98 7e0ccccf cd175af1 0b153874 a73ee510 3b08e48b ba395776 7fed13da e2bab9cb 07d13a8f f3635baf 0cf975bf 27c07bd6 f54016b9 21ddcdc9 a458ea53 7c75c518 ad3062eb c7dc6720 1793a828 e8b83407 66045105 24 | 0 0 4 1 578 3 5 1 3 0 1 05db9164 58e67aaf e691cc3b 99f78345 25c83c98 6f6d9be8 45e063a0 0b153874 a73ee510 6d432468 da89cb9b 411f6354 165642be 051219e6 d83fb924 9626f91a e5ba7672 c21c3e4c 21ddcdc9 a458ea53 ca4baed6 32c7478e 9f0d87bf 9b3e8820 bde577f6 25 | 0 60 5 3 96516 2624 0 4 743 0 0 3 75ac2fe6 dd8c896e 466f4ffb 13508380 f281d2a7 7e0ccccf 1d2ec7e0 0b153874 7cc72ec2 d7cb1343 2872a4bd eac95d2c f8320f48 07d13a8f 95275a51 7399f581 3486227d 3182300e 1d1eb838 b1252a9d a4e7381c c7dc6720 45ab94c8 010f6491 c84c4aec 26 | 0 0 0 14 930 550 101 16 1763 0 2 0 23 05db9164 f0cf0024 8e4f2214 66a6471d 4cf72387 7e0ccccf fed3cb1d 5b392875 a73ee510 5e4e70b2 b7094596 ba135b9b 1f9d2c38 b28479f6 fdb1071f 1bea1c10 8efede7f cc693e93 21ddcdc9 a458ea53 270b8643 32c7478e 5db13492 ea9a246c f17fe973 27 | 0 57 15 22456 240 1 0 83 1 68fd1e64 1cfdf714 3b56dde6 09b3ff2f 384874ce 7e0ccccf c86e8c6b 0b153874 7cc72ec2 f51aaa12 4d8549da 7b8b2d7a 51b97b8f 07d13a8f f775a6d5 69065c1b e5ba7672 e88ffc9d af1445c4 a458ea53 1903974e 3a171ecb d48ac163 cb079c2d 2a3d0743 28 | 1 1 68 22 12 492 13 34 32 145 1 8 1 12 68fd1e64 bce95927 90c10c84 13508380 25c83c98 7e0ccccf d4f2e777 0b153874 a73ee510 17f648e3 2dc2a28b 8f086db7 24e01e63 07d13a8f fec218c0 228df465 3486227d 04d863d5 1d1eb838 b1252a9d c0b9dbcf 423fab69 45ab94c8 e8b83407 c84c4aec 29 | 0 15 -1 21 103 30 50 43 390 3 13 30 b0938bf6 76c475b1 b30d5393 8100472a 25c83c98 7e0ccccf 9f525672 6c41e35e a73ee510 1e2ab9fa 843d8639 58d90787 9cab1003 07d13a8f d15f4d76 c62b15ac e5ba7672 ae09efbe 7c2dd7b3 423fab69 c4803cb9 30 | 0 0 303 50 5 1702 56 8 25 25 0 1 4 15 5a9ed9b0 c1384774 2ed12532 6af132a7 25c83c98 fe6b92e5 94aa68fb 0b153874 a73ee510 3b76bfa9 577aa337 716265a1 f405e2e8 cfef1c29 29c6cf21 be9937f2 e5ba7672 658dca4c 21ddcdc9 b1252a9d 345e27b5 ad3062eb 32c7478e fb55adff ea9a246c 2702fec5 31 | 0 1 16 471657 0 7 42 0 17 05db9164 38a947a1 16da655e e3679b95 25c83c98 fbad5c96 100f342c 0b153874 7cc72ec2 6f3d6efc 3d566bf6 74906f90 c4b75451 07d13a8f 750d1068 925e59a8 e5ba7672 a866c2a1 30ec1dcf c7dc6720 5a3afc28 32 | 1 3 9 10 4 8 1 13 36 366 1 5 1 b455c6d7 bce95927 873127b3 13508380 25c83c98 7e0ccccf 50370160 0b153874 a73ee510 ba06e67a c11709dc 2a99e062 499d401f 07d13a8f fec218c0 3f666a9d e5ba7672 04d863d5 1d1eb838 b1252a9d 5afba0b9 423fab69 45ab94c8 e8b83407 c84c4aec 33 | 0 0 5 3 7040 89 13 9 21 2 3 05db9164 287130e0 b858524e a520a4f6 4cf72387 fbad5c96 5e4454aa 0b153874 a73ee510 000e2f4b 680bf6cb 6caea246 7db6a946 1adce6ef 6a805a0e 7af2892f 27c07bd6 acd948bb b85fd9dc 5840adea 1b54e4e0 32c7478e b1e73d52 ea9a246c fffa8e76 34 | 1 1 17 16 8 17 8 15 11 113 1 5 8 05db9164 c66fca21 4af5e055 b54fc341 4cf72387 fe6b92e5 ed0714a0 c8ddd494 a73ee510 593c3061 b85b416c 6873ee9c c3f71b59 b28479f6 5507c0e6 6a4b6f44 e5ba7672 1304f63b 21ddcdc9 b1252a9d 80e5f125 ad3062eb 3a171ecb ea6b65e1 010f6491 6235fd85 35 | 0 5 4 5 0 704 130 18 17 350 1 10 9 05db9164 4f25e98b af0e4475 cc475129 4cf72387 fbad5c96 09e42cac 0b153874 a73ee510 3753b9eb 30b2a438 6b3d7744 aebdb575 07d13a8f dfab705f a6eed839 e5ba7672 7ef5affa 1439e83a b1252a9d e1ee5df1 3a171ecb 8407b107 001f3601 fe2b97df 36 | 1 10 0 7 2 1242 36 43 48 97 0 6 0 6 05db9164 52e9ecfc 20009f96 73fec7fb 4cf72387 fbad5c96 1c86e0eb 062b5529 a73ee510 7688173f 755e4a50 57c08194 5978055e b28479f6 e2dd9a77 054b386f e5ba7672 1e42ba17 21ddcdc9 b1252a9d 0dd41d11 32c7478e f9f7eb22 f0f449dd a3a8e8f4 37 | 0 14 1 15 0 0 171 15 33 1 11 2 0 05db9164 2607540a fc8245d5 f922efad f281d2a7 fbad5c96 95667a0f 0b153874 a73ee510 ca937c4f 6fcaf9ee 24d68740 7aab7990 07d13a8f 4bce7416 e2e2fcd9 27c07bd6 c6fa25f8 5c6c011e 423fab69 b34f3128 38 | 0 -1 12319 48 2 1 139 2 68fd1e64 a40adb47 b64ac9a3 1df4d824 0942e0a7 13718bbd fc6b47d9 5b392875 a73ee510 5080de78 b3410e99 71d55d49 0d2cad4c b28479f6 4587d8cc ed6d847a e5ba7672 3c59b550 1d0aeb7a bcdee96c 786a0db5 39 | 0 2054 4 12041 644 75 0 205 4 0 05db9164 dd8c896e 06916baf 13508380 25c83c98 7e0ccccf d89d88cb 5b392875 a73ee510 e8438e24 2872a4bd 11a14379 f8320f48 07d13a8f 95275a51 627304e5 e5ba7672 3182300e 1d1eb838 b1252a9d 11f2a5c4 c7dc6720 45ab94c8 010f6491 c84c4aec 40 | 0 3 5 5 328681 0 7 2 0 6 8cf07265 a796837e 42db3232 e3cc371a 25c83c98 7e0ccccf 22b00980 5b392875 7cc72ec2 75d43ad1 fb7a7aa9 c9669737 138498c6 cfef1c29 85e5b07c c4de5bba 07c540c4 4e6b896a 288eaded 3a171ecb 8fc66e78 41 | 0 718 17 8752 22 1 68fd1e64 403ea497 2cbec47f 3e2bfbda 25c83c98 fe6b92e5 be5072bf 0b153874 a73ee510 3b08e48b 058b2e38 21a23bfe 35323fda b28479f6 91f74a64 587267a3 27c07bd6 a78bd508 21ddcdc9 a458ea53 c2a93b37 be7c41b4 1793a828 e8b83407 2fede552 42 | 1 14 107 15 28 8 4 18 25 228 1 3 2 f473b8dc 38a947a1 b1b6f323 be4cb064 25c83c98 7e0ccccf d5141a06 0b153874 a73ee510 cc7a7a5c f2a5d7d2 d28c687a a3b89afc b28479f6 90af1d37 f2a191bd e5ba7672 c9da8737 5911ddcb ad3062eb 32c7478e 1335030a 43 | 0 -1 3360 8 3 0 8 2 05db9164 38d50e09 01a0648b 657dc3b9 25c83c98 7e0ccccf cffe7bdc 0b153874 a73ee510 474773a7 99177493 11fcf7fa 6969cd35 07d13a8f fa321567 5e1b6b9d 07c540c4 52b872ed 21ddcdc9 a458ea53 bfeb50f6 ad3062eb 32c7478e df487a73 001f3601 c27f155b 44 | 1 1 841 2 2 1249 16 46 21 434 1 19 2 05db9164 1cfdf714 ee895a32 13508380 25c83c98 7195046d 51d76abe a73ee510 eab78bab 4d8549da 34f21a64 51b97b8f 07d13a8f e439dd9b 63e124c6 e5ba7672 9df49ecd 55dd3565 b1252a9d 6082367b c7dc6720 45ab94c8 cb079c2d c84c4aec 45 | 1 3 -1 892 0 3 0 36 1 1 05db9164 287130e0 b605a719 84f6e2f1 25c83c98 13718bbd 03dda6cf 0b5a4776 a73ee510 87237511 3289d3e1 8776b6ad 5a777c73 1adce6ef 310d155b a9c158d4 e5ba7672 891589e7 04de9d96 a458ea53 3bcd3ce5 bcdee96c 63a6b65a ea9a246c c10c7587 46 | 0 0 0 13 8 1447 31 1 31 31 0 1 8 05db9164 ea3a5818 18ca1c45 3885240f 25c83c98 fbad5c96 7a3855e9 0b153874 a73ee510 a0b1ba23 13f669ec 073be304 3a6d18b1 b28479f6 0a069322 aac14e93 e5ba7672 a1d0cc4f 9437f62f a458ea53 eb28ed42 32c7478e 96a4b72c 1575c75f 5727ce8a 47 | 0 2 1220 2 2 1221 17 3 17 51 0 1 2 be589b51 876465ad ffe810c0 d2941349 25c83c98 fe6b92e5 609dda94 5b392875 a73ee510 3b08e48b f1c7dc3f 63470841 969e14fd b28479f6 9c382f7a 1478197a e5ba7672 08154af3 21ddcdc9 5840adea 58cd488e bcdee96c 3b425be9 f55c04b6 2fede552 48 | 0 2 9 1 469 28 2 27 26 1 1 1 05db9164 d833535f 51a1ba8e 499ac831 25c83c98 7e0ccccf 8f8a62c3 0b153874 a73ee510 e11f8843 75a64bb4 34645d5b 873349e8 b28479f6 a66dcf27 6a841a1b 07c540c4 7b49e3d2 9cfdbc70 bcdee96c aee52b6f 49 | 0 0 7 3 33020 225 7 3 3 1 3 8cf07265 38a947a1 25c83c98 dc90e471 0b153874 7cc72ec2 3b08e48b ef379d27 a22b1944 07d13a8f 65ccc4ef e5ba7672 c196a249 32c7478e 50 | 0 1 2 1 10 1455 33 4 33 95 0 3 12 17f69355 38a947a1 0d600a8b 42b65a87 25c83c98 7e0ccccf 421b4ccc 0b153874 a73ee510 7318b5eb 59da2976 71aa3a3e a1354fa8 b28479f6 f62a318f cdf5f8e5 e5ba7672 4108f042 de2b7350 c9d4222a bcdee96c d10706d8 51 | 1 80 272 9 8 0 0 81 8 8 2 3 0 68fd1e64 52e9ecfc 3211244b 954706e9 25c83c98 7e0ccccf 1c86e0eb 0b153874 a73ee510 b50b81d2 755e4a50 d818b769 5978055e b28479f6 e2dd9a77 61f91c70 e5ba7672 1e42ba17 21ddcdc9 b1252a9d 673eb227 ad3062eb 3a171ecb b94c3f93 f0f449dd 3293d45f 52 | 0 6 10 5 27187 167 2 11 123 2 5 05db9164 a0e12995 003f419b ff852091 4cf72387 023e5061 0b153874 a73ee510 3b08e48b 384f39a8 349450bc ca427ae3 1adce6ef 78c64a1d 6615ffe6 07c540c4 1616f155 21ddcdc9 5840adea 3dd38d65 423fab69 c2fe6ca4 b9266ff0 b3606536 53 | 0 0 24 1 0 11535 256 3 16 357 0 2 0 1 05db9164 2c16a946 c6ee519a aa5b3c1b 25c83c98 6f6d9be8 f610c0f7 0b153874 a73ee510 b0242980 dfb981f6 34156ace 4fa1154e b28479f6 f77a8d3d dce552fc e5ba7672 ea3d03ad 20c3a4f0 32c7478e 9117a34a 54 | 1 4 1 2 5257 0 2 56 0 0 2 05db9164 4c2bc594 d032c263 c18be181 25c83c98 7e0ccccf 02751f38 0b153874 a73ee510 c5a5caa6 230f3fdd dfbb09fb 89a6da90 1adce6ef ae0c3875 84898b2a d4bb7bd8 15a36060 0014c32a 3a171ecb 3b183c5c 55 | 1 10 37 4 16 7 5 23 18 210 2 6 2 5 4265881a a796837e 08de7b18 97ce69e9 4cf72387 7e0ccccf 7e72e3e1 5b392875 a73ee510 451bd4e4 0741a7f1 c5011072 8916f815 cfef1c29 98eddd86 5a9431f3 27c07bd6 e90118d1 e754c5e1 c9d4222a 3a171ecb 8fc66e78 56 | 1 7 4 4 2 316 104 8 40 116 1 2 4 05db9164 71ca0a25 ae06bf90 b125f81c 25c83c98 fe6b92e5 d7ea84dc 5b392875 a73ee510 62e331b2 4a77ddca c15e7f3e dc1d72e4 1adce6ef ae3a9888 98334731 e5ba7672 9bf8ffef 21ddcdc9 5840adea 845453b6 dbb486d7 ce327ac7 445bbe3b 09dd415c 57 | 0 0 2 27606 1 4 0 4 1 05db9164 d7988e72 59b82a36 1e5e2162 25c83c98 de17ce0b 0b153874 a73ee510 14af46ca b071da68 fba2af10 66e6836c 07d13a8f 194c42a4 5614787a e5ba7672 0f2f9850 712d530c a458ea53 377dd8df 32c7478e 471f55fb 445bbe3b cb1956a3 58 | 1 5 2 23 24 1 2 18 41 173 1 9 2 05db9164 90081f33 6ea6baf6 cb158ef7 25c83c98 7e0ccccf b4ac002c 5b392875 a73ee510 72ea60f7 a602cdb0 e994659b ff4d50db 64c94865 98995c3b 81da0044 e5ba7672 7181ccc8 466a2d06 423fab69 aea3714c 59 | 0 1 167 1 228 8 10 5 31 1 2 1 1 68fd1e64 2c16a946 777a8124 e4dbea90 25c83c98 7e0ccccf 3b549cb8 0b153874 a73ee510 e1af44fa 6685ea28 4be7cd5a 7edc047a b28479f6 3628a186 1656bd81 3486227d e4ca448c 9d45285b c7dc6720 9117a34a 60 | 0 3 27 85 11 845 27 3 7 15 1 1 15 05db9164 287130e0 9c6f773a 8f23ee93 4cf72387 fbad5c96 319a3bc4 0b153874 a73ee510 3b08e48b 5ba8ac16 75d14077 262ac3d4 b28479f6 9efd8b77 9c6e97c6 07c540c4 891589e7 21ddcdc9 b1252a9d 43e43c64 ad3062eb bcdee96c eae7a1b0 ea9a246c d5b1a8d2 61 | 0 8 360 8 1382 50 17 49 148 0 4 05db9164 b56822db 7da86e4b b733e495 25c83c98 7e0ccccf 38eb9cf4 0b153874 a73ee510 547c0ffe bc8c9f21 ed397d6b 46f42a63 b28479f6 a9d1ba1a 056d8866 e5ba7672 38dce391 21ddcdc9 b1252a9d deaf6b52 c9d4222a 32c7478e d9556584 e8b83407 6c27a535 62 | 0 331 15799 0 0 148 0 fc9c62bb 90081f33 c47701be 9ff76ed7 25c83c98 7e0ccccf 01348ea8 51d76abe a73ee510 3b08e48b dd0c7036 9da25024 57960a15 b28479f6 1c51f095 52b66428 e5ba7672 1b884e69 ef486571 be7c41b4 8e84e831 63 | 1 2 1 1 468 0 1 1 0 1 1bc1bea8 04e09220 71947b86 bf9e41b6 25c83c98 7e0ccccf 2d5cd5c1 0b153874 a73ee510 3b08e48b fcdbe63b 8529d3b4 cfb4d38e 07d13a8f cae64906 33a1f420 1e88c74f e161d23a f22e0924 32c7478e ded4aac9 64 | 1 72 0 50 8 2 1 7 1 0 05db9164 287130e0 36f9c8b0 83a201b6 30903e74 fbad5c96 7d48c0ae 0b153874 a73ee510 622fc8eb 5874c9c9 bbe239c9 740c210d 07d13a8f 10040656 5cd2db3b e5ba7672 891589e7 7839a083 5840adea 3cfc77bd ad3062eb 32c7478e e29de47e ea9a246c 164d3259 65 | 0 1981 0 11865 0 35 5 0 4 05db9164 90081f33 0f19c4dc 9ef1a527 25c83c98 7e0ccccf d130cbd1 0b153874 a73ee510 4effc25c 9a422971 a46413a8 174e4cac b28479f6 1c51f095 08820ac3 e5ba7672 1b884e69 ba5e72fe 32c7478e 8e84e831 66 | 0 -1 8 12435 0 3 148 0 be589b51 71ca0a25 003f419b ff852091 4cf72387 7e0ccccf ed54b715 0b153874 a73ee510 e5cadd10 da28c392 349450bc 7165d9e8 1adce6ef f3400fc5 6615ffe6 d4bb7bd8 4e024bd7 21ddcdc9 b1252a9d 3dd38d65 32c7478e c2fe6ca4 445bbe3b 0015d4de 67 | 0 1 6 1 21 16 19 1 5 6 1 1 ae82ea21 2607540a 8d0ef88a f922efad 25c83c98 fe6b92e5 95667a0f 5b392875 a73ee510 beb8cbb0 07c67b85 b5b8de53 7aab7990 07d13a8f 4bce7416 3f8c9229 3486227d c6fa25f8 98040c74 423fab69 b34f3128 68 | 1 64 7 4 0 7 0 146 3 3 4 11 0 0 68fd1e64 58e67aaf 666f8565 65866ff6 25c83c98 f5047e31 1f89b562 a73ee510 2860ede1 1aa6cf31 469b314f 3b03d76e 051219e6 d83fb924 f14b3819 e5ba7672 c21c3e4c 21ddcdc9 a458ea53 697f6fd4 32c7478e ada925b2 9b3e8820 40cd9f3b 69 | 0 0 2984 9316 217 7 11 34 0 3 09ca0b81 6e638bbc 859e1d36 f24d4506 43b19349 7e0ccccf 987da766 0b153874 a73ee510 bbbd7e2b f37be5c0 d3f30591 a59ea816 07d13a8f d4525f76 2210df91 e5ba7672 f6a2fc70 21ddcdc9 b1252a9d d1cd4aac c9d4222a c7dc6720 971d53a5 445bbe3b b502d7f9 70 | 0 0 585 24577 1 0 05db9164 bc478804 bb964b3b 13508380 25c83c98 fbad5c96 dc2b40a4 0b153874 a73ee510 6229d2b1 6685ea28 98bea2fa 7edc047a 07d13a8f 0af7c64c a8e2fc38 e5ba7672 65a2ac26 55dd3565 a458ea53 95430263 c7dc6720 45ab94c8 001f3601 c84c4aec 71 | 0 9 17 3 5830 0 3 25 0 3 05db9164 d7988e72 4a6325f3 2ed050c7 25c83c98 fe6b92e5 815cb136 66f29b89 a73ee510 3b08e48b a8e564a4 88681055 8e6dec32 051219e6 28aed80d 52f67c28 776ce399 0f2f9850 21ddcdc9 a458ea53 2ee96dde 3a171ecb 3fdb382b 3d2bedd7 49d68486 72 | 0 237 1 38934 115 0 0 14 0 9a89b36c f6f4fe4b d9076860 cba35ce7 4cf72387 fe6b92e5 36e461bd 5b392875 a73ee510 687c5a5b 026fa900 4d36f75a 540f3254 b28479f6 df360709 677226fb 1e88c74f c587eafc f54e9fc1 32c7478e e2aabffd 73 | 0 0 4 2 26095 0 2 7 0 2 05db9164 6e638bbc e3a92241 c771bf5c 25c83c98 7e0ccccf dcdd8d42 0b153874 a73ee510 e1a2ef0f c1700682 4f7b022c 0c66bf77 b28479f6 e63b30de c2807520 e5ba7672 3cb7e3f0 21ddcdc9 b1252a9d 59a15e58 32c7478e 8d653a3e 445bbe3b 8e1ae331 74 | 0 4 2 41 0 0 27 17 2 36 2 12 0 05db9164 38d50e09 835556c7 ded6a29a 4cf72387 13718bbd df5c2d18 37e4aa92 a73ee510 95b6ef60 a7b606c4 acd1070b eae197fd 07d13a8f e24ff4c6 cd13746a e5ba7672 f855e3f0 21ddcdc9 5840adea 0f350316 3a171ecb b2f178a3 001f3601 fab6966b 75 | 1 -1 7096 120 20 0 214 4 be589b51 a796837e 093ca40b 0fa0d423 25c83c98 fe6b92e5 e0386b63 5b392875 a73ee510 6a69ea7f 371dae82 64e10296 18fc2b1e 07d13a8f 23a9a76d 6bb29970 e5ba7672 c4ba79ab 69c7ea42 32c7478e 8fc66e78 76 | 1 0 -1 4226 18 10 3 30 0 2 0 05db9164 14a8f8d6 59b00bbe 13bc59b7 25c83c98 7e0ccccf a3579031 37e4aa92 a73ee510 eb21a142 1054ae5c 8bc53fee d7ce3abd d2dfe871 9dc1156f 69f18a97 e5ba7672 f64c5d3a 8a1d6051 423fab69 ad21a9fc 77 | 0 0 11 11 4605 0 22 11 0 11 68fd1e64 38a947a1 a49963ea 2f51bef4 25c83c98 7e0ccccf e702f4b9 0b153874 a73ee510 5115abf9 2bcfb78f 1eaff9a5 e6fc496d 1adce6ef c3cd9e9c cd1cedda d4bb7bd8 6391198f de3a2c86 32c7478e b3b23293 78 | 1 1 151 2 4 977 4 1 2 8 1 1 4 05db9164 e77e5e6e 30b266b7 a96f3796 25c83c98 7e0ccccf 04716793 37e4aa92 a73ee510 cd8f34fb 35d24a2f a24ac991 7165d9e8 051219e6 a6b876ce cecd67e7 e5ba7672 449d6705 21ddcdc9 b1252a9d 7b76a712 32c7478e 0f1ac786 445bbe3b 81c448e7 79 | 0 0 0 1 7983 48 1 1 0 0 1 05db9164 26ece8a8 8026a7ea 1e0ec6a2 25c83c98 7e0ccccf 3965ff35 1f89b562 a73ee510 e53bceb4 95eaf7a0 dd03210c ccfd4002 07d13a8f 102fc449 d0b4477d d4bb7bd8 87fd936e b193bbca 423fab69 5a456be6 80 | 0 0 6 6 3072 66 75 33 126 8 0 7 68fd1e64 d7988e72 f6b535f9 69fa4a82 25c83c98 13718bbd b0e3882e 0b153874 a73ee510 a9271c40 f37be5c0 2c6b7128 a59ea816 b28479f6 c8389df7 142b0f20 3486227d 0f2f9850 21ddcdc9 a458ea53 86c7b7e1 c7dc6720 6c4e776a 445bbe3b 06a75e35 81 | 0 1 8451 116 20 0 6 4 0 c2a5852e dc1def19 d3ae3ce1 13718bbd 6fadbb76 5b392875 a73ee510 5ba575e7 b5939c49 377af8aa ad1cc976 71d7ea29 3486227d 5681c2c1 c3dc6cef 82 | 0 1 3 12 9 1284 15 2 16 15 0 1 15 05db9164 333137d9 25c83c98 7e0ccccf 1ac10733 5b392875 a73ee510 434f9ed9 1b215789 da9a4095 07d13a8f 6cfa4ac6 07c540c4 c61e82d7 21ddcdc9 5840adea 32c7478e 445bbe3b 00ed90d0 83 | 0 4 1 5554 5 14 1 4 1 1 05db9164 f6f4fe4b 8314e9bc 14c1de76 25c83c98 3bf701e7 8422994c 64523cfa a73ee510 3dc0afb7 6263d404 038588ac aa1eb12e 07d13a8f 9ca59173 fa989b0b e5ba7672 ca5a75f3 9346764d c7dc6720 2d3e3ac8 84 | 1 5 181 14 15 7 2 5 28 148 1 1 2 05db9164 421b43cd a9103739 29998ed1 25c83c98 13718bbd fe4dce68 0b153874 a73ee510 6a002f59 68357db6 6aaba33c 768f6658 b28479f6 2d0bb053 b041b04a e5ba7672 2804effd 723b4dfd bcdee96c b34f3128 85 | 0 -1 14938 29 5 0 26 1 68fd1e64 287130e0 8a70fdb2 7b83af38 25c83c98 7e0ccccf 97a1686a 0b153874 a73ee510 07359bbb 6cb28f31 e3db5660 907c7bec 1adce6ef 310d155b 364e0dd2 e5ba7672 891589e7 95b99b23 a458ea53 24d0785a bcdee96c 804f40cc ea9a246c 46c722f2 86 | 0 0 34 9 10 1167 365 34 22 1058 0 10 0 40 68fd1e64 0468d672 a79173fb 34449f52 25c83c98 fbad5c96 fed3cb1d 5b392875 a73ee510 2ec6a85f b7094596 c3fb6371 1f9d2c38 b28479f6 60a23d23 0d9b39f7 8efede7f 124c6b00 21ddcdc9 5840adea 3f8a0a58 32c7478e c48364d9 ea9a246c 984e0db0 87 | 0 0 1 26 10 1663 14 0 0 14 05db9164 90081f33 ff01cbf9 b5e60c7f 25c83c98 fbad5c96 4aa938fc 0b153874 a73ee510 efea433b 7e40f08a 0495def1 1aa94af3 64c94865 98995c3b ce90c07d 27c07bd6 7181ccc8 df693fdc ad3062eb 3a171ecb cd7bf52a 88 | 0 0 9 4 3 1666 18 96 8 222 0 9 0 3 be589b51 fc67db1d d4279004 d3c8a2de 25c83c98 fe6b92e5 f2f89ef6 0b153874 a73ee510 fa7d0797 a35dd4d8 c37cc291 9ded12ab 64c94865 53175e74 2d050459 e5ba7672 e24773fa 65cc6f7c 3a171ecb be673bbf 89 | 1 -1 2 7 17 0 7 7 0 7 05db9164 dde11b16 63f98b5e 8cc7b33b 25c83c98 fbad5c96 f00bddf8 5b392875 a73ee510 16a81a6c 55795b33 7becd6e7 39795005 07d13a8f e28388cc eca39129 1e88c74f 43dfe9bd cf158609 ad3062eb 3a171ecb 10b3e56d 90 | 0 1 206 6 8 675 26 1 22 22 1 1 8 05db9164 a796837e 42db3232 e3cc371a 4cf72387 fe6b92e5 7e72e3e1 5b392875 a73ee510 8a097529 0741a7f1 c9669737 8916f815 cfef1c29 f0bf9094 c4de5bba d4bb7bd8 1cdbd1c5 288eaded 78e2e389 3a171ecb 8fc66e78 91 | 0 1 45 93 8 99 10 33 13 43 1 8 10 05db9164 404660bb 437a8d13 5c4b1b3c 30903e74 fbad5c96 1c86e0eb 5b392875 a73ee510 213889cd 755e4a50 7cb0dae0 5978055e b28479f6 f0449815 2ebfc5f9 e5ba7672 997cd4bd 21ddcdc9 5840adea 7678f116 32c7478e ecbda592 f0f449dd f3737bd0 92 | 1 2 305 3 8 3 2 3 3 1 1 3 05db9164 2705da39 dd5c67c2 2b30dc64 4cf72387 fe6b92e5 5fb2890d 0b153874 a73ee510 4effc25c 6944f67f 7d86ca65 57d7b6a5 07d13a8f d582d840 f41863d7 07c540c4 66c3058a 8432b36e 32c7478e 043ce596 93 | 0 168 17 5 8642 0 18 30 0 5 68fd1e64 4f25e98b 6b7d7a80 1917d405 4cf72387 fe6b92e5 eac6dc30 5b392875 a73ee510 32390b96 df29f7bb 8305bfff 67b031b4 b28479f6 8ab5b746 20f01128 d4bb7bd8 7ef5affa 21ddcdc9 a458ea53 b486aa6e 32c7478e 0d9b76e9 001f3601 77ac015d 94 | 0 4 -1 0 2 0 4 0 0 1 1 0 05db9164 287130e0 4cbd5fea 74d817cd 384874ce fbad5c96 6855ef53 37e4aa92 a73ee510 3fb38a44 b7094596 53a95ff6 1f9d2c38 1adce6ef 310d155b 2ed99f6d 8efede7f 891589e7 18884750 a458ea53 afe27167 ad3062eb 32c7478e fb696847 ea9a246c de0cb3a5 95 | 0 2 1 2 24 158 89 2 29 28 1 1 28 9684fd4d bccb7a1a 29ffa33c c95cee83 25c83c98 7e0ccccf 5fbd9170 0b153874 a73ee510 257c4f93 2bcfb78f bafda429 e6fc496d b28479f6 1302f720 01e777d6 e5ba7672 d51975d7 21ddcdc9 5840adea 16907e69 c9d4222a 32c7478e 9fa3e01a e8b83407 1d59c203 96 | 1 -1 1 21669 0 0 1 0 05db9164 6496eea0 951e87cb f4d07bea 25c83c98 fe6b92e5 86583230 0b153874 a73ee510 1b02929c 5f9d9fd9 e1e758de 406194ca 07d13a8f 08cec6f8 ea14a4c0 d4bb7bd8 fba30a05 21ddcdc9 b1252a9d 0a071455 32c7478e 1c3ed002 ea9a246c 98279775 97 | 1 6 -1 1120 36 18 35 243 1 7 05db9164 9b25e48b 99de6644 494b921c 25c83c98 7e0ccccf 1a30bbac 51d76abe a73ee510 38b45203 b388c135 2eb230f9 0defc36f f862f261 fa7b7e5c 5d724c2a e5ba7672 7d8c03aa 34fce22d a458ea53 2799e297 bcdee96c 3fdb382b 33d94071 49d68486 98 | 0 0 9 1 1 2368 420 1 22 161 0 1 1 9a89b36c 5ca60b73 03f5e595 bdc301e2 25c83c98 7e0ccccf ce4f7f55 0b153874 a73ee510 ebb8102f 38f692a7 870771b1 6e5da64f b28479f6 d7b9be02 bb01ab0a d4bb7bd8 04c62c3d 92259097 32c7478e 8ea22d26 99 | 0 0 11 4 4 14044 480 2 11 254 0 1 4 05db9164 421b43cd bc58b495 29998ed1 30903e74 fe6b92e5 840c9106 0b153874 a73ee510 f7394a63 45ca0a69 6aaba33c 17bb51e8 b28479f6 2d0bb053 b041b04a 07c540c4 2804effd 723b4dfd c9d4222a bcdee96c b34f3128 100 | 0 1 0 4 5 404 10 115 7 227 1 18 0 5 5a9ed9b0 3f0d3f28 56151406 b724d6df 25c83c98 fbad5c96 eb8a735c 0b153874 a73ee510 0c03b22b bc256141 337d46e3 43fc2d8c 07d13a8f 075f843b 1a1231dc 27c07bd6 744ad4a0 26798225 ad3062eb 423fab69 e5fca70a 101 | 1 0 2 1 1179 7 2 7 7 0 2 05db9164 95e2d337 549a314a 1f543f94 25c83c98 6f6d9be8 b097f76f 0b153874 a73ee510 00cdaa7d cb755bc4 bc831d5e 3df169e6 07d13a8f 4e505ea3 0f6e4eaa 07c540c4 7b06fafe 21ddcdc9 b1252a9d 36333f93 32c7478e df5fa281 2bf691b1 0d463698 102 | 1 2 7 11 13 1359 14 13 15 24 1 4 13 8c6ba407 95e2d337 458f7039 69040d07 25c83c98 fbad5c96 fd5c0ed6 0b153874 a73ee510 267caf03 95eaf7a0 7afbfbde ccfd4002 64c94865 7de4908b 4efcb04a e5ba7672 701d695d 712d530c a458ea53 1728db22 423fab69 4921c033 2bf691b1 80b0aeb9 103 | 1 10 50 2 2 1326 2 51 2 4 0 4 0 2 be589b51 421b43cd 0d000043 29998ed1 25c83c98 0832c824 0b153874 a73ee510 efea433b 0983d89c 6aaba33c 1aa94af3 b28479f6 2d0bb053 b041b04a 3486227d 2804effd 723b4dfd 32c7478e b34f3128 104 | 0 596 5 10127 534 65 1 18 7 0 8cf07265 78ccd99e c34140f1 f65372fb 25c83c98 7e0ccccf ea8446e3 0b153874 a73ee510 38405c01 ecf21575 49236ada 5e400308 64c94865 7cfbc0ba e3efa0ff 3486227d e7e991cb c10eb73c a458ea53 1958f1f8 423fab69 37761cb9 e8b83407 46c4348b 105 | 0 0 38 5 6 13569 182 1 48 50 0 1 6 05db9164 38d50e09 e7290a18 10ee14f5 25c83c98 fbad5c96 fdc48e74 5b392875 a73ee510 3b08e48b e0e79bd6 394b6320 96fa211f b28479f6 42b3012c c336971a 07c540c4 582152eb 21ddcdc9 5840adea f226d6a0 3a171ecb e4f45003 001f3601 632dae8a 106 | 0 13 1647 0 10 0 13 0 0 2 2 0 05db9164 a0e12995 622d2ce8 51c64c6d 25c83c98 fe6b92e5 33656cc3 0b153874 a73ee510 91b5a423 88030678 e9521d94 d87c8d9c b28479f6 83763c20 ab8b968d e5ba7672 1616f155 21ddcdc9 5840adea ee4fa92e 32c7478e d61a7d0a 9b3e8820 b29c74dc 107 | 0 4 4 2 72228 0 2 2 0 2 be589b51 537e899b 5037b88e 9dde01fd 25c83c98 fe6b92e5 3ce91fbf 0b153874 7cc72ec2 711ec2bc b9ec9192 680d7261 df5886ca 07d13a8f 6d68e99c c0673b44 e5ba7672 b34aa802 e049c839 c7dc6720 6095f986 108 | 1 33 4 10740 0 3 54 0 0 4 05db9164 e112a9de 54ada4ae adef24d7 25c83c98 3bf701e7 15055a89 0b153874 a73ee510 d2b8ac2b 6743b177 63f0bdb9 12c11b9c 1adce6ef a4e2caab 22afbfdb 3486227d c342ea0e a4ba8455 ad3062eb 32c7478e 8ca3a6b7 109 | 1 1 0 22 8 1 0 3 9 73 1 2 0 5a9ed9b0 4e8d18ed 2b88c272 a8e2ad42 25c83c98 fe6b92e5 87e1825e 1f89b562 a73ee510 3b76bfa9 9fd640ca f4f25591 56b1ebb5 1adce6ef bbee52f9 eb99650f 07c540c4 47e4d79e a153cea2 b1252a9d 9fe328aa 55dd3565 25988df2 c9f3bea7 aa130fe0 110 | 0 0 0 1 1 2845 138 24 9 152 0 9 0 1 05db9164 9819deea 18ad1173 f922efad 25c83c98 7e0ccccf 136f2270 0b153874 a73ee510 c484657f 7bbe6c06 b99ddbc8 ea1f21b7 b28479f6 1150f5ed 87acb535 3486227d 7e32f7a4 a4b7004c 423fab69 b34f3128 111 | 0 1 66 105 39 43 37 25 42 51 1 2 37 68fd1e64 89ddfee8 686c186b 5873e917 25c83c98 fbad5c96 1c86e0eb 1f89b562 a73ee510 73af8251 755e4a50 0adee066 5978055e 1adce6ef 34cce7d2 73a80605 e5ba7672 5bb2ec8e fbce4e45 b1252a9d c465b9c6 32c7478e 107b4351 f0f449dd cc6213d5 112 | 0 3 23 2 298304 0 2 28 0 2 05db9164 38d50e09 ae06bf90 b125f81c 25c83c98 7e0ccccf 25be271f 0b153874 7cc72ec2 3b08e48b 7a6f163c c15e7f3e b3ab85b2 07d13a8f e24ff4c6 98334731 776ce399 f855e3f0 21ddcdc9 5840adea 845453b6 3a171ecb ce327ac7 001f3601 09dd415c 113 | 1 2 15 3 38367 211 2 7 53 0 0 3 68fd1e64 95e2d337 3a853f26 69040d07 25c83c98 7e0ccccf a0e559da 0b153874 a73ee510 3b08e48b 418037d7 c4b67d3f b0bfed6d 64c94865 7de4908b a94cc9c6 27c07bd6 701d695d b795e71a a458ea53 91a7ca0f ad3062eb 423fab69 4921c033 2bf691b1 80b0aeb9 114 | 0 87 3 5490 3 5 0 3 1 0 8cf07265 09e68b86 3610d762 eba0cf7f 25c83c98 fe6b92e5 33cca6fa 0b153874 a73ee510 401ced54 cb70bc55 9a70d980 2b9fb512 b28479f6 52baadf5 66852857 e5ba7672 5aed7436 21ddcdc9 5840adea c641fc64 ad3062eb 3a171ecb 1793a828 e8b83407 691a6530 115 | 0 15 78 1 266 10 239 10 18 2 18 1 f473b8dc 0bf920cf 01c9c942 aff83244 25c83c98 fbad5c96 1c86e0eb 0b153874 a73ee510 e7ba2569 755e4a50 df63e04e 5978055e 32813e21 e5655411 0ad97eb2 e5ba7672 5705c078 f3dc5d51 3a171ecb 8f647f94 116 | 0 2 0 16 15 1308 46 2 42 46 1 1 25 05db9164 207b2d81 a0c17e97 397f8653 25c83c98 fe6b92e5 880b0bef 0b153874 a73ee510 3b08e48b 8784df70 21048d29 5b7c8640 1adce6ef dbbde166 75d8f997 07c540c4 25c88e42 21ddcdc9 b1252a9d 8844f7da 32c7478e edd9746a 001f3601 03764a6b 117 | 0 1 12 5 8720 197 11 7 94 3 7 8cf07265 38a947a1 7acbdeb5 bc5f5426 f3474129 fbad5c96 75dcaaca 0b153874 a73ee510 3b08e48b 8aabdae8 b6afd7f7 edcf17ce 1adce6ef 85eb15b5 04f520e6 e5ba7672 e46d32b4 a8c1f79b c9d4222a 3a171ecb 6efc9bbd 118 | 0 1 10 0 0 0 0 0 0 0 1fbb71a1 0468d672 77e1f21a b65189a7 43b19349 7e0ccccf 5fd3419b 5b392875 a73ee510 3b08e48b efc5e2cf 894114c7 c7176043 07d13a8f a888f201 65b35bf5 776ce399 9880032b 21ddcdc9 b1252a9d 8558b0a0 3a171ecb 42bab436 ea9a246c 779ff446 119 | 0 2 125 14 7 28 7 45 19 76 2 8 7 68fd1e64 1cfdf714 3c894c9a 9fe8df54 4cf72387 fbad5c96 15ebc248 5b392875 a73ee510 a5f77a53 8fa5dd63 37376386 70728bf9 687dfaf4 a54fca2b 48381557 27c07bd6 e88ffc9d 21ddcdc9 a458ea53 ef6da73e c7dc6720 12b11aed cb079c2d d6febfc0 120 | 1 8 268 1 1 4 0 10 23 110 2 3 0 0 05db9164 1cfdf714 ec880e19 28222836 43b19349 7e46c875 5b392875 a73ee510 8332a54f bbbf282d 2e8ead89 917e9bf3 07d13a8f f775a6d5 27af74ee 3486227d e88ffc9d 21ddcdc9 b1252a9d f7765884 423fab69 e927d3a5 cb079c2d ec8c1a11 121 | 0 -1 10 3 3309 64 1 8 60 1 3 68fd1e64 287130e0 4eadebf9 695010b5 25c83c98 fe19892c 316074ea a73ee510 6feef489 bbeade31 c11464ba 103a86a6 1adce6ef 310d155b 64c4e7db d4bb7bd8 891589e7 315ba0e1 a458ea53 0f494a3f c7dc6720 7f9a8e66 ea9a246c d4060883 122 | 0 8 61 3 1 1334 1 46 1 88 0 4 0 1 05db9164 80e26c9b 04c7ead5 7a10bc51 43b19349 7e0ccccf 5270a50f 0b153874 a73ee510 cfa5dd13 d6ea7935 44af655a 74838342 b28479f6 4c1df281 4f0ca259 8efede7f f54016b9 21ddcdc9 5840adea 46ff4ead 3a171ecb 82c1b580 e8b83407 4361c4a9 123 | 0 1 7 25812 1449 1 1 704 1 7 05db9164 e112a9de 9db30a48 b3dbc908 25c83c98 fbad5c96 2220ee44 5b392875 a73ee510 04a6cddd 40862c01 2598d8eb 0f39538f ad1cc976 f1e1df0a 9ab4d6b1 d4bb7bd8 fdbdefe6 bbf96cac c3dc6cef 8f079aa5 124 | 0 0 51 1 1 4561 24 1 1 14 0 1 1 68fd1e64 95e2d337 d5763291 330311a9 25c83c98 7e0ccccf 7ff5559f 0b153874 a73ee510 3b08e48b bb40a095 4266b7b2 4af5f8c2 64c94865 8ec16239 1e2a56f9 d4bb7bd8 7b06fafe 68c36492 a458ea53 50048a6e 8ec974f4 c7dc6720 3c5502ba 2bf691b1 518d725f 125 | 0 4 0 85 7 4 3 14 9 79 1 3 0 3 39af2607 0468d672 db2e7684 932860b1 25c83c98 7e0ccccf 368e23c1 51d76abe a73ee510 616ed314 bd3d9389 2f914cd6 c95e66fd b28479f6 234191d3 6057c1bb e5ba7672 9880032b 21ddcdc9 5840adea aee90379 3a171ecb e2c5f195 ea9a246c 4e7af834 126 | 1 4 38 1020 16 4 3 3 1 1 5a9ed9b0 e6203a55 6a094101 c5455c5f 25c83c98 7e0ccccf f417bf96 1f89b562 a73ee510 3b08e48b 488e8c8d 0ebcbbef 44af41ef 07d13a8f 68909f00 2cde4ca7 e5ba7672 8ecf282f 77d96671 3a171ecb 16bb3de8 127 | 0 0 -1 3 4151 138 5 0 3 0 1 4 3 5a9ed9b0 80e26c9b efbdff40 66096e43 25c83c98 f31511c2 5b392875 a73ee510 3b08e48b 6af26531 52813a23 5fcbc3ef 64c94865 df97efd0 66d6d022 8efede7f f54016b9 21ddcdc9 a458ea53 5542e314 32c7478e 45e9df56 e8b83407 74e8755f 128 | 0 6 1 1 1 14 1 6 1 1 1 1 0 1 68fd1e64 38a947a1 5a51fff9 492a96c3 25c83c98 7e0ccccf f1ff45d6 64523cfa a73ee510 9a1250bd acc758fc 044b253e 9bbdb8bd 1adce6ef 8da726c0 addff437 e5ba7672 355bd67f 508cd4cd ad3062eb 3a171ecb 1793a828 129 | 0 1 181 114 20 76 23 1 24 24 1 1 23 05db9164 f0cf0024 74e1a23a 9a6888fb 25c83c98 7e0ccccf aaa26caf 5b392875 a73ee510 c2824056 26879515 fb8fab62 2ef8dcdd b28479f6 e6c5b5cd c6b1e1b2 07c540c4 b04e4670 21ddcdc9 5840adea 99c09e97 32c7478e 335a6a1e ea9a246c 8d8eb391 130 | 0 457 5 2 119 0 2 2 0 2 5a9ed9b0 287130e0 982257df 74d1aace 25c83c98 fe6b92e5 28639f10 0b153874 a73ee510 3b08e48b 3a5bf2d6 93754964 155ff7d9 b28479f6 9efd8b77 df486640 776ce399 891589e7 21ddcdc9 b1252a9d 117fbd04 3a171ecb 3fdb382b ea9a246c 49d68486 131 | 1 0 16 3 3994 1280 0 10 167 0 3 05db9164 4f25e98b c6605a87 83b59776 25c83c98 fbad5c96 36b21dc8 0b153874 a73ee510 451bd4e4 0f1fa8b8 71a61057 e4e9ce3a b28479f6 8ab5b746 066c0065 e5ba7672 7ef5affa 21ddcdc9 a458ea53 dac4f09f bcdee96c 3fdb382b 9b3e8820 f7924e53 132 | 0 0 13 11 4546 0 10 11 0 11 68fd1e64 c44e8a72 16935f2b 0f73fbed 25c83c98 7e0ccccf 945ff49e 0b153874 a73ee510 3b08e48b 6f71ad27 bed5f3d8 78b1bfe4 07d13a8f b88d2fea 91d227ed 1e88c74f 456d734d 49c3ca73 b1252a9d 77bcd7f1 ad3062eb 32c7478e efbfe1e3 001f3601 d9509875 133 | 0 0 41 34 2 8317 9 0 9 be589b51 b56822db 7da86e4b b733e495 25c83c98 fbad5c96 67a76773 0b153874 a73ee510 beb6ee2c cfd02f48 ed397d6b 37bfaf8b b28479f6 a9d1ba1a 056d8866 07c540c4 38dce391 21ddcdc9 b1252a9d deaf6b52 32c7478e d9556584 001f3601 6c27a535 134 | 0 0 1 1 3612 3 34 0 54 0 1 0 68fd1e64 a0e12995 622d2ce8 51c64c6d 25c83c98 7e0ccccf 8fb5446a 5b392875 a73ee510 5c88b319 59cd5ae7 e9521d94 8b216f7b 07d13a8f 73e2709e ab8b968d e5ba7672 1616f155 21ddcdc9 5840adea ee4fa92e 32c7478e d61a7d0a 9b3e8820 b29c74dc 135 | 0 0 35827 145 0 0 1 0 5a9ed9b0 68aede49 28be62e7 b5dda8e5 25c83c98 fe6b92e5 301e3e06 0b153874 7cc72ec2 d023b3d2 df29f7bb 03dff1a5 67b031b4 07d13a8f 8dbc001a a8912661 d4bb7bd8 262c8681 1751f2e0 32c7478e 55dea74e 136 | 0 2 1 3 1 1404 46 2 7 7 1 1 1 05db9164 207b2d81 e48e5552 9ddc492e 25c83c98 fe6b92e5 4aa938fc 0b153874 a73ee510 71c30a63 7e40f08a e5e1ca92 1aa94af3 b28479f6 3c767806 29fd6b7b 07c540c4 395856b0 21ddcdc9 a458ea53 e5191f27 3a171ecb c23c2e19 001f3601 3464ae5c 137 | 0 0 7 27793 7 3 3 13 1 87552397 c5fe64d9 0d0ad712 87682a54 25c83c98 6f6d9be8 e69622a0 0b153874 a73ee510 3b08e48b ee6f573d 9a087a82 c9014cdb 07d13a8f 52b49730 2262b20c e5ba7672 c235abed 315ba0e1 a458ea53 29327913 32c7478e b116c772 ea9a246c 43e4411a 138 | 0 0 5 1 1 1896 122 7 29 253 0 4 1 06584483 421b43cd 9ebc75fa 29998ed1 4cf72387 fbad5c96 007a7704 5b392875 a73ee510 3ff5f6c9 9fc87b07 6aaba33c bf596cfe b28479f6 2d0bb053 b041b04a e5ba7672 2804effd 723b4dfd bcdee96c b34f3128 139 | 0 0 70 26 61 946 0 28 39 0 27 05db9164 1cfdf714 096f63a6 6c92a8ae 25c83c98 7e0ccccf 7195046d 5b392875 a73ee510 5a2dbfb2 4d8549da 7cf22eb7 51b97b8f b28479f6 d345b1a0 c047873a e5ba7672 e88ffc9d 5b885066 a458ea53 11aa8d43 ad3062eb bcdee96c 23a91d68 e8b83407 4450e3bb 140 | 1 1 1 6 15681 39 8 6 35 1 6 5a9ed9b0 9bcd4a15 d032c263 c18be181 4cf72387 7e0ccccf 276c620c 37e4aa92 7cc72ec2 3b08e48b 9ffb3655 dfbb09fb a0874a81 cfef1c29 f3bbc114 84898b2a e5ba7672 be7bc2c7 0014c32a ad3062eb 32c7478e 3b183c5c 141 | 0 49 3 7 28982 0 39 2 0 15 68fd1e64 a796837e 08de7b18 97ce69e9 25c83c98 7e0ccccf 9a8798e6 5b392875 a73ee510 3b08e48b fd0dad89 c5011072 f171ba63 cfef1c29 98eddd86 5a9431f3 776ce399 e90118d1 e754c5e1 be7c41b4 8fc66e78 142 | 1 13 -2 23 26 1 0 463 47 792 1 40 0 0 05db9164 89ddfee8 7edab412 f1d06e8a 4cf72387 7e0ccccf c96de117 0b153874 a73ee510 e533d2ab ad757a5a 0a02e48e 93b18cb5 051219e6 d5223973 e2bc04da 8efede7f 5bb2ec8e 2221c689 a458ea53 1de5dd94 c9d4222a 423fab69 43fe299c f0f449dd f3b1f00d 143 | 0 0 1 3 5528 127 19 3 251 0 3 3 87552397 efb7db0e db44bc96 5b392af8 25c83c98 7e0ccccf afa309bd 0b153874 a73ee510 b2d2f56a 77212bd7 a7ee0769 7203f04e b28479f6 5ab7247d 15469308 e5ba7672 a863ac26 31c11f6b c7dc6720 abe3a684 144 | 0 13 1 2077 5 2 1 10 2 1 5a9ed9b0 0c4bf847 512ccf08 78cdb539 25c83c98 fe6b92e5 a477a3ab 1f89b562 a73ee510 b36542f1 ef969cb2 3e26a0ba ac249cd4 64c94865 7255c07d 95b7b60e 07c540c4 a99375ce c989b5ec 32c7478e 0deb67ef 145 | 0 0 1 10568 202 5 2 74 1 8cf07265 482fe41f c0ac520f cbb4c12e 30903e74 fe6b92e5 2773eaab 0b153874 a73ee510 d66365e9 06474f17 1d6bfcff 2ec4b007 1adce6ef 62092ade 44f81422 07c540c4 b182a697 21ddcdc9 5840adea cb3e5850 c9d4222a 423fab69 42998020 e8b83407 b91c0f78 146 | 1 0 17 4 4 1638 83 27 31 161 0 6 4 05db9164 38a947a1 4d3664a7 d070b865 25c83c98 1c86e0eb 5b392875 a73ee510 f991ddf2 755e4a50 3e63d512 5978055e b28479f6 46ed0b3c 03c7b791 e5ba7672 2c6cb693 94e4b014 32c7478e b258af68 147 | 0 1960 3 2329 11 1 5 5 1 4 05db9164 38a947a1 783d0d56 04600d6d 25c83c98 fe6b92e5 52283d1c 0b153874 a73ee510 efea433b e51ddf94 71f24952 3516f6e6 b28479f6 be9e837b cd68975a d4bb7bd8 5ea889df adb2e690 ad3062eb 3a171ecb 3d759b8c 148 | 0 10 1 0 41edac3d 38a947a1 e3818eb2 a4456f7e 25c83c98 88002ee1 1f89b562 7cc72ec2 3b08e48b f1b78ab4 3bec5d45 6e5da64f 07d13a8f b3b12b81 72d1790f 2005abd1 f03e8b05 950d91c1 32c7478e 2f7e98de 149 | 1 0 136 2 10450 306 23 0 89 0 4 8cf07265 71ca0a25 8dbba1ce 03138509 384874ce e24d7cb8 5b392875 a73ee510 64738e1e 03458ded dc85594f 8019075f 07d13a8f a8e0f0c6 386da210 e5ba7672 9bf8ffef 21ddcdc9 5840adea fcbe92e8 32c7478e 696cc26c e8b83407 01774abe 150 | 1 41 6 7 13992 8 2 7 8 1 7 05db9164 90081f33 641e04f4 2c3cb82a 25c83c98 fe6b92e5 6ad82e7a 5b392875 a73ee510 010e2c68 c1ee56d0 9bb536b3 ebd756bd b28479f6 90cfbb67 5f32239f e5ba7672 622ba7f1 96e51f48 3a171ecb abf08f1b 151 | 0 0 349 10 8 2812 24 7 15 96 0 2 16 05db9164 287130e0 28eea01d b4e3a4d6 25c83c98 fbad5c96 4a230e59 0b153874 a73ee510 7fc0c2b2 922bbb91 8991cd8c ad61640d 07d13a8f 10040656 2aed4df7 27c07bd6 891589e7 82368759 b1252a9d 1157327f c9d4222a 32c7478e 1a507348 ea9a246c db1bd5f0 152 | 0 1 300 2 4 780 5 2 6 48 1 1 4 05db9164 2fe85f57 d4b34111 230aba50 25c83c98 7e0ccccf ac652f55 51d76abe a73ee510 3b08e48b 5a0a81aa fa2a702c 99b38f98 b28479f6 3f488e91 b542b634 e5ba7672 c23adf9f 7fd2eed0 32c7478e fab2a151 153 | 0 0 5 21 25 1225 25 1 25 25 0 1 25 241546e0 207b2d81 d9e96c5e 0fa0d423 25c83c98 fbad5c96 3c1fa46c 0b153874 a73ee510 7eea998b 499e9dcb 8c51950e a2158803 cfef1c29 f2237c1b 6bb29970 d4bb7bd8 99cb5912 21ddcdc9 b1252a9d 8f18d8d5 ad3062eb 3a171ecb 8fc66e78 001f3601 f37f3967 154 | 0 0 33 3 1588 140 1 27 58 0 1 0 3 05db9164 064c8f31 dc0d65af 55c7c029 4cf72387 fbad5c96 6fb62f1a 0b153874 a73ee510 efea433b e51ddf94 c5d0e605 3516f6e6 b28479f6 2f8d7add 7b274b5b e5ba7672 795b8402 21ddcdc9 b1252a9d 6cce8b6e 32c7478e 4997082e 001f3601 2fede552 155 | 1 3 69 2 1 11 1 3 1 1 1 1 1 5a9ed9b0 9e5ce894 12cb69e0 13508380 25c83c98 fbad5c96 2e62d414 0b153874 a73ee510 fa303997 258875ea 9bac6752 dcc8f90a 07d13a8f 8cf98699 d6dccedb d4bb7bd8 a5bb7b8a 55dd3565 b1252a9d 12ed5c2b 423fab69 45ab94c8 ea9a246c c84c4aec 156 | 0 0 0 2 2 40263 0 1 1 0 0 2 5a9ed9b0 6887a43c 310a2ebb dac3e8ea 25c83c98 894562a9 c8ddd494 7cc72ec2 00f2b452 41b3f655 530062c6 ce5114a2 07d13a8f db12b98c f65409a3 07c540c4 1e4afada 21ddcdc9 a458ea53 31fe1057 32c7478e 897b70cf 445bbe3b 60143f0c 157 | 1 44 2 2 5745 40 6 8 78 1 2 be589b51 603e7ec9 b94d59c7 913bea4c 25c83c98 fe6b92e5 6b157a36 37e4aa92 a73ee510 8d9d3025 5097afa1 306a092b a4456f70 07d13a8f 5d6805b7 39b8bb21 e5ba7672 e8bc4abf 94902917 423fab69 eff6c82f 158 | 0 1 0 28 4 465 49 3 9 14 1 2 4 05db9164 8947f767 bc118fbb 16fe249c 25c83c98 fbad5c96 afa309bd 0b153874 a73ee510 0042ccac 77212bd7 425c3693 7203f04e 07d13a8f 2c14c412 86a29fb8 e5ba7672 bd17c3da 6f3756eb 5840adea 7dc891cc c7dc6720 1793a828 e8b83407 a475662f 159 | 0 54 1 1 12769 0 6 72 0 1 05db9164 38a947a1 1071438d de02465f 25c83c98 fe6b92e5 9e36bb88 5b392875 a73ee510 3b08e48b 37deccff ba5dfa0d e1fb2a68 07d13a8f 6a807ea3 2727665f 776ce399 1d1986db f5511ad8 3a171ecb 69d3499a 160 | 1 0 2 7 1363 33 1 5 9 0 1 7 05db9164 207b2d81 69866e1c 1a4b1eb1 25c83c98 7e0ccccf d130cbd1 0b153874 a73ee510 2ed68727 9a422971 3ae289b3 174e4cac b28479f6 899da9d5 233426cd d4bb7bd8 25c88e42 21ddcdc9 b1252a9d 3229ec62 3a171ecb 73e86bff 001f3601 98f2cb71 161 | 0 15 1 7191 108 13 5 116 2 0 1 241546e0 e112a9de a8d6c8b0 65e97d66 25c83c98 fbad5c96 2f294c2b 0b153874 a73ee510 c9ae79d8 f53090e9 bd7eec69 4abc6315 b28479f6 5ff145a2 332f7099 e5ba7672 ca5a79fa 40acbeb3 32c7478e 8f079aa5 162 | 0 0 1 14 23 1486 65 1 30 58 0 1 40 05db9164 38a947a1 e9c3a8e9 fb7b61aa 25c83c98 7e0ccccf 776ecf80 0b153874 a73ee510 2ce2764d d93e6010 f0261606 4e8bba73 1adce6ef 1b9ea990 c8719c2b d4bb7bd8 a30a18f8 3bbda4e7 423fab69 11cfe898 163 | 0 5 4 4 15540 241 1 7 177 1 4 05db9164 0a519c5c 77f2f2e5 d16679b9 25c83c98 7e0ccccf fe4e75fa 0b153874 a73ee510 6aea41c7 8f4f8f83 9f32b866 8828a59c 07d13a8f b812f9f2 31ca40b6 07c540c4 2efa89c6 dfcfc3fa 93bad2c0 aee52b6f 164 | 1 26 -1 29 11 12 3 83 20 57 2 9 3 05db9164 c01b42bc 9270e53b a800f41a 25c83c98 bd9a3e0c 5b392875 a73ee510 54dee4bc 980e6880 f3ad49a9 aef750b7 b28479f6 4a610ec9 5766f62c e5ba7672 45e58044 21ddcdc9 5840adea c6008d7a 32c7478e 0243a2a0 f0f449dd 2ae73226 165 | 0 1 40 13 8 28 6 23 7 411 1 11 3 6 05db9164 89ddfee8 0b8a7a69 a6cb5175 4cf72387 fbad5c96 407438c8 0b153874 a73ee510 6c3c1307 755e4a50 325bab59 5978055e 1adce6ef 34cce7d2 b83077e4 8efede7f 5bb2ec8e 07995af6 a458ea53 51554e41 32c7478e 0c3013ca f0f449dd d885a38e 166 | 0 0 23 2 14 8079 311 10 26 245 0 3 14 05db9164 2c16a946 769c1166 f0ee1171 25c83c98 fe6b92e5 5e64ce5f 0b153874 a73ee510 2db0d980 8b94178b 54dffe71 025225f2 07d13a8f 18231224 8629304e e5ba7672 74ef3502 011fcea7 32c7478e 9117a34a 167 | 0 2 31 29 9512 42 74 31 150 9 0 29 05db9164 38a947a1 effd1151 f0e0a867 25c83c98 a4bbd4f4 0b153874 a73ee510 3b08e48b 8d5ad79c faaf9806 4809d853 07d13a8f 42d81087 bda48c80 27c07bd6 be5810bd 57763185 be7c41b4 043a382b 168 | 0 14 3 2 74835 0 8 2 0 3 39af2607 a796837e dffca8ba 0fa0d423 4cf72387 fbad5c96 6e31fb64 5b392875 7cc72ec2 b491e8c1 4f114f24 93bab460 7f533b9e cfef1c29 85e5b07c 6bb29970 07c540c4 4e6b896a d9d9202f ad3062eb 3a171ecb 8fc66e78 169 | 0 0 12 4 2 2350 81 1 11 23 0 1 2 8cf07265 558b4efb b009d929 c7043c4b 25c83c98 fe6b92e5 81bb0302 0b153874 a73ee510 7f314591 b7094596 3563ab62 1f9d2c38 07d13a8f c1ddc990 b688c8cc d4bb7bd8 c68ebaa0 21ddcdc9 5840adea 2754aaf1 c7dc6720 3b183c5c ea9a246c 2f44e540 170 | 0 9 0 2 16 1 3 36 19 130 1 9 0 2 05db9164 38d50e09 ca5b9347 36f6f194 43b19349 7e0ccccf dda1fed2 0b153874 a73ee510 2462946f 7f8ffe57 9603ccaa 46f42a63 07d13a8f e24ff4c6 46973e83 27c07bd6 f855e3f0 21ddcdc9 5840adea 1f06bc3d ad3062eb 3a171ecb b2f178a3 001f3601 09929967 171 | 1 0 516 11 6 7155 0 26 4 0 0 0 26 05db9164 26ece8a8 2c1222d0 d1baf1d4 25c83c98 fbad5c96 ac2d4799 0b153874 a73ee510 0d0ca0e7 434d6c13 d24c389e 7301027a cfef1c29 60538e81 c77cde77 e5ba7672 cc464611 7d24f355 ad3062eb 55dd3565 6138b4a7 172 | 0 9 163 5 2 5 2 27 4 8 1 7 2 2 68fd1e64 89ddfee8 de5fe7aa d382d770 25c83c98 fbad5c96 1c86e0eb 0b153874 a73ee510 67eea4ef 755e4a50 56f2ed36 5978055e 051219e6 d5223973 3329de85 e5ba7672 5bb2ec8e d0289910 b1252a9d e0157b0a 32c7478e d4f7b3a7 f0f449dd 2f89515c 173 | 1 8 -1 681 4 75 17 274 3 20 05db9164 38a947a1 e058fc3c 2192038e 4cf72387 fbad5c96 81bb0302 0b153874 a73ee510 e9995d97 b7094596 f7cbe917 1f9d2c38 b28479f6 ccb85bf3 0c6b4ad6 e5ba7672 4427594e f6e3bd9c 32c7478e 9e07eb4a 174 | 1 0 239 1 13317 0 0 5a9ed9b0 2c16a946 bd24ac03 9f43a1b5 25c83c98 fbad5c96 bcb858e8 0b153874 a73ee510 66cff97b 59da2976 17b9e35b a1354fa8 b28479f6 3628a186 87140baa 07c540c4 e4ca448c a62fa482 3a171ecb 9117a34a 175 | 0 5 1 7145 0 1 28 0 1 05db9164 c1384774 ad4b77ff d16679b9 4cf72387 fe6b92e5 81bb0302 5b392875 a73ee510 7f314591 b7094596 a2f4e8b5 1f9d2c38 b28479f6 916e9a2c 89052618 1e88c74f 8e8b535e 21ddcdc9 b1252a9d d4703ebd 32c7478e aee52b6f ea9a246c f8d62db8 176 | 0 0 8 3 8 32538 379 1 16 193 0 1 8 05db9164 0a519c5c b00d1501 d16679b9 25c83c98 7e0ccccf 670e4ae1 0b153874 a73ee510 3b08e48b e671204e e0d76380 e325e0dd 1adce6ef 123b2f29 1203a270 27c07bd6 2efa89c6 73d06dde 3a171ecb aee52b6f 177 | 0 7 0 12 5 325 55 36 45 178 1 2 2 5 05db9164 80e26c9b 707816c9 85dd697c 4cf72387 fe6b92e5 c3e9476d 0b153874 a73ee510 7d3f8124 35b0e946 78371173 8c2b39b2 8ceecbc8 8d015bd8 da441c7e 27c07bd6 005c6740 21ddcdc9 5840adea b2f79ba1 55dd3565 1793a828 e8b83407 9904c656 178 | 0 7 0 10 1 0 0 7 1 1 1 1 0 0 05db9164 58e67aaf 1341e526 79cd1475 25c83c98 fbad5c96 1faa5ca1 0b153874 a73ee510 a2df597b 1aa6cf31 e5e0738f 3b03d76e 07d13a8f 10935a85 1ab5b517 3486227d c21c3e4c 3014a4b1 a458ea53 ab58841e 32c7478e 95c16b66 9b3e8820 0e9e67aa 179 | 0 -1 0 0 4 0 05db9164 38a947a1 c4b406e2 bf536db3 4cf72387 7e0ccccf 1af5fecb 0b153874 7cc72ec2 3b08e48b 2e3884d7 e8a073ad 9c7a975e b28479f6 1db94996 489b1305 2005abd1 e8623312 e106ec2a ad3062eb 3a171ecb b889075b 180 | 0 2 464 1 5 1272 70 43 49 165 0 6 6 5 05db9164 38a947a1 7e863b36 8cb1ea25 4cf72387 fbad5c96 29ed3d08 0b153874 a73ee510 dec405d0 d74aabe6 5638862c 0fb78b80 07d13a8f 61866b4e 7cfd6756 8efede7f 74a21973 578de23c ad3062eb 55dd3565 786101ea 181 | 0 300 12 5 6649 0 2 19 0 5 5a9ed9b0 bfdcfc4a 003f419b ff852091 4cf72387 7e0ccccf 57b4bd89 0b153874 a73ee510 3b08e48b 71fd20d9 349450bc ddd66ce1 07d13a8f aa18cf81 6615ffe6 1e88c74f ffd53157 21ddcdc9 b1252a9d 3dd38d65 ad3062eb 3a171ecb c2fe6ca4 e8b83407 0015d4de 182 | 0 3 1 19 17 30 11 3 39 39 2 2 11 68fd1e64 38d50e09 a1f7e30b 8d8188a0 25c83c98 7e0ccccf 7d733ece 0b153874 a73ee510 7e535333 30b2a438 ba2755e8 aebdb575 07d13a8f e24ff4c6 2d816edd 07c540c4 f855e3f0 21ddcdc9 5840adea ee6cb673 32c7478e 9117a34a 001f3601 9f6a34e7 183 | 0 0 89 1 2 1029 264 5 44 53 0 1 2 05db9164 8084ee93 02cf9876 c18be181 25c83c98 7e0ccccf a86d9649 0b153874 a73ee510 147a12f7 aadb87b9 8fe001f4 e9332a03 b28479f6 16d2748c 36103458 07c540c4 003d4f4f e587c466 3a171ecb 3b183c5c 184 | 0 1 8 1 629377 0 3 1 0 1 39af2607 73a46ff0 fd6dd7f4 1f140fb3 f281d2a7 fe6b92e5 26fc9d19 0b153874 7cc72ec2 7a81165f 4be981e7 09aac274 d77a196a 07d13a8f 376a23f2 0f976152 e5ba7672 da507f45 21ddcdc9 5840adea f4ae7256 78e2e389 32c7478e c17719b6 ea9a246c 0e78bd29 185 | 0 1 1 1 4 14 4 1 4 4 1 1 4 8cf07265 d833535f ad4b77ff d16679b9 4cf72387 fe6b92e5 f4b9d7ad 0b153874 a73ee510 80f90445 c1ee56d0 a2f4e8b5 ebd756bd 07d13a8f 943169c2 89052618 d4bb7bd8 281769c2 d4703ebd 3a171ecb aee52b6f 186 | 1 0 -1 1647 0 3 0 0 0 1 98237733 38a947a1 3be104a1 6a14f9b9 25c83c98 fe6b92e5 38e70f12 062b5529 a73ee510 3b08e48b 7bb631ec fcee9999 2f5ddbed 07d13a8f 4dad26dd f8b34416 07c540c4 e5f8f18f f3ddd519 32c7478e b34f3128 187 | 0 0 0 27 7 26700 0 20 8 0 0 8 68fd1e64 9819deea 3662b866 f922efad 25c83c98 fe6b92e5 cc5ed2f1 0b153874 a73ee510 3b08e48b facf05cc b99ddbc8 9f16a973 b28479f6 1150f5ed 87acb535 e5ba7672 7e32f7a4 a4b7004c 32c7478e b34f3128 188 | 0 15 4 25 22 2 2 90 33 210 1 3 2 be589b51 71ca0a25 396df967 328b42c3 25c83c98 15ce37bc 0b153874 a73ee510 a9271c40 ff78732c 8481d649 9b656adc b28479f6 a67c19b7 b2f2a0c7 e5ba7672 9bf8ffef 21ddcdc9 5840adea f0bb1194 c7dc6720 a5ce2d0d 445bbe3b 984e0db0 189 | 0 48 5 1262842 0 0 1 0 86295adc 09e68b86 aa8c1539 85dd697c 25c83c98 7e0ccccf 0536e816 37e4aa92 7cc72ec2 3b08e48b b3bbdaaa d8c29807 89334965 8ceecbc8 d2f03b75 c64d548f d4bb7bd8 63cdbb21 cf99e5de 5840adea 5f957280 ad3062eb 3a171ecb 1793a828 e8b83407 b7d9c3bc 190 | 0 1 1 2 25624 184 2 2 130 1 2 05db9164 09e68b86 7e2f1224 b1155a69 4cf72387 fbad5c96 e746fe19 1f89b562 a73ee510 d6803e49 0bc63bd0 8228f535 ef007ecc b28479f6 6f73304a 2714bc71 07c540c4 479030a6 0b8cd6e5 5840adea a09ef9fc 32c7478e 41be4766 e8b83407 d8a062c4 191 | 0 1137 3 4 20049 54 1 3 49 1 4 39af2607 2705da39 5f6bc3b3 dabaa4f7 25c83c98 7e0ccccf 33cca6fa 0b153874 a73ee510 fb999b75 9f7c4fc1 60e9c48f 2b9fb512 b28479f6 d420da19 b151f39c d4bb7bd8 66c3058a 1952b3e9 3a171ecb 043ce596 192 | 0 0 0 12911 180 31 0 121 0 7 0 49807078 4f25e98b 9b1fe698 ddff38ca 43b19349 7e0ccccf 424fbb9a 361384ce a73ee510 48317e70 2386466b 84b8515e 45db6793 07d13a8f 5be89da3 fab8da65 3486227d bc5a0ff7 1d04f4a4 a458ea53 22ced0e8 3a171ecb 7a8e7ed6 001f3601 f159b6cb 193 | 1 10 240 6 5 1372 39 91 37 297 1 10 1 29 5a9ed9b0 bc478804 425eac0f 13508380 25c83c98 7e0ccccf d4ab9344 062b5529 a73ee510 2134f605 2bcfb78f e4bed286 e6fc496d 07d13a8f 0af7c64c 37c190a2 27c07bd6 65a2ac26 8733cf72 b1252a9d 0855f75a ad3062eb 423fab69 45ab94c8 001f3601 c84c4aec 194 | 1 3 1 15765 7 4 1 7 1 1 68fd1e64 a7ee4473 d24efce5 0baf92bf 25c83c98 fe6b92e5 dfc6e241 5b392875 a73ee510 5fe250bc 3547565f dcdb4766 12880350 07d13a8f 5e1280bd 9ed97a5a 07c540c4 475fc404 dd6a31f6 ad3062eb 3a171ecb 1be1c512 195 | 0 0 2 1 55931 0 5 92 0 2 7e5c2ff4 89ddfee8 2b62fba5 dad66aab 4cf72387 fbad5c96 66acf824 0b153874 7cc72ec2 0ed4b00d e192b186 82e69b07 7df3a6c1 1adce6ef 34cce7d2 77cd3f5f 27c07bd6 5bb2ec8e dd2d8ada a458ea53 aa881645 ad3062eb 423fab69 1793a828 f0f449dd c3cdeed4 196 | 1 3 26 6 1 692 5 7 100 3 6 05db9164 0468d672 3002a1ea e0e44130 4cf72387 7e0ccccf 81bb0302 0b153874 a73ee510 8b7e21f6 b7094596 3616c0b4 1f9d2c38 07d13a8f 7da7b9b2 48d4a2e5 e5ba7672 124c6b00 21ddcdc9 5840adea 20a0f0d8 32c7478e 4dca7569 ea9a246c aa5f0a15 197 | 0 12 92 378125 0 05db9164 38d50e09 e7290a18 10ee14f5 25c83c98 7e0ccccf 10d51d22 0b153874 7cc72ec2 3b08e48b e73bed86 394b6320 4a6575c1 b28479f6 42b3012c c336971a 776ce399 582152eb 21ddcdc9 5840adea f226d6a0 32c7478e e4f45003 001f3601 632dae8a 198 | 0 0 0 48 27 0 2363 2 26 344 0 1 0 0 8cf07265 38d50e09 981f35b1 d13a0547 4cf72387 7e0ccccf a9087a98 64523cfa a73ee510 fe687d88 30b2a438 c98d496e aebdb575 07d13a8f e24ff4c6 19301ae1 8efede7f f855e3f0 21ddcdc9 5840adea 8935d929 32c7478e 21c9d296 001f3601 984e0db0 199 | 1 56 19 2 9770 172 4 10 46 1 2 05db9164 0468d672 66203274 ae51ba90 4cf72387 fbad5c96 22fd2464 0b153874 a73ee510 7d8a9f62 d9085127 6606305d ef7e2c01 07d13a8f a888f201 0d38dc27 e5ba7672 9880032b 21ddcdc9 5840adea 0e14bf61 32c7478e 1487d019 ea9a246c 4e7af834 200 | 0 90 2 3749 0 19 30 0 68fd1e64 6c713117 6d0ceb43 8d164e53 384874ce fbad5c96 b328dad0 0b153874 a73ee510 000e2f4b c2e51649 9fa694f3 cfbf9eaa b28479f6 73b98472 a54711b4 e5ba7672 bf6b118a 21ddcdc9 b1252a9d 78766d37 c9d4222a c7dc6720 9e0bee34 445bbe3b f0d381cf 201 | 0 0 1 1 1956 69 3 12 41 0 1 8cf07265 403ea497 2cbec47f 3e2bfbda 25c83c98 bdf75bb3 0b153874 a73ee510 398f1dda 27465d16 21a23bfe db4c1ae9 07d13a8f e3209fc2 587267a3 07c540c4 a78bd508 21ddcdc9 5840adea c2a93b37 3a171ecb 1793a828 e8b83407 2fede552 202 | 1 3 3 3 5240 48 3 16 47 1 3 5bfa8ab5 b06f9574 48a29f29 85bbe3d4 25c83c98 7e0ccccf d9268037 1296137f a73ee510 9474ac23 14515151 a90d7d9f 2342c398 cfef1c29 5f1cd77c dfd505b1 07c540c4 f7be65bb 21ddcdc9 b1252a9d a1fe1c97 8ec974f4 423fab69 c67ffc42 001f3601 c84ddabc 203 | 0 0 0 1 2898 7 3 1 35 0 1 05db9164 68b3edbf ad4b77ff d16679b9 25c83c98 9f525672 5b392875 a73ee510 c731d31c 843d8639 a2f4e8b5 9cab1003 b28479f6 f511c49f 89052618 e5ba7672 752d8b8a d4703ebd bcdee96c aee52b6f 204 | 0 0 1697 10 3 79884 6022 0 17 501 0 0 10 5bfa8ab5 38a947a1 42a04a35 7031bb66 4cf72387 6f6d9be8 1de2c2da 6c41e35e 7cc72ec2 808e1dff d407af40 7127a7d6 a05d649e b28479f6 bcf15323 51a8930b d4bb7bd8 2a5c1ed3 53cb3c2b 32c7478e b258af68 205 | 0 0 1 3 47 31678 600 1 47 173 0 1 47 5a9ed9b0 0a519c5c 02cf9876 c18be181 25c83c98 7e0ccccf fe4e75fa 0b153874 a73ee510 6aea41c7 8f4f8f83 8fe001f4 8828a59c 07d13a8f 4ac81a35 36103458 07c540c4 416e8695 e587c466 ad3062eb 93bad2c0 3b183c5c 206 | 1 2 -1 1059 0 26 2 279 1 7 05db9164 537e899b 5037b88e 9dde01fd 4cf72387 fbad5c96 42695288 5b392875 a73ee510 d3587737 44482217 680d7261 0f3d4e02 07d13a8f 14be02cc c0673b44 e5ba7672 65979fb7 e049c839 ad3062eb 3a171ecb 6095f986 207 | 0 2 1 94 11 21 1 2 12 10 1 1 1 68fd1e64 4f25e98b 2079d226 cd92d9d5 43b19349 fbad5c96 b3ddf65a 51d76abe a73ee510 bde51b15 e973bfd7 bf8491f3 439cd4cc 07d13a8f dfab705f a021e18c e5ba7672 7ef5affa 21ddcdc9 a458ea53 3089c312 ad3062eb c7dc6720 04936173 e8b83407 c9b738a7 208 | 1 0 181 32 19 9250 783 24 37 614 0 4 32 05db9164 421b43cd 8aa6cae5 29998ed1 25c83c98 7e0ccccf 6fb62f1a 0b153874 a73ee510 efea433b e51ddf94 6aaba33c 3516f6e6 b28479f6 2d0bb053 b041b04a e5ba7672 2804effd 723b4dfd 32c7478e b34f3128 209 | 0 1 0 2 3 46 3 1 3 3 1 1 3 05db9164 80e26c9b cdb9e6df 80d8555a 4cf72387 7e0ccccf 26a81064 0b153874 a73ee510 768fb82c 9e511730 2cd99b9b 04e4a7e0 64c94865 df97efd0 146a70fd d4bb7bd8 f54016b9 21ddcdc9 b1252a9d cbec39db 3a171ecb cedad179 e8b83407 dbd4e512 210 | 0 1 2 9714 194 1 5 19 1 3 05db9164 80e26c9b e346a5fd 85dd697c f3474129 fe6b92e5 28cd7a0b 0b153874 a73ee510 3b08e48b 7158b034 539c5644 67414029 1adce6ef 0f942372 aafa191e 27c07bd6 005c6740 21ddcdc9 5840adea 7e5b7cc4 32c7478e 1793a828 e8b83407 b9809574 211 | 0 2 362 17 2 47 9 23 2 137 2 7 4 2 5a9ed9b0 85af3139 d032c263 c18be181 25c83c98 7e0ccccf b8a6d129 0b153874 a73ee510 6c47047a 2c3edb8b dfbb09fb 03d0ba0e b28479f6 af8db00e 84898b2a 8efede7f d4328054 0014c32a 3a171ecb 3b183c5c 212 | 0 1 13 75 22 67 52 1 25 24 1 1 24 05db9164 4f25e98b 04c3d10d 649ff81e 43b19349 7d733ece 0b153874 a73ee510 01281f02 30b2a438 504fb83b aebdb575 1adce6ef 17d9b759 dddbb8b0 07c540c4 7ef5affa 5fd56cf9 a458ea53 2424784a bcdee96c da008840 001f3601 ccd1f8e0 213 | 0 8 8 8 42838 0 21 39 0 8 5a9ed9b0 00ac063c f483d069 a089e42f 4cf72387 dc63c936 0b153874 a73ee510 3b08e48b 53b6a492 8287dc29 d1019a93 b28479f6 bc566d75 ddfabc04 776ce399 15a4e6dd 21ddcdc9 5840adea 87e212d4 be7c41b4 c2fe6ca4 001f3601 602f0609 214 | 1 17 1 2 2 1323 2 17 3 2 1 1 2 5952476a 2ae0a573 e608d148 3095f5c8 25c83c98 7e0ccccf 36a43a36 0b153874 a73ee510 3b08e48b 7b9241f1 8b4f5089 64e1ea33 07d13a8f 413cc8c6 7568f52e e5ba7672 f2fc99b1 ce653524 32c7478e 46485de8 215 | 0 0 0 19 3 2897 133 2 6 55 0 1 3 05db9164 4f25e98b 5aa574c1 a136ae32 25c83c98 fbad5c96 1478592d 0b153874 a73ee510 16a6e853 a89c45cb 2a745035 a4fafa5b b28479f6 df2f73e9 59400ee9 07c540c4 bc5a0ff7 9437f62f a458ea53 035a30f3 ad3062eb 32c7478e eb251029 001f3601 8a657198 216 | 1 1 13 1 1 869 6 127 25 654 1 25 1 1 17f69355 064c8f31 dc0d65af 55c7c029 26eb6185 fbad5c96 3c2cc096 0b153874 a73ee510 82c227f2 2a6cc1e3 c5d0e605 b029ebab 1adce6ef 323bf482 7b274b5b 27c07bd6 795b8402 21ddcdc9 b1252a9d 6cce8b6e 423fab69 4997082e 001f3601 2fede552 217 | 0 6 3 1 31946 0 1 1 0 0 1 be589b51 fc67db1d 7fd95270 d3c8a2de 25c83c98 3bf701e7 85c9b6a4 5b392875 7cc72ec2 c9e11adf 1be1371e 08b7730a 76b6f478 64c94865 53175e74 493452ea 27c07bd6 e24773fa 0bf95501 3a171ecb be673bbf 218 | 0 2 53 1 1 1425 5 4 6 8 0 1 1 05db9164 e3ce8d54 25c83c98 7e0ccccf 90a2c015 51d76abe a73ee510 89f9878d 0f736a0c 3af886ff d2dfe871 4b0401e8 e5ba7672 d9942b4c ad3062eb 3a171ecb 219 | 0 186 11 9 35922 149 1 9 149 0 9 05db9164 287130e0 9fb47509 9ad60ebf 25c83c98 f2916259 1f89b562 a73ee510 dae28d52 3e60f064 ee5316d5 ce08fcbf 64c94865 f8526149 6dee9e76 07c540c4 891589e7 0053530c a458ea53 87aa71d6 32c7478e a6e96657 ea9a246c 3dddc622 220 | 0 0 4 34 8 8566 241 6 9 48 0 1 8 05db9164 8ab240be 02bd7bb3 b4b00886 25c83c98 7e0ccccf 938d6619 0b153874 a73ee510 c510044d e6c92dd9 a9ecf335 16659efe 1adce6ef 28883800 dc1edaf3 e5ba7672 ca533012 21ddcdc9 5840adea ad69ce75 32c7478e 6e311859 e8b83407 9f6a34e7 221 | 0 154 1 6 2544 11 161 6 214 3 6 68fd1e64 8b57fabc de79547d e8930818 25c83c98 fe6b92e5 73eca724 0b153874 a73ee510 ea23b002 f66047e5 c1b177b0 13c89cc4 b28479f6 f6c3c685 601256eb 3486227d 6695336c 03b323fb bcdee96c 382cd1f1 222 | 1 4 4 7 5 5 7 4 6 5 1 1 5 41edac3d 4f25e98b 206a933c 10a8cee8 25c83c98 fe6b92e5 9b98e9fc 5b392875 a73ee510 547c0ffe 7f8ffe57 9a117b9a 46f42a63 64c94865 40e29d2a 38357a38 e5ba7672 7ef5affa 4b1cf52e a458ea53 e14d0325 c9d4222a 32c7478e df739657 001f3601 1954e885 223 | 0 2 1 14267 17 14 2 47 2 0 f473b8dc 58e67aaf 4e5ec66e ecb7ef0c 4cf72387 964d1fdd 5b392875 a73ee510 7d48ae56 59cd5ae7 36b22bfa 8b216f7b b28479f6 62eca3c0 c8fd56f0 27c07bd6 c21c3e4c e3b5ceb7 a458ea53 0f16f783 32c7478e a052b1ed 9b3e8820 64cb0863 224 | 0 5 4 6 9 1326 52 7 46 250 1 2 0 9 05db9164 38a947a1 8482926c 68012d41 43b19349 7e0ccccf cd846c62 ebf5afb4 a73ee510 3b08e48b e9f63a4d 5c2dcede 44af41ef 07d13a8f 75b12e6a 5c705520 3486227d c08a0c90 888c0b39 3a171ecb b72bb56a 225 | 1 1 5 0 05db9164 8084ee93 d032c263 c18be181 25c83c98 7e0ccccf af0809a5 a674580f 7cc72ec2 3b08e48b 9e12e146 dfbb09fb 025225f2 b28479f6 b2ff8c6b 84898b2a 2005abd1 52e44668 0014c32a be7c41b4 3b183c5c 226 | 0 25 1 3 10 96 10 25 10 10 1 1 0 10 05db9164 38a947a1 07a16775 16b068f6 25c83c98 fe6b92e5 e98b8fca 0b153874 a73ee510 80254878 73413f68 255e43a4 1ffceffc 07d13a8f 4290cfe5 244f1551 e5ba7672 b2e570f5 8fe057a5 c9d4222a 3a171ecb 4904c5a1 227 | 0 3 46 7 4494 17 1 8 9 1 8 8cf07265 1cfdf714 f7b9956b adc7cd76 25c83c98 3bf701e7 7415122a 0b153874 a73ee510 8924e76a 98579192 75587697 779f824b 07d13a8f f775a6d5 e26d4d55 07c540c4 e88ffc9d 5c787ae3 a458ea53 9780a253 ad3062eb 423fab69 bd635379 cb079c2d 90ea1eda 228 | 0 0 11 7 22324 115 10 6 36 8 7 05db9164 9819deea 0e613582 f922efad 25c83c98 13718bbd 968a6688 0b153874 a73ee510 c10eaf60 f25fe7e9 b99ddbc8 dd183b4c b28479f6 1150f5ed 87acb535 e5ba7672 7e32f7a4 a4b7004c 32c7478e b34f3128 229 | 0 1 110 1 30287 1627 0 6 68 0 1 f473b8dc 1cfdf714 5586e849 ace68113 4cf72387 7e0ccccf 6f441cf5 0b153874 7cc72ec2 30040193 1054ae5c 485a241e d7ce3abd 1adce6ef f3002fbd d174d2e6 e5ba7672 e88ffc9d afd8410c a458ea53 1d5407dd ad3062eb bcdee96c e9b5731b cb079c2d 6077e801 230 | 0 0 2 10 1 1632 39 1 6 39 0 1 1 05db9164 80e26c9b f9267cc1 85dd697c 4cf72387 fbad5c96 6951cde4 0b153874 a73ee510 3b08e48b b3bbdaaa e0ed00c3 89334965 07d13a8f e8f4b767 2d0bbe92 d4bb7bd8 005c6740 21ddcdc9 b1252a9d daa7197b ad3062eb 32c7478e 1793a828 e8b83407 9904c656 231 | 0 4 299 24 9 574 77 11 29 2302 1 5 0 17 68fd1e64 38d50e09 7ce53da5 e205af91 25c83c98 7e0ccccf 661e458d 5b392875 a73ee510 dad22fce 34c909fe 250014ce ef9686d6 07d13a8f ee569ce2 fe344361 27c07bd6 582152eb 21ddcdc9 5840adea 3aafc9a9 ad3062eb 423fab69 c825baa9 001f3601 aa5f0a15 232 | 0 1 9 7 13423 187 7 10 216 1 7 05db9164 9b25e48b 5699a6d2 cf413dfc 25c83c98 fbad5c96 f92ced46 9168e94b a73ee510 fc6ac8b3 a803a2ad 08e21a42 12ae6811 1adce6ef cf995508 45cfe3e8 e5ba7672 7d8c03aa a34d2cf6 5840adea 31a5135b 32c7478e eb8d4cc5 ea9a246c 2ddaef64 233 | 0 0 -1 2786 26 1 2 4 0 1 68fd1e64 09e68b86 5dfa1cbb 46e7e2d7 25c83c98 b87f4a4a 0b153874 a73ee510 e70742b0 319687c9 31d55110 62036f49 64c94865 91126f30 0682997a 07c540c4 5aed7436 bb71621a a458ea53 00191bf0 32c7478e 1dbb3aa6 e8b83407 b3b95afb 234 | 0 1 1 8 6552 28 22 16 27 2 8 8cf07265 0c0567c2 48a9e269 f48c449b 25c83c98 3a6d4c08 5b392875 a73ee510 3bc8bb1a 41656eae bc042ed5 66815d59 07d13a8f aa39dd42 bb2c7f0d e5ba7672 bb983d97 91d39925 32c7478e 996f5a43 235 | 0 0 8 1 28877 469 9 4 55 1 1 87552397 09e68b86 00ab53e4 8e5eaeda 25c83c98 50b436c9 6a698541 a73ee510 0a3a2cb6 a0a5e9d7 65164cdd ee79db7b b28479f6 52baadf5 b81981c7 e5ba7672 5aed7436 a84a01e1 b1252a9d 149bf48c 32c7478e c440aa3c e8b83407 b11e5179 236 | 0 27 59 0 9912 67 3 2 39 1 2 05db9164 d97d4ce8 c09bfc63 e5d4c5ff 25c83c98 fe6b92e5 0e6e482d 0b153874 a73ee510 7092e087 db9587fe 5859f863 9b537a5e 1adce6ef 7c1f836c f327e411 07c540c4 bc6f1d9c 21ddcdc9 a458ea53 65a56438 32c7478e 6c1cdd05 ea9a246c 2a4caf7c 237 | 1 1 13 7 11190 68 1 7 56 1 7 05db9164 6496eea0 b8642174 145a8e60 25c83c98 7e0ccccf 112b9a9c 0b153874 a73ee510 550727c0 81cef21c f4f72f21 48a932f6 07d13a8f 08cec6f8 14cce9a9 d4bb7bd8 fba30a05 21ddcdc9 b1252a9d 061e6cda be7c41b4 0b95f1c5 ea9a246c d4d7b05b 238 | 0 0 1 7 5675 0 4 280 0 25 68fd1e64 3e4b7926 ad76f339 16fe249c 25c83c98 fe6b92e5 df1c3c63 0b153874 a73ee510 3b08e48b c3516644 79e8482a 9f24464b 07d13a8f e6863a8e 405a2e1c 776ce399 e261f8d8 21ddcdc9 a458ea53 9bef9249 3a171ecb 1793a828 f7839e21 d25a68df 239 | 1 0 85 6 5 580 582 1 26 110 0 1 5 68fd1e64 bce95927 02391f51 b9c629a9 25c83c98 7e0ccccf 136f2270 1f89b562 a73ee510 05c23ec7 7bbe6c06 2397259a ea1f21b7 07d13a8f fec218c0 d37efe8c 3486227d 04d863d5 21ddcdc9 5840adea b6119319 c9d4222a 423fab69 45ab94c8 e8b83407 b13f4ade 240 | 1 1 0 3 4 159 7 4 22 97 1 2 4 05db9164 f0cf0024 5d3a12c5 ee36060f 43b19349 fe6b92e5 98f4456a 0b153874 a73ee510 672660cf 5b225578 03afd96d d1be539d 1adce6ef 99b06d74 fab4dda8 e5ba7672 88b0e440 21ddcdc9 5840adea f6e47def 32c7478e 6c1cdd05 ea9a246c c5116a1c 241 | 0 0 2 766 10 5388 192 1 9 11 0 1 10 5a9ed9b0 9f7e1d07 d7cd1ddc 820d1556 25c83c98 d459fe47 0b153874 a73ee510 713e3670 d059cd92 9eac3074 552e5180 1adce6ef df3426f3 55300130 d4bb7bd8 6a58e423 21ddcdc9 5840adea 233efd24 32c7478e f1b99840 ea9a246c e7ecb821 242 | 1 14 20 1 562137 1 1 05db9164 b06f9574 17f7f08a fefe54dc 4cf72387 7e0ccccf aa1c94e4 0b153874 7cc72ec2 03e48276 2b9c7071 73c0ce33 1aa94af3 1adce6ef b62ec7c9 4e9e2d54 3486227d e9a3d86d 21ddcdc9 b1252a9d 80be3d43 be7c41b4 8921c1da e8b83407 9b26b69e 243 | 0 18 36 34 2920 49 65 38 181 5 1 34 05db9164 89ddfee8 4c582c78 186eba2c 25c83c98 fbad5c96 c96de117 37e4aa92 a73ee510 a9966b7b ad757a5a 1a54862c 93b18cb5 07d13a8f 4df3da6b 91568f56 27c07bd6 5bb2ec8e 0053530c a458ea53 24a520b4 c7dc6720 e1523a9d f0f449dd 991e522d 244 | 0 55 1 1 285017 0 3 1 0 1 be589b51 38a947a1 9abba94d 6a14f9b9 25c83c98 fbad5c96 27c3579d 0b153874 7cc72ec2 3b08e48b 5368b8e5 59939db5 6b8887cc 07d13a8f 5bcd6532 f8b34416 d4bb7bd8 c9ac134a f3ddd519 3a171ecb b34f3128 245 | 0 0 1 1 2 6785 260 7 16 557 0 2 2 05db9164 d833535f 77f2f2e5 d16679b9 25c83c98 7e0ccccf a012896e 0b153874 a73ee510 54e99be5 2d6f299a 9f32b866 f0e0f335 07d13a8f 2e7bc615 31ca40b6 e5ba7672 7b49e3d2 dfcfc3fa 32c7478e aee52b6f 246 | 0 18 2 2 450195 0 6 2 0 2 05db9164 8084ee93 d032c263 c18be181 25c83c98 7e0ccccf 484cb3fb 0b153874 7cc72ec2 3b08e48b 3d5fb018 dfbb09fb 94172618 b28479f6 16d2748c 84898b2a e5ba7672 003d4f4f 0014c32a 3a171ecb 3b183c5c 247 | 0 3 89 21 11 242 2 24 148 1 14 be589b51 71ca0a25 4d564a1c c3dd08e6 25c83c98 7e0ccccf 1347b5cf 0b153874 a73ee510 3b08e48b 2efe2214 ea734b1a 4350b107 07d13a8f a8e0f0c6 db0cf8e1 e5ba7672 9bf8ffef 21ddcdc9 5840adea 8d9a161e 32c7478e b34f3128 445bbe3b 984e0db0 248 | 1 0 2 2 2 2736 131 1 25 330 0 1 2 be589b51 270cc1b8 82be0f99 f922efad 25c83c98 7e0ccccf 6ccbd984 5b392875 a73ee510 3b08e48b 149238ed 90a33af5 decd9980 1adce6ef 732eea45 e2e2fcd9 d4bb7bd8 86b4c7aa 9ae3e892 93bad2c0 b34f3128 249 | 0 5 0 9 8 1126 45 230 11 72 1 16 0 8 8cf07265 c1384774 850da519 202fdeac 4ea20c7d fe6b92e5 a4ca1fb6 0b153874 a73ee510 3a523fc8 4a31f431 d9abdf34 740c210d cfef1c29 b1b648b9 e9fdbb77 27c07bd6 5cd35b65 21ddcdc9 b1252a9d cabe4591 423fab69 72669222 ea9a246c 6f3c2551 250 | 0 2 34 37 0 283 36 41 38 76 1 8 2 17f69355 33728ce9 4b0ad917 4e353d3a 25c83c98 fe6b92e5 ee47b323 1f89b562 a73ee510 3b08e48b c1bfba9c ce2957ad 44af41ef b28479f6 468d6259 5f704016 e5ba7672 07070d63 21ddcdc9 5840adea e5195a68 3a171ecb 9be5c7a4 2bf691b1 2fede552 251 | 1 0 10 3 3 1509 30 1 32 88 0 1 3 8cf07265 421b43cd 40d6330c 29998ed1 4cf72387 7e0ccccf ca280131 1f89b562 a73ee510 3b08e48b 868a9e47 6aaba33c fc5dea81 b28479f6 e1ac77f7 b041b04a e5ba7672 2804effd 723b4dfd bcdee96c b34f3128 252 | 0 14 1678 2 1285 71 17 43 74 0 1 0 2 05db9164 4f25e98b 02eb682b 175186ab 4cf72387 fbad5c96 ade953a9 062b5529 a73ee510 4072f40f 29e4ad33 9b33da7f 80467802 f862f261 4595ddb7 ce24abf6 e5ba7672 7ef5affa 712d530c b1252a9d 365420f4 32c7478e 93aed340 001f3601 92baee4b 253 | 0 0 1 1 2612 39 4 9 5 0 1 05db9164 04e09220 b1ecc6c4 5dff9b29 25c83c98 fe6b92e5 8f572b5e 37e4aa92 a73ee510 1452b512 434d6c13 2436ff75 7301027a 07d13a8f cae64906 f4ead43c e5ba7672 e161d23a 4f1aa25f 55dd3565 ded4aac9 254 | 0 4 303 8 20 37 20 4 41 41 3 3 1 20 68fd1e64 38d50e09 c86b2d8d 657dc3b9 25c83c98 7e0ccccf f18d89c7 0b153874 a73ee510 fcd76f02 910afbbb 1ca7a526 bd727667 b28479f6 06373944 ba46c3a1 3486227d fffe2a63 21ddcdc9 b1252a9d eb0fc6f8 32c7478e df487a73 001f3601 c27f155b 255 | 0 1 38 1 1 0 1 1 0 1 68fd1e64 287130e0 484efd3b b09fbd40 307e775a 7e0ccccf c31847f5 0b153874 a73ee510 3b08e48b a12fca95 1ea22877 9b9e44d2 07d13a8f 10040656 41b2fd79 776ce399 891589e7 b04164d1 b1252a9d 274eaf90 be7c41b4 762d01e1 ea9a246c fd6ccd1e 256 | 1 28 0 7 2 46 0 191 39 709 5 29 1 0 68fd1e64 876465ad ffe810c0 d2941349 25c83c98 fbad5c96 744033ad 0b153874 a73ee510 3b08e48b a04db730 63470841 c66b30f8 07d13a8f e23edeb4 1478197a 3486227d 08154af3 21ddcdc9 5840adea 58cd488e c9d4222a 32c7478e 3b425be9 f55c04b6 2fede552 257 | 1 5 2 1 5 99 5 5 5 5 3 3 1 5 05db9164 bccb7a1a a07196a2 00b3f0d8 4cf72387 6f6d9be8 d4ab9344 1f89b562 a73ee510 67cebf16 2bcfb78f 549524c2 e6fc496d cfef1c29 c99f9716 ee293b0f 3486227d d51975d7 21ddcdc9 5840adea 15f80464 32c7478e 340d03c3 e8b83407 96911ece 258 | 0 39 1 1 145614 0 2 1 0 1 68fd1e64 38a947a1 2d13554b ee04a671 25c83c98 7e0ccccf bd6afa2b 5b392875 7cc72ec2 b72eaafe c1ee56d0 9a7f26fa ebd756bd b28479f6 bcf15323 a3b72634 d4bb7bd8 2a5c1ed3 ad32a9e4 3a171ecb b258af68 259 | 1 13 342 1 3 1169 3 15 3 3 0 1 1 3 68fd1e64 fc67db1d bfb712f9 4e0c8817 25c83c98 fbad5c96 80d2263e 0b153874 a73ee510 eff5602f 9d7e66c3 c76d4dd3 df957573 1adce6ef a89936e1 8f24e8fd 3486227d 5e5b3998 51e4e855 ad3062eb 423fab69 be673bbf 260 | 1 0 1 1 4213 56 2 0 0 0 1 a03bd830 38a947a1 179fb479 d1dd3326 307e775a 7e0ccccf c62d54af c8ddd494 a73ee510 985ac50f d0069af4 feec2593 e920b070 64c94865 cfe46459 923d561f 07c540c4 e150b589 89c4021c c9d4222a 3a171ecb c3e08d94 261 | 0 90 6 1 8 0 1 1 0 1 05db9164 207b2d81 dc4a19de 65d430f4 25c83c98 fe6b92e5 22ff0182 062b5529 a73ee510 3b08e48b 48e01e3c b16b9881 613b2c28 b28479f6 899da9d5 540d868b 776ce399 25c88e42 21ddcdc9 b1252a9d b5ff8ad8 c9d4222a 32c7478e 1815fb74 001f3601 1bdbe1df 262 | 0 1 4 5 8 751 30 1 26 26 1 1 8 68fd1e64 39dfaa0d b00d1501 d16679b9 25c83c98 7e0ccccf 6641b1eb 5b392875 a73ee510 3b08e48b 52105669 e0d76380 d2b8af4a b28479f6 2223bbe1 1203a270 d4bb7bd8 df4fffb7 21ddcdc9 5840adea 73d06dde be7c41b4 aee52b6f 010f6491 cfd96da1 263 | 0 0 2 2 7 1550 9 8 7 19 0 3 0 7 05db9164 e3ce8d54 a763a2ac c57235ae 25c83c98 fbad5c96 7195046d 51d76abe a73ee510 acccca1c 4d8549da 8a11f111 51b97b8f d2dfe871 4b0401e8 d2173eba 27c07bd6 d9942b4c 2ce387f5 3a171ecb e47759f6 264 | 0 0 132 2 0 6 7 0 5 5a9ed9b0 207b2d81 74e1a23a 9a6888fb 25c83c98 7e0ccccf 1af5fecb 5b392875 7cc72ec2 3b08e48b 2e3884d7 fb8fab62 9c7a975e b28479f6 231f3923 c6b1e1b2 2005abd1 25935396 21ddcdc9 5840adea 99c09e97 3a171ecb 335a6a1e 001f3601 877c5de5 265 | 0 1 9 3 1 2 1 143 1 470 1 76 1 05db9164 8947f767 100a3803 ad1b5124 30903e74 13718bbd afa309bd 0b153874 a73ee510 ce32fa82 77212bd7 d377c333 7203f04e b28479f6 a473257f 68d2c2b9 e5ba7672 bd17c3da e51f040f a458ea53 79c3f011 c7dc6720 fe35ffe2 010f6491 987ea0be 266 | 0 1586 14 12 13511 160 5 16 24 1 12 68fd1e64 38a947a1 26ae3a27 fe76ec5b 25c83c98 fbad5c96 dc7659bd 0b153874 a73ee510 03e48276 e51ddf94 64938e6a 3516f6e6 07d13a8f 86870efc a16ff857 e5ba7672 c0f46015 f473e44e ad3062eb 3a171ecb 64536fec 267 | 0 3 1 12 10 1193 72 6 34 143 1 2 10 68fd1e64 09e68b86 eb677595 29a1571c 25c83c98 7e0ccccf 622305e6 0b153874 a73ee510 e70742b0 319687c9 7f25e652 62036f49 07d13a8f 36721ddc 7ccbf225 e5ba7672 5aed7436 dbc91356 b1252a9d 065906f0 3a171ecb 89fa735f e8b83407 e0671437 268 | 0 2 28 9 1929 0 24 52 0 18 68fd1e64 c1384774 b00d1501 d16679b9 25c83c98 fbad5c96 05254e29 5b392875 a73ee510 3b08e48b 26800aa4 e0d76380 434b8eb7 07d13a8f 022c81dc 1203a270 776ce399 8e8b535e 21ddcdc9 b1252a9d 73d06dde be7c41b4 aee52b6f ea9a246c a4e8e846 269 | 0 4 1 358570 0 1 1 0 1 05db9164 38d50e09 01a0648b 657dc3b9 25c83c98 fbad5c96 4b3c7cfe 1f89b562 7cc72ec2 88a43e6d 8b94178b 11fcf7fa 025225f2 07d13a8f fa321567 5e1b6b9d e5ba7672 52b872ed 21ddcdc9 a458ea53 bfeb50f6 c9d4222a 32c7478e df487a73 001f3601 c27f155b 270 | 0 0 6 6 6 1464 56 2 42 56 0 1 6 05db9164 e5fb1af3 82f183af 0b0afa22 30903e74 fbad5c96 372a0c4c 5b392875 a73ee510 757f1081 2e15139e e01cd7c9 94881fc3 f862f261 2a079683 f727f8ff 07c540c4 13145934 7c629f16 a458ea53 9f9e7184 55dd3565 af656e5b 3a6f6b59 625b1867 271 | 0 0 38 29 19 13247 209 4 25 481 0 3 0 19 05db9164 39dfaa0d 396df967 328b42c3 30903e74 7e0ccccf 9b953fd9 5b392875 a73ee510 3b08e48b 566a7713 8481d649 e325e0dd b28479f6 2223bbe1 b2f2a0c7 3486227d df4fffb7 21ddcdc9 5840adea f0bb1194 32c7478e a5ce2d0d 010f6491 984e0db0 272 | 0 27 19 6 10524 5 6 f473b8dc 38d50e09 25c83c98 fe6b92e5 22ff0182 5b392875 a73ee510 189460b0 48e01e3c 613b2c28 07d13a8f ee569ce2 07c540c4 582152eb 21ddcdc9 5840adea 32c7478e 001f3601 56be3401 273 | 0 0 15 1 2783 33 1 16 27 0 1 1 05db9164 b26462db dad8b3db 06b1cf6e 25c83c98 fbad5c96 e0835d8c 5b392875 a73ee510 3b08e48b a35ba7a0 422e8212 628738d3 07d13a8f 72fbc65c 25b075e4 d4bb7bd8 35ee3e9e a13bd40d be7c41b4 0ff91809 274 | 0 67 2 73411 759 0 3 58 0 2 05db9164 b961056b b267c0d0 a2fc2c1b 25c83c98 fbad5c96 f1be50b4 25239412 7cc72ec2 8fda7cf8 b1a5e8a6 66638e27 b017b046 1adce6ef 8187184a cf28e926 d4bb7bd8 5a6878f5 72e08cf6 3a171ecb 540a7116 275 | 0 3 77 23 10 84 13 3 11 11 1 1 11 5a9ed9b0 09e68b86 aa8c1539 85dd697c 25c83c98 89391314 1f89b562 a73ee510 65089c81 608452cc d8c29807 cbb8fa8b 8ceecbc8 d2f03b75 c64d548f d4bb7bd8 63cdbb21 21ddcdc9 5840adea 5f957280 32c7478e 1793a828 e8b83407 b7d9c3bc 276 | 0 13 336 5 2 269 20 14 15 21 1 2 0 3 05db9164 86d4fccc b009d929 c7043c4b 25c83c98 fbad5c96 1771cc97 0b153874 a73ee510 efea433b 0983d89c 3563ab62 1aa94af3 1adce6ef b27dd6c7 b688c8cc e5ba7672 be645006 21ddcdc9 5840adea 2754aaf1 ad3062eb 3a171ecb 3b183c5c 001f3601 aa86a675 277 | 1 12 -1 269 0 24 0 69 1 3 05db9164 4950f9dd b5bc5d62 ad20b269 43b19349 fe6b92e5 938d6619 51d76abe a73ee510 c510044d e6c92dd9 07eda2ad 16659efe 07d13a8f dd42a670 1b6d3536 e5ba7672 45e764d2 21ddcdc9 b1252a9d 19191d02 32c7478e a5725134 445bbe3b 2fede552 278 | 0 6 2 5 5 1 5 85 5 21 1 8 0 5 68fd1e64 1cfdf714 cd6da797 86b5f635 25c83c98 fbad5c96 fb056459 37e4aa92 a73ee510 8a8cd8fb dd542e6d 171a7823 779f824b 07d13a8f f775a6d5 40f8cbdd 8efede7f e88ffc9d 9b19e0d9 a458ea53 690adc37 423fab69 f14d6396 cb079c2d b9baa3d9 279 | 0 0 -1 597 13 18 4 57 0 6 3 5bfa8ab5 b56822db 7da86e4b b733e495 4cf72387 fe6b92e5 b45db245 0b153874 a73ee510 d87d491f aadb87b9 ed397d6b e9332a03 b28479f6 a9d1ba1a 056d8866 27c07bd6 38dce391 21ddcdc9 b1252a9d deaf6b52 3a171ecb d9556584 001f3601 6c27a535 280 | 0 0 10 48355 6 ae82ea21 95e2d337 0a1435c1 bdcfffba 25c83c98 13718bbd 4b46e434 5b392875 7cc72ec2 bfa3bdc4 95eaf7a0 5a276398 ccfd4002 07d13a8f 8ebf193d 4da40ea2 e5ba7672 e1b6ea80 21ddcdc9 5840adea 290c14f6 32c7478e ded4aac9 2bf691b1 bdf46dce 281 | 1 0 5 2 21601 103 1 0 0 0 1 68fd1e64 39dfaa0d 1f5dca21 938875ca 25c83c98 7e0ccccf 8324f342 0b153874 a73ee510 3b08e48b 10465598 4c2d07ae f8362c26 07d13a8f 60fa10e5 444e37fd d4bb7bd8 df4fffb7 21ddcdc9 5840adea 24025110 32c7478e a90c7c8c 010f6491 13e48a90 282 | 0 0 11 393 0 0 1 0 05db9164 287130e0 0fffb4b1 994033a1 384874ce f4ae27b8 37e4aa92 a73ee510 dc790dda e3205ff0 ce197608 b688506c 1adce6ef 6a805a0e e6b81728 e5ba7672 acd948bb d9aa05dc 5840adea 11c700df 32c7478e c673354c ea9a246c fffa8e76 283 | 1 0 10 2704 0 8 12 0 68fd1e64 d4be07ad c426f824 36f6f194 43b19349 13718bbd db8511f6 0b153874 a73ee510 663eefea c1ee56d0 4c434508 ebd756bd b28479f6 98fca9df 46973e83 e5ba7672 cbae5931 0cc116d5 a458ea53 26e618c9 32c7478e b2f178a3 001f3601 938732a0 284 | 1 23 2 2 61 0 2 2 0 2 68fd1e64 2c8c5f5d 270bb7c1 079955f5 25c83c98 fbad5c96 d6bf922f 0b153874 a73ee510 3b08e48b 6af26531 bad986f0 5fcbc3ef 07d13a8f 67958662 93bbc93a 776ce399 1179d22c cf2fcf40 c9d4222a 3a171ecb 1793a828 285 | 0 2 0 365 22 2 12 2 25 23 1 1 12 5a9ed9b0 38d50e09 7a68aab4 9db85f7e 25c83c98 7e0ccccf a0f3f4b3 0b153874 a73ee510 3c982956 2b31063f a799dd01 bcf6a386 07d13a8f ee569ce2 5e2f0151 e5ba7672 582152eb 21ddcdc9 5840adea bb9b969b 32c7478e 379d044f 001f3601 d67a6f5b 286 | 0 12 11 6 424870 0 10 6 0 6 05db9164 0468d672 322e2c43 b52edb8e 4cf72387 7e0ccccf 100f342c cb69809d 7cc72ec2 6f3d6efc 3d566bf6 37ea1dc9 c4b75451 1adce6ef 4f3b3616 6be69da3 07c540c4 9880032b 21ddcdc9 5840adea 500dfcdd 423fab69 5695c26c ea9a246c aa5f0a15 287 | 0 0 39 10 1 8896 109 1 1 14 0 1 1 5a9ed9b0 c1384774 24c93e37 d772d0ec 4cf72387 fbad5c96 2a8c42b0 1f89b562 a73ee510 3b08e48b 7fd08581 5662d3e8 f9cef5cc 1adce6ef d82fb770 3f6a5fd0 776ce399 658dca4c 21ddcdc9 b1252a9d 3ba1c760 32c7478e ecc32110 ea9a246c 1f6efbe9 288 | 1 2 9 32 19 10 17 2 26 22 2 2 17 05db9164 80e26c9b d1b1bd5b 85dd697c 30903e74 fbad5c96 124131fa a61cc0ef a73ee510 28c6ef79 9ba53fcc 38e8a2f3 42156eb4 07d13a8f e8f4b767 2d0bbe92 e5ba7672 005c6740 21ddcdc9 5840adea edb51f3c c9d4222a 3a171ecb 1793a828 e8b83407 b9809574 289 | 0 2 8 10 16875 0 37 273 0 0 10 8cf07265 ad4527a2 c02372d0 d34ebbaa 25c83c98 7e0ccccf fd78c7c1 5b392875 a73ee510 3b08e48b b73fc93f 14d63538 570a3ead 07d13a8f f9d1382e b00d3dc9 3486227d cdfa8259 20062612 be7c41b4 1b256e61 290 | 0 9 16 13 9681 67 1 17 19 1 13 05db9164 421b43cd e14a0b7d 29998ed1 25c83c98 fbad5c96 dc7659bd 062b5529 a73ee510 efea433b e51ddf94 6aaba33c 3516f6e6 b28479f6 2d0bb053 b041b04a 27c07bd6 2804effd 723b4dfd ad3062eb 3a171ecb b34f3128 291 | 0 26 9 20 54 21 31 57 38 2302 3 6 0 31 05db9164 0b8e9caf b34374a6 748540db 25c83c98 7e0ccccf ac2d4799 062b5529 a73ee510 ce92c282 434d6c13 3f300c89 7301027a b28479f6 7d4df900 30235be8 27c07bd6 ca6a63cf 3c2b5ca0 ad3062eb c7dc6720 08b0ce98 292 | 0 0 1 3 18802 2 0 05db9164 207b2d81 0903cffe 12f4ddf4 25c83c98 b5e1898d 1f89b562 a73ee510 3b08e48b 5d1b7285 e6095f68 cb2e33ed b28479f6 899da9d5 6619471b 07c540c4 25c88e42 21ddcdc9 a458ea53 52e653bd be7c41b4 8539ffbf 001f3601 c38250b6 293 | 0 7 -1 1 4 1 1 33 8 12 2 13 0 0 05db9164 5b7b33dc 25c83c98 7e0ccccf 5cc3d947 0b153874 a73ee510 d5d892f4 da89cb9b 165642be 64c94865 5a9e81d6 e5ba7672 3cbc29b4 423fab69 294 | 0 60 36 7 5304 16 1 7 10 1 7 5bfa8ab5 73a46ff0 9115d05d 103cb1e2 25c83c98 95c3fea9 5b392875 a73ee510 5c7c893d b9ec9192 19dcc6cc df5886ca b28479f6 4f648a87 7a6f6a8e e5ba7672 da507f45 21ddcdc9 b1252a9d a45f6cbf 32c7478e 293634f8 ea9a246c f0a1eacc 295 | 0 0 1 18 3 1728 66 7 15 136 0 1 3 5a9ed9b0 09e68b86 aa8c1539 85dd697c 25c83c98 13718bbd f1ff45d6 0b153874 a73ee510 9a1250bd acc758fc d8c29807 9bbdb8bd 8ceecbc8 d2f03b75 c64d548f e5ba7672 63cdbb21 cf99e5de 5840adea 5f957280 3a171ecb 1793a828 e8b83407 b7d9c3bc 296 | 0 1 1 4502 4 7 5 140 2 9a89b36c 2705da39 2240a2e4 20716e3a 25c83c98 fbad5c96 d2d741ca 1f89b562 a73ee510 d61cc293 ea4adb47 5524ab2a 05781932 64c94865 dbf47116 62135136 e5ba7672 66c3058a 7e7f04b8 3a171ecb 043ce596 297 | 1 0 4 1 5781 5 1 1 5 1 1 05db9164 09e68b86 6ad141cb 6a18d06e 25c83c98 fe6b92e5 a255dd63 0b153874 a73ee510 4c1928d3 d13e1160 23132a94 45820f61 1adce6ef dbc5e126 4885b757 07c540c4 5aed7436 47e67bca a458ea53 6f274da6 78e2e389 3a171ecb f0c37e57 e8b83407 7b272ffa 298 | 0 1021 2 9489 311 1 24 75 1 24 68fd1e64 8084ee93 d032c263 c18be181 4cf72387 7e0ccccf 44fb02c7 1f89b562 a73ee510 48317e70 2386466b dfbb09fb 45db6793 b28479f6 16d2748c 84898b2a d4bb7bd8 003d4f4f 0014c32a 3a171ecb 3b183c5c 299 | 0 0 1 3 11531 60 16 0 141 0 2 0 05db9164 cc8e236e f609f039 2858b104 25c83c98 fbad5c96 1919941b 0b153874 a73ee510 6c47047a 5fb649d8 c1248678 2ecea536 07d13a8f 3a8c68b7 704ee0fb 8efede7f 775e80fe 21ddcdc9 5840adea 08cbbdc8 3a171ecb d3b901a5 e8b83407 2fede552 300 | 0 1 1 1 19909 0 3 1 0 1 05db9164 ae46a29d 8108298e f922efad 4cf72387 61f2f170 0b153874 a73ee510 5612701e a72ac67d 66a76a26 c0d12152 1adce6ef c94a9d2c 01adbab4 e5ba7672 ab194a92 21c9516a 32c7478e b34f3128 301 | -------------------------------------------------------------------------------- /Models/KimsLogReg2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# W261 Final Project\n", 8 | "\n", 9 | "#### *Anusha Munjuluri, Arvindh Ganesan, Kim Vignola, Christina Papadimitriou*" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "### Notebook Set-up" 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": 1, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "# imports\n", 26 | "import re\n", 27 | "import time\n", 28 | "import numpy as np\n", 29 | "import pandas as pd\n", 30 | "import seaborn as sns\n", 31 | "import matplotlib.pyplot as plt" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 2, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [ 40 | "# store path to notebook\n", 41 | "PWD = !pwd\n", 42 | "PWD = PWD[0]" 43 | ] 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 3, 48 | "metadata": {}, 49 | "outputs": [], 50 | "source": [ 51 | "# start Spark Session\n", 52 | "from pyspark.sql import SparkSession\n", 53 | "app_name = \"final_project\"\n", 54 | "master = \"local[*]\"\n", 55 | "spark = SparkSession\\\n", 56 | " .builder\\\n", 57 | " .appName(app_name)\\\n", 58 | " .master(master)\\\n", 59 | " .getOrCreate()\n", 60 | "sc = spark.sparkContext" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## 1. Question Formulation" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "\n", 75 | "### Background \n", 76 | "\n", 77 | "The following analysis is based on a Kaggle dataset from Criteo, an internet advertising company focused on retargeting. Criteo's goal is to increase online clickthrough rates among consumers who have previously visited an advertiser's website. This information will be used by Criteo to more efficiently provide the right ads to the right people. Optimizing the retargeting process not only helps advertisers become more efficient in terms of how they spend their dollars, but also it reduces clutter for consumers who do not want to be \"followed\" by ads for irrelevant products (or ones they may have already purchased!).\n", 78 | "\n", 79 | "There are 13 numerical features and 26 categorical features in this dataset. Our goal is to create a model that will most accurately predict clickthroughs (label = 1). It is likely that these features represent characterstics about consumer behavior (history of clickthroughs, site visitiation, etc.), the ads themselves (product, creative approach, placement, etc.) and general metrics such as the date the ad was published. Since there is no visibility into what each feature represents, however, our challenge is to make our predictions based on the data alone. With over 6 million records, this will require a scalable approach.\n", 80 | "\n", 81 | "### Key Questions\n", 82 | "\n", 83 | "* Which machine learning approach not only provides the highest accuracy in predicting clickthroughs, but is also scalable enough to be useful in a production environment? As internet patterns and product choices change rapidly, the ideal model should be updated daily to update the following day's retargeting model. - Note: not sure I fully answered this piece: *** Preview what level of performanceyour model would need to achieve to be practically useful ***\n", 84 | "\n", 85 | "* Which features are most important in predicting clickthroughs? Having this information can help Criteo focus on the metrics that are most critical to their product.\n", 86 | "\n", 87 | "* With 39 features, there is a high risk of overfitting. We should identify a model that provides an optimal tradeoff between bias and variance.\n" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "## 2. Algorithm Explanation" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": {}, 100 | "source": [ 101 | "Given scalability concerns and the need for feature selection, we decided to explore two independent models to assess the optimal performance. We also wanted to identify algorithms that have a history of success in the Spark framework and with binary classification. Logistic Regression and Decision Trees met all of these criteria. Logistic Regession is highly scalable and combined with regularization could aid in feature selection. Decision trees have similar benefits but also require little pre-processing and no direct feature selection. We continued down these parallel paths to compare the performance of these models." 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "### Data Loading and Pre-Processing" 109 | ] 110 | }, 111 | { 112 | "cell_type": "code", 113 | "execution_count": 4, 114 | "metadata": {}, 115 | "outputs": [ 116 | { 117 | "name": "stdout", 118 | "output_type": "stream", 119 | "text": [ 120 | "0\t1\t1\t5\t0\t1382\t4\t15\t2\t181\t1\t2\t\t2\t68fd1e64\t80e26c9b\tfb936136\t7b4723c4\t25c83c98\t7e0ccccf\tde7995b8\t1f89b562\ta73ee510\ta8cd5504\tb2cb9c98\t37c9c164\t2824a5f6\t1adce6ef\t8ba8b39a\t891b62e7\te5ba7672\tf54016b9\t21ddcdc9\tb1252a9d\t07b5194c\t\t3a171ecb\tc5c50484\te8b83407\t9727dd16\n" 121 | ] 122 | } 123 | ], 124 | "source": [ 125 | "# take a look at the data\n", 126 | "!head -n 1 data/train.txt" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": 5, 132 | "metadata": {}, 133 | "outputs": [], 134 | "source": [ 135 | "# load the data\n", 136 | "fullTrainRDD = sc.textFile('data/train.txt')\n", 137 | "testRDD = sc.textFile('data/test.txt')\n", 138 | "\n", 139 | "FIELDS = ['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13',\n", 140 | " 'C1','C2','C3','C4','C5','C6','C7','C8','C9','C10','C11','C12','C13','C14',\n", 141 | " 'C15','C16','C17','C18','C19','C20','C21','C22','C23','C24','C25','C26','Label']" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": 6, 147 | "metadata": {}, 148 | "outputs": [ 149 | { 150 | "name": "stdout", 151 | "output_type": "stream", 152 | "text": [ 153 | "Number of records in train data: 45840617 ...\n", 154 | "Number of records in test data: 6042135 ...\n" 155 | ] 156 | } 157 | ], 158 | "source": [ 159 | "# number of rows in train/test data\n", 160 | "print(f\"Number of records in train data: {fullTrainRDD.count()} ...\")\n", 161 | "print(f\"Number of records in test data: {testRDD.count()} ...\")" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": 7, 167 | "metadata": {}, 168 | "outputs": [ 169 | { 170 | "name": "stdout", 171 | "output_type": "stream", 172 | "text": [ 173 | "... held out 9167871 records for evaluation and assigned 36672746 for training.\n" 174 | ] 175 | } 176 | ], 177 | "source": [ 178 | "# Generate 80/20 (pseudo)random train/test split \n", 179 | "trainRDD, heldOutRDD = fullTrainRDD.randomSplit([0.8,0.2], seed = 1)\n", 180 | "print(f\"... held out {heldOutRDD.count()} records for evaluation and assigned {trainRDD.count()} for training.\")" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 8, 186 | "metadata": {}, 187 | "outputs": [], 188 | "source": [ 189 | "toyRDD_train, trainRDD2 = trainRDD.randomSplit([0.001,0.999], seed = 2)\n", 190 | "toyRDD_test, mainRDD_test = heldOutRDD.randomSplit([0.001,0.999], seed = 2)" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": 9, 196 | "metadata": {}, 197 | "outputs": [ 198 | { 199 | "data": { 200 | "text/plain": [ 201 | "['1\\t5\\t2\\t\\t\\t1382\\t17\\t78\\t25\\t76\\t0\\t9\\t\\t\\t05db9164\\t942f9a8d\\t56472604\\t53a5d493\\t25c83c98\\t\\t49b74ebc\\t6c41e35e\\ta73ee510\\te113fc4b\\tc4adf918\\t08531bcb\\t85dbe138\\t1adce6ef\\tae97ecc3\\t76b06ec3\\te5ba7672\\t1f868fdd\\t9437f62f\\ta458ea53\\tff4c70b8\\t\\t32c7478e\\tda89b7d5\\t7a402766\\tc7beb94e']" 202 | ] 203 | }, 204 | "execution_count": 9, 205 | "metadata": {}, 206 | "output_type": "execute_result" 207 | } 208 | ], 209 | "source": [ 210 | "#toyRDD_train.take(1)\n", 211 | "toyRDD_train.take(1)" 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "execution_count": 10, 217 | "metadata": {}, 218 | "outputs": [], 219 | "source": [ 220 | "# helper functions\n", 221 | "def parse(line):\n", 222 | " \"\"\"\n", 223 | " Map line --> tuple of (features, label)\n", 224 | " \"\"\"\n", 225 | " fields = np.array(line.split('\\t'))\n", 226 | " features,label = fields[1:14], fields[0]\n", 227 | " return(features, label)\n", 228 | "\n", 229 | "def edit_data_types(line):\n", 230 | " \"\"\"\n", 231 | " Map tuple of (features, label) --> tuple of (formated features, label)\n", 232 | " \n", 233 | " * '' is replaced with 'null'\n", 234 | " * numerical fields are converted to integers\n", 235 | " \"\"\"\n", 236 | " features, label = line[0], line[1]\n", 237 | " formated_features = []\n", 238 | " for i, value in enumerate(features):\n", 239 | " if value == '':\n", 240 | " formated_features.append(np.nan)\n", 241 | " else:\n", 242 | " if i < 13:\n", 243 | " formated_features.append(float(value)) \n", 244 | " else:\n", 245 | " formated_features.append(float(value))\n", 246 | " return (formated_features, label)" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": 11, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [ 255 | "#trainRDDCached = trainRDD.map(parse).map(edit_data_types).cache()\n", 256 | "toyRDD_train1 = toyRDD_train.map(parse).map(edit_data_types)\n", 257 | "toyRDD_test2 = toyRDD_test.map(parse).map(edit_data_types)" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 12, 263 | "metadata": {}, 264 | "outputs": [ 265 | { 266 | "name": "stdout", 267 | "output_type": "stream", 268 | "text": [ 269 | "36451\n" 270 | ] 271 | } 272 | ], 273 | "source": [ 274 | "print(toyRDD_train1.count())\n", 275 | "#print(toyRDD_test1.count())" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": 13, 281 | "metadata": {}, 282 | "outputs": [], 283 | "source": [ 284 | "sample = np.array(toyRDD_train1.map(lambda x: np.append(x[0], [x[1]])).takeSample(False, 1000))\n", 285 | "sample_df = pd.DataFrame(np.array(sample), columns = ['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13', 'Label'])" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": 14, 291 | "metadata": {}, 292 | "outputs": [], 293 | "source": [ 294 | "columns = (['I1','I2','I3','I4','I5','I6','I7','I8','I9','I10','I11','I12','I13', 'Label'])\n", 295 | "#columns = ['I1', 'I2', 'I3', 'I4', 'I5', 'I6', 'I7', 'I8', 'I9', 'I10', 'I11', 'I12', 'I13']\n", 296 | "sample_numeric = sample_df.reindex(columns=columns)\n", 297 | "sample_numeric[columns] = sample_numeric[columns].astype(np.float)" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": 15, 303 | "metadata": {}, 304 | "outputs": [ 305 | { 306 | "name": "stdout", 307 | "output_type": "stream", 308 | "text": [ 309 | "[4.164150943396226, 102.271, 17.273936170212767, 6.96516129032258, 17504.025484199796, 96.59664948453609, 19.74367088607595, 12.574574574574575, 104.00316455696202, 0.6679245283018868, 2.919831223628692, 1.2378854625550662, 8.225806451612904]\n", 310 | "[9.509351789102883, 322.7148703716641, 31.64668584436759, 8.408602304540327, 65951.30770633514, 201.85445899865584, 76.14245692688245, 13.381227576184473, 221.92516649954453, 0.7543017725786832, 5.536216317193093, 4.336686134602433, 12.056184806628144]\n" 311 | ] 312 | } 313 | ], 314 | "source": [ 315 | "\"\"\"Get means and standard deviations. Ideally we should do this in the RDD vs. pandas\"\"\"\n", 316 | "\n", 317 | "means = []\n", 318 | "stdevs = []\n", 319 | "\n", 320 | "for i in sample_numeric.columns[0:13]:\n", 321 | " mean = np.nanmean(sample_numeric[i])\n", 322 | " means.append(mean)\n", 323 | " std = np.nanstd(sample_numeric[i])\n", 324 | " stdevs.append(std)\n", 325 | " \n", 326 | "print(means)\n", 327 | "print(stdevs)\n" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": 16, 333 | "metadata": {}, 334 | "outputs": [ 335 | { 336 | "name": "stdout", 337 | "output_type": "stream", 338 | "text": [ 339 | "[2.000e+00 3.000e+00 5.000e+00 3.000e+00 1.414e+03 1.500e+01 6.000e+00\n", 340 | " 6.500e+00 3.200e+01 1.000e+00 2.000e+00 0.000e+00 3.000e+00 1.000e+00]\n", 341 | "[1.0000e+00 3.0000e+00 6.0000e+00 4.0000e+00 3.7095e+03 3.8000e+01\n", 342 | " 2.0000e+00 8.0000e+00 3.6000e+01 0.0000e+00 1.0000e+00 0.0000e+00\n", 343 | " 5.0000e+00 0.0000e+00]\n" 344 | ] 345 | } 346 | ], 347 | "source": [ 348 | "\"\"\"Get medians for each class. Ideally we should do this in the RDD vs. pandas\"\"\"\n", 349 | "\n", 350 | "median1 = np.array(sample_numeric[sample_numeric['Label'] == 1.0].median().tolist())\n", 351 | "print(median1)\n", 352 | "\n", 353 | "median0 = np.array(sample_numeric[sample_numeric['Label'] == 0.0].median().tolist())\n", 354 | "print(median0)" 355 | ] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": 17, 360 | "metadata": {}, 361 | "outputs": [], 362 | "source": [ 363 | "# helper functions\n", 364 | "def parse(line):\n", 365 | " \"\"\"\n", 366 | " Map line --> tuple of (features, label)\n", 367 | " \"\"\"\n", 368 | " fields = np.array(line.split('\\t'))\n", 369 | " features,label = fields[1:14], fields[0]\n", 370 | " return(features, label)\n", 371 | "\n", 372 | "def update_nans(line):\n", 373 | " \"\"\"\n", 374 | " Replace missing values with meidans for each label \n", 375 | " \"\"\"\n", 376 | " \n", 377 | " #median1 = np.array([2.0, 3.5, 4.0, 4.0, 1362.0, 13.5, 8.0, 7.0, 42.5, 1.0, 2.0, 0.0, 3.0, 1.0])\n", 378 | " #median0 = np.array([0.0, 2.0, 7.0, 5.0, 3539.0, 46.5, 2.0, 8.0, 38.5, 0.0, 1.0, 0.0, 5.0, 0.0])\n", 379 | " \n", 380 | " features, label = line[0], float(line[1])\n", 381 | " formated_features = []\n", 382 | " for i, value in enumerate(features):\n", 383 | " if value == '' and label == 1.0:\n", 384 | " formated_features.append(float(median1[i]))\n", 385 | " elif value == '' and label == 0.0:\n", 386 | " formated_features.append(float(median0[i]))\n", 387 | " else:\n", 388 | " if i < 13:\n", 389 | " formated_features.append(float(value)) \n", 390 | " else:\n", 391 | " formated_features.append(value)\n", 392 | " return (formated_features, label)" 393 | ] 394 | }, 395 | { 396 | "cell_type": "code", 397 | "execution_count": 18, 398 | "metadata": {}, 399 | "outputs": [], 400 | "source": [ 401 | "toyRDDCached_train = toyRDD_train.map(parse).map(update_nans)\n", 402 | "toyRDDCached_test = toyRDD_test.map(parse).map(update_nans)" 403 | ] 404 | }, 405 | { 406 | "cell_type": "code", 407 | "execution_count": 19, 408 | "metadata": {}, 409 | "outputs": [], 410 | "source": [ 411 | "meanClickthroughs = toyRDDCached_train.values().mean()" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": 20, 417 | "metadata": {}, 418 | "outputs": [ 419 | { 420 | "data": { 421 | "text/plain": [ 422 | "0.25475295602315423" 423 | ] 424 | }, 425 | "execution_count": 20, 426 | "metadata": {}, 427 | "output_type": "execute_result" 428 | } 429 | ], 430 | "source": [ 431 | "meanClickthroughs" 432 | ] 433 | }, 434 | { 435 | "cell_type": "code", 436 | "execution_count": 21, 437 | "metadata": {}, 438 | "outputs": [ 439 | { 440 | "data": { 441 | "text/plain": [ 442 | "[([1.0, 3.0, 5.0, 2.0, 2.0, 0.0, 55.0, 15.0, 23.0, 1.0, 11.0, 0.0, 0.0], 1.0)]" 443 | ] 444 | }, 445 | "execution_count": 21, 446 | "metadata": {}, 447 | "output_type": "execute_result" 448 | } 449 | ], 450 | "source": [ 451 | "toyRDDCached_test.take(1)" 452 | ] 453 | }, 454 | { 455 | "cell_type": "code", 456 | "execution_count": null, 457 | "metadata": {}, 458 | "outputs": [], 459 | "source": [ 460 | "# this is what I was working on to flatten outliers..\n", 461 | "\n", 462 | "\n", 463 | "featureMeans = np.array(means)\n", 464 | "featureStdev = np.array(stdevs)\n", 465 | "upper_bounds = np.array(featureMeans + (featureStdev * 3))\n", 466 | "lower_bounds = np.array(featureMeans - (featureStdev * 3))\n", 467 | "# print(\"feature means\", featureMeans)\n", 468 | "# print(\"feature std devs\", featureStdev)\n", 469 | "# print(\"upper_bounds\", upper_bounds)\n", 470 | "# print(\"lower_bounds\", lower_bounds)\n", 471 | "\n", 472 | "def flatten_outliers(line):\n", 473 | " features = line[0]\n", 474 | " labels = line[1]\n", 475 | " for i in range(features):\n", 476 | " for j in range(upper_bounds):\n", 477 | " if i > j:\n", 478 | " features[i] = j\n", 479 | " for i in range(features):\n", 480 | " for k in range(lower_bounds):\n", 481 | " if i < k:\n", 482 | " features[i] = k\n", 483 | " return features, labels\n", 484 | " \n", 485 | "# testRDD = toyRDDCached_train.mapValues(lambda x: x == ).take(1)\n", 486 | "#testRDD.collect()\n" 487 | ] 488 | }, 489 | { 490 | "cell_type": "code", 491 | "execution_count": 22, 492 | "metadata": {}, 493 | "outputs": [], 494 | "source": [ 495 | "def standardize(dataRDD):\n", 496 | " \"\"\"Standardize the data\"\"\"\n", 497 | "\n", 498 | " sc.broadcast(means)\n", 499 | " sc.broadcast(stdevs)\n", 500 | " \n", 501 | " featureMeans = np.array(means)\n", 502 | " featureStdev = np.array(stdevs)\n", 503 | " \n", 504 | " normedRDD = dataRDD.map(lambda x: ((x[0]-featureMeans)/featureStdev, x[1]))\n", 505 | " \n", 506 | " return normedRDD" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": 23, 512 | "metadata": {}, 513 | "outputs": [], 514 | "source": [ 515 | "# Normalize both the test and training sets\n", 516 | "normedRDD_train = standardize(toyRDDCached_train).cache()\n", 517 | "normedRDD_test = standardize(toyRDDCached_test).cache()" 518 | ] 519 | }, 520 | { 521 | "cell_type": "code", 522 | "execution_count": 24, 523 | "metadata": {}, 524 | "outputs": [ 525 | { 526 | "data": { 527 | "text/plain": [ 528 | "[(array([ 0.08789758, -0.31071081, -0.3878427 , -0.47156009, -0.24445346,\n", 529 | " -0.39432693, 0.76509652, 0.92857142, -0.12618292, -0.88548715,\n", 530 | " 1.09825347, -0.28544502, -0.43345441]), 1.0)]" 531 | ] 532 | }, 533 | "execution_count": 24, 534 | "metadata": {}, 535 | "output_type": "execute_result" 536 | } 537 | ], 538 | "source": [ 539 | "normedRDD_train.take(1)" 540 | ] 541 | }, 542 | { 543 | "cell_type": "markdown", 544 | "metadata": {}, 545 | "source": [ 546 | "## 3. EDA & Discussion of Challenges" 547 | ] 548 | }, 549 | { 550 | "cell_type": "code", 551 | "execution_count": null, 552 | "metadata": {}, 553 | "outputs": [], 554 | "source": [ 555 | "# Get summary statitics\n", 556 | "sample_df.iloc[:,0:21].describe(include = \"all\")" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": null, 562 | "metadata": {}, 563 | "outputs": [], 564 | "source": [ 565 | "# Get summary statistics\n", 566 | "sample_df.iloc[:,21:39].describe(include = \"all\")" 567 | ] 568 | }, 569 | { 570 | "cell_type": "code", 571 | "execution_count": null, 572 | "metadata": {}, 573 | "outputs": [], 574 | "source": [ 575 | "# Take a look at histograms for each feature \n", 576 | "sample_numeric.hist(figsize=(23,15), bins=15)\n", 577 | "#sample_numeric[FIELDS[:-1]].hist(figsize=(15,15), bins=15)\n", 578 | "plt.show()" 579 | ] 580 | }, 581 | { 582 | "cell_type": "code", 583 | "execution_count": null, 584 | "metadata": {}, 585 | "outputs": [], 586 | "source": [ 587 | "# part b - plot boxplots of each feature vs. the outcome\n", 588 | "\n", 589 | "fig, ax_grid = plt.subplots(5, 3, figsize=(23,15))\n", 590 | "y = sample_df['Label']\n", 591 | "for idx, feature in enumerate(FIELDS[0:13]):\n", 592 | " x = sample_numeric[feature]\n", 593 | " sns.boxplot(x, y, ax=ax_grid[idx//3][idx%3], orient='h', linewidth=.5)\n", 594 | " ax_grid[idx//3][idx%3].invert_yaxis()\n", 595 | "fig.suptitle(\"BoxPlots by Label\", fontsize=15, y=0.9)\n", 596 | "plt.show()" 597 | ] 598 | }, 599 | { 600 | "cell_type": "code", 601 | "execution_count": null, 602 | "metadata": {}, 603 | "outputs": [], 604 | "source": [ 605 | "# Look at correrlations across features\n", 606 | "\n", 607 | "corr = sample_numeric[FIELDS[:13]].corr()\n", 608 | "fig, ax = plt.subplots(figsize=(15, 13))\n", 609 | "mask = np.zeros_like(corr, dtype=np.bool)\n", 610 | "mask[np.triu_indices_from(mask)] = True\n", 611 | "cmap = sns.diverging_palette(10, 240, as_cmap=True)\n", 612 | "sns.heatmap(corr, mask=mask, cmap=cmap, center=0, linewidths=.5)\n", 613 | "plt.title(\"Correlations between features\")\n", 614 | "plt.show()" 615 | ] 616 | }, 617 | { 618 | "cell_type": "markdown", 619 | "metadata": {}, 620 | "source": [ 621 | "## 4. Algorithm Implementation" 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "### Following is a homegrown implementation of Logistic Regression" 629 | ] 630 | }, 631 | { 632 | "cell_type": "code", 633 | "execution_count": 25, 634 | "metadata": {}, 635 | "outputs": [], 636 | "source": [ 637 | "# Define the baseline model\n", 638 | "BASELINE_1 = np.array([1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])\n", 639 | "BASELINE_2 = np.array([meanClickthroughs, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])" 640 | ] 641 | }, 642 | { 643 | "cell_type": "code", 644 | "execution_count": 26, 645 | "metadata": {}, 646 | "outputs": [], 647 | "source": [ 648 | "# Compute log loss for Logisitc Regression\n", 649 | "def LRLoss(cachedRDD, W):\n", 650 | " \n", 651 | " augmentedData = cachedRDD.map(lambda x: (np.append([1.0], x[0]), x[1]))\n", 652 | "\n", 653 | " loss = augmentedData.map(lambda x: -x[1] * np.log(1/ (1 + np.exp(-(np.dot(x[0],W))))) - (1-x[1]) * np.log(1-(1/ (1 + np.exp(-(np.dot(x[0],W))))))).mean()\n", 654 | " \n", 655 | " return loss" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": 27, 661 | "metadata": {}, 662 | "outputs": [ 663 | { 664 | "name": "stdout", 665 | "output_type": "stream", 666 | "text": [ 667 | "Train Loss using 0 for baseline 1.0585087314950672\n", 668 | "Train Loss using meanClickthroughs as baseline 0.7637151310692348\n" 669 | ] 670 | } 671 | ], 672 | "source": [ 673 | "#Using meanClickthroughs as our bias term improves loss significantly (it will be used in our baseline)\n", 674 | "\n", 675 | "train_loss_baseline1 = LRLoss(normedRDD_train, BASELINE_1)\n", 676 | "print(\"Train Loss using 0 for baseline\", train_loss_baseline1)\n", 677 | "\n", 678 | "train_loss_baseline2 = LRLoss(normedRDD_train, BASELINE_2)\n", 679 | "print(\"Train Loss using meanClickthroughs as baseline\", train_loss_baseline2)\n" 680 | ] 681 | }, 682 | { 683 | "cell_type": "code", 684 | "execution_count": 28, 685 | "metadata": {}, 686 | "outputs": [], 687 | "source": [ 688 | "BASELINE = np.array([meanClickthroughs, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])" 689 | ] 690 | }, 691 | { 692 | "cell_type": "code", 693 | "execution_count": 29, 694 | "metadata": {}, 695 | "outputs": [], 696 | "source": [ 697 | "# Compute the gradient for logistic regression (one step)\n", 698 | "\n", 699 | "def GDUpdate(dataRDD, W, learningRate):\n", 700 | " \n", 701 | " augmentedData = dataRDD.map(lambda x: (np.append([1.0], x[0]), x[1])).cache()\n", 702 | " \n", 703 | " grad = augmentedData.map(lambda x: np.dot(x[0], ((1/ (1 + np.exp(-(np.dot(x[0],W))))) - x[1]))).mean()\n", 704 | "\n", 705 | " new_model = W - (learningRate * grad)\n", 706 | "\n", 707 | " return new_model" 708 | ] 709 | }, 710 | { 711 | "cell_type": "code", 712 | "execution_count": 30, 713 | "metadata": {}, 714 | "outputs": [ 715 | { 716 | "name": "stdout", 717 | "output_type": "stream", 718 | "text": [ 719 | "BASELINE: Loss = 0.7637151310692348\n", 720 | "----------\n", 721 | "STEP: 1\n", 722 | "Loss: 0.7449910292499055\n", 723 | "Model: [0.224, 0.01, 0.003, 0.001, -0.001, -0.004, -0.005, 0.005, -0.002, 0.001, 0.025, 0.008, 0.008, -0.003]\n", 724 | "----------\n", 725 | "STEP: 2\n", 726 | "Loss: 0.7277617913614016\n", 727 | "Model: [0.194, 0.019, 0.005, -0.001, -0.002, -0.008, -0.01, 0.009, -0.004, 0.002, 0.05, 0.015, 0.017, -0.006]\n", 728 | "----------\n", 729 | "STEP: 3\n", 730 | "Loss: 0.7119153299126009\n", 731 | "Model: [0.165, 0.027, 0.007, 0.003, -0.003, -0.012, -0.014, 0.013, -0.006, 0.002, 0.073, 0.022, 0.024, -0.008]\n" 732 | ] 733 | } 734 | ], 735 | "source": [ 736 | "# Review Gradient Descent steps\n", 737 | "nSteps = 3\n", 738 | "model = BASELINE\n", 739 | "#learningRate = .1\n", 740 | "print(f\"BASELINE: Loss = {LRLoss(normedRDD_train,model)}\")\n", 741 | "for idx in range(nSteps):\n", 742 | " print(\"----------\")\n", 743 | " print(f\"STEP: {idx+1}\")\n", 744 | " model = GDUpdate(normedRDD_train, model, .1)\n", 745 | " loss = LRLoss(normedRDD_train, model)\n", 746 | " print(f\"Loss: {loss}\")\n", 747 | " print(f\"Model: {[round(w,3) for w in model]}\")" 748 | ] 749 | }, 750 | { 751 | "cell_type": "code", 752 | "execution_count": 32, 753 | "metadata": {}, 754 | "outputs": [], 755 | "source": [ 756 | "# Perform one Gradient Descent update with lasso regularization\n", 757 | "\n", 758 | "def GDUpdate_Lasso(dataRDD, W, learningRate, regParam):\n", 759 | "\n", 760 | " augmentedData = dataRDD.map(lambda x: (np.append([1.0], x[0]), x[1])).cache()\n", 761 | " #new_model = None \n", 762 | " wReg = np.sign(W)\n", 763 | " wReg[0] = 0 #set bias to zero\n", 764 | " \n", 765 | " grad = augmentedData.map(lambda x: np.dot(x[0], ((1/ (1 + np.exp(-(np.dot(x[0],W))))) - x[1]))).mean() + (wReg * regParam)\n", 766 | " new_model = W - (learningRate * grad)\n", 767 | " \n", 768 | " return new_model" 769 | ] 770 | }, 771 | { 772 | "cell_type": "code", 773 | "execution_count": 37, 774 | "metadata": {}, 775 | "outputs": [ 776 | { 777 | "name": "stdout", 778 | "output_type": "stream", 779 | "text": [ 780 | "[ 0.22389365 0.00956181 0.00257687 0.00099639 -0.00086452 -0.00429818\n", 781 | " -0.00505806 0.00486502 -0.00234793 0.000844 0.02535733 0.00779211\n", 782 | " 0.00847686 -0.00335935]\n" 783 | ] 784 | } 785 | ], 786 | "source": [ 787 | "new_model = GDUpdate_Lasso(normedRDD_train, BASELINE, .1, .1)\n", 788 | "print(new_model)" 789 | ] 790 | }, 791 | { 792 | "cell_type": "code", 793 | "execution_count": 33, 794 | "metadata": {}, 795 | "outputs": [ 796 | { 797 | "name": "stdout", 798 | "output_type": "stream", 799 | "text": [ 800 | "BASELINE: Loss = 0.7637151310692348\n", 801 | "----------\n", 802 | "STEP: 1\n", 803 | "Loss: 0.7449910292499055\n", 804 | "Model: [0.224, 0.01, 0.003, 0.001, -0.001, -0.004, -0.005, 0.005, -0.002, 0.001, 0.025, 0.008, 0.008, -0.003]\n", 805 | "----------\n", 806 | "STEP: 2\n", 807 | "Loss: 0.7358405527549117\n", 808 | "Model: [0.194, 0.009, -0.005, -0.011, 0.008, 0.002, 0.0, -0.001, 0.006, -0.008, 0.04, 0.005, 0.007, 0.004]\n", 809 | "----------\n", 810 | "STEP: 3\n", 811 | "Loss: 0.7212600800705432\n", 812 | "Model: [0.165, 0.007, 0.008, 0.012, -0.003, -0.012, -0.014, 0.014, -0.007, 0.002, 0.054, 0.002, 0.005, -0.01]\n", 813 | "----------\n", 814 | "STEP: 4\n", 815 | "Loss: 0.7121642555729325\n", 816 | "Model: [0.137, 0.006, -0.0, -0.006, 0.006, -0.006, -0.009, 0.008, 0.001, -0.007, 0.067, -0.001, 0.002, -0.003]\n", 817 | "----------\n", 818 | "STEP: 5\n", 819 | "Loss: 0.7016324537410166\n", 820 | "Model: [0.11, 0.004, 0.012, 0.015, -0.005, -0.0, -0.003, 0.002, -0.011, 0.004, 0.08, 0.017, -0.0, 0.004]\n" 821 | ] 822 | } 823 | ], 824 | "source": [ 825 | "# Take a look at a few Gradient Descent steps\n", 826 | "nSteps = 5\n", 827 | "model = BASELINE\n", 828 | "#learningRate = .1\n", 829 | "#regParam = .1\n", 830 | "print(f\"BASELINE: Loss = {LRLoss(normedRDD_train,model)}\")\n", 831 | "for idx in range(nSteps):\n", 832 | " print(\"----------\")\n", 833 | " print(f\"STEP: {idx+1}\")\n", 834 | " model = GDUpdate_Lasso(normedRDD_train, model, .1, .1)\n", 835 | " loss = LRLoss(normedRDD_train, model)\n", 836 | " print(f\"Loss: {loss}\")\n", 837 | " print(f\"Model: {[round(w,3) for w in model]}\")" 838 | ] 839 | }, 840 | { 841 | "cell_type": "code", 842 | "execution_count": 45, 843 | "metadata": {}, 844 | "outputs": [], 845 | "source": [ 846 | "W = new_model\n", 847 | "threshold = .55\n", 848 | "augmentedData = normedRDD_train.map(lambda x: (np.append([1.0], x[0]), x[1]))\n", 849 | "augmentedData.take(5)\n", 850 | "\n", 851 | "def predict_label(line):\n", 852 | " features = line[0]\n", 853 | " label = line[1]\n", 854 | " pred = None\n", 855 | " z = np.dot(features,W)\n", 856 | " prob = np.divide(1.0, (1.0 + np.exp(-z)))\n", 857 | " if prob > threshold:\n", 858 | " pred = float(1.0)\n", 859 | " else:\n", 860 | " pred = float(0.0)\n", 861 | " return z, prob, label, pred\n", 862 | "\n", 863 | "def map_accuracy(line):\n", 864 | " label_actual = line[0] \n", 865 | " label_pred = line[1]\n", 866 | "\n", 867 | " if label_actual == 1.0 and label_pred == 1.0:\n", 868 | " return \"TP\", 1.0\n", 869 | " elif label_actual == 1.0 and label_pred == 0.0:\n", 870 | " return \"FN\", 1.0\n", 871 | " elif label_actual == 0.0 and label_pred == 0.0:\n", 872 | " return \"TN\", 1.0\n", 873 | " else:\n", 874 | " if label_actual == 0.0 and label_pred == 1.0:\n", 875 | " return \"FP\", 1.0\n", 876 | "\n", 877 | "#scores = augmentedData.map(predict_label)\n", 878 | "scores = augmentedData.map(predict_label).map(map_accuracy).reduceByKey(lambda x, y: x + y)" 879 | ] 880 | }, 881 | { 882 | "cell_type": "code", 883 | "execution_count": 42, 884 | "metadata": {}, 885 | "outputs": [], 886 | "source": [ 887 | "def get_scores(dataRDD, W, threshold):\n", 888 | " \"\"\"Assess peformance of current models\"\"\"\n", 889 | " augmentedData = dataRDD.map(lambda x: (np.append([1.0], x[0]), x[1]))\n", 890 | " \n", 891 | " sc.broadcast(W)\n", 892 | " sc.broadcast(threshold)\n", 893 | "\n", 894 | " def predict_label(line):\n", 895 | " features = line[0]\n", 896 | " label = line[1]\n", 897 | " pred = None\n", 898 | " z = np.dot(features,W)\n", 899 | " prob = np.divide(1.0, (1.0 + np.exp(-z)))\n", 900 | " if prob > threshold:\n", 901 | " pred = float(1.0)\n", 902 | " else:\n", 903 | " pred = float(0.0)\n", 904 | " return label, pred\n", 905 | " \n", 906 | " def map_accuracy(line):\n", 907 | " label_actual = line[0] \n", 908 | " label_pred = line[1]\n", 909 | " \n", 910 | " if label_actual == 1.0 and label_pred == 1.0:\n", 911 | " return \"TP\", 1.0\n", 912 | " elif label_actual == 1.0 and label_pred == 0.0:\n", 913 | " return \"FN\", 1.0\n", 914 | " elif label_actual == 0.0 and label_pred == 0.0:\n", 915 | " return \"TN\", 1.0\n", 916 | " else:\n", 917 | " if label_actual == 0.0 and label_pred == 1.0:\n", 918 | " return \"FP\", 1.0\n", 919 | " \n", 920 | " scores = augmentedData.map(predict_label).map(map_accuracy).reduceByKey(lambda x, y: x + y)\n", 921 | " \n", 922 | " return scores" 923 | ] 924 | }, 925 | { 926 | "cell_type": "code", 927 | "execution_count": 44, 928 | "metadata": {}, 929 | "outputs": [], 930 | "source": [ 931 | "#new_model = [0.773, 0.006, 0.008, 0.008, 0.005, -0.009, 0.006, 0.005, -0.007, -0.007, 0.105, -0.001, 0.011, -0.011]\n", 932 | "# train_scores_test = get_scores(normedRDD_train, new_model, .5).collect()\n", 933 | "# print(train_scores)\n", 934 | "updated_model = GDUpdate_Lasso(normedRDD_train, BASELINE, .1, .1)" 935 | ] 936 | }, 937 | { 938 | "cell_type": "code", 939 | "execution_count": 46, 940 | "metadata": {}, 941 | "outputs": [ 942 | { 943 | "data": { 944 | "text/plain": [ 945 | "[('FP', 8269.0), ('FN', 1585.0), ('TN', 18896.0), ('TP', 7701.0)]" 946 | ] 947 | }, 948 | "execution_count": 46, 949 | "metadata": {}, 950 | "output_type": "execute_result" 951 | } 952 | ], 953 | "source": [ 954 | "Get_Scores = get_scores(normedRDD_train, updated_model, .55).collect()\n", 955 | "Get_Scores" 956 | ] 957 | }, 958 | { 959 | "cell_type": "code", 960 | "execution_count": 50, 961 | "metadata": {}, 962 | "outputs": [], 963 | "source": [ 964 | "# Assess performance of current models. Takes \"train_scores\" as input.\n", 965 | "\n", 966 | "def assess_performance(scores):\n", 967 | "\n", 968 | " TP, FN, TN, FP = 0.0, 0.0, 0.0, 0.0\n", 969 | "\n", 970 | " for i in scores:\n", 971 | " if i[0] == 'TP':\n", 972 | " TP = i[1]\n", 973 | " elif i[0] == 'FN':\n", 974 | " FN = i[1]\n", 975 | " elif i[0] == 'TN':\n", 976 | " TN = i[1]\n", 977 | " else:\n", 978 | " if i[0] == 'FP':\n", 979 | " FP = i[1]\n", 980 | "\n", 981 | " if TP != 0.0:\n", 982 | " precision = TP/(TP + FP)\n", 983 | " recall = TP/(TP + FN)\n", 984 | " f1_score = 2*((precision*recall)/(precision+recall))\n", 985 | " accuracy = (TP+TN)/(TP+TN+FP+FN) \n", 986 | " else:\n", 987 | " precision = 0.0\n", 988 | " recall = 0.0\n", 989 | " f1_score = 0.0\n", 990 | " accuracy = 0.0\n", 991 | "\n", 992 | " print(\"True Positives=\", TP)\n", 993 | " print(\"False Negatives=\", FN)\n", 994 | " print(\"True Negatives=\", TN)\n", 995 | " print(\"False Positives=\", FP)\n", 996 | " print(\"Precision=\", precision)\n", 997 | " print(\"Recall=\", recall)\n", 998 | " print(\"F1_score=\", f1_score)\n", 999 | " print('Accuracy=', accuracy)\n", 1000 | " \n", 1001 | " return accuracy, precision, recall, accuracy\n", 1002 | " " 1003 | ] 1004 | }, 1005 | { 1006 | "cell_type": "code", 1007 | "execution_count": 53, 1008 | "metadata": {}, 1009 | "outputs": [ 1010 | { 1011 | "name": "stdout", 1012 | "output_type": "stream", 1013 | "text": [ 1014 | "True Positives= 7701.0\n", 1015 | "False Negatives= 1585.0\n", 1016 | "True Negatives= 18896.0\n", 1017 | "False Positives= 8269.0\n", 1018 | "Precision= 0.4822166562304321\n", 1019 | "Recall= 0.829312944217101\n", 1020 | "F1_score= 0.6098352866645549\n", 1021 | "Accuracy= 0.7296644810841952\n" 1022 | ] 1023 | }, 1024 | { 1025 | "data": { 1026 | "text/plain": [ 1027 | "(0.7296644810841952, 0.4822166562304321, 0.829312944217101, 0.6098352866645549)" 1028 | ] 1029 | }, 1030 | "execution_count": 53, 1031 | "metadata": {}, 1032 | "output_type": "execute_result" 1033 | } 1034 | ], 1035 | "source": [ 1036 | "Accuracy = assess_performance(Get_Scores)\n", 1037 | "Accuracy" 1038 | ] 1039 | }, 1040 | { 1041 | "cell_type": "code", 1042 | "execution_count": 66, 1043 | "metadata": {}, 1044 | "outputs": [], 1045 | "source": [ 1046 | "def GradientDescent(trainRDD, testRDD, wInit, nSteps = 20, \n", 1047 | " learningRate = 0.1, verbose = False):\n", 1048 | " \"\"\"\n", 1049 | " Perform nSteps iterations of OLS gradient descent and \n", 1050 | " track loss on a test and train set. Return lists of\n", 1051 | " test/train loss and the models themselves.\n", 1052 | " \"\"\"\n", 1053 | " # initialize lists to track model performance\n", 1054 | " train_history, test_history, model_history = [], [], []\n", 1055 | " \n", 1056 | " # perform n updates & compute test and train loss after each\n", 1057 | " model = wInit\n", 1058 | " for idx in range(nSteps): \n", 1059 | " \n", 1060 | " ############## YOUR CODE HERE #############\n", 1061 | " \n", 1062 | " model = GDUpdate_Lasso(trainRDD, model, .1, .1)\n", 1063 | " training_loss = LRLoss(trainRDD, model)\n", 1064 | " test_loss = LRLoss(testRDD, model)\n", 1065 | " \n", 1066 | " ############## (END) YOUR CODE #############\n", 1067 | " \n", 1068 | " # keep track of test/train loss for plotting\n", 1069 | " train_history.append(training_loss)\n", 1070 | " test_history.append(test_loss)\n", 1071 | " model_history.append(model)\n", 1072 | " \n", 1073 | " # console output if desired\n", 1074 | " if verbose:\n", 1075 | " print(\"----------\")\n", 1076 | " print(f\"STEP: {idx+1}\")\n", 1077 | " print(f\"training loss: {training_loss}\")\n", 1078 | " print(f\"test loss: {test_loss}\")\n", 1079 | " print(f\"Model: {[round(w,3) for w in model]}\")\n", 1080 | " return train_history, test_history, model_history" 1081 | ] 1082 | }, 1083 | { 1084 | "cell_type": "code", 1085 | "execution_count": 67, 1086 | "metadata": {}, 1087 | "outputs": [], 1088 | "source": [ 1089 | "def plotErrorCurves(trainLoss, testLoss, title = None):\n", 1090 | " \"\"\"\n", 1091 | " Helper function for plotting.\n", 1092 | " Args: trainLoss (list of MSE) , testLoss (list of MSE)\n", 1093 | " \"\"\"\n", 1094 | " fig, ax = plt.subplots(1,1,figsize = (16,8))\n", 1095 | " x = list(range(len(trainLoss)))[1:]\n", 1096 | " ax.plot(x, trainLoss[1:], 'k--', label='Training Loss')\n", 1097 | " ax.plot(x, testLoss[1:], 'r--', label='Test Loss')\n", 1098 | " ax.legend(loc='upper right', fontsize='x-large')\n", 1099 | " plt.xlabel('Number of Iterations')\n", 1100 | " plt.ylabel('Mean Squared Error')\n", 1101 | " if title:\n", 1102 | " plt.title(title)\n", 1103 | " plt.show()" 1104 | ] 1105 | }, 1106 | { 1107 | "cell_type": "code", 1108 | "execution_count": 80, 1109 | "metadata": {}, 1110 | "outputs": [ 1111 | { 1112 | "name": "stdout", 1113 | "output_type": "stream", 1114 | "text": [ 1115 | "\n", 1116 | "... trained 30 iterations in 556.9607865810394 seconds\n" 1117 | ] 1118 | } 1119 | ], 1120 | "source": [ 1121 | "# run 50 iterations (RUN THIS CELL AS IS)\n", 1122 | "wInit = BASELINE\n", 1123 | "#trainRDD, testRDD = normedRDD.randomSplit([0.8,0.2], seed = 2018)\n", 1124 | "start = time.time()\n", 1125 | "MSEtrain, MSEtest, models = GradientDescent(normedRDD_train, normedRDD_test, wInit, nSteps = 30)\n", 1126 | "print(f\"\\n... trained {len(models)} iterations in {time.time() - start} seconds\")" 1127 | ] 1128 | }, 1129 | { 1130 | "cell_type": "code", 1131 | "execution_count": 72, 1132 | "metadata": {}, 1133 | "outputs": [ 1134 | { 1135 | "data": { 1136 | "image/png": "\n", 1137 | "text/plain": [ 1138 | "
" 1139 | ] 1140 | }, 1141 | "metadata": {}, 1142 | "output_type": "display_data" 1143 | } 1144 | ], 1145 | "source": [ 1146 | "plotErrorCurves(MSEtrain, MSEtest, title = 'Ordinary Least Squares Regression' )" 1147 | ] 1148 | }, 1149 | { 1150 | "cell_type": "code", 1151 | "execution_count": 75, 1152 | "metadata": {}, 1153 | "outputs": [], 1154 | "source": [ 1155 | "def assessment(trainRDD, new_models, threshold, verbose=True):\n", 1156 | "\n", 1157 | " model_list = []\n", 1158 | " get_scores = []\n", 1159 | " accuracies = []\n", 1160 | " f1s = []\n", 1161 | " precisions =[]\n", 1162 | " recalls = []\n", 1163 | " \n", 1164 | " for w in new_models: \n", 1165 | " new_scores = get_scores(trainRDD, w, threshold).collect()\n", 1166 | " accuracy, precision, recall, f1_score = assess_performance(new_scores)\n", 1167 | " model_list.append(w)\n", 1168 | " get_scores.append(new_scores)\n", 1169 | " accuracies.append(accuracy)\n", 1170 | " f1s.append(f1_score)\n", 1171 | " precisions.append(precision)\n", 1172 | " recalls.append(recall)\n", 1173 | "\n", 1174 | " if verbose:\n", 1175 | " print(\"----------\")\n", 1176 | " print(f\"STEP: {idx+1}\")\n", 1177 | " print(f\"training loss: {training_loss}\")\n", 1178 | " print(f\"test loss: {test_loss}\")\n", 1179 | " print(f\"Model: {[round(w,3) for w in model]}\")\n", 1180 | "\n", 1181 | " \n", 1182 | " return model_list, get_scores, accuracies, f1s, precisions, recalls\n", 1183 | " \n", 1184 | " " 1185 | ] 1186 | }, 1187 | { 1188 | "cell_type": "code", 1189 | "execution_count": 79, 1190 | "metadata": {}, 1191 | "outputs": [ 1192 | { 1193 | "ename": "TypeError", 1194 | "evalue": "'list' object is not callable", 1195 | "output_type": "error", 1196 | "traceback": [ 1197 | "\u001b[0;31m\u001b[0m", 1198 | "\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", 1199 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0massess\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0massessment\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnormedRDD_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmodels\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mthreshold\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m.55\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1200 | "\u001b[0;32m\u001b[0m in \u001b[0;36massessment\u001b[0;34m(trainRDD, new_models, threshold, verbose)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mw\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mnew_models\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mnew_scores\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_scores\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrainRDD\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mw\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mthreshold\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcollect\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 12\u001b[0m \u001b[0maccuracy\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprecision\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrecall\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mf1_score\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0massess_performance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnew_scores\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mmodel_list\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mw\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1201 | "\u001b[0;31mTypeError\u001b[0m: 'list' object is not callable" 1202 | ] 1203 | } 1204 | ], 1205 | "source": [ 1206 | "assess = assessment(normedRDD_train, models, threshold=.55)\n" 1207 | ] 1208 | }, 1209 | { 1210 | "cell_type": "code", 1211 | "execution_count": 57, 1212 | "metadata": {}, 1213 | "outputs": [], 1214 | "source": [ 1215 | "# Run Logistic Regression gradient descent function and capture accuracy for each model\n", 1216 | "def GradientDescent(trainRDD, testRDD, wInit, nSteps = 10, \n", 1217 | " learningRate = 0.1, verbose = True):\n", 1218 | " \"\"\"\n", 1219 | " Perform nSteps iterations of Logistic Regression gradient descent and \n", 1220 | " track loss on a test and train set. Return lists of\n", 1221 | " test/train loss and the models themselves.\n", 1222 | " \"\"\"\n", 1223 | " # initialize lists to track model performance\n", 1224 | " train_history, test_history, model_history = [], [], []\n", 1225 | " \n", 1226 | " # perform n updates & compute test and train loss after each\n", 1227 | " model = wInit\n", 1228 | " \n", 1229 | " for idx in range(nSteps): \n", 1230 | " \n", 1231 | " new_model = GDUpdate_Lasso(trainRDD, model, .1, .1)\n", 1232 | " training_loss = LRLoss(trainRDD, new_model)\n", 1233 | " test_loss = LRLoss(testRDD, new_model)\n", 1234 | " \n", 1235 | " training_scores = get_scores(trainRDD, new_model, .55)\n", 1236 | " testing_scores = get_scores(testRDD, new_model, .55)\n", 1237 | " \n", 1238 | " train_accuracy = assess_performance(training_scores)\n", 1239 | " test_accuracy = assess_performance(testing_scores)\n", 1240 | " \n", 1241 | " model = new_model\n", 1242 | " \n", 1243 | " # keep track of accuracy for plotting\n", 1244 | " train_history.append(train_accuracy)\n", 1245 | " test_history.append(test_accuracy)\n", 1246 | " model_history.append(model)\n", 1247 | " \n", 1248 | " # console output \n", 1249 | " if verbose:\n", 1250 | " print(\"----------\")\n", 1251 | " print(f\"STEP: {idx+1}\")\n", 1252 | " print(f\"training accuracy: {train_accuracy}\")\n", 1253 | " print(f\"test accuracy: {test_accuracy}\")\n", 1254 | " print(f\"Model: {[round(w,3) for w in model]}\")\n", 1255 | " return train_history, test_history, model_history" 1256 | ] 1257 | }, 1258 | { 1259 | "cell_type": "code", 1260 | "execution_count": 58, 1261 | "metadata": {}, 1262 | "outputs": [ 1263 | { 1264 | "ename": "TypeError", 1265 | "evalue": "'PipelinedRDD' object is not iterable", 1266 | "output_type": "error", 1267 | "traceback": [ 1268 | "\u001b[0;31m\u001b[0m", 1269 | "\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", 1270 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m#trainRDD, testRDD = normedRDD.randomSplit([0.8,0.2], seed = 2018)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mstart\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtime\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtime\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0mTrain_Loss\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mTest_Loss\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmodels\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mGradientDescent\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnormedRDD_train\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnormedRDD_test\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mwInit\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnSteps\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"\\n... trained {len(models)} iterations in {time.time() - start} seconds\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1271 | "\u001b[0;32m\u001b[0m in \u001b[0;36mGradientDescent\u001b[0;34m(trainRDD, testRDD, wInit, nSteps, learningRate, verbose)\u001b[0m\n\u001b[1;32m 22\u001b[0m \u001b[0mtesting_scores\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_scores\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtestRDD\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnew_model\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m.55\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 24\u001b[0;31m \u001b[0mtrain_accuracy\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0massess_performance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtraining_scores\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 25\u001b[0m \u001b[0mtest_accuracy\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0massess_performance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtesting_scores\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 26\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 1272 | "\u001b[0;32m\u001b[0m in \u001b[0;36massess_performance\u001b[0;34m(scores)\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mTP\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mFN\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mTN\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mFP\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mscores\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m'TP'\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0mTP\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 1273 | "\u001b[0;31mTypeError\u001b[0m: 'PipelinedRDD' object is not iterable" 1274 | ] 1275 | } 1276 | ], 1277 | "source": [ 1278 | "# run 50 iterations (RUN THIS CELL AS IS)\n", 1279 | "wInit = BASELINE\n", 1280 | "#trainRDD, testRDD = normedRDD.randomSplit([0.8,0.2], seed = 2018)\n", 1281 | "start = time.time()\n", 1282 | "Train_Loss, Test_Loss, models = GradientDescent(normedRDD_train, normedRDD_test, wInit, nSteps = 3)\n", 1283 | "print(f\"\\n... trained {len(models)} iterations in {time.time() - start} seconds\")" 1284 | ] 1285 | }, 1286 | { 1287 | "cell_type": "code", 1288 | "execution_count": null, 1289 | "metadata": {}, 1290 | "outputs": [], 1291 | "source": [ 1292 | "# plot error curves - RUN THIS CELL AS IS\n", 1293 | "def plotErrorCurves(trainLoss, testLoss, title = None):\n", 1294 | " \"\"\"\n", 1295 | " Helper function for plotting.\n", 1296 | " Args: trainLoss (list of MSE) , testLoss (list of MSE)\n", 1297 | " \"\"\"\n", 1298 | " fig, ax = plt.subplots(1,1,figsize = (16,8))\n", 1299 | " x = list(range(len(train_Accuracy)))[1:]\n", 1300 | " ax.plot(x, train_Accuracy[1:], 'k--', label='Training Accuracy')\n", 1301 | " ax.plot(x, test_Accuracy[1:], 'r--', label='Test Accuracy')\n", 1302 | " ax.legend(loc='upper right', fontsize='x-large')\n", 1303 | " plt.xlabel('Number of Iterations')\n", 1304 | " plt.ylabel('Accuracy')\n", 1305 | " if title:\n", 1306 | " plt.title(title)\n", 1307 | " plt.show()" 1308 | ] 1309 | }, 1310 | { 1311 | "cell_type": "code", 1312 | "execution_count": null, 1313 | "metadata": {}, 1314 | "outputs": [], 1315 | "source": [] 1316 | }, 1317 | { 1318 | "cell_type": "code", 1319 | "execution_count": null, 1320 | "metadata": {}, 1321 | "outputs": [], 1322 | "source": [] 1323 | }, 1324 | { 1325 | "cell_type": "code", 1326 | "execution_count": null, 1327 | "metadata": {}, 1328 | "outputs": [], 1329 | "source": [] 1330 | }, 1331 | { 1332 | "cell_type": "code", 1333 | "execution_count": null, 1334 | "metadata": {}, 1335 | "outputs": [], 1336 | "source": [] 1337 | }, 1338 | { 1339 | "cell_type": "markdown", 1340 | "metadata": {}, 1341 | "source": [ 1342 | "## 5. Application of Course Concepts" 1343 | ] 1344 | }, 1345 | { 1346 | "cell_type": "code", 1347 | "execution_count": null, 1348 | "metadata": {}, 1349 | "outputs": [], 1350 | "source": [] 1351 | } 1352 | ], 1353 | "metadata": { 1354 | "kernelspec": { 1355 | "display_name": "Python 3", 1356 | "language": "python", 1357 | "name": "python3" 1358 | }, 1359 | "language_info": { 1360 | "codemirror_mode": { 1361 | "name": "ipython", 1362 | "version": 3 1363 | }, 1364 | "file_extension": ".py", 1365 | "mimetype": "text/x-python", 1366 | "name": "python", 1367 | "nbconvert_exporter": "python", 1368 | "pygments_lexer": "ipython3", 1369 | "version": "3.6.6" 1370 | } 1371 | }, 1372 | "nbformat": 4, 1373 | "nbformat_minor": 2 1374 | } 1375 | --------------------------------------------------------------------------------