├── images └── output_45_0.png ├── README.md ├── intro_to_sparse_data_and_embeddings.md └── intro_to_sparse_data_and_embeddings.ipynb /images/output_45_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hatemr/Intro-to-sparse-data-and-embeddings/master/images/output_45_0.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Intro to sparse data and embeddings 2 | I used Tensorflow to make some word embeddings. 3 | 4 | ## View my work 5 | My code can be found here: [intro_to_sparse_data_and_embeddings.md](intro_to_sparse_data_and_embeddings.md). I created the markdown version of the `.ipynb` notebook because `.ipynb` files often don't render in Github. 6 | 7 | 8 | 9 | ## Summary 10 | * The model performs sentiment analysis on a movie review dataset (positive vs. negative). 11 | * The words are first represented as one-hot vectors from a limited vocabulary of 50 terms. 12 | * A simple logistic regression is applied (`tf.estimator.LinearClassifier`) (test AUC: 0.87036055), then a feed-forward neural net (`tf.estimator.DNNClassifier`) (test AUC: 0.8653846). 13 | * The sparse features are combined into an embedding layer of dimension 2, which is lower dimensional than the indicator columns, and the models re-run with different hyperparameters. 14 | 15 | How does the 2d embedding get assigned? 16 | > That is, the model learns the best way to map your input numeric categorical values to the embeddings vector value in order to solve your problem. 17 | 18 | From [here](https://developers.googleblog.com/2017/11/introducing-tensorflow-feature-columns.html) 19 | 20 | ## Results 21 | | Hidden units | batch norm | steps | optimizer | learning rate | test AUC | 22 | | -- | -- | -- | -- | -- | -- | 23 | | [20,20] | False | 1000 | adagrad | 0.1 | 0.8666485 | 24 | | [20,20] | False | 100 | adagrad | 0.1 | 0.71556914 | 25 | | [40,20] | False | 1000 | adagrad | 0.1 | 0.86758274 | 26 | | [40,20] | False | 5000 | adagrad | 0.1 | 0.8707501 | 27 | | [40,20] | False | 10000 | adagrad | 0.1 | 0.8708241 | 28 | | [40,20] | False | 10000 | adagrad | 0.08 | 0.8709081 | 29 | | [40,20] | True | 10000 | adagrad | 0.08 | __0.8709133__ | 30 | | [40,20] | NA | 1000 | adam | 0.001 | 0.7138001 | 31 | | [40,20] | NA | 1000 | adam | 0.0001 | 0.5597429 | 32 | | [40,20] | NA | 1000 | adam | 0.01 | 0.86710733 | 33 | 34 | I did not get into regularization on the optimizer. 35 | -------------------------------------------------------------------------------- /intro_to_sparse_data_and_embeddings.md: -------------------------------------------------------------------------------- 1 | 2 | I followed this tutorial from Google Colab. I tried to show which cells I wrote by adding my name at the top of the cell. 3 | 4 | #### Copyright 2017 Google LLC. 5 | 6 | 7 | ```python 8 | # Licensed under the Apache License, Version 2.0 (the "License"); 9 | # you may not use this file except in compliance with the License. 10 | # You may obtain a copy of the License at 11 | # 12 | # https://www.apache.org/licenses/LICENSE-2.0 13 | # 14 | # Unless required by applicable law or agreed to in writing, software 15 | # distributed under the License is distributed on an "AS IS" BASIS, 16 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 17 | # See the License for the specific language governing permissions and 18 | # limitations under the License. 19 | ``` 20 | 21 | # Intro to Sparse Data and Embeddings 22 | 23 | **Learning Objectives:** 24 | * Convert movie-review string data to a sparse feature vector 25 | * Implement a sentiment-analysis linear model using a sparse feature vector 26 | * Implement a sentiment-analysis DNN model using an embedding that projects data into two dimensions 27 | * Visualize the embedding to see what the model has learned about the relationships between words 28 | 29 | In this exercise, we'll explore sparse data and work with embeddings using text data from movie reviews (from the [ACL 2011 IMDB dataset](http://ai.stanford.edu/~amaas/data/sentiment/)). This data has already been processed into `tf.Example` format. 30 | 31 | ## Setup 32 | 33 | Let's import our dependencies and download the training and test data. [`tf.keras`](https://www.tensorflow.org/api_docs/python/tf/keras) includes a file download and caching tool that we can use to retrieve the data sets. 34 | 35 | 36 | ```python 37 | from __future__ import print_function 38 | 39 | import collections 40 | import io 41 | import math 42 | 43 | import matplotlib.pyplot as plt 44 | import numpy as np 45 | import pandas as pd 46 | import tensorflow as tf 47 | from IPython import display 48 | from sklearn import metrics 49 | 50 | tf.logging.set_verbosity(tf.logging.ERROR) 51 | train_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/train.tfrecord' 52 | train_path = tf.keras.utils.get_file(train_url.split('/')[-1], train_url) 53 | test_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/test.tfrecord' 54 | test_path = tf.keras.utils.get_file(test_url.split('/')[-1], test_url) 55 | ``` 56 | 57 | ## Building a Sentiment Analysis Model 58 | 59 | Let's train a sentiment-analysis model on this data that predicts if a review is generally *favorable* (label of 1) or *unfavorable* (label of 0). 60 | 61 | To do so, we'll turn our string-value `terms` into feature vectors by using a *vocabulary*, a list of each term we expect to see in our data. For the purposes of this exercise, we've created a small vocabulary that focuses on a limited set of terms. Most of these terms were found to be strongly indicative of *favorable* or *unfavorable*, but some were just added because they're interesting. 62 | 63 | Each term in the vocabulary is mapped to a coordinate in our feature vector. To convert the string-value `terms` for an example into this vector format, we encode such that each coordinate gets a value of 0 if the vocabulary term does not appear in the example string, and a value of 1 if it does. Terms in an example that don't appear in the vocabulary are thrown away. 64 | 65 | **NOTE:** *We could of course use a larger vocabulary, and there are special tools for creating these. In addition, instead of just dropping terms that are not in the vocabulary, we can introduce a small number of OOV (out-of-vocabulary) buckets to which you can hash the terms not in the vocabulary. We can also use a __feature hashing__ approach that hashes each term, instead of creating an explicit vocabulary. This works well in practice, but loses interpretability, which is useful for this exercise. See the tf.feature_column module for tools handling this.* 66 | 67 | ## Building the Input Pipeline 68 | 69 | First, let's configure the input pipeline to import our data into a TensorFlow model. We can use the following function to parse the training and test data (which is in [TFRecord](https://www.tensorflow.org/guide/datasets#consuming_tfrecord_data) format) and return a dict of the features and the corresponding labels. 70 | 71 | 72 | ```python 73 | def _parse_function(record): 74 | """Extracts features and labels. 75 | 76 | Args: 77 | record: File path to a TFRecord file 78 | Returns: 79 | A `tuple` `(labels, features)`: 80 | features: A dict of tensors representing the features 81 | labels: A tensor with the corresponding labels. 82 | """ 83 | features = { 84 | "terms": tf.VarLenFeature(dtype=tf.string), # terms are strings of varying lengths 85 | "labels": tf.FixedLenFeature(shape=[1], dtype=tf.float32) # labels are 0 or 1 86 | } 87 | 88 | parsed_features = tf.parse_single_example(record, features) 89 | 90 | terms = parsed_features['terms'].values 91 | labels = parsed_features['labels'] 92 | 93 | return {'terms':terms}, labels 94 | ``` 95 | 96 | To confirm our function is working as expected, let's construct a `TFRecordDataset` for the training data, and map the data to features and labels using the function above. 97 | 98 | 99 | ```python 100 | # Create the Dataset object. 101 | ds = tf.data.TFRecordDataset(train_path) 102 | # Map features and labels with the parse function. 103 | ds = ds.map(_parse_function) 104 | 105 | ds 106 | ``` 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | Run the following cell to retrieve the first example from the training data set. 116 | 117 | 118 | ```python 119 | n = ds.make_one_shot_iterator().get_next() 120 | sess = tf.Session() 121 | sess.run(n) 122 | ``` 123 | 124 | 125 | 126 | 127 | ({'terms': array([b'but', b'it', b'does', b'have', b'some', b'good', b'action', 128 | b'and', b'a', b'plot', b'that', b'is', b'somewhat', b'interesting', 129 | b'.', b'nevsky', b'acts', b'like', b'a', b'body', b'builder', 130 | b'and', b'he', b'isn', b"'", b't', b'all', b'that', b'attractive', 131 | b',', b'in', b'fact', b',', b'imo', b',', b'he', b'is', b'ugly', 132 | b'.', b'(', b'his', b'acting', b'skills', b'lack', b'everything', 133 | b'!', b')', b'sascha', b'is', b'played', b'very', b'well', b'by', 134 | b'joanna', b'pacula', b',', b'but', b'she', b'needed', b'more', 135 | b'lines', b'than', b'she', b'was', b'given', b',', b'her', 136 | b'character', b'needed', b'to', b'be', b'developed', b'.', 137 | b'there', b'are', b'way', b'too', b'many', b'men', b'in', b'this', 138 | b'story', b',', b'there', b'is', b'zero', b'romance', b',', b'too', 139 | b'much', b'action', b',', b'and', b'way', b'too', b'dumb', b'of', 140 | b'an', b'ending', b'.', b'it', b'is', b'very', b'violent', b'.', 141 | b'i', b'did', b'however', b'love', b'the', b'scenery', b',', 142 | b'this', b'movie', b'takes', b'you', b'all', b'over', b'the', 143 | b'world', b',', b'and', b'that', b'is', b'a', b'bonus', b'.', b'i', 144 | b'also', b'liked', b'how', b'it', b'had', b'some', b'stuff', 145 | b'about', b'the', b'mafia', b'in', b'it', b',', b'not', b'too', 146 | b'much', b'or', b'too', b'little', b',', b'but', b'enough', 147 | b'that', b'it', b'got', b'my', b'attention', b'.', b'the', 148 | b'actors', b'needed', b'to', b'be', b'more', b'handsome', b'.', 149 | b'.', b'.', b'the', b'biggest', b'problem', b'i', b'had', b'was', 150 | b'that', b'nevsky', b'was', b'just', b'too', b'normal', b',', 151 | b'not', b'sexy', b'enough', b'.', b'i', b'think', b'for', b'most', 152 | b'guys', b',', b'sascha', b'will', b'be', b'hot', b'enough', b',', 153 | b'but', b'for', b'us', b'ladies', b'that', b'are', b'fans', b'of', 154 | b'action', b',', b'nevsky', b'just', b'doesn', b"'", b't', b'cut', 155 | b'it', b'.', b'overall', b',', b'this', b'movie', b'was', b'fine', 156 | b',', b'i', b'didn', b"'", b't', b'love', b'it', b'nor', b'did', 157 | b'i', b'hate', b'it', b',', b'just', b'found', b'it', b'to', b'be', 158 | b'another', b'normal', b'action', b'flick', b'.'], dtype=object)}, 159 | array([0.], dtype=float32)) 160 | 161 | 162 | 163 | Now, let's build a formal input function that we can pass to the `train()` method of a TensorFlow Estimator object. 164 | 165 | 166 | ```python 167 | # Create an input_fn that parses the tf.Examples from the given files, 168 | # and split them into features and targets. 169 | def _input_fn(input_filenames, num_epochs=None, shuffle=True): 170 | 171 | # Same code as above; create a dataset and map features and labels. 172 | ds = tf.data.TFRecordDataset(input_filenames) 173 | ds = ds.map(_parse_function) 174 | 175 | if shuffle: 176 | ds = ds.shuffle(10000) 177 | 178 | # Our feature data is variable-length, so we pad and batch 179 | # each field of the dataset structure to whatever size is necessary. 180 | ds = ds.padded_batch(25, ds.output_shapes) 181 | 182 | ds = ds.repeat(num_epochs) 183 | 184 | 185 | # Return the next batch of data. 186 | features, labels = ds.make_one_shot_iterator().get_next() 187 | return features, labels 188 | ``` 189 | 190 | ## Task 1: Use a Linear Model with Sparse Inputs and an Explicit Vocabulary 191 | 192 | For our first model, we'll build a [`LinearClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier) model using 50 informative terms; always start simple! 193 | 194 | The following code constructs the feature column for our terms. The [`categorical_column_with_vocabulary_list`](https://www.tensorflow.org/api_docs/python/tf/feature_column/categorical_column_with_vocabulary_list) function creates a feature column with the string-to-feature-vector mapping. 195 | 196 | 197 | ```python 198 | # 50 informative terms that compose our model vocabulary 199 | informative_terms = ("bad", "great", "best", "worst", "fun", "beautiful", 200 | "excellent", "poor", "boring", "awful", "terrible", 201 | "definitely", "perfect", "liked", "worse", "waste", 202 | "entertaining", "loved", "unfortunately", "amazing", 203 | "enjoyed", "favorite", "horrible", "brilliant", "highly", 204 | "simple", "annoying", "today", "hilarious", "enjoyable", 205 | "dull", "fantastic", "poorly", "fails", "disappointing", 206 | "disappointment", "not", "him", "her", "good", "time", 207 | "?", ".", "!", "movie", "film", "action", "comedy", 208 | "drama", "family") 209 | 210 | terms_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(key="terms", vocabulary_list=informative_terms) 211 | ``` 212 | 213 | Next, we'll construct the `LinearClassifier`, train it on the training set, and evaluate it on the evaluation set. After you read through the code, run it and see how you do. 214 | 215 | 216 | ```python 217 | my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1) 218 | my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0) 219 | 220 | feature_columns = [ terms_feature_column ] 221 | 222 | 223 | classifier = tf.estimator.LinearClassifier( 224 | feature_columns=feature_columns, 225 | optimizer=my_optimizer, 226 | ) 227 | 228 | classifier.train( 229 | input_fn=lambda: _input_fn([train_path]), 230 | steps=1000) 231 | 232 | evaluation_metrics = classifier.evaluate( 233 | input_fn=lambda: _input_fn([train_path]), 234 | steps=1000) 235 | print("Training set metrics:") 236 | for m in evaluation_metrics: 237 | print(m, evaluation_metrics[m]) 238 | print("---") 239 | 240 | evaluation_metrics = classifier.evaluate( 241 | input_fn=lambda: _input_fn([test_path]), 242 | steps=1000) 243 | 244 | print("Test set metrics:") 245 | for m in evaluation_metrics: 246 | print(m, evaluation_metrics[m]) 247 | print("---") 248 | ``` 249 | 250 | Training set metrics: 251 | accuracy 0.78832 252 | accuracy_baseline 0.5 253 | auc 0.8720737 254 | auc_precision_recall 0.8626951 255 | average_loss 0.4504532 256 | label/mean 0.5 257 | loss 11.26133 258 | precision 0.7594299 259 | prediction/mean 0.50741524 260 | recall 0.844 261 | global_step 1000 262 | --- 263 | Test set metrics: 264 | accuracy 0.78552 265 | accuracy_baseline 0.5 266 | auc 0.87036055 267 | auc_precision_recall 0.860967 268 | average_loss 0.4511889 269 | label/mean 0.5 270 | loss 11.279722 271 | precision 0.7580995 272 | prediction/mean 0.5062276 273 | recall 0.83864 274 | global_step 1000 275 | --- 276 | 277 | 278 | ## Task 2: Use a Deep Neural Network (DNN) Model 279 | 280 | The above model is a linear model. It works quite well. But can we do better with a DNN model? 281 | 282 | Let's swap in a [`DNNClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) for the `LinearClassifier`. Run the following cell, and see how you do. 283 | 284 | 285 | ```python 286 | ##################### Here's what we changed ################################## 287 | classifier = tf.estimator.DNNClassifier( # 288 | feature_columns=[tf.feature_column.indicator_column(terms_feature_column)], # 289 | hidden_units=[20,20], # 290 | optimizer=my_optimizer, # 291 | ) # 292 | ############################################################################### 293 | 294 | try: 295 | classifier.train( 296 | input_fn=lambda: _input_fn([train_path]), 297 | steps=1000) 298 | 299 | evaluation_metrics = classifier.evaluate( 300 | input_fn=lambda: _input_fn([train_path]), 301 | steps=1) 302 | print("Training set metrics:") 303 | for m in evaluation_metrics: 304 | print(m, evaluation_metrics[m]) 305 | print("---") 306 | 307 | evaluation_metrics = classifier.evaluate( 308 | input_fn=lambda: _input_fn([test_path]), 309 | steps=1) 310 | 311 | print("Test set metrics:") 312 | for m in evaluation_metrics: 313 | print(m, evaluation_metrics[m]) 314 | print("---") 315 | except ValueError as err: 316 | print(err) 317 | ``` 318 | 319 | Training set metrics: 320 | accuracy 0.8 321 | accuracy_baseline 0.52 322 | auc 0.92307687 323 | auc_precision_recall 0.925045 324 | average_loss 0.39349946 325 | label/mean 0.52 326 | loss 9.837486 327 | precision 0.9 328 | prediction/mean 0.42836806 329 | recall 0.6923077 330 | global_step 1000 331 | --- 332 | Test set metrics: 333 | accuracy 0.8 334 | accuracy_baseline 0.52 335 | auc 0.8653846 336 | auc_precision_recall 0.8871613 337 | average_loss 0.45752084 338 | label/mean 0.48 339 | loss 11.438021 340 | precision 0.7692308 341 | prediction/mean 0.54468846 342 | recall 0.8333333 343 | global_step 1000 344 | --- 345 | 346 | 347 | ## Task 3: Use an Embedding with a DNN Model 348 | 349 | In this task, we'll implement our DNN model using an embedding column. An embedding column takes sparse data as input and returns a lower-dimensional dense vector as output. 350 | 351 | **NOTE:** *An embedding_column is usually the computationally most efficient option to use for training a model on sparse data. In an [optional section](#scrollTo=XDMlGgRfKSVz) at the end of this exercise, we'll discuss in more depth the implementational differences between using an `embedding_column` and an `indicator_column`, and the tradeoffs of selecting one over the other.* 352 | 353 | In the following code, do the following: 354 | 355 | * Define the feature columns for the model using an `embedding_column` that projects the data into 2 dimensions (see the [TF docs](https://www.tensorflow.org/api_docs/python/tf/feature_column/embedding_column) for more details on the function signature for `embedding_column`). 356 | * Define a `DNNClassifier` with the following specifications: 357 | * Two hidden layers of 20 units each 358 | * Adagrad optimization with a learning rate of 0.1 359 | * A `gradient_clip_norm` of 5.0 360 | 361 | **NOTE:** *In practice, we might project to dimensions higher than 2, like 50 or 100. But for now, 2 dimensions is easy to visualize.* 362 | 363 | ### Hint 364 | 365 | 366 | ```python 367 | # Here's a example code snippet you might use to define the feature columns: 368 | 369 | terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2) 370 | feature_columns = [ terms_embedding_column ] 371 | ``` 372 | 373 | ### Complete the Code Below 374 | 375 | 376 | ```python 377 | ########################## YOUR CODE HERE ###################################### 378 | terms_embedding_column = tf.feature_column.embedding_column(categorical_column=terms_feature_column, 379 | dimension=2) # Define the embedding column 380 | feature_columns = [ terms_embedding_column ] # Define the feature columns 381 | 382 | my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.08) # 0.1 383 | my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0) 384 | 385 | # try adam 386 | my_optimizer = tf.train.AdamOptimizer(learning_rate=0.01, 387 | beta1=0.9, 388 | beta2=0.999, 389 | epsilon=1e-08) 390 | 391 | classifier = tf.estimator.DNNClassifier( # # Define the DNNClassifier 392 | feature_columns=feature_columns, # 393 | hidden_units=[40,20], # 394 | optimizer=my_optimizer, # 395 | batch_norm=True, # was False 396 | ) 397 | ################################################################################ 398 | 399 | classifier.train( 400 | input_fn=lambda: _input_fn([train_path]), 401 | steps=1000) # was 1000 402 | 403 | evaluation_metrics = classifier.evaluate( 404 | input_fn=lambda: _input_fn([train_path]), 405 | steps=1000) 406 | print("Training set metrics:") 407 | for m in evaluation_metrics: 408 | print(m, evaluation_metrics[m]) 409 | print("---") 410 | 411 | evaluation_metrics = classifier.evaluate( 412 | input_fn=lambda: _input_fn([test_path]), 413 | steps=1000) 414 | 415 | print("Test set metrics:") 416 | for m in evaluation_metrics: 417 | print(m, evaluation_metrics[m]) 418 | print("---") 419 | ``` 420 | 421 | Training set metrics: 422 | accuracy 0.65308 423 | accuracy_baseline 0.5 424 | auc 0.8681754 425 | auc_precision_recall 0.8571652 426 | average_loss 0.6069626 427 | label/mean 0.5 428 | loss 15.174066 429 | precision 0.91588783 430 | prediction/mean 0.42483702 431 | recall 0.33712 432 | global_step 1000 433 | --- 434 | Test set metrics: 435 | accuracy 0.65704 436 | accuracy_baseline 0.5 437 | auc 0.86710733 438 | auc_precision_recall 0.8548789 439 | average_loss 0.6060262 440 | label/mean 0.5 441 | loss 15.150654 442 | precision 0.9151861 443 | prediction/mean 0.42522198 444 | recall 0.34616 445 | global_step 1000 446 | --- 447 | 448 | 449 | ### Solution 450 | 451 | Click below for a solution. 452 | 453 | 454 | ```python 455 | ########################## SOLUTION CODE ######################################## 456 | terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2) 457 | feature_columns = [ terms_embedding_column ] 458 | 459 | my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1) 460 | my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0) 461 | 462 | classifier = tf.estimator.DNNClassifier( 463 | feature_columns=feature_columns, 464 | hidden_units=[20,20], 465 | optimizer=my_optimizer 466 | ) 467 | ################################################################################# 468 | 469 | classifier.train( 470 | input_fn=lambda: _input_fn([train_path]), 471 | steps=1000) 472 | 473 | evaluation_metrics = classifier.evaluate( 474 | input_fn=lambda: _input_fn([train_path]), 475 | steps=1000) 476 | print("Training set metrics:") 477 | for m in evaluation_metrics: 478 | print(m, evaluation_metrics[m]) 479 | print("---") 480 | 481 | evaluation_metrics = classifier.evaluate( 482 | input_fn=lambda: _input_fn([test_path]), 483 | steps=1000) 484 | 485 | print("Test set metrics:") 486 | for m in evaluation_metrics: 487 | print(m, evaluation_metrics[m]) 488 | print("---") 489 | ``` 490 | 491 | ## Task 4: Convince yourself there's actually an embedding in there 492 | 493 | The above model used an `embedding_column`, and it seemed to work, but this doesn't tell us much about what's going on internally. How can we check that the model is actually using an embedding inside? 494 | 495 | To start, let's look at the tensors in the model: 496 | 497 | 498 | ```python 499 | classifier.get_variable_names() 500 | ``` 501 | 502 | 503 | 504 | 505 | ['dnn/hiddenlayer_0/bias', 506 | 'dnn/hiddenlayer_0/bias/t_0/Adagrad', 507 | 'dnn/hiddenlayer_0/kernel', 508 | 'dnn/hiddenlayer_0/kernel/t_0/Adagrad', 509 | 'dnn/hiddenlayer_1/bias', 510 | 'dnn/hiddenlayer_1/bias/t_0/Adagrad', 511 | 'dnn/hiddenlayer_1/kernel', 512 | 'dnn/hiddenlayer_1/kernel/t_0/Adagrad', 513 | 'dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights', 514 | 'dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights/t_0/Adagrad', 515 | 'dnn/logits/bias', 516 | 'dnn/logits/bias/t_0/Adagrad', 517 | 'dnn/logits/kernel', 518 | 'dnn/logits/kernel/t_0/Adagrad', 519 | 'global_step'] 520 | 521 | 522 | 523 | Okay, we can see that there is an embedding layer in there: `'dnn/input_from_feature_columns/input_layer/terms_embedding/...'`. (What's interesting here, by the way, is that this layer is trainable along with the rest of the model just as any hidden layer is.) 524 | 525 | Is the embedding layer the correct shape? Run the following code to find out. 526 | 527 | **NOTE:** *Remember, in our case, the embedding is a matrix that allows us to project a 50-dimensional vector down to 2 dimensions.* 528 | 529 | 530 | ```python 531 | classifier.get_variable_value('dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights').shape 532 | ``` 533 | 534 | 535 | 536 | 537 | (50, 2) 538 | 539 | 540 | 541 | Spend some time manually checking the various layers and shapes to make sure everything is connected the way you would expect it would be. 542 | 543 | 544 | ```python 545 | # first hidden layer takes 2d to 20d 546 | classifier.get_variable_value('dnn/hiddenlayer_0/kernel').shape 547 | ``` 548 | 549 | 550 | 551 | 552 | (2, 20) 553 | 554 | 555 | 556 | 557 | ```python 558 | # second hidden layer takes 20d to 20d 559 | classifier.get_variable_value('dnn/hiddenlayer_1/kernel/t_0/Adagrad').shape 560 | ``` 561 | 562 | 563 | 564 | 565 | (20, 20) 566 | 567 | 568 | 569 | 570 | ```python 571 | # final layer takes 20d to 1d 572 | classifier.get_variable_value('dnn/logits/kernel').shape 573 | ``` 574 | 575 | 576 | 577 | 578 | (20, 1) 579 | 580 | 581 | 582 | Robert Hatem: 583 | 584 | The shapes make sense, but I'm not sure what the extra `/t_0/Adagrad` means: 585 | 586 | ```python 587 | 'dnn/hiddenlayer_0/kernel', 588 | 'dnn/hiddenlayer_0/kernel/t_0/Adagrad', 589 | ``` 590 | 591 | ## Task 5: Examine the Embedding 592 | 593 | Let's now take a look at the actual embedding space, and see where the terms end up in it. Do the following: 594 | 1. Run the following code to see the embedding we trained in **Task 3**. Do things end up where you'd expect? 595 | 596 | 2. Re-train the model by rerunning the code in **Task 3**, and then run the embedding visualization below again. What stays the same? What changes? 597 | 598 | 3. Finally, re-train the model again using only 10 steps (which will yield a terrible model). Run the embedding visualization below again. What do you see now, and why? 599 | 600 | 601 | ```python 602 | import numpy as np 603 | import matplotlib.pyplot as plt 604 | 605 | embedding_matrix = classifier.get_variable_value('dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights') 606 | 607 | for term_index in range(len(informative_terms)): 608 | # Create a one-hot encoding for our term. It has 0s everywhere, except for 609 | # a single 1 in the coordinate that corresponds to that term. 610 | term_vector = np.zeros(len(informative_terms)) 611 | term_vector[term_index] = 1 612 | # We'll now project that one-hot vector into the embedding space. 613 | embedding_xy = np.matmul(term_vector, embedding_matrix) 614 | plt.text(embedding_xy[0], 615 | embedding_xy[1], 616 | informative_terms[term_index]) 617 | 618 | # Do a little setup to make sure the plot displays nicely. 619 | plt.rcParams["figure.figsize"] = (15, 15) 620 | plt.xlim(1.2 * embedding_matrix.min(), 1.2 * embedding_matrix.max()) 621 | plt.ylim(1.2 * embedding_matrix.min(), 1.2 * embedding_matrix.max()) 622 | plt.show() 623 | ``` 624 | 625 | 626 | ![png](images/output_45_0.png) 627 | 628 | 629 | Robert Hatem: 630 | ### Observations 631 | * Positive words are near each other (e.g. 'excellent' and 'amazing'). 632 | * Similarly, negataives words are near each other (e.g. 'terrible' and 'word'). 633 | * These clusters-based-on-meaning is intuitive and desirable for our embeddings. 634 | 635 | ### After re-training model from Task 3: 636 | * The words still cluster based on meaning (e.g. 'excellent' is near 'perfect'). 637 | * The clsuters separate along a different direction than before (now top-left and bottom-right corners). 638 | * The clusters are more spread out - I don't know if this is because the words vectors are truly farther apart or if it's just an artifact of the visualization. 639 | 640 | ### After re-training with only 10 steps: 641 | * Now the embeddings are not clustered nicely. For example, 'excellent' is near 'disappointment' and 'boring' is near 'beautiful'. 642 | 643 | ## Task 6: Try to improve the model's performance 644 | 645 | See if you can refine the model to improve performance. A couple things you may want to try: 646 | 647 | * **Changing hyperparameters**, or **using a different optimizer** like Adam (you may only gain one or two accuracy percentage points following these strategies). 648 | * **Adding additional terms to `informative_terms`.** There's a full vocabulary file with all 30,716 terms for this data set that you can use at: https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt You can pick out additional terms from this vocabulary file, or use the whole thing via the `categorical_column_with_vocabulary_file` feature column. 649 | 650 | Robert Hatem 651 | 652 | I performed a simple grid search: 653 | 654 | | Hidden units | batch norm | steps | optimizer | learning rate | test AUC | 655 | | -- | -- | -- | -- | -- | -- | 656 | | [20,20] | False | 1000 | adagrad | 0.1 | 0.8666485 | 657 | | [20,20] | False | 100 | adagrad | 0.1 | 0.71556914 | 658 | | [40,20] | False | 1000 | adagrad | 0.1 | 0.86758274 | 659 | | [40,20] | False | 5000 | adagrad | 0.1 | 0.8707501 | 660 | | [40,20] | False | 10000 | adagrad | 0.1 | 0.8708241 | 661 | | [40,20] | False | 10000 | adagrad | 0.08 | 0.8709081 | 662 | | [40,20] | True | 10000 | adagrad | 0.08 | __0.8709133__ | 663 | | [40,20] | NA | 1000 | adam | 0.001 | 0.7138001 | 664 | | [40,20] | NA | 1000 | adam | 0.0001 | 0.5597429 | 665 | | [40,20] | NA | 1000 | adam | 0.01 | 0.86710733 | 666 | 667 | I'm not going to get into regularization on the optimizer. 668 | 669 | 670 | ```python 671 | # Download the vocabulary file. 672 | terms_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt' 673 | terms_path = tf.keras.utils.get_file(terms_url.split('/')[-1], terms_url) 674 | ``` 675 | 676 | Downloading data from https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt 677 | 253952/253538 [==============================] - 0s 0us/step 678 | 679 | 680 | 681 | ```python 682 | # Create a feature column from "terms", using a full vocabulary file. 683 | informative_terms = None 684 | with io.open(terms_path, 'r', encoding='utf8') as f: 685 | # Convert it to a set first to remove duplicates. 686 | informative_terms = list(set(f.read().split())) 687 | 688 | terms_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(key="terms", 689 | vocabulary_list=informative_terms) 690 | 691 | terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2) 692 | feature_columns = [ terms_embedding_column ] 693 | 694 | my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1) 695 | my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0) 696 | 697 | classifier = tf.estimator.DNNClassifier( 698 | feature_columns=feature_columns, 699 | hidden_units=[10,10], 700 | optimizer=my_optimizer 701 | ) 702 | 703 | classifier.train( 704 | input_fn=lambda: _input_fn([train_path]), 705 | steps=1000) 706 | 707 | evaluation_metrics = classifier.evaluate( 708 | input_fn=lambda: _input_fn([train_path]), 709 | steps=1000) 710 | print("Training set metrics:") 711 | for m in evaluation_metrics: 712 | print(m, evaluation_metrics[m]) 713 | print("---") 714 | 715 | evaluation_metrics = classifier.evaluate( 716 | input_fn=lambda: _input_fn([test_path]), 717 | steps=1000) 718 | 719 | print("Test set metrics:") 720 | for m in evaluation_metrics: 721 | print(m, evaluation_metrics[m]) 722 | print("---") 723 | ``` 724 | 725 | Training set metrics: 726 | accuracy 0.80692 727 | accuracy_baseline 0.5 728 | auc 0.8792404 729 | auc_precision_recall 0.88184667 730 | average_loss 0.45094785 731 | label/mean 0.5 732 | loss 11.273696 733 | precision 0.7966901 734 | prediction/mean 0.5484024 735 | recall 0.82416 736 | global_step 1000 737 | --- 738 | Test set metrics: 739 | accuracy 0.79 740 | accuracy_baseline 0.5 741 | auc 0.8636051 742 | auc_precision_recall 0.86517555 743 | average_loss 0.47325826 744 | label/mean 0.5 745 | loss 11.831456 746 | precision 0.7802691 747 | prediction/mean 0.54887104 748 | recall 0.80736 749 | global_step 1000 750 | --- 751 | 752 | 753 | ## A Final Word 754 | 755 | We may have gotten a DNN solution with an embedding that was better than our original linear model, but the linear model was also pretty good and was quite a bit faster to train. Linear models train more quickly because they do not have nearly as many parameters to update or layers to backprop through. 756 | 757 | In some applications, the speed of linear models may be a game changer, or linear models may be perfectly sufficient from a quality standpoint. In other areas, the additional model complexity and capacity provided by DNNs might be more important. When defining your model architecture, remember to explore your problem sufficiently so that you know which space you're in. 758 | 759 | ### *Optional Discussion:* Trade-offs between `embedding_column` and `indicator_column` 760 | 761 | Conceptually when training a `LinearClassifier` or a `DNNClassifier`, there is an adapter needed to use a sparse column. TF provides two options: `embedding_column` or `indicator_column`. 762 | 763 | When training a LinearClassifier (as in **Task 1**), an `embedding_column` in used under the hood. As seen in **Task 2**, when training a `DNNClassifier`, you must explicitly choose either `embedding_column` or `indicator_column`. This section discusses the distinction between the two, and the trade-offs of using one over the other, by looking at a simple example. 764 | 765 | Suppose we have sparse data containing the values `"great"`, `"beautiful"`, `"excellent"`. Since the vocabulary size we're using here is $V = 50$, each unit (neuron) in the first layer will have 50 weights. We denote the number of terms in a sparse input using $s$. So for this example sparse data, $s = 3$. For an input layer with $V$ possible values, a hidden layer with $d$ units needs to do a vector-matrix multiply: $(1 \times V) * (V \times d)$. This has $O(V * d)$ computational cost. Note that this cost is proportional to the number of weights in that hidden layer and independent of $s$. 766 | 767 | If the inputs are one-hot encoded (a Boolean vector of length $V$ with a 1 for the terms present and a 0 for the rest) using an [`indicator_column`](https://www.tensorflow.org/api_docs/python/tf/feature_column/indicator_column), this means multiplying and adding a lot of zeros. 768 | 769 | When we achieve the exact same results by using an [`embedding_column`](https://www.tensorflow.org/api_docs/python/tf/feature_column/embedding_column) of size $d$, we look up and add up just the embeddings corresponding to the three features present in our example input of "`great`", "`beautiful`", "`excellent`": $(1 \times d) + (1 \times d) + (1 \times d)$. Since the weights for the features that are absent are multiplied by zero in the vector-matrix multiply, they do not contribute to the result. Weights for the features that are present are multiplied by 1 in the vector-matrix multiply. Thus, adding the weights obtained via the embedding lookup will lead to the same result as in the vector-matrix-multiply. 770 | 771 | When using an embedding, computing the embedding lookup is an $O(s * d)$ computation, which is computationally much more efficient than the $O(V * d)$ cost for the `indicator_column` in sparse data for which $s$ is much smaller than $V$. (Remember, these embeddings are being learned. In any given training iteration it is the current weights that are being looked up.) 772 | 773 | As we saw in **Task 3**, by using an `embedding_column` in training the `DNNClassifier`, our model learns a low-dimensional representation for the features, where the dot product defines a similarity metric tailored to the desired task. In this example, terms that are used similarly in the context of movie reviews (e.g., `"great"` and `"excellent"`) will be closer to each other the embedding space (i.e., have a large dot product), and terms that are dissimilar (e.g., `"great"` and `"bad"`) will be farther away from each other in the embedding space (i.e., have a small dot product). 774 | -------------------------------------------------------------------------------- /intro_to_sparse_data_and_embeddings.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "colab_type": "text", 7 | "id": "Vvb49tqt7jxm" 8 | }, 9 | "source": [ 10 | "I followed this tutorial from Google Colab. I tried to show which cells I wrote by adding my name at the top of the cell." 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": { 16 | "colab_type": "text", 17 | "id": "JndnmDMp66FL" 18 | }, 19 | "source": [ 20 | "#### Copyright 2017 Google LLC." 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "metadata": { 27 | "cellView": "both", 28 | "colab": {}, 29 | "colab_type": "code", 30 | "id": "hMqWDc_m6rUC" 31 | }, 32 | "outputs": [], 33 | "source": [ 34 | "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", 35 | "# you may not use this file except in compliance with the License.\n", 36 | "# You may obtain a copy of the License at\n", 37 | "#\n", 38 | "# https://www.apache.org/licenses/LICENSE-2.0\n", 39 | "#\n", 40 | "# Unless required by applicable law or agreed to in writing, software\n", 41 | "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", 42 | "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", 43 | "# See the License for the specific language governing permissions and\n", 44 | "# limitations under the License." 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": { 50 | "colab_type": "text", 51 | "id": "PTaAdgy3LS8W" 52 | }, 53 | "source": [ 54 | "# Intro to Sparse Data and Embeddings\n", 55 | "\n", 56 | "**Learning Objectives:**\n", 57 | "* Convert movie-review string data to a sparse feature vector\n", 58 | "* Implement a sentiment-analysis linear model using a sparse feature vector\n", 59 | "* Implement a sentiment-analysis DNN model using an embedding that projects data into two dimensions\n", 60 | "* Visualize the embedding to see what the model has learned about the relationships between words\n", 61 | "\n", 62 | "In this exercise, we'll explore sparse data and work with embeddings using text data from movie reviews (from the [ACL 2011 IMDB dataset](http://ai.stanford.edu/~amaas/data/sentiment/)). This data has already been processed into `tf.Example` format. " 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": { 68 | "colab_type": "text", 69 | "id": "2AKGtmwNosU8" 70 | }, 71 | "source": [ 72 | "## Setup\n", 73 | "\n", 74 | "Let's import our dependencies and download the training and test data. [`tf.keras`](https://www.tensorflow.org/api_docs/python/tf/keras) includes a file download and caching tool that we can use to retrieve the data sets." 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": { 81 | "colab": {}, 82 | "colab_type": "code", 83 | "id": "jGWqDqFFL_NZ" 84 | }, 85 | "outputs": [], 86 | "source": [ 87 | "from __future__ import print_function\n", 88 | "\n", 89 | "import collections\n", 90 | "import io\n", 91 | "import math\n", 92 | "\n", 93 | "import matplotlib.pyplot as plt\n", 94 | "import numpy as np\n", 95 | "import pandas as pd\n", 96 | "import tensorflow as tf\n", 97 | "from IPython import display\n", 98 | "from sklearn import metrics\n", 99 | "\n", 100 | "tf.logging.set_verbosity(tf.logging.ERROR)\n", 101 | "train_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/train.tfrecord'\n", 102 | "train_path = tf.keras.utils.get_file(train_url.split('/')[-1], train_url)\n", 103 | "test_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/test.tfrecord'\n", 104 | "test_path = tf.keras.utils.get_file(test_url.split('/')[-1], test_url)" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": { 110 | "colab_type": "text", 111 | "id": "6W7aZ9qspZVj" 112 | }, 113 | "source": [ 114 | "## Building a Sentiment Analysis Model" 115 | ] 116 | }, 117 | { 118 | "cell_type": "markdown", 119 | "metadata": { 120 | "colab_type": "text", 121 | "id": "jieA0k_NLS8a" 122 | }, 123 | "source": [ 124 | "Let's train a sentiment-analysis model on this data that predicts if a review is generally *favorable* (label of 1) or *unfavorable* (label of 0).\n", 125 | "\n", 126 | "To do so, we'll turn our string-value `terms` into feature vectors by using a *vocabulary*, a list of each term we expect to see in our data. For the purposes of this exercise, we've created a small vocabulary that focuses on a limited set of terms. Most of these terms were found to be strongly indicative of *favorable* or *unfavorable*, but some were just added because they're interesting.\n", 127 | "\n", 128 | "Each term in the vocabulary is mapped to a coordinate in our feature vector. To convert the string-value `terms` for an example into this vector format, we encode such that each coordinate gets a value of 0 if the vocabulary term does not appear in the example string, and a value of 1 if it does. Terms in an example that don't appear in the vocabulary are thrown away." 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": { 134 | "colab_type": "text", 135 | "id": "2HSfklfnLS8b" 136 | }, 137 | "source": [ 138 | "**NOTE:** *We could of course use a larger vocabulary, and there are special tools for creating these. In addition, instead of just dropping terms that are not in the vocabulary, we can introduce a small number of OOV (out-of-vocabulary) buckets to which you can hash the terms not in the vocabulary. We can also use a __feature hashing__ approach that hashes each term, instead of creating an explicit vocabulary. This works well in practice, but loses interpretability, which is useful for this exercise. See the tf.feature_column module for tools handling this.*" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": { 144 | "colab_type": "text", 145 | "id": "Uvoa2HyDtgqe" 146 | }, 147 | "source": [ 148 | "## Building the Input Pipeline" 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "metadata": { 154 | "colab_type": "text", 155 | "id": "O20vMEOurDol" 156 | }, 157 | "source": [ 158 | "First, let's configure the input pipeline to import our data into a TensorFlow model. We can use the following function to parse the training and test data (which is in [TFRecord](https://www.tensorflow.org/guide/datasets#consuming_tfrecord_data) format) and return a dict of the features and the corresponding labels." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "metadata": { 165 | "colab": {}, 166 | "colab_type": "code", 167 | "id": "SxxNIEniPq2z" 168 | }, 169 | "outputs": [], 170 | "source": [ 171 | "def _parse_function(record):\n", 172 | " \"\"\"Extracts features and labels.\n", 173 | " \n", 174 | " Args:\n", 175 | " record: File path to a TFRecord file \n", 176 | " Returns:\n", 177 | " A `tuple` `(labels, features)`:\n", 178 | " features: A dict of tensors representing the features\n", 179 | " labels: A tensor with the corresponding labels.\n", 180 | " \"\"\"\n", 181 | " features = {\n", 182 | " \"terms\": tf.VarLenFeature(dtype=tf.string), # terms are strings of varying lengths\n", 183 | " \"labels\": tf.FixedLenFeature(shape=[1], dtype=tf.float32) # labels are 0 or 1\n", 184 | " }\n", 185 | " \n", 186 | " parsed_features = tf.parse_single_example(record, features)\n", 187 | " \n", 188 | " terms = parsed_features['terms'].values\n", 189 | " labels = parsed_features['labels']\n", 190 | "\n", 191 | " return {'terms':terms}, labels" 192 | ] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "metadata": { 197 | "colab_type": "text", 198 | "id": "SXhTeeYMrp-l" 199 | }, 200 | "source": [ 201 | "To confirm our function is working as expected, let's construct a `TFRecordDataset` for the training data, and map the data to features and labels using the function above." 202 | ] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "execution_count": null, 207 | "metadata": { 208 | "colab": { 209 | "base_uri": "https://localhost:8080/", 210 | "height": 284 211 | }, 212 | "colab_type": "code", 213 | "id": "oF4YWXR0Omt0", 214 | "outputId": "5e243004-2d48-4463-9370-2e78bb74215f" 215 | }, 216 | "outputs": [ 217 | { 218 | "data": { 219 | "text/plain": [ 220 | "" 221 | ] 222 | }, 223 | "execution_count": 9, 224 | "metadata": { 225 | "tags": [] 226 | }, 227 | "output_type": "execute_result" 228 | } 229 | ], 230 | "source": [ 231 | "# Create the Dataset object.\n", 232 | "ds = tf.data.TFRecordDataset(train_path)\n", 233 | "# Map features and labels with the parse function.\n", 234 | "ds = ds.map(_parse_function)\n", 235 | "\n", 236 | "ds" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": { 242 | "colab_type": "text", 243 | "id": "bUoMvK-9tVXP" 244 | }, 245 | "source": [ 246 | "Run the following cell to retrieve the first example from the training data set." 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": null, 252 | "metadata": { 253 | "colab": { 254 | "base_uri": "https://localhost:8080/", 255 | "height": 840 256 | }, 257 | "colab_type": "code", 258 | "id": "Z6QE2DWRUc4E", 259 | "outputId": "86a033b1-e02a-44f1-96a5-06cce0037f9e" 260 | }, 261 | "outputs": [ 262 | { 263 | "data": { 264 | "text/plain": [ 265 | "({'terms': array([b'but', b'it', b'does', b'have', b'some', b'good', b'action',\n", 266 | " b'and', b'a', b'plot', b'that', b'is', b'somewhat', b'interesting',\n", 267 | " b'.', b'nevsky', b'acts', b'like', b'a', b'body', b'builder',\n", 268 | " b'and', b'he', b'isn', b\"'\", b't', b'all', b'that', b'attractive',\n", 269 | " b',', b'in', b'fact', b',', b'imo', b',', b'he', b'is', b'ugly',\n", 270 | " b'.', b'(', b'his', b'acting', b'skills', b'lack', b'everything',\n", 271 | " b'!', b')', b'sascha', b'is', b'played', b'very', b'well', b'by',\n", 272 | " b'joanna', b'pacula', b',', b'but', b'she', b'needed', b'more',\n", 273 | " b'lines', b'than', b'she', b'was', b'given', b',', b'her',\n", 274 | " b'character', b'needed', b'to', b'be', b'developed', b'.',\n", 275 | " b'there', b'are', b'way', b'too', b'many', b'men', b'in', b'this',\n", 276 | " b'story', b',', b'there', b'is', b'zero', b'romance', b',', b'too',\n", 277 | " b'much', b'action', b',', b'and', b'way', b'too', b'dumb', b'of',\n", 278 | " b'an', b'ending', b'.', b'it', b'is', b'very', b'violent', b'.',\n", 279 | " b'i', b'did', b'however', b'love', b'the', b'scenery', b',',\n", 280 | " b'this', b'movie', b'takes', b'you', b'all', b'over', b'the',\n", 281 | " b'world', b',', b'and', b'that', b'is', b'a', b'bonus', b'.', b'i',\n", 282 | " b'also', b'liked', b'how', b'it', b'had', b'some', b'stuff',\n", 283 | " b'about', b'the', b'mafia', b'in', b'it', b',', b'not', b'too',\n", 284 | " b'much', b'or', b'too', b'little', b',', b'but', b'enough',\n", 285 | " b'that', b'it', b'got', b'my', b'attention', b'.', b'the',\n", 286 | " b'actors', b'needed', b'to', b'be', b'more', b'handsome', b'.',\n", 287 | " b'.', b'.', b'the', b'biggest', b'problem', b'i', b'had', b'was',\n", 288 | " b'that', b'nevsky', b'was', b'just', b'too', b'normal', b',',\n", 289 | " b'not', b'sexy', b'enough', b'.', b'i', b'think', b'for', b'most',\n", 290 | " b'guys', b',', b'sascha', b'will', b'be', b'hot', b'enough', b',',\n", 291 | " b'but', b'for', b'us', b'ladies', b'that', b'are', b'fans', b'of',\n", 292 | " b'action', b',', b'nevsky', b'just', b'doesn', b\"'\", b't', b'cut',\n", 293 | " b'it', b'.', b'overall', b',', b'this', b'movie', b'was', b'fine',\n", 294 | " b',', b'i', b'didn', b\"'\", b't', b'love', b'it', b'nor', b'did',\n", 295 | " b'i', b'hate', b'it', b',', b'just', b'found', b'it', b'to', b'be',\n", 296 | " b'another', b'normal', b'action', b'flick', b'.'], dtype=object)},\n", 297 | " array([0.], dtype=float32))" 298 | ] 299 | }, 300 | "execution_count": 10, 301 | "metadata": { 302 | "tags": [] 303 | }, 304 | "output_type": "execute_result" 305 | } 306 | ], 307 | "source": [ 308 | "n = ds.make_one_shot_iterator().get_next()\n", 309 | "sess = tf.Session()\n", 310 | "sess.run(n)" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "metadata": { 316 | "colab_type": "text", 317 | "id": "jBU39UeFty9S" 318 | }, 319 | "source": [ 320 | "Now, let's build a formal input function that we can pass to the `train()` method of a TensorFlow Estimator object." 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": null, 326 | "metadata": { 327 | "colab": {}, 328 | "colab_type": "code", 329 | "id": "5_C5-ueNYIn_" 330 | }, 331 | "outputs": [], 332 | "source": [ 333 | "# Create an input_fn that parses the tf.Examples from the given files,\n", 334 | "# and split them into features and targets.\n", 335 | "def _input_fn(input_filenames, num_epochs=None, shuffle=True):\n", 336 | " \n", 337 | " # Same code as above; create a dataset and map features and labels.\n", 338 | " ds = tf.data.TFRecordDataset(input_filenames)\n", 339 | " ds = ds.map(_parse_function)\n", 340 | "\n", 341 | " if shuffle:\n", 342 | " ds = ds.shuffle(10000)\n", 343 | "\n", 344 | " # Our feature data is variable-length, so we pad and batch\n", 345 | " # each field of the dataset structure to whatever size is necessary.\n", 346 | " ds = ds.padded_batch(25, ds.output_shapes)\n", 347 | " \n", 348 | " ds = ds.repeat(num_epochs)\n", 349 | "\n", 350 | " \n", 351 | " # Return the next batch of data.\n", 352 | " features, labels = ds.make_one_shot_iterator().get_next()\n", 353 | " return features, labels" 354 | ] 355 | }, 356 | { 357 | "cell_type": "markdown", 358 | "metadata": { 359 | "colab_type": "text", 360 | "id": "Y170tVlrLS8c" 361 | }, 362 | "source": [ 363 | "## Task 1: Use a Linear Model with Sparse Inputs and an Explicit Vocabulary\n", 364 | "\n", 365 | "For our first model, we'll build a [`LinearClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier) model using 50 informative terms; always start simple!\n", 366 | "\n", 367 | "The following code constructs the feature column for our terms. The [`categorical_column_with_vocabulary_list`](https://www.tensorflow.org/api_docs/python/tf/feature_column/categorical_column_with_vocabulary_list) function creates a feature column with the string-to-feature-vector mapping." 368 | ] 369 | }, 370 | { 371 | "cell_type": "code", 372 | "execution_count": null, 373 | "metadata": { 374 | "colab": {}, 375 | "colab_type": "code", 376 | "id": "B5gdxuWsvPcx" 377 | }, 378 | "outputs": [], 379 | "source": [ 380 | "# 50 informative terms that compose our model vocabulary \n", 381 | "informative_terms = (\"bad\", \"great\", \"best\", \"worst\", \"fun\", \"beautiful\",\n", 382 | " \"excellent\", \"poor\", \"boring\", \"awful\", \"terrible\",\n", 383 | " \"definitely\", \"perfect\", \"liked\", \"worse\", \"waste\",\n", 384 | " \"entertaining\", \"loved\", \"unfortunately\", \"amazing\",\n", 385 | " \"enjoyed\", \"favorite\", \"horrible\", \"brilliant\", \"highly\",\n", 386 | " \"simple\", \"annoying\", \"today\", \"hilarious\", \"enjoyable\",\n", 387 | " \"dull\", \"fantastic\", \"poorly\", \"fails\", \"disappointing\",\n", 388 | " \"disappointment\", \"not\", \"him\", \"her\", \"good\", \"time\",\n", 389 | " \"?\", \".\", \"!\", \"movie\", \"film\", \"action\", \"comedy\",\n", 390 | " \"drama\", \"family\")\n", 391 | "\n", 392 | "terms_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(key=\"terms\", vocabulary_list=informative_terms)" 393 | ] 394 | }, 395 | { 396 | "cell_type": "markdown", 397 | "metadata": { 398 | "colab_type": "text", 399 | "id": "eTiDwyorwd3P" 400 | }, 401 | "source": [ 402 | "Next, we'll construct the `LinearClassifier`, train it on the training set, and evaluate it on the evaluation set. After you read through the code, run it and see how you do." 403 | ] 404 | }, 405 | { 406 | "cell_type": "code", 407 | "execution_count": null, 408 | "metadata": { 409 | "colab": { 410 | "base_uri": "https://localhost:8080/", 411 | "height": 736 412 | }, 413 | "colab_type": "code", 414 | "id": "HYKKpGLqLS8d", 415 | "outputId": "0c015429-952c-41d9-b5d8-126415e19e86" 416 | }, 417 | "outputs": [ 418 | { 419 | "name": "stdout", 420 | "output_type": "stream", 421 | "text": [ 422 | "Training set metrics:\n", 423 | "accuracy 0.78832\n", 424 | "accuracy_baseline 0.5\n", 425 | "auc 0.8720737\n", 426 | "auc_precision_recall 0.8626951\n", 427 | "average_loss 0.4504532\n", 428 | "label/mean 0.5\n", 429 | "loss 11.26133\n", 430 | "precision 0.7594299\n", 431 | "prediction/mean 0.50741524\n", 432 | "recall 0.844\n", 433 | "global_step 1000\n", 434 | "---\n", 435 | "Test set metrics:\n", 436 | "accuracy 0.78552\n", 437 | "accuracy_baseline 0.5\n", 438 | "auc 0.87036055\n", 439 | "auc_precision_recall 0.860967\n", 440 | "average_loss 0.4511889\n", 441 | "label/mean 0.5\n", 442 | "loss 11.279722\n", 443 | "precision 0.7580995\n", 444 | "prediction/mean 0.5062276\n", 445 | "recall 0.83864\n", 446 | "global_step 1000\n", 447 | "---\n" 448 | ] 449 | } 450 | ], 451 | "source": [ 452 | "my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)\n", 453 | "my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", 454 | "\n", 455 | "feature_columns = [ terms_feature_column ]\n", 456 | "\n", 457 | "\n", 458 | "classifier = tf.estimator.LinearClassifier(\n", 459 | " feature_columns=feature_columns,\n", 460 | " optimizer=my_optimizer,\n", 461 | ")\n", 462 | "\n", 463 | "classifier.train(\n", 464 | " input_fn=lambda: _input_fn([train_path]),\n", 465 | " steps=1000)\n", 466 | "\n", 467 | "evaluation_metrics = classifier.evaluate(\n", 468 | " input_fn=lambda: _input_fn([train_path]),\n", 469 | " steps=1000)\n", 470 | "print(\"Training set metrics:\")\n", 471 | "for m in evaluation_metrics:\n", 472 | " print(m, evaluation_metrics[m])\n", 473 | "print(\"---\")\n", 474 | "\n", 475 | "evaluation_metrics = classifier.evaluate(\n", 476 | " input_fn=lambda: _input_fn([test_path]),\n", 477 | " steps=1000)\n", 478 | "\n", 479 | "print(\"Test set metrics:\")\n", 480 | "for m in evaluation_metrics:\n", 481 | " print(m, evaluation_metrics[m])\n", 482 | "print(\"---\")" 483 | ] 484 | }, 485 | { 486 | "cell_type": "markdown", 487 | "metadata": { 488 | "colab_type": "text", 489 | "id": "J0ubn9gULS8g" 490 | }, 491 | "source": [ 492 | "## Task 2: Use a Deep Neural Network (DNN) Model\n", 493 | "\n", 494 | "The above model is a linear model. It works quite well. But can we do better with a DNN model?\n", 495 | "\n", 496 | "Let's swap in a [`DNNClassifier`](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) for the `LinearClassifier`. Run the following cell, and see how you do." 497 | ] 498 | }, 499 | { 500 | "cell_type": "code", 501 | "execution_count": null, 502 | "metadata": { 503 | "colab": { 504 | "base_uri": "https://localhost:8080/", 505 | "height": 736 506 | }, 507 | "colab_type": "code", 508 | "id": "jcgOPfEALS8h", 509 | "outputId": "89b825e4-2978-4413-9bd9-6e680f7910be" 510 | }, 511 | "outputs": [ 512 | { 513 | "name": "stdout", 514 | "output_type": "stream", 515 | "text": [ 516 | "Training set metrics:\n", 517 | "accuracy 0.8\n", 518 | "accuracy_baseline 0.52\n", 519 | "auc 0.92307687\n", 520 | "auc_precision_recall 0.925045\n", 521 | "average_loss 0.39349946\n", 522 | "label/mean 0.52\n", 523 | "loss 9.837486\n", 524 | "precision 0.9\n", 525 | "prediction/mean 0.42836806\n", 526 | "recall 0.6923077\n", 527 | "global_step 1000\n", 528 | "---\n", 529 | "Test set metrics:\n", 530 | "accuracy 0.8\n", 531 | "accuracy_baseline 0.52\n", 532 | "auc 0.8653846\n", 533 | "auc_precision_recall 0.8871613\n", 534 | "average_loss 0.45752084\n", 535 | "label/mean 0.48\n", 536 | "loss 11.438021\n", 537 | "precision 0.7692308\n", 538 | "prediction/mean 0.54468846\n", 539 | "recall 0.8333333\n", 540 | "global_step 1000\n", 541 | "---\n" 542 | ] 543 | } 544 | ], 545 | "source": [ 546 | "##################### Here's what we changed ##################################\n", 547 | "classifier = tf.estimator.DNNClassifier( #\n", 548 | " feature_columns=[tf.feature_column.indicator_column(terms_feature_column)], #\n", 549 | " hidden_units=[20,20], #\n", 550 | " optimizer=my_optimizer, #\n", 551 | ") #\n", 552 | "###############################################################################\n", 553 | "\n", 554 | "try:\n", 555 | " classifier.train(\n", 556 | " input_fn=lambda: _input_fn([train_path]),\n", 557 | " steps=1000)\n", 558 | "\n", 559 | " evaluation_metrics = classifier.evaluate(\n", 560 | " input_fn=lambda: _input_fn([train_path]),\n", 561 | " steps=1)\n", 562 | " print(\"Training set metrics:\")\n", 563 | " for m in evaluation_metrics:\n", 564 | " print(m, evaluation_metrics[m])\n", 565 | " print(\"---\")\n", 566 | "\n", 567 | " evaluation_metrics = classifier.evaluate(\n", 568 | " input_fn=lambda: _input_fn([test_path]),\n", 569 | " steps=1)\n", 570 | "\n", 571 | " print(\"Test set metrics:\")\n", 572 | " for m in evaluation_metrics:\n", 573 | " print(m, evaluation_metrics[m])\n", 574 | " print(\"---\")\n", 575 | "except ValueError as err:\n", 576 | " print(err)" 577 | ] 578 | }, 579 | { 580 | "cell_type": "markdown", 581 | "metadata": { 582 | "colab_type": "text", 583 | "id": "cZz68luxLS8j" 584 | }, 585 | "source": [ 586 | "## Task 3: Use an Embedding with a DNN Model\n", 587 | "\n", 588 | "In this task, we'll implement our DNN model using an embedding column. An embedding column takes sparse data as input and returns a lower-dimensional dense vector as output." 589 | ] 590 | }, 591 | { 592 | "cell_type": "markdown", 593 | "metadata": { 594 | "colab_type": "text", 595 | "id": "AliRzhvJLS8k" 596 | }, 597 | "source": [ 598 | "**NOTE:** *An embedding_column is usually the computationally most efficient option to use for training a model on sparse data. In an [optional section](#scrollTo=XDMlGgRfKSVz) at the end of this exercise, we'll discuss in more depth the implementational differences between using an `embedding_column` and an `indicator_column`, and the tradeoffs of selecting one over the other.*" 599 | ] 600 | }, 601 | { 602 | "cell_type": "markdown", 603 | "metadata": { 604 | "colab_type": "text", 605 | "id": "F-as3PtALS8l" 606 | }, 607 | "source": [ 608 | "In the following code, do the following:\n", 609 | "\n", 610 | "* Define the feature columns for the model using an `embedding_column` that projects the data into 2 dimensions (see the [TF docs](https://www.tensorflow.org/api_docs/python/tf/feature_column/embedding_column) for more details on the function signature for `embedding_column`).\n", 611 | "* Define a `DNNClassifier` with the following specifications:\n", 612 | " * Two hidden layers of 20 units each\n", 613 | " * Adagrad optimization with a learning rate of 0.1\n", 614 | " * A `gradient_clip_norm` of 5.0" 615 | ] 616 | }, 617 | { 618 | "cell_type": "markdown", 619 | "metadata": { 620 | "colab_type": "text", 621 | "id": "UlPZ-Q9bLS8m" 622 | }, 623 | "source": [ 624 | "**NOTE:** *In practice, we might project to dimensions higher than 2, like 50 or 100. But for now, 2 dimensions is easy to visualize.*" 625 | ] 626 | }, 627 | { 628 | "cell_type": "markdown", 629 | "metadata": { 630 | "colab_type": "text", 631 | "id": "mNCLhxsXyOIS" 632 | }, 633 | "source": [ 634 | "### Hint" 635 | ] 636 | }, 637 | { 638 | "cell_type": "code", 639 | "execution_count": null, 640 | "metadata": { 641 | "colab": {}, 642 | "colab_type": "code", 643 | "id": "L67xYD7hLS8m" 644 | }, 645 | "outputs": [], 646 | "source": [ 647 | "# Here's a example code snippet you might use to define the feature columns:\n", 648 | "\n", 649 | "terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2)\n", 650 | "feature_columns = [ terms_embedding_column ]" 651 | ] 652 | }, 653 | { 654 | "cell_type": "markdown", 655 | "metadata": { 656 | "colab_type": "text", 657 | "id": "iv1UBsJxyV37" 658 | }, 659 | "source": [ 660 | "### Complete the Code Below" 661 | ] 662 | }, 663 | { 664 | "cell_type": "code", 665 | "execution_count": null, 666 | "metadata": { 667 | "colab": { 668 | "base_uri": "https://localhost:8080/", 669 | "height": 476 670 | }, 671 | "colab_type": "code", 672 | "id": "5PG_yhNGLS8u", 673 | "outputId": "d56941fc-2e4a-4be1-9a86-09a51a802eee" 674 | }, 675 | "outputs": [ 676 | { 677 | "name": "stdout", 678 | "output_type": "stream", 679 | "text": [ 680 | "Training set metrics:\n", 681 | "accuracy 0.65308\n", 682 | "accuracy_baseline 0.5\n", 683 | "auc 0.8681754\n", 684 | "auc_precision_recall 0.8571652\n", 685 | "average_loss 0.6069626\n", 686 | "label/mean 0.5\n", 687 | "loss 15.174066\n", 688 | "precision 0.91588783\n", 689 | "prediction/mean 0.42483702\n", 690 | "recall 0.33712\n", 691 | "global_step 1000\n", 692 | "---\n", 693 | "Test set metrics:\n", 694 | "accuracy 0.65704\n", 695 | "accuracy_baseline 0.5\n", 696 | "auc 0.86710733\n", 697 | "auc_precision_recall 0.8548789\n", 698 | "average_loss 0.6060262\n", 699 | "label/mean 0.5\n", 700 | "loss 15.150654\n", 701 | "precision 0.9151861\n", 702 | "prediction/mean 0.42522198\n", 703 | "recall 0.34616\n", 704 | "global_step 1000\n", 705 | "---\n" 706 | ] 707 | } 708 | ], 709 | "source": [ 710 | "########################## YOUR CODE HERE ######################################\n", 711 | "terms_embedding_column = tf.feature_column.embedding_column(categorical_column=terms_feature_column, \n", 712 | " dimension=2) # Define the embedding column\n", 713 | "feature_columns = [ terms_embedding_column ] # Define the feature columns\n", 714 | "\n", 715 | "my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.08) # 0.1\n", 716 | "my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", 717 | "\n", 718 | "# try adam\n", 719 | "my_optimizer = tf.train.AdamOptimizer(learning_rate=0.01,\n", 720 | " beta1=0.9,\n", 721 | " beta2=0.999,\n", 722 | " epsilon=1e-08)\n", 723 | "\n", 724 | "classifier = tf.estimator.DNNClassifier( # # Define the DNNClassifier\n", 725 | " feature_columns=feature_columns, #\n", 726 | " hidden_units=[40,20], #\n", 727 | " optimizer=my_optimizer, #\n", 728 | " batch_norm=True, # was False\n", 729 | ")\n", 730 | "################################################################################\n", 731 | "\n", 732 | "classifier.train(\n", 733 | " input_fn=lambda: _input_fn([train_path]),\n", 734 | " steps=1000) # was 1000\n", 735 | "\n", 736 | "evaluation_metrics = classifier.evaluate(\n", 737 | " input_fn=lambda: _input_fn([train_path]),\n", 738 | " steps=1000)\n", 739 | "print(\"Training set metrics:\")\n", 740 | "for m in evaluation_metrics:\n", 741 | " print(m, evaluation_metrics[m])\n", 742 | "print(\"---\")\n", 743 | "\n", 744 | "evaluation_metrics = classifier.evaluate(\n", 745 | " input_fn=lambda: _input_fn([test_path]),\n", 746 | " steps=1000)\n", 747 | "\n", 748 | "print(\"Test set metrics:\")\n", 749 | "for m in evaluation_metrics:\n", 750 | " print(m, evaluation_metrics[m])\n", 751 | "print(\"---\")" 752 | ] 753 | }, 754 | { 755 | "cell_type": "markdown", 756 | "metadata": { 757 | "colab_type": "text", 758 | "id": "eQS5KQzBybTY" 759 | }, 760 | "source": [ 761 | "### Solution\n", 762 | "\n", 763 | "Click below for a solution." 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": null, 769 | "metadata": { 770 | "colab": {}, 771 | "colab_type": "code", 772 | "id": "R5xOdYeQydi5" 773 | }, 774 | "outputs": [], 775 | "source": [ 776 | "########################## SOLUTION CODE ########################################\n", 777 | "terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2)\n", 778 | "feature_columns = [ terms_embedding_column ]\n", 779 | "\n", 780 | "my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)\n", 781 | "my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", 782 | "\n", 783 | "classifier = tf.estimator.DNNClassifier(\n", 784 | " feature_columns=feature_columns,\n", 785 | " hidden_units=[20,20],\n", 786 | " optimizer=my_optimizer\n", 787 | ")\n", 788 | "#################################################################################\n", 789 | "\n", 790 | "classifier.train(\n", 791 | " input_fn=lambda: _input_fn([train_path]),\n", 792 | " steps=1000)\n", 793 | "\n", 794 | "evaluation_metrics = classifier.evaluate(\n", 795 | " input_fn=lambda: _input_fn([train_path]),\n", 796 | " steps=1000)\n", 797 | "print(\"Training set metrics:\")\n", 798 | "for m in evaluation_metrics:\n", 799 | " print(m, evaluation_metrics[m])\n", 800 | "print(\"---\")\n", 801 | "\n", 802 | "evaluation_metrics = classifier.evaluate(\n", 803 | " input_fn=lambda: _input_fn([test_path]),\n", 804 | " steps=1000)\n", 805 | "\n", 806 | "print(\"Test set metrics:\")\n", 807 | "for m in evaluation_metrics:\n", 808 | " print(m, evaluation_metrics[m])\n", 809 | "print(\"---\")" 810 | ] 811 | }, 812 | { 813 | "cell_type": "markdown", 814 | "metadata": { 815 | "colab_type": "text", 816 | "id": "aiHnnVtzLS8w" 817 | }, 818 | "source": [ 819 | "## Task 4: Convince yourself there's actually an embedding in there\n", 820 | "\n", 821 | "The above model used an `embedding_column`, and it seemed to work, but this doesn't tell us much about what's going on internally. How can we check that the model is actually using an embedding inside?\n", 822 | "\n", 823 | "To start, let's look at the tensors in the model:" 824 | ] 825 | }, 826 | { 827 | "cell_type": "code", 828 | "execution_count": null, 829 | "metadata": { 830 | "colab": { 831 | "base_uri": "https://localhost:8080/", 832 | "height": 509 833 | }, 834 | "colab_type": "code", 835 | "id": "h1jNgLdQLS8w", 836 | "outputId": "93c7d758-2fcf-4052-be5a-6a73856f1d7d" 837 | }, 838 | "outputs": [ 839 | { 840 | "data": { 841 | "text/plain": [ 842 | "['dnn/hiddenlayer_0/bias',\n", 843 | " 'dnn/hiddenlayer_0/bias/t_0/Adagrad',\n", 844 | " 'dnn/hiddenlayer_0/kernel',\n", 845 | " 'dnn/hiddenlayer_0/kernel/t_0/Adagrad',\n", 846 | " 'dnn/hiddenlayer_1/bias',\n", 847 | " 'dnn/hiddenlayer_1/bias/t_0/Adagrad',\n", 848 | " 'dnn/hiddenlayer_1/kernel',\n", 849 | " 'dnn/hiddenlayer_1/kernel/t_0/Adagrad',\n", 850 | " 'dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights',\n", 851 | " 'dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights/t_0/Adagrad',\n", 852 | " 'dnn/logits/bias',\n", 853 | " 'dnn/logits/bias/t_0/Adagrad',\n", 854 | " 'dnn/logits/kernel',\n", 855 | " 'dnn/logits/kernel/t_0/Adagrad',\n", 856 | " 'global_step']" 857 | ] 858 | }, 859 | "execution_count": 30, 860 | "metadata": { 861 | "tags": [] 862 | }, 863 | "output_type": "execute_result" 864 | } 865 | ], 866 | "source": [ 867 | "classifier.get_variable_names()" 868 | ] 869 | }, 870 | { 871 | "cell_type": "markdown", 872 | "metadata": { 873 | "colab_type": "text", 874 | "id": "Sl4-VctMLS8z" 875 | }, 876 | "source": [ 877 | "Okay, we can see that there is an embedding layer in there: `'dnn/input_from_feature_columns/input_layer/terms_embedding/...'`. (What's interesting here, by the way, is that this layer is trainable along with the rest of the model just as any hidden layer is.)\n", 878 | "\n", 879 | "Is the embedding layer the correct shape? Run the following code to find out." 880 | ] 881 | }, 882 | { 883 | "cell_type": "markdown", 884 | "metadata": { 885 | "colab_type": "text", 886 | "id": "JNFxyQUiLS80" 887 | }, 888 | "source": [ 889 | "**NOTE:** *Remember, in our case, the embedding is a matrix that allows us to project a 50-dimensional vector down to 2 dimensions.*" 890 | ] 891 | }, 892 | { 893 | "cell_type": "code", 894 | "execution_count": null, 895 | "metadata": { 896 | "colab": { 897 | "base_uri": "https://localhost:8080/", 898 | "height": 136 899 | }, 900 | "colab_type": "code", 901 | "id": "1xMbpcEjLS80", 902 | "outputId": "8c1a0845-d0b7-47ed-c760-4bb2ab5d011a" 903 | }, 904 | "outputs": [ 905 | { 906 | "data": { 907 | "text/plain": [ 908 | "(50, 2)" 909 | ] 910 | }, 911 | "execution_count": 48, 912 | "metadata": { 913 | "tags": [] 914 | }, 915 | "output_type": "execute_result" 916 | } 917 | ], 918 | "source": [ 919 | "classifier.get_variable_value('dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights').shape" 920 | ] 921 | }, 922 | { 923 | "cell_type": "markdown", 924 | "metadata": { 925 | "colab_type": "text", 926 | "id": "MnLCIogjLS82" 927 | }, 928 | "source": [ 929 | "Spend some time manually checking the various layers and shapes to make sure everything is connected the way you would expect it would be." 930 | ] 931 | }, 932 | { 933 | "cell_type": "code", 934 | "execution_count": null, 935 | "metadata": { 936 | "colab": { 937 | "base_uri": "https://localhost:8080/", 938 | "height": 136 939 | }, 940 | "colab_type": "code", 941 | "id": "PBOczo_D-88-", 942 | "outputId": "2ed7885c-35d6-4a55-f784-0cd7d6208814" 943 | }, 944 | "outputs": [ 945 | { 946 | "data": { 947 | "text/plain": [ 948 | "(2, 20)" 949 | ] 950 | }, 951 | "execution_count": 47, 952 | "metadata": { 953 | "tags": [] 954 | }, 955 | "output_type": "execute_result" 956 | } 957 | ], 958 | "source": [ 959 | "# first hidden layer takes 2d to 20d\n", 960 | "classifier.get_variable_value('dnn/hiddenlayer_0/kernel').shape" 961 | ] 962 | }, 963 | { 964 | "cell_type": "code", 965 | "execution_count": null, 966 | "metadata": { 967 | "colab": { 968 | "base_uri": "https://localhost:8080/", 969 | "height": 296 970 | }, 971 | "colab_type": "code", 972 | "id": "w9UAhcDI_JOC", 973 | "outputId": "b2e16b53-d281-47c1-8b1d-9ecea73c2988" 974 | }, 975 | "outputs": [ 976 | { 977 | "data": { 978 | "text/plain": [ 979 | "(20, 20)" 980 | ] 981 | }, 982 | "execution_count": 39, 983 | "metadata": { 984 | "tags": [] 985 | }, 986 | "output_type": "execute_result" 987 | } 988 | ], 989 | "source": [ 990 | "# second hidden layer takes 20d to 20d\n", 991 | "classifier.get_variable_value('dnn/hiddenlayer_1/kernel/t_0/Adagrad').shape" 992 | ] 993 | }, 994 | { 995 | "cell_type": "code", 996 | "execution_count": null, 997 | "metadata": { 998 | "colab": { 999 | "base_uri": "https://localhost:8080/", 1000 | "height": 296 1001 | }, 1002 | "colab_type": "code", 1003 | "id": "yjy9pNi7_hhV", 1004 | "outputId": "d0a6fe06-4bd2-41c6-d3af-282e1f7e6794" 1005 | }, 1006 | "outputs": [ 1007 | { 1008 | "data": { 1009 | "text/plain": [ 1010 | "(20, 1)" 1011 | ] 1012 | }, 1013 | "execution_count": 40, 1014 | "metadata": { 1015 | "tags": [] 1016 | }, 1017 | "output_type": "execute_result" 1018 | } 1019 | ], 1020 | "source": [ 1021 | "# final layer takes 20d to 1d\n", 1022 | "classifier.get_variable_value('dnn/logits/kernel').shape" 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "markdown", 1027 | "metadata": { 1028 | "colab_type": "text", 1029 | "id": "1tpo1HrzAh0W" 1030 | }, 1031 | "source": [ 1032 | "Robert Hatem:\n", 1033 | "\n", 1034 | "The shapes make sense, but I'm not sure what the extra `/t_0/Adagrad` means:\n", 1035 | "\n", 1036 | "```python\n", 1037 | "'dnn/hiddenlayer_0/kernel',\n", 1038 | "'dnn/hiddenlayer_0/kernel/t_0/Adagrad',\n", 1039 | "```" 1040 | ] 1041 | }, 1042 | { 1043 | "cell_type": "markdown", 1044 | "metadata": { 1045 | "colab_type": "text", 1046 | "id": "rkKAaRWDLS83" 1047 | }, 1048 | "source": [ 1049 | "## Task 5: Examine the Embedding\n", 1050 | "\n", 1051 | "Let's now take a look at the actual embedding space, and see where the terms end up in it. Do the following:\n", 1052 | "1. Run the following code to see the embedding we trained in **Task 3**. Do things end up where you'd expect?\n", 1053 | "\n", 1054 | "2. Re-train the model by rerunning the code in **Task 3**, and then run the embedding visualization below again. What stays the same? What changes?\n", 1055 | "\n", 1056 | "3. Finally, re-train the model again using only 10 steps (which will yield a terrible model). Run the embedding visualization below again. What do you see now, and why?" 1057 | ] 1058 | }, 1059 | { 1060 | "cell_type": "code", 1061 | "execution_count": null, 1062 | "metadata": { 1063 | "colab": { 1064 | "base_uri": "https://localhost:8080/", 1065 | "height": 976 1066 | }, 1067 | "colab_type": "code", 1068 | "id": "s4NNu7KqLS84", 1069 | "outputId": "01b9bb6b-2451-4cca-8c9e-7878fda329bd" 1070 | }, 1071 | "outputs": [ 1072 | { 1073 | "data": { 1074 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA3YAAANSCAYAAAApkFytAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAIABJREFUeJzs3X1UVWX+///XBhVRSJy0ZrwpsKWI\ncOAABxXxBixRy1JTI8MSLcvxruhOnSYzPtZ041cbS3NqUppSxyWOOTg24+BN3o+AHRVNQu1MVq5C\nCxKF5Gb//jDPL8Z7RQ4bn4+1Wuvsva99Xe/rrNaqF9d19jZM0xQAAAAAwLq8PF0AAAAAAODqEOwA\nAAAAwOIIdgAAAABgcQQ7AAAAALA4gh0AAAAAWBzBDgAAAAAsjmAHAAAAABZHsAMAAAAAiyPYAQAA\nAIDFNfB0ARfSokULMzAw0NNlAAAAAIBH5ObmHjVNs+XF2tXpYBcYGKicnBxPlwEAAAAAHmEYxn8v\npR1bMQEAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACL\nI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYXI0EO8MwFhiG8Z1hGHnnuR5v\nGEaxYRjOn/+ZVhPjAgAAAACkBjXUT7qktyT95QJtNpmmOaCGxgMAAAAA/KxGVuxM09wo6fua6AsA\nAAAAcHlq8zd2sYZh7DIM42PDMEJrcVwAAAAAqNdqaivmxeyUdKtpmiWGYdwp6SNJ7c/V0DCMRyU9\nKkm33HJLLZUHAAAAANZVKyt2pmn+aJpmyc+fV0tqaBhGi/O0fcc0TYdpmo6WLVvWRnkAAAAAYGm1\nEuwMw/i1YRjGz587/zzusdoYGwAAAADquxrZimkYxhJJ8ZJaGIbxlaQXJDWUJNM050saKum3hmFU\nSCqVdL9pmmZNjA0AAAAA17saCXamaQ6/yPW3dPp1CAAAAACAGlabT8UEAAAAAFwDBDsAAAAAsDiC\nHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsA\nAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAA\nAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR7AAAAADA\n4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR\n7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAAAACLI9gB\nAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwAA\nAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyOYAcAAAAA\nFkewAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyO\nYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEO\nAAAAACyOYAcAAAAAFkewAwAAAACLI9gBAACgRkyfPl0zZ870dBnAdYlgBwAAgGumoqLC0yUA14UG\nni4AAAAA1vXSSy/p/fff10033aS2bdsqOjpa8fHxstvt2rx5s4YPH64OHTpoxowZOnXqlG688UYt\nWrRIN998s6ZPn64vvvhChw4d0pdffqnZs2dr+/bt+vjjj9W6dWtlZmaqYcOGSktLU2ZmpkpLS9Wt\nWzf96U9/kmEYnp46UKewYgcAAIArkpubq7/+9a9yOp1avXq1srOz3ddOnTqlnJwcPfXUU+revbu2\nb9+uTz/9VPfff79ee+01d7uDBw9q3bp1+vvf/64RI0YoISFBe/bska+vr/7xj39IkiZMmKDs7Gzl\n5eWptLRUq1atqvW5AnUdK3YAAAC4Ips2bdLgwYPVpEkTSdI999zjvpaUlOT+/NVXXykpKUlHjhzR\nqVOnFBQU5L7Wv39/NWzYUDabTZWVlerXr58kyWazyeVySZLWr1+v1157TSdPntT333+v0NBQ3X33\n3bUwQ8A6WLEDAABAjWvatKn788SJEzVhwgTt2bNHf/rTn1RWVua+5uPjI0ny8vJSw4YN3Vssvby8\nVFFRobKyMo0bN04ZGRnas2ePxowZU+1+AKcR7AAAAHBFevbsqY8++kilpaU6fvy4MjMzz9muuLhY\nrVu3liS9//77lzXGmRDXokULlZSUKCMj4+qKBuopgh2uWnp6uiZMmCDp6h5z7HK5tHjx4posDQAA\nXENRUVFKSkpSRESE+vfvr5iYmHO2mz59uoYNG6bo6Gi1aNHissYICAjQmDFjFBYWpr59+553DOB6\nZ5im6ekazsvhcJg5OTmeLgMXkZ6erpycHL311luaPn26/Pz89PTTT192Pxs2bNDMmTP5QTQAAADw\nM8Mwck3TdFysHSt215EPP/xQnTt3lt1u12OPPab//ve/at++vY4ePaqqqir16NFDa9askST95S9/\nUXh4uCIiIvTggw9KkgoLCzVkyBDFxMQoJiZGW7ZsueB4Bw8eVL9+/RQdHa0ePXpo//79kqSUlBRN\nmjRJ3bp1U7t27dxbKqZMmaJNmzbJbrdr9uzZ1/CbAAAAAOoXnop5nfjss8+0dOlSbdmyRQ0bNtS4\nceP0ySefaPLkyfrtb3+rzp07q1OnTkpMTNTevXs1Y8YMbd26VS1atND3338vSXr88ceVmpqq7t27\n68svv1Tfvn312WefnXfMRx99VPPnz1f79u31n//8R+PGjdO6deskSUeOHNHmzZu1f/9+3XPPPRo6\ndKheeeUVVuwAAACAK0Cwu06sXbtWubm57n3ppaWluummmzR9+nQtW7ZM8+fPl9PplCStW7dOw4YN\nc++B/9WvfiVJysrK0r59+9x9/vjjjyopKTnneCUlJdq6dauGDRvmPvfTTz+5Pw8aNEheXl7q1KmT\nvv3225qdLAAAAHCdIdhdJ0zT1MiRI/WHP/yh2vmTJ0/qq6++knQ6jPn7+5+3j6qqKm3fvl2NGze+\n6HhVVVUKCAhwh8X/debRxmdqAwAAAHDl+I3ddeL2229XRkaGvvvuO0nS999/r//+97+aPHmykpOT\nlZaWpjFjxkiSevfurWXLlunYsWPutpKUmJioN998093n+UKbJN1www0KCgrSsmXLJJ0Ob7t27bpg\njf7+/jp+/PiVTxIAAAC4ThHsrhOdOnXSjBkzlJiYqPDwcPXp00cul0vZ2dnucNeoUSMtXLhQoaGh\neu6559SrVy9FREToySeflCTNmTNHOTk5Cg8PV6dOnTR//vwLjrlo0SK99957ioiIUGhoqFauXHnB\n9uHh4fL29lZERAQPTwEAAAAuA687AAAAAIA6itcdAAAAAMB1gmAHAAAAABZHsAMAAAAAiyPYAQAA\nAIDFEewAAAAAwOIIdgAAAABgcQQ7AAAAALA4gh0AAAAAWBzBDgAAAAAsjmAHAAAAABZHsAMAAAAA\niyPYAQBwEd26dfN0CQAAXBDBDgCAi9i6daunSwAA4IIIdgAAyxo0aJCio6MVGhqqd955R5Lk5+en\nZ555RqGhobrjjju0Y8cOxcfHq127dvr73/8uSXK5XOrRo4eioqIUFRXlDm7Tpk2T3W6X3W5X69at\nNWrUKHefkrRhwwbFx8dr6NCh6tixo5KTk2WapiRp9erV6tixo6KjozVp0iQNGDCgtr8OAMB1zDjz\nH6S6yOFwmDk5OZ4uAwBQR33//ff61a9+pdLSUsXExOiTTz5RixYttHr1avXv31+DBw/WiRMn9I9/\n/EP79u3TyJEj5XQ6dfLkSXl5ealx48YqKCjQ8OHD9cv/3hQVFalHjx5KT09XdHS0/Pz8VFJSog0b\nNmjgwIHau3evWrVqpbi4OL3++utyOBxq3769Nm7cqKCgIA0fPlzHjx/XqlWrPPjtAADqA8Mwck3T\ndFysXYPaKAYAgGthzpw5WrFihSTp8OHDKigoUKNGjdSvXz9Jks1mk4+Pjxo2bCibzSaXyyVJKi8v\n14QJE+R0OuXt7a3PP//c3adpmhoxYoSefPJJRUdHnzVm586d1aZNG0mS3W6Xy+WSn5+f2rVrp6Cg\nIEnS8OHD3SuIAADUBoIdAMCSNmzYoKysLG3btk1NmjRRfHy8ysrK1LBhQxmGIUny8vKSj4+P+3NF\nRYUkafbs2br55pu1a9cuVVVVqXHjxu5+p0+frjZt2ri3Yf6vM/1Jkre3t7tPAAA8iWAHALCk4uJi\nNW/eXE2aNNH+/fu1ffv2y7q3TZs28vLy0vvvv6/KykpJUmZmprKysrR+/frLqiU4OFiHDh2Sy+VS\nYGCgli5deln3AwBwtXh4CgDAkvr166eKigqFhIRoypQp6tq16yXfO27cOL3//vuKiIjQ/v371bRp\nU0nSrFmz9PXXX6tz586y2+2aNm3aJfXn6+urefPmqV+/foqOjpa/v7+aNWt2RfMCAOBK8PAUAABq\nQElJifz8/GSapsaPH6/27dsrNTXV02UBACzuUh+ewoodAAA14N1335XdbldoaKiKi4v12GOPebok\nAMB1hBU7AAAAAKijWLEDAAAAgOsEwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAAB1\n2pw5cxQSEqLk5ORzXs/JydGkSZMkSenp6ZowYUJtlgcAQJ3QwNMFAABwIfPmzVNWVpbatGlzzusO\nh0MOx0WfAg0AQL3Gih0AoM4aO3asDh06pP79++vVV19VbGysIiMj1a1bN+Xn50uSNmzYoAEDBpx1\n77JlyxQWFqaIiAj17NmztksHAKBWsWIHAKiz5s+fr3/+859av369GjVqpKeeekoNGjRQVlaWfve7\n32n58uXnvTctLU3/+te/1Lp1axUVFdVi1QAA1D6CHQDAEoqLizVy5EgVFBTIMAyVl5dfsH1cXJxS\nUlJ033336d57762lKgEA8Ay2YgIALOH5559XQkKC8vLylJmZqbKysgu2nz9/vmbMmKHDhw8rOjpa\nx44dq6VKAQCofTUS7AzDWGAYxneGYeSd57phGMYcwzAOGIax2zCMqJoYFwBw/SguLlbr1q0lnX76\n5cUcPHhQXbp0UVpamlq2bKnDhw9f4woBAPCcmlqxS5fU7wLX+0tq//M/j0p6u4bGBQBcJ5599llN\nnTpVkZGRqqiouGj7Z555RjabTWFhYerWrZsiIiJqoUoAADzDME2zZjoyjEBJq0zTDDvHtT9J2mCa\n5pKfj/MlxZumeeRCfTocDjMnJ6dG6gMAAAAAqzEMI9c0zYu+16e2fmPXWtIv98B89fM5AAAAAMBV\nqnMPTzEM41HDMHIMw8gpLCz0dDkAAAAAUOfVVrD7WlLbXxy3+fncWUzTfMc0TYdpmo6WLVvWSnEA\nAAAAYGW1Fez+Lumhn5+O2VVS8cV+XwcAAAAAuDQ18oJywzCWSIqX1MIwjK8kvSCpoSSZpjlf0mpJ\nd0o6IOmkpFE1MS4AAAAAoIaCnWmawy9y3ZQ0vibGAgAAAABUV+cengIAAAAAuDwEOwAAAACwOIId\nAAAAAFgcwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAAAAAMDiCHYAAAAAYHEEOwAA\nAACwOIIdAAAAAFgcwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAAAAAMDiCHYAAAAA\nYHEEOwAAAACwOIIdAAAAAFgcwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAAAAAMDi\nCHYAAAAAYHEEOwAAAACwOIIdAAAAAFgcwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHs\nAKCOmTNnjkJCQpScnHxV/UybNk1ZWVmSpPj4eOXk5NREeQAAoA5q4OkCAADVzZs3T1lZWWrTps1V\n9ZOWllZDFQEAgLqOFTsAqEPGjh2rQ4cOqX///nr11VcVGxuryMhIdevWTfn5+ZKk9PR0DRo0SH36\n9FFgYKDeeustzZo1S5GRkeratau+//57SVJKSooyMjKq9b9gwQI98cQT7uN3331XqamptTdBAABw\nTRDsAKAOmT9/vlq1aqX169frt7/9rTZt2qRPP/1UaWlp+t3vfudul5eXp7/97W/Kzs7Wc889pyZN\nmujTTz9VbGys/vKXv5y3//vuu0+ZmZkqLy+XJC1cuFCjR4++5vMCAADXFlsxAaCOKi4u1siRI1VQ\nUCDDMNxhTJISEhLk7+8vf39/NWvWTHfffbckyWazaffu3eft08/PT71799aqVasUEhKi8vJy2Wy2\naz4XAABwbbFiBwB11PPPP6+EhATl5eUpMzNTZWVl7ms+Pj7uz15eXu5jLy8vVVRUXLDfRx55ROnp\n6Vq4cKFGjRp1bYoHAAC1ihU7AKijiouL1bp1a0mnf1dXU7p06aLDhw9r586dF1zdAwAA1sGKHQDU\nUc8++6ymTp2qyMjIi67CXa777rtPcXFxat68eY32azVFRUWaN2/eVffzyCOPaN++fZJOb3c9l3M9\nzAYAgJpimKbp6RrOy+FwmLx3CQAuT3p6uhITE9WqVavzthkwYIBSU1N1++2312JldY/L5dKAAQOU\nl5d3Se1N05RpmvLy+v//LlpZWSlvb2/3sZ+fn0pKSs66NyUlRQMGDNDQoUOvvnAAwHXDMIxc0zQd\nF2vHih0A1DPp6en65ptvznmtqKhIHTp0kK+v73Uf6iRpypQpOnjwoOx2u5555hm9/vrriomJUXh4\nuF544QVJp8NfcHCwHnroIYWFhenw4cPy8/PTU089pYiICG3btu2sF8CnpqYqNDRUt99+uwoLC88a\nNzc3V7169VJ0dLT69u2rI0eO1NqcAQD1E8EOAOo4l8ulkJAQjRkzRqGhoUpMTFRpaamcTqe6du2q\n8PBwDR48WD/88IMyMjKUk5Oj5ORk2e12lZaWVusrICBAn3/+uZYtW+ah2dQtr7zyim677TY5nU71\n6dNHBQUF2rFjh5xOp3Jzc7Vx40ZJUkFBgcaNG6e9e/fq1ltv1YkTJ9SlSxft2rVL3bt3r9bniRMn\n5HA4tHfvXvXq1Usvvvhitevl5eWaOHGiMjIylJubq9GjR+u5556rtTkDAOongh0A1BEul0thYWFn\nnZ81a5Y+//xzjR8/Xi1btpRpmlq+fLkeeughvfrqq9q9e7dsNptefPFFDR06VA6HQ4sWLZLT6ZSv\nr6+mTZumrKwsD8zIWtasWaM1a9YoMjJSUVFR2r9/vwoKCiRJt956q7p27epu6+3trSFDhpyzHy8v\nLyUlJUmSRowYoc2bN1e7np+fr7y8PPXp00d2u10zZszQV199dY1mBQC4XvBUTACo45588kl9/PHH\nstvtkqSOHTvq4MGDKioqUq9evSRJI0eO1LBhw866t7KyUmlpabVar1WZpqmpU6fqscceq3be5XKp\nadOm1c41bty42u/qLsQwjLPGCQ0N1bZt266uYAAAfoEVOwCoQyorK8/acvn000/rp59+crfx8vJS\nUVGRfvjhBzkcDoWGhuqNN95wX9++fbvmzJmjqKgoLVu2rNrTGNeuXavIyEjZbDaNHj3a3W9gYKCO\nHj0qScrJyVF8fLwk6ZNPPpHdbpfdbldkZKSOHz9eS99E7fD393fPqW/fvlqwYIH7wSdff/21vvvu\nu8vus6qqyv19L168+KytmsHBwSosLHQHu/Lycu3du/dqpgEAAMEOAOqSgoICjR8/Xnv37lVAQICW\nL19+znbNmjXTrbfeqtmzZ2v37t3KzMxUaGiopNPBz9fXVzt37tT999/vvqesrEwpKSlaunSp9uzZ\no4qKCr399tsXrGfmzJmaO3eunE6nNm3aJF9f35qbbB1w4403Ki4uTmFhYfr3v/+tBx54QLGxsbLZ\nbBo6dOgVBdmmTZtqx44dCgsL07p16zRt2rRq1xs1aqSMjAxNnjxZERERstvt2rp1a01NCQBwnWIr\nJgDUIUFBQe4tl9HR0XK5XOdtO3DgQPXr109VVVWqqqpyb8ts2rSpsrKyZLfbq233y8/PV1BQkDp0\n6CDp9PbNuXPn6oknnjjvGHFxcXryySeVnJyse++9V23atKmBWdYtixcvrnb8+OOPn9Xmf1+H8L+v\nM9iwYcN5r53xy5fM2+1294NZAACoCazYAUAd4uPj4/7s7e2tiooK+fn5adasWe7zDz74oEaOHKml\nS5fqq6++UmlpqYYPH64GDU7/ra5p06b6z3/+4354yqVo0KCBqqqqJJ1e2TtjypQp+vOf/6zS0lLF\nxcVp//79NTFNAABQwwh2AGBBP/74o5o2bapmzZrp22+/1ccff3zRe4KDg+VyuXTgwAFJ0gcffOBe\n5QsMDFRubq4kVdv+efDgQdlsNk2ePFkxMTEEOwAA6iiCHQBYUEREhCIjI9WxY0c98MADiouLu+g9\njRs31sKFCzVs2DDZbDZ5eXlp7NixkqQXXnhBjz/+uBwOR7WnPb7xxhsKCwtTeHi4GjZsqP79+1+z\nOQEAgCtnmKbp6RrOy+FwmDk5OZ4uAwAAAAA8wjCMXNM0HRdrx4odAAAAAFgcwQ4AAAAALI5gBwAA\nAAAWR7ADAAAAAIsj2AEAAACAxRHsAAAAAMDiCHYAAAAAYHEEOwAAAACwOIIdAAAAAFgcwQ4AgAuY\nPn26Zs6ced7rhYWF6tKliyIjI7Vp0ybdeeedKioqumCf06ZNU1ZWliTpjTfe0MmTJy9aR3x8vHJy\nci6veADAdaOBpwsAAMDK1q5dK5vNpj//+c+SpB49elz0nrS0NPfnN954QyNGjFCTJk2uWY0AgPqP\nFTsAAP7HSy+9pA4dOqh79+7Kz8+XJB08eFD9+vVTdHS0evToof3798vpdOrZZ5/VypUrZbfbVVpa\nqsDAQB09elQul0shISEaM2aMQkNDlZiYqNLSUklSSkqKMjIyNGfOHH3zzTdKSEhQQkKCJGnNmjWK\njY1VVFSUhg0bppKSkmq1LViwQE888YT7+N1331VqamotfTMAgLqKYAcAwC/k5ubqr3/9q5xOp1av\nXq3s7GxJ0qOPPqo333xTubm5mjlzpsaNGye73a60tDQlJSXJ6XTK19e3Wl8FBQUaP3689u7dq4CA\nAC1fvrza9UmTJqlVq1Zav3691q9fr6NHj2rGjBnKysrSzp075XA4NGvWrGr33HfffcrMzFR5ebkk\naeHChRo9evQ1/EYAAFbAVkwAAH5h06ZNGjx4sHtr5D333KOysjJt3bpVw4YNc7f76aefLtpXUFCQ\n7Ha7JCk6Oloul+uC7bdv3659+/YpLi5OknTq1CnFxsZWa+Pn56fevXtr1apVCgkJUXl5uWw22+VM\nEQBQDxHsAAC4iKqqKgUEBMjpdF7WfT4+Pu7P3t7e7q2Y52Oapvr06aMlS5ZcsN0jjzyil19+WR07\ndtSoUaMuqyYAQP3EVkwAAH6hZ8+e+uijj1RaWqrjx48rMzNTTZo0UVBQkJYtWybpdADbtWtXjYzn\n7++v48ePS5K6du2qLVu26MCBA5KkEydO6PPPPz/rni5duujw4cNavHixhg8fXiN1AACsjWAHAMAv\nREVFKSkpSREREerfv79iYmIkSYsWLdJ7772niIgIhYaGauXKlTUy3qOPPqp+/fopISFBLVu2VHp6\nuoYPH67w8HDFxsZq//7957zvvvvuU1xcnJo3b14jdQAArM0wTdPTNZyXw+EweWcPAABnGzBggFJT\nU3X77bd7uhQAwDVkGEauaZqOi7VjxQ4AAAspKipShw4d5OvrS6gDALjx8BQAACwkICDgnL+7AwBc\n31ixAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEOAAAAACyO\nYAcAAAAAFkewAwAAAACLI9gBAAAAgMUR7AAAAADA4gh2AAAAAGBxBDsAAAAAsDiCHQAAAABYHMEO\nAAAAACyOYAcAACCpsrLS0yUAwBUj2AEAAMtxuVzq2LGjkpOTFRISoqFDh+rkyZNau3atIiMjZbPZ\nNHr0aP3000+SdN7zgYGBmjx5sqKiorRs2TJPTgkArgrBDgAAWFJ+fr7GjRunzz77TDfccINmzZql\nlJQULV26VHv27FFFRYXefvttlZWVnfP8GTfeeKN27typ+++/34OzAYCrQ7ADAACW1LZtW8XFxUmS\nRowYobVr1yooKEgdOnSQJI0cOVIbN25Ufn7+Oc+fkZSUVPvFA0ANI9gBAABLMgyj2nFAQMAV9dO0\nadOaKAcAPIpgBwAALOnLL7/Utm3bJEmLFy+Ww+GQy+XSgQMHJEkffPCBevXqpeDg4HOeB4D6hGAH\nAAAsKTg4WHPnzlVISIh++OEHpaamauHChRo2bJhsNpu8vLw0duxYNW7c+JznAaA+MUzT9HQN5+Vw\nOMycnBxPlwEAAOoYl8ulAQMGKC8vz9OlAMA1ZRhGrmmajou1Y8UOAAAAACyOYAcAACwnMDCQ1ToA\n+AWCHQAAAABYHMEOAAAAACyOYAcAAAAAFkewAwCgHkpJSVFGRoanywAA1BKCHQAA9UxlZaWnSwAA\n1DKCHQAAHuRyudSxY0clJycrJCREQ4cO1cmTJ7V27VpFRkbKZrNp9OjR+umnnyTpvOcDAwM1efJk\nRUVFadmyZe7+161bp0GDBrmP//3vf2vw4MG1O0kAwDVHsAMAwMPy8/M1btw4ffbZZ7rhhhs0a9Ys\npaSkaOnSpdqzZ48qKir09ttvq6ys7Jznz7jxxhu1c+dO3X///e5zCQkJ2r9/vwoLCyVJCxcu1OjR\no2t9jgCAa4tgBwCAh7Vt21ZxcXGSpBEjRmjt2rUKCgpShw4dJEkjR47Uxo0blZ+ff87zZyQlJZ3V\nt2EYevDBB/Xhhx+qqKhI27Y8osvJAAAgAElEQVRtU//+/WthVgCA2tTA0wUAAHC9Mwyj2nFAQICO\nHTt22f00bdr0nOdHjRqlu+++W40bN9awYcPUoAH/+QeA+oYVOwAAPOzLL7/Utm3bJEmLFy+Ww+GQ\ny+XSgQMHJEkffPCBevXqpeDg4HOev5hWrVqpVatWmjFjhkaNGnXtJgIA8BiCHQAAHhYcHKy5c+cq\nJCREP/zwg1JTU7Vw4UINGzZMNptNXl5eGjt2rBo3bnzO85ciOTlZbdu2VUhIyDWeDQDAEwzTND1d\nw3k5HA4zJyfH02XAQqZPny4/Pz/9+OOP6tmzp+644w6P1XLnnXdq8eLFCggIOG+b9PR0JSYmqlWr\nVte0ltoaB8Dlc7lcGjBggPLy8q7pOBMmTFBkZKQefvjhazoOAKBmGYaRa5qm42Lt2GSPeiktLc3T\nJWj16tUXbZOenq6wsLBaCXa1MQ6Auik6OlpNmzbV//t//8/TpQAArhG2YsLyXnrpJXXo0EHdu3dX\nfn6+JCklJUUZGRmSpClTpqhTp04KDw/X008/LUnKzMxUly5dFBkZqTvuuEPffvutpNMrfg8++KBi\nY2PVvn17vfvuu5KkDRs2qGfPnrrrrrsUHByssWPHqqqqSpK0ZMkS2Ww2hYWFafLkye66AgMDdfTo\nUblcLoWEhGjMmDEKDQ1VYmKiSktLlZGRoZycHCUnJ8tut6u0tFSBgYGaOnWq7Ha7HA6Hdu7cqb59\n++q2227T/Pnz3X2//vrriomJUXh4uF544QVJuqxxANQdgYGB13y1Ljc3Vxs3bpSPj881HQcA4DkE\nO1habm6u/vrXv8rpdGr16tXKzs6udv3YsWNasWKF9u7dq927d+v3v/+9JKl79+7avn27Pv30U91/\n//167bXX3Pfs3r1b69at07Zt25SWlqZvvvlGkrRjxw69+eab2rdvnw4ePKi//e1v+uabbzR58mSt\nW7dOTqdT2dnZ+uijj86qs6CgQOPHj9fevXsVEBCg5cuXa+jQoXI4HFq0aJGcTqd8fX0lSbfccouc\nTqd69OjhDqjbt293B7g1a9aooKBAO3bskNPpdP8P2+WOAwAAgPqDrZiwtE2bNmnw4MFq0qSJJOme\ne+6pdr1Zs2Zq3LixHn74YQ0YMEADBgyQJH311VdKSkrSkSNHdOrUKQUFBbnvGThwoHx9feXr66uE\nhATt2LFDAQEB6ty5s9q1aydJGj58uDZv3qyGDRsqPj5eLVu2lHT64QQbN27UoEGDqtURFBQku90u\n6fSWKJfLdd45nZmDzWZTSUmJ/P395e/vLx8fHxUVFWnNmjVas2aNIiMjJUklJSUqKCjQLbfcclnj\nAAAAoP5gxQ71WoMGDbRjxw4NHTpUq1atUr9+/SRJEydO1IQJE7Rnzx796U9/UllZmfue/32f1Jnj\n852/FL/c/uTt7a2KioqLtvXy8qp2n5eXlyoqKmSapqZOnSqn0ymn06kDBw64H4ZwOeMAAACg/iDY\nwdJ69uypjz76SKWlpTp+/LgyMzOrXS8pKVFxcbHuvPNOzZ49W7t27ZIkFRcXq3Xr1pKk999/v9o9\nK1euVFlZmY4dO6YNGzYoJiZG0umtmF988YWqqqq0dOlSde/eXZ07d9Ynn3yio0ePqrKyUkuWLLmk\nd0qd4e/vr+PHj1/WnPv27asFCxaopKREkvT111/ru+++q/FxAAAAYB1sxYSlRUVFKSkpSREREbrp\nppvcIeyM48ePa+DAgSorK5Npmpo1a5ak0w9JGTZsmJo3b67evXvriy++cN8THh6uhIQEHT16VM8/\n/7xatWqlzz//XDExMZowYYIOHDighIQEDR48WF5eXnrllVeUkJAg0zR11113aeDAgZdcf0pKisaO\nHStfX1/3y4kvJjExUZ999pliY2MlSX5+fvrwww/l7e19yePwOzsAAID6hffYAb9w5j14Z56eecaG\nDRs0c+ZMrVq1ykOVAQAA4Hp0qe+xYysmAAAAAFgcK3YAAAAAUEexYgcAAAAA14kaCXaGYfQzDCPf\nMIwDhmFMOcf1FMMwCg3DcP78zyM1MS4AAAAAoAaeimkYhrekuZL6SPpKUrZhGH83TXPf/zRdaprm\nhKsdDwAAAABQXU2s2HWWdMA0zUOmaZ6S9FdJl/68dwAAAADAVamJYNda0uFfHH/187n/NcQwjN2G\nYWQYhtG2BsYFAAAAAKj2Hp6SKSnQNM1wSf+W9P75GhqG8ahhGDmGYeQUFhbWUnkAAAAAYF01Eey+\nlvTLFbg2P59zM03zmGmaP/18+GdJ0efrzDTNd0zTdJim6WjZsmUNlAcAAAAA9VtNBLtsSe0Nwwgy\nDKORpPsl/f2XDQzD+M0vDu+R9FkNjAsAAAAAUA08FdM0zQrDMCZI+pckb0kLTNPcaxhGmqQc0zT/\nLmmSYRj3SKqQ9L2klKsdFwAAAABwmmGapqdrOC+Hw2Hm5OR4ugwAAAAA8AjDMHJN03RcrF1tPTwF\nAAAAAHCNEOwAAAAAwOIIdgBqncvlUlhY2Fnnp02bpqysrAveO336dM2cOfOc1/z8/GqkPgAAAKsh\n2AGoM9LS0nTHHXd4ugwA9dD5/qAEAPUFwQ6AR1RWVmrMmDEKDQ1VYmKiSktLlZKSooyMDEnS6tWr\n1bFjR0VHR2vSpEkaMGCA+959+/YpPj5e7dq105w5c87q+6GHHtJHH33kPk5OTtbKlSuv/aQAAAA8\nhGAHwCMKCgo0fvx47d27VwEBAVq+fLn7WllZmR577DF9/PHHys3NVWFhYbV79+/fr3/961/asWOH\nXnzxRZWXl1e7/vDDDys9PV2SVFxcrK1bt+quu+665nMCUHP+7//+T8HBwerevbuGDx+umTNnyul0\nqmvXrgoPD9fgwYP1ww8/SNJ5z+fm5ioiIkIRERGaO3euJ6cDANccwQ6ARwQFBclut0uSoqOj5XK5\n3Nf279+vdu3aKSgoSJI0fPjwavfedddd8vHxUYsWLXTTTTfp22+/rXa9V69eKigoUGFhoZYsWaIh\nQ4aoQYOrfm0ngFqSnZ2t5cuXa9euXfr444915tVHDz30kF599VXt3r1bNptNL7744gXPjxo1Sm++\n+aZ27drlsbkAQG0h2AHwCB8fH/dnb29vVVRU1Oi9Dz30kD788EMtXLhQo0ePvrpiAdSqLVu2aODA\ngWrcuLH8/f11991368SJEyoqKlKvXr0kSSNHjtTGjRtVXFx8zvNFRUUqKipSz549JUkPPvigx+YD\nALWBP2EDqHOCg4N16NAhuVwuBQYGaunSpZfdR0pKijp37qxf//rX6tSp0zWoEgAAoO5gxQ5AnePr\n66t58+apX79+io6Olr+/v5o1a3ZZfdx8880KCQnRqFGjrlGVAK6VuLg4ZWZmqqysTCUlJVq1apWa\nNm2q5s2ba9OmTZKkDz74QL169VKzZs3OeT4gIEABAQHavHmzJGnRokUemw8A1AbDNE1P13BeDofD\nPLOvHsD1paSkRH5+fjJNU+PHj1f79u2Vmpp6yfefPHlSNptNO3fuvOxQCMDzpk+frsWLF+vmm2/W\nTTfdpH79+ikmJkZjx47VyZMn1a5dOy1cuFDNmzeX0+k85/nc3FyNHj1ahmEoMTFRq1evVl5enqen\nBgCXxTCMXNM0HRdtR7ADUBfNnj1b77//vk6dOqXIyEi9++67atKkySXdm5WVpYcfflipqal64okn\nrnGlAK6FM3/cOXnypHr27Kl33nlHUVFRni4LAGodwQ4AAFjWAw88oH379qmsrEwjR47U1KlTPV0S\nAHjEpQY7Hp4CAADqnMWLF3u6BACwFB6eAgAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAAAAAMDiCHYA\nAAAAYHEEOwAAAACwOIIdAAAAAFgcwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEAAACAxRHsAOAi\nioqKNG/evMu6JyUlRRkZGdeoIgAAgOoIdgDqvcLCQnXp0kWRkZHatGnTZd3rdDq1YsWKyw52AAAA\ntYlgB6Beq6io0Nq1a2Wz2fTpp5+qR48el3W/0+nUyy+/rIMHD8put+uZZ57RM888o7CwMNlsNi1d\nulSSZJqmJkyYoODgYN1xxx367rvv3H2kpaUpJiZGYWFhevTRR2Wapg4ePKioqCh3m4KCgmrHAAAA\nl4NgB6DOc7lc6tixo5KTkxUSEqKhQ4fq5MmTys3NVa9evRQdHa2+ffvqyJEjkqT4+Hg98cQTcjgc\n+uMf/6hnn31WK1eulN1uV2lpqdasWaPY2FhFRUVp2LBhKikpkSRlZ2erW7duioiIUOfOnVVcXKxp\n06bp2LFjkqSpU6eqa9eucjqd2rVrl7KysvTMM8/oyJEjWrFihfLz87Vv3z795S9/0datW931T5gw\nQdnZ2crLy1NpaalWrVql2267Tc2aNZPT6ZQkLVy4UKNGjarlbxYAANQXBDsAlpCfn69x48bps88+\n0w033KC5c+dq4sSJysjIUG5urkaPHq3nnnvO3f7UqVPKycnRU089pbS0NCUlJcnpdOrEiROaMWOG\nsrKytHPnTjkcDs2aNUunTp1SUlKS/vjHP7pDW9OmTZWWlqYBAwbotttuU1JSkjZv3qzhw4fL29tb\nN998s3r16qXs7Gxt3LjRfb5Vq1bq3bu3u5b169erS5custlsWrdunfbu3StJeuSRR7Rw4UJVVlZq\n6dKleuCBB2r9ewUAAPVDA08XAACXom3btoqLi5MkjRgxQi+//LLy8vLUp08fSVJlZaV+85vfuNsn\nJSWds5/t27dr37597r5OnTql2NhY5efn6ze/+Y1iYmIkSTfccEON1F1WVqZx48YpJydHbdu21fTp\n01VWViZJGjJkiF588UX17t1b0dHRuvHGG2tkTAAAcP1hxQ6AJRiGUe3Y399foaGhcjqdcjqd2rNn\nj9asWeO+3rRp03P2Y5qm+vTp475v3759eu+99y44dqNGjXT8+HFJUo8ePbR06VJVVlaqsLBQGzdu\nVOfOndWzZ0/3+SNHjmj9+vWS5A5xLVq0UElJSbUnZTZu3Fh9+/bVb3/7W7ZhAgCAq0KwA2AJX375\npbZt2yZJWrx4sbp27arCwkL3ufLycvcWxwvp2rWrtmzZogMHDkiSTpw4oc8//1zBwcE6cuSIsrOz\nJUnHjx9XRUWF/P39VV5erri4OIWFhWnbtm0KDw9XRESEevfurddee02//vWvNXjwYLVv316dOnXS\nQw89pNjYWElSQECAxowZo7CwMPXt29e9InhGcnKyvLy8lJiYWGPfFQAAuP6wFROAJQQHB2vu3Lka\nPXq0OnXqpIkTJ6pv376aNGmSiouLVVFRoSeeeEKhoaEX7Kdly5ZKT0/X8OHD9dNPP0mSZsyYoQ4d\nOmjp0qWaOHGiSktL5evrq6ysLCUkJOiVV15ReXm5nn/+efcWz9dff71av4Zh6K233jrnmDNmzNCM\nGTPOeW3z5s0aNWqUvL29L/crAQAAcDNM0/R0DeflcDjMnJwcT5cBwMNcLpcGDBigvLw8T5dSowYP\nHqyDBw9q3bp1atGihafLAeqdOXPm6O2331ZUVJQWLVp0Wfe+/PLL+t3vfndF46anpysxMVGtWrWS\ndPpBSU8++aQ6dep0Rf0BuL4ZhpFrmqbjou0IdgDquvoa7ABcWx07dlRWVpbatGlz2ff6+fm5X4Vy\nueLj4zVz5kw5HBf9/zAAuKhLDXb8xg5AnRcYGEioA3BZxo4dq0OHDql///569dVXFRsbq8jISHXr\n1k35+fmSTq+s3XvvverXr5/at2+vZ599VpI0ZcoUlZaWym63Kzk5WZI0aNAgRUdHKzQ0VO+8846k\n00/jTUlJUVhYmGw2m2bPnq2MjAzl5OQoOTnZ/e7M+Ph4nflD9T//+U9FRUUpIiJCt99+uwe+GQD1\nFSt2AACgXgoMDFROTo4aNWqkJk2aqEGDBsrKytLbb7+t5cuXKz09XWlpafr000/l4+Oj4OBgbd68\nWW3btj1rxe7777/Xr371K5WWliomJkaffPKJXC6XpkyZon//+9+SpKKiIgUEBJy1Ynfm+NZbb1VU\nVJQ2btyooKAgd58AcCGXumLHw1MAAEC9VlxcrJEjR6qgoECGYai8vNx97fbbb1ezZs0kSZ06ddJ/\n//tftW3b9qw+5syZoxUrVkiSDh8+rIKCAgUHB+vQoUOaOHGi7rrrros+3Xb79u3q2bOngoKCJIlQ\nB6BGsRUTAADUa88//7wSEhKUl5enzMxM9/slJcnHx8f92dvbWxUVFWfdv2HDBmVlZWnbtm3atWuX\nIiMjVVZWpubNm2vXrl2Kj4/X/Pnz9cgjj9TKfADgXAh2AACgXisuLlbr1q0lnf5d3aVo2LChe2Wv\nuLhYzZs3V5MmTbR//35t375dknT06FFVVVVpyJAhmjFjhnbu3ClJ8vf31/Hjx8/qs2vXrtq4caO+\n+OILSae3dwJATWErJgAAqNeeffZZjRw5UjNmzNBdd911Sfc8+uijCg8PV1RUlBYsWKD58+crJCRE\nwcHB6tq1qyTp66+/1qhRo1RVVSVJ+sMf/iBJSklJ0dixY+Xr66tt27a5+2zZsqXeeecd3Xvvvaqq\nqtJNN93k/n0eAFwtHp4CAAAAAHUUrzsAAAAAgOsEwQ4AAAAALI5gBwAAAAAWR7ADAAAAAIsj2AEA\nAACAxRHsAABAvTNnzhyFhISoefPmeuWVVyRJ06dP18yZMz1cGQBcG7zHDgAA1Dvz5s1TVlaW2rRp\n4+lSAKBWsGIHAADqlbFjx+rQoUPq37+/Zs+erQkTJpzVJj4+XqmpqXI4HAoJCVF2drbuvfdetW/f\nXr///e89UDUAXB2CHYDL5nK5FBYWdlV9bNiwQVu3bq2hioBzO9+/q9OmTVNWVpYHKkJtmD9/vlq1\naqX169erefPm523XqFEj5eTkaOzYsRo4cKDmzp2rvLw8paen69ixY7VYMQBcPbZiAvCIDRs2yM/P\nT926dfN0KbgOpaWleboE1AH33HOPJMlmsyk0NFS/+c1vJEnt2rXT4cOHdeONN3qyPAC4LKzYAbgi\nFRUVSk5OVkhIiIYOHaqTJ08qNzdXvXr1UnR0tPr27asjR45IOv0Qg06dOik8PFz333+/XC6X5s+f\nr9mzZ8tut2vTpk0eng3qs8rKSo0ZM0ahoaFKTExUaWmpUlJSlJGRIUkKDAzU1KlTZbfb5XA4tHPn\nTvXt21e33Xab5s+f7+HqcS35+PhIkry8vNyfzxxXVFR4qiwAuCKs2AG4Ivn5+XrvvfcUFxen0aNH\na+7cuVqxYoVWrlypli1baunSpXruuee0YMECvfLKK/riiy/k4+OjoqIiBQQEaOzYsfLz89PTTz/t\n6amgnisoKNCSJUv07rvv6r777tPy5cvPanPLLbfI6XQqNTVVKSkp2rJli8rKyhQWFqaxY8d6oGoA\nAC4PwQ7AFWnbtq3i4uIkSSNGjNDLL7+svLw89enTR9LpVZIz25rCw8OVnJysQYMGadCgQR6rGden\noKAg2e12SVJ0dLRcLtdZbX65Ja+kpET+/v7y9/ev9scIAADqMoIdgCtiGEa1Y39/f4WGhmrbtm1n\ntf3HP/6hjRs3KjMzUy+99JL27NlTW2UC1bbYeXt7q7S09Lxt2JJXf5wJ8CkpKUpJSZF0+j12Z2zY\nsMH9OT4+XvHx8ee8BgBWwW/sAFyRL7/80h3iFi9erK5du6qwsNB9rry8XHv37lVVVZUOHz6shIQE\nvfrqqyouLnaviBw/ftyTUwAAAKg3CHYArkhwcLDmzp2rkJAQ/fDDD5o4caIyMjI0efJkRUREyG63\na+vWraqsrNSIESNks9kUGRmpSZMmKSAgQHfffbdWrFjBw1MAAABqgGGapqdrOC+Hw2Hm5OR4ugwA\nAAAA8AjDMHJN03RcrB0rdgAAAABgcQQ7AABw3amsrPR0CQBQowh2AADAUl5//XXNmTNHkpSamqre\nvXtLktatW6fk5GQtWbJENptNYWFhmjx5svs+Pz8/PfXUU4qIiNC2bds0ZcoUderUSeHh4e53ahYW\nFmrIkCGKiYlRTEyMtmzZUvsTBIArQLADAACW0qNHD/dDl3JyclRSUqLy8nJt2rRJHTp00OTJk7Vu\n3To5nU5lZ2fro48+kiSdOHFCXbp00a5duxQSEqIVK1Zo79692r17t37/+99Lkh5//HGlpqYqOztb\ny5cv1yOPPOKxeQLA5SDYAQAAS4mOjlZubq5+/PFH+fj4KDY2Vjk5Odq0aZMCAgIUHx+vli1bqkGD\nBkpOTtbGjRslnX6P4ZAhQyRJzZo1U+PGjfXwww/rb3/7m5o0aSJJysrK0oQJE2S323XPPffoxx9/\nVElJicfmCgCXimCH/4+9Ow+vqrr7v//eBAQRBCpK1doC3oBARpJAGBKmCqjIICAilalqUcFKK2pL\n7R3UWpX80EJrU6wyVIpREFRurTZCSpgKSXuCglikxglKQSQQBiVwnj+U80gBwTKcHHi/rsvLPay1\n9nedXmo/rD1IkhRTqlWrRqNGjZg2bRrt2rUjMzOThQsX8s4779CwYcMj9qtRowZxcXEAVK1alRUr\nVtC/f3/mz59Pjx49ANi/fz/Lly8nFAoRCoX46KOPqFWr1qmYliQdF4OdJEmKOZmZmeTk5JCVlUVm\nZia5ubmkpKTQunVr/vKXv7Blyxb27dvHrFmz6Nix4yH9y8vLKSsr48orr+TRRx+lpKQEgG7dujF5\n8uRIu1AodMrmJEnHw2AnSZJiTmZmJhs3bqRt27Y0aNCAGjVqkJmZyYUXXshDDz1E586dSUpKIjU1\nld69ex/Sf8eOHfTs2ZPExEQ6dOjAxIkTAZg0aRJFRUUkJibSokULcnNzT/XUJOm/4gfKJUmSJKmS\n8gPlkiRJknSGMNhJkiRJUowz2EmSJElSjDPYSZIkSVKMM9hJkiRJUowz2EmSJElSjDPYSZIkSVKM\nM9hJkiRJUowz2EmSJElSjDPYSZIkSVKMM9hJkiRJUowz2EmSJElSjDPYSZIkSVKMM9hJkiRJUowz\n2EmSJElSjDPYSZIkSVKMM9hJkiRJUowz2EmSJElSjDPYSZIkVRK5ubnMmDHjhI7ZqVMnioqKDjk+\nbdo0Ro0adUKvJSl6qka7AEmSJH1u5MiR0S5BUoxyxU6SJOkkevrpp2ndujXJycn84Ac/YN++fdSq\nVYtx48aRlJRERkYGmzZtAiA7O5ucnBwAQqEQGRkZJCYm0rdvXz755BPWr19Pq1atImOvW7cusn/f\nffeRnp5OfHw8N998M+FwONLuD3/4A8nJycTHx7NixYpDaty8eTP9+vUjPT2d9PR0lixZcjJ/Ekkn\ngcFOkiTpJHnrrbfIy8tjyZIlhEIh4uLimDlzJjt37iQjI4OSkhKysrJ44oknDuk7ZMgQHn74YVat\nWkVCQgLjx4/n0ksvpU6dOoRCIQCmTp3K8OHDARg1ahQrV67kzTffZPfu3cyfPz8y1q5duwiFQjz+\n+OOMGDHikGv98Ic/ZMyYMaxcuZI5c+Zw4403nqRfRNLJ4q2YkiRJJ8nrr79OcXEx6enpAOzevZsL\nLriAs846i549ewKQmprKn//854P6lZWVsW3bNjp27AjA0KFDGTBgAAA33ngjU6dOZeLEieTl5UVW\n4BYuXMgjjzzCrl272Lp1Ky1btuTqq68GYNCgQQBkZWWxfft2tm3bdtD18vPzWbNmTWR/+/btlJeX\nU6tWrRP9k0g6SQx2kiRJJ0k4HGbo0KH88pe/POh4Tk4OQRAAEBcXR0VFxTGP2a9fP8aPH0+XLl1I\nTU3lvPPOY8+ePdx6660UFRVxySWXkJ2dzZ49eyJ9DlzrSPv79+9n+fLl1KhR4+tOUVIl4a2YkiRJ\nJ0nXrl2ZPXs2//73vwHYunUr77333lH71alTh3r16lFYWAh8/ozcgdW7GjVq0L17d2655ZbIbZgH\nQlz9+vUpLy9n9uzZB42Xl5cHwOLFi6lTpw516tQ56Hy3bt2YPHlyZP/ArZ6SYocrdpIkSSdJixYt\neOCBB+jWrRv79++nWrVq/OY3v/nKPgdW06ZPn87IkSPZtWsXjRs3ZurUqZE2gwcPZu7cuXTr1g2A\nunXrctNNNxEfH883v/nNyK2fB9SoUYOUlBT27t3LU089dcg1J02axG233UZiYiIVFRVkZWWRm5t7\nvNOXdAoFX35jUmWTlpYWPtx3VyRJkk5Ho0ePplWrVpGVuCPJycmhrKyM+++//xRVJilagiAoDofD\naUdr54qdJElSJXDvvffy17/+lezs7K9s17dvX9avX8+CBQtOTWGSYoIrdpIkSZJUSR3rip0vT5Ek\nSZKkGGewkyRJkqQYZ7CTJEmSpBhnsJMkSZKkGGewkyRJkqQYZ7CTJEmSpBhnsJMkSdJpKzc3lxkz\nZkS7DOmk8wPlkiRJOm2NHDky2iVIp4QrdpIkSaoUSktLueyyyxg2bBhNmzZl8ODB5Ofn0759e5o0\nacKKFSvYunUrffr0ITExkYyMDFatWsX+/ftp2LAh27Zti4zVpEkTNm3aRHZ2Njk5OQCsX7+eHj16\nkJqaSmZmJmvXro3WVKUTzmAnSZKkSuOdd97hxz/+MWvXrmXt2rX88Y9/ZPHixeTk5PDggw/yv//7\nv6SkpLBq1SoefPBBhgwZQpUqVejduzdz584F4K9//Svf+c53aNCgwUFj33zzzUyePJni4mJycnK4\n9dZbozFF6aTwVkxJkiRVGo0aNSIhIQGAli1b0rVrV4IgICEhgdLSUt577z3mzJkDQJcuXfj444/Z\nvn07AwcO5L777mP48OE888wzDBw48KBxy8vLWbp0KQMGDIgc+/TTT0/dxKSTzGAnSZKkSqN69eqR\n7SpVqkT2q1SpQkVFBdWqVTtsv7Zt2/LOO++wefNm5s2bx89+9rODzu/fv5+6desSCoVOXvFSFHkr\npiRJkmJGZmYmM2fOBB4h0rsAACAASURBVKCgoID69etz7rnnEgQBffv25Uc/+hHNmzfnvPPOO6jf\nueeeS6NGjXjuuecACIfDlJSUnPL6pZPFYCdJkqSYkZ2dTXFxMYmJidxzzz1Mnz49cm7gwIE8/fTT\nh9yGecDMmTN58sknSUpKomXLlrzwwgunqmzppAvC4XC0aziitLS0cFFRUbTLkCRJkg4xb948mjZt\nSosWLaJdik5jQRAUh8PhtKO1c8VOkiSpEti2bRuPP/44ABs2bKB///5RrkgH7Nu377DH582bx5o1\na05xNdLhGewkSZIqgS8Hu4suuojZs2dHuaLTw4QJE5g0aRIAY8aMoUuXLgAsWLCAwYMHM2vWLBIS\nEoiPj+fuu++O9KtVqxY//vGPSUpKYtmyZdxzzz20aNGCxMRE7rzzTpYuXcqLL77I2LFjSU5OZv36\n9VGZn3SAwU6SJKkSuOeee1i/fj3JyckMGDCA+Ph4AKZNm0afPn24/PLLadiwIb/+9a+ZOHEiKSkp\nZGRksHXrVsCPbx9JZmYmhYWFABQVFVFeXs7evXspLCykadOm3H333SxYsIBQKMTKlSuZN28eADt3\n7qRNmzaUlJTQvHlz5s6dy+rVq1m1ahU/+9nPaNeuHb169WLChAmEQiEuvfTSaE5TMthJkiRVBg89\n9BCXXnopoVCICRMmHHTuzTff5Pnnn2flypWMGzeOmjVr8ve//522bdsyY8YMwI9vH0lqairFxcVs\n376d6tWr07ZtW4qKiigsLKRu3bp06tSJ888/n6pVqzJ48GAWLVoEQFxcHP369QOgTp061KhRg+9/\n//s8//zz1KxZM5pTkg7L79hJkiRVcp07d6Z27drUrl2bOnXqcPXVVwOQkJDAqlWr/Pj2V6hWrRqN\nGjVi2rRptGvXjsTERBYuXMg777xDw4YNKS4uPmy/GjVqEBcXB0DVqlVZsWIFr7/+OrNnz+bXv/41\nCxYsOJXTkI7KYCdJklTJHe2j3X58+6tlZmaSk5PDU089RUJCAj/60Y9ITU2ldevW3H777WzZsoV6\n9eoxa9YsRo8efUj/8vJydu3axZVXXkn79u1p3LgxALVr12bHjh2nejrSYZ2QWzGDIOgRBMHbQRC8\nEwTBPYc5Xz0Igrwvzv81CIKGJ+K6kiRJp4vjCQl+fPurZWZmsnHjRtq2bUuDBg2oUaMGmZmZXHjh\nhTz00EN07tyZpKQkUlNT6d279yH9d+zYQc+ePUlMTKRDhw5MnDgRgOuuu44JEyaQkpLiy1MUdce9\nYhcEQRzwG+By4ENgZRAEL4bD4S+/+/X7wCfhcPh/giC4DngYOPyXIyVJks5A5513Hu3btyc+Pp7m\nzZt/7f4zZ87klltu4YEHHmDv3r1cd911JCUlnYRKY0/Xrl3Zu3dvZP8f//hHZHvQoEEMGjTokD7l\n5eWR7QsvvJAVK1Yc0qZ9+/Z+7kCVxnF/oDwIgrZAdjgc7v7F/k8AwuHwL7/U5tUv2iwLgqAq8C/g\n/PBRLu4HyiVJkiSdyU7lB8ovBj740v6HXxw7bJtwOFwBlAHnnYBrS5IkSdIZr9J97iAIgpuDICgK\ngqBo8+bN0S5HkiRJkiq9ExHsPgIu+dL+t744dtg2X9yKWQf4+HCDhcPhKeFwOC0cDqedf/75J6A8\nSZIkSTq9nYhgtxJoEgRBoyAIzgKuA178jzYvAkO/2O4PLDja83WSJEmSpGNz3G/FDIfDFUEQjAJe\nBeKAp8Lh8OogCO4DisLh8IvAk8AfgiB4B9jK5+FPkiRJknQCnJAPlIfD4ZeBl//j2M+/tL0HGHAi\nriWdDKWlpfTs2ZM333wz2qVIkiRJX1ule3mKFGsqKiqiXYIkSZLOcAY76Qv79u3jpptuomXLlnTr\n1o3du3ezfv16evToQWpqKpmZmaxduxaAYcOGMXLkSNq0acNdd90V5colSZJ0pjsht2JKp4N169Yx\na9YsnnjiCa699lrmzJnD1KlTyc3NpUmTJvz1r3/l1ltvZcGCBQB8+OGHLF26lLi4uChXLkmSpDOd\nwU76QqNGjUhOTgYgNTWV0tJSli5dyoAB///joZ9++mlke8CAAYY6SZIkVQoGO+kL1atXj2zHxcWx\nadMm6tatSygUOmz7c84551SVJkmSJH0ln7GTjuDcc8+lUaNGPPfccwCEw2FKSkqiXJUkSZJ0KIOd\n9BVmzpzJk08+SVJSEi1btuSFF16IdkmSJEnSIYJwOBztGo4oLS0tXFRUFO0yJEmSJCkqgiAoDofD\naUdr54qdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmS\nJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIk\nxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTF\nOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4\ng50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdpDPKpEmTaN68OYMHD452\nKZIkSSdM1WgXIEmn0uOPP05+fj7f+ta3ol2KJEnSCeOKnaQzxsiRI/nnP//JFVdcQZ06dcjJyYmc\ni4+Pp7S0lNLSUpo3b85NN91Ey5Yt6datG7t3745i1f+9efPmsWbNmq/dr6CggKVLlx613YsvvshD\nDz30lW02bNhA//79v3YNkiTp6zHYSTpj5ObmctFFF7Fw4ULGjBlzxHbr1q3jtttuY/Xq1dStW5c5\nc+acwipPnP8m2FVUVBxzsOvVqxf33HPPV7a56KKLmD179teqQZIkfX0GO0n6D40aNSI5ORmA1NRU\nSktLo1vQlzz99NO0bt2a5ORkfvCDH7Bv3z5q1arFuHHjSEpKIiMjg02bNrF06VJefPFFxo4dS3Jy\nMuvXr2f9+vX06NGD1NRUMjMzWbt2LQDDhg1j5MiRtGnThmuvvZbc3FweffRRkpOTKSws5KWXXqJN\nmzakpKTw3e9+l02bNgEwbdo0Ro0aFRnj9ttvp127djRu3DgS5kpLS4mPj4+0v+aaa+jRowdNmjTh\nrrvuiszrySefpGnTprRu3ZqbbropMq4kSTo2BjtJZ6SqVauyf//+yP6ePXsi29WrV49sx8XFUVFR\ncUprO5K33nqLvLw8lixZQigUIi4ujpkzZ7Jz504yMjIoKSkhKyuLJ554gnbt2tGrVy8mTJhAKBTi\n0ksv5eabb2by5MkUFxeTk5PDrbfeGhn7ww8/ZOnSpTz//POMHDmSMWPGEAqFyMzMpEOHDixfvpy/\n//3vXHfddTzyyCOHrW/jxo0sXryY+fPnH3ElLxQKkZeXxxtvvEFeXh4ffPABGzZs4P7772f58uUs\nWbIkEjglSdKx8+Upks5IDRs2ZP78+QD87W9/4913341yRUf3+uuvU1xcTHp6OgC7d+/mggsu4Kyz\nzqJnz57A5yuMf/7znw/pW15eztKlSxkwYEDk2KeffhrZHjBgAHFxcYe97ocffsjAgQPZuHEjn332\nGY0aNTpsuz59+lClShVatGgRWdX7T127dqVOnToAtGjRgvfee48tW7bQsWNHvvGNb0Rq+cc//nG0\nn0OSJH2JwU7SGalfv37MmDGDli1b0qZNG5o2bRrtko4qHA4zdOhQfvnLXx50PCcnhyAIgCOvMO7f\nv5+6desSCoUOO/Y555xzxOuOHj2aH/3oR/Tq1YuCggKys7MP2+7LK53hcPiobSrTaqgkSbHOYCfp\njPLl5+Vee+21w7Z58803I9t33nnnyS7pmHXt2pXevXszZswYLrjgArZu3cqOHTuO2L527dqR8+ee\ney6NGjXiueeeY8CAAYTDYVatWkVSUtJh+23fvj2yX1ZWxsUXXwzA9OnTT/CsID09nTvuuINPPvmE\n2rVrM2fOHBISEk74dSRJOp35jJ0kxYgWLVrwwAMP0K1bNxITE7n88svZuHHjEdtfd911TJgwgZSU\nFNavX8/MmTN58sknSUpKomXLlrzwwguH7Xf11Vczd+7cyMtTsrOzGTBgAKmpqdSvX/+Ez+viiy/m\npz/9Ka1bt6Z9+/Y0bNgwcrumJEk6NsGRbpepDNLS0sJFRUXRLkOSdJKVl5dTq1YtKioq6Nu3LyNG\njKBv377RLkvSMSgsLGTkyJFUq1aNZcuWcfbZZx+2XadOncjJySEtLe0UVyjFtiAIisPh8FH/wXHF\nTpIUddnZ2SQnJxMfH0+jRo3o06dPtEuSdIxmzpzJT37yE0Kh0BFDnaSTz2AnSYq6nJwcQqEQa9eu\nZdKkSZGXwUg6tfr06UNqaiotW7ZkypQpPPfcc/zoRz8C4Fe/+hWNGzcG4J///Cft27fn97//Pc8+\n+yz33nsvgwcPpqCgIPKWXoBRo0Yxbdq0aExFOuP48hRJkiQB8NRTT/GNb3yD3bt3k56ezquvvhr5\ndmVhYSHnnXceH330EYWFhWRlZXHjjTeyePFievbsSf/+/SkoKIjuBKQzmMFOkiRJAEyaNIm5c+cC\n8MEHH/DBBx9QXl7Ojh07+OCDD7j++utZtGgRhYWFXHPNNVGuVtKXeSumJEmSKCgoID8/n2XLllFS\nUkJKSgp79uyhXbt2TJ06lWbNmpGZmUlhYSHLli2jffv2h4xRtWpV9u/fH9nfs2fPqZyCdEYz2EmS\nJImysjLq1atHzZo1Wbt2LcuXLwcgMzOTnJwcsrKySElJYeHChVSvXv2wnyX5zne+w5o1a/j000/Z\ntm0br7/++qmehnTG8lZMSZIk0aNHD3Jzc2nevDnNmjUjIyMD+DzYffDBB2RlZREXF8cll1zCZZdd\ndtgxLrnkEq699trIG25TUlJO5RSkM5rfsZMkSZKkSsrv2EmSJEnSGcJgJ0mSJEkxzmAnSZIkSTHO\nYCdJkiRJMc5gJ0mSJEkxzmAnxbiCggJ69uwZ7TIkSZIURQY7SZIkSYpxBjvpBJoxYwaJiYkkJSVx\nww03UFpaSpcuXUhMTKRr1668//77AAwbNoxbbrmFjIwMGjduTEFBASNGjKB58+YMGzYsMt5rr71G\n27ZtadWqFQMGDKC8vByAP/3pT1x22WW0atWK559/HoD9+/fTpEkTNm/eHNn/n//5n8i+JEmSTl8G\nO+kEWb16NQ888AALFiygpKSEX/3qV4wePZqhQ4eyatUqBg8ezO233x5p/8knn7Bs2TIeffRRevXq\nxZgxY1i9ejVvvPEGoVCILVu28MADD5Cfn8/f/vY30tLSmDhxInv27OGmm27ipZdeori4mH/9618A\nVKlShe9973vMnDkTgPz8fJKSkjj//POj8ntIkiTp1DHYSSfIggULGDBgAPXr1wfgG9/4BsuWLeP6\n668H4IYbbmDx4sWR9ldffTVBEJCQkECDBg1ISEigSpUqtGzZktLSUpYvX86aNWto3749ycnJTJ8+\nnffee4+1a9fSqFEjmjRpQhAEfO9734uMOWLECGbMmAHAU089xfDhw0/hLyBJkqRoqRrtAqQzVfXq\n1YHPV9oObB/Yr6ioIC4ujssvv5xZs2Yd1C8UCh1xzEsuuYQGDRqwYMECVqxYEVm9O91lZ2dTq1Yt\n7rzzzmiXIkmSFBWu2EknSJcuXXjuuef4+OOPAdi6dSvt2rXjmWeeAWDmzJlkZmYe83gZGRksWbKE\nd955B4CdO3fyj3/8g8suu4zS0lLWr18PcEjwu/HGG/ne977HgAEDiIuLOxFTkyRJUiXnip10grRs\n2ZJx48bRsWNH4uLiSElJYfLkyQwfPpwJEyZw/vnnM3Xq1GMe7/zzz2fatGkMGjSITz/9FIAHHniA\npk2bMmXKFK666ipq1qxJZmYmO3bsiPTr1asXw4cP9zZMSZKkM0gQDoejXcMRpaWlhYuKiqJdhhRT\nioqKGDNmDIWFhdEu5ZTxVkxJknS6CoKgOBwOpx2tnSt20mnkoYce4re//e0Z82ydJEmSPueKnSRJ\nkiRVUse6YufLUyRJkiQpxhnsJMW83NzcyPf7JEmSzkQ+Yycp5o0cOTLaJUiSJEWVK3aSJEmSFOMM\ndpIkSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2\nkiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaS\nJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIk\nSZIU4wx2kiRJkhTjDHaSJEmSFOMMdpIkSZIU4wx2kiRJkhTjDHaSpNPO2rVradeuHQkJCXTs2JEt\nW7ZEuyRJkk4qg50k6bT09NNP88Ybb9CuXTtyc3OjXY4kSSdV1WgXIEnSiXbZZZdFtj/99FPOO++8\nKFYjSdLJZ7CTJJ22Xn31VV555RWWLVsW7VIkSTqpDHaSpNPS/v37+f73v8/ChQupW7dutMuRJOmk\n8hk7SdJpacOGDdSpU4cmTZpEuxRJkk46g50k6bRUr149/t//+3/RLkOSpFPCYCdJOi2VlZXx+9//\nPtplSJJ0ShjsJEmnpYsuuojZs2dHuwxJkk4Jg50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIk\nxbjjCnZBEHwjCII/B0Gw7ou/1ztCu31BEIS++OvF47mmJEmSJOlgx7tidw/wejgcbgK8/sX+4ewO\nh8PJX/zV6zivKUmSJEn6kuMNdr2B6V9sTwf6HOd4kiRJkqSv6XiDXYNwOLzxi+1/AQ2O0K5GEARF\nQRAsD4LA8CdJkiRJJ1DVozUIgiAf+OZhTo378k44HA4HQRA+wjDfCYfDHwVB0BhYEATBG+FweP0R\nrnczcDPAt7/97aOVJ0mSJElnvKMGu3A4/N0jnQuCYFMQBBeGw+GNQRBcCPz7CGN89MXf/xkEQQGQ\nAhw22IXD4SnAFIC0tLQjBUVJkiRJ0heO91bMF4GhX2wPBV74zwZBENQLgqD6F9v1gfbAmuO8riRJ\nkiTpC8cb7B4CLg+CYB3w3S/2CYIgLQiC33/RpjlQFARBCbAQeCgcDhvsJEmSJOkEOeqtmF8lHA5/\nDHQ9zPEi4MYvtpcCCcdzHUmSJEnSkR3vip0kSZIkKcoMdpIkSZIU4wx2kiSdxkpLS4mPjz9h4734\n4os89NBDAAwbNozZs2cf0qagoICePXuesGtKko7uuJ6xkyRJp6+KigqqVq160H6vXr3o1atXFKuS\nJB2OK3aSJJ3m9u3bx0033UTLli3p1q0bu3fvJhQKkZGRQWJiIn379uWTTz4BoFOnTtxxxx2kpaXx\nq1/9imHDhjFy5EjatGnDXXfdxbRp0xg1alRk7Pz8fNLS0mjatCnz588/5No7d+5kxIgRtG7dmpSU\nFF544ZAvI0mSTgCDnSQpZl155ZVs2LAh2mVUeuvWreO2225j9erV1K1blzlz5jBkyBAefvhhVq1a\nRUJCAuPHj4+0/+yzzygqKuLHP/4xAB9++CFLly5l4sSJh4xdWlrKihUr+L//+z9GjhzJnj17Djr/\ni1/8gi5durBixQoWLlzI2LFj2blz58mdsCSdgQx2kqSY9fLLL3PRRRdFu4xKr1GjRiQnJwOQmprK\n+vXr2bZtGx07dgRg6NChLFq0KNJ+4MCBB/UfMGAAcXFxhx372muvpUqVKjRp0oTGjRuzdu3ag86/\n9tprPPTQQyQnJ9OpUyf27NnD+++/fyKnJ0nCZ+wkSTrtVa9ePbIdFxfHtm3bvrL9Oeec85X7XxYE\nwVfuh8Nh5syZQ7NmzY61XEnSf8EVO0mSzjB16tShXr16FBYWAvCHP/whsnr3dT333HPs37+f9evX\n889//vOQANe9e3cmT55MOBwG4O9///vxFS9JOixX7CRJOgNNnz6dkSNHsmvXLho3bszUqVP/q3G+\n/e1v07p1a7Zv305ubi41atQ46Py9997LHXfcQWJiIvv376dRo0aHfcmKJOn4BAf+BK0ySktLCxcV\nFUW7DEmSJEmKiiAIisPhcNrR2nkrpiRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmS\nJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIk\nxTiDnSRJkiTFOIOdJEmSJMU4g50kSSI3N5cZM2actPFLS0uJj48/aeNL0pmuarQLkE6UUCjEhg0b\nuPLKK6NdiiTFnJEjR0a7BEnScXDFTpVSRUXF1+4TCoV4+eWXT0I1khSbnn76aVq3bk1ycjI/+MEP\n2LdvH7Vq1WLcuHEkJSWRkZHBpk2bAMjOziYnJwf4/N+nGRkZJCYm0rdvXz755BPWr19Pq1atImOv\nW7cusl9cXEzHjh1JTU2le/fubNy4MXI8KSmJpKQkfvOb35zi2UvSmcVgp6i4//77adasGR06dGDQ\noEHk5OTQqVMn7rjjDtLS0vjVr37F5s2b6devH+np6aSnp7NkyRIAVqxYQdu2bUlJSaFdu3a8/fbb\nfPbZZ/z85z8nLy+P5ORk8vLyojxDSYqut956i7y8PJYsWUIoFCIuLo6ZM2eyc+dOMjIyKCkpISsr\niyeeeOKQvkOGDOHhhx9m1apVJCQkMH78eC699FLq1KlDKBQCYOrUqQwfPpy9e/cyevRoZs+eTXFx\nMSNGjGDcuHEADB8+nMmTJ1NSUnJK5y5JZyJvxdQpt3LlSubMmUNJSQl79+6lVatWpKamAvDZZ59R\nVFQEwPXXX8+YMWPo0KED77//Pt27d+ett97isssuo7CwkKpVq5Kfn89Pf/pT5syZw3333UdRURG/\n/vWvozk9SaoUXn/9dYqLi0lPTwdg9+7dXHDBBZx11ln07NkTgNTUVP785z8f1K+srIxt27bRsWNH\nAIYOHcqAAQMAuPHGG5k6dSoTJ04kLy+PFStW8Pbbb/Pmm29y+eWXA7Bv3z4uvPBCtm3bxrZt28jK\nygLghhtu4JVXXjklc5ekM5HBTqfckiVL6N27NzVq1KBGjRpcffXVkXMDBw6MbOfn57NmzZrI/vbt\n2ykvL6esrIyhQ4eybt06giBg7969p7R+SYoF4XCYoUOH8stf/vKg4zk5OQRBAEBcXNzXuvW9X79+\njB8/ni5dupCamsp5553Hhg0baNmyJcuWLTuo7bZt245/EpKkY+atmKpUzjnnnMj2/v37Wb58OaFQ\niFAoxEcffUStWrW499576dy5M2+++SYvvfQSe/bsiWLFklQ5de3aldmzZ/Pvf/8bgK1bt/Lee+8d\ntV+dOnWoV68ehYWFAPzhD3+IrN7VqFGD7t27c8sttzB8+HAAmjVrxubNmyPBbu/evaxevZq6detS\nt25dFi9eDMDMmTNP+BwlSf8/g51Oufbt20cCWXl5OfPnzz9su27dujF58uTI/oHnOsrKyrj44osB\nmDZtWuR87dq12bFjx8krXJJiSIsWLXjggQfo1q0biYmJXH755ZGXmhzJgZW86dOnM3bsWBITEwmF\nQvz85z+PtBk8eDBVqlShW7duAJx11lnMnj2bu+++m6SkJJKTk1m6dCnw+XN4t912G8nJyYTD4ZM0\nU0kSQFCZ/0WblpYWPvC8lU4v2dnZ/PGPf6RBgwZccMEF9OjRg5kzZ5KTk0NaWhoAW7Zs4bbbbuOt\nt96ioqKCrKwscnNzWbZsGUOHDuWcc87hqquu4umnn6a0tJStW7fSvXt39u7dy09+8pODbuuUJH21\n0aNH06pVq8hK3JHk5ORQVlbG/ffff4oqk6QzWxAExeFwOO2o7Qx2ioby8nJq1arFrl27yMrKYsqU\nKQe9RluSdOrce++9vPrqq7zyyiucd955R2zXt29f1q9fz4IFC6hfv/4prFCSzlwGO1Vq119/PWvW\nrGHPnj0MHTqUn/zkJ9EuSZIkSap0jjXY+VZMRcUf//jHaJcgSZIknTZ8eYokSZIkxTiDnSRJkiTF\nOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4g50kSZIkxTiDnSRJkiTFOIOdJEmSJMU4\ng50kSZIkxTiDnSRJkk64SZMm0bx5cwYPHnxSxi8qKuL2228HoKCggKVLl56U60ixomq0C5AkSdLp\n5/HHHyc/P59vfetbJ3zsiooK0tLSSEtLAz4PdrVq1aJdu3Yn/FpSrHDFTpIkSSfUyJEj+ec//8kV\nV1zBww8/TNu2bUlJSaFdu3a8/fbbAGRkZLB69epIn06dOlFUVMTWrVvp06cPiYmJZGRksGrVKgCy\ns7O54YYbaN++PTfccAMFBQX07NmT0tJScnNzefTRR0lOTqawsJDNmzfTr18/0tPTSU9PZ8mSJVH5\nHaRTyRU7SZIknVC5ubn86U9/YuHChZx11ln8+Mc/pmrVquTn5/PTn/6UOXPmMHDgQJ599lnGjx/P\nxo0b2bhxI2lpaYwePZqUlBTmzZvHggULGDJkCKFQCIA1a9awePFizj77bAoKCgBo2LAhI0eOpFat\nWtx5550AXH/99YwZM4YOHTrw/vvv0717d956661o/RzSKWGwkyRJ0klTVlbG0KFDWbduHUEQsHfv\nXgCuvfZaunXrxvjx43n22Wfp378/AIsXL2bOnDkAdOnShY8//pjt27cD0KtXL84+++yjXjM/P581\na9ZE9rdv3055eTm1atU60dOTKg2DnSRJkk6ae++9l86dOzN37lxKS0vp1KkTABdffDHnnXceq1at\nIi8vj9zc3KOOdc455xzTNffv38/y5cupUaPG8ZQuxRSfsZMkSdJJU1ZWxsUXXwzAtGnTDjo3cOBA\nHnnkEcrKykhMiCD0IQAAIABJREFUTAQgMzOTmTNnAp+/FKV+/fqce+65X3mN2rVrs2PHjsh+t27d\nmDx5cmT/wK2c0unMYCdJ0mnuP18Fn5uby4wZM6JYkc4kd911Fz/5yU9ISUmhoqLioHP9+/fnmWee\n4dprr40cy87Opri4mMTERO655x6mT59+1GtcffXVzJ07N/LylEmTJlFUVERiYiItWrQ4ptVAKdYF\n4XA42jUcUVpaWrioqCjaZUiSFNOys7MPerGEJCl2BEFQHA6H047WzhU7SZJiVJ8+fUhNTaVly5ZM\nmTIFgD/96U+0atWKpKQkunbtethXwWdnZ5OTkwN8fotaRkYGiYmJ9O3bl08++QT4/NXzd999N61b\nt6Zp06YUFhZGbZ6SpKMz2EmSFKOeeuopiouLKSoqYtKkSWzatImbbrqJOXPmUFJSwnPPPRd5FfyY\nMWMIhUJkZmYeNMaQIUN4+OGHWbVqFQkJCYwfPz5yrqKighUrVvDYY48ddFySVPkY7CRJilGTJk0i\nKSmJjIwMPvjgA6ZMmUJWVhaNGjUC4Bvf+MZX9i8rK2Pbtm107NgRgKFDh7Jo0aLI+WuuuQaA1NRU\nSktLT84kJEknhMFOkqQYVFBQQH5+PsuWLaOkpISUlBSSk5NP6DWqV68OQFxc3CEvvZAkVS4GO0mS\nYlBZWRn16tWjZs2arF27luXLl7Nnzx4WLVrEu+++C8DWrVuBQ18Ff0CdOnWoV69e5Pm5P/zhD5HV\nO0lSbPED5ZIkxaAePXqQm5tL8+bNadasGRkZGZx//vlMmTKFa665hv3793PBBRfw5z//mauvvpr+\n/fvzwgsvHPRtL4Dp06czcuRIdu3aRePGjZk6dWqUZiRJOh5+7kCSJEmSKik/dyBJkiRJZwiDnSRJ\n/6G0tJT4+PhT3leSpP+WwU6SJEmSYpzBTpKkw6ioqGDw4ME0b96c/v37s2vXLu677z7S09OJj4/n\n5ptv5sBz6sXFxSQlJZGUlMRvfvObKFcuSToTGewkSTqMt99+m1tvvZW33nqLc889l8cff5xRo0ax\ncuVK3nzzTXbv3s38+fMBGD58OJMnT6akpCTKVUuSzlQGO0mSDuOSSy6hffv2AHzve99j8eLFLFy4\nkDZt2pCQkMCCBQtYvXo127ZtY9u2bWRlZQFwww03RLNsSdIZyu/YSZJ0GEEQHLJ/6623UlRUxCWX\nXEJ2djZ79uyJUnWSJB3MFTtJkg7j/fffZ9myZQD88Y9/pEOHDgDUr1+f8vJyZs+eDUDdunWpW7cu\nixcvBmDmzJnRKViSdEZzxU6SpMNo1qwZv/nNbxgxYgQtWrTglltu4ZNPPiE+Pp5vfvObpKenR9pO\nnTqVESNGEAQB3bp1i2LVkqQzVXDgjV6VUVpaWrioqCjaZUiSJElSVARBUBwOh9OO1s5bMSVJkiQp\nxhnsJEmSJFVK2dnZ5OTkHPF8QUEBPXv2BGDatGmMGjXqVJVW6RjsJEmSJCnGGewkSZIkVRq/+MUv\naNq0KR06dODtt98GoFOnThx498aWLVto2LBhFCusnHwrpiRJkqRKobi4mGeeeYZQKERFRQWtWrUi\nNTU12mXFBIOdJEmSpEqhsLCQvn37UrNmTQB69eoV5Ypih7diSpIkSarUqlatyv79+wHYs2dPlKup\nnAx2kiRJkiqFrKws5s2bx+7du9mxYwcvvfQSAA0bNqS4uBiA2bNnR7PESstgJ0mSJKlSaNWqFQMH\nDiQpKYkrrriC9PR0AO68805++9vfkpKSwpYtW6JcZeUUhMPhaNdwRGlpaeEDb7+RJEmSpDNNEATF\n4XA47WjtXLGTJEmSpBhnsJMkSZKkGGewkyRJkqQYZ7CTJEmSzlBFRUXcfvvt0S5DJ4AfKJckSZLO\nUGlpaaSlHfW9HIoBrthJkiRJlUCfPn1ITU2lZcuWTJkyBYBatWoxbtw4kpKSyMjIYNOmTQAMGzaM\n22+/nXbt2tG4cePIt93C4TBjx44lPj6ehIQE8vLyABgyZAjz5s2LXGvw4MG88MILFBQU0LNnTwCy\ns7MZMWIEnTp1onHjxkyaNCnS/v7776dZs2Z06NCBQYMGkZOTc0p+Ex07g50kSZJUCTz11FMUFxdT\nVFTEpEmT+Pjjj9m5cycZGRmUlJSQlZXFE088EWm/ceNGFi9ezPz587nnnnsAeP755wmFQpSUlJCf\nn8/YsWPZuHEj3//+95k2bRoAZWVlLF26lKuuuuqQGtauXcurr77KihUrGD9+PHv37mXlypXMmTOH\nkpISXnnlFfwcWeVksJMkSZIqgUmTJkVW5j744APWrVvHWWedFVlRS01NpbS0NNK+T58+VKlShRYt\nWkRW8hYvXsygQYOIi4ujQYMGdOzYkZUrV9KxY0fWrVvH5s2bmTVrFv369aNq1UOfyrrqqquoXr06\n9evX54ILLmDTpk0sWbKE3r17U6NGDWrXrs3VV199Sn4PfT0+YydJkiRFWUFBAfn5+SxbtoyaNWvS\nqVMn9uzZQ7Vq1QiCAIC4uDgqKioifapXrx7ZDofDR73GkCFDePrpp3nmmWeYOnXqYdt8ecz/vJ4q\nN1fsJEmSpCgrKyujXr161KxZk7Vr17J8+fL/apzMzEzy8vLYt28fmzdvZtGiRbRu3Rr4/Lm8xx57\nDIAWLVoc85jt27fnpZdeYs+ePZSXlzN//vz/qjadXK7YSZIkSVHWo0cPcnNzad68Oc2aNSMjI+O/\nGqdv374sW7aMpKQkgiDgkUce4Zvf/CYADRo0oHnz5vTp0+drjZmenk6vXr1ITEykQYMGJCQkUKdO\nnf+qPp08wbEs20ZLWlpa2IczJUmSpOO3a9cuEhIS+Nvf/va1g1l5eTm1atVi165dZGVlMWXKFFq1\nanWSKtWXBUFQHA6Hj/pNCm/FlCRJkk5z+fn5NG/enNGjR/9Xq20333wzycnJtGrVin79+hnqKiFX\n7CRJkiSpknLFTpIkSZLOEAY7SZIkSYpxBjtJkiRJinEGO0mSJEmKcQY7SZIkSYpxBjtJkiRJinEG\nO0mSRHZ2Njk5Ofz85z8nPz8/qrVceeWVbNu27SvbTJs2jQ0bNkT2b7zxRtasWXOyS5OkSqtqtAuQ\nJEmVx3333RftEnj55ZeP2mbatGnEx8dz0UUXAfD73//+ZJclSZWaK3aqdGrVqnVSxx82bBizZ88+\nqdeQpFjwi1/8gqZNm9KhQwfefvtt4OB/R95zzz20aNGCxMRE7rzzTgBeeukl2rRpQ0pKCt/97nfZ\ntGkT8PmK3w033EDbtm1p0qQJTzzxBAAFBQVkZWVx1VVX0axZM0aOHMn+/fsBmDVrFgkJCcTHx3P3\n3XdH6mrYsCFbtmyhtLSU5s2bc9NNN9GyZUu6devG7t27mT17NkVFRQwePJjk5GR2795Np06dKCoq\nAj7/78i4ceNISkoiIyMjUuP69evJyMggISGBn/3sZyf9vzeSdCoZ7CRJOgMVFxfzzDPPEAqFePnl\nl1m5cuVB5z/++GPmzp3L6tWrWbVqFT/72c8A6NChA8uXL+fvf/871113HY888kikz6pVq1iwYAHL\nli3jvvvui9wquWLFCiZPnsyaNWtYv349zz//PBs2bODuu+9mwYIFhEIhVq5cybx58w6pc926ddx2\n222sXr2aunXrMmfOHPr3709aWhozZ84kFApx9tlnH9Rn586dZGRkUFJSQlZWViRk/vCHP+SHP/wh\nb7zxBt/61rdO6O8pSdFmsFOlFQ6HGTt2LPHx8SQkJJCXlwfAddddx//93/9F2h340+V9+/YxduxY\n0tPTSUxM5He/+11knFGjRtGsWTO++93v8u9//zsq85FUuQwaNIjExEQeffTRY+4zb968k/Yc12OP\nPcauXbuO2u7LK1PHo7CwkL59+1KzZk3OPfdcevXqddD5OnXqUKNGDb7//e/z/PPPU7NmTQA+/PBD\nunfvTkJCAhMmTGD16tWRPr179+bss8+mfv36dO7cmRUrVgDQunVrGjduTFxcHIMGDWLx4sWsXLmS\nTp06cf7551O1alUGDx7MokWLDqmzUaNGJCcnA5CamkppaelR53bWWWfRs2fPQ/osW7aMAQMGAHD9\n9dd/vR9Mkio5g50qreeff55QKERJSQn5+fmMHTuWjRs3MnDgQJ599lkAPvvsM15//XWuuuoqnnzy\nSerUqcPKlStZuXIlTzzxBO+++y5z587l7bffZs2aNcyYMYOlS5dGeWaSou1f//oXK1euZNWqVYwZ\nM+aY+lRUVFSKYHeqVK1alRUrVtC/f3/mz59Pjx49ABg9ejSjRo3ijTfe4He/+x179uyJ9AmC4KAx\nDuwf6fixqF69emQ7Li6OioqKo/apVq1a5BrH2keSYp3BTpXW4sWLGTRoEHFxcTRo0ICOHTuycuVK\nrrjiChYuXMinn37KK6+8QlZWFmeffTavvfYaM2bMIDk5mTZt2vDxxx+zbt06Fi1aFBnnoosuokuX\nLtGemqQTrLS0lPj4+Mh+Tk4O2dnZdOrUibvvvpvWrVvTtGlTCgsLAejWrRsfffQRycnJFBYWEgqF\nyMjIIDExkb59+/LJJ58An6+O3XHHHaSlpfHwww/z4osvMnbsWJKTk1m/fv1Bq2dbtmyhYcOGwOcv\n9rjmmmvo0aMHTZo04a677orUdsstt5CWlkbLli353//9XwAmTZrEhg0b6Ny5M507dwbgtddeo23b\ntrRq1YoBAwZQXl5+0Jyfeuop7rjjjsj+E088ccwhFSArK4t58+axe/duduzYwUsvvXTQ+fLycsrK\nyrjyyit59NFHKSkpAaCsrIyLL74YgOnTpx/U54UXXmDPnj18/PHHFBQUkJ6eDnx+K+a7777L/v37\nycvLo0OHDrRu3Zq//OUvbNmyhX379jFr1iw6dux4zPXXrl2bHTt2HHN7gIyMDObMmQPAM88887X6\nSlJlZ7BTzKlRowadOnXi1VdfJS8vj4EDBwKf33I5efJkQqEQoVCId999l27dukW5WknRVlFRwYoV\nK3jssccYP348AC+++CKXXnopoVCIzMxMhgwZwsMPP8yqVatISEiItIPP7wwoKipi3Lhx9OrViwkT\nJhAKhbj00ku/8rqhUIi8vDzeeOMN8vLy+OCDD4DPX1hSVFTEqlWr+Mtf/sKqVau4/fbbueiii1i4\ncCELFy5ky5YtPPDAA+Tn5/O3v/2NtLQ0Jk6ceND41157LS+99BJ79+4FYOrUqYwYMeKYf5dWrVox\ncOBAkpKSuOKKKyIh7IAdO3bQs2dPEhMT6dChQ+T62dnZDBgwgNTUVOrXr39Qn8TERDp37kxGRgb3\n3ntv5I2V6enpjBo1iubNm9OoUSP69u3LhRdeyEMPPUTnzp1JSkoiNTWV3r17H3P9w4YNY+TIkZGX\npxyLxx57jIkTJ5KYmMg777xDnTp1jvl6klTZ+bkDVVqZmZn87ne/Y+jQoWzdupVFixYxYcIEAAYO\nHMjvf/97ioqKmDZtGgDdu3fnt7/9LV26dKFatWr84x//4OKLLyYrKysyzr///W8WLlzosxXSGeSa\na64Bjvx8VllZGdu2bYusFg0dOjTyHBYQ+cOjr6tr166R4NCiRQvee+89LrnkEp599lmmTJlCRUUF\nGzduZM2aNSQmJh7Ud/ny5axZs4b27dsDn4fLtm3bHtSmVq1adOnShfnz59O8eXP27t1LQkLC16px\n3LhxjBs37ojnDzwj92W9e/c+YgBLTExkxowZhxw/99xzmT9//iHHBw0axKBBgw45fuB/p/r16/Pm\nm29Gjh94MydAv3796NevX2S/oKAgsv3l1c3+/fvTv39/AC6++GKWL19OEAQ888wzkTeBStLpwGCn\nSqtv374sW7aMpKQkgiDgkUce4Zvf/Cbw+W1UN9xwA7179+ass84CPv84bWlpKa1atSIcDnP++ecz\nb948+vbty4IFC2jRogXf/va3D/k/R5JiX9WqVSOv0AcOeu7rwDNa/+2zVuecc84xXffL1/zydb98\n7XfffZecnBxWrlxJvXr1GDZs2CH94PM7EC6//HJmzZr1lbXdeOONPPjgg1x22WUMHz7860zrjFRc\nXMyoUaMIh8PUrVuXp556KtolSdIJY7BTpXPgT1qDIGDChAmRVbovq1atGlu3bj3oWJUqVXjwwQd5\n8MEHD2n/61//+uQU+/+1d+/BVdX33sc/X0nkfisoiopBCyaGXNmgGCJRIHCE0UdILAc4NY8iyqVQ\nbBloZUBsx/EcqHbaUq2AjXXgHIQQyFPBhpsCFQs7MVzCpYATUeDhYlEIIBLye/7I5QmQkECSvVnh\n/Zph2Hut3/6t74LfXpNP1uUH4LrQoUMHHT16VF9//bVatGhx0cM+qtO6dWu1bdtWGzZsUGJiot57\n770q7/W69L6usLAw5eTkqGfPnjWaH/PkyZNq3ry5WrdurSNHjmjlypVKSkq6qO/27dvrwQcf1Lhx\n47Rv3z798Ic/1OnTp3Xw4EF17dr1ov4eeOABffnll8rNzdW2bdtqtL/15eWXX650eVJSUvk+Blti\nYmL5vYIA0NBwjx0AwPNCQ0M1ffp09ezZU/3791d4ePhVff7dd9/V5MmTFR0drby8PE2fPr3SdsOG\nDdOsWbMUFxen/fv36+c//7nefPNNxcXF6fjx49VuJyYmRnFxcQoPD9fw4cPLL7WUpNGjR2vgwIF6\n5JFHdMsttyg9Pb18SoZevXpp9+7dlfb51FNPKSEhQW3btr2qfQYANCzmnAt2DVXy+XyuLubqAQDA\nKwoKCjR48OCL7i2rzPTp0/Xwww/rt7/9rb788kvNnz9fPp9PYWFh8vv9at++vR566KFrnuIlPT1d\nycnJ5Q9AAQAEh5nlOOd81bXjjB0AAB5z4cIFvfjiixo7dqyaNm1a5dm62szbmZ6erkOHDl3z5wEA\ngUWwAwDgOlNUVKQRI0YoIiJCKSkpOnPmjMLCwjRlyhTFx8dr8eLF+ulPf6pXX31VixcvrrKfFi1a\nSCq5d7lv376Kj49XVFSUli9fLqnk7GBERISee+45RUZGKjk5WWfPntWSJUvk9/s1YsSIq5pOAAAQ\nPAQ7AACuM3v27NHYsWO1a9cutWrVSn/84x8lSe3atVNubq6GDRt2Vf01adJEmZmZys3N1bp16/Sz\nn/1MZbdi7N27V+PGjVN+fr7atGmjjIwMpaSkyOfzacGCBcrLy1PTpk3rfB8BAHWrVsHOzFLNLN/M\nis2syus+zWygme0xs31mNrU22wQAoKG76667yh+sMnLkSG3cuFHStc+p55zTL3/5S0VHR6tfv346\nePCgjhw5Iknq3LmzYmNjJVU91x8A4PpX2+kOdkgaIulPVTUws0aS5kjqL+krSVvMLMs5t7OW2wYA\noEEys0rfX2lOvStZsGCBjh07ppycHIWGhiosLKx8/rxL59vjsksA8KZanbFzzu1yzu2ppllPSfuc\nc587576X9D+SnqjNdgEAaMgOHDigTZs2SZIWLlyo3r1716q/b7/9VrfeeqtCQ0O1bt06ffHFF9V+\n5tI5+wAA17dA3GN3h6QvK7z/qnQZAACoxH333ac5c+YoIiJCJ06c0JgxY2rV34gRI+T3+xUVFaW/\n/OUvNZrnLy0tTS+88AIPTwEAj6h2HjszWy3ptkpWveScW17a5iNJP3fOXTbpnJmlSBronBtV+v4/\nJD3gnBtfxfZGSxotSZ06depek98qAgAAAEBDVNN57Kq9x84516+WtRyUdFeF93eWLqtqe29Lelsq\nmaC8ltsGAAAAgAYvEJdibpHUxcw6m9nNkoZJygrAdgEAAADghlDb6Q6eNLOvJPWS9IGZ/a10eUcz\nWyFJzrkiSeMl/U3SLknvO+fya1c2AAAAAKBMraY7cM5lSsqsZPkhSY9VeL9C0orabAsAAAAAULlA\nXIoJAAAAAKhHBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyOYAcAAAAAHkewAwAAAACPI9gBAAAAgMcR\n7AAAABqgFi1aSJIOHTqklJQUSVJ6errGjx9/zX2GhYXp+PHjdVIfgLpFsAMAAGjAOnbsqCVLlgS7\nDAD1jGCHSi1btkw7d+4MdhkAAKCWCgoK1K1bt8uWf/DBB+rVq5eOHz+uY8eOaejQoerRo4d69Oih\nv//975Kkr7/+WsnJyYqMjNSoUaPknAt0+QBqiGCHShHsAABouDIzM/Xaa69pxYoVat++vSZOnKhJ\nkyZpy5YtysjI0KhRoyRJM2fOVO/evZWfn68nn3xSBw4cCHLlAKoSEuwCUHdmzZqlxo0ba8KECZo0\naZK2bt2qtWvXau3atZo/f75atWqlLVu26OzZs0pJSdHMmTMlSVOnTlVWVpZCQkKUnJysIUOGKCsr\nSx9//LF+/etfKyMjQ5I0btw4HTt2TM2aNdPcuXMVHh4ezN0FAADXYO3atfL7/crOzlarVq0kSatX\nr77oF7onT55UYWGh1q9fr6VLl0qSBg0apLZt2walZgDVI9g1IImJifrNb36jCRMmyO/369y5czp/\n/rw2bNighx9+WKmpqfrBD36gCxcuqG/fvtq2bZvuuOMOZWZmavfu3TIzffPNN2rTpo0ef/xxDR48\nuPxm6759++qtt95Sly5d9I9//ENjx47V2rVrg7zHAADgat177736/PPP9c9//lM+n0+SVFxcrE8/\n/VRNmjQJcnUArhWXYjYg3bt3V05Ojk6ePKnGjRurV69e8vv92rBhgxITE/X+++8rPj5ecXFxys/P\n186dO9W6dWs1adJEzz77rJYuXapmzZpd1m9hYaE++eQTpaamKjY2Vs8//7wOHz4chD0EAAC1dffd\ndysjI0M//vGPlZ+fL0lKTk7W73//+/I2eXl5kqSHH35YCxculCStXLlSJ06cCHzBAGqEYNeAhIaG\nqnPnzkpPT9dDDz2kxMRErVu3Tvv27VPTpk01e/ZsrVmzRtu2bdOgQYP03XffKSQkRJs3b1ZKSor+\n+te/auDAgZf1W1xcrDZt2igvL6/8z65du4KwhwAAoC6Eh4drwYIFSk1N1f79+/W73/1Ofr9f0dHR\nuv/++/XWW29JkmbMmKH169crMjJSS5cuVadOnYJcOYCqcClmA5OYmKjZs2frnXfeUVRUlF588UV1\n795dJ0+eVPPmzdW6dWsdOXJEK1euVFJSkgoLC3XmzBk99thjSkhI0D333CNJatmypU6dOiVJatWq\nlTp37qzFixcrNTVVzjlt27ZNMTExwdxVAABwBYWFhZJK5p7bsWOHJCktLU1paWmSpLi4uIvuq1u0\naNFlfbRr107Z2dn1XyyAWuOMXQOTmJiow4cPq1evXurQoYOaNGmixMRExcTEKC4uTuHh4Ro+fLgS\nEhIkSadOndLgwYMVHR2t3r176/XXX5ckDRs2TLNmzVJcXJz279+vBQsWaP78+YqJiVFkZKSWL18e\nzN0EAAAAUIFdz/OR+Hw+5/f7g10GAAAAAASFmeU453zVteOMHQAAAAB4HMEOAAAAADyOYAcAAAAA\nHkewAwAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyO\nYAcAAAAAHkewAwAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAAAAB4HMEO\nAAAAADyOYAcAAAAAHkewAwAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAA\nAAB4HMEOAAAAADyOYAcAAAAAHkewAwAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA\n8DiCHQAAAAB4HMEOAAAAADyOYAcAAAAAHkewAwAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBx\nBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyOYAcAgMeNGjVKO3furJO+WrRoUSf9AAACKyTYBQAAgNqZ\nN29esEsAAAQZZ+wAAPCQ06dPa9CgQYqJiVG3bt20aNEiJSUlye/3Syo54zZ58mRFRkaqX79+2rx5\ns5KSknTPPfcoKytLkpSenq4nnnhCSUlJ6tKli2bOnFnptmbNmqUePXooOjpaM2bMCNg+AgCuHsEO\nAAAP+fDDD9WxY0dt3bpVO3bs0MCBAy9af/r0aT366KPKz89Xy5YtNW3aNK1atUqZmZmaPn16ebvN\nmzcrIyND27Zt0+LFi8uDYZns7Gzt3btXmzdvVl5ennJycrR+/fqA7CMA4OoR7AAA8JCoqCitWrVK\nU6ZM0YYNG9S6deuL1t98883lYS8qKkp9+vRRaGiooqKiVFBQUN6uf//+ateunZo2baohQ4Zo48aN\nF/WTnZ2t7OxsxcXFKT4+Xrt379bevXvrff8AANeGe+wAAPCQrl27Kjc3VytWrNC0adPUt2/fi9aH\nhobKzCRJN910kxo3blz+uqioqLxdWZuq3jvn9Itf/ELPP/98fewGAKCOccYOAAAPOXTokJo1a6aR\nI0dq8uTJys3NvaZ+Vq1apX/96186e/asli1bpoSEhIvWDxgwQO+8844KCwslSQcPHtTRo0drXT8A\noH4Q7AAA8JDt27erZ8+eio2N1cyZMzVt2rRr6qdnz54aOnSooqOjNXToUPl8vovWJycna/jw4erV\nq5eioqKUkpKiU6dOqaCgQN26dauLXalUQUGBFi5cWP7e7/drwoQJkqRz586pX79+io2N1aJFi6rs\nIz09XePHj6+3GgHgesSlmAAAeMiAAQM0YMCAi5Z99NFH5a/LzrBJ0ssvv3xRu4rr7rzzTi1btuyy\n/iu2mThxoiZOnHjR+or36dWHsmA3fPhwSZLP5ysPnZ999pkkKS8vr15rAAAv4owdAAC4KkVFRRox\nYoQiIiKUkpKiM2fOKCcnR3369FH37t01YMAAHT58WJI0d+5c9ejRQzExMRo6dKjOnDkjSUpLS9OS\nJUvK+yw3IWbsAAAMD0lEQVSbGH3q1KnasGGDYmNj9cYbb+ijjz7S4MGDdfToUY0cOVJbtmxRbGys\n9u/fr7CwMB0/flxSyZm9pKSkwP5DAMB1hGAHAMANJi0tTX/4wx+u+fN79uzR2LFjtWvXLrVq1Upz\n5szRT37yEy1ZskQ5OTl65pln9NJLL0mShgwZoi1btmjr1q2KiIjQ/Pnzr9j3a6+9psTEROXl5WnS\npEnly2+99VbNmzevfN299957zfUDQEPEpZgAAOCq3HXXXeUPWxk5cqReffVV7dixQ/3795ckXbhw\nQbfffrskaceOHZo2bZq++eYbFRYWXnYZKQCgbhDsAADAVbl0aoSWLVsqMjJSmzZtuqxtWlqali1b\nppiYGKWnp5ffDxgSEqLi4mJJUnFxsb7//vurrqNiH999991Vfx4AGhIuxQQAAFflwIED5SFu4cKF\nevDBB3Xs2LHyZefPn1d+fr4k6dSpU7r99tt1/vx5LViwoLyPsLAw5eTkSJKysrJ0/vx5SSUh8dSp\nUzWqo2IfGRkZdbNzAOBRBDsAAHBV7rvvPs2ZM0cRERE6ceJE+f11U6ZMUUxMjGJjY/XJJ59Ikn71\nq1/pgQceUEJCgsLDw8v7eO655/Txxx8rJiZGmzZtUvPmzSVJ0dHRatSokWJiYvTGG29csY4ZM2Zo\n4sSJ8vl8atSoUf3tMAB4gDnngl1DlXw+n/P7/cEuAwAAAACCwsxynHO+6tpxxg4AAMDjajtx/PTp\n07V69eo6rAhAoPHwFAAAgBvYhQsX9MorrwS7DAC1xBk7AACABqCyiePXrFmjuLg4RUVF6ZlnntG5\nc+cklTx4ZsqUKYqPj9fixYsvmjA+LCxMM2bMUHx8vKKiorR7925J0rFjx9S/f39FRkZq1KhRuvvu\nu8sniAcQfAQ7AACABuDSieNff/11paWladGiRdq+fbuKior05ptvlrdv166dcnNzNWzYsMv6at++\nvXJzczVmzBjNnj1bkjRz5kw9+uijys/PV0pKig4cOBCwfQNQPYIdAABAA3DpxPFr1qxR586d1bVr\nV0nS008/rfXr15e3/9GPflRlX0OGDJEkde/eXQUFBZKkjRs3lofAgQMHqm3btvWxGwCuEcEOAACg\nAbh04vg2bdpcsX3ZFBOVady4sSSpUaNGKioqqn1xAOodwQ4AAKABuHTieJ/Pp4KCAu3bt0+S9N57\n76lPnz7X3H9CQoLef/99SVJ2drZOnDhR+6IB1BmCHQAAQANw6cTxkyZN0p///GelpqYqKipKN910\nk1544YVr7n/GjBnKzs5Wt27dtHjxYt12221q2bJlHe4BgNpggnIAAABU69y5c2rUqJFCQkK0adMm\njRkzRnl5ecEuC2jwajpBOfPYAQAAoFoHDhzQU089peLiYt18882aO3dusEsCUAHBDgAAANXq0qWL\nPvvss2CXAaAK3GMHAAAAAB5HsAMAAAAAjyPYAQAAAIDHEewAAAAAwOMIdgAAAADgcQQ7AAAAAPA4\ngh0AAAAAeBzBDgAAAAA8jmAHAAAAAB5HsAMAAAAAjyPYAQAAAIDHEewAAAAAwOMIdgAAAADgcQQ7\nAAAAAPA4gh0AAAAAeBzBDgAAAAA8jmAHAAAAAB5HsAMAAAAAjyPYAQAAAIDHEewAAAAAwOMIdgAA\nAADgcQQ7AAAAAPA4gh0AAAAAeBzBDgAAAAA8jmAHAAAAAB5HsAMAAAAAjyPYAQAAAIDHEewAAAAA\nwOMIdgAAAADgcQQ7AAAAAPA4gh0AAAAAeBzBDgAAAAA8rlbBzsxSzSzfzIrNzHeFdgVmtt3M8szM\nX5ttAgAAAAAuFlLLz++QNETSn2rQ9hHn3PFabg8AAAAAcIlaBTvn3C5JMrO6qQYAAAAAcNUCdY+d\nk5RtZjlmNjpA2wQAAACAG0K1Z+zMbLWk2ypZ9ZJzbnkNt9PbOXfQzG6VtMrMdjvn1lexvdGSRktS\np06datg9AAAAANy4qg12zrl+td2Ic+5g6d9HzSxTUk9JlQY759zbkt6WJJ/P52q7bQAAAABo6Or9\nUkwza25mLcteS0pWyUNXAAAAAAB1oLbTHTxpZl9J6iXpAzP7W+nyjma2orRZB0kbzWyrpM2SPnDO\nfVib7QIAAAAA/r/aPhUzU1JmJcsPSXqs9PXnkmJqsx0AAAAAQNUC9VRMAAAAAEA9IdgBAAAAgMcR\n7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyOYAcAAAAAHkewAwAAAACPI9gB\nAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyOYAcAAAAAHkewAwAA\nAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAAAAB4HMEOAAAAADyOYAcAAAAA\nHkewAwAAAACPM+dcsGuokpkdk/RFEDbdXtLxIGwX1yfGA8owFlAR4wFlGAsow1hARXU1Hu52zt1S\nXaPrOtgFi5n5nXO+YNeB6wPjAWUYC6iI8YAyjAWUYSygokCPBy7FBAAAAACPI9gBAAAAgMcR7Cr3\ndrALwHWF8YAyjAVUxHhAGcYCyjAWUFFAxwP32AEAAACAx3HGDgAAAAA8jmAnycxSzSzfzIrNrMon\n15hZgZltN7M8M/MHskYEzlWMh4FmtsfM9pnZ1EDWiMAwsx+Y2Soz21v6d9sq2l0oPS7kmVlWoOtE\n/anue25mjc1sUen6f5hZWOCrRKDUYDykmdmxCseDUcGoE/XPzN4xs6NmtqOK9WZmvysdK9vMLD7Q\nNSIwajAWkszs2wrHhen1VQvBrsQOSUMkra9B20ecc7E8yrZBq3Y8mFkjSXMk/Zuk+yX9u5ndH5jy\nEEBTJa1xznWRtKb0fWXOlh4XYp1zjweuPNSnGn7Pn5V0wjn3Q0lvSPrPwFaJQLmK4/6iCseDeQEt\nEoGULmngFdb/m6QupX9GS3ozADUhONJ15bEgSRsqHBdeqa9CCHaSnHO7nHN7gl0Hrg81HA89Je1z\nzn3unPte0v9IeqL+q0OAPSHp3dLX70r6X0GsBYFXk+95xTGyRFJfM7MA1ojA4biPcs659ZL+dYUm\nT0j6iyvxqaQ2ZnZ7YKpDINVgLAQMwe7qOEnZZpZjZqODXQyC6g5JX1Z4/1XpMjQsHZxzh0tf/19J\nHapo18TM/Gb2qZkR/hqOmnzPy9s454okfSupXUCqQ6DV9Lg/tPTSuyVmdldgSsN1iJ8TUFEvM9tq\nZivNLLK+NhJSXx1fb8xstaTbKln1knNueQ276e2cO2hmt0paZWa7S1M6PKaOxgMagCuNhYpvnHPO\nzKp6jPDdpceGeyStNbPtzrn9dV0rgOve/5H03865c2b2vErO5j4a5JoABFeuSn5OKDSzxyQtU8kl\nunXuhgl2zrl+ddDHwdK/j5pZpkouyyDYeVAdjIeDkir+JvbO0mXwmCuNBTM7Yma3O+cOl15Cc7SK\nPsqODZ+b2UeS4iQR7LyvJt/zsjZfmVmIpNaSvg5MeQiwaseDc67i//08Sf8VgLpwfeLnBEiSnHMn\nK7xeYWZ/NLP2zrnjdb0tLsWsITNrbmYty15LSlbJQzZwY9oiqYuZdTazmyUNk8TTEBueLElPl75+\nWtJlZ3PNrK2ZNS593V5SgqSdAasQ9akm3/OKYyRF0lrHBLENVbXj4ZJ7qB6XtCuA9eH6kiXpx6VP\nx3xQ0rcVLu3HDcTMbiu799rMeqokf9XLLwBvmDN2V2JmT0r6vaRbJH1gZnnOuQFm1lHSPOfcYyq5\ntyaz9P8lRNJC59yHQSsa9aYm48E5V2Rm4yX9TVIjSe845/KDWDbqx2uS3jezZyV9IekpSSqdBuMF\n59woSRGS/mRmxSo5WL/mnCPYNQBVfc/N7BVJfudclqT5kt4zs30quXl+WPAqRn2q4XiYYGaPSypS\nyXhIC1rBqFdm9t+SkiS1N7OvJM2QFCpJzrm3JK2Q9JikfZLOSPrfwakU9a0GYyFF0hgzK5J0VtKw\n+voFoPGLRQAAAADwNi7FBAAAAACPI9gBAAAAgMcR7AAAAADA4wh2AAAAAOBxBDsAAAAA8DiCHQAA\nAAB4HMEOAAAAADyOYAcAAAAAHvf/ALQLYgdb8XoTAAAAAElFTkSuQmCC\n", 1075 | "text/plain": [ 1076 | "
" 1077 | ] 1078 | }, 1079 | "metadata": { 1080 | "tags": [] 1081 | }, 1082 | "output_type": "display_data" 1083 | } 1084 | ], 1085 | "source": [ 1086 | "import numpy as np\n", 1087 | "import matplotlib.pyplot as plt\n", 1088 | "\n", 1089 | "embedding_matrix = classifier.get_variable_value('dnn/input_from_feature_columns/input_layer/terms_embedding/embedding_weights')\n", 1090 | "\n", 1091 | "for term_index in range(len(informative_terms)):\n", 1092 | " # Create a one-hot encoding for our term. It has 0s everywhere, except for\n", 1093 | " # a single 1 in the coordinate that corresponds to that term.\n", 1094 | " term_vector = np.zeros(len(informative_terms))\n", 1095 | " term_vector[term_index] = 1\n", 1096 | " # We'll now project that one-hot vector into the embedding space.\n", 1097 | " embedding_xy = np.matmul(term_vector, embedding_matrix)\n", 1098 | " plt.text(embedding_xy[0],\n", 1099 | " embedding_xy[1],\n", 1100 | " informative_terms[term_index])\n", 1101 | "\n", 1102 | "# Do a little setup to make sure the plot displays nicely.\n", 1103 | "plt.rcParams[\"figure.figsize\"] = (15, 15)\n", 1104 | "plt.xlim(1.2 * embedding_matrix.min(), 1.2 * embedding_matrix.max())\n", 1105 | "plt.ylim(1.2 * embedding_matrix.min(), 1.2 * embedding_matrix.max())\n", 1106 | "plt.show() " 1107 | ] 1108 | }, 1109 | { 1110 | "cell_type": "markdown", 1111 | "metadata": { 1112 | "colab_type": "text", 1113 | "id": "MIIlbKtvB8Ui" 1114 | }, 1115 | "source": [ 1116 | "Robert Hatem:\n", 1117 | "### Observations\n", 1118 | "* Positive words are near each other (e.g. 'excellent' and 'amazing').\n", 1119 | "* Similarly, negataives words are near each other (e.g. 'terrible' and 'word').\n", 1120 | "* These clusters-based-on-meaning is intuitive and desirable for our embeddings.\n", 1121 | "\n", 1122 | "### After re-training model from Task 3:\n", 1123 | "* The words still cluster based on meaning (e.g. 'excellent' is near 'perfect').\n", 1124 | "* The clsuters separate along a different direction than before (now top-left and bottom-right corners).\n", 1125 | "* The clusters are more spread out - I don't know if this is because the words vectors are truly farther apart or if it's just an artifact of the visualization.\n", 1126 | "\n", 1127 | "### After re-training with only 10 steps:\n", 1128 | "* Now the embeddings are not clustered nicely. For example, 'excellent' is near 'disappointment' and 'boring' is near 'beautiful'." 1129 | ] 1130 | }, 1131 | { 1132 | "cell_type": "markdown", 1133 | "metadata": { 1134 | "colab_type": "text", 1135 | "id": "pUb3L7pqLS86" 1136 | }, 1137 | "source": [ 1138 | "## Task 6: Try to improve the model's performance\n", 1139 | "\n", 1140 | "See if you can refine the model to improve performance. A couple things you may want to try:\n", 1141 | "\n", 1142 | "* **Changing hyperparameters**, or **using a different optimizer** like Adam (you may only gain one or two accuracy percentage points following these strategies).\n", 1143 | "* **Adding additional terms to `informative_terms`.** There's a full vocabulary file with all 30,716 terms for this data set that you can use at: https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt You can pick out additional terms from this vocabulary file, or use the whole thing via the `categorical_column_with_vocabulary_file` feature column." 1144 | ] 1145 | }, 1146 | { 1147 | "cell_type": "markdown", 1148 | "metadata": { 1149 | "colab_type": "text", 1150 | "id": "DAKQMf9ptpeD" 1151 | }, 1152 | "source": [ 1153 | "Robert Hatem\n", 1154 | "\n", 1155 | "I performed a simple grid search:\n", 1156 | "\n", 1157 | "| Hidden units | batch norm | steps | optimizer | learning rate | test AUC |\n", 1158 | "| -- | -- | -- | -- | -- | -- |\n", 1159 | "| [20,20] | False | 1000 | adagrad | 0.1 | 0.8666485 |\n", 1160 | "| [20,20] | False | 100 | adagrad | 0.1 | 0.71556914 |\n", 1161 | "| [40,20] | False | 1000 | adagrad | 0.1 | 0.86758274 |\n", 1162 | "| [40,20] | False | 5000 | adagrad | 0.1 | 0.8707501 |\n", 1163 | "| [40,20] | False | 10000 | adagrad | 0.1 | 0.8708241 |\n", 1164 | "| [40,20] | False | 10000 | adagrad | 0.08 | 0.8709081 |\n", 1165 | "| [40,20] | True | 10000 | adagrad | 0.08 | __0.8709133__ |\n", 1166 | "| [40,20] | NA | 1000 | adam | 0.001 | 0.7138001 |\n", 1167 | "| [40,20] | NA | 1000 | adam | 0.0001 | 0.5597429 |\n", 1168 | "| [40,20] | NA | 1000 | adam | 0.01 | 0.86710733 |\n", 1169 | "\n", 1170 | "I'm not going to get into regularization on the optimizer." 1171 | ] 1172 | }, 1173 | { 1174 | "cell_type": "code", 1175 | "execution_count": null, 1176 | "metadata": { 1177 | "colab": { 1178 | "base_uri": "https://localhost:8080/", 1179 | "height": 52 1180 | }, 1181 | "colab_type": "code", 1182 | "id": "6-b3BqXvLS86", 1183 | "outputId": "a586aa75-bdfc-494e-f260-43db690e2c22" 1184 | }, 1185 | "outputs": [ 1186 | { 1187 | "name": "stdout", 1188 | "output_type": "stream", 1189 | "text": [ 1190 | "Downloading data from https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt\n", 1191 | "253952/253538 [==============================] - 0s 0us/step\n" 1192 | ] 1193 | } 1194 | ], 1195 | "source": [ 1196 | "# Download the vocabulary file.\n", 1197 | "terms_url = 'https://download.mlcc.google.com/mledu-datasets/sparse-data-embedding/terms.txt'\n", 1198 | "terms_path = tf.keras.utils.get_file(terms_url.split('/')[-1], terms_url)" 1199 | ] 1200 | }, 1201 | { 1202 | "cell_type": "code", 1203 | "execution_count": null, 1204 | "metadata": { 1205 | "colab": { 1206 | "base_uri": "https://localhost:8080/", 1207 | "height": 476 1208 | }, 1209 | "colab_type": "code", 1210 | "id": "0jbJlwW5LS8-", 1211 | "outputId": "f0162456-121a-4901-aa12-5ba274579f60" 1212 | }, 1213 | "outputs": [ 1214 | { 1215 | "name": "stdout", 1216 | "output_type": "stream", 1217 | "text": [ 1218 | "Training set metrics:\n", 1219 | "accuracy 0.80692\n", 1220 | "accuracy_baseline 0.5\n", 1221 | "auc 0.8792404\n", 1222 | "auc_precision_recall 0.88184667\n", 1223 | "average_loss 0.45094785\n", 1224 | "label/mean 0.5\n", 1225 | "loss 11.273696\n", 1226 | "precision 0.7966901\n", 1227 | "prediction/mean 0.5484024\n", 1228 | "recall 0.82416\n", 1229 | "global_step 1000\n", 1230 | "---\n", 1231 | "Test set metrics:\n", 1232 | "accuracy 0.79\n", 1233 | "accuracy_baseline 0.5\n", 1234 | "auc 0.8636051\n", 1235 | "auc_precision_recall 0.86517555\n", 1236 | "average_loss 0.47325826\n", 1237 | "label/mean 0.5\n", 1238 | "loss 11.831456\n", 1239 | "precision 0.7802691\n", 1240 | "prediction/mean 0.54887104\n", 1241 | "recall 0.80736\n", 1242 | "global_step 1000\n", 1243 | "---\n" 1244 | ] 1245 | } 1246 | ], 1247 | "source": [ 1248 | "# Create a feature column from \"terms\", using a full vocabulary file.\n", 1249 | "informative_terms = None\n", 1250 | "with io.open(terms_path, 'r', encoding='utf8') as f:\n", 1251 | " # Convert it to a set first to remove duplicates.\n", 1252 | " informative_terms = list(set(f.read().split()))\n", 1253 | " \n", 1254 | "terms_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(key=\"terms\", \n", 1255 | " vocabulary_list=informative_terms)\n", 1256 | "\n", 1257 | "terms_embedding_column = tf.feature_column.embedding_column(terms_feature_column, dimension=2)\n", 1258 | "feature_columns = [ terms_embedding_column ]\n", 1259 | "\n", 1260 | "my_optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)\n", 1261 | "my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", 1262 | "\n", 1263 | "classifier = tf.estimator.DNNClassifier(\n", 1264 | " feature_columns=feature_columns,\n", 1265 | " hidden_units=[10,10],\n", 1266 | " optimizer=my_optimizer\n", 1267 | ")\n", 1268 | "\n", 1269 | "classifier.train(\n", 1270 | " input_fn=lambda: _input_fn([train_path]),\n", 1271 | " steps=1000)\n", 1272 | "\n", 1273 | "evaluation_metrics = classifier.evaluate(\n", 1274 | " input_fn=lambda: _input_fn([train_path]),\n", 1275 | " steps=1000)\n", 1276 | "print(\"Training set metrics:\")\n", 1277 | "for m in evaluation_metrics:\n", 1278 | " print(m, evaluation_metrics[m])\n", 1279 | "print(\"---\")\n", 1280 | "\n", 1281 | "evaluation_metrics = classifier.evaluate(\n", 1282 | " input_fn=lambda: _input_fn([test_path]),\n", 1283 | " steps=1000)\n", 1284 | "\n", 1285 | "print(\"Test set metrics:\")\n", 1286 | "for m in evaluation_metrics:\n", 1287 | " print(m, evaluation_metrics[m])\n", 1288 | "print(\"---\")" 1289 | ] 1290 | }, 1291 | { 1292 | "cell_type": "markdown", 1293 | "metadata": { 1294 | "colab_type": "text", 1295 | "id": "ew3kwGM-LS9B" 1296 | }, 1297 | "source": [ 1298 | "## A Final Word\n", 1299 | "\n", 1300 | "We may have gotten a DNN solution with an embedding that was better than our original linear model, but the linear model was also pretty good and was quite a bit faster to train. Linear models train more quickly because they do not have nearly as many parameters to update or layers to backprop through.\n", 1301 | "\n", 1302 | "In some applications, the speed of linear models may be a game changer, or linear models may be perfectly sufficient from a quality standpoint. In other areas, the additional model complexity and capacity provided by DNNs might be more important. When defining your model architecture, remember to explore your problem sufficiently so that you know which space you're in." 1303 | ] 1304 | }, 1305 | { 1306 | "cell_type": "markdown", 1307 | "metadata": { 1308 | "colab_type": "text", 1309 | "id": "9MquXy9zLS9B" 1310 | }, 1311 | "source": [ 1312 | "### *Optional Discussion:* Trade-offs between `embedding_column` and `indicator_column`\n", 1313 | "\n", 1314 | "Conceptually when training a `LinearClassifier` or a `DNNClassifier`, there is an adapter needed to use a sparse column. TF provides two options: `embedding_column` or `indicator_column`.\n", 1315 | "\n", 1316 | "When training a LinearClassifier (as in **Task 1**), an `embedding_column` in used under the hood. As seen in **Task 2**, when training a `DNNClassifier`, you must explicitly choose either `embedding_column` or `indicator_column`. This section discusses the distinction between the two, and the trade-offs of using one over the other, by looking at a simple example." 1317 | ] 1318 | }, 1319 | { 1320 | "cell_type": "markdown", 1321 | "metadata": { 1322 | "colab_type": "text", 1323 | "id": "M_3XuZ_LLS9C" 1324 | }, 1325 | "source": [ 1326 | "Suppose we have sparse data containing the values `\"great\"`, `\"beautiful\"`, `\"excellent\"`. Since the vocabulary size we're using here is $V = 50$, each unit (neuron) in the first layer will have 50 weights. We denote the number of terms in a sparse input using $s$. So for this example sparse data, $s = 3$. For an input layer with $V$ possible values, a hidden layer with $d$ units needs to do a vector-matrix multiply: $(1 \\times V) * (V \\times d)$. This has $O(V * d)$ computational cost. Note that this cost is proportional to the number of weights in that hidden layer and independent of $s$.\n", 1327 | "\n", 1328 | "If the inputs are one-hot encoded (a Boolean vector of length $V$ with a 1 for the terms present and a 0 for the rest) using an [`indicator_column`](https://www.tensorflow.org/api_docs/python/tf/feature_column/indicator_column), this means multiplying and adding a lot of zeros." 1329 | ] 1330 | }, 1331 | { 1332 | "cell_type": "markdown", 1333 | "metadata": { 1334 | "colab_type": "text", 1335 | "id": "I7mR4Wa2LS9C" 1336 | }, 1337 | "source": [ 1338 | "When we achieve the exact same results by using an [`embedding_column`](https://www.tensorflow.org/api_docs/python/tf/feature_column/embedding_column) of size $d$, we look up and add up just the embeddings corresponding to the three features present in our example input of \"`great`\", \"`beautiful`\", \"`excellent`\": $(1 \\times d) + (1 \\times d) + (1 \\times d)$. Since the weights for the features that are absent are multiplied by zero in the vector-matrix multiply, they do not contribute to the result. Weights for the features that are present are multiplied by 1 in the vector-matrix multiply. Thus, adding the weights obtained via the embedding lookup will lead to the same result as in the vector-matrix-multiply.\n", 1339 | "\n", 1340 | "When using an embedding, computing the embedding lookup is an $O(s * d)$ computation, which is computationally much more efficient than the $O(V * d)$ cost for the `indicator_column` in sparse data for which $s$ is much smaller than $V$. (Remember, these embeddings are being learned. In any given training iteration it is the current weights that are being looked up.)" 1341 | ] 1342 | }, 1343 | { 1344 | "cell_type": "markdown", 1345 | "metadata": { 1346 | "colab_type": "text", 1347 | "id": "etZ9qf0kLS9D" 1348 | }, 1349 | "source": [ 1350 | "As we saw in **Task 3**, by using an `embedding_column` in training the `DNNClassifier`, our model learns a low-dimensional representation for the features, where the dot product defines a similarity metric tailored to the desired task. In this example, terms that are used similarly in the context of movie reviews (e.g., `\"great\"` and `\"excellent\"`) will be closer to each other the embedding space (i.e., have a large dot product), and terms that are dissimilar (e.g., `\"great\"` and `\"bad\"`) will be farther away from each other in the embedding space (i.e., have a small dot product)." 1351 | ] 1352 | } 1353 | ], 1354 | "metadata": { 1355 | "colab": { 1356 | "collapsed_sections": [ 1357 | "JndnmDMp66FL", 1358 | "mNCLhxsXyOIS", 1359 | "eQS5KQzBybTY" 1360 | ], 1361 | "name": "intro_to_sparse_data_and_embeddings.ipynb", 1362 | "provenance": [], 1363 | "version": "0.3.2" 1364 | }, 1365 | "kernelspec": { 1366 | "display_name": "Python 3", 1367 | "language": "python", 1368 | "name": "python3" 1369 | }, 1370 | "language_info": { 1371 | "codemirror_mode": { 1372 | "name": "ipython", 1373 | "version": 3 1374 | }, 1375 | "file_extension": ".py", 1376 | "mimetype": "text/x-python", 1377 | "name": "python", 1378 | "nbconvert_exporter": "python", 1379 | "pygments_lexer": "ipython3", 1380 | "version": "3.6.8" 1381 | } 1382 | }, 1383 | "nbformat": 4, 1384 | "nbformat_minor": 2 1385 | } 1386 | --------------------------------------------------------------------------------