├── Assignments
    ├── Assignment_week1.md
    ├── Assignment_week2.md
    ├── Assignment_week3.md
    ├── Assignment_week4.md
    ├── Assignment_week5.md
    └── Assignment_week6.md
├── Attention.ipynb
├── ConvNets.ipynb
├── Fully_connected.ipynb
├── Graphical_monitoring.ipynb
├── Images
    ├── The-Transformer-model-architecture.png
    ├── cnn.png
    └── nn.png
├── README.md
├── Text_preprocessing.ipynb
└── Transformer.ipynb


/Assignments/Assignment_week1.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 1
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to learn to build and train a simple neural network with Keras
 7 | 2.	to learn to follow and interpret the progress of training
 8 | 
 9 | ## Setup
10 | 
11 | Use the iris dataset for this assignment: [https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data](https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data).
12 | 
13 | ## Task
14 | 
15 | Fetch the iris dataset (e.g. from the above link) and execute the necessary preprocessing to convert the input features and target labels to a form suitable for inserting into neural networks built with Keras. 
16 | 
17 | Your task is to build a multiclass classifier with Keras, that takes the four numerical features of the iris samples as input, and outputs the prediction for its species. The classifier should be a simple neural network model, and consist of **1)** no more than one hidden layer, and **2)** the output layer of three units with softmax activation.
18 | 
19 | Prepare a Jupyter notebook containing a full account of the problem treatment, including the training of the model. Please take into account the following:
20 | 
21 | - Remember to one-hot-encode the target labels.
22 | 
23 | - It might be a good idea to apply random shuffling to the data, check out `np.random.shuffle`.
24 | 
25 | - Use the parameter `validation_split` with the `fit` method to extract a part of data from training for validation purposes (you don't need to extract a separate test set).
26 | 
27 | - Pick accuracy as the relevant metric, and aim to achieve approx. 95% for both training and validation accuracy with your model.
28 | 
29 | - If you experience problems in having your model to learn, try modifying the **learning rate** parameter of your optimizer; see [https://keras.io/api/optimizers/](https://keras.io/api/optimizers/) how this can be done.
30 | 
31 | - Use markdown cells to document your work.
32 | 
33 | ## Deliverables
34 | 
35 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
36 | 
37 | 
38 | 


--------------------------------------------------------------------------------
/Assignments/Assignment_week2.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 2
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to try out different model architectures
 7 | 2.	to experiment with different kinds of regularization
 8 | 
 9 | ## Setup
10 | 
11 | Use the same California housing dataset (the small-size version) as in the course material [notebook](../Graphical_monitoring.ipynb).
12 | 
13 | ## Task
14 | 
15 | In the course material, a neural network is trained and tested to have the mean average error of approximately 0,3 (in units of 100000$), when computed with the test set containing 20% of the available labeled data. Your task is to try out some strategies for obtaining a better model, and try to achieve a lower MAE than the above.
16 | 
17 | Prepare a Jupyter notebook containing an account of the problem treatment. You do not have to present all the different experiments you have performed in detail: a short mention of the various trials is enough. However, the training and subsequent testing of the finally selected model must be presented in the notebook.
18 | 
19 | Note the following:
20 | 
21 | - You should use three separate datasets: training, validation, and test sets.
22 | 
23 | - Check out a few different model architectures (e.g. change the number of neurons in the hidden layers). 
24 | 
25 | - Select the model with optimal architecture among the ones you tried. Ideally, the model should still show some signs of overfitting (to make sure it has enough predictive power).
26 | 
27 | - Try out some of the strategies for reducing overfitting, and pick the one leading to the most promising results.
28 | 
29 | - Train the final model once more with training and validation sets combined, and evaluate it with the test set.
30 | 
31 | - Use markdown cells to document your work.
32 | 
33 | ## Deliverables
34 | 
35 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
36 | 
37 | 
38 | 


--------------------------------------------------------------------------------
/Assignments/Assignment_week3.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 3
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to work with image data
 7 | 2.	to experiment with pretrained convnets
 8 | 
 9 | ## Setup
10 | 
11 | Use the same CIFAR-10 dataset as in the course material [notebook](../ConvNets.ipynb). The data is also directly available in Keras: [https://keras.io/api/datasets/cifar10/](https://keras.io/api/datasets/cifar10/).
12 | 
13 | ## Task
14 | 
15 | In the course material, a simple convolutional neural network is built, trained and tested to solve the multiclass classification task presented by the CIFAR-10 dataset. To improve the accuracy, you should experiment with pretrained models. Follow the instructions in Chollet's book "Deep Learning with Python", 2nd edition, Chapter 8, pp. 225-231: _Feature extraction with a pretrained model_. Pick one of the pretrained models available with Keras, and discard the Dense classifier top. You should only use the convolutional base to preprocess the original images to a new representation, using its `predict` method. For this modified input data, you should build a simple fully connected classifier, train it, and test it.
16 | 
17 | Prepare a Jupyter notebook containing an account of the problem treatment. Note the following:
18 | 
19 | - You should create three separate Dataset objects for training, validation, and testing.
20 | 
21 | - Pick a suitable convolutional base, and use it for fast feature extraction with all three Datasets (see page 229 in Chollet's book). 
22 | 
23 | - Build the fully connected classifier top, and train it together with validation data. Remember to present the training graphs in the notebook. Experiment with a few choices for hyperparameters, and present the ones with the best results (do not include all your trials in the notebook).  
24 | 
25 | - Test your model with test set, and report the accuracy. Try to improve the accuracy obtained with the model in the course material.
26 | 
27 | - Use markdown cells to document your work.
28 | 
29 | ## Deliverables
30 | 
31 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
32 | 
33 | 
34 | 


--------------------------------------------------------------------------------
/Assignments/Assignment_week4.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 4
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to learn how to obtain and use pretrained word embeddings
 7 | 2.	to gain a better understanding of word vectors
 8 | 
 9 | ## Setup
10 | 
11 | Use the pretrained [GloVe embeddings](https://nlp.stanford.edu/projects/glove/) (file "glove.6B.100d.txt"), that are also available e.g. in [Kaggle](https://www.kaggle.com/datasets/anmolkumar/glove-embeddings).
12 | 
13 | ## Task
14 | 
15 | With the pretrained GloVe embeddings, find the word vectors for the three words "man", "woman", and "king". With these, calculate the vector obtained from the expression
16 | 
17 | vec("woman") - vec("man) + vec("king")
18 | 
19 | and find the nearest vector(s) to it, using the cosine similarity as the distance measure. You can use the code in [weekly material](../Text_preprocessing.ipynb) as the starting point.
20 | 
21 | Can you explain your result? 
22 | 
23 | ## Deliverables
24 | 
25 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
26 | 


--------------------------------------------------------------------------------
/Assignments/Assignment_week5.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 5
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to learn about residual connections and their implementation
 7 | 2.	to add layer normalization to a NN model.
 8 | 
 9 | ## Setup
10 | 
11 | Use the IMDB dataset of movie reviews for this assignment; you can download the data directly from Keras, as e.g. in [course material](../Attention.ipynb).
12 | 
13 | ## Task
14 | 
15 | In the [course material](../Attention.ipynb), a simple binary classifier containing an attention layer has been built and trained. Using this model as a starting point, you should modify it by adding
16 | 
17 | - a **residual connection** (around the attention layer), and
18 | - **layer normalization** (directly after the attention layer),
19 | 
20 | as shown in the image in section 11.4.3 (**The Transformer encoder**) in Chollet's "Deep Learning with Python", 2nd ed. Use the functional API to implement the residual connection, and the LayerNormalization layer provided by Keras. (You do not need to construct a complete Transformer encoder, the two elements mentioned above are enough.)
21 | 
22 | Also, train the model, and test it with the test set.
23 | 
24 | ## Deliverables
25 | 
26 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
27 | 


--------------------------------------------------------------------------------
/Assignments/Assignment_week6.md:
--------------------------------------------------------------------------------
 1 | # Assignment: week 6
 2 | 
 3 | ## Objectives
 4 | 
 5 | The objectives of this assignment are:
 6 | 1.	to build a sequence-to-sequence model
 7 | 2.	to learn more about Transformer architectures.
 8 | 
 9 | ## Setup
10 | 
11 | Use the dataset containing short sentences translated from English to Finnish (*fin-eng.zip*), that can be downloaded at [https://www.manythings.org/anki/](https://www.manythings.org/anki/).
12 | 
13 | ## Task
14 | 
15 | Build and train a Transformer model to translate sentences from English to Finnish with the dataset referred to above. You can use the same classes for embedding (token + positional) layer and Transformer encoder and decoder as in Chollet's "Deep Learning with Python" and [weekly material](../Transformer.ipynb). As for the actual model, follow the similar example with English-Spanish translation in Chollet's book; also, you can use the code below to read the sentences from the file:
16 | 
17 | `text_file = "fin-eng/fin.txt"` <br/>
18 | 
19 | `with open(text_file, encoding='utf-8') as f:` <br/>
20 |  &nbsp;&nbsp;&nbsp;&nbsp;`lines = f.read().split("\n")[:-1]` <br/>
21 | `text_pairs = []` <br/>
22 | `for line in lines:` <br/>
23 | &nbsp;&nbsp;&nbsp;&nbsp;`english, finnish, rest = line.split("\t")` <br/>
24 | &nbsp;&nbsp;&nbsp;&nbsp;`finnish = "[start] " + finnish + " [end]"` <br/>
25 | &nbsp;&nbsp;&nbsp;&nbsp;`text_pairs.append((english, finnish))` <br/>
26 | 
27 | Provide the notebook also with a couple of examples showing translated sentences. If the results are less than impressive, never mind; a genuine effort is all that is expected of this exercise.
28 | 
29 | ## Deliverables
30 | 
31 | Submit your Jupyter Notebook **both** as an .ipynb file **and** in HTML form to the appropriate assignment in OMA workspace.
32 | 


--------------------------------------------------------------------------------
/Fully_connected.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "metadata": {},
  5 |    "cell_type": "markdown",
  6 |    "source": [
  7 |     "# 1. Fully connected neural networks\n",
  8 |     "\n",
  9 |     "Neural networks (NNs) are a family of extremely versatile machine learning algorithms, that can be used to study any supervised learning tasks, such as classification or regression problems. We begin to get acquainted with NNs by looking at a famous example of classifying images of handwritten digits. For implementing NN models with Python, we use the **Keras** library, that can be accessed from the **TensorFlow** platform. Therefore, in order to be able to execute the code cells in this document, you'll need to install TensorFlow first, using *e.g.* the Python Packages manager.\n",
 10 |     "\n",
 11 |     "## First example with MNIST data\n",
 12 |     "\n",
 13 |     "First, run the following cell for the necessary imports (their significance will become clear later):"
 14 |    ],
 15 |    "id": "88d9a12e62335f32"
 16 |   },
 17 |   {
 18 |    "metadata": {
 19 |     "ExecuteTime": {
 20 |      "end_time": "2024-10-23T16:51:19.189635Z",
 21 |      "start_time": "2024-10-23T16:51:14.976494Z"
 22 |     }
 23 |    },
 24 |    "cell_type": "code",
 25 |    "source": [
 26 |     "from tensorflow.keras.models import Sequential\n",
 27 |     "from tensorflow.keras.layers import Input, Dense, Flatten\n",
 28 |     "from tensorflow.keras.datasets import mnist\n",
 29 |     "from tensorflow.keras.utils import to_categorical"
 30 |    ],
 31 |    "id": "ceaaa4d2",
 32 |    "outputs": [],
 33 |    "execution_count": 1
 34 |   },
 35 |   {
 36 |    "metadata": {},
 37 |    "cell_type": "markdown",
 38 |    "source": [
 39 |     "Next, we download the MNIST dataset, which comes with Tensorflow and contains 28 x 28 pixel images of handwritten digits, labelled 0-9. The data ends up in four different NumPy tensors: `x_train` and `y_train` contain the input features and target labels for the training samples, and `x_test` and `y_test` the same for the test samples. \n",
 40 |     "\n",
 41 |     "The training set consists of 60000 image samples, and the test set a further 10000 samples."
 42 |    ],
 43 |    "id": "494d185388b339a"
 44 |   },
 45 |   {
 46 |    "metadata": {
 47 |     "ExecuteTime": {
 48 |      "end_time": "2024-10-23T16:51:19.346065Z",
 49 |      "start_time": "2024-10-23T16:51:19.189635Z"
 50 |     }
 51 |    },
 52 |    "cell_type": "code",
 53 |    "source": "(x_train, y_train), (x_test, y_test) = mnist.load_data()",
 54 |    "id": "2bce384a8ae093bc",
 55 |    "outputs": [],
 56 |    "execution_count": 2
 57 |   },
 58 |   {
 59 |    "metadata": {},
 60 |    "cell_type": "markdown",
 61 |    "source": "First, we do some basic preprocessing: the input features (pixel values), that originally vary in the range 0 ... 255, are divided by 255 to convert them to range 0 ... 1. In addition, the labels are one-hot-encoded (with the Keras function `to_categorical`) from their original integer-valued form; this ensures that any two different labels are equidistant from each other. For example, the label 4 is converted to the form [0, 0, 0, 0, 1, 0, 0, 0, 0, 0]. ",
 62 |    "id": "267c1b36f68fb3d5"
 63 |   },
 64 |   {
 65 |    "metadata": {
 66 |     "ExecuteTime": {
 67 |      "end_time": "2024-10-23T16:51:19.481512Z",
 68 |      "start_time": "2024-10-23T16:51:19.346065Z"
 69 |     }
 70 |    },
 71 |    "cell_type": "code",
 72 |    "source": [
 73 |     "x_train = x_train / 255.0\n",
 74 |     "x_test = x_test / 255.0\n",
 75 |     "y_train = to_categorical(y_train, 10)\n",
 76 |     "y_test = to_categorical(y_test, 10)"
 77 |    ],
 78 |    "id": "bfa58bb96856179a",
 79 |    "outputs": [],
 80 |    "execution_count": 3
 81 |   },
 82 |   {
 83 |    "metadata": {},
 84 |    "cell_type": "markdown",
 85 |    "source": [
 86 |     "Now we determine the structure (architecture) of our simple neural network. First we create an instance of class `Sequential`, which corresponds to a neural network consisting of layers stacked one after another (most of them are of this kind).\n",
 87 |     "\n",
 88 |     "This particular neural network model has four layers (one Input layer, a Flatten layer, and two Dense layers); ignore their details for now, as we'll look at them more closely below."
 89 |    ],
 90 |    "id": "4184bc750b25ae35"
 91 |   },
 92 |   {
 93 |    "metadata": {
 94 |     "ExecuteTime": {
 95 |      "end_time": "2024-10-23T16:51:19.531326Z",
 96 |      "start_time": "2024-10-23T16:51:19.481512Z"
 97 |     }
 98 |    },
 99 |    "cell_type": "code",
100 |    "source": [
101 |     "model = Sequential([\n",
102 |     "    Input(shape=(28, 28)),\n",
103 |     "    Flatten(),\n",
104 |     "    Dense(128, activation='relu'),\n",
105 |     "    Dense(10, activation='softmax')\n",
106 |     "    ])"
107 |    ],
108 |    "id": "68e350acf47c9cd3",
109 |    "outputs": [],
110 |    "execution_count": 4
111 |   },
112 |   {
113 |    "metadata": {},
114 |    "cell_type": "markdown",
115 |    "source": "Next the model needs to be compiled, which means providing the learning algorithm some information about training and performance monitoring. Again, we'll look at the details later. ",
116 |    "id": "c8d1737e15ec06ce"
117 |   },
118 |   {
119 |    "metadata": {
120 |     "ExecuteTime": {
121 |      "end_time": "2024-10-23T16:51:19.545781Z",
122 |      "start_time": "2024-10-23T16:51:19.531326Z"
123 |     }
124 |    },
125 |    "cell_type": "code",
126 |    "source": [
127 |     "model.compile(optimizer='rmsprop',\n",
128 |     "              loss='categorical_crossentropy',\n",
129 |     "              metrics=['accuracy'])"
130 |    ],
131 |    "id": "29cbc911881112d2",
132 |    "outputs": [],
133 |    "execution_count": 5
134 |   },
135 |   {
136 |    "metadata": {},
137 |    "cell_type": "markdown",
138 |    "source": "At this stage, the model can be trained with the training data:",
139 |    "id": "8d5b4ed51629d708"
140 |   },
141 |   {
142 |    "metadata": {
143 |     "ExecuteTime": {
144 |      "end_time": "2024-10-23T16:51:29.902573Z",
145 |      "start_time": "2024-10-23T16:51:19.545781Z"
146 |     }
147 |    },
148 |    "cell_type": "code",
149 |    "source": "model.fit(x_train, y_train, epochs=5, batch_size=32)",
150 |    "id": "600a97bb2ad2f0e7",
151 |    "outputs": [
152 |     {
153 |      "name": "stdout",
154 |      "output_type": "stream",
155 |      "text": [
156 |       "Epoch 1/5\n",
157 |       "\u001B[1m1875/1875\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m2s\u001B[0m 1ms/step - accuracy: 0.8794 - loss: 0.4266\n",
158 |       "Epoch 2/5\n",
159 |       "\u001B[1m1875/1875\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m2s\u001B[0m 1ms/step - accuracy: 0.9631 - loss: 0.1232\n",
160 |       "Epoch 3/5\n",
161 |       "\u001B[1m1875/1875\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m2s\u001B[0m 1ms/step - accuracy: 0.9749 - loss: 0.0852\n",
162 |       "Epoch 4/5\n",
163 |       "\u001B[1m1875/1875\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m2s\u001B[0m 962us/step - accuracy: 0.9798 - loss: 0.0692\n",
164 |       "Epoch 5/5\n",
165 |       "\u001B[1m1875/1875\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m2s\u001B[0m 978us/step - accuracy: 0.9842 - loss: 0.0548\n"
166 |      ]
167 |     },
168 |     {
169 |      "data": {
170 |       "text/plain": [
171 |        "<keras.src.callbacks.history.History at 0x22d2676da90>"
172 |       ]
173 |      },
174 |      "execution_count": 6,
175 |      "metadata": {},
176 |      "output_type": "execute_result"
177 |     }
178 |    ],
179 |    "execution_count": 6
180 |   },
181 |   {
182 |    "metadata": {},
183 |    "cell_type": "markdown",
184 |    "source": "Now that our model has been trained, we can use it to predict the labels of the samples in the test set (which the model has so far not been exposed to), and evaluate the classification accuracy. We observe that, even this fairly simple model is able to predict the labels with accuracy that is above 97%.",
185 |    "id": "a2510e5d738ee736"
186 |   },
187 |   {
188 |    "metadata": {
189 |     "ExecuteTime": {
190 |      "end_time": "2024-10-23T16:51:30.285936Z",
191 |      "start_time": "2024-10-23T16:51:29.902573Z"
192 |     }
193 |    },
194 |    "cell_type": "code",
195 |    "source": [
196 |     "test_loss, test_acc = model.evaluate(x_test, y_test)\n",
197 |     "print(f'Test accuracy: {test_acc}')"
198 |    ],
199 |    "id": "f876664c0cc638f1",
200 |    "outputs": [
201 |     {
202 |      "name": "stdout",
203 |      "output_type": "stream",
204 |      "text": [
205 |       "\u001B[1m313/313\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m0s\u001B[0m 767us/step - accuracy: 0.9727 - loss: 0.1032\n",
206 |       "Test accuracy: 0.9768999814987183\n"
207 |      ]
208 |     }
209 |    ],
210 |    "execution_count": 7
211 |   },
212 |   {
213 |    "metadata": {},
214 |    "cell_type": "markdown",
215 |    "source": [
216 |     "## Stages of deep learning\n",
217 |     "\n",
218 |     "The above example contains all the essential stages of **deep learning** (*i.e.*, machine learning with NNs) experiments:\n",
219 |     "\n",
220 |     "* **Download the data**: prepare separate sets for training and testing as NumPy arrays\n",
221 |     "* **Preprocess the data**: standardize/normalize input features, one-hot-encode categorical variables\n",
222 |     "* **Build the model**: define the architecture and the various layers\n",
223 |     "* **Compile the model**: specify the details for the learning algorithm and monitored metrics\n",
224 |     "* **Train the model**: allow the learning algorithm to find good parameter values for the model\n",
225 |     "* **Evaluate the model**: test the performance with new, previously unseen data\n",
226 |     "\n",
227 |     "It is useful to think of a neural network as an information-processing unit. Input data enters the network in the form of a tensor, which is a multidimensional array of numbers. The first layer of the network modifies this tensor, and outputs a new one, which then enters the second layer as an input. The data then flows through the network from one layer to the next, and gets modified every step of the way, finally coming out as predictions for the target values. In other words, a neural network aims to model the connection between the inputs and outputs.\n",
228 |     "\n",
229 |     "Originally, the shape of the tensor containing the input features in the training samples is "
230 |    ],
231 |    "id": "e08d813c79054fd3"
232 |   },
233 |   {
234 |    "metadata": {
235 |     "ExecuteTime": {
236 |      "end_time": "2024-10-23T16:51:30.290472Z",
237 |      "start_time": "2024-10-23T16:51:30.285936Z"
238 |     }
239 |    },
240 |    "cell_type": "code",
241 |    "source": "x_train.shape",
242 |    "id": "48b658f5dcf21b6b",
243 |    "outputs": [
244 |     {
245 |      "data": {
246 |       "text/plain": [
247 |        "(60000, 28, 28)"
248 |       ]
249 |      },
250 |      "execution_count": 8,
251 |      "metadata": {},
252 |      "output_type": "execute_result"
253 |     }
254 |    ],
255 |    "execution_count": 8
256 |   },
257 |   {
258 |    "metadata": {},
259 |    "cell_type": "markdown",
260 |    "source": "That is, the three-dimensional input tensor contains 60000 samples, each determined by 28 times 28 pixel values: the first index determines the sample, and the second and third index point to the row and column of the collection of pixel values defining the image. The cell below shows one example of the input samples.",
261 |    "id": "569a6cc40577facb"
262 |   },
263 |   {
264 |    "metadata": {
265 |     "ExecuteTime": {
266 |      "end_time": "2024-10-23T16:51:30.698303Z",
267 |      "start_time": "2024-10-23T16:51:30.290472Z"
268 |     }
269 |    },
270 |    "cell_type": "code",
271 |    "source": [
272 |     "import matplotlib.pyplot as plt\n",
273 |     "\n",
274 |     "# Visualize a sample image\n",
275 |     "sample_number = 42\n",
276 |     "plt.imshow(x_train[sample_number], cmap='gray')\n",
277 |     "plt.title(f'Label: {y_train[sample_number].argmax()}')  # Use .argmax() to get the label as an integer\n",
278 |     "plt.axis('off')  # Hide the axes\n",
279 |     "plt.show()"
280 |    ],
281 |    "id": "3e41cc8c7fd585b5",
282 |    "outputs": [
283 |     {
284 |      "data": {
285 |       "text/plain": [
286 |        "<Figure size 640x480 with 1 Axes>"
287 |       ],
288 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAGZCAYAAABmNy2oAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAMI0lEQVR4nO3cS4jV9f/H8ffJ0YzspoKNziIbKUsQ23RzEdbGypLCsgii0jAi2hh0oYuXokWtKkIJL6BBUglZ2KIbQSChRBRSkNGkBlom0cXU1PNb1P9F/rXf3+/8nXG0xwNm4Znv+3w/s9CnnzPnfFrtdrtdAFBVJx3rBQAwcIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIDwvLly6vVatWGDRuOyvO1Wq267777jspz/f05582b16vZefPmVavV+sevV1555aiuFXqr41gvAP4NZs+eXVOnTj3k8bvvvru+/vrrw34PjgVRgH7Q1dVVXV1dBz3W09NTGzdurNtuu63OPPPMY7Mw+F+8fMRxY/fu3TV37tyaNGlSnXHGGTV8+PC67LLL6o033vjHmcWLF9d5551XJ598cl144YWHfZlm27ZtNWfOnOrq6qohQ4bU2LFja/78+bVv376+/HFq6dKl1W63a/bs2X16H2jCToHjxp49e2rnzp31wAMP1JgxY2rv3r317rvv1o033ljLli2r22+//aDr16xZUx988EEtWLCgTj311HrxxRfr1ltvrY6OjpoxY0ZV/RmEiy++uE466aR6/PHHq7u7u9atW1dPPvlk9fT01LJly/7rms4555yq+vN//U0cOHCgli9fXuPGjasrrrii0Sz0qTYMAMuWLWtXVXv9+vVHPLNv3772H3/80Z41a1b7oosuOuh7VdU+5ZRT2tu2bTvo+vHjx7fHjRuXx+bMmdMeNmxY+9tvvz1o/tlnn21XVXvjxo0HPecTTzxx0HXd3d3t7u7uI17z/3j77bfbVdV++umnG89CX/LyEceVV199tSZPnlzDhg2rjo6OGjx4cC1ZsqS++OKLQ6696qqratSoUfnzoEGDaubMmbVp06baunVrVVW99dZbNWXKlBo9enTt27cvX1dffXVVVX344Yf/dT2bNm2qTZs2Nf45lixZUh0dHXXHHXc0noW+JAocN1avXl0333xzjRkzplauXFnr1q2r9evX11133VW7d+8+5Pqzzz77Hx/78ccfq6pq+/bt9eabb9bgwYMP+powYUJVVe3YseOo/xw7duyoNWvW1LXXXnvYNcKx5HcKHDdWrlxZY8eOrVWrVlWr1crje/bsOez127Zt+8fHRowYUVVVI0eOrIkTJ9ZTTz112OcYPXr0/3fZh1ixYkXt3bvXL5gZkESB40ar1aohQ4YcFIRt27b947uP3nvvvdq+fXteQtq/f3+tWrWquru78/bQadOm1dq1a6u7u7vOOuusvv8h6s+XjkaPHp2XqGAgEQUGlPfff/+w7+S55ppratq0abV69eq69957a8aMGbVly5ZauHBhdXZ21ldffXXIzMiRI+vKK6+sxx57LO8++vLLLw96W+qCBQvqnXfeqcsvv7zuv//+Ov/882v37t3V09NTa9eurUWLFh3y+YK/GzduXFXVEf9e4eOPP66NGzfWI488UoMGDTqiGehPosCA8uCDDx728W+++abuvPPO+v7772vRokW1dOnSOvfcc+uhhx6qrVu31vz58w+Zuf7662vChAn16KOP1ubNm6u7u7tefvnlmjlzZq7p7OysDRs21MKFC+uZZ56prVu31mmnnVZjx46tqVOn/p+7h6afZViyZEm1Wq2aNWtWoznoL612u90+1osAYGDw7iMAQhQACFEAIEQBgBAFAEIUAIgj/pzC3z9FCsDx50g+gWCnAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoARMexXgD0hVar1Xims7Oz8cxNN93UeGbGjBmNZ6qquru7G89ceumljWc2b97ceIYTh50CACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCAOGUVPpNV1dXr+amT5/eeOaWW25pPDN58uTGM/3pt99+azyza9euPlgJJzI7BQBCFAAIUQAgRAGAEAUAQhQACFEAIEQBgBAFAEIUAAhRACBEAYBwIB41ceLExjMPP/xw45kbbrih8UxV1ZAhQxrP9PT0NJ554YUXGs90dDT/K3TPPfc0nqmqeueddxrP7Nixo1f34t/LTgGAEAUAQhQACFEAIEQBgBAFAEIUAAhRACBEAYAQBQBCFAAIUQAgHIg3QE2ZMqVXc0uXLm08M2rUqMYzQ4cObTzz0ksvNZ6pqlqxYkXjmU8++aTxzK5duxrPTJo0qfFMbw/E+/zzz3s1B03YKQAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEA/EGqJEjR/Zq7tNPP2088+uvvzaeef311xvPrFmzpvFMVdWBAwd6NXei+f3334/1EvgXsFMAIEQBgBAFAEIUAAhRACBEAYAQBQBCFAAIUQAgRAGAEAUAQhQACFEAIFrtdrt9RBe2Wn29FjjurF27tvHM1KlTe3Wv4cOHN5756aefenUvTkxH8s+9nQIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAdBzrBcDxrLOz81gvAY4qOwUAQhQACFEAIEQBgBAFAEIUAAhRACBEAYAQBQBCFAAIUQAgRAGAcCAe9LMNGzb0au6XX345yiuBQ9kpABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQD8eAvXV1djWcuuOCCxjNr1qxpPFNVtX///l7NQRN2CgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgDhQDz4y/Tp0xvPDBkypPHMc88913gG+oudAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgDhlFT4y+TJkxvPHDhwoPHM5s2bG89Af7FTACBEAYAQBQBCFAAIUQAgRAGAEAUAQhQACFEAIEQBgBAFAEIUAAgH4sFfOjs7G8989tlnjWcciMdAZqcAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQHQc6wVAXzj99NMbz1xyySWNZz766KPGMzCQ2SkAEKIAQIgCACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhAPxOCFdd911jWeGDh3aeOb5559vPAMDmZ0CACEKAIQoABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCAOGUVE5IM2bM6Jf7bNmypV/uA/3FTgGAEAUAQhQACFEAIEQBgBAFAEIUAAhRACBEAYAQBQBCFAAIUQAgHIgHf/n5558bz/zwww99sBI4duwUAAhRACBEAYAQBQBCFAAIUQAgRAGAEAUAQhQACFEAIEQBgBAFAMKBeJyQxo8f33hm586djWe+++67xjMwkNkpABCiAECIAgAhCgCEKAAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQD8Rjw5s6d23imNwfiLV68uPEMnGjsFAAIUQAgRAGAEAUAQhQACFEAIEQBgBAFAEIUAAhRACBEAYAQBQBCFAAIp6Qy4I0YMaJf7vPaa6/1y31gILNTACBEAYAQBQBCFAAIUQAgRAGAEAUAQhQACFEAIEQBgBAFAEIUAIhWu91uH9GFrVZfrwWAPnQk/9zbKQAQogBAiAIAIQoAhCgAEKIAQIgCACEKAIQoABCiAECIAgAhCgBEx5FeeITn5gFwHLNTACBEAYAQBQBCFAAIUQAgRAGAEAUAQhQACFEAIP4Dunesl1M94oMAAAAASUVORK5CYII="
289 |      },
290 |      "metadata": {},
291 |      "output_type": "display_data"
292 |     }
293 |    ],
294 |    "execution_count": 9
295 |   },
296 |   {
297 |    "metadata": {},
298 |    "cell_type": "markdown",
299 |    "source": [
300 |     "The first two layers or the model above only specify the shape of the input features in the training data (the *Input* layer), and then reshape this data to 1D vector form (the *Flatten* layer). Instead of a 2D table of 28 x 28 numerical values, the input features of each sample then take the form of a 1D vector of 28 x 28 = 784 numerical values. \n",
301 |     "\n",
302 |     "Following the Flatten layer, the model has two additional *Dense* layers. These layers execute a certain computation with the data contained in the input tensor, transforming it to a tensors having different shapes. To understand that in detail, we need to consider the mathematics behind neural networks."
303 |    ],
304 |    "id": "5a4992c1afafeb29"
305 |   },
306 |   {
307 |    "metadata": {},
308 |    "cell_type": "markdown",
309 |    "source": [
310 |     "### Single neuron\n",
311 |     "\n",
312 |     "Neural networks consist of a large number of similar unit entities, neurons, arranged in layers. Each of the neurons performs a similar kind of computation: it takes in a collection of input values, and produces a single number as an output. The computation takes place in two stages: 1) **weighted sum**, followed by 2) **activation function**.\n",
313 |     "\n",
314 |     "Assume that an individual neuron takes in N inputs: $x_{1}, x_{2}, ..., x_{N}$. Such a neuron has N parameters called **weights**, $w_{1}, w_{2}, ..., w_{N}$, and a **bias** parameter $b$. First, the neuron computes a weighted sum $z$ of the inputs as follows:\n",
315 |     "\n",
316 |     "$$\n",
317 |     "z = w_{1}x_{1} + w_{2}x_{2} + ... + w_{N}x_{N} + b\n",
318 |     "$$\n",
319 |     "After this, the final output $y$ of the neuron is obtained from\n",
320 |     "\n",
321 |     "$$\n",
322 |     "y = f(z)\n",
323 |     "$$\n",
324 |     "where the function $f$ is called an **activation function**. Common choices for activation functions are the sigmoid\n",
325 |     "\n",
326 |     "$$\n",
327 |     "f(z) = \\frac{1}{e^{-z}+1}\n",
328 |     "$$ \n",
329 |     "and the ReLU (rectified linear unit)\n",
330 |     "\n",
331 |     "$$\n",
332 |     "f(z) = \\begin{cases} \n",
333 |     "z & \\hspace{1cm} z > 0 \\\\\n",
334 |     "0 & \\hspace{1cm} \\text{otherwise} \n",
335 |     "\\end{cases} \n",
336 |     "$$\n",
337 |     "The activation functions are typically chosen to be nonlinear (as the two examples above are) to increase the complexity of the computation performed by the neurons and, therefore, to be able to model complex behavior.\n",
338 |     "\n",
339 |     "### Fully connected (Dense) layer\n",
340 |     "\n",
341 |     "A **fully connected** (FC) layer &ndash; or *Dense* layer, as they are referred to in Keras &ndash; is simply a collection of several neurons, each having their individual set of weight and bias values. For example, the second layer in our model above has a total of 128 neurons. Each of these 128 neurons takes all 784 pixel values of each sample in as input; each neuron also has 128 weights and one bias value, and they compute a weighted sum of the 784 input features. Next, each of these 128 intermediate outputs (one from each neuron) are inserted in the activation function (this is the same for all 128 neurons in the layer), producing the final output of 128 numerical values. In this way, the dense layer processes the original 1D vector of 784 values to another 1D vector of 128 values. This information then continues to the next layer as an input.\n",
342 |     "\n",
343 |     "As the MNIST model is a fairly large one, let us first take a look at a much simpler FC neural network with similar structure, shown in the image below:"
344 |    ],
345 |    "id": "5c8df2c03f7f518b"
346 |   },
347 |   {
348 |    "metadata": {
349 |     "ExecuteTime": {
350 |      "end_time": "2024-10-23T16:51:30.705754Z",
351 |      "start_time": "2024-10-23T16:51:30.698303Z"
352 |     }
353 |    },
354 |    "cell_type": "code",
355 |    "source": "from IPython.display import Image,display;display(Image(filename=\"Images/nn.png\"))",
356 |    "id": "7119328f77e175fe",
357 |    "outputs": [
358 |     {
359 |      "data": {
360 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcsAAAEWCAYAAAAJory2AAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAAACYktHRAD/h4/MvwAAVMBJREFUeF7t3Qe4LVWR9vE2zZgQHXUcIzLqGEAFc0YHFQQlSVSiAgKSQeSKSBAkCZJzvAQFxjg6KsEE5iyYA6Io5oQ5MN/8FnvxNX27d3fvcGL9n2c/55w+O3Ta9a6qVVXrdv/7fxRBEARBEDRy+8HPIAiCIAgaCLEMgiAIghZCLIMgCIKghRDLIAiCIGghxDIIgiAIWgixDIIgCIIWQiyDIAiCoIUQyyAIgiBoIcQyCIIgCFoIsQyCIAiCFkIsgyAIgqCFEMsgCIIgaCHEMgiCIAhaCLEMgiAIghZCLIMgCIKghRDLIAiCIGghxDIIgiAIWgixDIIgCIIWbve//8fg96CBm2++ufjSl75UXH311cUNN9xQ/P73vy/uda97FSussELxnOc8p3jUox41eGYQBEGwEAmxbOHjH/94cdJJJxUPechDitVWW61YccUVi+WWW6749a9/XXzrW98qPvShDxV//etfi9133714zGMeM3hVEARBsJAIsWyAN3nyyScX11xzTfG6170ueZFNfOUrXykOP/zwYsMNNyw22GCDwdYgCIJgoRBi2cApp5xS3HjjjcUb3vCG4o53vONgazNCs3vuuWex3nrrFWuttdZgaxAEQbAQiASfGj7zmc8UX/jCF4r999+/k1Di7ne/e3HUUUcV55xzTprXDIIgCBYOIZY1nHjiicU+++xT3OlOdxps6cbyyy9f7LjjjskrDYIgCBYOIZYVvvnNbxb3vOc9i0c84hGDLf143vOel95DWDYIgiBYGIRYVpD9qhxkVG53u9sVT3/601MoNwiCIFgYhFhW+PGPfzw087ULXh/zlkEQBAuHEMsKf/jDH4q73e1ug79GQ7KP9wmCIAgWBiGWFXTm+dWvfjX4azR++ctfFve+970HfwVBEATznRDLCg9/+MOLr3/964O/RkOTgp/+9KfF5ZdfXvz85z8fbA2CIAjmK9GUoAJx23XXXYuLLrooJev05U9/+lOx6aabFu94xzuKm266qfja1752q6f6z//8z8Uqq6xS3Pe+901/B0EQBPODEMsaXv/61xfPfe5zi+c///mDLd0544wzUnLPYx/72JQV+8AHPnDwn6L4zW9+E+IZBEEwDwmxrEFG7M4775waqN///vcfbG3n2muvTX1kNTR45jOfmboAfec73yme+MQnpvBulRDPIAiC+UGIZQOf/OQni+OOOy61sHvwgx882NqMeUoe6UYbbVQ85SlPKb785S8XL3nJS4q73OUuaQ7U3w972MOKJz/5yYNXLEuIZxAEwdwkxHIIPMNDDjkkCeBLX/rS4p/+6Z8G//n/6NRz3nnnFVdddVVaeeT2t7998fnPf75YY401ive///1prcuVVlopPZdoeljui7fZNica4hkEQTA3CLFs4Xe/+11x7rnnFldeeWUSvYc+9KGpHZ71LL/xjW8U119/fVppRFJPFlOZsB/96EeTwPI4r7vuuuRl5l6z3//+99Ni0v/yL/+SwrV3uMMd0vY2QjyDIAhmhxDLjvzjH/9IwvejH/2o+O1vf5vqMf/93/+9eOQjH1nrIRI2niXBlCH73ve+Nwkjsc0QVW3xiJ6EojrPdRghnkEQBDNDiOUU0cXnXe96V7H++usXd73rXVPfWZ7qmmuueRuBJb6f+MQnir///e/Ff/7nf47cQSjEMwiCYDqEWE6Zv/3tb8V//dd/FS984QtTV5+f/exnxWWXXVa84AUvKO53v/sNnnULf/zjH1NiEfG0egnvdRxCPIMgCCZDiOUMcPPNN6cmBcKwSlGc8g9/+MPpfzzJKgT26quvTiUs/t+nfGUYIZ5BEASjEWI5QzjN73nPe4rHPe5xxYorrpi2SQ769Kc/XbzoRS8qlltuubStjHlS4VldhZScdClh6UOIZxAEQTdCLGcYHuW//du/FY9+9KPT37xIiUA6/SgnqcMlUsYiq5bY/sd//MfgP5MlxDMIgqCeEMtZQIhV5qvmBZlrrrkmlaIoMbnzne882LosfWs1xyHEMwiC4BZCLGcJjQtkyz7nOc8ZbLklwecDH/hA8ZjHPCY1MxjG9773vdReT83ns571rNQMYdqEeAZBsFgJsewIr0/pxw9/+MPUtWf55ZdPc4/Ejpc3Cl/96ldT03Xdfsp86lOfKn7yk58kL7OtYcGNN96YQrR3vOMdUzJQbnwwE4R4BkGwWAixbIEHePzxxydx1DiAQN797ndP5R3f/OY3U2cff++2226p92tffvCDH6TP0AWoHFIlREpMZNCWVy5pQhLQxz72sRTCXX311YeGcqdFiGcQBAuVEMsGnJZzzjkn9XxdsmTJ0KQaXXje/OY3F9tss03KbO1LuT1e2ZO0D7YLz3rfLvOTmh6YE/Xa1VZbLQn5bBHiGQTBQiHEsoGzzz47hV41Uu/Sho6nufvuuxebb7558uz6Qlhye7zq5xHTj3zkI2l9TY0NuvDnP/85iaaQ8bOf/ezOr5smIZ5BEMxXQixr+OIXv5iW5zr99NN79WslAtttt11x4oknjtRIoNoer4yayyuuuCJ5ikKzXdFCL8+BVhejnm1CPIMgmC+EWNbwile8onjNa15zay1kHz74wQ+mRgIHHXTQYEs/qu3xqnz7298uPvvZzxbrrLNOrxCry9y2GPVsE+IZBMFcJcSyAjF5y1veUpx00kmDLf1wOoVSL7jggmW8w65U2+NV+etf/5pCtlY9eexjHzvY2p2ui1HPNiGeQRDMFUIsK1i7kshtvPHGgy39OfLII5MIaYY+Ki5LtT1eFYInI3fddddNQtKXmWxwMAlCPIMgmC1CLCsceuihaUWQcnedvlxyySXFX/7yl2KLLbYYbBmdanu8KuY5eZlPeMITkqc5CqMuRj3bhHgGQTBThFhWUCZC5HTRGRVdeL71rW8Vu+6662DLeNS1x6siieeXv/xlKjEZtZvPuItRzzYhnkEQTIsQywpCqLyrPhmnVS666KLUbGDllVdOfxMd84wefveQyCPcy5uzbqVkHT+bQqF17fGqaEwgwUjpyjjLek1qMerZJsQzCIJJEWJZ4dJLLy1uuummlBE7Kq973etSTWTdWpUZYulz1EESwV//+te3GvUmcfUc4V1h4iZxdTl18vFazxuHSS9GPduEeAZBMCohlhWEIvfaa6+UzToKxGyjjTYq9thjjySC5v9krD7oQQ8aPGM0srgqHTG/yGslnqgT15/97GfJu9XFxzqYbZ7rMHz2NBajnm1CPIMg6EqIZQ377rtvmvsjNH3RIs9qIPvtt18KsRIwGacasPtduQahGyeJpqk9XhkCxyMUlr3LXe6SkoTaPNe2sPC0F6OebeaSeP7iF79IEQKlTAZFev1qKPGMZzwjJXvN9czlIFhohFjWQNi0rjv11FN7GUpJPbzSpUuXFl/5yldSn1bhy9w8wKm2tJZyD7WUhGnVVVdNYtYXhr2pPV4Vn6fMxComTZ+VPdeuYWHH4fhWWmml5DmP47nOVWZDPHVaOvnkk9O9xIuXaOY+0b5Q1jIP33Ne/epXjzWvHgRBP0IsG+C5aXd39NFHJ6+sDYJkrvL1r3996ilLJK01qfTjT3/6U/JUq0LFQyRiBMr/nvrUpybD2BWi1tQer4p9+J//+Z+0TiaBG5csrgYFjlcSkOMllnXi2sVznetMWzx1ZjrssMOKHXbYIc03N50bonnEEUek5v5Wu5mJtUyDYLETYjkEIUeriWy55ZaNa0sa8ct+VS6i6ToD5pQSSeLwtKc9LYmhpbysOamNXd2ak55zzTXXpPIPXidB67LkFxEa1h6vir631113XTqeSa596T3tf9Ni1H091/kgrpMUT4MO98+xxx5bPOABDxhsbUZI/PDDD0/nSmvGIAimS4hlCwzhGWeckYRTlxsCttxyyyVjz5u0gPPaa6+dVhupeneScRhBHXYIJSG0gLTXmw9t8ggYQmE481U8Qh18NB1omp9sa49XxVzme9/73vT8hz70oYOtk0GIUJnLuItRz0dxHVU8ZR2r7RXF6HM9XHfTBe6vUVa6CYKgOyGWHWGYNSKXYcqA86AYNsZwWBjMvN773ve+FJbN4VyZqp/+9KeTERV6bTPoPlOo0z4QAc0JqsLsMra1x6tCuO3fmmuuOXFRmenFqOeiuHYVz9NOOy39b+uttx5s6c6NN96YQrGiGwYoQRBMhxDLGcApLodlM0RQWLRPQ3QiZJ6TyJkn5O3e5z73Gfy3vT1eFcJ92WWXpTmy+93vfoOtk8N+zpXFqMvMhrjWiafBjTlKYjdKohescOP6yZQNgmA6hFjOINWwbEY413wfr6PP0llqOr2fRCEhuUc84hEpgYfHyMB37W+bxRzDGimMw1xcjLoP0xBX4qm056qrrkproI6Kayc5aJ999hlsCYJg0oRYzjB1YdkM4TMHapHmvnOJxJLoEmRGW6s6nqLP6cr111+fwsMyd82rToO5vBj1JOkqruamhajNPY6K62a+8/jjjx9sCYJg0oRYzgLZk6uGZeF/xEQdo5Z5o4ZGeZvqMP2UHKSJgHnWNhh5ryNiQrzTwnHO9cWoZwKZzER1m222GWzpDxHec8890/JyQRBMhxDLWaQpLAsZsWo9dXJRFtJF6OowLyosqxyBUe7afk9YWFKREpNpJ+fMl8Wou9A3XOv6+wqOU/7x3e9+t7BY+THHHDPYEgTBpAmxnGWGhWUhbKlsxfwWT5M32pdyezwi3LX9npIG9aO6yJgLnTZzcTHqScxVGuhUM6ZdB80FXNuPfOQjxVlnnTX4T380mzDwkhUbBMF0CLGcA7gETWHZjAQZiSCMrCQchrkPxJZR3XDDDW99rc/t0n4vzzG++MUvnpHyBHNwsoTtj1rQpvrSUZiW+LWRxdEcpdpZ722gosxn0003Lc4+++z0GaNgvlNylxpbWdAGNnNhoBEEC4kQy44w4MTKyhuK+hlOSTjWl/zXf/3XwbPGY1hYNpNLMQiaz+4jJMRhWHu8Ye33iK0SE+I1U0k59qdtMerZEr82msRRhyfXTH2k5g32x0DEdX3ta187eHV3vP+b3vSm1MAfQufuIxGJEM4gmBwhli3ITj3uuOPS7wz2CiusUCy//PLJIDNUPELzfxpbCx+OS1tYNmMuUwiPoVeK0dUgMs5d2uMRn2r7PfWgwrnCszJmp2GE68TPAEU7PZ9nHzDT4tdGmzjCV81gyByja/ukJz0pNW5wzS+88MK02k2fOVvXYfvtt0/znY9//OMHW2/BZ4VwBsHkCLEcgkJxoUv1a4rHmyAgJ5xwQhLMPqUaTbgkbWHZzA033JBEk2B3LUonfn3a4xECIiBzVTiYGOXEo7Z6yTrxG9Xzm0uLUXcRx4z9td+OnxgaVBFN51P5jAQs1+T8889PDQZkL7fhnPJENXrYZJNNBlvrCeEMgvEJsWyAUAoBWgWiS2cV4mHBZyN9nt4k6BKWzTC8CtOJepdVRVz2vu3xQMwYXmt2+kkYZNfqAdtX/EbFe870YtR9xDHDgxSZkE0sZO6n8KvBlblhr/vc5z6X9t///CSA7p+tttoqdWiqQ0KQZbz0I15nnXUGW7sRwhkEoxFiWYPw45FHHpmW6OrTgkzrOK3LrIM5qXnMrmHZjGxS+y/El0OWw7jiiitSAwKGehTPj0jzbAmHbdX2e9OEgPnsaSxGPYo4wvnR2ME8pKQbgxFCxCMnkt6HN5gjB7xl5yuHUXVletvb3la8853vTCImU5n3zpPMUQTvu+OOO44d9g/hDILuhFjWwDvcZZddOvdrLWM1jy996UtpXctJ4RJ1Dcsy1gyrgn9GkEHNl7hJ/Bhh3p6Q4Cien/fTyIBYERmJOcKKuf3etI2v43O85jWJE0Hry6jimCkn7PAic3javgmzOifmeSG0L2Qu49c5qttf58+gR6aywUse0BgUTKO7kv0M4QyCZkIsKyil4FXyDkeBkVOeIWGjj1faRBY/c365h6xuN8SsSfyy50f0GD/eqeQk25pg6HmWDP2oyKS1j8LGwrLl9nsSo/S+neQamnV0rdUcVxzhq1NO2JE9XH5tDo1b39NAwt/Cruutt14aVBHMLmtXzjQhnEGwLCGWFZYuXZoM+mabbTbY0h+L8jKEdcJTFr9Rwp4uF0/FXF2XsCwIgzk+oimcy3usw/waL3ONNdYYbOmPY+JlSlIph4F5VoTFsfOMeEjEfFpUF6N23sYVx0xdwk4Z11TI1XbnwWd/6EMfStdReLzPYt2zjX0P4QyCEMtlULOmU07XFTvquOSSS5IwEMs28Rsl7OmSdQ3LljFv5nXEwhqWXl9FezxeJu9nHIOokYGyE6HH6rERbaFqPwlVl/Z7fcmeI0+O5+fcGwToRtRXHDN1CTtlfKbzm0trDLr8Luwq89j1fvvb355aCN7jHvcYvGr+EMIZLGZCLCssWbIkrVrPqI6KZZcY1p122mmwZTowWl2zZcvwaiX2MOY8nGpolBfIM9Ieb1RhgcQb58Liz00ZqwYSwqZd2u8Noy2sOupi1AY3dQk7VVwHYVYef/aYc9h1gw02SKL57ne/O5V5dP3suUwIZ7DYCLGscMQRR6TU/a41i3WYr2QovYesWOFSP6dhSPpmy5bh+QnpCovK0Cx7gHXt8UbB7UWkCKEFiofhuV3a72HUOUfnq8ti1E0JO1XKpSCPfOQj0zbvncOukqZkSSv3GHfwMVdxvITTtXBdQjiDhUiIZYWLL744hSvVuY3KwQcfnITB/BRjah6QMDG8GcKkbZx5LaUD4xhRl3CUsGyGMec93fe+901JKtnImZMb1h6vDzxHornWWmulcGQXeLi5/Z6Wd7xT28adc4RrXF2M2nkclrBTplwKInkqn7Ny2NX+CmvzLp3DxSAeIZzBQiXEsoJCd23Hli5dOtjSDx6U+b5tt902GdumPqpCVwRUeE9DA4YFXu93wmUtS6/vGrYbNSybYdiVM0jMyWUzBH5SCSmOS39ZgwOJMcPw3LLn6Db1yF6uxgvEclxcBwMNgkYghWirCTtl7EO5FKQ8iCiHXe2nff/a176W7ofFiHMVwhksFEIsa7CQrvDjKKHY888/Pwmufp2yMXkpxM57dc3+ZFgIKCFllHk/YIB5pN6HkPJcqh7fOGHZTC5RUeqhTEVItE97vDa8P49RoksOsVbFcZjn6LmeR5x4eBrayzrt62GWE3Z4l4TNNu9Vtxi1zyuXgmR8hcphV6j7dC14nUEIZzD/CbGsQdmBtmOnnXZa55AhzLcRyVe96lXJUK699tpJDITmeBwEkHfG4xwnyYN4ElEPZSe8UbiU5h/NjxIeYjlKWDbDSyUmBECd5Cjt8ZpwDFoK8oCJ/zBxbINHzBA7D+YgZTI3hY27JOxIOCovRl0tBSlTDbuC5+m4XOdgWUI4g/lIiGUDwoUSdaw+3yX8SCgJ7AEHHJCyOc33SeqwQglDmg2B0Gv2OrLxHWXOrYmykNon86VaqfkM+9A34cjtoQzEeymp4X15PQPXhybPkdhY/oyXOYmGBbJeCZ3zq7dqbr/XNWGnjIECL937bLfddmmfy1TDrrjyyitv074uGE4IZzBfCLEcgtDaSSedlDxFCTt1X+A8p+dx4IEHLtMi70c/+lFKbGG0eU1leETCnfqBMq6TmIOrUg7LEohRE44YMt4V79gAgMc9rBa1SRzrPEdF/rmjzSS81owQ7X//93+nuVwep/lIJUFdDHG5FMT+lRejdq6qYVdfIx5mU/u6oJ0QzmAuE2LZAnHR+s4qGwyj5BcF5cKfvtTmsAjRK17xiqGF5rwaxlczAGJTJhsJ82WMhM/h+U0K79+WLds14ciDZ+x8EIZcDtJHHJsQviTuztE4BrKuw4551y7t9+pKQTK8dfvoGF/2spfdmgiU53SJ/VxsXzcfCeEM5hohlh1heIUjeYqMMS+NwWWMq+G5JgjSVVddlTweIlOXscqQE1ZGm6gSzqb5t76Mmi3LWFUTjvJ+OhcEMYcezQGOE1ZWxiIE7vwQ5j6UE3bqOuyUcRzORW6/Z7+Fb4lnuRSkTA67CkcbJLmeEoO095sv7evmIyGcwVwgxHIW4JUK4/G8eDBNSCyxpiahNk84rO6vK+WwbJ9s2SbP0Xa1mLJHtazTzICQwq1FiIiez+oqfl7HE4Yw6DC6JOwMw2fpZkQoRQ0kZFXb73lONewK10eY3uuUkfRJBgtGw7UI4QxmgxDLWYSnp0cqL6hNSMwzMujmN8c1EC55W1i2SRzrwqq8NN6VrFbzerytvG/lhCO/l/GcYQlHEn8MFoRliW6ZURJ2qtSVghD6cvs9omk/PKdcNmNQkNvXOR8zvRh1EMIZzCwhlrOM028ejGeUS02GkQ0EA26ujGc6qnEuh2UZmHHmHHN7PPN2BI7I+H0YhK4t4cgcIEEjZjKH7a9w6zie9rBSkDLOD6/TsTg/uf2e0G1d+zoGe1qLUQfDCeEMpk2I5RyhqdRkGMSFZ8oL4iUSp66rWWTPkbfqIZmFEPRJyKniGHJ7PF4WkRMW1W2nL+WEI1nD3osgEzfhzlE6HDlmHrVylbwqSB2+EnVhV96x1xNRYigbmIBW8fpxFqPm0RJp94OBgXMBx+wa87QjkaiZEM5gGoRYzjEkEEkCIgp9jKwQJ0HhNTHgjLx+qpksjnWeIy+uLSzbFQJebo8npKk8Rp/c8vqWXagm7PCkP/CBDyTD573LCUdwTE0djsqlIMM6KdU1Gcg4d7l9nc90XATdfjW133P8HrzYYYtRZ4j0ySefnERRIpFSl5yMZQAi3GuZL0K988473+YaB8sSwhlMihDLOcqwUpM2hAFlbfLIeIk8MUa1Law6arZslVxKURYcwqJ0o06EyhDbtoQd/xe+ffGLX1y7n+V5UgMEn8sDJbDDEo5ytmu5yUBmWPs6Rpgx9vqm9nvVxaiJepXTTz89ff4hhxwytHSIp7l06dJUv6tpRp13GyxLCGcwDiGWc5gupSaZJs/Ra3hnjDPPhmgMY9Rs2Spuq2p7PNvMZ/KQLI9VNvJ9E3aEZJWYEN+6Y3LOiEm1FKQspNkjtV8E1YCCZ11NOOrbvm5Y+z2DAMfp/Xi5ORR8ySWXpNIkS8Q1hYerWCvU60455ZRlxD0Yjmsewhn0IcSyI4wroykZhZHlIaiz5CX09fz6Uldq0iSOTZ4jb49nI7QprCnM1xSOdEtMKizrfYgPY5Sx78KJhJkoCT2PkrBjPyXq5DlIhs4218n1qq4KUkc57Jrb4pUTjpwv70GQPZo6HDXR1H6vvBi1c2Olm3PPPbd1f6soXXHtteMLRsM9E8IZtBFi2YLQ5IknnpjmAnk8vCShPH/7cjF45qt23HHH1vKPcfAl5mUSAlmWvLK2sGoTBEK4r62x+6TCsoSRQc/t8XKHHT95zzKAhZv7CkWGMEqGUevpmihdKddJNjEs7OprUW5fV044GtbhiKA2JRwp+3E+7a/Bi/eWqKOn8MYbb9xaU1qHgdLLX/7yJLRdk7uCZkI4gyZCLIfwzne+M4W59thjj8Y+qIyecOAZZ5xR7LXXXq3lEl3xRa3zHBlY4TpGu0upSRsEwHwc4eIpV+faJhWWFXrUjF2oMyfsZFHhqcv+FH6UGNQ1DJkxcCGWRExCDC9xGG75umzXTJ5zdS27ZJ26VtUOR3DNmhKOfIbQr+SfM888M/XGrZvH7MIJJ5yQBnHmcIPJEcIZlAmxbICxZMCPPPLINO/UBkNJVHfffffOc1tlmsSxyXMcpdSkjabG7m6RUcOy5YQdx+TBi6yDcPOcee7mNNvEwzmzX+VSEN6wrOB11lmn9roNy3ZFNZt3EpTnSatLqhFZpT+SekbFQMT88EEHHTTYEkyaEM4gxLIGX4qDDz44eYvmmboicUU6P0+hLUOxrzg2MWqpyTCyYTBfZz95X+YW+4RlmxJ2CLLtyi+aDI3+sARWaNPAo+55w0pBiJGOQkpVyqvADAu7QlKQsgzLhc1USJMwE8wddthhsKU/7rs3vvGNKdEnmD4hnIuTEMsadtppp2LbbbdNAtQXoVtfIvNQZSYljk0QIGIwSqnJMAiP9yZ+3tdanUKYdWFZtxIRI7LDEnZ4WBJzqh1wqhBWS2OVRc9+NK0KUkVijVAnL5PH2hR2RW5flwcC5ih52ODFOzZ/E38hVNcQfrq2nk9sebM8SeFmz/PwO4/W53sPHrPj9nz7Z1DgfhsVYrv33nsX55xzThjsGSaEc/EQYlmBgeZV8g5HwRdmww03LC644IIUepyWONbRp9RkFMwNKv0gBkKmhNO8Xk7Ysd3cbl66ahi5PZ5zBaLsQYyQhck24uvhd16kRJ78P4+ql1jeRuicE3OZ9qvu+UKjrjsjZx7V/4V0iRz8blt+1P3P337vg/NlvU1zufvtt99ga380SpCE5lwSZdgnx+p+JMZqbUU7/D7q3GgwnBDOhU2IZYULL7ww/ZRhOCrmjsy7mUebpjg20baqSRYZjzpx8kD+f/V/DDGx47nxaiTBKImwvUp+fpm8jVgwLrr7CHfXiZPffY7PkwWsrMO8qUSnNnEqh10lMRF416QsFkTSc7Tom6ZRc00IPs/YoAYETBnJoYceWrztbW9L20bBwMzXeIstthhsuS3Oc54v1dc2BHX6hHAuPEIsK7zpTW9K82Dj1BdefPHF6QtigWBkkfEYVZzK1G3LlP9HyIgLj4oR9J5ZiPxs8pzytuwt2ZZ/9x7lDju61egmY07TAIE4D+vQU8X7NSXUEDvJOuVSELerbGDemHZwdeU6nlOX7aq2USH/6quvnvaRIcvt6yZJFkYC5XoIxTo2gxeCz2DyeIW3XSPzpAcccMBIc86OdauttkqCO07j9hDU6eEahXDOf0IsKyxZsiSN0IXtRoVB9hAK5VV65BKPnDCUxccXJm8TAvRc4dOmWr2+uLzm64jbOKUmTQk78BmyUrOo9m3sTkzK7fGEe9tWBWF0PIfYENqc5NOW7Wpf1cYSYuJV176uK/ZBghVhJH6O3/kxcLDvVSH3P43viZLzJGTtvCob8R6jJOgoWxJmluAzbUJQx8f9F8I5PwmxrHD44YensoWmRJAuXHTRRelLMU4od9KMUmriGLok7GTK2bLmTYc1dq/isyTYeB2hzaUgbQhpWhZLmFYi0LXXXtuY7ZoxePB8j7XWWisZ9zbKwuiY4BzyeHmEw5KqvEbLQcdIIIm48C/h5JkTH2FY54iX2BXLtO25555JZAn0XCAEtTvuhxDO+UOIZQVC58Ztmv/pgjlLnk4OixEJhlxiShcBmCZdSk1GSdjJ1DUxEP7Mjch9Zl1z9FwKks8Vw9EVtzBPnnhorCCM3iSWV155ZZonVEfqOvPM/G3JrQzBdp48/A6hccdjv7rU3XqtTF5iIWs3H7PtwtgE03v5fC0TeerKjpxvmbFtBtOgQOjWw3vPJ0JQlyWEc+4TYlnBHN8b3vCG4uyzzx5s6QdPxxwY0c0hSMaAIVc6kksOhFl1XfGYVMi1D3WlJtkDsj/lDjt9cUs1NTHwGRofMH55ySph0XIpSLU93jCc23LYlSB7vXCzY8iesH0qt6/LeL3n84LzYCZfG6I9zBuuQgSyN01Y7X/OSJZc5JwQAAMJWcVCsMpa8j4Sije/+c3JQ99+++2TJ18VCfenJDRJTxoZ2MeFyGIW1BDOuUmIZQ277rprsfnmm3cy1lUuueSSlHzC+DPcQmt1fUqJqmJyAup5DAB4o0RjJori7YPQrC+mMJ7P5XFNimFNDBg/oUciordutbE7MRc6JnhNlLNdq56kuUzhWe9JSNW/rrLKKmneMHuMznn2ZPV0vfzyy5NR6rNYtfdzjITPMfosiU4Zhl6ykf+ZW/V8om1fiHMdrsf555+fvEf3A6/T+9hvXi2B1dpuWEh8MbAYBHWmhFPegEGsQZhIkPc3586O+cwQ6hDLWhj517/+9amDTx/Rktiy2267Ja+UkXYDCsXxKrvM+fkyMIjleTFfboacR1QWk3EpJ+wI49nPplKTcaiGZd1u5gwZubwqCCNH9AgcYXCeeHe8XJ7UGmusMXi3W/Aew3q7ggF1HmW7Cik7f0LPjtHgpcloOg/WntTFpylkng0Ysfa7/a0m8xBj4W73gCQiokxUvY7YdfHaDWZcJwMKIukzykIctLOQBDXfd5MUTu8jV0D5ke+IMi73mc9y75kyYJNe+cpXpsHeYhbNEMsGFIvrtyks1qUjDqOukbqlluoExw0nocPp5i11FWHG1nt7+OLDF94+lUsRuuCzGWwiUifeBgn20f5Vjf84+FwhSMIpDD1sVRDCUG7szngRsNwez7FXs12zMDKKDKBzRnR9hnAn4dNCT4jUwKDNc/TZMlR5u8KxGUk5Rt+EULY0Y1WlnHCkREWmswHJBz7wgTToGaUrVDBd5qOg+k6NK5y+jzqNsSHaLTq2OmTSH3/88enelo8x7uIN85UQyyEwyueee25KvGgKBzLMRHXp0qWpC4t5uGEQC3WCQh3jhD2zQLiRGWOX0c1eJ6B9Ena8zyRKTcrkUhDCzPh0XfIrixMBZcB4aN5H2FTCDWPm2OtKNYiVEfMmm2xyGy/OKh/e0wi6bb7PeTC6NrBx7pxXYlcXHXDeCLNBES+aYYUwO5E2Nxxe4fxlLguqe6+vcLJDr3rVq4pNN900ZYR3gfdpfl+3qC7f34VGiGULQqgW2GW4CaZUf14dAXCD8iJkUkrIKNcedoHYMd4Eiac0akJNJguocCaDTpB9wYkIgeqSxZkZpdSkii9udVWQali2Ca/NIWnHRXB4h+b6iLhBQROeZ9+H9Z4lmPqylj3UDOMnRE0oHbfBhv2v87Z9fXjC9tP9kY/J/udlx0ZZpzKYf8wFQe0qnLKoDdZz45SuWA7OZ8jrWGyEWHaEOPI0GHDG0w3PMzFnZu5sHHhB5gYISVNCUFd4WuUOO5JICD3Rz1/ePqUsjret1KSOXApStyqIW66cLTusVEMoVrYsUePh6XbDi2SQCHn1/DvWru3r7AdvUKKV0HAWZ4MX+5W9w7y/KAufZAjH6bUGURkGk1gLwypLCYLMTAqq+7ZOOG076qijirPOOqv1O1JFKJbAHnvssZ3Wel1IhFjOIfomBJUpJ+zwcIZ5ub6g5VIWX1RhFcLUVMrivc1p8rCGzeESnmopSJnyZ0uIst9C1zxFAl4u1ajLdnV8uT0eI0LseMHOF+PCEPRpX0dcc+INYdOMvMkDt98GIgYNIgJC6NVOTyINBlPCrqN440EwDUEtC6fl6whe09RSG+9617vS99zqTIuJEMs5Ci+nLSHI/4Yl7PTBiLGtlMVzeJm8v+qqJrYRSZ6quUVC4UtPELPH6D2rXm1dWNZxDct2Nagot8eD4n6Zr7zOtjkbYWrizxAZIBBrxsY+8iANIIhdNWLgWISDee3OdXl+OicF2aeylxkE02AcQTUlc+mll946AO2LKNurX/3qVEu+mAixnOPUJQT1SdgZB1+4HJosl7LwLG2XVWoOUXhauQWv1JeXqPoiygRtK9WAWzCHZWWrVrNd6/AaGcvWuSTyhJuA2W4EzQskqjxc78P7JJDV8pQ6nNc838iDFSZ3viUSeR0RNufpc2TaGrDYB7WPizHxIZib1Amq6Q0LPagHHweCy8NcTNGTEMt5hExQRpuo6CE6LBw6LQiQcCzh4fn6Mkp84U3xtCQ7tYWEmuBNOkbp7FWvrgnJVwYLRKsK4+ALzVv2fkKsfeaDvY5REZbdbrvtlpnflaUrM9CARXg6COY6BrlWqDn55JMHW0Zjs802S3OeXb+nC4EQyzlONWGHZznJhKBhlD1Lnhlv0f7wGIUkeZY8N56lrkW8SKNYz3FbCf/UlbJU8dwcdvWeXbJlvSa3r5P9at9yxyX7W21cbpSdW9HZL+etPD9ahjcvRZ6HbBkwr3G+hbKyZ8mr5FFqMCDRx9xreJbBXKLsWbqH/e07oXOWMOw4uO+VZoVnGcw6XRJ2xkkIqlIXcvVFIMQSWnixnlMtBcnUlZr4knq/XMoCokY8zRUSFu9VDbu6JcvZslXynKWmATkjz5wlAXMehKtz4/I6mhq7C1ER7aY5S8lAronjMmhRq5kxgLEPjkPzgSCYJnVCmOcgzVn6bqnrNTD0nZC8lr+DMWc5GiGWcwiXYtSEHULXtUOQRBbP9/A7cqmGxJi64nn71VQKUsZ7Dis1Iaq5lIV3qjOR8G0ucykLcF1v2XI2LM8we4v2XbjUIKPaHm8YzjXP0Jyrc7bRRhs1Hp/9dmzEWIi22t/VuRdGzoOJxTTqDibHOELYhHvTYFJtsRV6ZMMqeRoFHqXvLcFcTIRYdoTnoCwgt53jQUlo4fmMG7efZMJOTgjizRErIpRLNfwPElu6rqrRVgrSBA+sqdTELVfOdiUu5X1kGBgE3idDINlGWJaImUO0786Z5/DkyuKevb/cHm8Y9kN41WsYDuFc4umceN8smnl/zcXajwwhF+51DHV1lkK4fRtVBAubaQhhE+7bLJB+NxD2UPakjeeodZbW6T3mmGMWXQQlxLIFhtRkOM/DuoMEhgfDWLsRCRMPUAKIG7wPDPMklsTK+OKVSzXMc0pCcVPzzuo8xia8vloK0pe6UpO6sGsdXptLWZxrAxXHw5szNzisgw9jxMMb1sGH0BFzQlctrraPubG743cOeat1++vr4x4wZ0scc5cfIWsi715xrMHCZyaFsIkmgax+fy1DqMRKok4fJNS5t6ODT3AbNL+2gohww2qrrTbYelvcOHrDWlLJSiVtjbLrEnZGwReS2Ao75lINhpmYV0s1+iQEuR0k7PjS8wrH9ZphX3lmOuNIYa9bUqsOAxUlGkRL+JZo88B5nI4FjrNuVRbHTJRlwJY/iyHxnq7TMMGFc0as1ar5PJ/b1BvWfUCg7asQcd4XgixULCHC9QnmJ3NBCJvoKpBlDAi16OQlds3kfutb35oiJgTT8S42QiwbUGCu7ODoo4/uVKLBo7PqyJIlS2pXHemSsNNEFkZfVl9MCS5e3yXTtExbQpA5SYZ92Kogo+AWI5Y8RKLi+Ov6rMKx8uoYI8cmA1UyTm5fR7TK2bKOScIOYWXEwBC4ZjzG3PqOiJnTFcJVmzkMn+H6C7EbfGSyeJvfNcipE1sDl7zqCE/TYMM10zXFe7V9djDzzGUhbGIUgaziHlWm5TuhqXpu71jFudETVoTo4IMPnsgAej4SYlkD0bCCSN/1LBnT3Xff/db1LJ3aPgk7hEQI1fPLpRp1q2qMi8/JCUGyQQmSz2jzjPtSDbv6PJ4rzzqvasIgGUgYUBA5Wa62a82lK0+1fZ33GJYtizzAMD/DG3Vszr+EnGEDDMk+QqpCvdW6yozPZ6i8v2vGW3eNyjAsQtD+LynKtfTejJuQeNtcMbzWPKpjcT7cQ4vVUI3KfBTCJvJ9N45AVnGPWRhdZqucBHXS7mXv7/uY5/S32WabRZ+0FmJZg3j85ptvfmvdXh8koDC25tZ8MZsSdsrCmD0iN2K5VGPa2AeZcb6AvDBzGKOGheuo6+2aEY4V1nGOfOEJTnlkm0s7hF6bqMuWzZhnJKjeUwhdlGDllVdOn8vLzAbRfhFPQm6OkffneV2pCr3jKAuaY1CzSex41ESUl2lQUvZayxgkWA6JuNovhlzik+Qyx6MBg4HGqM0fFgoLSQibmIZA1iFCY/Ds4bvj/UV1ZKqrfV7MIpkJsazAIJn85h2OgjAcT8hILXulDCRhzIk3aCvVmDZEploKkj0xhp0nOGrCkVuqnO1axij12muvTcZfuJqHVS014XkydDzBNqphWWLos73e3GH2Du1Tbo8n7JThwfs8STo+n7G1T4RMmLUq8sNgsMuN3cuRBAItPOtcC3MbsTNKRutZ9EQRZCkyjBLGeM1V4+78EVKJYYcccshtjmUhsRiEsAn36kwIZNCPEMsKRI7gbbnlloMt/TnwwAPTlzjP+/Up1Zg2XUpB+iQEVanLdiUUvC9CVm5cXsb/eYq6A/HE+3i4bmENASTjKOHgjTadZ94m48qL5lnzJhldod+MY2gqZXEduwwiDLp4hgZF5cbuBkzE2TmVlMSzt5SXOehddtklnRtC2WYYDXbcZx7C6POJxSyETYRAzn1CLCscdthhydhWPaI+aCfFS9hiiy0GW2YfHm3fUpC2hKAq5bCr4yeAhJIQeO0wkXEbSqphOHlf1VVNmsgJNZKHiBnBrQvLlhEWJdzZs+uSbOVzcimLQQSjDiFcg46mue1sBJ1D59MgxSAiJ1PxdO2He0bIfpttthm8sh37IqnslFNOWWbOdLYIIexOvjdCIOcHIZYVZLMSueo6hX3gLVx++eUpDIhcMsDbYSwYB6FOCFXysmzP3tAkSwxc3kmUgpQTgqodgmwT+mQUCaLPciyMf5e5VyJSbl/H0Ho/YdC6zGL4TCFP+yUEm+c7q2HZKl6nxINY8eyaSoK6UDfv3FTKAgMI51DmtOvgeL3eIME8t3revobSvUb83/jGNw62TI8QwvFx/4VAzk9CLCvwLBnapizLLlgChyHVUooRyV6Inwym0JyfeVt+1P2vDMNU3QbbstEqI5zKCAv/MmIMV34eD5Og+WlbftT9D8KjvtDCktUOQWpM/d9nEMgcfu2CY83t66oeHi+RuJRLTdyukn8cl7m/JkGsy5Z1PoRrNZcgyub9hEv7tMdrY1gpS7nUh9DkVn1vf/vbiwMOOODWOds+OFZTBu7bccp9Qginh2sUAjn/CbGsIHmCd0DoRuWggw5K3qFEH0I1018Kxo73VFcKwvBlwfVzVPEW9jQnx8ASAJ9TNq759zK25fcwmCAskn3MufFUqwLtp7+JmmNyPDopCZGbB2w7r+VsWeFhGauaA5SNPFETLu7SHm8ciBAPVMmM8+lrR3iIjiWThGFHxT3rXDbNs4cQzjwhkAuPEMsK5pZkGaqxHAVGS5u1c889N81xefDGnGah3WmKJwHiUfFkhFwJzSTJ3hCjy7OUUs5LGyUhyGs0ZNYwICf7NAk0L5Jg+l06uyQe5zALL/xeJ9DmJs3VKgcxp5kpP99+C8sq98nzqnXetZ8edf/LnncfnEfdn3i3uj+NilpU62oS/BDC2SMEcmETYlnDTjvtlDISm+bLhqGezxdm3333HWy5BQLA6E9LPOtKQSYB0fLevDKGlwdJfOp6uxoodEkIUmivbdaw3q0gIpJ3iG/OVjXHN2xVkzL227VQl5gXzW4KrxNvc51rrbVWEkHlH8jC7dgMQuD6+VvSTy4Fyk0kXFfbia7nm5smWgYEjtXzRR3y850v52rbbbdN7zMKBi577713KnfKA49gZnC9QyAXByGWNag1lDDBu+QxdIWg6CN75plnppH8MCYlnl7fVgrSl2wAiK/fiZ45Q383NRmo0pQQJOyZW9A1Hac5P+Imw9Nn1z0vl5rwoKtJRMRNX1+JNuUwdDksS/irEDCDHfs2aiJUXyQ2GTzssMMOgy39cQ+4X2XFBtMnfz9CIBcXIZYNSDqRkXnUUUd1EkxzQnvssUex2267JQPfl77iOUopSBuETO9T7+2zzQvCvjQ1GWjDseSEIK8nZNX2dRnhUJm7vDIC2xYu5MFVVzVRTiFUbPFm4ccqbdmy9q8p4WhU8pyhh98zPE6DCtmxQv+jYuAgpK1vZzAdfAdCIBc3IZZDMOq3ojgRbGp95/TJsDz99NNTzVu5uH0cmsRTUbzQJMM7TilIxrwZT48Rz4k6ZZESSuyypFYbMlh5pva32iHI8Sl/IH6ajxsA9MExXHnllSnpyOBCKHoYzuWw3rJCrDJUzcd2PWbnj+h5lAXRe/GqnVsJSnngZR9cX/ONPELlI6OGUDW5dtxa4AWTwzUKgQwyIZYt6NMpecKclpo86f/mnPztSyTr1BfIPGedpzIpiCdPktfE6Ao9jjrnyaPhjeR+pgQ+132W6RN2HUa1fV1OCPKTmEiW4QmOKvwGDuZAnQueZbnUZBjDwrK+FuX2eJKneMeyWX1eOUGIN6rO02cqSak7l3ANfR5BBc/dtTTfuPHGGydvty8GGpZZklDW1Bgh6E4IZNBEiGVHCAuPTvIKz4FxlJWp1m+SCTV11JWC9A3bEiUhVuULjDmPKhfyV/Feo4Zdq/D4ZGCW29cRHsdjro64EKFhCUHDyN17iK3jtu/EubyqyTDKYVlhV16q1+qqw6s2fy2ca//MoRJEXmLXwYP3N3frJ8/ReTCXaqAgPE10GWPZsASv74BBIwNCb23CYDTcMyGQQRshlnOYPqUgTeLJkCq7sE0SEDEdxqTCrj7f+2hakLNWbSMQBNscY9n7a0oIaoJACl3aT4OWKpJ1eJs8Z8/Jhs9AJ88fEkReoc80X5vDsoRRiUoOiwoRE8emUHwV76V+1FwqAX7Sk550a/jV54oOeD9zzVl01VnafsQRRyzj5TahS5SVW0499dTO4h3cgmseAhn0IcRyjiJcN2opCE8tr0PHiOa+oW1h20mFXXmx5fZ1cDzCnjzVvK0Ool7uEFTXUF2bO4MCa07WCUtZEJWf+FxeOU/acRHp7CmWz8WwsKywNQEm5FUcL6Mr6mDQ4vgMTMrvITxsMCBLmnjXedC8RI0eJPsQ6yYMonRNMpB6y1ve0pp5HdxCCGQwDiGWcwwiMEopCEPOoBMZ3oyQZk6gwbCwLSYVdq1mkzJOwr/Cx+Z7+1BdMowoWQsyz/X1SagZVmpSZli2bLk9Hq/RcZXnH+sMr/33sC9WFGkzzMLWRFNykXPofbOHK2wt+1fSmfey7qr53qCZEMhgUoRYzhFGKQUhTLwVHhSj2bVxObJ48ngIiSQWIchhnmcbjkEWqSYA5uTsm/eVJNOXckKNfZRNaz6RmDvGLgk1VepKTerwlajLllX/aa5V0pfBjPNdV17i9Tx0nrrnMc59MI+pQbrPco2cC/AgeesEf5w+sAsd5z8EMpg0IZazjNPfd1UQHhcvx2v7Ni4vUw67eq8+CUNViKNaPwlPRNI+EZphryX21YQaQgHelDCpcCRxIoy80z5LhjXhM9tWNQFP1ADGfthXYuV8a6FHyKodiJwzou51T37yk3t70sHoOPchkME0CbGcRYiVpA4C0+YpSIAhFARFeFbz8VENgUveFnbtk20rPCg8ysPjaQlh5tBhXUINzB3yhglRNaEmY86VJ6gFXTVzt29CUB1EzXuUS03sn9pH4VWeqP3j2Zo7LodlDQ4cM8F0PswRy5gWLh42JxtMDtc+BDKYKUIsZ4Fhq4KUITTE1PMZal5N10zJJkbNdm0ST9vNoclIFXIV4szkWsSmhJomvDcxJ55ttYddEoKG4bPME5p/dAw8RQ3iy16h59SFZYm/TFQZv6uvvnptSDaYLK5FCGQwG4RYdoTIZO+BV5HrLCVadM0c7VIKwrORkelzCCNRq2vbNgrjZLsS7nJCDVH0Xko4iBShICSEZhzDZV7QfB3x6RteriYElROcqvB0nWdi7/kMLi+zWmpSJmfLEnD3guvpufrQTrI9XnBbQiCDuUCIZQuM6mmnnZYMKTEgkAyqeS/GU3IMw7711lsPDQUyssSqrhQkGwP/97u5uBwWnATesy3sWk6occxNHWp4h+blhCqFZPPCyX3Ctk0QIGIp7FoNyfbBYMZ7mVt0zYS47Y/9M/9p/3i6vPqqoArvmqc0CKquamLfDHZc9y222CJFBiADt297vGA4+TsRAhnMFUIsh6CoXd9NyycRhToDTlSUSjCWGlnzrMoQj6ZSkKbG5ZOkHHbVSactoYYg1nWocZuobyQmyjEIfm5fV0cf8VSuocEAYfP/SeFzJR0RNy0KeX/2oYvBLZea8J6FenmOMpWdq2pY1vGV2+MF/XEOQyCDuUqIZQNXXHFFKvw++uijk8i0IQT42te+tjjooIOSYBLAulIQYiV8KZRZ17h8XMoJNYyO5BQLHws1tiXU1OH24EkSPolIsnCr7eu60CSe9td2DQb6NlCvo6m93CgJQcSSANqvV73qVcsMIPy/2sSAiDq/0xj4LERcjxDIYD4QYlkDgdlnn31S8kaf7iiM/mte85pkWIUAcykI742n0ta4vCtlQfR7JodOCeJ1112XvESfNSq6yRAD4UXh52r7unFwfi6++OK0v86x27Bv2DYzrL1cFULdlhBUbiTgvYaVmtQ1MejbHm+xEQIZzEdCLGvYc889iw033HAkobngggtSEgwvs2vj8jqIYDmhJtPUoSYzarZrGfN6xF15Ci/ZZ1bb142D95flyyMTHkWfsK39YWiHtZfrQjkhyLGZh7UPBLRuQMCTrJaawL5Ww7LOX1N7vMWIcxQCGcxnQiwrMNb77rtvsXTp0sGWfvDuLG681VZbpcL0pjm4rgk1fTrUjJPtCq3crOIhISYPFOzLpBZDdszC27J72wYiVfG0H8LVah/R1F6uL95P6YhQs/Ots4+BSBO+Lk2rmlTDsuX2eIuREMhgIRFiWUFokHGW3ToqBx54YFoySkblqAk1fXAJ27JdhyHL85Of/GTaH0k72ZgJa+b2daMU/ZchGhKmhKa7iq79El7locE8Ky+7zfPsgmPT8MB7CzObh+Wxdu0Q5HV1q5pUw7JCxLxMA6jFIBKuSwhksBAJsaxw+OGHp9DZKCHYzEUXXZRCehaL7ptQ05dxwq4MOy9J+YRjLgtDbl+3ySabDK1XbMPtpQEDz/D5z39+q9HMGcI8y9xeziCgSp+wbRnnS+KVY+ctN63o0jUhqK7UxGvKYVmRg7r2eAsFxxsCGSx0QiwrLFmyJNXQMbyjoqjefNhOO+002DIdRg27EhdJKMKiakSrWaja1/GaxjXuBPeyyy5LnlvTXCcRLbeXU3Yhe7fv57aJp32R2MOzJ9p1AlyH9+rSIahcasLbRDksyyPO7fHGiSLMFUIgg8VGiGWFN73pTUlAhtUQtnHJJZek+Spzlnku0oNBZ0x4TOYkJedIcPF7HyPjko0SdvX5XkcAhInrBINnR4DXX3/9sQwfgZH9u8466ywjfDJhhVeFpoU+q+3lJkEWTwZdQpF9UJ6i1nXU42rrEFS3qkk5LGuu9l3velc6t13Fei4RAhksZkIsK0jsIWybbrrpYEt/rHYv/CYMW4V48jI89Hw1p2kbg+Nnm7gy+pJw+nSL4UEKA5oDlGzSFHokKrw882ujksPCq6yySvHwhz98sHXZ9nLqS/tkB/dF+Y8wqnNH2AjZKGHbOsodggyqqglB1VITn5XDssK1k0qYmgnsewhkEIRYLgODetRRRxWnnHLKYEs/eErKTpSQTMJ7KIsrb0zto1Agr6VNXH0+IyesytMplztUkQ3KC9JAYVSIIcPKm9QAwblsay83aco1koSpyahnz3Mc8WxLCKqWmuSwrIQp88FztT1eCGQQLEuIZQ3bbbddsdtuu6W5s74IuRGe/ffff7BlfFyirmHXLK7m0Bg73Wu8jqFrElcepwQWYeNRwsLmHTUTl8wkicn8IyZV3tGG8yN0bA6XJ+cz+zKueDYlBPm7XGriXLlHDErUmja1xyPEBkeOSbhaVIGw5ms0aexnCGQQNBNiWYNszCOPPLI444wzenlCRGeHHXZIXinhmAR9s10ZPPvPi+syDygZiSHmiY0SFiYSkoUYfKKc28vNBG5dAxMem0475bDvuIwqnp5blxBULjWRae135yt73bk9ns9VvqQJhNcbsDnfBkDKb7y37TvuuOPQetAuOKYQyCDoRohlA8KovLPDDjusk2ASmd133z01XedZTII+2a4Sc3g2RIu30obLPkr7OuLJuAoHS2ISdlTe4Rx1EddRPNcqvLO8XJpBxEwstjyKeNYlBJVLTbxOWFaNp+c4Dm0WhWe33HLLRg/S/KcB2eabb55C3n2w3yGQQdCfEMshEEyel9Z1w0KyjJ/VSYz2LcE1Li5J17Cr+UirgUgWkVTUxegJ8fVpX+f5jGtuL2e/eE7mQduazBPPPOc6iudaFlciJbnJ3KoByWwmyPQRz7qEoFxqQhhl0JqDllymEb+oQBvOgftSlu3GG2882FqPfQuBDILxCLFsQUjz+OOPTx6U7Faem1Aaoy95hagJj+688863rm84Dl3DruaxeBiEhMHs2vCAV9YlG1P5g2Ovzj/q9GPJKqUn0zC2deLq8wiTfV9hhRWSgE7Tcx2FLuJZTQjiXRJ/x3ruuecm8evTfN29sv3226fm/dX6zxDIIJgsIZYdYQjNzZk3MqpnoK3EwcNh+CZBl7Ar74RA80QIXp9lrdra15Xby3l/ySRZUAmWNSeJuOOeCewPcRaidJ5l2FaZpOc6SdrEs5wQ5HfnVwi2LwZsaoPPOeec9HcIZBBMhxDLOYBL0BZ2zb1M/dSBhoD0oal9XZf2csK8jL6ifiI6beyTRBZCLWu0XI4xLrMlrk3iKXqw2WabJbHzOaNgrlxyk8SqEMggmA4hlrNMW9hVMb3SA96s+VAGuy/l9nVCgV3by9k3LdqEYBn2aUNMeLaaJpjLm0Yv3b5MS1yzePIuL7/88uLss88e/Kc/7h/zn8qdgiCYDiGWs8iwsKvLwqMTojMnyeCOAi+N2Aqd9mkvJ5Qn41W2ZZ/ymVEoNxJQAjKf6SuuwqYGBXvttdfgHfoj6/akk04qjjnmmMGWIAgmTYjlLOCUDwu7Cosyos9+9rNHng/VXo7HwoMUWu3aXk5rPM3PrZIyTeFyDtRI8q4kp/QpX1lI6CNsjnibbbYZbOkPQSa2ed4yCILJE2I5wwwLu/Lk1N0JQdZ1dRmGy0h4cns5n5MXM+4KL9S86FprrTWyJ9uG0K/5SGKujMJC0wsV3iUhI4bZ0/Qze5a8TNfboMm846jog/uWt7ylOO644wZbgiCYNCGWM0hT2JXAqbt73OMel0KkXWFsCWy1vEM5gkScriunuAV4usKBQr7TICcoEQ6Dgbb6zLlIF/HLYVbnUhKWOcs8h6nkqDw3bB6aF28++cQTTxxs7Y/XC9mPkk0bBEE3QixngCxG1bCrMhTCxrvqutg0AyspxE8Gudpe7sorr0xCVK27a0J5hsYLliUbVtc5KjzcLostzwaTFr82nANzwbxq72vJLslV1j21YHjfDOeMRgYiCOMsWB4EwXBCLDvCuCkoJ3AMKqOvQJ43yJA2URd2zfWD+sfy/trS/IVHZYnyzpRTKGZntMu4jD6nT/s6XWXsi7DrsGMYhbzYskYChLhajjINZlr82qgTR/dLdfWXU089NSVRbb311oMt3dHyTxYssZ2Jsp4gWKyEWLbA+J555pkpo1SSjLlEBpYYCJ/K4lR/+PKXv3wZQaiGXRlP78MwKrJvMszV9nJa0llNo8kYen6f9nX2Q4MB7fEUyU8SwmA1DcerRtLPUZlr4tdGV3Gs4vj0gj366KN7NXxw3c11ylhWexsEwfQIsRwCz8jqIxpWr7vuurWGVzKNUb1Q5qGHHpo8O6e0HHb1HN1/ZJrysuq67vAa69rLtXmdXdvXZcxxSgQi8H26/7RRXWy57lzNN/FrY1RxrEOizyGHHFIce+yxnfv1WmTcAEq7uyAIpkuIZQPm2U477bQ02u9SvsHD3G+//YoDDzwweZwEg5HPSS3mlKqe57D2cl0gsNrX8SwY6mEw5tac5EmOsk5nHYSO56zMhZA5XscyX8WvjUmKYx2SdA4//PDUkJ+n2DRQEnEglLr28CwnHUIPgmBZQixr4N3tscceaRkk9YZdIRp77713cd555yUPjhiuscYat0lq6dJergtN7evqIN5Co7xjYtVEV8+PR6pZgv/zfgnGfBS/NqYtjnVYJFqDAY0GdGzSOUnClmtCJEUotMyTFCSrOAiCmSHEsgarP0h6scpIX6weodiel8moCpN2bS/XFe3rrDiy4YYbDn0fzbkl/fBazYWNG/Z0q0xrseW5wGyIYxOuseiGeW+dlwyoZD2blyag4U0GwcwSYlmBkeJVXnjhhYMt/eAxbrTRRqnmjffXtb1cFwidBZdl5cqIHSZ+hNI8ojlTWbvjeH4Ef6YXW54J5pI4BkEwtwmxrHDppZcmI/rKV75ysKU/r3vd69KcU9tC0ISuT8ILAbTocu6wUyd+LudHP/rRJHDD5r26IDFpriy2PAlCHIMgGJUQywoSJ/RkHafAW3asesxVVlllqPj1SXgR/mTslWM0wZPVEcZc1jjen89R4iJ71+cNm+ecy4Q4BkEwKUIsKyxZsqTYYostxlqSStap7FiZiuN4dhnCZd5xWPs6ZS6SQyzsPEqoFblZAhFvWmx5LhPiGATBtAixrGDVeeFLWaqjYiUJxlp95ri0ta/LHYJ4saMm3ExzseVpEuIYBMFMEWJZQTarzMONN954sKU/Rx11VMoWHacpucvS1r5OIbtylVHXnJyLiy0PI8QxCILZIsSygrIISx2NugqE06mkY+nSpWml/FFoa18neUeoVymBFnx9mS+LLYc4BkEwVwixrMFCvGotFdz3RYKNOUYrQYxCW/s65SAaDKy55ppJPLriMs/1xZZDHIMgmKuEWNZAVE444YTU7k7maleUgWy//fbptaMsdzWsfR1v8/LLL0/dc9pKUsrM5cWWQxyDIJgvhFg2cNZZZ6WQ7MEHH9xJMBl+2a8ve9nLRloBYlj7OmJigV9N2LsumjwXF1sOcQyCYL4SYtmA02JpLqUUykkk2jQhLCqpx3qEGgb0ZVj7Ok0BNCMQdu1ShjKXFlsOcQyCYKEQYtkCIRRW5ZnpFavHquYBuu3IRLUUF09w1113HSqoTSjbsHLH+uuvfxsxJJDWnNRerssahzzTmV5suUqIYxAEC5UQyw44RbJHhTWttkHIdNyx3JXi/T4L9paxGogm6+utt95gyy1YqsnKEtacbFrwOUOYCLpQ8biLLfclxDEIgsVCiOUsUde+Tgj1/e9/f1r4ua2DUJfFlidNiGMQBIuVEMtZoK59HRGyBmZbg4FyjaSVRybRTq+JEMcgCIJbCLGcYart6zQrV5tpkemmBgEukXlNaxuuuuqqI9V/diHEMQiCoJ4QyxnCaa62r5PcYx40L7lVxWumudhyiGMQBEE3QixngGr7OqdcFq1erHX9Y6e12HKIYxAEwWiEWE6Zavs6y2B98IMfTOUd1S4/k15sOcQxCIJgMoRYTpFq+zreIrEUdi2v8EHUJrHYcohjEATBdAix7IhQquWsbrjhhuKmm25KjQnUWTY1JC+3ryNcGgw87WlPS6/JjLvYcohjEATBzBBi2QJBOu+884orrrgi1T8SO54fMSRUP/zhD4sNNtggrX+ZGwKU29cRWCt9aDCgCTpGXWw5xDEIgmB2CLEcgkzUQw45JImeR113HF4mMb366quLI444ItU9KvNYe+2109wkcV155ZXTc/suthziGARBMDcIsWyA53fsscemBukPfvCDB1ub0VDggAMOSCKph6wWdOuuu27yQrsuthziGARBMDcJsaxBX9ZddtmlOPHEE3utS3nNNdcU++23X7HnnnsmwWxbbDnEMQiCYH4QYlnD/vvvnxJuXvCCFwy2dOeMM85ISUBEr7rYcohjEATB/CTEsoIMVcttXXTRRSP1XVUruemmm6YmBBZeDnEMgiCY/4RYViByv/jFL4rtt99+sKU/++yzT+q6s9JKK4U4BkEQLACGp2MuQjQrJ3LjYI7SXKcwbghlEATB/CfEsoL6ybqm5n1QP/mrX/1q8FcQBEEw3wmxrHC3u92t+P3vfz/4azTUXo7asi4IgiCYe4RYVhA+1WFnHLx+UiuFBEEQBLNPiGUFS2J97GMfG/zVH/lS+r0+5SlPGWwJgiAI5jshlhU0DzBvKdFnFPSE9R7KRIIgCIKFQYhlBbWVO++8c+rz+ve//32wtRuaDpxyyinFTjvtNNgSBEEQLARCLGsQQl111VWLgw8+uLNgakDwmte8pthmm21u07UnCIIgmP+EWDawww47FPe5z32Sl9mW8GMlke22265YY4010sLOQRAEwcIiOvi0cNVVVxUnn3xyseKKK6Z+sQ972MNus57lhz70oeKPf/xjsccee4zdzCAIgiCYm4RYduDmm28uvvjFLxYf//jHi+uvvz7VYWpckAXUotBBEATBwiXEMgiCIAhaiDnLIAiCIGghxDIIgiAIWgixDIIgCIIWQiyDIAiCoIUQyyAIgiBoIcQyCIIgCFoIsQyCIAiCFkIsgyAIgqCFEMsgCIIgaCHEMgiCIAhaCLEMgiAIghZCLIMgCIKghRDLIAiCIGghxDIIgiAIWgixDIIgCIIWQiyDIAiCoIUQyyAIgiBoIcQyCIIgCFoIsQyCIAiCFkIsgyAIgqCF2/3v/zH4PZhHHHPMMcXXvva1wV//dyFvd7vivve9b/GSl7ykePrTnz7YGsw0z3jGM4pPfOITg7+K4s9//nOx7777Fg996EOL3XffvXHbQiDuyWCmmI177Q4H/h+D3+cdX/3qV4unPOUpxR/+8IfiSU96UnGnO91p8J/RuO6664p73eteg7+K4nOf+1yxyiqrFH//+9+Lxz72scWd73znwX9mHzfEO97xjuIJT3hCsd566xWrrbZacdZZZxXXX399cYc73KF4xCMeUbzhDW8ovv71rxdnn3128eIXv3jwymBU3A8rr7xycbe73S3dF+6X5z3vecU3vvGNJJLuj6233rq4/e3/f8Dmjne8Y/H9738/CaTnNG3ryny/J3HBBRcU559/fvHCF74w/R0M55vf/GZx7LHHpnvmIx/5SPHpT3862T0C0UTVlnVhodm/3/3ud8WWW25ZbLTRRoNXjce8DsOutNJKaXS++uqrF3e9610HW0fjW9/6VnHEEUcM/roFAvyQhzwkjVbuec97DrbOTR796EcXD3zgA4s3vvGNxYc//OG07UEPelDxve99r7jHPe6R/g7Gw/3wgAc8oFhzzTWLf/qnfyoe/vCHp/P+7Gc/u1h++eXTc3xRq9QZmVENz3y/Jz/2sY8V9773vYsIaHXjxz/+cbH22msXe++9d/GKV7yi2GOPPYpf/OIXxWGHHTZ4xrLU2bI2FqL9Y/eWW2659PskWFBzlkuXLi3WX3/94qCDDiqe+MQnphH/8ccfX7zsZS8r9tprr+JRj3pU8dnPfrY46qijis0337z405/+lG7AU089NY3YPv/5zxfvf//7B+/WjPd10x599NHFkUcembb5zFVXXbX4/e9/X/z85z9Po5nf/va3aQR98sknFy996UuLE044oVh33XXT7+ecc056XRkjuB133LFYY401is022yw9Xv7ylw/+OxwepJHgK1/5ymLDDTdM2z7+8Y8Xb37zm4uf/OQn6e9guri3eJ74xz/+kYzPJZdcUrz1rW9t3PaXv/zlNvfI6aefvsw93IX5ck+ed955xbvf/e7iqquuSp51MJwzzjgjeZFlsdpkk03SNfao2jFkW/aiF72ok+0rv2YU+9flPnOtHctM3muTZkGJJePy61//ujjggAOS+33ZZZcVT37yk4ubbropXVijDqFJoTPc5S53SW48nvOc5xQrrLBCusHauPHGG9MN4aJffPHFadv++++fwiKMX76ZLrroonQD8Ubuf//7F/e73/2KX/7yl8Xb3/72tH9Vzj333OLxj3988cEPfjAZU48LL7xw8N9m9ttvv+JpT3ta8atf/SrNgT31qU9N231BTjzxxN7hmGA4wj+ulYfQVca9lgXAdeNtbrzxxslANG0TIi/fI/5fvYe7MF/uSaEyXhJvfC6F9eYqN9xwQ7pGZVw7IcbHPe5x6e+yHUO2ZWxdF9uHcexfl/vsO9/5TvHMZz5zRu+1P/7xj2lfeeKTYEGJpRBYHoGZV/rrX/+aLqKwD8S1nbxxIcpGMd/97neTtwDzVK9+9auL4447LiV4POMZz0ijHRdfTJ1o8Truc5/7pOfXhevMSbzzne8stthii1sfW2211eC/zRx66KFppLbbbrsVhx9+eBo1YsmSJcXOO++c5juCybHBBhukuUmPFVdccbD1FsxHQrgxG7k8RVC3rXqPmBuq3sNdmC/3JISvRTyCdsy9/ehHPxr8dQuMvwFwvm5NTMP2oXqvdbnPHvOYx6TnzeS95jv2vve9r/U8dWVBiWUTN998c/ppVCOkYb7JqANuPP93Y5lHyc9t4lOf+lQaoTGSRjNlXFwXO18cRsHF+9vf/lZcc801xbe//e20vQk34bOe9awUvsgPYasuMODCrjvssEMKdwSzi/liocaM+6puW997pI64Jxcu22yzTRImtivDm+NN1dkxlG1Z3jbM9mFc+zfufYa5fq/Na7H8whe+kEY3QlUyYo1qfvCDH6THl770peLLX/5yCov5yZ2/9NJLUzhCeFJsXcxebF22mdG8CyrElvH+3kuaspGwG/SUU05JE99+FzZwM3zmM59Jz3cjumg5Zr7tttumn26s97znPemGle7sPesQmrNfz33uc1tHVvbJ8Qp3HHLIISkRYM8990xztB4+L5gsV199dTrP5nVyyElGNo/xN7/5TRpN+/+1115b7LLLLun+cQ/4ErtPGb7qtk033TS9d/keqbuHM3FPLi6UQ7z3ve9NCT1nnnlmOse8Ree1zo7xqoQ9sy3rYvuqr8n0udfa7rNHPvKR6d6Yz/fagq+zNBIyie3CTpMcLhNycNGCYLaJe3JxE7ZvsizoMKxxgFG/EfxPf/rTwdbpYH5QCEEGVxDMBeKeXLyE7Zs80cEnCIIgCFpYFAk+QRAEQTAOIZZBEARB0EKIZRAEQRC0EGIZBEEQBC2EWAZBEARBCyGWQRAEQdBCiGUQBEEQtBBiGQRBEAQthFgGQRAEQQshlkEQBEHQQohlEARBELQQYhkEQRAELYRYBkEQBEELIZZBEARB0EKIZRAEQRC0EGIZBEEQBC2EWAZBEARBCyGWQRAEQdBCiGUQBEEQtBBiGQRBEAQthFgGQRAEQQshlkEQBEHQQohlEARBELQQYhkEQRAELYRYBkEQBEELIZZBEARBMJSi+H/z2wcezvtBTwAAAABJRU5ErkJggg==",
361 |       "text/plain": [
362 |        "<IPython.core.display.Image object>"
363 |       ]
364 |      },
365 |      "metadata": {},
366 |      "output_type": "display_data"
367 |     }
368 |    ],
369 |    "execution_count": 10
370 |   },
371 |   {
372 |    "metadata": {},
373 |    "cell_type": "markdown",
374 |    "source": [
375 |     "(image generated with [https://alexlenail.me/NN-SVG/](https://alexlenail.me/NN-SVG/))\n",
376 |     "\n",
377 |     "This model consists of three layers: input layer, hidden layer, and output layer. The first **input layer** only represents the input features of a single sample: in this example, there are eight such features (in the above MNIST example, there are 784 of them). The next layer, **hidden layer**, consists of four neurons (in the MNIST model, there are 128 neurons in the hidden layer). In an FC model, each of them receives all the elements in the previous input layer as inputs; this is signified by the lines connecting the input-layer elements to the hidden-layer neurons. Each of these four neurons then performs its own individual calculation (weighted sum followed by activation function), and produces an output. Next, these four numerical output values enter the final **output layer** as inputs. Finally, the single neuron in the output layer (instead of ten in the MNIST model) then computes the final output of the neural network, which corresponds to the prediction for the sample under consideration; with sigmoid activation, this value would be in the range 0 ... 1, and could be identified with a class probability in a binary classification problem.\n",
378 |     "\n",
379 |     "In the MNIST example, the output layer is associated with a special activation function called **softmax**, which involves all the ten neuron units together. This function is defined as \n",
380 |     "\n",
381 |     "$$\n",
382 |     "y_{i} = \\frac{e^{z_{i}}}{\\Sigma_{j} \\hspace{2mm} e^{z_{j}}},\n",
383 |     "$$\n",
384 |     "where $z_{i}$ is the weighted sum and $y_{i}$ is the final output of a particular output-layer neuron denoted by the index $i$ (in MNIST example, $i$ = 0, ..., 9). Note that the definition of the softmax function ensures that all the outputs are in the range 0 ... 1, and their total sum is 1. Therefore, the output produced by softmax can conveniently be interpreted as a *probability distribution*, and is commonly used in the context of multiclass classification problems such as the MNIST task.  \n",
385 |     "\n",
386 |     "All FC neural networks have an input layer, and an output layer, but there can be any number of intermediate hidden layers; in fact, large models (such as those used in self-driving cars) might have hundreds of them. The computation proceeds in a similar manner, however, from one layer to the next in the sequence of layers. The process of calculating the outputs (target predictions) from the inputs, with information flowing through the neural network, is called **forward pass**.\n",
387 |     "\n",
388 |     "### Trainable parameters\n",
389 |     "\n",
390 |     "A neural network outputs a certain function of its inputs. The details of this function are determined by the weights and biases of its neuron units. For example, the hidden layer of the simple FC neural network in the previous image has four neurons connected to eight input values (each having 8 weights and one bias value); this layer then has altogether 4 $\\cdot$ (8 + 1) = 36 parameters. In the output layer, there are 3 $\\cdot$ (4 + 1) = 15 further parameters, leading to the total of 36 + 15 = 51 parameters.\n",
391 |     "\n",
392 |     "The much larger MNIST model has 128 $\\cdot$ (784 + 1) + 10 $\\cdot$ (128 + 1) = 101770 tunable parameters, and is able to tackle the fairly complicated problem of classifying handwritten digits into 10 distinct classes. The numbers of parameters in various layers of the network can conveniently be found out as follows:"
393 |    ],
394 |    "id": "84526f23aa78e4f1"
395 |   },
396 |   {
397 |    "metadata": {
398 |     "ExecuteTime": {
399 |      "end_time": "2024-10-23T16:51:30.729821Z",
400 |      "start_time": "2024-10-23T16:51:30.705754Z"
401 |     }
402 |    },
403 |    "cell_type": "code",
404 |    "source": "model.summary()",
405 |    "id": "29f1429c1b1e460a",
406 |    "outputs": [
407 |     {
408 |      "data": {
409 |       "text/plain": [
410 |        "\u001B[1mModel: \"sequential\"\u001B[0m\n"
411 |       ],
412 |       "text/html": [
413 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential\"</span>\n",
414 |        "</pre>\n"
415 |       ]
416 |      },
417 |      "metadata": {},
418 |      "output_type": "display_data"
419 |     },
420 |     {
421 |      "data": {
422 |       "text/plain": [
423 |        "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
424 |        "┃\u001B[1m \u001B[0m\u001B[1mLayer (type)                   \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape          \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m      Param #\u001B[0m\u001B[1m \u001B[0m┃\n",
425 |        "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
426 |        "│ flatten (\u001B[38;5;33mFlatten\u001B[0m)               │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m784\u001B[0m)            │             \u001B[38;5;34m0\u001B[0m │\n",
427 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
428 |        "│ dense (\u001B[38;5;33mDense\u001B[0m)                   │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m128\u001B[0m)            │       \u001B[38;5;34m100,480\u001B[0m │\n",
429 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
430 |        "│ dense_1 (\u001B[38;5;33mDense\u001B[0m)                 │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m10\u001B[0m)             │         \u001B[38;5;34m1,290\u001B[0m │\n",
431 |        "└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
432 |       ],
433 |       "text/html": [
434 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
435 |        "┃<span style=\"font-weight: bold\"> Layer (type)                    </span>┃<span style=\"font-weight: bold\"> Output Shape           </span>┃<span style=\"font-weight: bold\">       Param # </span>┃\n",
436 |        "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
437 |        "│ flatten (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Flatten</span>)               │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">784</span>)            │             <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
438 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
439 |        "│ dense (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)                   │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">128</span>)            │       <span style=\"color: #00af00; text-decoration-color: #00af00\">100,480</span> │\n",
440 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
441 |        "│ dense_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)                 │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">10</span>)             │         <span style=\"color: #00af00; text-decoration-color: #00af00\">1,290</span> │\n",
442 |        "└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
443 |        "</pre>\n"
444 |       ]
445 |      },
446 |      "metadata": {},
447 |      "output_type": "display_data"
448 |     },
449 |     {
450 |      "data": {
451 |       "text/plain": [
452 |        "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m203,542\u001B[0m (795.09 KB)\n"
453 |       ],
454 |       "text/html": [
455 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">203,542</span> (795.09 KB)\n",
456 |        "</pre>\n"
457 |       ]
458 |      },
459 |      "metadata": {},
460 |      "output_type": "display_data"
461 |     },
462 |     {
463 |      "data": {
464 |       "text/plain": [
465 |        "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m101,770\u001B[0m (397.54 KB)\n"
466 |       ],
467 |       "text/html": [
468 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">101,770</span> (397.54 KB)\n",
469 |        "</pre>\n"
470 |       ]
471 |      },
472 |      "metadata": {},
473 |      "output_type": "display_data"
474 |     },
475 |     {
476 |      "data": {
477 |       "text/plain": [
478 |        "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
479 |       ],
480 |       "text/html": [
481 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
482 |        "</pre>\n"
483 |       ]
484 |      },
485 |      "metadata": {},
486 |      "output_type": "display_data"
487 |     },
488 |     {
489 |      "data": {
490 |       "text/plain": [
491 |        "\u001B[1m Optimizer params: \u001B[0m\u001B[38;5;34m101,772\u001B[0m (397.55 KB)\n"
492 |       ],
493 |       "text/html": [
494 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Optimizer params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">101,772</span> (397.55 KB)\n",
495 |        "</pre>\n"
496 |       ]
497 |      },
498 |      "metadata": {},
499 |      "output_type": "display_data"
500 |     }
501 |    ],
502 |    "execution_count": 11
503 |   },
504 |   {
505 |    "metadata": {},
506 |    "cell_type": "markdown",
507 |    "source": [
508 |     "When training the network, we would like to fix the values of its parameters so that the resulting function accurately represents the connection between the inputs and outputs. Happily, this parameter-value tuning happens automatically, without any direct involvement of the programmer. This happens as follows:\n",
509 |     "\n",
510 |     "* First, all the parameter values are initialized, typically with small random numbers.\n",
511 |     "* Then, a subset of training samples, a **batch**, is selected from the training set, and predictions are computed for these samples.\n",
512 |     "* Next, the predictions are compared with the true target labels, and the prediction error is quantified with a suitably chosen **loss function**.\n",
513 |     "* The parameters (weights and biases) of the network are updated in order to *decrease the loss*.\n",
514 |     "* After each update, a new batch of samples is selected, and the process is repeated.\n",
515 |     "\n",
516 |     "The above process of alternating forward passes and parameter updates are repeated until the training is completed, and the loss value no longer decreases appreciably with further iterations (or some other suitable stopping criterion is reached). At this point, the values of the trainable parameters (weights and biases) in the network should be well adapted to model the connection between the inputs and targets. The trained model can then be used to compute predictions for new samples in a separate test set, to assess its performance.\n",
517 |     "\n",
518 |     "In the MNIST example, the batch size of 32 (a parameter to the `fit` method) means that the prediction-update computations are performed to subsets of 32 samples at a time. These subsets are drawn randomly from the training set without replacement. When the entire training set has been used, the first training **epoch** terminates. The number of training epochs needs to be given to the fit method as well; in our example, the training time was set to five epochs (the samples in the training set were traversed through five times).  \n",
519 |     "\n",
520 |     "In the compilation stage, some details concerning the training process are specified. Keras offers a selection of **optimizers**, which are programs that take care of the parameter updates during training. Here we do not delve into their computational details; see [https://keras.io/api/optimizers/](https://keras.io/api/optimizers/) for a list of available choices. The loss function is also specified here: for multiclass classification problems such as the MNIST problem, the appropriate choice is *categorical cross entropy*; we shall return to this issue later in more detail. Finally, the monitored metrics shown during training are specified: here we chose to look at accuracy in addition to the loss value, which is always presented. \n",
521 |     "\n",
522 |     "**NOTE**: There are different kinds of parameters to deep learning, which can sometimes cause confusion. First, there are the **trainable parameters**, the weights and biases of the network, whose values change iteratively during training. In addition, however, the NN models also have **hyperparameters** that define the structure of the model: the number of hidden layers, and the number of neurons in each layer. Unlike the trainable parameters, these need to be selected beforehand by the programmer, and their values do not change during training. In fact, choosing appropriate and sensible values for the model hyperparameters is often the most important (and difficult) part of any deep learning project."
523 |    ],
524 |    "id": "f6efbb2afe595f20"
525 |   },
526 |   {
527 |    "metadata": {},
528 |    "cell_type": "code",
529 |    "outputs": [],
530 |    "execution_count": null,
531 |    "source": "",
532 |    "id": "26934df18432bb50"
533 |   }
534 |  ],
535 |  "metadata": {
536 |   "kernelspec": {
537 |    "display_name": "Python 3 (ipykernel)",
538 |    "language": "python",
539 |    "name": "python3"
540 |   },
541 |   "language_info": {
542 |    "codemirror_mode": {
543 |     "name": "ipython",
544 |     "version": 3
545 |    },
546 |    "file_extension": ".py",
547 |    "mimetype": "text/x-python",
548 |    "name": "python",
549 |    "nbconvert_exporter": "python",
550 |    "pygments_lexer": "ipython3",
551 |    "version": "3.9.16"
552 |   }
553 |  },
554 |  "nbformat": 4,
555 |  "nbformat_minor": 5
556 | }
557 | 


--------------------------------------------------------------------------------
/Images/The-Transformer-model-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kopuj/neuralNetworks/654ffdc15b317da197c236cb9d5b407c87699a97/Images/The-Transformer-model-architecture.png


--------------------------------------------------------------------------------
/Images/cnn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kopuj/neuralNetworks/654ffdc15b317da197c236cb9d5b407c87699a97/Images/cnn.png


--------------------------------------------------------------------------------
/Images/nn.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kopuj/neuralNetworks/654ffdc15b317da197c236cb9d5b407c87699a97/Images/nn.png


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Neural networks
 2 | 
 3 | Learning Resources for Neural Networks course
 4 | which is the part of the Software Engineering Curriculum in Metropolia UAS.
 5 | 
 6 | _This material has been written by Juha Kopu._
 7 | 
 8 | ## Material
 9 | 
10 | 1. [Fully connected neural nets](Fully_connected.ipynb)
11 | 2. [Graphical monitoring and overfitting](Graphical_monitoring.ipynb)
12 | 3. [Image processing and convolutional neural nets](ConvNets.ipynb)
13 | 4. [Text preprocessing and word vectors](Text_preprocessing.ipynb)
14 | 5. [Positional embedding and self-attention](Attention.ipynb)
15 | 6. [Transformer architecture](Transformer.ipynb)
16 | 
17 | 
18 | 
19 | ## Assignments
20 | 
21 | 1. [Week 1.](Assignments/Assignment_week1.md) 
22 | 2. [Week 2.](Assignments/Assignment_week2.md) 
23 | 3. [Week 3.](Assignments/Assignment_week3.md)  
24 | 4. [Week 4.](Assignments/Assignment_week4.md)  
25 | 5. [Week 5.](Assignments/Assignment_week5.md)  
26 | 6. [Week 6.](Assignments/Assignment_week6.md)
27 | 


--------------------------------------------------------------------------------
/Text_preprocessing.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "metadata": {},
  5 |    "cell_type": "markdown",
  6 |    "source": [
  7 |     "# 4. Text data and preprocessing\n",
  8 |     "\n",
  9 |     "Most of the recent impressive advances in machine learning and artificial intelligence have emerged in the field of natural language processing. In addition to simpler tasks such as sentiment analysis or text classification, there has been considerable progress in more advanced tasks such as translation and question answering. Text processing involves several unique techniques and methods, which we shall look at next.\n",
 10 |     "\n",
 11 |     "## Bag-of-words approach\n",
 12 |     "\n",
 13 |     "In natural language processing tasks, the original data is typically in the form of strings, or a list of strings. Neural networks cannot process strings directly, so they need to be converted to arrays consisting of numerical values. This preprocessing usually consists of a succession of distinct steps:\n",
 14 |     "\n",
 15 |     "* *Text standardization*: convert everything to lowercase, remove punctuation and other special characters, etc.\n",
 16 |     "* *Tokenization*: split the text into separate units or **tokens** (words, characters, N-grams ...), and set up a vocabulary\n",
 17 |     "* *Vectorization*: associate a numerical vector with each of the tokens in the vocabulary\n",
 18 |     "\n",
 19 |     "To investigate an example of implementing these methods, we use the well-known IMDB movie review dataset, that can be downloaded from [the Stanford page of Andrew Maas](https://ai.stanford.edu/~amaas/data/sentiment/). The movie reviews are contained in two folders: one for training samples and one for testing. These folders contain additional subfolders \"pos\" and \"neg\" with samples corresponding to their sentiment; note that the train directory contains an extra subfolder \"unsup\", which is unnecessary for this purpose and should be removed before continuing.   "
 20 |    ],
 21 |    "id": "21103dac0ccc2f7e"
 22 |   },
 23 |   {
 24 |    "metadata": {
 25 |     "ExecuteTime": {
 26 |      "end_time": "2024-11-15T08:39:07.077634Z",
 27 |      "start_time": "2024-11-15T08:38:37.918227Z"
 28 |     }
 29 |    },
 30 |    "cell_type": "code",
 31 |    "source": [
 32 |     "from tensorflow import keras\n",
 33 |     "\n",
 34 |     "batch_size = 32\n",
 35 |     "\n",
 36 |     "train_ds = keras.utils.text_dataset_from_directory( \n",
 37 |     "    '../../aclImdb/train/', \n",
 38 |     "    validation_split=0.2, \n",
 39 |     "    subset=\"training\", \n",
 40 |     "    seed=123,\n",
 41 |     "    batch_size=batch_size)\n",
 42 |     "\n",
 43 |     "val_ds = keras.utils.text_dataset_from_directory( \n",
 44 |     "    '../../aclImdb/train/', \n",
 45 |     "    validation_split=0.2, \n",
 46 |     "    subset=\"validation\", \n",
 47 |     "    seed=123, # same seed as above!\n",
 48 |     "    batch_size=batch_size)\n",
 49 |     "\n",
 50 |     "test_ds = keras.utils.text_dataset_from_directory( \n",
 51 |     "    '../../aclImdb/test/', \n",
 52 |     "    batch_size=batch_size)"
 53 |    ],
 54 |    "id": "cbd0c5d8dd6d0208",
 55 |    "outputs": [
 56 |     {
 57 |      "name": "stdout",
 58 |      "output_type": "stream",
 59 |      "text": [
 60 |       "Found 25000 files belonging to 2 classes.\n",
 61 |       "Using 20000 files for training.\n",
 62 |       "Found 25000 files belonging to 2 classes.\n",
 63 |       "Using 5000 files for validation.\n",
 64 |       "Found 25000 files belonging to 2 classes.\n"
 65 |      ]
 66 |     }
 67 |    ],
 68 |    "execution_count": 1
 69 |   },
 70 |   {
 71 |    "metadata": {},
 72 |    "cell_type": "markdown",
 73 |    "source": "Below we take a look at one of the training samples. The integer-valued labels are binary: either zero (negative sentiment) or one (positive sentiment).",
 74 |    "id": "fde7f4e532c0845"
 75 |   },
 76 |   {
 77 |    "metadata": {
 78 |     "ExecuteTime": {
 79 |      "end_time": "2024-11-15T08:39:13.342714Z",
 80 |      "start_time": "2024-11-15T08:39:13.286066Z"
 81 |     }
 82 |    },
 83 |    "cell_type": "code",
 84 |    "source": [
 85 |     "for inputs, targets in train_ds:\n",
 86 |     "    print(inputs.shape, targets.shape)\n",
 87 |     "    print(inputs.dtype, targets.dtype)\n",
 88 |     "    print(inputs[0])\n",
 89 |     "    print(targets[0])\n",
 90 |     "    break"
 91 |    ],
 92 |    "id": "1a1112abd1c3b5b",
 93 |    "outputs": [
 94 |     {
 95 |      "name": "stdout",
 96 |      "output_type": "stream",
 97 |      "text": [
 98 |       "(32,) (32,)\n",
 99 |       "<dtype: 'string'> <dtype: 'int32'>\n",
100 |       "tf.Tensor(b'After, I watched the films... I thought, \"Why the heck was this film such a high success in the Korean Box Office?\" Even thought the movie had a clever/unusal scenario, the acting wasn\\'t that good and the characters weren\\'t very interesting. For a Korean movie... I liked the fighting scenes. If you want to watch a film without thinking, this is the film for you. But I got to admit... the film was kind of childish... 6/10', shape=(), dtype=string)\n",
101 |       "tf.Tensor(1, shape=(), dtype=int32)\n"
102 |      ]
103 |     }
104 |    ],
105 |    "execution_count": 2
106 |   },
107 |   {
108 |    "metadata": {},
109 |    "cell_type": "markdown",
110 |    "source": [
111 |     "For basic text preprocessing, Keras offers a `TextVectorization` layer. This layer can be used to preprocess the string-formed text data into numerical vectors. The code cell below does the following:\n",
112 |     "\n",
113 |     "* transforms the text to lowercase\n",
114 |     "* removes punctuation\n",
115 |     "* does word-level tokenization by splitting on whitespace\n",
116 |     "* sets up a vocabulary of tokens\n",
117 |     "* outputs the text converted to a vector of zeroes and ones.  "
118 |    ],
119 |    "id": "a6bf2f7f1970ce96"
120 |   },
121 |   {
122 |    "metadata": {
123 |     "ExecuteTime": {
124 |      "end_time": "2024-11-15T08:39:29.721073Z",
125 |      "start_time": "2024-11-15T08:39:24.356323Z"
126 |     }
127 |    },
128 |    "cell_type": "code",
129 |    "source": [
130 |     "from tensorflow.keras.layers import TextVectorization\n",
131 |     "\n",
132 |     "max_tokens = 10000 # Maximum vocabulary size \n",
133 |     "\n",
134 |     "vectorization_layer = TextVectorization( \n",
135 |     "    max_tokens=max_tokens, \n",
136 |     "    output_mode='multi_hot'\n",
137 |     ")\n",
138 |     "\n",
139 |     "# Adapt the layer to the text data \n",
140 |     "train_texts = train_ds.map(lambda x, y: x) \n",
141 |     "vectorization_layer.adapt(train_texts)\n",
142 |     "\n",
143 |     "# Apply the vectorization to the datasets \n",
144 |     "train_ds_bow = train_ds.map(lambda x, y: (vectorization_layer(x), y)) \n",
145 |     "val_ds_bow = val_ds.map(lambda x, y: (vectorization_layer(x), y))\n",
146 |     "test_ds_bow = test_ds.map(lambda x, y: (vectorization_layer(x), y))"
147 |    ],
148 |    "id": "d884f76f661f1418",
149 |    "outputs": [],
150 |    "execution_count": 3
151 |   },
152 |   {
153 |    "metadata": {},
154 |    "cell_type": "markdown",
155 |    "source": "After this, the samples drawn from the Datasets are no longer strings, but vectors with as many elements as there are tokens in the vocabulary; in our example, we restrict the vocabulary size to 10000 most commonly encountered tokens in the training texts. Each of the vector elements are either one or zero depending on whether the text contained that particular token or not. ",
156 |    "id": "9b3bb47fd2382f1b"
157 |   },
158 |   {
159 |    "metadata": {
160 |     "ExecuteTime": {
161 |      "end_time": "2024-11-15T08:39:40.627958Z",
162 |      "start_time": "2024-11-15T08:39:40.543463Z"
163 |     }
164 |    },
165 |    "cell_type": "code",
166 |    "source": [
167 |     "for inputs, targets in train_ds_bow:\n",
168 |     "    print(inputs.shape, targets.shape)\n",
169 |     "    print(inputs.dtype, targets.dtype)\n",
170 |     "    print(inputs[0])\n",
171 |     "    print(targets[0])\n",
172 |     "    break"
173 |    ],
174 |    "id": "ca56c417cc859d11",
175 |    "outputs": [
176 |     {
177 |      "name": "stdout",
178 |      "output_type": "stream",
179 |      "text": [
180 |       "(32, 10000) (32,)\n",
181 |       "<dtype: 'int64'> <dtype: 'int32'>\n",
182 |       "tf.Tensor([1 1 1 ... 0 0 0], shape=(10000,), dtype=int64)\n",
183 |       "tf.Tensor(0, shape=(), dtype=int32)\n"
184 |      ]
185 |     }
186 |    ],
187 |    "execution_count": 4
188 |   },
189 |   {
190 |    "metadata": {},
191 |    "cell_type": "markdown",
192 |    "source": "This is a very simple example of a *bag-of-words* approach to natural language processing: the word vector merely indicates the presence or absence of a set of words in the input text. Let us now build and train a very simple fully connected classifier for the sentiment analysis task:",
193 |    "id": "5bf0bdc603e1a8b6"
194 |   },
195 |   {
196 |    "metadata": {
197 |     "ExecuteTime": {
198 |      "end_time": "2024-11-15T08:41:07.212050Z",
199 |      "start_time": "2024-11-15T08:40:14.892236Z"
200 |     }
201 |    },
202 |    "cell_type": "code",
203 |    "source": [
204 |     "from tensorflow.keras.models import Sequential \n",
205 |     "from tensorflow.keras.layers import Input, Dense, Dropout\n",
206 |     "\n",
207 |     "model = Sequential([\n",
208 |     "    Input(shape=(max_tokens,)),\n",
209 |     "    Dense(units=16, activation='relu'),\n",
210 |     "    Dropout(0.5),\n",
211 |     "    Dense(units=1, activation='sigmoid')\n",
212 |     "])\n",
213 |     "\n",
214 |     "model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
215 |     "\n",
216 |     "model.fit(train_ds_bow, epochs=10, validation_data=val_ds_bow)"
217 |    ],
218 |    "id": "2f72a99b29c21124",
219 |    "outputs": [
220 |     {
221 |      "name": "stdout",
222 |      "output_type": "stream",
223 |      "text": [
224 |       "Epoch 1/10\n",
225 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m6s\u001B[0m 9ms/step - accuracy: 0.7650 - loss: 0.4964 - val_accuracy: 0.8828 - val_loss: 0.2917\n",
226 |       "Epoch 2/10\n",
227 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.8916 - loss: 0.2934 - val_accuracy: 0.8880 - val_loss: 0.2841\n",
228 |       "Epoch 3/10\n",
229 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9072 - loss: 0.2571 - val_accuracy: 0.8860 - val_loss: 0.3023\n",
230 |       "Epoch 4/10\n",
231 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9158 - loss: 0.2417 - val_accuracy: 0.8900 - val_loss: 0.3157\n",
232 |       "Epoch 5/10\n",
233 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9210 - loss: 0.2290 - val_accuracy: 0.8938 - val_loss: 0.3255\n",
234 |       "Epoch 6/10\n",
235 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9249 - loss: 0.2346 - val_accuracy: 0.8884 - val_loss: 0.3441\n",
236 |       "Epoch 7/10\n",
237 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9293 - loss: 0.2228 - val_accuracy: 0.8852 - val_loss: 0.3608\n",
238 |       "Epoch 8/10\n",
239 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9274 - loss: 0.2172 - val_accuracy: 0.8852 - val_loss: 0.3660\n",
240 |       "Epoch 9/10\n",
241 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9304 - loss: 0.2152 - val_accuracy: 0.8874 - val_loss: 0.3715\n",
242 |       "Epoch 10/10\n",
243 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 8ms/step - accuracy: 0.9290 - loss: 0.2212 - val_accuracy: 0.8878 - val_loss: 0.3743\n"
244 |      ]
245 |     },
246 |     {
247 |      "data": {
248 |       "text/plain": [
249 |        "<keras.src.callbacks.history.History at 0x1a5cd65d760>"
250 |       ]
251 |      },
252 |      "execution_count": 5,
253 |      "metadata": {},
254 |      "output_type": "execute_result"
255 |     }
256 |    ],
257 |    "execution_count": 5
258 |   },
259 |   {
260 |    "metadata": {},
261 |    "cell_type": "markdown",
262 |    "source": "Let us test the classifier:",
263 |    "id": "d11a3c57cc2a1c56"
264 |   },
265 |   {
266 |    "metadata": {
267 |     "ExecuteTime": {
268 |      "end_time": "2024-11-15T08:41:23.243947Z",
269 |      "start_time": "2024-11-15T08:41:18.014014Z"
270 |     }
271 |    },
272 |    "cell_type": "code",
273 |    "source": "print(f\"Test accuracy: {model.evaluate(test_ds_bow)[1]:.4f}\")",
274 |    "id": "540f1ee4068d5a3b",
275 |    "outputs": [
276 |     {
277 |      "name": "stdout",
278 |      "output_type": "stream",
279 |      "text": [
280 |       "\u001B[1m782/782\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m5s\u001B[0m 7ms/step - accuracy: 0.8797 - loss: 0.3700\n",
281 |       "Test accuracy: 0.8798\n"
282 |      ]
283 |     }
284 |    ],
285 |    "execution_count": 6
286 |   },
287 |   {
288 |    "metadata": {},
289 |    "cell_type": "markdown",
290 |    "source": "We can also inspect the vocabulary formed during the text vectorization process. The first element ([UNK]) of the corresponding dictionary is reserved for unknown words not contained in the training texts. ",
291 |    "id": "f4a614fe286c57a0"
292 |   },
293 |   {
294 |    "metadata": {
295 |     "ExecuteTime": {
296 |      "end_time": "2024-11-15T08:42:44.041434Z",
297 |      "start_time": "2024-11-15T08:42:44.022750Z"
298 |     }
299 |    },
300 |    "cell_type": "code",
301 |    "source": [
302 |     "vocabulary = vectorization_layer.get_vocabulary()\n",
303 |     "print(vocabulary[0:20]) # print 20 most common tokens"
304 |    ],
305 |    "id": "c7b74bcc4226dbea",
306 |    "outputs": [
307 |     {
308 |      "name": "stdout",
309 |      "output_type": "stream",
310 |      "text": [
311 |       "['[UNK]', 'the', 'and', 'a', 'of', 'to', 'is', 'in', 'it', 'i', 'this', 'that', 'br', 'was', 'as', 'with', 'for', 'movie', 'but', 'film']\n"
312 |      ]
313 |     },
314 |     {
315 |      "data": {
316 |       "text/plain": [
317 |        "list"
318 |       ]
319 |      },
320 |      "execution_count": 7,
321 |      "metadata": {},
322 |      "output_type": "execute_result"
323 |     }
324 |    ],
325 |    "execution_count": 7
326 |   },
327 |   {
328 |    "metadata": {},
329 |    "cell_type": "markdown",
330 |    "source": [
331 |     "## Word embeddings\n",
332 |     "\n",
333 |     "Representing texts as multi-hot vectors of zeroes and ones fails to take into account many aspects of natural language: the order of words in the text, the semantic relationships between different words and so on. Also, the vector representation is very sparse (the overwhelming majority of the vector elements are zero) and wasteful. A much better and useful alternative is to use dense encodings of floating-point numbers with smaller dimensionality. This approach is called **word embeddings**.\n",
334 |     "\n",
335 |     "With word embeddings, each word/token is associated with a vector in a small-dimensional (compared to the vocabulary size) embedding space. The components of these vectors are first initialized randomly, and then treated as trainable parameters. During training, the floating-point values of these vector components change iteratively, as they gather information about the dataset, and the relationships between words. \n",
336 |     "\n",
337 |     "In Keras, the word vector components are arranged in a special `Embedding` layer, as shown in the example implementation below. First, we create a new vectorization layer, which now outputs texts as lists of integer-valued token indices, and adapt this to the training texts. Since the text samples are of varying length, we define a `sequence_length` variable to restrict all lists to the same length (either by truncating long samples, or padding the short ones with zeroes)."
338 |    ],
339 |    "id": "986c23f303304ff3"
340 |   },
341 |   {
342 |    "metadata": {
343 |     "ExecuteTime": {
344 |      "end_time": "2024-11-15T08:43:14.551235Z",
345 |      "start_time": "2024-11-15T08:43:09.285817Z"
346 |     }
347 |    },
348 |    "cell_type": "code",
349 |    "source": [
350 |     "# Define the TextVectorization layer \n",
351 |     "max_tokens = 10000 # Maximum vocabulary size \n",
352 |     "sequence_length = 250 # Maximum sequence length \n",
353 |     "\n",
354 |     "vectorization_layer = TextVectorization( \n",
355 |     "    max_tokens=max_tokens, \n",
356 |     "    output_mode='int', \n",
357 |     "    output_sequence_length=sequence_length )\n",
358 |     "\n",
359 |     "train_texts = train_ds.map(lambda x, y: x) \n",
360 |     "vectorization_layer.adapt(train_texts)\n",
361 |     "\n",
362 |     "train_ds_int = train_ds.map(lambda x, y: (vectorization_layer(x), y)) \n",
363 |     "val_ds_int = val_ds.map(lambda x, y: (vectorization_layer(x), y))\n",
364 |     "test_ds_int = test_ds.map(lambda x, y: (vectorization_layer(x), y))"
365 |    ],
366 |    "id": "739a0440891dc282",
367 |    "outputs": [],
368 |    "execution_count": 8
369 |   },
370 |   {
371 |    "metadata": {},
372 |    "cell_type": "markdown",
373 |    "source": "This is how the preprocessed samples now appear: ",
374 |    "id": "9fe95bd700c2bd3e"
375 |   },
376 |   {
377 |    "metadata": {
378 |     "ExecuteTime": {
379 |      "end_time": "2024-11-15T08:43:19.131898Z",
380 |      "start_time": "2024-11-15T08:43:19.045181Z"
381 |     }
382 |    },
383 |    "cell_type": "code",
384 |    "source": [
385 |     "for inputs, targets in train_ds_int:\n",
386 |     "    print(inputs.shape, targets.shape)\n",
387 |     "    print(inputs.dtype, targets.dtype)\n",
388 |     "    print(inputs[0])\n",
389 |     "    print(targets[0])\n",
390 |     "    break"
391 |    ],
392 |    "id": "d867ff60a141d101",
393 |    "outputs": [
394 |     {
395 |      "name": "stdout",
396 |      "output_type": "stream",
397 |      "text": [
398 |       "(32, 250) (32,)\n",
399 |       "<dtype: 'int64'> <dtype: 'int32'>\n",
400 |       "tf.Tensor(\n",
401 |       "[  45   23  174    6   66    4   18   43  106 1192 6974  102   11    7\n",
402 |       "    2   29    2  114    7  882  191   37  310 6183   15    2 1982 8703\n",
403 |       "  905   15    2  652  260   35   63   24   46    6   77    3   37  847\n",
404 |       " 4206   15   25 7975  115  471    6  905   28  492   17 2082  200 1339\n",
405 |       "   19  233  487  191 8840    1    3    1    1    6 4624   30  266    6\n",
406 |       " 1111  905    6 3560   17  325   60 2732    1  195   96   26 1738  242\n",
407 |       "   16   11   29   10  196    9  215   12    8    4 4608 2606   16 1531\n",
408 |       "  497    1 1912   21    2  289   28    1   12    2  106  414   68   53\n",
409 |       "  269  956   43    2  230   12    2  172   14  738 1132    6  163   16\n",
410 |       "   12    2  172    5 1323   68 1132 3920   91    5   89   12    2 1023\n",
411 |       "   14 1316 1601   43  105   91    6  842  183 1061    3   38   21   11\n",
412 |       " 1275    5   29  200   53 4908  225  254  614  676   21    2  288    0\n",
413 |       "    0    0    0    0    0    0    0    0    0    0    0    0    0    0\n",
414 |       "    0    0    0    0    0    0    0    0    0    0    0    0    0    0\n",
415 |       "    0    0    0    0    0    0    0    0    0    0    0    0    0    0\n",
416 |       "    0    0    0    0    0    0    0    0    0    0    0    0    0    0\n",
417 |       "    0    0    0    0    0    0    0    0    0    0    0    0    0    0\n",
418 |       "    0    0    0    0    0    0    0    0    0    0    0    0], shape=(250,), dtype=int64)\n",
419 |       "tf.Tensor(0, shape=(), dtype=int32)\n"
420 |      ]
421 |     }
422 |    ],
423 |    "execution_count": 9
424 |   },
425 |   {
426 |    "metadata": {},
427 |    "cell_type": "markdown",
428 |    "source": [
429 |     "Now that we have preprocessed our text samples to integer lists of fixed length, they can be directly inserted into a Keras Embedding layer, which converts the integer lists to floating-point valued tensors of shape (sequence length, embedding dimension). \n",
430 |     "\n",
431 |     "As can be seen in the cell below, the two-dimensional tensors from the Embedding layer are first converted to one-dimensional ones with shape (sequence length$\\cdot$embedding dimension, ) using a Reshape layer. This is then followed up by a fully connected classifier consisting of two Dense layers: one hidden layer plus the final output layer. The Dropout layers are added to reduce overfitting, and the model is trained for ten epochs.      "
432 |    ],
433 |    "id": "8dc3539053c98b45"
434 |   },
435 |   {
436 |    "metadata": {
437 |     "ExecuteTime": {
438 |      "end_time": "2024-11-15T08:46:51.396414Z",
439 |      "start_time": "2024-11-15T08:45:30.635731Z"
440 |     }
441 |    },
442 |    "cell_type": "code",
443 |    "source": [
444 |     "from tensorflow.keras.layers import Embedding, Dropout, Reshape\n",
445 |     "\n",
446 |     "embed_dim = 64 # dimension of the word embeddings\n",
447 |     "\n",
448 |     "model = Sequential([\n",
449 |     "    Input(shape=(sequence_length,)),\n",
450 |     "    Embedding(input_dim=max_tokens, output_dim=embed_dim),\n",
451 |     "    Reshape((sequence_length * embed_dim,)),\n",
452 |     "    Dropout(0.5),\n",
453 |     "    Dense(32, activation='relu'),\n",
454 |     "    Dropout(0.5),\n",
455 |     "    Dense(units=1, activation='sigmoid')\n",
456 |     "])\n",
457 |     "\n",
458 |     "model.summary()\n",
459 |     "\n",
460 |     "model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
461 |     "model.fit(train_ds_int, epochs=10, validation_data=val_ds_int)"
462 |    ],
463 |    "id": "2113140ef45dea87",
464 |    "outputs": [
465 |     {
466 |      "data": {
467 |       "text/plain": [
468 |        "\u001B[1mModel: \"sequential_1\"\u001B[0m\n"
469 |       ],
470 |       "text/html": [
471 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"sequential_1\"</span>\n",
472 |        "</pre>\n"
473 |       ]
474 |      },
475 |      "metadata": {},
476 |      "output_type": "display_data"
477 |     },
478 |     {
479 |      "data": {
480 |       "text/plain": [
481 |        "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
482 |        "┃\u001B[1m \u001B[0m\u001B[1mLayer (type)                   \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape          \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m      Param #\u001B[0m\u001B[1m \u001B[0m┃\n",
483 |        "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
484 |        "│ embedding (\u001B[38;5;33mEmbedding\u001B[0m)           │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m250\u001B[0m, \u001B[38;5;34m64\u001B[0m)        │       \u001B[38;5;34m640,000\u001B[0m │\n",
485 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
486 |        "│ reshape (\u001B[38;5;33mReshape\u001B[0m)               │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m16000\u001B[0m)          │             \u001B[38;5;34m0\u001B[0m │\n",
487 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
488 |        "│ dropout_1 (\u001B[38;5;33mDropout\u001B[0m)             │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m16000\u001B[0m)          │             \u001B[38;5;34m0\u001B[0m │\n",
489 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
490 |        "│ dense_2 (\u001B[38;5;33mDense\u001B[0m)                 │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)             │       \u001B[38;5;34m512,032\u001B[0m │\n",
491 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
492 |        "│ dropout_2 (\u001B[38;5;33mDropout\u001B[0m)             │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)             │             \u001B[38;5;34m0\u001B[0m │\n",
493 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
494 |        "│ dense_3 (\u001B[38;5;33mDense\u001B[0m)                 │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m1\u001B[0m)              │            \u001B[38;5;34m33\u001B[0m │\n",
495 |        "└─────────────────────────────────┴────────────────────────┴───────────────┘\n"
496 |       ],
497 |       "text/html": [
498 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓\n",
499 |        "┃<span style=\"font-weight: bold\"> Layer (type)                    </span>┃<span style=\"font-weight: bold\"> Output Shape           </span>┃<span style=\"font-weight: bold\">       Param # </span>┃\n",
500 |        "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩\n",
501 |        "│ embedding (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Embedding</span>)           │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">250</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">64</span>)        │       <span style=\"color: #00af00; text-decoration-color: #00af00\">640,000</span> │\n",
502 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
503 |        "│ reshape (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Reshape</span>)               │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">16000</span>)          │             <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
504 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
505 |        "│ dropout_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>)             │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">16000</span>)          │             <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
506 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
507 |        "│ dense_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)                 │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)             │       <span style=\"color: #00af00; text-decoration-color: #00af00\">512,032</span> │\n",
508 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
509 |        "│ dropout_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>)             │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)             │             <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │\n",
510 |        "├─────────────────────────────────┼────────────────────────┼───────────────┤\n",
511 |        "│ dense_3 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)                 │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">1</span>)              │            <span style=\"color: #00af00; text-decoration-color: #00af00\">33</span> │\n",
512 |        "└─────────────────────────────────┴────────────────────────┴───────────────┘\n",
513 |        "</pre>\n"
514 |       ]
515 |      },
516 |      "metadata": {},
517 |      "output_type": "display_data"
518 |     },
519 |     {
520 |      "data": {
521 |       "text/plain": [
522 |        "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m1,152,065\u001B[0m (4.39 MB)\n"
523 |       ],
524 |       "text/html": [
525 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,152,065</span> (4.39 MB)\n",
526 |        "</pre>\n"
527 |       ]
528 |      },
529 |      "metadata": {},
530 |      "output_type": "display_data"
531 |     },
532 |     {
533 |      "data": {
534 |       "text/plain": [
535 |        "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m1,152,065\u001B[0m (4.39 MB)\n"
536 |       ],
537 |       "text/html": [
538 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">1,152,065</span> (4.39 MB)\n",
539 |        "</pre>\n"
540 |       ]
541 |      },
542 |      "metadata": {},
543 |      "output_type": "display_data"
544 |     },
545 |     {
546 |      "data": {
547 |       "text/plain": [
548 |        "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
549 |       ],
550 |       "text/html": [
551 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
552 |        "</pre>\n"
553 |       ]
554 |      },
555 |      "metadata": {},
556 |      "output_type": "display_data"
557 |     },
558 |     {
559 |      "name": "stdout",
560 |      "output_type": "stream",
561 |      "text": [
562 |       "Epoch 1/10\n",
563 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m9s\u001B[0m 13ms/step - accuracy: 0.5825 - loss: 0.6529 - val_accuracy: 0.7516 - val_loss: 0.5372\n",
564 |       "Epoch 2/10\n",
565 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.8447 - loss: 0.3626 - val_accuracy: 0.8482 - val_loss: 0.3550\n",
566 |       "Epoch 3/10\n",
567 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 12ms/step - accuracy: 0.9100 - loss: 0.2292 - val_accuracy: 0.8468 - val_loss: 0.4008\n",
568 |       "Epoch 4/10\n",
569 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 12ms/step - accuracy: 0.9444 - loss: 0.1434 - val_accuracy: 0.8368 - val_loss: 0.4991\n",
570 |       "Epoch 5/10\n",
571 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9672 - loss: 0.0924 - val_accuracy: 0.8320 - val_loss: 0.5941\n",
572 |       "Epoch 6/10\n",
573 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9758 - loss: 0.0658 - val_accuracy: 0.8332 - val_loss: 0.6362\n",
574 |       "Epoch 7/10\n",
575 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9814 - loss: 0.0528 - val_accuracy: 0.8400 - val_loss: 0.7123\n",
576 |       "Epoch 8/10\n",
577 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9839 - loss: 0.0432 - val_accuracy: 0.8346 - val_loss: 0.7261\n",
578 |       "Epoch 9/10\n",
579 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9854 - loss: 0.0411 - val_accuracy: 0.8394 - val_loss: 0.7180\n",
580 |       "Epoch 10/10\n",
581 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m8s\u001B[0m 13ms/step - accuracy: 0.9899 - loss: 0.0336 - val_accuracy: 0.8358 - val_loss: 0.8086\n"
582 |      ]
583 |     },
584 |     {
585 |      "data": {
586 |       "text/plain": [
587 |        "<keras.src.callbacks.history.History at 0x1a5cd762e70>"
588 |       ]
589 |      },
590 |      "execution_count": 10,
591 |      "metadata": {},
592 |      "output_type": "execute_result"
593 |     }
594 |    ],
595 |    "execution_count": 10
596 |   },
597 |   {
598 |    "metadata": {},
599 |    "cell_type": "markdown",
600 |    "source": [
601 |     "The validation accuracy has not improved from the much simpler bag-of-words model, which is not very surprising: sentiments in the text can be expected to be reasonably well determined by merely looking at which kinds of words it contains. However, it is easy to deduce that with problems that require deeper insights into the subtle meanings of written text, it is often necessary to use models that are able to preserve the word orderings. \n",
602 |     "\n",
603 |     "Also, it is interesting to see that, after training our model, the word vectors in the Embedding layer encode semantic information about the relationships between words. To see that, first we extract the word vector components from the embedding layer, and construct the dictionaries from the vocabulary: "
604 |    ],
605 |    "id": "73ab87271ecf059d"
606 |   },
607 |   {
608 |    "metadata": {
609 |     "ExecuteTime": {
610 |      "end_time": "2024-11-15T08:57:07.365990Z",
611 |      "start_time": "2024-11-15T08:57:07.334461Z"
612 |     }
613 |    },
614 |    "cell_type": "code",
615 |    "source": [
616 |     "# Extract the embedding weights from the trained model \n",
617 |     "embedding_layer = model.layers[0] # the first layer of the model\n",
618 |     "embedding_weights = embedding_layer.get_weights()[0] # shape (10000, 64)\n",
619 |     "\n",
620 |     "# Get the word index from the tokenizer \n",
621 |     "vocabulary = vectorization_layer.get_vocabulary() \n",
622 |     "index_word = {idx: word for idx, word in enumerate(vocabulary)}\n",
623 |     "word_index = {word: idx for idx, word in enumerate(vocabulary)}"
624 |    ],
625 |    "id": "c9c7074ef75ed337",
626 |    "outputs": [],
627 |    "execution_count": 21
628 |   },
629 |   {
630 |    "metadata": {},
631 |    "cell_type": "markdown",
632 |    "source": "The following code snippet defines a function that takes a word as an input, and compares its word vector representation to those of all other words in the dictionary in turn, using cosine similarity (essentially the same as taking the dot product of the two vectors). The function then outputs the words with most similar representation.  ",
633 |    "id": "d930677b8394129e"
634 |   },
635 |   {
636 |    "metadata": {
637 |     "ExecuteTime": {
638 |      "end_time": "2024-11-15T09:33:19.844930Z",
639 |      "start_time": "2024-11-15T09:33:19.833714Z"
640 |     }
641 |    },
642 |    "cell_type": "code",
643 |    "source": [
644 |     "import numpy as np\n",
645 |     "from sklearn.metrics.pairwise import cosine_similarity \n",
646 |     "\n",
647 |     "# Function to find similar words \n",
648 |     "def find_similar_words(target_word, top_n=10): \n",
649 |     "    if target_word not in word_index: \n",
650 |     "        return f'Word \"{target_word}\" not in vocabulary' \n",
651 |     "    target_idx = word_index[target_word] \n",
652 |     "    target_embedding = embedding_weights[target_idx].reshape(1, -1) # shape (1, 64)\n",
653 |     "    similarities = cosine_similarity(embedding_weights, target_embedding).reshape(-1) # shape (10000,)\n",
654 |     "    similar_indices = np.argsort(similarities)[-top_n-1:-1][::-1] \n",
655 |     "    similar_words = [index_word[idx] for idx in similar_indices] \n",
656 |     "    return similar_words \n",
657 |     "    \n",
658 |     "word_to_check = 'bad'\n",
659 |     "print(f'Words similar to \"{word_to_check}\": {find_similar_words(word_to_check)}')"
660 |    ],
661 |    "id": "ca77d6f166c0f35e",
662 |    "outputs": [
663 |     {
664 |      "name": "stdout",
665 |      "output_type": "stream",
666 |      "text": [
667 |       "Words similar to \"bad\": ['awful', 'worst', 'unfunny', 'wasted', 'waste', 'avoid', 'garbage', 'horrible', 'worse', 'lame']\n"
668 |      ]
669 |     }
670 |    ],
671 |    "execution_count": 46
672 |   },
673 |   {
674 |    "metadata": {},
675 |    "cell_type": "markdown",
676 |    "source": "From this output, it is obvious that the model has been able to figure out something about how the words resemble each other.",
677 |    "id": "14f0573f28300c4e"
678 |   }
679 |  ],
680 |  "metadata": {
681 |   "kernelspec": {
682 |    "display_name": "Python 3",
683 |    "language": "python",
684 |    "name": "python3"
685 |   },
686 |   "language_info": {
687 |    "codemirror_mode": {
688 |     "name": "ipython",
689 |     "version": 2
690 |    },
691 |    "file_extension": ".py",
692 |    "mimetype": "text/x-python",
693 |    "name": "python",
694 |    "nbconvert_exporter": "python",
695 |    "pygments_lexer": "ipython2",
696 |    "version": "2.7.6"
697 |   }
698 |  },
699 |  "nbformat": 4,
700 |  "nbformat_minor": 5
701 | }
702 | 


--------------------------------------------------------------------------------
/Transformer.ipynb:
--------------------------------------------------------------------------------
   1 | {
   2 |  "cells": [
   3 |   {
   4 |    "metadata": {},
   5 |    "cell_type": "markdown",
   6 |    "source": [
   7 |     "# 6. Transformer models for text processing\n",
   8 |     "\n",
   9 |     "In the previous documents, we have looked at several different neural network models for classifying movie reviews according to their sentiment (positive/negative). Now that we are acquainted with all the necessary building blocks, we are ready to assemble a complete Transformer model. "
  10 |    ],
  11 |    "id": "21416e736aebd471"
  12 |   },
  13 |   {
  14 |    "metadata": {
  15 |     "ExecuteTime": {
  16 |      "end_time": "2024-11-27T13:53:13.853643Z",
  17 |      "start_time": "2024-11-27T13:53:07.640616Z"
  18 |     }
  19 |    },
  20 |    "cell_type": "code",
  21 |    "source": [
  22 |     "import keras\n",
  23 |     "from keras import layers\n",
  24 |     "from keras import ops\n",
  25 |     "import numpy as np\n",
  26 |     "import tensorflow as tf"
  27 |    ],
  28 |    "id": "2b782fee45d6f1e7",
  29 |    "outputs": [],
  30 |    "execution_count": 1
  31 |   },
  32 |   {
  33 |    "metadata": {},
  34 |    "cell_type": "markdown",
  35 |    "source": [
  36 |     "## Text classification (sentiment analysis)\n",
  37 |     "\n",
  38 |     "For simplicity, we keep using the same (by now very familiar) IMDB dataset:"
  39 |    ],
  40 |    "id": "56fd7270b5f53272"
  41 |   },
  42 |   {
  43 |    "cell_type": "code",
  44 |    "id": "initial_id",
  45 |    "metadata": {
  46 |     "collapsed": true,
  47 |     "ExecuteTime": {
  48 |      "end_time": "2024-11-27T13:53:18.575613Z",
  49 |      "start_time": "2024-11-27T13:53:13.853643Z"
  50 |     }
  51 |    },
  52 |    "source": [
  53 |     "batch_size = 32\n",
  54 |     "\n",
  55 |     "train_ds = keras.utils.text_dataset_from_directory( \n",
  56 |     "    '../../aclImdb/train/', \n",
  57 |     "    validation_split=0.2, \n",
  58 |     "    subset=\"training\", \n",
  59 |     "    seed=123,\n",
  60 |     "    batch_size=batch_size)\n",
  61 |     "\n",
  62 |     "val_ds = keras.utils.text_dataset_from_directory( \n",
  63 |     "    '../../aclImdb/train/', \n",
  64 |     "    validation_split=0.2, \n",
  65 |     "    subset=\"validation\", \n",
  66 |     "    seed=123, # same seed as above!\n",
  67 |     "    batch_size=batch_size)\n",
  68 |     "\n",
  69 |     "test_ds = keras.utils.text_dataset_from_directory( \n",
  70 |     "    '../../aclImdb/test/', \n",
  71 |     "    batch_size=batch_size)"
  72 |    ],
  73 |    "outputs": [
  74 |     {
  75 |      "name": "stdout",
  76 |      "output_type": "stream",
  77 |      "text": [
  78 |       "Found 25000 files belonging to 2 classes.\n",
  79 |       "Using 20000 files for training.\n",
  80 |       "Found 25000 files belonging to 2 classes.\n",
  81 |       "Using 5000 files for validation.\n",
  82 |       "Found 25000 files belonging to 2 classes.\n"
  83 |      ]
  84 |     }
  85 |    ],
  86 |    "execution_count": 2
  87 |   },
  88 |   {
  89 |    "metadata": {},
  90 |    "cell_type": "markdown",
  91 |    "source": "As before, we preprocess the original string-valued samples to integer-valued token sequences of fixed length:",
  92 |    "id": "af853db2329cdea"
  93 |   },
  94 |   {
  95 |    "metadata": {
  96 |     "ExecuteTime": {
  97 |      "end_time": "2024-11-27T13:53:23.746557Z",
  98 |      "start_time": "2024-11-27T13:53:18.575613Z"
  99 |     }
 100 |    },
 101 |    "cell_type": "code",
 102 |    "source": [
 103 |     "vocab_size = 10000 # maximum number of tokens in vocabulary \n",
 104 |     "sequence_length = 250 # maximum length of sequences\n",
 105 |     "\n",
 106 |     "vectorization_layer = layers.TextVectorization( \n",
 107 |     "    max_tokens=vocab_size, \n",
 108 |     "    output_mode='int',\n",
 109 |     "    output_sequence_length=sequence_length,\n",
 110 |     ")\n",
 111 |     "\n",
 112 |     "train_texts = train_ds.map(lambda x, y: x) \n",
 113 |     "vectorization_layer.adapt(train_texts)\n",
 114 |     "\n",
 115 |     "train_ds_int = train_ds.map(lambda x, y: (vectorization_layer(x), y)) \n",
 116 |     "val_ds_int = val_ds.map(lambda x, y: (vectorization_layer(x), y))\n",
 117 |     "test_ds_int = test_ds.map(lambda x, y: (vectorization_layer(x), y))"
 118 |    ],
 119 |    "id": "fb2234636d949219",
 120 |    "outputs": [],
 121 |    "execution_count": 3
 122 |   },
 123 |   {
 124 |    "metadata": {},
 125 |    "cell_type": "markdown",
 126 |    "source": "Now we define a custom layer for converting the integer lists to word vectors. This layer consists of two separate Embedding layers: one encoding the tokens, and one encoding their positions in the sample. The final embeddings are obtained by summing these two separate contributions. Note that we also include the ability to include a mask for ignoring the padding zeroes in the input samples, as well as a `get_config` method to allow for saving the model after training. ",
 127 |    "id": "f6a1a30c383ff135"
 128 |   },
 129 |   {
 130 |    "metadata": {
 131 |     "ExecuteTime": {
 132 |      "end_time": "2024-11-27T13:53:23.752044Z",
 133 |      "start_time": "2024-11-27T13:53:23.746557Z"
 134 |     }
 135 |    },
 136 |    "cell_type": "code",
 137 |    "source": [
 138 |     "class PositionalEmbedding(layers.Layer):\n",
 139 |     "    def __init__(self, sequence_length, vocab_size, embed_dim, **kwargs):\n",
 140 |     "        super().__init__(**kwargs)\n",
 141 |     "        self.token_embeddings = layers.Embedding(\n",
 142 |     "            input_dim=vocab_size, output_dim=embed_dim\n",
 143 |     "        )\n",
 144 |     "        self.position_embeddings = layers.Embedding(\n",
 145 |     "            input_dim=sequence_length, output_dim=embed_dim\n",
 146 |     "        )\n",
 147 |     "        self.sequence_length = sequence_length\n",
 148 |     "        self.vocab_size = vocab_size\n",
 149 |     "        self.embed_dim = embed_dim\n",
 150 |     "\n",
 151 |     "    def call(self, inputs):\n",
 152 |     "        length = ops.shape(inputs)[-1]\n",
 153 |     "        positions = ops.arange(0, length, 1)\n",
 154 |     "        embedded_tokens = self.token_embeddings(inputs)\n",
 155 |     "        embedded_positions = self.position_embeddings(positions)\n",
 156 |     "        return embedded_tokens + embedded_positions\n",
 157 |     "\n",
 158 |     "    def compute_mask(self, inputs, mask=None):\n",
 159 |     "        return ops.not_equal(inputs, 0)\n",
 160 |     "\n",
 161 |     "    def get_config(self):\n",
 162 |     "        config = super().get_config()\n",
 163 |     "        config.update(\n",
 164 |     "            {\n",
 165 |     "                \"sequence_length\": self.sequence_length,\n",
 166 |     "                \"vocab_size\": self.vocab_size,\n",
 167 |     "                \"embed_dim\": self.embed_dim,\n",
 168 |     "            }\n",
 169 |     "        )\n",
 170 |     "        return config"
 171 |    ],
 172 |    "id": "a00020f9d1740df0",
 173 |    "outputs": [],
 174 |    "execution_count": 4
 175 |   },
 176 |   {
 177 |    "metadata": {},
 178 |    "cell_type": "markdown",
 179 |    "source": "We are now ready to build the Transformer model; the structure of the original model is presented in the image below (Yuening Jia, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons):",
 180 |    "id": "ee08e5154789829a"
 181 |   },
 182 |   {
 183 |    "metadata": {
 184 |     "ExecuteTime": {
 185 |      "end_time": "2024-11-27T13:53:23.764666Z",
 186 |      "start_time": "2024-11-27T13:53:23.752044Z"
 187 |     }
 188 |    },
 189 |    "cell_type": "code",
 190 |    "source": "from IPython.display import Image,display;display(Image(filename=\"Images/The-Transformer-model-architecture.png\"))",
 191 |    "id": "94d3bcf228de7bfb",
 192 |    "outputs": [
 193 |     {
 194 |      "data": {
 195 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbMAAAGHCAAAAAAfbePgAAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAAmJLR0QAAKqNIzIAAAAHdElNRQfoARUTAA/iu73qAAAAEGNhTnYAAAKQAAADnwAAAAAAAAAAWC5WIAAAS+VJREFUeNrtXXdgVFXWP/eVedNLJr33Hggh9N670lUQEBWsq7h2V3fX1W8ta1eUYgEFG106SJHeewKppPcyk0yf9979/ggtyUwyaCYzgTl/wOTV++7vnnrPPRdh8FAXI8LTBR7MPOTBzEMezDyYecgFRHm6wA5xZfXCYLE7tgx5bH3bZN5U0a2kYnYAAAAYkfDGCV4vRR7Z6JZ07KexQ6dWr2ioNFdp9N8d0mnLSzhttbmy8dQ31S4e5x7ZaIf+8A4HcdoPw3958ljt1J1sw774M1O91y36xafxwDAfD5+5JTWIKQDGJK7nZOXewWk+mhnD1guqaKo+NDLeYze6J0VVGwHKfOSIIAgAgpEr4hmeJAgCEOHRZ+5J4+SHGrMvzVLxOWU6KyrX6+tKeyiNWVU6aKjkXds08t8eeGySMiG3qrDfMIbICQmOBY3wmI9ghI+xODIktloV5FpG89j69h00MyUAwFYaEM8fW/GekgSeowCxhIuFk8dutC+CxAAASAAABKBIvRqAINyhyzx85hBhlqPcZnh7MGuGDOZJ+kbAw0o0/cZWEABYLWKOFRgFNIDVLHGlRvPYILcQf+zz0qziIEHTX1VfioIAAED/dW0cwKWdifu3Jq8hAzBc3p5Ie/SZexDhd3Zc8L8tU3me4i1CYWaSlQawgLAQWykUJmHoDDRUaT4cF8LQwHEC4DgakAcz1yoKoUgeTmevZ/XD8w1EL+bUleiJuUX6npLsL+XTigvCJQLdpbiyj+6PLwvRXC5IiD3HKXp4fGpXk+nQTyGjN1aF/2JM/30Hr+y28uxZcfF2EKZv3Vu0jQVgd+X6KOKqtuk31ZtXnNnjR2APZi6379UDXk2Wp/S84OUXfQ7FjQ4vSOfqtRA+oHeeggIAiRQYodKb0p0T9JoToVuMkQczVxuORHycHPNYoCjgIRTxrMp7pcmb43gsjAQOMA88D5yF5SirLtqLeyJulcGDmYvtxlJxpRUaqCpidskh2T0pxYd79uCPm+r8dYeCBuugqlZQRmgIv8PliBu//rNiXUaPXq6wHz3+2a1s1qATKUhzvUBB1BjVEkMV7SOo0yp0XtWEr7DBoLCYxEZaUWuRmBWCYi4E1XNKkQcz9yTdqlFR7uSSeBBpn/a8udLqwaxLUcWXFT+c8mDWlYhbc16pW9rowawLUdYPE2PvP7MDezDrMmT+VvogOXj40jIPZl3G/j+0bWEIL1xY9xPrwayLELsmbRIJkDhns9sw2t0d1zcX6dvFLCYwv1CXd6GbLre+PZVG+Ph3Rvzxrvap6z7Y0b6cIYBna+UiRPDtdhVWPDOJ9PCZM4n/ddPfkmwCgRBgfOMnjxB2ZGgj67r/i0vwYOZUyXho+ALbbIGrkERyzTuroPwcfmLYfWc8mDmXz8zetkUj+1u1Mv+hAAAAdnejZqbcUUtNJtd3Qrs9dqMN0m5MvDdGD5xGD+WbE0duvGq16HScxgJYqwdjo1lnN/rYKVOgd7fdaEdLCWWfvTCa1h+gr3arybrstVp4MScko2+eaNYBtmpE3a9jykb6ubLVHj6zQZIXw5//iTx0un/Qt77B6XH+6da8wZq64UerNSFZp+KFG+K9PT61u5HO5+2nVp7IM4viGvWUgCIZsdJLplLyREidxSQecYUjb5NxPZg5nYrXW4eHmWMqajURcrMVWK2JZS1mliv/mmaMtbXjVte4VJ95MLNBqto//hie3n/owbL7IaBanHiG9anx4RpCLKFn/BrOSGZIzvGu5LO7OQ6ifyj5nzYZg7eYWZkA2EaB2GKlGANP8bSVIKw0Z2UsNEmakdA2Q1U9cN9Cj93o1PGKWNtdTwiFAACUCoBhACQAAAIABgBACAB2q4bw1s6QW3czZoKQUwXBHamBuGM1ER7Z6Fw6tYhOoLAjDOlYvFF3auC7cg9mTiX+4o56R/qobuOAeEfWvRMxE3w9fOYetHLRgx8w7tMcj63fPuV9qfl1L3gw60Jk/b7MX7yk2oNZV7JUNsyOeLBoLe/BrMtQ49LIKTht2opcD2ZdhfDO40/48NRc+luzB7MuQqZto4Zg4EMWHCtxlyZ51sC3Q8IX1EKEEJrePdCDmVsQV21q7xKpqaDYXFmA1FXtubJI6tUpczF3t09dt3y7xYHLLMU+joSksPL++4QePnMyl327ck6cI6xBOBRvZA//V3kP8vCZU6l25rBXOzLvV/eY7PNOWBR/V9uNRnNkh6ZqS8I1nbF45u62QewLMqyzklISDHo10TEP9PCZ04k/tWgXC5C93uJ+bfP4Z3bGcmiFtwAg2pfmsIkSgtksQyaLFPEsL0QezNxTagoEFAL+fM70vWVEwwLzpQJV4kVQxa0XoHlCV48nDzx2FBoAANIeh/qMYZlXtpTLtmUImE3mQ16DXT7MPXzWljXBimlCrgyQaq8MSE7GhbUGRtEZC8w8fPbnyGI2m2v21ll5C8tZSCZLyq7O9OVMVgv2YOZa69AuAHxlcNbOn/IFihpeUeVFTM3/siqluEqZ5VfPtSdQPXEQZ1L9Ayn/EdnpAKOFxyBEViHLCcwCodasxhqJmeIpkV2zEdUsCPugE5TNXa3PFNM+agxqU6FhhBFgguAxYEAYtc1LmQWLKA+fOZn0G7YZHeij+hNx4Y70k3rmCNKDmdPJ5EBqDr/sH9M+ljgitAQefeYmlHX/OfXyKe7THo+t377V/50mRL2s3INZF6JjWx8KfrD6F86DWZchzZdJE3HKAz9c9mDWZdzuzRefUPHELMUyowezrmJYHpjUHwMOeCLbUwuwqxDzoheNAWBicoAHsy5CZGzT/6IEt2mSRzZ2PfJg5kgnIeROzblbZaMhr6C4AcChLd1RTdn6bIfiRbh5JIz0CY4M6fiEx7szdtW4d8M5pJQRfHGVI0sB2XqpI3v5YDrYr1l3cpoGcd+JA0QezDoAsvfXpE9OVdLcT18ZHag1gXgrRTqU/O33ylB0S6/ypqoT2y/PeNbLg9lfdrk+/Om1e6QAcGLh+BnCtjqm+T+ovXlo7Vcl3wcBAFevvKFzNL99MOitDgYN3310LGmJFWOM8adDylufbTrFNlgxNjfwGBvreYwtjSzG+gZLO08+nbwbY1zw9sycm8fYLUlfch3a/rvQbrRuDJnaxAZGZetZscLVRgCAnHcKAB/+qAHK1uw5iqH0rWVW6/5P2guFqIWNoFk9f9PQW+qokmNmfp/vsfX/GmlOjrhR57SVsMNnVlwBAIugzABmqpiFU4V+l3k+WLJkGxUeEID1JsC81cSxRguvtxHqtx7+2/u9Vz0hu9U0n83v7lANdBfa+jW1ifYtfI0pbWs3Mj/T3AjnKgrMCMTHfeLXxabETlsS7KXkThjLk4M2kUbfBnFpzwvh97Qc8uySq/TDfeuOAJDR6usHQ9OPPCTy8NlfoVIUYv/kZfn4E0XGtb49ROVbEuJJKCmx8DHZCgJNH/xREXF2d8/AJTX76IFlV/pc0qVub7lrLjLvKzBveulvzz777PxlN5Y10X1zKjx89peoQWg/t8N8hhXr9ozJnC2S51X5a4SmlT2fXS6eEIEx8/jrn03NaVAkf9OgjEs9ZwqQ+8itLYUjFgwukE7rzWDAS85ab/RtnKU8wsNnf4V4ZPejcYZi3n1zt9RaMvR6QUWB3qg5xw70+4Gs1JmsqudNJaF1tbpANcthjuNYK8e1csjpJ3/o9sMOdd++/UJveYmaqPbw2V/zqCm7C1ssp5VyMk5YNn17sW+M8YeQAMHQrfXjznwxPLW0LCbyjapehfvF08SKOiPF1MgbaK9aVcsnMINTNn+zd/4s6a1Wh1JU7sHsr3mkhF0+o6aTCHf/ghH3J2aK+1ULGfGCStJ3bJVS8BiDoI+FmV0nUFpfowRTeOZlmohtLWQxqOYO+n73COmtB4WqOg9mf4mkVpM9U59UAYBYDOAPAGQQAAhCAKiQplrEiAHaD0DgBSAHEF0rVNwcMgQAEf+oVzY3TegO3Sr5LtRnFH/ddFBWdXCFv3yTNwAA5UM3Gw2EyIA9fPaXZOONX4NWvjLaoQ5wrD4I0q9NTrJ5gupQPrurcwvi/7tklSNYmHIDvRxilKRnVJ0hKO5mzIihPXWOlP0ueOpvYx3BjFB2ylLruzyHRyZz5CqjQB3gRo2+6/OurO3ndCMTbzE7wGcU5cGsE6hyy6Ha9q/S57z/vQMPCxo7SuzBzNmU/Y/svultuDvomp058vrC67am9dns1w686u3BzLnU+E7lkvQ28qIsBswDLbs5c2PVi9tYFmjY8W/f5z3rqZ1Lp0++16+t81VfWgdoa+b63Dhw9Zv58fYvF08u/Pm+8E4wd+9mzC76prd53h/DuGmDKUsjBr7RCmCmqw1tduYIc55HNjqXDPK2bQaS0dcVSy1nr/oOzyipnKa/yGnazmFV0A0ezJxN7VjwuOD37P6HFOIt5MWBO/zyBwatd4tQAHioDbsxfvrj0ZelvRaVGwIWhVwMlEk9mLk58WYsCAojcxQcU8CLWPZco86EPZi5NdVQkloQzMhYXD068tMDkbP27wpyhwUynjWDbZDXCyAG6B1rUVMv1ykk43ojkRB5MLtO5hpVx+3kR3RQx9IKAACkAgBhIADh4x595S6YZS8K8+koVSF5MOqOZn+34bMCwtoxmKGrxRP/sguAmjI72rrAgxkoX+nfQZitWez3V59Rtc/iZ+JH2ptcKz44LMCDGTAEdFRKu+CvqzPpMcOr+rM62XV2usZ31/9kt6Z4MAOgkDttPiDxbghtlCkza0UJBboqKY49F+2XHVmijVRdMVtT8+tc66R1LGa4saq4srH+9jPDUHXpj8fbu4tQyJVBgT6dkHOBcfbavB7ZJYNW5GQYutcefn2XUVxQX+q7cuaHvYM2+Pka7xjMjBe3najlhELm9qUcMppyzO0sR0ec0Wy0MCFDRkU6vxqpb6ofs3ZQdPJRn8B5dScKvc4HJPhI8wsk4qHBb76k8L5TMCv8cL/fwJQINcPcfrlQVDbryRnt8Rk2m41leedWrHxwvpPjfghUsbHGtdUglpM0qHquHvzH2Xl7a8IoICm2Wivn+TsDsysv1/xj+J/2OoWEoH3ulABEDpxdsO6rkldUHQWOTdJWsxqVeNLv52tG/lFjFA08N0ijlOdnI80VU2Vc9+/76+rvCMz0H2i/6vbnbydJR7cSpqL/HvmvqEc7IlBKGe0YPmgeEBgNCtSN9FFjgMhX/B8gYHZGSIL3s2rqqTPS1+2sOjRyXWpvyKNH3v0LkIFYqHP4WnrqxdXjgzugzXErsm3nFsh7AuCm8mTJAEAHgReAnx/EAAAoh9t94Ekc0YUw47bFD7Xtnu647Ijwt2RvKHbE2pQP60UDOeO3ozM6oNF9Ij56P+I2xWZb+vbM0qHhXQgzQ35Pm1GD8pfOJTCOgDEEHNmzG53+5Y1pBISGZHVE9MjnlRcfmRRKtNs8hBwYdsiauSX0b11JNtbXhtvqRLzl3Kd9W6mea1mCBMZN3YEQxtf/bGcFSuPbXw3yB0lwMdcRLe+3bOVvDsjkRovakaep75vdKfuOdxRmVq7ZAjqstTaZkFkp/VsNPawTkwBgrRKrwFijFkNDnVoG5mqRF2gaZMo2GUg46Y96fyDF9R1ibaOk/9O0P/PMvV/ynsyBslgSZefMIHeYDdKipzf8+uBEBQDmha2/Q/vljBgAaPx0yAQo/uTJZM0vAYZ7RaafzrwbWrzygbR2DBAEALjDgke0A97J8Z2ay9PAfchZI2Nw0jtPHrDYVuR5B37nAfRm3gBabLJA6XnfSr1Z3i3nY21ofBSuq2SxSVet19VXWiq1rZi007tIt7So4atSN8LMSTFiFPXupGXPjJ8XY2NMmIvu3zXVr/iwb0m/i1l0LQJJwfbxW5XjJI9v/XYugw/qGyzjftVZIzK7n0w1lj+tcnEP4Z1H07i6H/9O3rF8xmft2L59+/bt2383TUr5ctay2tacdrW+p+EA3swMDG7cEN1XBZUHosvCjFE0H/3ijq1EybqkoXszqtFDvqX9VRUTClxeIb1i6fABcbN+zLhz+az4xYLrMSiCxWfelbRyWvkLmosBW4dlxgik2jwlI+TXEy+//MmkJIRw34e/HBxaIfD2qhX7BpdIZSKJBPEu7iDup+r//Urcv3v5e+I7FbOaopd7NG05C6U/l/R9bGfLzezx1cr50tTHj/seSKhTUX/00TYWc/TgpbFqVT3Ck/OzgpSXSUlshoGzmiwWZLWYsWsznTJWP5DE80EL3vhjLLpDMQNBVGKT5t65XPPqjIDDLaOpuFBM04rB1TPWbotKTN+F070m/7ou5l/72JEagUm88KT/Y5nshLAAgZ6O1/rj+iSTq0Vj+iwSIxh3suq2ICMQxl0Gs6bS1/jiB2dHPpJoo0YDGjSQAv9/Av2SlaFQDxCQKJmVkCMJQRDQoB6NeiZiEZ4HVHoaEQrEc64e3IP7yIHIXU4Ear8GabN0PrqZsCSb7/MprKVRl8HsGl3CHw8U2rLMEQ0AJAnAMAAghmv/iJtK3QABIAJAArvWUWdjKBQCjwp/QQhwi7XX3K1/YeKWOu0IYV73PNPFMENT7pUAACCh1tKxGyZqodONbgIP/IoAADA3K81ibbYWDZv4664+suSrJXwq0dX47JrxiPpu+n6MwKEiNo6J/7LlMX4d10pNhSMFcnCloZywwePNB48UkCyQBAAo+G7vayO7nk99g0bMWbLcEcbgLIwj4xKbIl6Sd5jqPfm/LIdErYZ90JHrMHPv00qo3fAtPNwXujBmkufGFnEOfHDOt/Mc2U4Yy5L9O6xt5f8U/lfpkKR3LGKGzy31nnt4ae7kuZHQlTEDJjXVkcvObhsb19l66lLJivQOfeCA4hXnN3L3x5w6ASDs59dlMQNLcXX7o5TIqDtV3/5lotCO3BFCI7aRDcyZxAgAjPUqEdbpfChcb/QRgL5BIebrzcq2YyFEcFZejdepszyApfLDGV0Ws/yv9pEOWI5W6xftN4U3BD02mnFue/M2PyYFgILFT8fDsR1vKPN3+yhGoNp3fV+mjh59LLSdBnZ79vuTSXMjAKqebuyysrHg2donejpi7TtiN+KKX1/WzHbqxCJ/7re+AwAaCYMFa2ktBxcqE/JZIjhiVei82PpATkMpsNVqFXMmSqiRM62skNFD1q187dF7lQInLi50MmaW5TVftaPP+ObeM9/W/nBJfd5dnO7UTRprYfiW3vSFHFSHDzYWGRHID/ilfZPWM2zBT+HhItMfTH507EoJq6z3u9w/029OK5OYVy8cuuKDLf/yceI8n5Nnw6v2zu7eZhct+feGnxcX3zxQ+em5tq6XPkr+4dRZz4t0/7O5uvUxfZVFe3qnMnDlqtIUaAgl+NEzP8skjx/pn7Ki7nLIDHPdmCp6+CGbFV5i3/zSv9aZbXQyn5WZerYpI5TG7KfQeQ6zNABLEIDp7LbZyD+q1JmBfsMlWkzumJLnJRIX1sjFjH7l6OffQRP8MSIfLPrkwQadIJLXy0PCJEqpRCGykwJOD+pHVHVdzNh27A9SRtOmQJ8zV70GluRohjNnrNq2ESFEzmwuPu09lZQuG0gcTtWIq87pG7S5yaOi1ncjvRpMoqfzKofsK9WH+pnM2GrhTGax3dqPzu1Vl68/K1pXHlFcpv6WPJu+2SAi+9OuXNtlzldRZFiMZv5uQ7fuou2hPeTTj1mmJKwfltxYzfq9UZE67bh0tjDSYFRwNeEWU3S9lyta6XLMgsYZG7b7hi2oLxx2H/P9PH8fV06+MA8gElI/IshewJA9dAKKChlLSAfphORTBAkJseREPSXkX0LkFCBeRNCDbBUwuRsw42gvIbe1PkpzukYRXmzISzZbO21imrG2XPuHBNe6RAIASA4ASNH0gwQAIAFJAYimiSJbUVQ9ie58zBqqcWUoOebj91L6HH8ndejU38ycqdMwi0U/LHAoL93BeCPK2zlAdOdjxjzEqQF6vqcJk79cqAqMTeBGyzpNOEY/8fkeh6IqHHasm2oj51N3AWbBAABESAiAPAUAOrUYCz2ve4bFgRGC19c94pAf69W/S+Xrd0kS9O7t0HUFRY8hN2q2qzDjWeCBpJDd04RbjSbswQxAs76km6V2Ypid01U/D+/WOQ0xtJ9dgE0WrQOykRQTdzZmcuPFhdRFFqwUAuABACPEE8hKEoAxgakrSZ3SjPxNx3Tts1C2yZHcAjrp3jTyTsaMkjNCc2DAiUZ978LyWkXj0MMBvY6mldRYul80N4zINTd0hjDCB9809AtsnztGOJSBXnds+zOz6DvZBkElG8rDii/P+vmSKGMqvbFveWmKhj3U/1fLttjuW33SO6VdRf/x+1d0h7EGrvjs/YhB18cacp6gdFltWxww9v5uRymf9CwqdPywmOOy7AtRvv10dbwqZdC5MN/OqHSDtzS8HtccMl7XwAOASWMGAMD6Oh6AKzcBAKvVmFmt5pZUdKxvvtgUBbyYsOp68QqWE995mHG0V3SKdwnHBwgwTw3bpvLdG1WxWqbkeMDmAtbCOj9WbD7Zq2XeEHfg4cMAxv++VgEAwG77xMJbjGuLAUC//IViwzcvFNxy7aalLeL6XmMv1Vw/ab3z+ExfTVQCMUGyv2QSaayHbt3SRsQqoe4IzjDWUBN3reUMzseM0/m0/Hw6uvAnE+QdZX31VqOBUlXxh/ZS04IAQO7TGCSXmENxgwFAr+GBN1JVLdsYwOq7vj4jOTumND2LVWMU/lRtile0WQyKZyShLAp4ixXpBAwxKZkfpbBthPDmDmydjVdQo45f7JGdiA0/pIt3P8FQ+l8J7+OjYwAQw9WKjDR/rjRnPJvJ+fc/wF9pPX2G7oAYsT9zznZKraCp+pBKBeAHAEgOJAPE9bSmGLsPrM7t5dRu4ZKMG5UoLF9aVdIrwwRYHCwL/SotXI+kqGKn8CIUbRpesT9WXr9deuxv9UfvSJ86YOiqgckd+Dzjd8ZBzh3KzNQ3VRNKMUETFAUABCmQComsteLZEDRZqDlXWh/0mMCQqzNnWBUK+o7ETLDw/FMLegixA1LKgQIcfMX67X93qrONzbox4SWR260Y1VcbzByLQa+3WKMfJ5QGXiQmTPKy8qDq9QF+ZmlOoVbXwWt+3MQ/i/zs849Qu/kCCOuZ9pMKEG/xe3uiU0MN1qs1pkdJI6+sHbRDnGpo9G1IOZTpo6P9odEQUOkDXrIxq7qPTTgrlUT0/EEVab0T+Qwg5v2CwnY/jTj33qh57cs8pIpx8rYE1KgRgj7APwmCYT0ZkgwfSY/sJ+xNEADihzFNPoTpuClCSch40WRxilZE03cmZiCMj29fTf3WmB2ZBK4nQggAQIoBQAlN+QOKaxawCJpW1akBKCUwAC4rcOsee/zs2UJfXWYCD3UdzCqWpMeP3XfA7TsLs9iD2TU/eU3ZQsmYXl/Wug865SevhRKbgdTwY4UHsya6smpmKid7LHc97zYMte+jOsB5ReZzxhuwsRcamGhR02Sfa4ly/ShgV0nn0IB7TPl+TKibYKbRWw5Nrv8oKvLnp9MLrvpHn6Pzu1vemd8TY12WVpmSW61nR4ruGMwI3WfrblPoW7aqP9aXIfphsdskXeTEUTtHCViFjPM+UST7+Pkfk4XfzWC965YvOq+MWzKtbMfstT597xjMQuZcqWzuU7UPxGCoNXIIIl52l40qTUck5gvn+yoDAhU+m6WxE9XK0KiTtCKQYTR/PB0feCRN2ef3KrhjMPN+qUWwm69zYHoCVWdWFjnEnhKvTgA2T9UL521KxTzPY5M4NYLDCABjHoCCKhDTyInFrFyhz6jmT9StWqdxRKdVfPG9Q92gvH9Wx00A23mjfmdcIvRfnKE6p2bP9FmuD++l09UYMXU83mCdeN6XHnHOVKvX8zYGzx2xxgKv/XhGb0c4g3Bs5HLHPxbO7qieIWiDzbxFcpQK436hwXPKY56ThfrXJCoWKujnwp4lff/u553PTvGm4+SP2aws04jvgNxv3W9j3ujQQgNj6jdN7ShGY2KP1vjaOC5MAUDBwQDB0AOgGwB0AwgEFYA3QDwAhAHYLN5uORTUGQEtJ2sHc0OcDciMTWEqY6kRcEMZC7i2xAKgLzcAX1tmaPOBwig922HfPr78a10Hfiy7Ze9UaRfmM4wJuxIeb6amIADIX7woFo7s+pciZ4+vfCSqecf3FfrIsSc6r4Zszyc/L5jY5kwBug1bQ39g/ejJXTm3oPr7pKH2vE7N0ZphKsCNyGjltXQjDxer4oqtREjUqrD5sQ2BrIZUYovVKuaNlKhe4cTdPKiHg1b8p2kYWWxJcMRqFe1N15kF12HCXk/dJ+vKNgiR993IR5Jsf3D2gHXHx+Lz+VDPHTBcNQLIDvj3+Do9LXThLxGhQsM+cW5M9EopL9P4Xe5/xWeOE1Uuc8+QYh0AoPObnrShodDOTx8ehdt2Ur6aknLtCkGwT9e2G73f27F8/uz7bY1eQ0ZawpZh5vUzlZsKTz6jOAMZhUqzvzWcgNHMZw+Tx469Llz2XNaMnhsb5x5ghn49Te7M71coAADMq4/cN6H1yZpTdaeeaHurmBVHYud2doJBh9sgrK6JyAmL71k2d4Ox9dDLL70ovphZn6cSSgpr5WJG/33s87vWj/PFQM5O+URbrGciOYM0OFgsl4oV4k7ZhnHfL5rFea0di18OCg6ubfP9eV9qftrX5X3qU29eXzpJ8MY9x1vvFW06PjHZkvfLIuJwcr2k8myDVpNXOiJqUzdQa02iv+VVDfqjrD7cz2zGrIUzm4UWzvmdUL2U9q/9/vWWmQLZqwbkJ60c0sY0u3VlmZ9keU/vLo5ZVeZD3k0F9lFtQ0Vk640vK7Vimut9xbpgl75HqmhHaLpy5nHLtMSNw5J01azvPyt7TD8qnS2M0uuVUBthMcdo1c7uA35dwawjIzeMbbF9g/kb8awP532w4j/2Zd/JTQ8cHrd+wyNE18YMfB5uWghoObDE8urEV1qp8OC/EcDMASIkHQnIVD1NEzPGIekgPUP+DZEQH0tMMFAM/woQ0zB6EVBP5+v13JVT445MubA0ubnVd3T7P8K58EfeHT/Y3o2NS2KmHOrJrxga07X12bVJQe78889Il/wjsvUUIUmTgCiKALGIBCRlCEAyKSApTdAUAiAASRggKIogKZKiSIp0LEj4F/z+b8l5NO/z5IntzSell3abCAD3JC6z53fjbSef8Obp+ehbc1fHDAAAyv6Z9fYXfWmB+qKhQ5+rv6xsN0GtnaVfRIvzpWceCwWEhoz7vVnX86nPywFjxXPJ9jSqYce4gRj40EdOlHZx2dhEXi+FBwGAZMYbL/Zx5B0OxojZY8f+I2wf17o2HkYUGq7obj2vmxWbWajLDBhTk9WMp8dQGYVeDPTrbc+tFjzjz2AAmJZio/Qw6nqYSQY0tXwi/9NXDqDB5vipHQIt4M2J7fWGUPNCW1+F6ipeauY2IoSxXvMiZWOnVUuiDyC7D6N7NP0vt1GJmuWlXQ6zGx04c0Ij3/6Yq563cLIDT8NILmn3ogGr+LbgRwd+eTWg5QX2ZjG92gtG2StwZTEzXRYzAInEgYsIgSqgwwzXdvLD63f0v81cIWuFXndDqSHDzcpm2FSgtTe8oOtiVvl7ZptWFcIAgPQFq87w9s7e8qfvkLS/3GIO366bfv45wy2bWN/McMGIZqWdEhbuVMzOvlkQI2ph1fEYgEDcNVAwAsAwFlcAInjcdNYOZIBPrX7k0c5PUiuoey36pgykb5F6DGKD7zTMyv/Jf53YzO7iLx0bHA9w/PzQGATAX7o4WQLs1sgUAOOh/EGRh3MGJN/oHv5CzvjmklWz+ivfGZ2fniUfGAbuQ879fry76D/pYuZWEvn8+C1izCs2+AgZhhFZ94DxKhPmwzCM3LpBJtP9phQyDMMwAoZhRIZ90Oxmxu/JUSs1Lugm7E796lw+wxcTElseU6WfzehxWeQlPOQlz+yrFlo3ZDzJEABA+IolhFym0GZp00RXGoOiqgpLW/WVcMyhms4vAsxra663xHS9Aginv66ALdcP4ZuHsnVEF8UMrK3rdvEp5k0xRSlV1FFm8uowBKQVkSsHhwAA1O9Wn2QNO4IOZvS6GLHiiQ19mNarDSU81/nCqGQRcR0N9vrrMYuuWyI3RhZx3VQhybTgroqZDaGCRVM+2yWneEpiFhM8iSm1JlzCV+3lBiEmyLeIqjw+vTcRzpcXXyhYQIk704a2/xHiQeHXlazk+uLwm2XkmOvRNFJCXp+zFglkXRYzG+Y73yvgpw/PA2DM8wCAeIwBUQqexuJkvzKCbwyMqDlelkKbKo2Yd4ulMtj74XA3skE6HTM+N3vU1Cs+pbqqoG37UR02ahQlZ3XVinHA7jVUqasbUPCHfeLKT0m1EsU3SmOjtxt0EgJ3KrnpAj7zvVc4YhA5KEU00kvR3dvwnHSQ0uchEQbAPcO9ce+IkOdOK7rFxwVFhydlevfzd2pbGo5fdYCRiVPlq7wdksnS9FjiDsSMCAsDEEESAAwGAHUIwKBrvmo8AMTGAowBEA8HAAhyclu0b2/zdaR0BR/uYM5HA/3WMAQApgY1eSfxmRvRga2vjerQvq1748t0OfCZy3T/U3swcwpdCZ2gcsACcVyZeY/92iAv/2m16ilnRiFdhpm1QWVD9LOXgzrRY+ZFrT+fPxemBgD9Xp8+yHIQD0eXytSplP6Afry44HB8ajtsSRO6jUtqH7zPqXX2Xba0MvsDDYClERotN4/pzOypTl0/aWPSrO6jfQAA1MkzCIgrB6HmZ9mRWhAYX1/LifOo9nYhQZUvLaiam5q7f//Bcnyn8Rmfefz4OHajceCa0ammKqW8Edep0De9+kyQgEajVBpNRmsw6YqG5XjvGi3ncrRGAl+t1hLAF2eJK7XhMcN/CO0XFcDn1FBJpdrKUB1TnFyEB7Wc2OSLc5DXjyxGoJnytuAOw6yKmLJ1KD5Eee2N9T9rOjNnMxC6KXu8gr+7R3Ym4Ni9NevTTs5Ld0G7dCXz/+/0sLMnxpghf8t4juPrjIf+u04czo9N+liF4GjGqJ8uVRf2Nq595MypMat9W24CgMJ67vNe2J3C+PMCVnCHycZz5pjLF8Uh8elBqafO+VhreHrqVRTS3a++/lfxENEaYd3wgPMuEdmZV8nfzHu9o8KJw0RMFH1+y0O1Z/zHCzD5UMKHtdxuUVTaMTpsfjdZaqy4D9M649b37a99Pt2mSOnmTzjND3cRZrpcrtFvCw+IAFQmTHkpSSAW0jwQlIAr54jQBkIsEnEuaJc1q3/UfZmZukrAuLGSx3ifefywt+XeFowlz7J7CKoKREqKAkCAbaWKYSwav3zB5gd/MnXdfXPtabMjilnT5+7LoKo01rKAE2cyKw1Gk97KlzTo2bRLtXX9eINJZ+78WCO+lNUrZbB0S5+Ta0u16UWr8xuVpzcFJHz/XU52gTHgpVhyfN2ZstFYozdaNQZNvbWObxXlwuDzxMq++/USi9Oa7+w18HaGs86XJ4LH1A08gkdVDq4/1Nc3UmzqJx6taegpnXLwVGyP3D6GaIlNhdCxEqfl07zvpUHwGooL0D6kCnilrJvMO9bYfcxpJhhYgG7+sp5e9f0Dgs046G/K4QZ6oQLb/ODYtwxysYnrophJiqy2csYEkwFBYiKgHmQsJuexFNwHEAfxPBkBxGSLAHXvDlG24akhOk61S2s0zX1fFBICIO0JkAYAkJh4LcA2vKmKI+EPEAkACQBSX/CzVcGxCpEAAAKnLklzLmao76ZDY20dR9fGOAUIANHXBzwBCAAxbYjs2jWxvh3WvIHf/XuMIx2AkINyrnzVWKXzJbiTMRsy4HVN344bdLh8WeYnHZegm/zmkv85Imu11aEOfQOmRj8t6OqYgeKfH/1XcutncHYcZTvHueYWM27w+b/BHafRyAkDtdiBybFtyz4PcUR3Y4G6M9bpOtunDv6/2XnWmzLm4OHHFTbsEiJjzSMhreUP0i9NGd3ssLpbx0byFAoAS3t2ORIRUkX7VckZBjqHnB4HEfXsefOPhg3F3jNsXMSeLiMfsHF8e4nvaCcv8yzYcayqPUaruLqo/URYrO49Ou7OwKzZZ63brvuirw0xs/9H3dcjUltbYV9U7vvxKWe6kOYdn+p7DGsvsImxAxUNuYIff3lyyh23P/XV73yUJatebPXOmmXCEOvy/7VMsuJ+zQ2hVo1McKJv/+s7I56K6qAq+dai5W/WP9YJHdqZcRDrCuvMmFlrLrQaxptyHgp+6ECr+fvsldMSJku+cWIN93OfTns33gZkWF9hAABLpfa6GrOU31yhayi32HoYHfXvR5aewHcWZqc2PhyLZvgvbblaN/+7Sb250f2+qm5+2PSNZA4ZunDrYac1yPKN+gnbK63KX/zYCnjTvNPXLaDa924OtYz/szPJJ3wkdqmhC2HW/vjSLgmfinj/xw/uaX6tZQV+SIClC4vWNTMR8aHtj4bxML7bkjpnfXzVmSm2zVAUqtyYDZoT9ZGkpUoPxlqjgjcY6qzmKiNgLdVor7Sdeualkq6DmQOhgvyrT6gxwKghe5t/ctWpRyN5wD1mHmhWBZc/mD4JAZY/WX3Fdhzlr1OhKdXeCEwO3sJdCfRCujUnVxfsvnCAI/mDRyoP7/uq5tT+y412X5/MFXQdG0QibDcpIOrzeMCcXvp3nbmZPhD9M9Jg4o3muRX8raDh6TSt4zD0/6y1ockZBR0x2IykvVpaWDp5zcSyxF1gJoLX+u6ZFom4w6PSfs9O23f8xBx+u12hIqGMXQczpX8u147JLO8OIKlYJEQtaoYggofK/NcV0IJVCeC5nOlA29gQXl/cpyMSD9rgVtx/z+J7fDAwkjret8+XE6fgoxH9rzJJb+QXqgxC9Kee6W6YCdLXFkW03+qROMumhGZyIn1tDd4h42w+pqhkvlP7Bls0konfxFebrBlrn+YLe4b8PAjdezhQvXcoi40n/LWdsdOq8zFDE378+QUaADBbI7Zvj6CRI20e3n/56UTHX8b+rHLyjgSlhqohat+slJqAwMuJNVd8pxKBsX6nxxT90H/4vMPRafQdgRlEPfil1xwxgETzjPC2fRRUa7mNe8xr1r8Q4NxuiXmWpPzw8GGI+jcQPGEVor8TiCO7GSTkuMGEwAHBTPPuP09NPmz55MR9veS9v6j8E1yascdhWWe4vGbTzOnOVRuIoqBpXg+EAAACAAEACSAHQI7NBclMRqXbx65kz8T+8IJvQkJwjJIhmNs068R7SqXtbvrJmwy6styLV7xenikGdyeEu0I+CHPvsLPHLv9qZjGBqNvETFPwcvulX3grxpQ6ema/AKdx2fVy5RgQAGDgSATYQhM3Yga3RIubznZlfQYAIB8yxFrfqNdY2IbbG2To4s+zQtrVaIRcLlZ4S5wZbmMPHrsvCth1pdPCAIDbV/ygABt+HhYJoN1cPt37t7LJN0s1cntqZ1JdHjMAoH3/VLqG9+7xiZ32zW1ki1Jeq6yvkyVfBywA4AnCdOyBq4bYNBUAyHRHF0pqTy5APAGAeRIAaU5OhzsBsz8tkzoxmVFiv5ogkg88XBCZGSK1boznL49X0oYleH6dPwCQKhGNRHJhSVbjIC7TGB19pbrYRZ3lLluOdSJZoA1Tne/tu61KF8sLj1yATY0AtFBNryoCAICS9etO85Ub6b2bTlY1/pK/OUTGwR3JZ1x9tZZFErWP0G0w07e1xTlWTl7pFVOJaTFIKACgpcpAMWTvFE5CXunCvNyrOX3uUflczaw+ZY4IzXFXzHgr5gU3L2ONEgKwgWyOAbZytoJwdUcPnqu1YiAk8f2GhXcFnsbc8NUHJx3HwLNmKw8YMMfx2CuNlnKiSMY7m6j3S6z8TR58hS02sCzvppgZ153raU64UeC1aPscOQ+bgoY0t7e25z/dKqCj3/59QczQFF8hV593YunX0+4LcXvIrNk5ox6giBpDXfRxq7S2mmv0P3feWt1nABhK2Srver3S/53UPvrseG0IvYzFeqF7YiYSZj2T+d77sSaKAo4V+I0WlVweNIgBbBKQABwnADAyFJXVaswVf7B7yEvdZAgAYNicgo2/7np1MOnmmKHuscw9CM9hJVNTvEYqVSmSe1J9X5JhBOS0iXJy+sTAV3K9IkKK/IaExBarhQo35TNCJFSF6utONNT08TvNipQXfX7bL68OkJ0y1Q227FdlTYo4YrKOlLZ6UNbL5W+Pu5GGxMS9OP5/i16b7uagUU2iIBoAegCABACU4A8AwEQCQAQA9AZQqyEQ5P6uaqQDOgZVb9lyr2ZfT/zV5R1qYeVWq79/xOGMPWd71Xyj3yKDrbWZQdvzWymz4td2N88cI1I+mfD2Dgwe6gxbXxj6wGPZFlVakYD+VK8WEIzQS45PU8qeWaAKDdfJu1VrW23lY/zoSnDLdBavF4f897KnyzsDM06cFEYHlBvMgaI5g36s5zGwZg77F7HmUIbDHFxZ4yuycs3VGd6767WRB1uCpnpBsNxkpxnuJTQrGmwdLTzBdQ3MzNWScgzDYvbkTUfn4/qzVG2A4SinH6XeWznDSmv0BN9wTJxVxTQrgN34fe/Joy+caPmssAV7ba2Sbrh0tPZglsF9ILMs3Y4BV1v4qpu+HK62aEvdQrS3b4NQUybIALwfqxV7cUpLdz5RRr8mTUWqx+tlSuObkkhe/n/UvTwztJkNlXX5XcmQ5C8SW2zwgEZ8u7dXy2FiOvhtFqt97+vuj/Z0l6pApZUFY5RZ3z6AfpobXW6MYDXWxviSJdOj1ARXofXxrjc2ENGkG2NGNi1yEIsBKDUASAFCAQBksqaDtkqJ4aPqHqBc9OSHrymbn/Due6ShxaG6j9f3+Kfvs9Mjf1z4qAtqetvUBlfuW3ZidM35ccYL9QfYM76J3448MF1+flhu7kuH67x/nn3uUO9Di9pJSUeulI1/yjXNivYC6P3q9tda7N5Hdq9qUZ/Bsnz9i4snRxPhM5bP/eJXtyixCZWXQLXZEhSUFBbkuxOHmSXWgfGZfoHdlBrN+vAhsEuJJ0jz2nNrrSbX8dmfwqw6jgYgporenT9toK9IfnMlXQjbImfpwKqn5xLAAw/eT2s+T+3eqYEqO8evyI1pP2WoMMLIVB8+1FDB0IwFY4Im2ToT4WciGVrQnmZjeIsLMas+bVWYqT4tp/N1x0MstYNs8ynmaQQA5IS4H35dIRFHjRh6fbk402Lyil2XMJu4JkuYJw5uSenMoCRhewao/PjsUMMfvz5kyZUY6gWrh5EivaEBIza7XivqfTrSMqDSYGg02iw3d7NkiGv3HmG2WJ7TXIhrlYKxuV+yvYZRsjqWBgAU86+FV0rrT/8r4fk+TWBUGnKb9VPt2ftJHSADZ9JjkKcfN0g7ETM1Lkm2cdgQxQCeqpXea4yfSDy7q2JE4wScQiinGgKG44eOZY+IPzvcNCSEt2WEVHCdUIvXAczkPqYEc6Aily/1j9cWWuOZolqvSFyoxbxazF1ly30T9PkmX79bzQc69IKhyY4kg4MBDGc+f/qNSSQA4ILSN5otOTYV/LwfA5jzvtgAgK4yus7ELNTnwEgbHRAVBSC+F2AmwAMA8QABsRAFMBUAAMZxJPTpA2E2GYn7Qx7pFpgB5G4tDB69motYt+hI/IFyQW3i8qnU2b4adlvp3382xP3yj11+F+tevBUzss+m3FuW5IoHxv73X6rBCMB4aPRjt0o/VPsvrz4YUGNGdHcMUC3r1IQqxT1Lxw26XbOPbOOK879M93ETzGQBWCpUovFHDlbNCmv8eEbq/s1Mz+RwRpwvVIrGHSg/NzX5UnOZkCrf1aw8pe/rtf+NCgbIyHh9RHOzcSu/SARQuWfy/QB1f/TqTDYD4v4Db76T3mFWOb70ZtB8yk0w809Lq6okCUQZSkw+2KAFr7oKHSAgECIpRIiGZ4bd0/zL/Sb9OqmZqvB+6aH1TxOGbwIGNn+0YPLL62chwIABuBUV4zt3XlT9xmuPzxse2CHuMV9x8GvV/wWAW2DWWG3KNPzRp4HSaMNOvZOUNuZ8YuPEnB2qcm9xfYOG1zQ21tSIvK3NpjzJB3Z+9n4z5zlx4tZZXuv/eK9lWtaoYx/IxjAEEKBfu/zJXp3siSUv+27Vd3IKeKJNlwA5cJJrIMYuDAX3wIydYq5jU2IpUjQ/Ie54eGT4hZoxEcneljnepkh2GBI+LDH76w6oo5vdFPTCSx++fKugI4dvumx6f9rIlp8verbxlf0zxGzl3tVHZ8zt9OCV/4v3ZxZaN5xbEG7PnUfsj9yDtB0LGRm/85nWtEkMGZwY1jlrLxzoJFWTBvIFCAGIAYDeACAec03mAQRfzJkoMbSISKGRL/+fcdGt+9zEKRfnDFjU2sbwe2fzd38wuV9IQz4c5oKpeioiAjK/1YvsFxw4/QXbp7+9kztryZGJndzkjtAf0ZNKtEktS69QM/9z4LGtjTdDI0WmfePetOW9SB/44SN/w7Al341zUW6W8ess4ufj9s7ql14t/EJj52TN4urLy02dPcw64BmiEbaO0tOiP3mx27i0YAmJTTWZv+/1+eweOyFgP3MlWxOrdFV48fDuwRr5khQ763T3Hu5mOL35QZsqjV9/JUa6e/yoroeZPUWQ+vnhdV9ZvH2VpspqS+jT4+2WqrKuvCg5uO5hF22kU7ekZ8rv89/aPtN2vcjl/aSVAcsG2PSV87+bkh1TuSRN3akNdqptLRv7+er/jPaxML2e+mbFo/ari53ZMDZ2yHe5roEMb8xZKOcGjF5WYvvs1UfFgoe572xFfC3fCR4k5Y/lbsR3DmYAwqiJL3yw+NN/zunRhuTTLQufysylVlhdglnBikm9eCx6RPuLrcSBvG/vTeP5mEfW29J3xzc9Eo759Enf5t1JmAEAkHTbAhjvPPaYLxf58G8nXGOB9JtPA+D4Z4W2rP1L3vMEGGDqgCs2Tl7sew/CIHgooHNFhBvM5td+M2zIIUD3bPomzRXT1HFvMgAAxAOcrb4Y2jMIA2DVe7YG9/0zlY0AOPIL6d2GGRoxQYgAlC+ecYkRQl6LXFE2u0KpbNqXyeb+TV4AgJpqS7ciAXBdFjOuprEdDY0nkLklpgKB1/B2FnRhxtcVjIiEBrbNyJZNYjhzV8Ws/IddOge6RVf9Dwcqw1KJD/f66/FcsaWVE0yXf25/EgidzP/IrrdvzrYTInWizHAyZqUvZE9PcKSbCUeyd6o3Pvn2uL/cGYH6upYLdIbknGljQF2tP2a/m5KGO+BOmOmug5nl65zP+qM2vgXo2+Kb8W/8LzH8L2MGOS0zhaLebCv+9MnRZXL7g03U/iAyVPRDXQaz8p1z2oAMqr6ueyKq+Xmeb6tF/ovmHPrLmAXE7Z7YUta1ufGEkFbc/laPt35WWV1k1+GzSkOftgaYD1sbzGMzQ/HYTJFWmkXUKcPAtpoUGnL1LzdKeu97l3vczg18Q8btY5bP3SwksleS1HUwY8k2DT2Cpi0b2Tr59L0lZuPkvQNNp2d+T0YHtzV6BX89TISGff3V+8rbuMF6ZgF5u69Fhh43vqNi48jQroNZu1IcC3LxuC+G5BU9/vHugvDgk/f7K/zavqEDWuX3/Mtf/v025n2o5Nelt//e4OvZPIav6qeRXQczB96vJMJoqzw4rm8uRQhIkhY6f64Xjan4n+7JYIevp70Gyv/827RL176efOfEQTDH8VaKZTG2WM3xFxq0ZiuYzAKnh0PI2fSnZx8f4Gj6qMRk+tOYmc8tPvXsDPLOwayeUBQCVS034bx9yhHiw6ZoU/yRomjnv5h5MPXT10LHDQpVOPL9BP5zo4g3V53adTT888EdLDpcipnsKcw8gqg3RcdCU+XCSf3FFOPbT9kZUUeU8um5337+TpUQ6+frI6ToNt9Zby67vfUS2MKZtLXVeXlX2aSXxnR4lqpLMaObrDcRSwgkQqD9AYD07aR3Swb0Kcu+cHGL0QQU2XZNv7Lqx29viyzM8hwQjF/0hB7RTkiMdouVlegeqwtWTVKhoSOtWl1DtcHSZhYO2nNsnrxtu7FpsQ++vioGiSipUiLxaubomBs4JL1ZpLLB4IeAq1A1x5TTIBVyN8xaqgYMCID0u/kbA0adNydDeztgiOgKp7d9lbWqssYi8PbzaYsddYsrB5emD7o+ODMz5xIsu3F48+Wi5pXCJ9yPz/IOMUq9dFjT15n3SW7kguv2hKaC6Q+p34WJAnAr4nFbU2G4cM+hLAMieV4UO2hkmN3x5iXhJ2/6LMGrUSwEAytOjkCHNKOmy8Cslwo4zsLLkEUnE0krkfthJt2cOrsmy9wEC3FIcmNdCnlY0wOoc+QDagRuRm0JxopffrX0eCTeW2Ctzjq25Nvp99tLVUIUSSCy+rgwbwxzhpHpi6avNwccHSU7LSiaVHAg8Moj0j0Sw1RH9ufobMyQUuUTofTXX2JKkkpwf0X5JmU/Y7a+hzS70Yxwfp1eRihxToOGHEDklwtC/N2hbIjUYje7iD/53tVp06IEAACxAx7MXffD/td72RlzKO/HiicPCx7I+X7AwQflZy48GEJGLE/ZFjn0f2vSLk67cqI34ffVUEfGa6fX58P86XWLi8s+qjz5DfdzJtQ0Lt68uXHHTycOKvWQvU1h4k8vtxxYZliZfWUD+dl5t2AzCW+P0fCeR2qWv5ZwXZYLEv/xDfXUHntXB4x4ZtAVQhhXGiX+sFxKIpKWisyXGVFsMS1Xq8xKSZ2Vc0vMAAV1S1fK5T3iRP2FtShudr91R8g+QTsDY8PRflFsFO3Fk96qwcqq2uKw0BA3YDOsqzGV2Z5rx8ffKI3rfmsXEmlfpL5y1CZovEngLyHjcgwNifycwdsNFg50jWYy6rLJkIIsnJnb/7s3pzdbeffDDIF/zBhpFQGYwBgAIW+FLnRit5pqwFhbBRgjACAA8UmhZ8dEux6y0s/mfZy1cO6npTbOVb4VviijxYnANyPfKrf1oEbavwrQ1PD91EzzlYgRtLqhuy4zgL1fdsB7vDWoTiYKhNzIXELdfjmiTrf1axpKztQdT7BqDJp6q8YrN4N75OeP02OGb1ZXKvsuW1NM1ujqtYZ6naGmAgeUR6BWkHcqsTs/rxneY+UE6w/bFo1q2Vfcz2UrmM3bH2/epsB/zP/lGRvyQfoEFgD4ztWLGf9wTsIOEwxKY3oRwrlGKS1PoxYiOom+FyHMuBgz1NpKpp9CVuHIAH/FoDRqnlKdYxwdEHRKndrT3zjfO1BaPU1KPkb3ThDN9y2M9qo9HixoZXd36gj7/ZW+7ydlbJrQ9+Jnr5CjWgyY0jUzk/lJKwe1iNon3/frvTbmpckm71kgACAlTcuyFQAAQiFAE04OB6KxM+lKz5//7K3cp+/u3n2WbXG0ZOAXuBMpd9jCOozP9DyAcd3DQ/NanP25RwbGBWOnFbQ4frnHj7wzW+VcfRacsrr8T95K3JdsDE5uIWOsvzT060xner3xhevJqKoXjeubmwfc/qRIgLA3Kxadbn4iLOEgZ5M5eM6GmMA8z91e+qpzZaPkySefeSKWxo5I0Vag9QZce+tRxNduWr0wpRMxazw8IgYASEQCQPzQI082Cw4aCoYxANDrk7cW3jsyWCK9oYiEqXv0NioV5+yU+WilY1tK+/LtQ7NlA90oRtzz4w8WKTtq+gg30M/M7szt4mqKZYt5QMWVa87yQBSf/eTWUDHSFAQhAEDpS37evF4qCRw6Ivi6N9OgsYGZal+/MTXZrAAwAuCJ6xsKoYMJ4fS1SCxPuAFmxMDE84Xtexwoo2SkA81VdYvoVN+EJo9VAkCj5lAGAMpv2N1swJi011gmYNGszBLNhU9/eGp8E6/RtgKUSKaU+4CK+4PO75acX8L2seZoA5J1mQ1W1kzoT9H5SamFBYaQGKGrMQPwGubAReYXCoa6gS/Wknzi+Q8FgC48+59+GCzPE/9rJtfqHqm4DoefH4A56+tXixaKAACqRLZMQMydkl0em7y6h2DlrNMj1jAXEv2+WpjFxFqtmwLv+zlRtlK9csJmv2hwPWYO0Z5far79t8DtMBOO/N/loQByUqYE2HP+1eYpYdLgbPaW7mO6vRO5RDKfBGAzA23mQxLRg4J9RMqwkFOH5FFPlB0eFyHfXfRkqI9YwYtVYTFHGkuoWLUDq0jcYj+Q6qV62bpj7sdnaFL6W5cAADDAhbd6j28x3AecqWhucT0x+/MzAFB5rrfN8Ye9QwahMgJhxF3RI+BqkESuqeKvld/G4DXkUowD2f9ugRm/rmBE75ilWvcDTfWyYMHKKiuwVSseE76kbIHoQP3+5tYu83jocgPgA7pBtsI1VYaiY5tX1xl0daa0yn9uUk48fspn0qBftxkqdZp6g67WUpuVee4S+2ds7M6nKw+N4zPnvvbKLOR+qF1dutlXfSTNUHXvgvCW50yvXFrZohbzujdXpJXPSXzPloSrKwGKFQSVyKjaUG1uVIj5Kg5QGi9RUqUWfMskdJ3uVIRWM639lB830Gfm75i5q/n+45YPdsM9gCLenHx4k14yfGBqa3EnfHTe0lebo9NPdiJxSf0Cm0rJywsAABIA/EEWDMDEA4CoFwD4AMgB/Pfm+MhDHVgZ4Aay8fy2h8N5LHzYsN4dNyZh+i7whyl/t6mhEp5b+3Xz1Zxe0Ze++eXZP1lMqd9sRUgPB3LSyX+7vFca/aYJD1RMDvD3jnFD4Qh4w6cNmlG2DcF4+EqbdGtwhEJb9jwx/0/6/XRglJ8jt7qBbIyLQxgAiHuwO0IGxV/L5fk/PWtz9lWwQPbJpYX9bmTA6Y7+WP/ibGc7Le5Qt6DVD3cibpVhxoXeP47sZvOsaG7c4me7j0wLktHWxvKzu852/6Kf09UNBR5qW9uuWUhcvO/M8vdtO7tk/+SD65dalSqhSaMVxL83uBO2sfNg1jbplwZO28gHLnz9wGg7YkA+YWTp5asVZsY/IiGI6Yw2eTBrm/Iuv+jLAx69bc8Iu13FREYC7sTsZw9mbVPYpwkAgKX/NLSdAtaJ+eoezNohxbXV8qFu1CbCA8vtmLYezG4Sj90aMffCzE1ko59rNonnLp+sbx+yQwVfOVDsD8t7dOucKUCXxvUxoKYtBbDW4u0C0PTfrRCo25CH1/oG8w6loDc0TH5afcfzmfno6X69BACAlLcA2WmCiF/zxfwZ9noZ6QxC2bUR7djA1m37hH5BcKdjRqMfhtC3gIUR6LMSO63Kfvl30xfZc4L5o5d9qwNHNEHA3cpnnD2mUzzCLp3Q/U7HjFQw8sKrXN0YQwZbPqSKjz7a6+z3f+/VWVLyimaa3bjFiS9fSCr+r2UcARxdfnKEhLcwyErxmMrJH27vJmL8D2fueMwAA8r7+tH9Eu/PHsm7lJz1ynalmPDtNNlYy9gtksBvUadQUQnrdOL0tfccXisv5st8pmyJ9Nl13/qMiHi7zpyqvjPa7WpbH6nlaeHVKvWgByrMwAiwQuHTeYa1/Tdxxb4UgKqh5qz8Uk1QQMIx3didFyquyE6Z/ENDXe3HuRYzBAgQBoQRIgJEHMdijN3CVSN86jkAjVxGUgLEMEpFWM+4CoYQ0AQjFLq8ca58OVtJV9SY6xr1XN2ZSyMTq37DDRLjKasbYEaOryjgKy+Pl+q1Riu2GjmLURAFOq3Jii0mVw8q1+qzpMU+6AXVHEYnEfYIg+fIAVLZcyq3cPMHGn6PKRx2b97B7WFsIHXOekmSmmpafyTKFHHicg90F2NGhYQABIIazpJB4QCpAAA9wC1IMKHOmKImEt8gRDT5hnBHVLqSHhTP0EIcKUd3NZ9dNx99HxW4WzIIUgMAEL4AAL4GhlBRQPoDAPi7vGnuEW8MCgJ3DhID9RjpTq1xizENbjLd0aoa17W/BcHXfmN3aKhnzvNWMh4p8cO6PmFNf+WfGnEjGlm7v0ckvnpmRFH5KJeznGfO81aiy7/3ioSqa39pftTcPLXtKiDjhnqxwsNnboaZvyxSLfQqKSNq4nOVMuL0hd5BNbm4B5HdwBO6nArEMaqGPLIoPVCTp/cNkXj4zB2o5rcVG+jzn2i3rDV9U2GsPfHB1c3G73fsPy/TmzdUMwa856e6jy9cWGX6ser4UqNHNroFSRJSoyilqk+ociCrlY1aWLL3EjNQuCsuPrjsZFq8ivZi5bKE/hXGbDIxQO2iNnpkY3NDURTrHVPEIQASY0CE2B+MCT0LKjSATVU6BgMChBAAMzI3airyYOYGxFYYLyhPSbwtGmO91mqS5hODeh/6MCFxwHpLnTjs+yRdTWNDtaFRYzKVliImWGjDZ+kM1wjfvQht/M+aqOZHLJerJMgYzZdFlRABuf4mrTJSlHc+rJv5GCUNRqeVwrCq+qDiIHN90FaFSTMxtsUDq++f8rSHz5xKUdzJFpgJrk0zh4PfjSBVVBQAPRIAYDwAqAHCACDz8ogAvlVh6czqWI8N4mTMhiw+wf+5W2PnsHQPrxa68PKHceke2ehkynuxaExs+9tkIYRbXoOIm9X0rx3hC7eJ3/Ng5nQqXbO3/QoXqLox3BHrQtp7VjTyYOZ8276h/VKypn+cWRbrQDeJ5J2kaO5yWx8p2l+XuXtv9a7+7tRPnjhIe1SzRCf65bg7tcjjU7dD/PrcMTXMVylyD591HdtyxeR01eNnt2IPZl2FLCtgHsUNGLO81INZV6ELWx6J4EH0qHat+zCaR5+1Y8A/NhUB8PGvuVESjweztimxaQd3Ypob5fJ5ZGM7DlyrHx7MPOTBzIOZhzyYdU0iafdqz10e17dPNTVUQFP+Is6uTRd4MHN74g+cTmnIn960jRnL34QMG4Se3G/3pLzP00bdQy7W1Jg09ZYN2/S62hLWWGOpbcz/uoR3deM8PrVNOmHtjgS9/5Wx8976Ywu2+0u3R18c0OubhSfqgrd2DyI8fOaOpBEKAYQs1DfIioQhScGFw6Zs0ldamZLgVvvoefjMTShEpxFDtcibIEiCQCQj9ZJKLSRBkgRBuTwi4uEzm9Q/+oCu4vCkEFRUZjChWp1FWxIbyBVU6llzFevqxrnBngjuSOKkgsr80CkSKkMVECco9TroZRkWzGb5ByRUC8JdLRw9tr4dYo2kCAE20QTClsK33gwRAG9iELIil3vYHn1mr2NkAABIBABIyIWbaABCDABu4F17+MwRMltoxn0mYzyY3RL8QAgwx9MILFiAWL3kuhDCLEECAGALEgAAxixFwPV9bz12owupvBADcNvX8ACHtoHu4K8Hro/n2iVnmn5sOwQAAFmLawAA46JS7MHMlVRyVEFwPNl4jscQ1x3OHR9pyMbA8gBYkF0KAICheywAxyPBSQMP3IlS5akyjw3iSsG4Ldir5CqbJNOsNd5bynrtv9x7bdwCQ1b9EOlpq5YAKDlppVSqoOKcGu8AdFA7gnlvwizx9vmkh89cRrqLfrCrQGnG1pijZ07tFvv7xkhCjOuERzbuLE6kMEDdt6xq73HrKmjYiC1+5vVSURgdcM4VG8d6MLtGRg0FkVsOy8C/u49RRQgUMn+pV2W+eGr3A2EBfgCg8Bo0TAWgrw1OUEoS43UCiZpi9HoPZq4joYzF0S+e+x14HgPmMc/zPEvUq9LkhnzWwmLAGAPP08PrRdMIHvMYcyy2isUefeY6kqVVoEvyUX6FXDWvMZlqG02a4LNxke+kDpm6DptMPNlIVKitOn1GdmlJBFGj51i/Y7HlSUqPf+ZCKj0xDBoo7zqTuo4Gq7fW7GuuCbPkK0NwgUUulyJtrVJaBdzm7taCJB+VxeRfb1DuS4vwYOZSY98U1X6sI//zWF91+rXtY/LpYOTBzKXEku0jwJeWKUKvKTHMuUazeDDreuSxGz2YeciDmYc8mHkw85AHMw95MLsz6f8BHvVe1FeEHw0AAAAldEVYdGRhdGU6Y3JlYXRlADIwMjQtMDEtMjFUMTk6MDA6MTUrMDA6MDB5g7BbAAAAJXRFWHRkYXRlOm1vZGlmeQAyMDI0LTAxLTIxVDE5OjAwOjE1KzAwOjAwCN4I5wAAAABJRU5ErkJggg==",
 196 |       "text/plain": [
 197 |        "<IPython.core.display.Image object>"
 198 |       ]
 199 |      },
 200 |      "metadata": {},
 201 |      "output_type": "display_data"
 202 |     }
 203 |    ],
 204 |    "execution_count": 5
 205 |   },
 206 |   {
 207 |    "metadata": {},
 208 |    "cell_type": "markdown",
 209 |    "source": "For the purpose of our classification model, we shall employ the **encoder** part of the architecture, which constitutes the left-hand part in the image above. First, the embedded input samples enter the multi-head attention layer, whose output is then summed with the original input coming through a residual connection. Following a normalization, the tensor enters a fully connected segment containing two Dense layers. The output from this dense projection is then added to the input tensor via a residual connection, and normalized once more to produce the final output of the Transformer encoder.",
 210 |    "id": "f3304f680edc0de6"
 211 |   },
 212 |   {
 213 |    "metadata": {
 214 |     "ExecuteTime": {
 215 |      "end_time": "2024-11-27T13:53:23.773292Z",
 216 |      "start_time": "2024-11-27T13:53:23.764666Z"
 217 |     }
 218 |    },
 219 |    "cell_type": "code",
 220 |    "source": [
 221 |     "class TransformerEncoder(layers.Layer):\n",
 222 |     "    def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n",
 223 |     "        super().__init__(**kwargs)\n",
 224 |     "        self.embed_dim = embed_dim\n",
 225 |     "        self.dense_dim = dense_dim\n",
 226 |     "        self.num_heads = num_heads\n",
 227 |     "        self.attention = layers.MultiHeadAttention(\n",
 228 |     "            num_heads=num_heads, key_dim=embed_dim\n",
 229 |     "        )\n",
 230 |     "        self.dense_proj = keras.Sequential(\n",
 231 |     "            [\n",
 232 |     "                layers.Dense(dense_dim, activation=\"relu\"),\n",
 233 |     "                layers.Dense(embed_dim),\n",
 234 |     "            ]\n",
 235 |     "        )\n",
 236 |     "        self.layernorm_1 = layers.LayerNormalization()\n",
 237 |     "        self.layernorm_2 = layers.LayerNormalization()\n",
 238 |     "        self.supports_masking = True\n",
 239 |     "\n",
 240 |     "    def call(self, inputs, mask=None):\n",
 241 |     "        if mask is not None:\n",
 242 |     "            padding_mask = ops.cast(mask[:, None, :], dtype=\"int32\")\n",
 243 |     "        else:\n",
 244 |     "            padding_mask = None\n",
 245 |     "\n",
 246 |     "        attention_output = self.attention(\n",
 247 |     "            query=inputs, value=inputs, key=inputs, attention_mask=padding_mask\n",
 248 |     "        )\n",
 249 |     "        proj_input = self.layernorm_1(inputs + attention_output)\n",
 250 |     "        proj_output = self.dense_proj(proj_input)\n",
 251 |     "        return self.layernorm_2(proj_input + proj_output)\n",
 252 |     "\n",
 253 |     "    def get_config(self):\n",
 254 |     "        config = super().get_config()\n",
 255 |     "        config.update(\n",
 256 |     "            {\n",
 257 |     "                \"embed_dim\": self.embed_dim,\n",
 258 |     "                \"dense_dim\": self.dense_dim,\n",
 259 |     "                \"num_heads\": self.num_heads,\n",
 260 |     "            }\n",
 261 |     "        )\n",
 262 |     "        return config\n"
 263 |    ],
 264 |    "id": "b258aa4d3d86c12d",
 265 |    "outputs": [],
 266 |    "execution_count": 6
 267 |   },
 268 |   {
 269 |    "metadata": {},
 270 |    "cell_type": "markdown",
 271 |    "source": "We are now ready to build the actual classifier model. The output from the encoder is flattened by a global pooling layer, and then fed straight into the output layer of a single neuron with sigmoid activation.",
 272 |    "id": "5bf1eb515995b216"
 273 |   },
 274 |   {
 275 |    "metadata": {
 276 |     "ExecuteTime": {
 277 |      "end_time": "2024-11-27T13:53:24.672410Z",
 278 |      "start_time": "2024-11-27T13:53:23.773292Z"
 279 |     }
 280 |    },
 281 |    "cell_type": "code",
 282 |    "source": [
 283 |     "embed_dim = 32 # dimension of word embeddings (token + pos)\n",
 284 |     "dense_dim = 64 # \n",
 285 |     "num_heads = 2\n",
 286 |     "\n",
 287 |     "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
 288 |     "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n",
 289 |     "x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n",
 290 |     "x = layers.GlobalAveragePooling1D()(x)\n",
 291 |     "x = layers.Dropout(0.5)(x)\n",
 292 |     "outputs = layers.Dense(1, activation='sigmoid')(x)\n",
 293 |     "\n",
 294 |     "model = keras.Model(inputs=inputs, outputs=outputs)\n",
 295 |     "\n",
 296 |     "model.compile(optimizer=keras.optimizers.RMSprop(), \n",
 297 |     "              loss='binary_crossentropy', \n",
 298 |     "              metrics=['accuracy'])\n",
 299 |     "\n",
 300 |     "model.summary()"
 301 |    ],
 302 |    "id": "8ecae66d00669c53",
 303 |    "outputs": [
 304 |     {
 305 |      "name": "stdout",
 306 |      "output_type": "stream",
 307 |      "text": [
 308 |       "WARNING:tensorflow:From C:\\Users\\kopuj\\Anaconda3\\envs\\keras-cpu\\Lib\\site-packages\\keras\\src\\backend\\tensorflow\\core.py:204: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.\n",
 309 |       "\n"
 310 |      ]
 311 |     },
 312 |     {
 313 |      "data": {
 314 |       "text/plain": [
 315 |        "\u001B[1mModel: \"functional_1\"\u001B[0m\n"
 316 |       ],
 317 |       "text/html": [
 318 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"functional_1\"</span>\n",
 319 |        "</pre>\n"
 320 |       ]
 321 |      },
 322 |      "metadata": {},
 323 |      "output_type": "display_data"
 324 |     },
 325 |     {
 326 |      "data": {
 327 |       "text/plain": [
 328 |        "┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n",
 329 |        "┃\u001B[1m \u001B[0m\u001B[1mLayer (type)       \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape     \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m   Param #\u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mConnected to     \u001B[0m\u001B[1m \u001B[0m┃\n",
 330 |        "┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n",
 331 |        "│ input_layer         │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m)      │          \u001B[38;5;34m0\u001B[0m │ -                 │\n",
 332 |        "│ (\u001B[38;5;33mInputLayer\u001B[0m)        │                   │            │                   │\n",
 333 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 334 |        "│ positional_embeddi… │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)  │    \u001B[38;5;34m328,000\u001B[0m │ input_layer[\u001B[38;5;34m0\u001B[0m][\u001B[38;5;34m0\u001B[0m] │\n",
 335 |        "│ (\u001B[38;5;33mPositionalEmbeddi…\u001B[0m │                   │            │                   │\n",
 336 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 337 |        "│ not_equal           │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m)      │          \u001B[38;5;34m0\u001B[0m │ input_layer[\u001B[38;5;34m0\u001B[0m][\u001B[38;5;34m0\u001B[0m] │\n",
 338 |        "│ (\u001B[38;5;33mNotEqual\u001B[0m)          │                   │            │                   │\n",
 339 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 340 |        "│ transformer_encoder │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)  │     \u001B[38;5;34m12,736\u001B[0m │ positional_embed… │\n",
 341 |        "│ (\u001B[38;5;33mTransformerEncode…\u001B[0m │                   │            │ not_equal[\u001B[38;5;34m0\u001B[0m][\u001B[38;5;34m0\u001B[0m]   │\n",
 342 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 343 |        "│ global_average_poo… │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)        │          \u001B[38;5;34m0\u001B[0m │ transformer_enco… │\n",
 344 |        "│ (\u001B[38;5;33mGlobalAveragePool…\u001B[0m │                   │            │ not_equal[\u001B[38;5;34m0\u001B[0m][\u001B[38;5;34m0\u001B[0m]   │\n",
 345 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 346 |        "│ dropout_1 (\u001B[38;5;33mDropout\u001B[0m) │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)        │          \u001B[38;5;34m0\u001B[0m │ global_average_p… │\n",
 347 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 348 |        "│ dense_2 (\u001B[38;5;33mDense\u001B[0m)     │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m1\u001B[0m)         │         \u001B[38;5;34m33\u001B[0m │ dropout_1[\u001B[38;5;34m0\u001B[0m][\u001B[38;5;34m0\u001B[0m]   │\n",
 349 |        "└─────────────────────┴───────────────────┴────────────┴───────────────────┘\n"
 350 |       ],
 351 |       "text/html": [
 352 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n",
 353 |        "┃<span style=\"font-weight: bold\"> Layer (type)        </span>┃<span style=\"font-weight: bold\"> Output Shape      </span>┃<span style=\"font-weight: bold\">    Param # </span>┃<span style=\"font-weight: bold\"> Connected to      </span>┃\n",
 354 |        "┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n",
 355 |        "│ input_layer         │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)      │          <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                 │\n",
 356 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)        │                   │            │                   │\n",
 357 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 358 |        "│ positional_embeddi… │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)  │    <span style=\"color: #00af00; text-decoration-color: #00af00\">328,000</span> │ input_layer[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>] │\n",
 359 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">PositionalEmbeddi…</span> │                   │            │                   │\n",
 360 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 361 |        "│ not_equal           │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)      │          <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ input_layer[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>] │\n",
 362 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">NotEqual</span>)          │                   │            │                   │\n",
 363 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 364 |        "│ transformer_encoder │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)  │     <span style=\"color: #00af00; text-decoration-color: #00af00\">12,736</span> │ positional_embed… │\n",
 365 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">TransformerEncode…</span> │                   │            │ not_equal[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]   │\n",
 366 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 367 |        "│ global_average_poo… │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)        │          <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ transformer_enco… │\n",
 368 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">GlobalAveragePool…</span> │                   │            │ not_equal[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]   │\n",
 369 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 370 |        "│ dropout_1 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dropout</span>) │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)        │          <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ global_average_p… │\n",
 371 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 372 |        "│ dense_2 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)     │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">1</span>)         │         <span style=\"color: #00af00; text-decoration-color: #00af00\">33</span> │ dropout_1[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>][<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]   │\n",
 373 |        "└─────────────────────┴───────────────────┴────────────┴───────────────────┘\n",
 374 |        "</pre>\n"
 375 |       ]
 376 |      },
 377 |      "metadata": {},
 378 |      "output_type": "display_data"
 379 |     },
 380 |     {
 381 |      "data": {
 382 |       "text/plain": [
 383 |        "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m340,769\u001B[0m (1.30 MB)\n"
 384 |       ],
 385 |       "text/html": [
 386 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">340,769</span> (1.30 MB)\n",
 387 |        "</pre>\n"
 388 |       ]
 389 |      },
 390 |      "metadata": {},
 391 |      "output_type": "display_data"
 392 |     },
 393 |     {
 394 |      "data": {
 395 |       "text/plain": [
 396 |        "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m340,769\u001B[0m (1.30 MB)\n"
 397 |       ],
 398 |       "text/html": [
 399 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">340,769</span> (1.30 MB)\n",
 400 |        "</pre>\n"
 401 |       ]
 402 |      },
 403 |      "metadata": {},
 404 |      "output_type": "display_data"
 405 |     },
 406 |     {
 407 |      "data": {
 408 |       "text/plain": [
 409 |        "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
 410 |       ],
 411 |       "text/html": [
 412 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
 413 |        "</pre>\n"
 414 |       ]
 415 |      },
 416 |      "metadata": {},
 417 |      "output_type": "display_data"
 418 |     }
 419 |    ],
 420 |    "execution_count": 7
 421 |   },
 422 |   {
 423 |    "metadata": {},
 424 |    "cell_type": "markdown",
 425 |    "source": "Train the model ...",
 426 |    "id": "dc117025b5a0faab"
 427 |   },
 428 |   {
 429 |    "metadata": {
 430 |     "ExecuteTime": {
 431 |      "end_time": "2024-11-27T13:54:25.724615Z",
 432 |      "start_time": "2024-11-27T13:53:24.672410Z"
 433 |     }
 434 |    },
 435 |    "cell_type": "code",
 436 |    "source": [
 437 |     "history = model.fit(\n",
 438 |     "    train_ds_int, \n",
 439 |     "    batch_size=batch_size, \n",
 440 |     "    epochs=3, \n",
 441 |     "    validation_data=(val_ds_int)\n",
 442 |     ")\n"
 443 |    ],
 444 |    "id": "d0bd0b59acca4b53",
 445 |    "outputs": [
 446 |     {
 447 |      "name": "stdout",
 448 |      "output_type": "stream",
 449 |      "text": [
 450 |       "Epoch 1/3\n"
 451 |      ]
 452 |     },
 453 |     {
 454 |      "name": "stderr",
 455 |      "output_type": "stream",
 456 |      "text": [
 457 |       "C:\\Users\\kopuj\\Anaconda3\\envs\\keras-cpu\\Lib\\site-packages\\keras\\src\\layers\\layer.py:932: UserWarning: Layer 'query' (of type EinsumDense) was passed an input with a mask attached to it. However, this layer does not support masking and will therefore destroy the mask information. Downstream layers will not see the mask.\n",
 458 |       "  warnings.warn(\n",
 459 |       "C:\\Users\\kopuj\\Anaconda3\\envs\\keras-cpu\\Lib\\site-packages\\keras\\src\\layers\\layer.py:932: UserWarning: Layer 'key' (of type EinsumDense) was passed an input with a mask attached to it. However, this layer does not support masking and will therefore destroy the mask information. Downstream layers will not see the mask.\n",
 460 |       "  warnings.warn(\n",
 461 |       "C:\\Users\\kopuj\\Anaconda3\\envs\\keras-cpu\\Lib\\site-packages\\keras\\src\\layers\\layer.py:932: UserWarning: Layer 'value' (of type EinsumDense) was passed an input with a mask attached to it. However, this layer does not support masking and will therefore destroy the mask information. Downstream layers will not see the mask.\n",
 462 |       "  warnings.warn(\n"
 463 |      ]
 464 |     },
 465 |     {
 466 |      "name": "stdout",
 467 |      "output_type": "stream",
 468 |      "text": [
 469 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m21s\u001B[0m 30ms/step - accuracy: 0.6883 - loss: 0.5721 - val_accuracy: 0.7454 - val_loss: 0.5455\n",
 470 |       "Epoch 2/3\n",
 471 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m19s\u001B[0m 30ms/step - accuracy: 0.8607 - loss: 0.3280 - val_accuracy: 0.8264 - val_loss: 0.4016\n",
 472 |       "Epoch 3/3\n",
 473 |       "\u001B[1m625/625\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m21s\u001B[0m 33ms/step - accuracy: 0.8887 - loss: 0.2775 - val_accuracy: 0.8722 - val_loss: 0.3088\n"
 474 |      ]
 475 |     }
 476 |    ],
 477 |    "execution_count": 8
 478 |   },
 479 |   {
 480 |    "metadata": {},
 481 |    "cell_type": "markdown",
 482 |    "source": "... and test it.",
 483 |    "id": "d1a6a3f6a109812e"
 484 |   },
 485 |   {
 486 |    "metadata": {
 487 |     "ExecuteTime": {
 488 |      "end_time": "2024-11-27T13:54:39.310606Z",
 489 |      "start_time": "2024-11-27T13:54:25.724615Z"
 490 |     }
 491 |    },
 492 |    "cell_type": "code",
 493 |    "source": "model.evaluate(test_ds_int)",
 494 |    "id": "f3c097972c6aea9a",
 495 |    "outputs": [
 496 |     {
 497 |      "name": "stdout",
 498 |      "output_type": "stream",
 499 |      "text": [
 500 |       "\u001B[1m782/782\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m14s\u001B[0m 17ms/step - accuracy: 0.8578 - loss: 0.3268\n"
 501 |      ]
 502 |     },
 503 |     {
 504 |      "data": {
 505 |       "text/plain": [
 506 |        "[0.3285871744155884, 0.8590400218963623]"
 507 |       ]
 508 |      },
 509 |      "execution_count": 9,
 510 |      "metadata": {},
 511 |      "output_type": "execute_result"
 512 |     }
 513 |    ],
 514 |    "execution_count": 9
 515 |   },
 516 |   {
 517 |    "metadata": {},
 518 |    "cell_type": "markdown",
 519 |    "source": [
 520 |     "## Text generation\n",
 521 |     "\n",
 522 |     "In addition to simple tasks like classification, Transformer architecture models can also be used for more ambitious natural language processing tasks, such as machine translation and text generation. Here we shall take a look at how a neural network model can be trained to produce movie reviews from an initial prompt. \n",
 523 |     "\n",
 524 |     "We begin by creating a new Dataset object from the data in the training directory only. This time, we do not need the sentiment labels, and remove the line break tokens in order to avoid generating those later."
 525 |    ],
 526 |    "id": "3cb96f236d7cd3fc"
 527 |   },
 528 |   {
 529 |    "metadata": {
 530 |     "ExecuteTime": {
 531 |      "end_time": "2024-11-27T13:54:41.067718Z",
 532 |      "start_time": "2024-11-27T13:54:39.310606Z"
 533 |     }
 534 |    },
 535 |    "cell_type": "code",
 536 |    "source": [
 537 |     "dataset = keras.utils.text_dataset_from_directory(\n",
 538 |     "    directory=\"../../aclImdb/train/\", \n",
 539 |     "    label_mode=None, \n",
 540 |     "    batch_size=256)\n",
 541 |     "\n",
 542 |     "dataset = dataset.map(lambda x: tf.strings.regex_replace(x, \"<br />\", \"\"))"
 543 |    ],
 544 |    "id": "24c15dc8aa7bf7f1",
 545 |    "outputs": [
 546 |     {
 547 |      "name": "stdout",
 548 |      "output_type": "stream",
 549 |      "text": [
 550 |       "Found 25000 files.\n"
 551 |      ]
 552 |     }
 553 |    ],
 554 |    "execution_count": 10
 555 |   },
 556 |   {
 557 |    "metadata": {},
 558 |    "cell_type": "markdown",
 559 |    "source": "Next, we convert the strings to integer lists, as before. To speed up training, we restrict the sequence lengths to be somewhat shorter than before.",
 560 |    "id": "4a8e929d6bae3c27"
 561 |   },
 562 |   {
 563 |    "metadata": {
 564 |     "ExecuteTime": {
 565 |      "end_time": "2024-11-27T13:54:47.026413Z",
 566 |      "start_time": "2024-11-27T13:54:41.067718Z"
 567 |     }
 568 |    },
 569 |    "cell_type": "code",
 570 |    "source": [
 571 |     "sequence_length = 100\n",
 572 |     "vocab_size = 10000\n",
 573 |     "text_vectorization = layers.TextVectorization(\n",
 574 |     "    max_tokens=vocab_size,\n",
 575 |     "    output_mode=\"int\",\n",
 576 |     "    output_sequence_length=sequence_length,\n",
 577 |     ")\n",
 578 |     "text_vectorization.adapt(dataset)"
 579 |    ],
 580 |    "id": "3d7856bb7ce47196",
 581 |    "outputs": [],
 582 |    "execution_count": 11
 583 |   },
 584 |   {
 585 |    "metadata": {},
 586 |    "cell_type": "markdown",
 587 |    "source": "Next, we generate specialized targets for the purpose of training the model: the targets are simply the same integer sequences as the samples, but shifted one token to the right (this means that the sequence length gets reduced by one token from the original ones). ",
 588 |    "id": "36aa3103da705b07"
 589 |   },
 590 |   {
 591 |    "metadata": {
 592 |     "ExecuteTime": {
 593 |      "end_time": "2024-11-27T13:54:47.100553Z",
 594 |      "start_time": "2024-11-27T13:54:47.026413Z"
 595 |     }
 596 |    },
 597 |    "cell_type": "code",
 598 |    "source": [
 599 |     "def prepare_lm_dataset(text_batch):\n",
 600 |     "    vectorized_sequences = text_vectorization(text_batch)\n",
 601 |     "    x = vectorized_sequences[:, :-1]\n",
 602 |     "    y = vectorized_sequences[:, 1:]\n",
 603 |     "    return x, y\n",
 604 |     "\n",
 605 |     "lm_dataset = dataset.map(prepare_lm_dataset)"
 606 |    ],
 607 |    "id": "f9841b801b257f8c",
 608 |    "outputs": [],
 609 |    "execution_count": 12
 610 |   },
 611 |   {
 612 |    "metadata": {},
 613 |    "cell_type": "markdown",
 614 |    "source": [
 615 |     "For the actual text generation, we need to employ the **decoder** part of the Transformer (the right-hand side of the architecture image above). The building blocks of the decoder are very similar to those of the encoder, but with a couple of differences. First, there are two separate attention layers; the first layer takes the embedded inputs in as query, key and value. The second attention layer takes the output of the first one in as the query. In the case of a machine translation task the key and value would come from in from the encoder outputs generated by the source sequence to be translated; here, however, there is no such separate source sequence, but the keys and values are provided by the original inputs to the decoder.\n",
 616 |     "\n",
 617 |     "Another important detail is the **causal mask**, which prevents the generated from attending to future words, when predicting a given word in a sequence. The implementation of this, as well as all the other essentials of the code are from F. Chollet: **Deep Learning with Python** (Chapter 12), with migration guidelines to Keras 3 in [this example](https://keras.io/examples/nlp/neural_machine_translation_with_transformer/)."
 618 |    ],
 619 |    "id": "e5f17086db0e2ea1"
 620 |   },
 621 |   {
 622 |    "metadata": {
 623 |     "ExecuteTime": {
 624 |      "end_time": "2024-11-27T13:54:47.108871Z",
 625 |      "start_time": "2024-11-27T13:54:47.100553Z"
 626 |     }
 627 |    },
 628 |    "cell_type": "code",
 629 |    "source": [
 630 |     "class TransformerDecoder(layers.Layer):\n",
 631 |     "    def __init__(self, embed_dim, latent_dim, num_heads, **kwargs):\n",
 632 |     "        super().__init__(**kwargs)\n",
 633 |     "        self.embed_dim = embed_dim\n",
 634 |     "        self.latent_dim = latent_dim\n",
 635 |     "        self.num_heads = num_heads\n",
 636 |     "        self.attention_1 = layers.MultiHeadAttention(\n",
 637 |     "            num_heads=num_heads, key_dim=embed_dim\n",
 638 |     "        )\n",
 639 |     "        self.attention_2 = layers.MultiHeadAttention(\n",
 640 |     "            num_heads=num_heads, key_dim=embed_dim\n",
 641 |     "        )\n",
 642 |     "        self.dense_proj = keras.Sequential(\n",
 643 |     "            [\n",
 644 |     "                layers.Dense(latent_dim, activation=\"relu\"),\n",
 645 |     "                layers.Dense(embed_dim),\n",
 646 |     "            ]\n",
 647 |     "        )\n",
 648 |     "        self.layernorm_1 = layers.LayerNormalization()\n",
 649 |     "        self.layernorm_2 = layers.LayerNormalization()\n",
 650 |     "        self.layernorm_3 = layers.LayerNormalization()\n",
 651 |     "        self.supports_masking = True\n",
 652 |     "\n",
 653 |     "    def call(self, inputs, encoder_outputs, mask=None):\n",
 654 |     "        causal_mask = self.get_causal_attention_mask(inputs)\n",
 655 |     "        if mask is not None:\n",
 656 |     "            padding_mask = ops.cast(mask[:, None, :], dtype=\"int32\")\n",
 657 |     "            padding_mask = ops.minimum(padding_mask, causal_mask)\n",
 658 |     "        else:\n",
 659 |     "            padding_mask = None\n",
 660 |     "\n",
 661 |     "        attention_output_1 = self.attention_1(\n",
 662 |     "            query=inputs, value=inputs, key=inputs, attention_mask=causal_mask\n",
 663 |     "        )\n",
 664 |     "        out_1 = self.layernorm_1(inputs + attention_output_1)\n",
 665 |     "\n",
 666 |     "        attention_output_2 = self.attention_2(\n",
 667 |     "            query=out_1,\n",
 668 |     "            value=encoder_outputs,\n",
 669 |     "            key=encoder_outputs,\n",
 670 |     "            attention_mask=padding_mask,\n",
 671 |     "        )\n",
 672 |     "        out_2 = self.layernorm_2(out_1 + attention_output_2)\n",
 673 |     "\n",
 674 |     "        proj_output = self.dense_proj(out_2)\n",
 675 |     "        return self.layernorm_3(out_2 + proj_output)\n",
 676 |     "\n",
 677 |     "    def get_causal_attention_mask(self, inputs):\n",
 678 |     "        input_shape = ops.shape(inputs)\n",
 679 |     "        batch_size, sequence_length = input_shape[0], input_shape[1]\n",
 680 |     "        i = ops.arange(sequence_length)[:, None]\n",
 681 |     "        j = ops.arange(sequence_length)\n",
 682 |     "        mask = ops.cast(i >= j, dtype=\"int32\")\n",
 683 |     "        mask = ops.reshape(mask, (1, input_shape[1], input_shape[1]))\n",
 684 |     "        mult = ops.concatenate(\n",
 685 |     "            [ops.expand_dims(batch_size, -1), ops.convert_to_tensor([1, 1])],\n",
 686 |     "            axis=0,\n",
 687 |     "        )\n",
 688 |     "        return ops.tile(mask, mult)\n",
 689 |     "\n",
 690 |     "    def get_config(self):\n",
 691 |     "        config = super().get_config()\n",
 692 |     "        config.update(\n",
 693 |     "            {\n",
 694 |     "                \"embed_dim\": self.embed_dim,\n",
 695 |     "                \"latent_dim\": self.latent_dim,\n",
 696 |     "                \"num_heads\": self.num_heads,\n",
 697 |     "            }\n",
 698 |     "        )\n",
 699 |     "        return config"
 700 |    ],
 701 |    "id": "400ba417d38ab6f2",
 702 |    "outputs": [],
 703 |    "execution_count": 13
 704 |   },
 705 |   {
 706 |    "metadata": {},
 707 |    "cell_type": "markdown",
 708 |    "source": "",
 709 |    "id": "3df06d1bab9a625b"
 710 |   },
 711 |   {
 712 |    "metadata": {},
 713 |    "cell_type": "markdown",
 714 |    "source": "We are now ready to build the model with the Transformer decoder. Its output is directly connected to the output layer with the size of the vocabulary and softmax activation, to provide a probability distribution for token predictions. Note that we choose `sparse_categorical_crossentropy` as our loss function, because the target labels consist of integers (instead of being one-hot encoded).",
 715 |    "id": "d486ccb406b67d7d"
 716 |   },
 717 |   {
 718 |    "metadata": {
 719 |     "ExecuteTime": {
 720 |      "end_time": "2024-11-27T13:54:47.356113Z",
 721 |      "start_time": "2024-11-27T13:54:47.108871Z"
 722 |     }
 723 |    },
 724 |    "cell_type": "code",
 725 |    "source": [
 726 |     "embed_dim = 32\n",
 727 |     "latent_dim = 64\n",
 728 |     "num_heads = 2\n",
 729 |     "\n",
 730 |     "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n",
 731 |     "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n",
 732 |     "x = TransformerDecoder(embed_dim, latent_dim, num_heads)(x, x)\n",
 733 |     "outputs = layers.Dense(vocab_size, activation=\"softmax\")(x)\n",
 734 |     "\n",
 735 |     "model = keras.Model(inputs, outputs)\n",
 736 |     "\n",
 737 |     "model.compile(loss=\"sparse_categorical_crossentropy\", \n",
 738 |     "              optimizer=keras.optimizers.RMSprop())\n",
 739 |     "\n",
 740 |     "model.summary()"
 741 |    ],
 742 |    "id": "1b51e72262245b76",
 743 |    "outputs": [
 744 |     {
 745 |      "data": {
 746 |       "text/plain": [
 747 |        "\u001B[1mModel: \"functional_3\"\u001B[0m\n"
 748 |       ],
 749 |       "text/html": [
 750 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">Model: \"functional_3\"</span>\n",
 751 |        "</pre>\n"
 752 |       ]
 753 |      },
 754 |      "metadata": {},
 755 |      "output_type": "display_data"
 756 |     },
 757 |     {
 758 |      "data": {
 759 |       "text/plain": [
 760 |        "┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n",
 761 |        "┃\u001B[1m \u001B[0m\u001B[1mLayer (type)       \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mOutput Shape     \u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1m   Param #\u001B[0m\u001B[1m \u001B[0m┃\u001B[1m \u001B[0m\u001B[1mConnected to     \u001B[0m\u001B[1m \u001B[0m┃\n",
 762 |        "┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n",
 763 |        "│ input_layer_2       │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m)      │          \u001B[38;5;34m0\u001B[0m │ -                 │\n",
 764 |        "│ (\u001B[38;5;33mInputLayer\u001B[0m)        │                   │            │                   │\n",
 765 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 766 |        "│ positional_embeddi… │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)  │    \u001B[38;5;34m323,200\u001B[0m │ input_layer_2[\u001B[38;5;34m0\u001B[0m]… │\n",
 767 |        "│ (\u001B[38;5;33mPositionalEmbeddi…\u001B[0m │                   │            │                   │\n",
 768 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 769 |        "│ transformer_decoder │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m, \u001B[38;5;34m32\u001B[0m)  │     \u001B[38;5;34m21,216\u001B[0m │ positional_embed… │\n",
 770 |        "│ (\u001B[38;5;33mTransformerDecode…\u001B[0m │                   │            │ positional_embed… │\n",
 771 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 772 |        "│ dense_5 (\u001B[38;5;33mDense\u001B[0m)     │ (\u001B[38;5;45mNone\u001B[0m, \u001B[38;5;45mNone\u001B[0m,      │    \u001B[38;5;34m330,000\u001B[0m │ transformer_deco… │\n",
 773 |        "│                     │ \u001B[38;5;34m10000\u001B[0m)            │            │                   │\n",
 774 |        "└─────────────────────┴───────────────────┴────────────┴───────────────────┘\n"
 775 |       ],
 776 |       "text/html": [
 777 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓\n",
 778 |        "┃<span style=\"font-weight: bold\"> Layer (type)        </span>┃<span style=\"font-weight: bold\"> Output Shape      </span>┃<span style=\"font-weight: bold\">    Param # </span>┃<span style=\"font-weight: bold\"> Connected to      </span>┃\n",
 779 |        "┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩\n",
 780 |        "│ input_layer_2       │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>)      │          <span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> │ -                 │\n",
 781 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">InputLayer</span>)        │                   │            │                   │\n",
 782 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 783 |        "│ positional_embeddi… │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)  │    <span style=\"color: #00af00; text-decoration-color: #00af00\">323,200</span> │ input_layer_2[<span style=\"color: #00af00; text-decoration-color: #00af00\">0</span>]… │\n",
 784 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">PositionalEmbeddi…</span> │                   │            │                   │\n",
 785 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 786 |        "│ transformer_decoder │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00af00; text-decoration-color: #00af00\">32</span>)  │     <span style=\"color: #00af00; text-decoration-color: #00af00\">21,216</span> │ positional_embed… │\n",
 787 |        "│ (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">TransformerDecode…</span> │                   │            │ positional_embed… │\n",
 788 |        "├─────────────────────┼───────────────────┼────────────┼───────────────────┤\n",
 789 |        "│ dense_5 (<span style=\"color: #0087ff; text-decoration-color: #0087ff\">Dense</span>)     │ (<span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>, <span style=\"color: #00d7ff; text-decoration-color: #00d7ff\">None</span>,      │    <span style=\"color: #00af00; text-decoration-color: #00af00\">330,000</span> │ transformer_deco… │\n",
 790 |        "│                     │ <span style=\"color: #00af00; text-decoration-color: #00af00\">10000</span>)            │            │                   │\n",
 791 |        "└─────────────────────┴───────────────────┴────────────┴───────────────────┘\n",
 792 |        "</pre>\n"
 793 |       ]
 794 |      },
 795 |      "metadata": {},
 796 |      "output_type": "display_data"
 797 |     },
 798 |     {
 799 |      "data": {
 800 |       "text/plain": [
 801 |        "\u001B[1m Total params: \u001B[0m\u001B[38;5;34m674,416\u001B[0m (2.57 MB)\n"
 802 |       ],
 803 |       "text/html": [
 804 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Total params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">674,416</span> (2.57 MB)\n",
 805 |        "</pre>\n"
 806 |       ]
 807 |      },
 808 |      "metadata": {},
 809 |      "output_type": "display_data"
 810 |     },
 811 |     {
 812 |      "data": {
 813 |       "text/plain": [
 814 |        "\u001B[1m Trainable params: \u001B[0m\u001B[38;5;34m674,416\u001B[0m (2.57 MB)\n"
 815 |       ],
 816 |       "text/html": [
 817 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">674,416</span> (2.57 MB)\n",
 818 |        "</pre>\n"
 819 |       ]
 820 |      },
 821 |      "metadata": {},
 822 |      "output_type": "display_data"
 823 |     },
 824 |     {
 825 |      "data": {
 826 |       "text/plain": [
 827 |        "\u001B[1m Non-trainable params: \u001B[0m\u001B[38;5;34m0\u001B[0m (0.00 B)\n"
 828 |       ],
 829 |       "text/html": [
 830 |        "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\"> Non-trainable params: </span><span style=\"color: #00af00; text-decoration-color: #00af00\">0</span> (0.00 B)\n",
 831 |        "</pre>\n"
 832 |       ]
 833 |      },
 834 |      "metadata": {},
 835 |      "output_type": "display_data"
 836 |     }
 837 |    ],
 838 |    "execution_count": 14
 839 |   },
 840 |   {
 841 |    "metadata": {},
 842 |    "cell_type": "markdown",
 843 |    "source": "Now we can train the model.",
 844 |    "id": "a234182d70731a37"
 845 |   },
 846 |   {
 847 |    "metadata": {
 848 |     "ExecuteTime": {
 849 |      "end_time": "2024-11-27T14:51:48.432106Z",
 850 |      "start_time": "2024-11-27T14:35:09.876053Z"
 851 |     }
 852 |    },
 853 |    "cell_type": "code",
 854 |    "source": "model.fit(lm_dataset, epochs=10)",
 855 |    "id": "f65d340549ae25e4",
 856 |    "outputs": [
 857 |     {
 858 |      "name": "stdout",
 859 |      "output_type": "stream",
 860 |      "text": [
 861 |       "Epoch 1/10\n",
 862 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m95s\u001B[0m 967ms/step - loss: 5.3083\n",
 863 |       "Epoch 2/10\n",
 864 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m106s\u001B[0m 1s/step - loss: 5.2834\n",
 865 |       "Epoch 3/10\n",
 866 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m108s\u001B[0m 1s/step - loss: 5.2622\n",
 867 |       "Epoch 4/10\n",
 868 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m105s\u001B[0m 1s/step - loss: 5.2397\n",
 869 |       "Epoch 5/10\n",
 870 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m102s\u001B[0m 1s/step - loss: 5.2251\n",
 871 |       "Epoch 6/10\n",
 872 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m103s\u001B[0m 1s/step - loss: 5.2083\n",
 873 |       "Epoch 7/10\n",
 874 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m102s\u001B[0m 1s/step - loss: 5.1917\n",
 875 |       "Epoch 8/10\n",
 876 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m104s\u001B[0m 1s/step - loss: 5.1768\n",
 877 |       "Epoch 9/10\n",
 878 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m85s\u001B[0m 860ms/step - loss: 5.1650\n",
 879 |       "Epoch 10/10\n",
 880 |       "\u001B[1m98/98\u001B[0m \u001B[32m━━━━━━━━━━━━━━━━━━━━\u001B[0m\u001B[37m\u001B[0m \u001B[1m88s\u001B[0m 895ms/step - loss: 5.1521\n"
 881 |      ]
 882 |     },
 883 |     {
 884 |      "data": {
 885 |       "text/plain": [
 886 |        "<keras.src.callbacks.history.History at 0x21245ecf680>"
 887 |       ]
 888 |      },
 889 |      "execution_count": 22,
 890 |      "metadata": {},
 891 |      "output_type": "execute_result"
 892 |     }
 893 |    ],
 894 |    "execution_count": 22
 895 |   },
 896 |   {
 897 |    "metadata": {},
 898 |    "cell_type": "markdown",
 899 |    "source": "Once the model has been trained, we can experiment with generating text from an initial seed prompt. Instead of always predicting the token with the highest probability in the predicted distribution, we scale the distribution with a parameter referred to as **temperature**: high values of temperature tend to flatten the probability distribution, which leads to somewhat more surprising choices for next tokens.",
 900 |    "id": "721ade4e1cf77948"
 901 |   },
 902 |   {
 903 |    "metadata": {
 904 |     "ExecuteTime": {
 905 |      "end_time": "2024-11-27T14:34:09.027122Z",
 906 |      "start_time": "2024-11-27T14:34:09.002197Z"
 907 |     }
 908 |    },
 909 |    "cell_type": "code",
 910 |    "source": [
 911 |     "tokens_index = dict(enumerate(text_vectorization.get_vocabulary()))\n",
 912 |     "\n",
 913 |     "def sample_next(predictions, temperature=1.0):\n",
 914 |     "    predictions = np.asarray(predictions).astype(\"float64\")\n",
 915 |     "    predictions = np.log(predictions) / temperature\n",
 916 |     "    exp_preds = np.exp(predictions)\n",
 917 |     "    predictions = exp_preds / np.sum(exp_preds)\n",
 918 |     "    probas = np.random.multinomial(1, predictions, 1)\n",
 919 |     "    return np.argmax(probas)"
 920 |    ],
 921 |    "id": "81505246136ec361",
 922 |    "outputs": [],
 923 |    "execution_count": 19
 924 |   },
 925 |   {
 926 |    "metadata": {},
 927 |    "cell_type": "markdown",
 928 |    "source": "Finally, we can define an initial prompt, and check out the quality of generated text. Unfortunately, due to the short training time, the results are fairly disappointing. ",
 929 |    "id": "3ced59f70f9d0ac1"
 930 |   },
 931 |   {
 932 |    "metadata": {
 933 |     "ExecuteTime": {
 934 |      "end_time": "2024-11-27T14:52:17.333110Z",
 935 |      "start_time": "2024-11-27T14:52:15.678818Z"
 936 |     }
 937 |    },
 938 |    "cell_type": "code",
 939 |    "source": [
 940 |     "temperature = 0.7\n",
 941 |     "\n",
 942 |     "sentence = \"in my view\"\n",
 943 |     "generate_length = 50\n",
 944 |     "for i in range(generate_length):\n",
 945 |     "    tokenized_sentence = text_vectorization([sentence])\n",
 946 |     "    predictions = model(tokenized_sentence)\n",
 947 |     "    next_token = sample_next(predictions[0, i, :])\n",
 948 |     "    sampled_token = tokens_index[next_token]\n",
 949 |     "    sentence += \" \" + sampled_token\n",
 950 |     "print(sentence)"
 951 |    ],
 952 |    "id": "dab069e1cca9bc04",
 953 |    "outputs": [
 954 |     {
 955 |      "name": "stdout",
 956 |      "output_type": "stream",
 957 |      "text": [
 958 |       "in my view all foreign of the movies the film have sex or few hunting the years the greatest not buffs live here or in this [UNK] the film that way is the included land stadium the surf which film is all most a the soup deranged and of [UNK] any an the\n"
 959 |      ]
 960 |     }
 961 |    ],
 962 |    "execution_count": 24
 963 |   },
 964 |   {
 965 |    "metadata": {
 966 |     "ExecuteTime": {
 967 |      "end_time": "2024-11-27T13:56:05.941920Z",
 968 |      "start_time": "2024-11-27T13:56:05.939133Z"
 969 |     }
 970 |    },
 971 |    "cell_type": "code",
 972 |    "source": "",
 973 |    "id": "ccbfd75b84aea31b",
 974 |    "outputs": [],
 975 |    "execution_count": 17
 976 |   }
 977 |  ],
 978 |  "metadata": {
 979 |   "kernelspec": {
 980 |    "display_name": "Python 3",
 981 |    "language": "python",
 982 |    "name": "python3"
 983 |   },
 984 |   "language_info": {
 985 |    "codemirror_mode": {
 986 |     "name": "ipython",
 987 |     "version": 2
 988 |    },
 989 |    "file_extension": ".py",
 990 |    "mimetype": "text/x-python",
 991 |    "name": "python",
 992 |    "nbconvert_exporter": "python",
 993 |    "pygments_lexer": "ipython2",
 994 |    "version": "2.7.6"
 995 |   }
 996 |  },
 997 |  "nbformat": 4,
 998 |  "nbformat_minor": 5
 999 | }
1000 | 


--------------------------------------------------------------------------------