├── logo.png ├── Slides └── RNN.pdf ├── images ├── image_1.png ├── table.png ├── data_set.png ├── attn_model.png ├── cosine_sim.png ├── embedding1.png ├── emojifier-v2.png ├── attn_mechanism.png ├── date_attention.png ├── date_attention2.png └── poorly_trained_model.png ├── TimeDistributed.ipynb ├── README.md ├── nmt_utils.py ├── 05-1-video-action-recognition-train-extract-features-with-cnn.ipynb ├── 06_analogy-using-embeddings.ipynb ├── 09_add-numbers-with-seq2seq.ipynb ├── 02_1_simple-RNN-diffrent-sequence-length.ipynb ├── 08_shahnameh-text-generation-language-model.ipynb └── 07_text-classification-Emojify.ipynb /logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/logo.png -------------------------------------------------------------------------------- /Slides/RNN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/Slides/RNN.pdf -------------------------------------------------------------------------------- /images/image_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/image_1.png -------------------------------------------------------------------------------- /images/table.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/table.png -------------------------------------------------------------------------------- /images/data_set.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/data_set.png -------------------------------------------------------------------------------- /images/attn_model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/attn_model.png -------------------------------------------------------------------------------- /images/cosine_sim.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/cosine_sim.png -------------------------------------------------------------------------------- /images/embedding1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/embedding1.png -------------------------------------------------------------------------------- /images/emojifier-v2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/emojifier-v2.png -------------------------------------------------------------------------------- /images/attn_mechanism.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/attn_mechanism.png -------------------------------------------------------------------------------- /images/date_attention.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/date_attention.png -------------------------------------------------------------------------------- /images/date_attention2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/date_attention2.png -------------------------------------------------------------------------------- /images/poorly_trained_model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Alireza-Akhavan/rnn-notebooks/master/images/poorly_trained_model.png -------------------------------------------------------------------------------- /TimeDistributed.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "metadata": {}, 7 | "outputs": [ 8 | { 9 | "name": "stderr", 10 | "output_type": "stream", 11 | "text": [ 12 | "Using TensorFlow backend.\n" 13 | ] 14 | }, 15 | { 16 | "data": { 17 | "text/plain": [ 18 | "'2.0.0'" 19 | ] 20 | }, 21 | "execution_count": 1, 22 | "metadata": {}, 23 | "output_type": "execute_result" 24 | } 25 | ], 26 | "source": [ 27 | "import tensorflow as tf\n", 28 | "from keras.models import Sequential \n", 29 | "from keras.layers import Dense \n", 30 | "from keras.layers import TimeDistributed\n", 31 | "tf.__version__" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": 2, 37 | "metadata": {}, 38 | "outputs": [ 39 | { 40 | "name": "stdout", 41 | "output_type": "stream", 42 | "text": [ 43 | "Model: \"sequential_1\"\n", 44 | "_________________________________________________________________\n", 45 | "Layer (type) Output Shape Param # \n", 46 | "=================================================================\n", 47 | "time_distributed_1 (TimeDist (None, 10, 8) 48 \n", 48 | "=================================================================\n", 49 | "Total params: 48\n", 50 | "Trainable params: 48\n", 51 | "Non-trainable params: 0\n", 52 | "_________________________________________________________________\n" 53 | ] 54 | } 55 | ], 56 | "source": [ 57 | "model = Sequential()\n", 58 | "model.add(TimeDistributed(Dense(8), input_shape=(10, 5)))\n", 59 | "model.summary()" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 3, 65 | "metadata": {}, 66 | "outputs": [ 67 | { 68 | "name": "stdout", 69 | "output_type": "stream", 70 | "text": [ 71 | "Model: \"sequential_2\"\n", 72 | "_________________________________________________________________\n", 73 | "Layer (type) Output Shape Param # \n", 74 | "=================================================================\n", 75 | "dense_2 (Dense) (None, 10, 8) 136 \n", 76 | "=================================================================\n", 77 | "Total params: 136\n", 78 | "Trainable params: 136\n", 79 | "Non-trainable params: 0\n", 80 | "_________________________________________________________________\n" 81 | ] 82 | } 83 | ], 84 | "source": [ 85 | "model = Sequential()\n", 86 | "model.add(Dense(8, input_shape=(10, 16)))\n", 87 | "model.summary()" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": null, 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [] 96 | } 97 | ], 98 | "metadata": { 99 | "kernelspec": { 100 | "display_name": "tf2-GPU", 101 | "language": "python", 102 | "name": "tf2" 103 | }, 104 | "language_info": { 105 | "codemirror_mode": { 106 | "name": "ipython", 107 | "version": 3 108 | }, 109 | "file_extension": ".py", 110 | "mimetype": "text/x-python", 111 | "name": "python", 112 | "nbconvert_exporter": "python", 113 | "pygments_lexer": "ipython3", 114 | "version": "3.6.9" 115 | } 116 | }, 117 | "nbformat": 4, 118 | "nbformat_minor": 2 119 | } 120 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # rnn-notebooks 2 | RNN(SimpleRNN, LSTM, GRU) Tensorflow2.0 & Keras Notebooks (Workshop materials) 3 | 4 | class.vision 5 | 6 | [class.vision](http://Class.vision) 7 | 8 | # Slides 9 | 10 | [RNN.pdf](./Slides/RNN.pdf) 11 | 12 | # Video 13 | Some parts are freely available from our [Aparat channel](https://www.aparat.com/v/qD1Mi?playlist=287685) or 14 | you can purchase a full package including 32 videos in Persian from [class.vision](http://class.vision/deeplearning2/) 15 | 16 | # Notebooks 17 | 18 | ## Intro to RNN: 19 | [01_simple-RNN.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/01_simple-RNN.ipynb) 20 | 21 | ## How we can inference with diffrent sequence length?! 22 | [02_1_simple-RNN-diffrent-sequence-length.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/02_1_simple-RNN-diffrent-sequence-length.ipynb) 23 | 24 | [02_2_simple-RNN-diffrent-sequence-length.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/02_2_simple-RNN-diffrent-sequence-length.ipynb) 25 | 26 | ## Cryptocurrency predicting 27 | - when we use return_sequences=True ? 28 | - Stacked RNN (Deep RNN) 29 | - using a LSTM layer 30 | 31 | [03_1_Cryptocurrency-predicting.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/03_1_Cryptocurrency-predicting.ipynb) 32 | 33 | [03_2_Cryptocurrency-predicting.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/03_2_Cryptocurrency-predicting.ipynb) 34 | 35 | ## CNN + LSTM for Ball movement classification 36 | - what is TimeDistributed layer in Keras? 37 | - Introduction to video classification 38 | - CNN + LSTM 39 | 40 | [04_simple-CNN-LSTM.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/04_simple-CNN-LSTM.ipynb) 41 | 42 | ## Action Recognition with pre-trained CNN and LSTM 43 | 44 | - How using pre-trained CNN as a feature extracture for RNN 45 | - using GRU layer 46 | 47 | [05-1-video-action-recognition-train-extract-features-with-cnn](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/05-1-video-action-recognition-train-extract-features-with-cnn.ipynb) 48 | 49 | [05-2_video-action-recognition-train-rnn.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/05-2_video-action-recognition-train-rnn.ipynb) 50 | 51 | 52 | ## Word Embedding and Analogy 53 | 54 | - Using Glove 55 | - Cosine Similarity 56 | - Analogy 57 | 58 | [06_analogy-using-embeddings.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/06_analogy-using-embeddings.ipynb) 59 | 60 | ## Text Classification 61 | 62 | - What is Bag of Embeddings? 63 | - Using Embedding Layer in keras 64 | - Set embedding layer with pre-trained embedding 65 | - Using RNN for NLP Tasks 66 | 67 | [07_text-classification-Emojify.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/07_text-classification-Emojify.ipynb) 68 | 69 | ## Language Model and Text generation (On Persian poetry, Shahnameh) 70 | 71 | - what is TF Dataset 72 | - Stateful VS Stateless 73 | - When we need batch_input_shape ? 74 | 75 | [08_shahnameh-text-generation-language-model.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/08_shahnameh-text-generation-language-model.ipynb) 76 | 77 | # Seq2Seq networks (Encoder-Decoder) 78 | 79 | ## Understanding a mathematical strings with seq2seq 80 | 81 | - using RepeatVector for connecting encoder to decoder 82 | - use encoder hidden state as an input decoder 83 | 84 | [09_add-numbers-with-seq2seq.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/09_add-numbers-with-seq2seq.ipynb) 85 | 86 | ## NMT (Natural Machine Trnslate) with Attention in Keras 87 | 88 | [10_Neural-machine-translation-with-attention-for-date-convert.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/10_Neural-machine-translation-with-attention-for-date-convert.ipynb) 89 | 90 | ## NMT with Attention and teacher forcing in TF2.0 91 | 92 | - Teacher forcing 93 | - Loss with Mask for zero padding! 94 | - Using Model-Subclassing 95 | 96 | [11_nmt-with-attention.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/11_nmt-with-attention.ipynb) 97 | 98 | ## Image Captioning with Attention 99 | [12_image-captioning-with-attention.ipynb](https://nbviewer.jupyter.org/github/Alireza-Akhavan/rnn-notebooks/blob/master/12_image-captioning-with-attention.ipynb) 100 | -------------------------------------------------------------------------------- /nmt_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from faker import Faker 3 | import random 4 | from tqdm import tqdm 5 | from babel.dates import format_date 6 | from keras.utils import to_categorical 7 | import keras.backend as K 8 | import matplotlib.pyplot as plt 9 | 10 | fake = Faker() 11 | fake.seed(12345) 12 | random.seed(12345) 13 | 14 | # Define format of the data we would like to generate 15 | FORMATS = ['short', 16 | 'medium', 17 | 'long', 18 | 'full', 19 | 'full', 20 | 'full', 21 | 'full', 22 | 'full', 23 | 'full', 24 | 'full', 25 | 'full', 26 | 'full', 27 | 'full', 28 | 'd MMM YYY', 29 | 'd MMMM YYY', 30 | 'dd MMM YYY', 31 | 'd MMM, YYY', 32 | 'd MMMM, YYY', 33 | 'dd, MMM YYY', 34 | 'd MM YY', 35 | 'd MMMM YYY', 36 | 'MMMM d YYY', 37 | 'MMMM d, YYY', 38 | 'dd.MM.YY'] 39 | 40 | # change this if you want it to work with another language 41 | LOCALES = ['en_US'] 42 | 43 | def load_date(): 44 | """ 45 | Loads some fake dates 46 | :returns: tuple containing human readable string, machine readable string, and date object 47 | """ 48 | dt = fake.date_object() 49 | 50 | try: 51 | human_readable = format_date(dt, format=random.choice(FORMATS), locale='en_US') # locale=random.choice(LOCALES)) 52 | human_readable = human_readable.lower() 53 | human_readable = human_readable.replace(',','') 54 | machine_readable = dt.isoformat() 55 | 56 | except AttributeError as e: 57 | return None, None, None 58 | 59 | return human_readable, machine_readable, dt 60 | 61 | def load_dataset(m): 62 | """ 63 | Loads a dataset with m examples and vocabularies 64 | :m: the number of examples to generate 65 | """ 66 | 67 | human_vocab = set() 68 | machine_vocab = set() 69 | dataset = [] 70 | Tx = 30 71 | 72 | 73 | for i in tqdm(range(m)): 74 | h, m, _ = load_date() 75 | if h is not None: 76 | dataset.append((h, m)) 77 | human_vocab.update(tuple(h)) 78 | machine_vocab.update(tuple(m)) 79 | 80 | human = dict(zip(sorted(human_vocab) + ['', ''], 81 | list(range(len(human_vocab) + 2)))) 82 | inv_machine = dict(enumerate(sorted(machine_vocab))) 83 | machine = {v:k for k,v in inv_machine.items()} 84 | 85 | return dataset, human, machine, inv_machine 86 | 87 | def preprocess_data(dataset, human_vocab, machine_vocab, Tx, Ty): 88 | 89 | X, Y = zip(*dataset) 90 | 91 | X = np.array([string_to_int(i, Tx, human_vocab) for i in X]) 92 | Y = [string_to_int(t, Ty, machine_vocab) for t in Y] 93 | 94 | Xoh = np.array(list(map(lambda x: to_categorical(x, num_classes=len(human_vocab)), X))) 95 | Yoh = np.array(list(map(lambda x: to_categorical(x, num_classes=len(machine_vocab)), Y))) 96 | 97 | return X, np.array(Y), Xoh, Yoh 98 | 99 | def string_to_int(string, length, vocab): 100 | """ 101 | Converts all strings in the vocabulary into a list of integers representing the positions of the 102 | input string's characters in the "vocab" 103 | 104 | Arguments: 105 | string -- input string, e.g. 'Wed 10 Jul 2007' 106 | length -- the number of time steps you'd like, determines if the output will be padded or cut 107 | vocab -- vocabulary, dictionary used to index every character of your "string" 108 | 109 | Returns: 110 | rep -- list of integers (or '') (size = length) representing the position of the string's character in the vocabulary 111 | """ 112 | 113 | #make lower to standardize 114 | string = string.lower() 115 | string = string.replace(',','') 116 | 117 | if len(string) > length: 118 | string = string[:length] 119 | 120 | rep = list(map(lambda x: vocab.get(x, ''), string)) 121 | 122 | if len(string) < length: 123 | rep += [vocab['']] * (length - len(string)) 124 | 125 | #print (rep) 126 | return rep 127 | 128 | 129 | def int_to_string(ints, inv_vocab): 130 | """ 131 | Output a machine readable list of characters based on a list of indexes in the machine's vocabulary 132 | 133 | Arguments: 134 | ints -- list of integers representing indexes in the machine's vocabulary 135 | inv_vocab -- dictionary mapping machine readable indexes to machine readable characters 136 | 137 | Returns: 138 | l -- list of characters corresponding to the indexes of ints thanks to the inv_vocab mapping 139 | """ 140 | 141 | l = [inv_vocab[i] for i in ints] 142 | return l 143 | 144 | 145 | EXAMPLES = ['3 May 1979', '5 Apr 09', '20th February 2016', 'Wed 10 Jul 2007'] 146 | 147 | def run_example(model, input_vocabulary, inv_output_vocabulary, text): 148 | encoded = string_to_int(text, TIME_STEPS, input_vocabulary) 149 | prediction = model.predict(np.array([encoded])) 150 | prediction = np.argmax(prediction[0], axis=-1) 151 | return int_to_string(prediction, inv_output_vocabulary) 152 | 153 | def run_examples(model, input_vocabulary, inv_output_vocabulary, examples=EXAMPLES): 154 | predicted = [] 155 | for example in examples: 156 | predicted.append(''.join(run_example(model, input_vocabulary, inv_output_vocabulary, example))) 157 | print('input:', example) 158 | print('output:', predicted[-1]) 159 | return predicted 160 | 161 | 162 | def softmax(x, axis=1): 163 | """Softmax activation function. 164 | # Arguments 165 | x : Tensor. 166 | axis: Integer, axis along which the softmax normalization is applied. 167 | # Returns 168 | Tensor, output of softmax transformation. 169 | # Raises 170 | ValueError: In case `dim(x) == 1`. 171 | """ 172 | ndim = K.ndim(x) 173 | if ndim == 2: 174 | return K.softmax(x) 175 | elif ndim > 2: 176 | e = K.exp(x - K.max(x, axis=axis, keepdims=True)) 177 | s = K.sum(e, axis=axis, keepdims=True) 178 | return e / s 179 | else: 180 | raise ValueError('Cannot apply softmax to a tensor that is 1D') 181 | 182 | 183 | def plot_attention_map(model, input_vocabulary, inv_output_vocabulary, text, n_s = 128, num = 6, Tx = 30, Ty = 10): 184 | """ 185 | Plot the attention map. 186 | 187 | """ 188 | attention_map = np.zeros((10, 30)) 189 | Ty, Tx = attention_map.shape 190 | 191 | s0 = np.zeros((1, n_s)) 192 | c0 = np.zeros((1, n_s)) 193 | layer = model.layers[num] 194 | 195 | encoded = np.array(string_to_int(text, Tx, input_vocabulary)).reshape((1, 30)) 196 | encoded = np.array(list(map(lambda x: to_categorical(x, num_classes=len(input_vocabulary)), encoded))) 197 | 198 | f = K.function(model.inputs, [layer.get_output_at(t) for t in range(Ty)]) 199 | r = f([encoded, s0, c0]) 200 | 201 | for t in range(Ty): 202 | for t_prime in range(Tx): 203 | attention_map[t][t_prime] = r[t][0,t_prime,0] 204 | 205 | # Normalize attention map 206 | # row_max = attention_map.max(axis=1) 207 | # attention_map = attention_map / row_max[:, None] 208 | 209 | prediction = model.predict([encoded, s0, c0]) 210 | 211 | predicted_text = [] 212 | for i in range(len(prediction)): 213 | predicted_text.append(int(np.argmax(prediction[i], axis=1))) 214 | 215 | predicted_text = list(predicted_text) 216 | predicted_text = int_to_string(predicted_text, inv_output_vocabulary) 217 | text_ = list(text) 218 | 219 | # get the lengths of the string 220 | input_length = len(text) 221 | output_length = Ty 222 | 223 | # Plot the attention_map 224 | plt.clf() 225 | f = plt.figure(figsize=(8, 8.5)) 226 | ax = f.add_subplot(1, 1, 1) 227 | 228 | # add image 229 | i = ax.imshow(attention_map, interpolation='nearest', cmap='Blues') 230 | 231 | # add colorbar 232 | cbaxes = f.add_axes([0.2, 0, 0.6, 0.03]) 233 | cbar = f.colorbar(i, cax=cbaxes, orientation='horizontal') 234 | cbar.ax.set_xlabel('Alpha value (Probability output of the "softmax")', labelpad=2) 235 | 236 | # add labels 237 | ax.set_yticks(range(output_length)) 238 | ax.set_yticklabels(predicted_text[:output_length]) 239 | 240 | ax.set_xticks(range(input_length)) 241 | ax.set_xticklabels(text_[:input_length], rotation=45) 242 | 243 | ax.set_xlabel('Input Sequence') 244 | ax.set_ylabel('Output Sequence') 245 | 246 | # add grid and legend 247 | ax.grid() 248 | 249 | #f.show() 250 | 251 | return attention_map -------------------------------------------------------------------------------- /05-1-video-action-recognition-train-extract-features-with-cnn.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

طبقه بندی ویدیو با شبکه‌های بازگشتی - استخراج ویژگی

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "##
مجموعه داده
\n", 17 | "\n", 18 | "\n", 19 | "
\n", 20 | "قبلا 6 کلاس از دیتاست UCF-101 را به عنوان نمونه انتخاب و فریم‌های ویدیوهای متعلق به این 6 کلاس از این مجموعه داده را استخراج کرده ایم و اطلاعات هر ویدیو نظیر اسم - کلاس و تعداد فریم را در یک فایل متنی قرار داده ایم.\n", 21 | "
\n", 22 | " \n", 23 | "این 6 کلاس که برای این آموزش آماده شده است را از اینجا دانلود کنید: \n", 24 | "
\n", 25 | "\n", 26 | "http://dataset.class.vision/rnn/RNN-Video-6action.zip\n", 27 | "\n", 28 | "
\n", 29 | "
\n", 30 | " همچنین\n", 31 | " دیتاست اصلی شامل 101 کلاس مختلف را می‌توانید از لینک زیر دانلود کنید:\n", 32 | "
\n", 33 | "\n", 34 | "UCF-101\n", 35 | "[https://www.crcv.ucf.edu/data/UCF101.php](https://www.crcv.ucf.edu/data/UCF101.php)\n", 36 | "\n", 37 | "\n", 38 | "\n", 39 | "\n" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": 1, 45 | "metadata": {}, 46 | "outputs": [ 47 | { 48 | "name": "stderr", 49 | "output_type": "stream", 50 | "text": [ 51 | "Using TensorFlow backend.\n" 52 | ] 53 | } 54 | ], 55 | "source": [ 56 | "from keras.preprocessing import image\n", 57 | "from keras.applications.inception_v3 import InceptionV3, preprocess_input\n", 58 | "from keras.models import Model, load_model\n", 59 | "from keras.layers import Input\n", 60 | "import numpy as np\n", 61 | "import os.path\n", 62 | "from tqdm import tqdm\n", 63 | "import csv\n", 64 | "import random\n", 65 | "import glob\n", 66 | "import os.path\n", 67 | "import sys\n", 68 | "import operator\n", 69 | "import threading\n", 70 | "from keras.utils import to_categorical\n", 71 | "from keras.preprocessing.image import img_to_array, load_img" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 2, 77 | "metadata": {}, 78 | "outputs": [], 79 | "source": [ 80 | "seq_length= 40\n", 81 | "max_frames = 300\n", 82 | "image_shape=(224, 224, 3)\n", 83 | "base_path = \"D:/dataset/RNN-Video\"\n" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 3, 89 | "metadata": {}, 90 | "outputs": [], 91 | "source": [ 92 | "with open(os.path.join('D:/dataset/RNN-Video/data_file_5class.csv'), 'r') as fin:\n", 93 | " reader = csv.reader(fin)\n", 94 | " data = list(reader)" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 4, 100 | "metadata": {}, 101 | "outputs": [ 102 | { 103 | "data": { 104 | "text/plain": [ 105 | "['CricketBowling',\n", 106 | " 'CricketShot',\n", 107 | " 'FieldHockeyPenalty',\n", 108 | " 'HandstandPushups',\n", 109 | " 'HandstandWalking',\n", 110 | " 'SoccerPenalty']" 111 | ] 112 | }, 113 | "execution_count": 4, 114 | "metadata": {}, 115 | "output_type": "execute_result" 116 | } 117 | ], 118 | "source": [ 119 | "train_path = os.path.join(base_path, 'train')\n", 120 | "classes =os.listdir(train_path)\n", 121 | "classes = sorted(classes)\n", 122 | "classes" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": {}, 128 | "source": [ 129 | "
\n", 130 | " در اینجا آن ویدیوهایی که حداقل 40 فریم و حداکثر 300 فریم دارند را لود می‌کنیم.\n", 131 | "
" 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 6, 137 | "metadata": {}, 138 | "outputs": [], 139 | "source": [ 140 | "data_clean = []\n", 141 | "for item in data:\n", 142 | " if int(item[3]) >= seq_length and int(item[3]) <= max_frames:\n", 143 | " data_clean.append(item)" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": 7, 149 | "metadata": {}, 150 | "outputs": [ 151 | { 152 | "data": { 153 | "text/plain": [ 154 | "439" 155 | ] 156 | }, 157 | "execution_count": 7, 158 | "metadata": {}, 159 | "output_type": "execute_result" 160 | } 161 | ], 162 | "source": [ 163 | "len(data_clean)" 164 | ] 165 | }, 166 | { 167 | "cell_type": "code", 168 | "execution_count": 8, 169 | "metadata": {}, 170 | "outputs": [], 171 | "source": [ 172 | "def get_n_sample_from_video(sample, seq_length):\n", 173 | " path = os.path.join(base_path, sample[0], sample[1])\n", 174 | " filename = sample[2]\n", 175 | " images = sorted(glob.glob(os.path.join(path, filename + '*jpg')))\n", 176 | "\n", 177 | " #Given a list and a size, return a rescaled/samples list. For example,\n", 178 | " #if we want a list of size 5 and we have a list of size 25, return a new\n", 179 | " #list of size five which is every 5th element of the origina list.\n", 180 | " # Get the number to skip between iterations.\n", 181 | " skip = len(images) // seq_length\n", 182 | "\n", 183 | " # Build our new output.\n", 184 | " output = [images[i] for i in range(0, len(images), skip)]\n", 185 | "\n", 186 | " # Cut off the last one if needed.\n", 187 | " return output[:seq_length]" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 9, 193 | "metadata": {}, 194 | "outputs": [ 195 | { 196 | "data": { 197 | "text/plain": [ 198 | "['train', 'HandstandWalking', 'v_HandstandWalking_g24_c06', '151']" 199 | ] 200 | }, 201 | "execution_count": 9, 202 | "metadata": {}, 203 | "output_type": "execute_result" 204 | } 205 | ], 206 | "source": [ 207 | "data_clean[3]" 208 | ] 209 | }, 210 | { 211 | "cell_type": "code", 212 | "execution_count": 10, 213 | "metadata": {}, 214 | "outputs": [ 215 | { 216 | "data": { 217 | "text/plain": [ 218 | "40" 219 | ] 220 | }, 221 | "execution_count": 10, 222 | "metadata": {}, 223 | "output_type": "execute_result" 224 | } 225 | ], 226 | "source": [ 227 | "len(get_n_sample_from_video(data_clean[3], 40))" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": 11, 233 | "metadata": {}, 234 | "outputs": [], 235 | "source": [ 236 | "# Get model with pretrained weights.\n", 237 | "base_model = InceptionV3(weights='imagenet', include_top=True)\n", 238 | "\n", 239 | "# We'll extract features at the final pool layer.\n", 240 | "model = Model(inputs=base_model.input,\n", 241 | " outputs=base_model.get_layer('avg_pool').output)\n", 242 | "\n", 243 | "def model_predict(image_path):\n", 244 | " img = image.load_img(image_path, target_size=(299, 299))\n", 245 | " x = image.img_to_array(img)\n", 246 | " x = np.expand_dims(x, axis=0)\n", 247 | " x = preprocess_input(x)\n", 248 | "\n", 249 | " # Get the prediction.\n", 250 | " features = model.predict(x)\n", 251 | " return features[0]" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": 12, 257 | "metadata": {}, 258 | "outputs": [ 259 | { 260 | "name": "stderr", 261 | "output_type": "stream", 262 | "text": [ 263 | "100%|████████████████████████████████████████████████████████████████████████████████| 439/439 [18:31<00:00, 2.85s/it]\n" 264 | ] 265 | } 266 | ], 267 | "source": [ 268 | "os.makedirs('sequences', exist_ok=True)\n", 269 | "for video in tqdm(data_clean):\n", 270 | "\n", 271 | " # Get the path to the sequence for this video.\n", 272 | " path = os.path.join('sequences', video[2] + '-' + str(seq_length) + \\\n", 273 | " '-features') # numpy will auto-append .npy\n", 274 | "\n", 275 | " # Check if we already have it.\n", 276 | " if os.path.isfile(path + '.npy'):\n", 277 | " continue\n", 278 | "\n", 279 | " # Get the frames for this video.\n", 280 | " frames = get_n_sample_from_video(video, seq_length)\n", 281 | "\n", 282 | " # Now loop through and extract features to build the sequence.\n", 283 | " sequence = []\n", 284 | " for frame in frames:\n", 285 | " features = model_predict(frame)\n", 286 | " sequence.append(features)\n", 287 | "\n", 288 | " # Save the sequence.\n", 289 | " np.save(path, sequence)" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | "
\n", 297 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 298 | "
\n", 299 | "Class.Vision - AkhavanPour.ir - GitHub\n", 300 | "\n", 301 | "
" 302 | ] 303 | } 304 | ], 305 | "metadata": { 306 | "kernelspec": { 307 | "display_name": "tensorflow", 308 | "language": "python", 309 | "name": "tensorflow" 310 | }, 311 | "language_info": { 312 | "codemirror_mode": { 313 | "name": "ipython", 314 | "version": 3 315 | }, 316 | "file_extension": ".py", 317 | "mimetype": "text/x-python", 318 | "name": "python", 319 | "nbconvert_exporter": "python", 320 | "pygments_lexer": "ipython3", 321 | "version": "3.6.8" 322 | } 323 | }, 324 | "nbformat": 4, 325 | "nbformat_minor": 2 326 | } 327 | -------------------------------------------------------------------------------- /06_analogy-using-embeddings.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

قیاس کلمات (Word analogies)

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "
\n", 17 | "کدها با تغییرات برگرفته از کورس Sequence Models پروفسور Andrew NG است.\n", 18 | "
\n", 19 | "\n", 20 | "[https://www.coursera.org/learn/nlp-sequence-models](https://www.coursera.org/learn/nlp-sequence-models)\n", 21 | "\n", 22 | "
\n", 23 | "بردار از قبل آموزش داده شده را می‌توانید از اینجا دانلود کنید:
\n", 24 | "\n", 25 | "https://nlp.stanford.edu/projects/glove/\n", 26 | "\n", 27 | "http://nlp.stanford.edu/data/glove.6B.zip\n" 28 | ] 29 | }, 30 | { 31 | "cell_type": "code", 32 | "execution_count": 1, 33 | "metadata": {}, 34 | "outputs": [], 35 | "source": [ 36 | "import numpy as np\n", 37 | "import os" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 2, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "glove_dir = 'D:/dataset/glove.6B'\n", 47 | "\n", 48 | "embeddings_index = {}\n", 49 | "f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'), encoding=\"utf8\")\n", 50 | "for line in f:\n", 51 | " values = line.split()\n", 52 | " word = values[0]\n", 53 | " coefs = np.asarray(values[1:], dtype='float32')\n", 54 | " embeddings_index[word] = coefs\n", 55 | "f.close()\n" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "# 1 - Cosine similarity\n", 63 | "\n", 64 | "To measure how similar two words are, we need a way to measure the degree of similarity between two embedding vectors for the two words. Given two vectors $u$ and $v$, cosine similarity is defined as follows: \n", 65 | "\n", 66 | "$$\\text{CosineSimilarity(u, v)} = \\frac {u . v} {||u||_2 ||v||_2} = cos(\\theta) \\tag{1}$$\n", 67 | "\n", 68 | "where $u.v$ is the dot product (or inner product) of two vectors, $||u||_2$ is the norm (or length) of the vector $u$, and $\\theta$ is the angle between $u$ and $v$. This similarity depends on the angle between $u$ and $v$. If $u$ and $v$ are very similar, their cosine similarity will be close to 1; if they are dissimilar, the cosine similarity will take a smaller value. \n", 69 | "\n", 70 | "\n" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 3, 76 | "metadata": {}, 77 | "outputs": [], 78 | "source": [ 79 | "from sklearn.metrics.pairwise import cosine_similarity\n", 80 | "\n", 81 | "def similarity(u, v):\n", 82 | " return np.squeeze(cosine_similarity(u.reshape(1, -1), v.reshape(1, -1)))\n" 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": 6, 88 | "metadata": {}, 89 | "outputs": [ 90 | { 91 | "name": "stdout", 92 | "output_type": "stream", 93 | "text": [ 94 | "cosine_similarity(father, mother) = 0.86566603\n", 95 | "cosine_similarity(ball, crocodile) = 0.15206575\n", 96 | "cosine_similarity(france - paris, tehran - iran) = -0.6854124\n" 97 | ] 98 | } 99 | ], 100 | "source": [ 101 | "father = embeddings_index[\"father\"]\n", 102 | "mother = embeddings_index[\"mother\"]\n", 103 | "ball = embeddings_index[\"ball\"]\n", 104 | "crocodile = embeddings_index[\"crocodile\"]\n", 105 | "france = embeddings_index[\"france\"]\n", 106 | "tehran = embeddings_index[\"tehran\"]\n", 107 | "paris = embeddings_index[\"paris\"]\n", 108 | "iran = embeddings_index[\"iran\"]\n", 109 | "print(\"cosine_similarity(father, mother) = \", similarity(father, mother))\n", 110 | "print(\"cosine_similarity(ball, crocodile) = \",similarity(ball, crocodile))\n", 111 | "print(\"cosine_similarity(france - paris, tehran - iran) = \",similarity(france - paris, tehran - iran))" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "## 2 - Word analogy task\n", 119 | "\n", 120 | "In the word analogy task, we complete the sentence \"*a* is to *b* as *c* is to **____**\". An example is '*man* is to *woman* as *king* is to *queen*' . In detail, we are trying to find a word *d*, such that the associated word vectors $e_a, e_b, e_c, e_d$ are related in the following manner: $e_b - e_a \\approx e_d - e_c$. We will measure the similarity between $e_b - e_a$ and $e_d - e_c$ using cosine similarity. \n" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 7, 126 | "metadata": {}, 127 | "outputs": [ 128 | { 129 | "data": { 130 | "text/plain": [ 131 | "array([ 0.64706 , -0.068067, 0.15468 , -0.17408 , -0.29134 , 0.76999 ,\n", 132 | " -0.3192 , -0.25663 , -0.25082 , -0.036737, -0.25509 , 0.29636 ,\n", 133 | " 0.5776 , 0.49641 , 0.19167 , -0.83888 , 0.58482 , -0.38717 ,\n", 134 | " -0.71591 , 0.9519 , -0.37966 , -0.1131 , 0.47154 , 0.20921 ,\n", 135 | " 0.38197 , 0.067582, -0.92879 , -1.1237 , 0.84831 , 0.68744 ,\n", 136 | " -0.15472 , 0.92714 , 0.53371 , -0.037392, -0.856 , 0.19056 ,\n", 137 | " -0.014594, 0.15186 , 0.53514 , -0.20306 , -0.35164 , 0.33152 ,\n", 138 | " 1.1306 , -0.72787 , -0.19724 , 0.031659, -0.24041 , -0.057617,\n", 139 | " 0.60473 , -0.49233 , -0.24405 , -0.3184 , 0.96156 , 1.0895 ,\n", 140 | " 0.21534 , -2.0542 , -1.0615 , 0.052439, 0.57958 , 0.2748 ,\n", 141 | " 0.91587 , 0.85195 , 0.36113 , -0.31901 , 0.7784 , -0.36865 ,\n", 142 | " 0.64387 , 0.33104 , -0.27181 , 0.58524 , -0.15143 , 0.11121 ,\n", 143 | " 0.2126 , -0.60345 , 0.16148 , 0.32952 , -0.1354 , -0.30629 ,\n", 144 | " -0.89143 , 0.091912, 0.49753 , 0.55932 , 0.19329 , 0.044859,\n", 145 | " -1.0416 , -0.41566 , -0.54174 , -0.7244 , -0.57492 , -1.1188 ,\n", 146 | " 0.087097, -0.2992 , 0.87227 , 0.86996 , -0.89641 , -0.28259 ,\n", 147 | " -0.47295 , -0.74062 , -0.39 , -0.78099 ], dtype=float32)" 148 | ] 149 | }, 150 | "execution_count": 7, 151 | "metadata": {}, 152 | "output_type": "execute_result" 153 | } 154 | ], 155 | "source": [ 156 | "embeddings_index[\"father\"]" 157 | ] 158 | }, 159 | { 160 | "cell_type": "code", 161 | "execution_count": 8, 162 | "metadata": {}, 163 | "outputs": [], 164 | "source": [ 165 | "def complete_analogy(word_a, word_b, word_c, embeddings_index):\n", 166 | " \n", 167 | " # convert words to lower case\n", 168 | " word_a, word_b, word_c = word_a.lower(), word_b.lower(), word_c.lower()\n", 169 | " \n", 170 | " # Get the word embeddings v_a, v_b and v_c \n", 171 | " e_a, e_b, e_c = embeddings_index[word_a], embeddings_index[word_b], embeddings_index[word_c]\n", 172 | " \n", 173 | " words = embeddings_index.keys()\n", 174 | " max_cosine_sim = -100 # Initialize max_cosine_sim to a large negative number\n", 175 | " best_word = None # Initialize best_word with None, it will help keep track of the word to output\n", 176 | "\n", 177 | " # loop over the whole word vector set\n", 178 | " for w in words: \n", 179 | " # to avoid best_word being one of the input words, pass on them.\n", 180 | " if w in [word_a, word_b, word_c] :\n", 181 | " continue\n", 182 | " \n", 183 | " # Compute cosine similarity between the vector (e_b - e_a) and the vector ((w's vector representation) - e_c) (≈1 line)\n", 184 | " cosine_sim = similarity(e_b - e_a, embeddings_index[w] - e_c)\n", 185 | " \n", 186 | " # If the cosine_sim is more than the max_cosine_sim seen so far,\n", 187 | " # then: set the new max_cosine_sim to the current cosine_sim and the best_word to the current word (≈3 lines)\n", 188 | " if cosine_sim > max_cosine_sim:\n", 189 | " max_cosine_sim = cosine_sim\n", 190 | " best_word = w\n", 191 | " \n", 192 | " return best_word" 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": {}, 198 | "source": [ 199 | "Run the cell below to test your code, this may take 1-2 minutes." 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": 9, 205 | "metadata": {}, 206 | "outputs": [ 207 | { 208 | "data": { 209 | "text/plain": [ 210 | "'iranian'" 211 | ] 212 | }, 213 | "execution_count": 9, 214 | "metadata": {}, 215 | "output_type": "execute_result" 216 | } 217 | ], 218 | "source": [ 219 | "complete_analogy('china', 'chinese', 'iran', embeddings_index)" 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": 10, 225 | "metadata": {}, 226 | "outputs": [ 227 | { 228 | "data": { 229 | "text/plain": [ 230 | "'tehran'" 231 | ] 232 | }, 233 | "execution_count": 10, 234 | "metadata": {}, 235 | "output_type": "execute_result" 236 | } 237 | ], 238 | "source": [ 239 | "complete_analogy('india', 'delhi', 'iran', embeddings_index)" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": 11, 245 | "metadata": {}, 246 | "outputs": [ 247 | { 248 | "data": { 249 | "text/plain": [ 250 | "'girl'" 251 | ] 252 | }, 253 | "execution_count": 11, 254 | "metadata": {}, 255 | "output_type": "execute_result" 256 | } 257 | ], 258 | "source": [ 259 | "complete_analogy('man', 'woman', 'boy', embeddings_index)" 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": 12, 265 | "metadata": {}, 266 | "outputs": [ 267 | { 268 | "data": { 269 | "text/plain": [ 270 | "'bigger'" 271 | ] 272 | }, 273 | "execution_count": 12, 274 | "metadata": {}, 275 | "output_type": "execute_result" 276 | } 277 | ], 278 | "source": [ 279 | "complete_analogy('small', 'smaller', 'big', embeddings_index)" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": 9, 285 | "metadata": {}, 286 | "outputs": [ 287 | { 288 | "data": { 289 | "text/plain": [ 290 | "'inuktitut'" 291 | ] 292 | }, 293 | "execution_count": 9, 294 | "metadata": {}, 295 | "output_type": "execute_result" 296 | } 297 | ], 298 | "source": [ 299 | "complete_analogy('iran', 'farsi', 'canada', embeddings_index)" 300 | ] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "metadata": {}, 305 | "source": [ 306 | "
\n", 307 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 308 | "
\n", 309 | "Class.Vision - AkhavanPour.ir - GitHub\n", 310 | "\n", 311 | "
" 312 | ] 313 | } 314 | ], 315 | "metadata": { 316 | "coursera": { 317 | "course_slug": "nlp-sequence-models", 318 | "graded_item_id": "8hb5s", 319 | "launcher_item_id": "5NrJ6" 320 | }, 321 | "kernelspec": { 322 | "display_name": "tf2-GPU", 323 | "language": "python", 324 | "name": "tf2" 325 | }, 326 | "language_info": { 327 | "codemirror_mode": { 328 | "name": "ipython", 329 | "version": 3 330 | }, 331 | "file_extension": ".py", 332 | "mimetype": "text/x-python", 333 | "name": "python", 334 | "nbconvert_exporter": "python", 335 | "pygments_lexer": "ipython3", 336 | "version": "3.6.9" 337 | } 338 | }, 339 | "nbformat": 4, 340 | "nbformat_minor": 2 341 | } 342 | -------------------------------------------------------------------------------- /09_add-numbers-with-seq2seq.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

Seq2Seq برای جمع اعداد!

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "name": "stderr", 19 | "output_type": "stream", 20 | "text": [ 21 | "Using TensorFlow backend.\n" 22 | ] 23 | } 24 | ], 25 | "source": [ 26 | "from random import seed\n", 27 | "from random import randint\n", 28 | "from numpy import array\n", 29 | "from math import ceil\n", 30 | "from math import log10\n", 31 | "from math import sqrt\n", 32 | "from numpy import argmax\n", 33 | "from keras.models import Sequential\n", 34 | "from keras.layers import Dense\n", 35 | "from keras.layers import LSTM\n", 36 | "from keras.layers import TimeDistributed\n", 37 | "from keras.layers import RepeatVector" 38 | ] 39 | }, 40 | { 41 | "cell_type": "code", 42 | "execution_count": 2, 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "# generate lists of random integers and their sum\n", 47 | "def random_sum_pairs(n_examples, n_numbers, largest):\n", 48 | " X, y = list(), list()\n", 49 | " for i in range(n_examples):\n", 50 | " in_pattern = [randint(1,largest) for _ in range(n_numbers)]\n", 51 | " out_pattern = sum(in_pattern)\n", 52 | " X.append(in_pattern)\n", 53 | " y.append(out_pattern)\n", 54 | " return X, y" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": 3, 60 | "metadata": {}, 61 | "outputs": [ 62 | { 63 | "name": "stdout", 64 | "output_type": "stream", 65 | "text": [ 66 | "100\n", 67 | "[12, 3, 4]\n", 68 | "19\n" 69 | ] 70 | } 71 | ], 72 | "source": [ 73 | "x,y = random_sum_pairs(100,3,15)\n", 74 | "print(len(x))\n", 75 | "print(x[0])\n", 76 | "print(y[0])" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": 4, 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "# convert data to strings\n", 86 | "def to_string(X, y, n_numbers, largest):\n", 87 | " max_length = n_numbers * ceil(log10(largest+1)) + n_numbers - 1\n", 88 | " Xstr = list()\n", 89 | " for pattern in X:\n", 90 | " strp = '+'.join([str(n) for n in pattern])\n", 91 | " strp = ''.join([' ' for _ in range(max_length-len(strp))]) + strp\n", 92 | " Xstr.append(strp)\n", 93 | " max_length = ceil(log10(n_numbers * (largest+1)))\n", 94 | " ystr = list()\n", 95 | " for pattern in y:\n", 96 | " strp = str(pattern)\n", 97 | " strp = ''.join([' ' for _ in range(max_length-len(strp))]) + strp\n", 98 | " ystr.append(strp)\n", 99 | " return Xstr, ystr" 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "execution_count": 5, 105 | "metadata": {}, 106 | "outputs": [], 107 | "source": [ 108 | "# integer encode strings\n", 109 | "def integer_encode(X, y, alphabet):\n", 110 | " char_to_int = dict((c, i) for i, c in enumerate(alphabet))\n", 111 | " Xenc = list()\n", 112 | " for pattern in X:\n", 113 | " integer_encoded = [char_to_int[char] for char in pattern]\n", 114 | " Xenc.append(integer_encoded)\n", 115 | " yenc = list()\n", 116 | " for pattern in y:\n", 117 | " integer_encoded = [char_to_int[char] for char in pattern]\n", 118 | " yenc.append(integer_encoded)\n", 119 | " return Xenc, yenc" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": 6, 125 | "metadata": {}, 126 | "outputs": [], 127 | "source": [ 128 | " # one hot encode\n", 129 | "def one_hot_encode(X, y, max_int):\n", 130 | " Xenc = list()\n", 131 | " for seq in X:\n", 132 | " pattern = list()\n", 133 | " for index in seq:\n", 134 | " vector = [0 for _ in range(max_int)]\n", 135 | " vector[index] = 1\n", 136 | " pattern.append(vector)\n", 137 | " Xenc.append(pattern)\n", 138 | " yenc = list()\n", 139 | " for seq in y:\n", 140 | " pattern = list()\n", 141 | " for index in seq:\n", 142 | " vector = [0 for _ in range(max_int)]\n", 143 | " vector[index] = 1\n", 144 | " pattern.append(vector)\n", 145 | " yenc.append(pattern)\n", 146 | " return Xenc, yenc" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 7, 152 | "metadata": {}, 153 | "outputs": [], 154 | "source": [ 155 | "# generate an encoded dataset\n", 156 | "def generate_data(n_samples, n_numbers, largest, alphabet):\n", 157 | " # generate pairs\n", 158 | " X, y = random_sum_pairs(n_samples, n_numbers, largest)\n", 159 | " # convert to strings\n", 160 | " X, y = to_string(X, y, n_numbers, largest)\n", 161 | " # integer encode\n", 162 | " X, y = integer_encode(X, y, alphabet)\n", 163 | " # one hot encode\n", 164 | " X, y = one_hot_encode(X, y, len(alphabet))\n", 165 | " # return as numpy arrays\n", 166 | " X, y = array(X), array(y)\n", 167 | " return X, y" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": 8, 173 | "metadata": {}, 174 | "outputs": [], 175 | "source": [ 176 | "# invert encoding\n", 177 | "def invert(seq, alphabet):\n", 178 | " int_to_char = dict((i, c) for i, c in enumerate(alphabet))\n", 179 | " strings = list()\n", 180 | " for pattern in seq:\n", 181 | " string = int_to_char[argmax(pattern)]\n", 182 | " strings.append(string)\n", 183 | " return ''.join(strings)" 184 | ] 185 | }, 186 | { 187 | "cell_type": "code", 188 | "execution_count": 9, 189 | "metadata": {}, 190 | "outputs": [], 191 | "source": [ 192 | "# define dataset\n", 193 | "seed(1)\n", 194 | "n_samples = 1000\n", 195 | "n_numbers = 2\n", 196 | "largest = 10\n", 197 | "alphabet = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', ' ']\n", 198 | "n_chars = len(alphabet)\n", 199 | "n_in_seq_length = n_numbers * ceil(log10(largest+1)) + n_numbers - 1\n", 200 | "n_out_seq_length = ceil(log10(n_numbers * (largest+1)))" 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": 10, 206 | "metadata": {}, 207 | "outputs": [], 208 | "source": [ 209 | "X, y = generate_data(n_samples, n_numbers, largest, alphabet)" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": 11, 215 | "metadata": {}, 216 | "outputs": [ 217 | { 218 | "name": "stdout", 219 | "output_type": "stream", 220 | "text": [ 221 | "shape of X (1000, 5, 12)\n", 222 | "shape of y (1000, 2, 12)\n", 223 | "X[0]:\n", 224 | "[[0 0 0 0 0 0 0 0 0 0 0 1]\n", 225 | " [0 0 0 1 0 0 0 0 0 0 0 0]\n", 226 | " [0 0 0 0 0 0 0 0 0 0 1 0]\n", 227 | " [0 1 0 0 0 0 0 0 0 0 0 0]\n", 228 | " [1 0 0 0 0 0 0 0 0 0 0 0]]\n", 229 | "y[0]\n", 230 | "[[0 1 0 0 0 0 0 0 0 0 0 0]\n", 231 | " [0 0 0 1 0 0 0 0 0 0 0 0]]\n", 232 | "invert X[0] 3+10\n", 233 | "invert y[0] 13\n" 234 | ] 235 | } 236 | ], 237 | "source": [ 238 | "print(\"shape of X\", X.shape)\n", 239 | "print(\"shape of y\", y.shape)\n", 240 | "print(\"X[0]:\")\n", 241 | "print(X[0])\n", 242 | "print(\"y[0]\")\n", 243 | "print(y[0])\n", 244 | "\n", 245 | "print(\"invert X[0]\", invert(X[0], alphabet) )\n", 246 | "print(\"invert y[0]\", invert(y[0], alphabet) )" 247 | ] 248 | }, 249 | { 250 | "cell_type": "code", 251 | "execution_count": 12, 252 | "metadata": {}, 253 | "outputs": [], 254 | "source": [ 255 | "# define LSTM configuration\n", 256 | "n_batch = 10\n", 257 | "n_epoch = 30" 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 13, 263 | "metadata": {}, 264 | "outputs": [ 265 | { 266 | "name": "stdout", 267 | "output_type": "stream", 268 | "text": [ 269 | "Model: \"sequential_1\"\n", 270 | "_________________________________________________________________\n", 271 | "Layer (type) Output Shape Param # \n", 272 | "=================================================================\n", 273 | "lstm_1 (LSTM) (None, 100) 45200 \n", 274 | "_________________________________________________________________\n", 275 | "repeat_vector_1 (RepeatVecto (None, 2, 100) 0 \n", 276 | "_________________________________________________________________\n", 277 | "lstm_2 (LSTM) (None, 2, 50) 30200 \n", 278 | "_________________________________________________________________\n", 279 | "time_distributed_1 (TimeDist (None, 2, 12) 612 \n", 280 | "=================================================================\n", 281 | "Total params: 76,012\n", 282 | "Trainable params: 76,012\n", 283 | "Non-trainable params: 0\n", 284 | "_________________________________________________________________\n", 285 | "None\n" 286 | ] 287 | } 288 | ], 289 | "source": [ 290 | "# create LSTM\n", 291 | "model = Sequential()\n", 292 | "model.add(LSTM(100, input_shape=(n_in_seq_length, n_chars)))\n", 293 | "model.add(RepeatVector(n_out_seq_length))\n", 294 | "model.add(LSTM(50, return_sequences=True))\n", 295 | "model.add(TimeDistributed(Dense(n_chars, activation='softmax')))\n", 296 | "model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\n", 297 | "print(model.summary())" 298 | ] 299 | }, 300 | { 301 | "cell_type": "code", 302 | "execution_count": 14, 303 | "metadata": {}, 304 | "outputs": [ 305 | { 306 | "name": "stdout", 307 | "output_type": "stream", 308 | "text": [ 309 | "0\n", 310 | "Epoch 1/1\n", 311 | "1000/1000 [==============================] - 4s 4ms/step - loss: 1.9967 - accuracy: 0.3445\n", 312 | "1\n", 313 | "Epoch 1/1\n", 314 | "1000/1000 [==============================] - 2s 2ms/step - loss: 1.5194 - accuracy: 0.3680\n", 315 | "2\n", 316 | "Epoch 1/1\n", 317 | "1000/1000 [==============================] - 2s 2ms/step - loss: 1.3945 - accuracy: 0.4670\n", 318 | "3\n", 319 | "Epoch 1/1\n", 320 | "1000/1000 [==============================] - 3s 3ms/step - loss: 1.3099 - accuracy: 0.5075\n", 321 | "4\n", 322 | "Epoch 1/1\n", 323 | "1000/1000 [==============================] - 2s 2ms/step - loss: 1.1971 - accuracy: 0.5595\n", 324 | "5\n", 325 | "Epoch 1/1\n", 326 | "1000/1000 [==============================] - 2s 2ms/step - loss: 1.0768 - accuracy: 0.6285\n", 327 | "6\n", 328 | "Epoch 1/1\n", 329 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.9106 - accuracy: 0.6870\n", 330 | "7\n", 331 | "Epoch 1/1\n", 332 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.7884 - accuracy: 0.7330\n", 333 | "8\n", 334 | "Epoch 1/1\n", 335 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.6850 - accuracy: 0.8075\n", 336 | "9\n", 337 | "Epoch 1/1\n", 338 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.5964 - accuracy: 0.8660\n", 339 | "10\n", 340 | "Epoch 1/1\n", 341 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.5245 - accuracy: 0.8975\n", 342 | "11\n", 343 | "Epoch 1/1\n", 344 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.4445 - accuracy: 0.9350\n", 345 | "12\n", 346 | "Epoch 1/1\n", 347 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.3942 - accuracy: 0.9580\n", 348 | "13\n", 349 | "Epoch 1/1\n", 350 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.3604 - accuracy: 0.9520\n", 351 | "14\n", 352 | "Epoch 1/1\n", 353 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.3117 - accuracy: 0.9580\n", 354 | "15\n", 355 | "Epoch 1/1\n", 356 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.2776 - accuracy: 0.9790\n", 357 | "16\n", 358 | "Epoch 1/1\n", 359 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.2440 - accuracy: 0.9800\n", 360 | "17\n", 361 | "Epoch 1/1\n", 362 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.2322 - accuracy: 0.9855\n", 363 | "18\n", 364 | "Epoch 1/1\n", 365 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.1978 - accuracy: 0.9885\n", 366 | "19\n", 367 | "Epoch 1/1\n", 368 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.1971 - accuracy: 0.9875\n", 369 | "20\n", 370 | "Epoch 1/1\n", 371 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.2231 - accuracy: 0.9595\n", 372 | "21\n", 373 | "Epoch 1/1\n", 374 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.1460 - accuracy: 0.9875\n", 375 | "22\n", 376 | "Epoch 1/1\n", 377 | "1000/1000 [==============================] - 4s 4ms/step - loss: 0.1305 - accuracy: 0.9955\n", 378 | "23\n", 379 | "Epoch 1/1\n", 380 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.1155 - accuracy: 0.9945\n", 381 | "24\n", 382 | "Epoch 1/1\n", 383 | "1000/1000 [==============================] - 2s 2ms/step - loss: 0.1039 - accuracy: 0.9950\n", 384 | "25\n", 385 | "Epoch 1/1\n", 386 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.0897 - accuracy: 0.9960\n", 387 | "26\n", 388 | "Epoch 1/1\n", 389 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.0833 - accuracy: 0.9975\n", 390 | "27\n", 391 | "Epoch 1/1\n", 392 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.0742 - accuracy: 0.9960\n", 393 | "28\n", 394 | "Epoch 1/1\n", 395 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.0678 - accuracy: 0.9975: 0s - l\n", 396 | "29\n", 397 | "Epoch 1/1\n", 398 | "1000/1000 [==============================] - 3s 3ms/step - loss: 0.0590 - accuracy: 0.9985\n" 399 | ] 400 | } 401 | ], 402 | "source": [ 403 | "# train LSTM\n", 404 | "for i in range(n_epoch):\n", 405 | " X, y = generate_data(n_samples, n_numbers, largest, alphabet)\n", 406 | " print(i)\n", 407 | " model.fit(X, y, epochs=1, batch_size=n_batch)" 408 | ] 409 | }, 410 | { 411 | "cell_type": "code", 412 | "execution_count": 15, 413 | "metadata": {}, 414 | "outputs": [ 415 | { 416 | "name": "stdout", 417 | "output_type": "stream", 418 | "text": [ 419 | "Expected=14, Predicted=14\n", 420 | "Expected=15, Predicted=15\n", 421 | "Expected=12, Predicted=12\n", 422 | "Expected=12, Predicted=12\n", 423 | "Expected=13, Predicted=13\n", 424 | "Expected=15, Predicted=15\n", 425 | "Expected= 8, Predicted= 8\n", 426 | "Expected= 4, Predicted= 4\n", 427 | "Expected=19, Predicted=19\n", 428 | "Expected=13, Predicted=13\n", 429 | "Expected=15, Predicted=15\n", 430 | "Expected= 8, Predicted= 8\n", 431 | "Expected=13, Predicted=13\n", 432 | "Expected= 4, Predicted= 4\n", 433 | "Expected=16, Predicted=16\n", 434 | "Expected=13, Predicted=13\n", 435 | "Expected=16, Predicted=16\n", 436 | "Expected= 8, Predicted= 8\n", 437 | "Expected=14, Predicted=14\n", 438 | "Expected=16, Predicted=16\n" 439 | ] 440 | } 441 | ], 442 | "source": [ 443 | "# evaluate on some new patterns\n", 444 | "X, y = generate_data(n_samples, n_numbers, largest, alphabet)\n", 445 | "result = model.predict(X, batch_size=n_batch, verbose=0)\n", 446 | "# calculate error\n", 447 | "expected = [invert(x, alphabet) for x in y]\n", 448 | "predicted = [invert(x, alphabet) for x in result]\n", 449 | "# show some examples\n", 450 | "for i in range(20):\n", 451 | " print('Expected=%s, Predicted=%s' % (expected[i], predicted[i]))" 452 | ] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "metadata": {}, 457 | "source": [ 458 | "### Extensions\n", 459 | "\n", 460 | "This section lists some natural extensions to this tutorial that you may wish to explore.\n", 461 | "\n", 462 | " ***Integer Encoding***. Explore whether the problem can learn the problem better using an integer encoding alone. The ordinal relationship between most of the inputs may prove very useful.\n", 463 | " \n", 464 | " ***Variable Numbers***. Change the example to support a variable number of terms on each input sequence. This should be straightforward as long as you perform sufficient padding.\n", 465 | " \n", 466 | " ***Variable Mathematical Operations***. Change the example to vary the mathematical operation to allow the network to generalize even further.\n", 467 | " Brackets. Allow the use of brackets along with other mathematical operations." 468 | ] 469 | }, 470 | { 471 | "cell_type": "markdown", 472 | "metadata": {}, 473 | "source": [ 474 | "Source:\n", 475 | " https://machinelearningmastery.com/learn-add-numbers-seq2seq-recurrent-neural-networks/" 476 | ] 477 | }, 478 | { 479 | "cell_type": "markdown", 480 | "metadata": {}, 481 | "source": [ 482 | "
\n", 483 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 484 | "
\n", 485 | "Class.Vision - AkhavanPour.ir - GitHub\n", 486 | "\n", 487 | "
" 488 | ] 489 | } 490 | ], 491 | "metadata": { 492 | "kernelspec": { 493 | "display_name": "tf2-GPU", 494 | "language": "python", 495 | "name": "tf2" 496 | }, 497 | "language_info": { 498 | "codemirror_mode": { 499 | "name": "ipython", 500 | "version": 3 501 | }, 502 | "file_extension": ".py", 503 | "mimetype": "text/x-python", 504 | "name": "python", 505 | "nbconvert_exporter": "python", 506 | "pygments_lexer": "ipython3", 507 | "version": "3.6.9" 508 | } 509 | }, 510 | "nbformat": 4, 511 | "nbformat_minor": 2 512 | } 513 | -------------------------------------------------------------------------------- /02_1_simple-RNN-diffrent-sequence-length.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

تغییر در طول بازه‌های زمانی ورودی شبکه

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "code", 14 | "execution_count": 1, 15 | "metadata": {}, 16 | "outputs": [ 17 | { 18 | "name": "stderr", 19 | "output_type": "stream", 20 | "text": [ 21 | "Using TensorFlow backend.\n" 22 | ] 23 | } 24 | ], 25 | "source": [ 26 | "import pandas as pd\n", 27 | "import numpy as np\n", 28 | "from keras.models import Sequential\n", 29 | "from keras.layers import Dense, SimpleRNN\n", 30 | "import matplotlib.pyplot as plt\n", 31 | "%matplotlib inline\n", 32 | "\n", 33 | "t = np.arange(0,1500)\n", 34 | "x = np.sin(0.02*t)+ np.random.rand(1500) * 2\n", 35 | "\n", 36 | "train,test = x[0:1000], x[1000:]\n", 37 | "\n", 38 | "# convert into dataset data and label\n", 39 | "def convertToDataset(data, step):\n", 40 | " #data = np.append(data,np.repeat(data[-1,],step))\n", 41 | " X, Y =[], []\n", 42 | " for i in range(len(data)-step):\n", 43 | " d=i+step \n", 44 | " X.append(data[i:d,])\n", 45 | " Y.append(data[d,])\n", 46 | " return np.array(X), np.array(Y)" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 2, 52 | "metadata": {}, 53 | "outputs": [], 54 | "source": [ 55 | "train_step = 10\n", 56 | "test_step = 20\n", 57 | "\n", 58 | "trainX,trainY =convertToDataset(train,train_step)\n", 59 | "testX,testY =convertToDataset(test,test_step)\n", 60 | "\n", 61 | "trainX = np.reshape(trainX, (trainX.shape[0], trainX.shape[1], 1))\n", 62 | "testX = np.reshape(testX, (testX.shape[0],testX.shape[1], 1))" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": 3, 68 | "metadata": {}, 69 | "outputs": [ 70 | { 71 | "name": "stdout", 72 | "output_type": "stream", 73 | "text": [ 74 | "(990, 10, 1)\n", 75 | "(480, 20, 1)\n" 76 | ] 77 | } 78 | ], 79 | "source": [ 80 | "print(trainX.shape)\n", 81 | "print(testX.shape)" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "#
معماری شبکه و compile آن
\n" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 4, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "model = Sequential()\n", 98 | "model.add(SimpleRNN(units=64, input_shape=(None, 1), activation=\"tanh\"))\n", 99 | "model.add(Dense(1))\n", 100 | "model.compile(loss='mean_squared_error', optimizer='rmsprop')" 101 | ] 102 | }, 103 | { 104 | "cell_type": "code", 105 | "execution_count": 5, 106 | "metadata": {}, 107 | "outputs": [ 108 | { 109 | "name": "stdout", 110 | "output_type": "stream", 111 | "text": [ 112 | "_________________________________________________________________\n", 113 | "Layer (type) Output Shape Param # \n", 114 | "=================================================================\n", 115 | "simple_rnn_1 (SimpleRNN) (None, 64) 4224 \n", 116 | "_________________________________________________________________\n", 117 | "dense_1 (Dense) (None, 1) 65 \n", 118 | "=================================================================\n", 119 | "Total params: 4,289\n", 120 | "Trainable params: 4,289\n", 121 | "Non-trainable params: 0\n", 122 | "_________________________________________________________________\n" 123 | ] 124 | } 125 | ], 126 | "source": [ 127 | "model.summary()" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 6, 133 | "metadata": {}, 134 | "outputs": [ 135 | { 136 | "data": { 137 | "text/plain": [ 138 | "" 139 | ] 140 | }, 141 | "execution_count": 6, 142 | "metadata": {}, 143 | "output_type": "execute_result" 144 | } 145 | ], 146 | "source": [ 147 | "model.input" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "#
آموزش مدل
\n" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "
\n", 162 | "حالا با این پیاده سازی به نظر شما میتوانیم در طول آموزش هم طول timestep متفاوت داشته باشیم؟
\n", 163 | "\n", 164 | "
" 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 7, 170 | "metadata": {}, 171 | "outputs": [ 172 | { 173 | "name": "stdout", 174 | "output_type": "stream", 175 | "text": [ 176 | "Epoch 1/100\n", 177 | " - 4s - loss: 0.5067\n", 178 | "Epoch 2/100\n", 179 | " - 0s - loss: 0.4110\n", 180 | "Epoch 3/100\n", 181 | " - 0s - loss: 0.4095\n", 182 | "Epoch 4/100\n", 183 | " - 0s - loss: 0.3995\n", 184 | "Epoch 5/100\n", 185 | " - 1s - loss: 0.3934\n", 186 | "Epoch 6/100\n", 187 | " - 0s - loss: 0.3865\n", 188 | "Epoch 7/100\n", 189 | " - 0s - loss: 0.3920\n", 190 | "Epoch 8/100\n", 191 | " - 1s - loss: 0.3851\n", 192 | "Epoch 9/100\n", 193 | " - 1s - loss: 0.3858\n", 194 | "Epoch 10/100\n", 195 | " - 0s - loss: 0.3869\n", 196 | "Epoch 11/100\n", 197 | " - 0s - loss: 0.3804\n", 198 | "Epoch 12/100\n", 199 | " - 0s - loss: 0.3806\n", 200 | "Epoch 13/100\n", 201 | " - 0s - loss: 0.3736\n", 202 | "Epoch 14/100\n", 203 | " - 0s - loss: 0.3740\n", 204 | "Epoch 15/100\n", 205 | " - 0s - loss: 0.3662\n", 206 | "Epoch 16/100\n", 207 | " - 0s - loss: 0.3708\n", 208 | "Epoch 17/100\n", 209 | " - 0s - loss: 0.3668\n", 210 | "Epoch 18/100\n", 211 | " - 0s - loss: 0.3632\n", 212 | "Epoch 19/100\n", 213 | " - 0s - loss: 0.3697\n", 214 | "Epoch 20/100\n", 215 | " - 0s - loss: 0.3612\n", 216 | "Epoch 21/100\n", 217 | " - 0s - loss: 0.3619\n", 218 | "Epoch 22/100\n", 219 | " - 0s - loss: 0.3577\n", 220 | "Epoch 23/100\n", 221 | " - 0s - loss: 0.3577\n", 222 | "Epoch 24/100\n", 223 | " - 0s - loss: 0.3534\n", 224 | "Epoch 25/100\n", 225 | " - 0s - loss: 0.3537\n", 226 | "Epoch 26/100\n", 227 | " - 0s - loss: 0.3556\n", 228 | "Epoch 27/100\n", 229 | " - 0s - loss: 0.3499\n", 230 | "Epoch 28/100\n", 231 | " - 0s - loss: 0.3484\n", 232 | "Epoch 29/100\n", 233 | " - 0s - loss: 0.3455\n", 234 | "Epoch 30/100\n", 235 | " - 0s - loss: 0.3415\n", 236 | "Epoch 31/100\n", 237 | " - 0s - loss: 0.3393\n", 238 | "Epoch 32/100\n", 239 | " - 0s - loss: 0.3332\n", 240 | "Epoch 33/100\n", 241 | " - 0s - loss: 0.3361\n", 242 | "Epoch 34/100\n", 243 | " - 0s - loss: 0.3387\n", 244 | "Epoch 35/100\n", 245 | " - 0s - loss: 0.3236\n", 246 | "Epoch 36/100\n", 247 | " - 0s - loss: 0.3313\n", 248 | "Epoch 37/100\n", 249 | " - 0s - loss: 0.3276\n", 250 | "Epoch 38/100\n", 251 | " - 0s - loss: 0.3225\n", 252 | "Epoch 39/100\n", 253 | " - 0s - loss: 0.3239\n", 254 | "Epoch 40/100\n", 255 | " - 0s - loss: 0.3161\n", 256 | "Epoch 41/100\n", 257 | " - 0s - loss: 0.3075\n", 258 | "Epoch 42/100\n", 259 | " - 0s - loss: 0.3118\n", 260 | "Epoch 43/100\n", 261 | " - 0s - loss: 0.3085\n", 262 | "Epoch 44/100\n", 263 | " - 0s - loss: 0.3020\n", 264 | "Epoch 45/100\n", 265 | " - 0s - loss: 0.2969\n", 266 | "Epoch 46/100\n", 267 | " - 0s - loss: 0.2989\n", 268 | "Epoch 47/100\n", 269 | " - 0s - loss: 0.2919\n", 270 | "Epoch 48/100\n", 271 | " - 0s - loss: 0.2893\n", 272 | "Epoch 49/100\n", 273 | " - 0s - loss: 0.2869\n", 274 | "Epoch 50/100\n", 275 | " - 0s - loss: 0.2843\n", 276 | "Epoch 51/100\n", 277 | " - 0s - loss: 0.2818\n", 278 | "Epoch 52/100\n", 279 | " - 0s - loss: 0.2788\n", 280 | "Epoch 53/100\n", 281 | " - 0s - loss: 0.2698\n", 282 | "Epoch 54/100\n", 283 | " - 0s - loss: 0.2686\n", 284 | "Epoch 55/100\n", 285 | " - 0s - loss: 0.2606\n", 286 | "Epoch 56/100\n", 287 | " - 0s - loss: 0.2595\n", 288 | "Epoch 57/100\n", 289 | " - 0s - loss: 0.2569\n", 290 | "Epoch 58/100\n", 291 | " - 0s - loss: 0.2531\n", 292 | "Epoch 59/100\n", 293 | " - 0s - loss: 0.2535\n", 294 | "Epoch 60/100\n", 295 | " - 0s - loss: 0.2413\n", 296 | "Epoch 61/100\n", 297 | " - 0s - loss: 0.2381\n", 298 | "Epoch 62/100\n", 299 | " - 0s - loss: 0.2340\n", 300 | "Epoch 63/100\n", 301 | " - 0s - loss: 0.2337\n", 302 | "Epoch 64/100\n", 303 | " - 0s - loss: 0.2308\n", 304 | "Epoch 65/100\n", 305 | " - 0s - loss: 0.2165\n", 306 | "Epoch 66/100\n", 307 | " - 0s - loss: 0.2228\n", 308 | "Epoch 67/100\n", 309 | " - 0s - loss: 0.2148\n", 310 | "Epoch 68/100\n", 311 | " - 0s - loss: 0.2090\n", 312 | "Epoch 69/100\n", 313 | " - 0s - loss: 0.2065\n", 314 | "Epoch 70/100\n", 315 | " - 0s - loss: 0.2043\n", 316 | "Epoch 71/100\n", 317 | " - 0s - loss: 0.1964\n", 318 | "Epoch 72/100\n", 319 | " - 0s - loss: 0.1957\n", 320 | "Epoch 73/100\n", 321 | " - 1s - loss: 0.1896\n", 322 | "Epoch 74/100\n", 323 | " - 1s - loss: 0.1830\n", 324 | "Epoch 75/100\n", 325 | " - 0s - loss: 0.1827\n", 326 | "Epoch 76/100\n", 327 | " - 1s - loss: 0.1808\n", 328 | "Epoch 77/100\n", 329 | " - 0s - loss: 0.1748\n", 330 | "Epoch 78/100\n", 331 | " - 0s - loss: 0.1662\n", 332 | "Epoch 79/100\n", 333 | " - 0s - loss: 0.1674\n", 334 | "Epoch 80/100\n", 335 | " - 0s - loss: 0.1636\n", 336 | "Epoch 81/100\n", 337 | " - 0s - loss: 0.1580\n", 338 | "Epoch 82/100\n", 339 | " - 0s - loss: 0.1580\n", 340 | "Epoch 83/100\n", 341 | " - 0s - loss: 0.1476\n", 342 | "Epoch 84/100\n", 343 | " - 0s - loss: 0.1502\n", 344 | "Epoch 85/100\n", 345 | " - 0s - loss: 0.1431\n", 346 | "Epoch 86/100\n", 347 | " - 0s - loss: 0.1377\n", 348 | "Epoch 87/100\n", 349 | " - 0s - loss: 0.1369\n", 350 | "Epoch 88/100\n", 351 | " - 0s - loss: 0.1321\n", 352 | "Epoch 89/100\n", 353 | " - 0s - loss: 0.1286\n", 354 | "Epoch 90/100\n", 355 | " - 0s - loss: 0.1256\n", 356 | "Epoch 91/100\n", 357 | " - 0s - loss: 0.1229\n", 358 | "Epoch 92/100\n", 359 | " - 0s - loss: 0.1206\n", 360 | "Epoch 93/100\n", 361 | " - 0s - loss: 0.1167\n", 362 | "Epoch 94/100\n", 363 | " - 0s - loss: 0.1145\n", 364 | "Epoch 95/100\n", 365 | " - 0s - loss: 0.1071\n", 366 | "Epoch 96/100\n", 367 | " - 0s - loss: 0.1078\n", 368 | "Epoch 97/100\n", 369 | " - 0s - loss: 0.1038\n", 370 | "Epoch 98/100\n", 371 | " - 0s - loss: 0.1023\n", 372 | "Epoch 99/100\n", 373 | " - 0s - loss: 0.0970\n", 374 | "Epoch 100/100\n", 375 | " - 0s - loss: 0.0994\n" 376 | ] 377 | } 378 | ], 379 | "source": [ 380 | "history = model.fit(trainX,trainY, epochs=100, batch_size=16, verbose=2)" 381 | ] 382 | }, 383 | { 384 | "cell_type": "markdown", 385 | "metadata": {}, 386 | "source": [ 387 | "#
ارزیابی مدل
" 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": 9, 393 | "metadata": {}, 394 | "outputs": [ 395 | { 396 | "name": "stdout", 397 | "output_type": "stream", 398 | "text": [ 399 | "0.07633800416281729\n" 400 | ] 401 | } 402 | ], 403 | "source": [ 404 | "trainScore = model.evaluate(trainX, trainY, verbose=0)\n", 405 | "print(trainScore)" 406 | ] 407 | }, 408 | { 409 | "cell_type": "markdown", 410 | "metadata": {}, 411 | "source": [ 412 | "#
رسم سری اصلی و پیش بینی برای داده های آموزشی و تست
\n" 413 | ] 414 | }, 415 | { 416 | "cell_type": "code", 417 | "execution_count": 10, 418 | "metadata": {}, 419 | "outputs": [], 420 | "source": [ 421 | "trainPredict = model.predict(trainX)\n", 422 | "testPredict= model.predict(testX)\n", 423 | "predicted=np.concatenate((trainPredict,testPredict),axis=0)" 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "execution_count": 11, 429 | "metadata": {}, 430 | "outputs": [ 431 | { 432 | "data": { 433 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAD8CAYAAABjAo9vAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJztnXec1FTXx3936hY6LL0sICBFBUUQUUQBQez19bFj76I++lgeEdtj7xUsKCoqViwgonSQJr3q0jtLW1hgd2cm9/0jyUwmk2QymdTZ+/18YGeSzL0n7eTk3HPPIZRSMBgMBiN38DktAIPBYDDMhSl2BoPByDGYYmcwGIwcgyl2BoPByDGYYmcwGIwcgyl2BoPByDGYYmcwGIwcgyl2BoPByDGYYmcwGIwcI+BEpw0aNKDFxcVOdM1g5DZr1vB/O3RwVg6GJfz111+7KaVF6bZzRLEXFxdjwYIFTnTNYOQ2ffvyf6dOdVIKhkUQQjbq2Y65YhgMBiPHYIqdwWAwcgym2BkMBiPHYIqdwWAwcoysFTshJI8QMo8QsoQQsoIQ8oQZgjEYDAbDGGZExVQCOINSWk4ICQKYSQiZQCmdY0LbDAaDwciQrBU75UswlQtfg8I/VpaJwWAwHMIUHzshxE8IWQxgF4BJlNK5ZrTLMJ/Jq3di2/4jTothH7tWgW6cjdklu0Epxbb9R1AZjTktFYNhKaYodkppjFLaFUBzAD0IIV3k2xBCbiaELCCELCgtLTWjW4YBrv94Ac57a6bTYtjHOyeBjDoLV3wwF1/O34yTn5uM+8YucVoqx9i6/whGzVrvtBgMizE1KoZSuh/AVACDFNaNpJR2p5R2LypKOyOWYSG7y6ucFsER1u8+BAD4feVOhyVxjiGj5uGJn1Zi14EKp0UxjyVfARtnOy2FqzAjKqaIEFJH+JwPoD+A1dm2y7CWikgMpQcrnRbDVvjhIMBHiMOSOEfZkQgAgMulUbDvbwZGnaW4isupHdWPGRZ7EwBTCCFLAcwH72P/2YR2GRZy0+gFOPGZ350Ww1LkN3VU+F6N9Xq14usFm9HmkfHYWp3GlASyVuyU0qWU0m6U0mMppV0opU+aIZhZLN2yH+e9NRNHqtiAmZQZ/+x2WgTLWb/nUNL3mKjYnRCGYTs/LtkGACjZVZ5my9wj52eePvnTSizdUoZlW8ucFoVhM1T2Fj76Tz4xHqnGJrv8mOQihyqjuHn0Auwoy6FxhAxxJG2vnfh8/E0cq6a+NpFdBysQ9OX8c1wX1VivVwt+XroNv0kGyKvj6c55xR4QFHtFNIaywxHULgg6LJEz9HjmD6dFcADlh3l1vNFFqsNDTW7DKe3z9rIjKKoRRsCfm8ZObu6VBL+g2IeMmo/jnvwtZT2lFI+PW441Ow7aLRqDYQkrtx/AuMVbFddVB1cMp7CTizbtQ/FDv2BHWQX2HapCr2cn4/EfV6AqyjkgofXkvGJX8qeW7DqIldsOAOAnbHzy50Zc//F8u0VjOEQu+9gpgANHIrjny8WK63dVgxBXeTQUAcGnc/jxlZklu3Gggg/5/HzuJrT/7wTb5bODnFfsfoV7uP8r0zH4jRn2C2MTFZEY3pr8DyKx3LRG9KJmnfoIf4xml+ReZJA4liS+qS7fWoY3//gHALD/cPWYmKY0njZ33d74Z1INnHG5r9h9uX8S5Yycvg4v/fY3PpujqzxizqLmdSCE4KbRC3DFB3OxescBW2WymmiM3+uaefzw2XlvzcTLk/4GpRSHJCG/OfzSojj5qrrFsue8Yq+OswwPCzfwYRa7r8jeQ1XxOP79hyMOS2Muon95/+EIZpXsjis5ubLLZV+73Mf+y7JtDkniHDml2PcfrsK0v5MTjKWz2Mctzq2T/uyEVXhv2tqMf1cRiSGaY64bvcpr897D2LTnsLXCOMCoWRvin6Ncbp1bLeSK/Yt5m5O+VwdbL6cU+51jFuHaj+bhld/W4N6v+MEjXxrF/uLENQBy51Vt1MwN8c9Uh2br/dxkAMDRj/2Ky0fmRm2UAxURFD/0C35cohwZIufUF6agz4tTLJbKHqRnPOgncSUW42jS9UAI73/Xc414jRyzTwzhLcVedQgYczmwf5PiajHP+BuTS/D9oq2YsmZXtRkwiiN5jum5Z6UPtAUb91kgkP2IU8jfnpL5m4v3SZx0v4/AL2j2qMwXM21NKc55c2aKNZsLKIU7Vje8pdhX/wL8PQH4QzkdjXzy0ZBR8zGrZI9qcxe9M8tU8dzGy5P+TplWXUy2o7/vL4ckso4Nuw/hzjELq3cRjU1zk0z28cu2xxV6LJas7MQ8Op4fPP7hjvjHvwTDpLpmdJTiLcWehsKQ9kRaP2LAxj/j3xdu2m+1SLYjdzx9NT9hkTUnpZgavh8fhF62VygbeOCbJfh56XYs2rQ/pwcG1Vg47UfgozMRKE+MGUn1m9xiV3JQluwqj8d4e4bFn8U/XvzubOwpr8ytlMQGyRnFvmXfYcxME5d8T+BbYNQgHE/+VlwvTlryMvIYXgqKf3bys2qnhYY6IZIt7BGKh9QrDOGDGesclsZ+3p+4AADgjyhnMpywfDvOfTO1cpb0Idj/lWm45F0PFKxYP0PVHXskEkMszZNdbfD0i3mbcOMnC7KVzhXkjGL/vxHpB/7aE34wrYgoW+q/Lt9uqkxOILfMKAUGvDodAOAnuWvKlB2JYHjgY7R/tzkmLN+Bi3zTUQuH0v5OfOh5nYMo4D9wUcX1w8atwD5JaKeacvt7pwdS3H5yDvB6V9XV6Vwx8pnHK7cdwIRl2/Hwd8vw+6rcqK7lLcWu8STeeyjzQdLzfTNRC4kL2fNqr+QPFBOlhxPFhrwrbBfHTqIcxXUBPhfQ0WQTXgm9hxeDI5K2IeDwL/8fCCKh/B4bt8JWOa2inOYDAAJV+t46xdmXFBRLNu/3nl+aqo+lpBs8lT/TBr8xA7d9vtAEodyDtxR7HMmp2b8JWPARHiYfIwx9yp0AaE224/XQO3g9+HZ8ued9s59dhKnh+5MWUQA+7z+y0iJ1QRWAHzBuKHszu9g/A88GP8Qt/p9slc0OMj3DESGufc66vTj/7Vn4YGbuuK/SPaNy/27IhbS9H/QHynfiGh/wt78xPosNUN1UekLzwSdDakxyI8RPFUpBqsGlHOMo4Fdf38u3As0JP3mtHskN94uUTOfcjJjGK/It+/iJWau2584x0YrN/3X5DtTJz/3U3d5X7OUJn5gP+mYmSBUdldwSYqFfr9MIe7ET9eLf0yn2XPAzpxsw+yL0jOq6fFR4Oi3U4+OWG35451pCLEq1LfLfV+3MGT+6Fh51xSiT7hJtQvYK2ymf+k/nbERFJIY569Rj371Af3/CX6jmirnKPwl/hHi3zSPfL7NLNMtQyuinmgRMtmZZ+EaszLsexQ/9gts+816M/yd/blS8phugDHnQTtMrjiPmygxUSrX3pRH24rPgM0lja7lITin2dHT1pZ+JeNcXi3D5yDnYXe6NvNVnvDwVo2atBw4lQj2T3kio8oPs6eAotPVtF7ZPfiSuKy3H/8av8tTNLlXs4v5SndZogCTe9CYs32GuYA6yIO82zTcVKd4509pwlGqOld0W+BGn+FfgIn9q6GcukVOKXe/rKAHwQ+gxAEAn30Z+4pLAJKFWoleiBNaVHsITP60EPhqkuJ6mPSap628cvQAjp6/DRo8nxpKe1+pKN1+J5vpcywAao9pXvPiwz/Vxp5xS7HL+zz8FA33KlZFCJCbZbmrKes+d9j3/KC6mVHvsIYBYig9LfKh5IucGpcDhvUmLxJu2q0850iMXb+p0+7Qh7wq8G3xVdb0XTrUeaBqLnRNUXq5HiuW0Yn8++D5GhNQvZpEaSLVMvXyhH+/7B3moxJOBUbh26VWaDolL/NNRHEl2UYkTODxxCGa+CrzQGo2RGBcxYzgwGuMU/fZuRc/D6iy/evlH7+xpgn9/vSRlWYzTfkvlqonF7rGomHQTD/S6YpK3U3p6p3dhOI+aD/wi/0z4weF8/2zgsPZxeS74AbAbGIsx8WWiYvTEw23lDwCABqQsviidNabnOjnq0Qk4oVVdfHvbydnJZwNdyDqc488u5fIaDyYD++avLXgpL3mZ3Mf+WZAfY7gq8iiAhCtG6y2WUur5urgeU+wCWR50XYrd7UrtyH7E/PmqqzuRRFm8TK2TxOF1+0EAEOEnI1UgFF9E0qRO0Hv1/OWRNMY/h/+ruT4fFZrrAY+kEtDBwk3J5+wUvzCzWIhk5uKKXcOqp8q1kr1ETrli1M7FX+FbNH+n9PR2vUp7vhXID7eqrpbKr9ef2J5sBiYNw9pS/ib3hCciyiutSiQmnaR7kHX0bUQbkgOVs767GZj5WtrNVuVdb4Mw7uDR75enpKqWkhg8VcdLLjg1ckyxK5+Q+rKZhvKTKiq+9mQzXgy8Bx84d4f6CbL5V36nugknObWXKQwOK/F56H/ArNdRDwel3bgbQbFLz6k4w1SNE31/Y3L437jd/wOmhu61UDiTWfE9sFOS22bpV8Dvjzsnj8M0hXI214OV6hMNEz52dVeMJ4IG0pBTil2/nS1zxQhxzO8GX8OlgeloTba7W6lx6UPUpDHc/w1+nlHzfuGi98I4g6jY7/T/EF/0YnCkrp8+GByLYl/yLMSr/JOADS4twPL1dcC79vj8xy3eik9mb7ClL6PMDN8jWyJEc2lMQBcNHj84HEdKoKQzXH3v68STip2jFM9OWGXaJCK9qQhcgyyz3ZX+31M3MRAbEotf9Hz7W/Z6oA5slL8GTvAp59jPlKeDo4CPB5vSlpWUHrR2At09Xy7G4z+6O/OlTzaWcq6PL6IjGiS9fakzqsVfXOCfhXHhYfHfSOEoxYGKSLzEohfxpGLfdbASI6atw6MZTIXXyvwoumLChH+FoyDufmoLObcp4bNePRP8KGUTI+LHJNYMwE9U2nUw/cCbo0RdLp9F5IK7wGyO8vHjJqKL/PPQsynbiAaPOOu6jUKaa45S/N+IOej/yjSLJLUeTyp20f8dkdVx1LJR1+RdF//8RujtpHWn+xYjiCiaE95nR0GSbpx/dh5E2WEXJQgTFDvRyEltyGKnwuQNyRT7/W7ab5GZrwJ7kmPvjeyvl6lee6sXIZWExkNPfp3IrX6AfzCs2u698E8p3lTswl/5xU1AMTTwDVqRHaiPMvnPVOns25i0PUWyxTvg1ek4/20X5ZbQ4WPv4tuQcbOixR6QuKZcZxge2gP8PhwYfUHSYtFaqw58PGt92jKQ1RHxzVsrqIWjcpWn5GN320WfOd6KYxcPuPBHHs7+aJCfZNPXtwRjY30zalqaCIp3xVCs3HYAzeryseIb3JQ3RYdiN0JMSGguHXNw3wCqIE8kfdm7XGX4TysBABflKa/XCvfsQVahCdmDcdwpiusboAz48S6E0A9VcFHe8mglEAhrbuIDh0IcwfEVa/AXOihuIz8ySsdq4GvTjUrpGrxpsROC5wMjcX3p84rru/rWophklqUvhITLQSxNMfiNGbjsvdTBFcfRcMFkQ8JiT7TvWuPFYsEG+uahNsoxYtpabNidOw+RseGn8Hronfh3Ag4NwU/qmRe+HQvybgMWjsYg3zynREyl9G/g6YbA0q81NyMAXgq+h0fLnkAr2f1/IlmNECJJYcCA8hyPnQcSA9M/L92GsfM3G5fdITyp2EGB/wtMxcnlk1Q3uTnwS0ZNPhd8X9p8XG+scWMRCpWCxVk3q+CKcR/2eJdHhF7D28HX8eyE1bjyg7m29JkZ2g+26/0TdLVyl/8HzMu7A81QmlRK0FW5VHYKQRKTnwRi6te+DxTFhA9fLZDlof86/CTuCXyb6mNPs593jlmEB79dakBoZ/GmYreAHr41Sd/95VsdkkQHFin2aFyxJ9p3rcVug+IRB9OPRNyX2lZLIRFQDAt+qqudPn5eaTUm8gyZLmT/JmDmKxobyEfHkmmAA/EJSiKueoCZSNaKnRDSghAyhRCyihCyghAinzVgOs02jbO0/aGB79B6dA/VmW2OY7GPPckVk6MXfib4XKjlAiblmhcVnV/2luYKhbdnLTC8NrBJkuBst3J6aiC99V2GwhSLXX5qW5CdmqHRXsEMiz0K4H5KaUcAJwG4gxDSyYR2HeNUYWJDI7cWurZIsVOVm9yNUBAMfn2GLX25MdOfVhGRTKSNz8QkLlTs64U48qVfJZZpDKCq1TIW4eBTsNiT93tG+F6MDT1pQFh3kbVip5Rup5QuFD4fBLAKQLNs23USF1zS2lg0eCoivfR9VQeARZmlJLAHipU2xRq7T62b9/AV5y6kZDxNkyHTMYi6ykpnsXMgunzsx6kUaPESpvrYCSHFALoBcONok27ceCNLqYokXhW1khkZRZr2tum0B4BxtwPbFpvejzHsVzg+F1rsWkosCP1jMK5wxZT8ruJiUTjuzY5XbeaGwASENPadA8F1gYmyHlz6AMsS0xQ7IaQGgG8BDKWUpphShJCbCSELCCELSku1s+85jdtP9vTViVCuO/zmjzdI9z9weBcA4PDhcnw6Z6Pjkzc+n7sx/UYmEZ8I5wa9vj7Z7dSeqIfgtSI7VdfJcUWpuM8uBt7qnrpcPPBS0dK4IYskkT1yOJD4gHi8C70yegxTFDshJAheqX9OKVXMJUspHUkp7U4p7V5UVGSwJ3crXNuQXNxmvTZe4p8WL3wsVeyi8f7BjPV47IflmLNur9LPLWfngQqc8fJUvPbbal6uI/vwfWiYLX27wmJflBzl8k1Y3Q9cl+hPXpUoPOFCH7sSNN0bKr8/AQXLnUrUXYxqV1LyXGJAGWZExRAAHwJYRSnVikXyDEVCmTW3XtyEJi7aU33mxNi+FByRaD9pEIqnrILvsyLqTOjfyOnrsK70EKQPd/FBZBXxogwu0OvpFZox5InfRFxV7Fl6/HUGDvTzLUpZFpOkExDfVNTu8Z6+VfrlcyFmWOy9AVwN4AxCyGLhn/vznnoYIrnJ84j5SbqS9RiV/O881upY5b10m2I/hpg3uJdwxShb7MUP/eKumbdpAgfEMxgRQnelSCNiEvmmpOc88Vk8HnVxAD2I95S8GVExMymlhFJ6LKW0q/BvvBnCMZTRyupoSvsqFzu/zmbKS4Hvb4U/JlZKsu4RI287XvjYDZpdoth/SlPjNBPULFfpHk9ds8u0/gwhEa3ssL40zZWSGrgiyYo9tfap0rX1RegZjA0/hX92HrQ8B76ZsJmnXmPk6ei18H5Lu2hDtqOnaKXEr3WHlNvvw4ElX6DL3t8EKaxT7Bf5lDN4btxzGE/9vNKyfnVhmSsmfVRMMOCUmki95j6eqf22IirsKoX8hlIfe6L2qVSxp/Z8tI8fpB7w6jT0eWGKHqFdgbeyOzKAbQuhktTPNJ4IfgIAKK4YE1/mdDSMkoVlNj18q1XXfThzPdoW1cAVPVta1r8mFil2TsXHnqTY/Q4pduFNKUZp3LFSWRWBnqSTSnH+Rix2ER+oK1NLqOEpi91p5cJwYham3EVg3TWQLhLikQwqdpmORde+uism8f3Bb5Zi+Vb99Q3MJsal+r7ToZRyIQbp4KmSxa5+jMUHxU2jF+jq32k8pdhnlexxWgRQSjFi2lrsKKsuJdnEwVMX+Jlh7UCmvG237DMACy12fROURD/7p3M2YuMe5wZT0824rUN42ZQUu3SPxIlM0mXJrhjZTFyh30kr9c8RcBJPKfaFm5zP3fLshNV4dsJq3PKpN57cuUJqAKb5uDW8FYD1rpiUXDHJxDigIhLDYz8sx6W21ShIfbD6iL7joPQAkLYWJLGUpVpvA16La/eUYrc76K4Z2Q0/YijEEdzi/wngOIyczg/eLNni3KuprVD1aAkn+rc2vtqlin3vOmD336Y2KZaC5GiqS0L6vRYOoQj7EZOch7IjztXB1ZsjpwFJvT+VHtxqb2XituJEJi8kxpPiqcFTu2+7N0NvYWBsPg7QQlwRmAysOQdS1XakKob8UGq8bC4i3tcHKiI46pHxGDXkRJzazugMYuP9W+tjd6lif6Ob6U3+lXcbbqq6T3WCknic54bvQD6pwsvUuYpK0tTReovADJHlhAHSXzsDffMV+ibg42lcem2o4C2L3YHB03P8c1FIBH965EjSuqqYt57iZrBi2wFEOYq3p1g76zOBfYOnrnbFWEA3X4lqrhjRfMknfMK5NnumIvhax6QSkpajMKCila44HQ2IdjbQmxSqriWiZ7x1r3tKsTt128XDpEpX4Tr/r47I8PfOg5j5j/2FP/aUC5MyhJuMEyIU/DZXn0jEHVuHvO2jfNtQAy4qYm4yUUl+8tSZp8nfT1//MvyHdqIh2Y/KKIfySmuqeCkiufGzcYmolctUyisj75pZ7Fbi0LGNF8Cd8TKGB0c7IsOZr07HVR/amw2ZgEPLCj62W1SsYuiZbbMxU3z89lrsN+isHepFYtSv4YpJRvTFi2zaY8cDj++zMpqQLRuLXYkrA3+gJO8a1EJypI+4t14qPiPFU4rdMYudyi9zmvTHMirLgd+HY/H6Hem3tYCzfakPEjGk2D6LPTnc0m7FThwuOMFx1vUfgV9SwDxZYbrVLWWVgu3uW4NjfBvi34nsumOumByEkx0m2y76ac8DM1/Fb5+9bE9/MqTZEzcIscucYEE3jW4BJg0DVlpbf1bETyPgb7PMjv3PsZ66t1V6VPH9OafkRky3rppPVKLYtSYoAYi/Of078BXsJ/MJSplymm+JSs/Wz3i2AqbYdZBaJ9Emqvi82kGHAm8u9CeKO+w7zPshOUrRnJTif1uHALNeB8ZeY6kM2/bzr/wXbH4et/p/svTYb6f1UpbdE/geG/KutLBXbRZtsG5cJQa/6gQluSLzRw4CAC7wzwbgXMZLv0UKtpAoJ/iKu2J0xs+7BU8pdqcyCjhmscf4CATq05EcwwLk+w3wir0e7Kk1CgBz1ycKe1zkn5Hxsc9k+zyXVadftnQhRm4407L2o5D62OWumGRsmYW77Jv4x8poDD8v3c7LIuk6QKzJ11KII4rLE+l9mWK3EGc0e0xFsVOr5eF4K5n4nVHsSuFhMY639JzAZ8AV075hDd3b9tEoWtLf9xcAYMqaXZhVYkN0UnkpyDdDLO0iCr/q4KBaCmNV9qwFdqknUdPFtzfEP45dsAWT15SKnccxe/BUJD/loc53qpYkTZGSP4D96iUL7cRTij1Ti31g5XOm9OuYK0aw2J1S7FLER1mvXWNVrRsrkCdp0vJ1Dotcm/T9IM2P//6ByM1p+8oj6hb7ByF+nGPIqPm48gMbopPG3Y4uksE8K0iy2GWuhhN8yTNd0956bx4PvKN/PEOV4bWBGS+jKpq466T3m1WDp2oGQ0bhjp9dBLzTyzyhssBTij1T1tDkFKtzuI5YzLXJuB25tWKbK4YTJoO4QLEDQA+yGufteANjw0850j8fda1+7CfFEgWRrwy8itMrX4m/xh+kBehW8Z5m+/nIrJDC/A178b/xFlXXqbI+nJADUQ137OdPLi0X0oj1Np15H4BSGo9Iklrpdit2IDUqZvNejXNTddBEqYyT04odAOZxHeKfX45ciiM082zmaj52K33+Y+ZuwpSV2/gvPj7zQ1PYP0FJSogo39wHKuyZjehL4xCQvlmt87fCbtSWxCMD+1BLs/3U13FtLn3vz3juIFP59CJgo3LRDzPhaOKIpos2qUHsz2Yq3mf5kjcpu+LJu/lKsCI8BLXI4ZR+T31hCn5d7kwIsl5yXrHLiRrYZbkrxg4e+X4ZqOCKiQopfQb6U3NZ2EUHsiUpSkbKnWNSCwdnxfLveJ/t8NrxKAwgfeihVO2nvsanP4dBiwbmMmbtH7Z083LoPdzs52djWhVtYgwKSpXdH1b52OXcE/g+KVJGLsuKbe5OAuixJGCZX3zyVywjA3/qg6fWIk4aWbY98/zX87gO6OFbY5osL4fU3RjBHYsB9DCtL3wzBFCIBPIRbR97kmInyROa3KS23ESBoLyUFab6URNdXPPW70VxgwI0NFMoSsFR5YFyp2aAyt9o3F7zx1sWu4GDKbfTjFnszoQ7iv00InvRgWzK6LeXVz1mhUiKdIpaUA+US3XvBBDTPPZKjho73rVyobKXksJMN2C4fGsZLhvxJ855Q+I2+vGulGR5RqAq/dvvY1deP27JVkvkMAtPKfbm9Qqy+j0BNWSxq0XFWH1DixfTC8H3MTH8UEa/VYpBt4poSsqFLNrSyJjpTzN4Kl3jE3ZfPjXcCnJAryta7M1IqeK2k0P3YfvUj3DOm7xC33VQMui8cDSwNIPZqYoHT3TFpF4LTlns8qtn816Nh9ehPcCGmXxKEIfwlGKvXxjK6vcEqW4VADi38mnN36Uqdnte70/xr7C4B3Oo4kxQmuumAod2Y/LqXaqb+NKMdnBJHnW5Kyb5l70rXkeUmnP5czmg2a8NTEpZ9mbwTcVt2/h24PTVj5vTsdKx03DFWDVBKR0ZvaVv+hP4+Gxg71rrBEqDpxS7EdNoJtcl6fsi7qiUbZbRNng5conuNt8IvgXRqgD4WXLP/LLStugQt1ERpfjmry3GG+A4YPT5wCfnat4+6S12qY89+a/8V1tRhEFVJs1zcJFeV7q+jRI2HOKYyYNe/eApuWKcSsaVNo49Io0aEqvCOKdePaXYfTRzxfl69CJs4vhKP4RQfBAbjD/PHIeZp3yStN1W2kB3m/38ixBCND6Y++1fW/H+jPV4dZIJ5cv2rAWWfq24KqjzRlvFteDlrHwRp1dan0CMA8G/v1ZOoqQP4UbYtVIzHXC6OHalqBildSIltDk6V3yYkaRKJFnsa6cAFc5FTFTCvDkP6azUfFSgEfZqbpMWJWPt0C74qw4qW+wOKfYfwsPQXMU1BQD47dHE53h9WueKoXtLsXOZ5/Kg8GEbGiR9L6/TEfuKkqM4vuNOzVwgCmD1eFz8e2+EUYVRszbgYLZW+zu9gO9uVFz1SPALXU1cVsW/Jq+lzRQTW5lN1v58yc09YGx71c3SpRTQjopRvsnya9TJSFQlPpuzkU+ve2g38OkFwNfXZd2mEW6putfU9tL5s78IPYO5eXemrsgkQ9jSLxUXd988StFKlmYctZvTfRpDOhFQAAAgAElEQVRhvfs2JD7H6zgyxa4LfwaKvVPFRynLEhOLqIJlmNlJIKK9PukxhCMH0Izwk4fWlmYemphELLPZj0pU2RzFmn2cvz5fRkaKPf5Xu+25j/TT1bcWT/+yik+vGxVex0vNCzPNhEPIfPKdFumOXVefCT7kDcoTscKRMlsTb+nxoSttE41xeHHiakRi0nXMFZMRPoUQODUOSy5yqhC1URWLYS3XBO9HB6dtS83aoxQA4aNsROvGDQNpyQrOenm4bAchdR4zP2L6B09FH7vYhcpvzKoX8vyvWSbAMgGlwIBscLK4RChabmsOdKOBCr8s2463p6zFsq37Ewtd4Irx1AQlI64YNUp2lePeKuP+5/hF7+MVu5hnxA0xzaJiv+P0tmhVJwhYXN1NVKiTV+/EGUc3MtCCvmMWJlH8N/iZRivqrhjxJtvINcR62iRlu1wgZlKUj4g9ilX5+Idj5WjlcPUqPXCV/Bt62ZEo4pHUe4Q0E8xi14c/A4s9HdFYdhcNgTATVlDsP4YfQz/fX6bFNP838Knh34oi3HH6UTi7aytFt5SZiIp95TaDedozOGjH+dRzs0hbUcoMCACnVb2G6yL/0d2flAZIMyhq9OSvnw5UZJ/jPmpyOmXLLPZDu/l/gKofuujQ37gqYE9qBcNsW4QLJ5yIQb55yct3Luf/Mh+7PjLxsacjqhCjVkrUBhpTT5APXJIrBgD6+paYFvp2Y8C4mZ2oD8r/lbqlrBhMFQdPg36jl5M5B006iCveU9FgIQCgJvhkTucc2wTf3nayofYX5N0W/7xk836NLTOgfBfwybnAdzdl3ZTprhjDFnMahfZiW/6fBoWRLKNt7GAbP5jaR6WsHouK0Umm4Y4Th/bByKtPkLyMJyYWKc1wfCF0l2I7V/p/T5UFFDs3lwDbFyctc5uPXV50ehNtiOKKMfHvn0b7Z92fqFBCAYOXkwXHTHyo7a97DICE0m9etwAntKqbVds3+Mej5vupuccPVBqYPCOUP4zsWIlXJ/2dlSvPMxZ7zqCmuFlUTEZMajkUt1YN1b19h8Y1cWbnxlhE+UkbpTQR2lY7PzXe169ioShVEiKgmPbVKynLKiIuyRAI/rrKkxVMraTJ+/1Y9Pqs++GyVewmWuxjo6ehrM258XtqXbvrcUvVUIzn+PBWpXvtwsonMurnseBnaONLTdt68TuzMpYZHK88Dx3cjzf/WIOt+43nWakAPzO7vF4nw21IMexjz0iheWeMoz3ZghASxmVpOT+uxrtlFfaDKXZ9REkIv3KZZxF8OXoZBlY+h39o8/iy209PnaHny6Bg7X8CX2Jo4DvZUorrRs3HT0u2ZSyjmYi3o2j8dWmWyENuRSikeFGHfBT48x3sLTuIiPhGtGYCXxWnTCNpkkkWOwXwYPQWbBvwDiqjfP/5oSAmcj1AhUv9uOa1U363hGq7BdTo61uEgb5EKuVDlQZmalLeEKhDD+CxwKeYvXaPIVkAft4CAHCB7HIqiehV7LXgXE4UO7kq8AeeDiTGq16e9E/8c5Jij8exs8FTXRj1X3PwpVRTkluyAR9BIAML5YrA5JRl4qmdtHJnxjKaSY28IJ6/+Bjkh/h9lOpNM2cmioiDp0dtHQdMfBijX7w7MRN14Wj+r8RlJWfqGnOOV3xsgQAHjvCWVVHNcNI2g7o0Sfmd0Tj8j0MvYkToVUO/BQDsXQ/MT8x8Pc8/Gw9+o1531W70umKGB0fLlug8ntFKLxnsAIATfalhrf8KTEleEJ/HwCx2XVjlvp72QF/MeaQfbjy1OKt2RB++U352MalV09oF+L8TEw+yZMWeXSI1JeI+9hgf+lUTR/CjnreWQ7uB5d/hzjELDfW7kUvOAs5JBo1rF/APsA6Na+LBQXwVrRphtbcVs29Ane19NAiYNyL+1c64bT3oVewFGZYUjDPmMnhOs0tQnS+yW1DszBWjj3hulpiB6f/SdmT3T6v6hWhQI4yWdbKbuZdI55tVM4Y5pfJ1xTEIqTjfxPqY3m8QMWzIuwLF6xIpD3Rd0l9eCXwzBA2JsQiTx6PXJn2X3mif3tATb/yrG/KCflzRg3/ImTUZKT06L4DyZD+9eYOV5uyo3geN4ct93VSjv3SM1r6d6EJ0lkN00BXjqQlK8WyK1JjYXVvUQYMaYfTtUKTSQXY3VhPC+0eNVHoygx2oj1+5+jhadl9TStGx4iN8e/speCJcwEfKvG1evzWFupCFh/hiIBTQTOYVp4zPCGm0UPJm2hD3Vt2GV0PvCkv4PiMxDs3q5KNZnXwAQEzw4QUMh2Pag21F0nWSzRtEJMZhR1kFWshrKERl1r0HJ4j19S3B8libpKNzhl/J1ehxi50Q8hEhZBchZLkZ7akhhoIZveCa1c3HB9d2T/GvJzrILqKlj38ZmmAPxi/b4aroGAA4gjwgkI92jWqiTVENU9uWp2xoQUoV5wmoYVShqaV6OFiR/KCokccbAvcOUE8wZi7GbmizXDFm1T3R+waRch62zMPQrxbj1BemoDIquw+eNrWIniPoPk85MHj6MYBBJrWliqgrlK7bfTRZWf06NNVd06hmsqvF7yO4sFuzxAITfCgNyT4AfC1INcoro5i9dnfWfenF6mRzRBYmOtC/AABwuEqfJW5UsfNJfFN3Sh67Hw74seG5s3H1Sa0M9ZM5/P5s2H0I28v0hy+aVR2I2OyKSTl/f32MX5ZuB5B4W0KkApg0zBS5RMqoOdE/maL7evW6j51SOh3INjGzjn4gWuypN8CE2IlJE2+KaoRTthEH0UTW/m8wXv2/rpIOsr+xxFOpNYA69MvFuOL9udh1oCJ5xcofs+4fSM1/8p+zOqBmXgDF9QtNaT+lP5ULvdOwidoPy3g+F2Nw8Cn2fWJxdhOQsoWCovihX9D3pano9Wxq9JTaJCS3DZ6Kha7ToXX+4rs6bwQw6/WU9QcqjL/Z3he5Lf1GFiA3ZDS2TPq2dMt+rN+dZfZXndj2rkAIuZkQsoAQsqC0VCNhvQb/PbsTTmpTT9dUZ6ly++bWXvjv2R3VXTAipih2MTWw+jartvMTnioisv4yqRUp4ZHIDXg8khhIbCXza55xdCMsGz4wHv4oZ0DlC4b6FdFjwazffRjFD/2CnfKHGQCjw29qYYpWJvbq5UufBVD+FrF6R2KC24Rl29H64fFYW6oU+22WYneP3zq+R/s3K66vtfJzw23vozWN/bB/ZhPS5CQyhqY5zjJXzHlvzcLpL03Nqm+92KbYKaUjKaXdKaXdi4pUBi/TkBf0o0Y4qKJI1A9y9+J6uPHUNuk74LL3i0tDHnceqMC0v9UfYmbpnzGxfvgkNjD+/aXLjsvo99KJWz/GemXcvx5Lc/IavpbpjH92A4f3AuumxdddpZCyQQ+Ukvis4mzo2Vp//pwvQs+k3abscHLqi0GvzcAGwVIbv5yPhFm+NTWhWBAxFEF/hNABmo9nIldgOVeM96Ln6P6d2Wg92A9XRjG7ZLej/uYUfNnGjLjfFeOpqBgeqjmo80TkarQnWzDAiL/cBItdhKPAxe/OxpZ9R7DhubOF9ikQi8TdND4L4u9+vusUjXjt9DwZuQbn+f/M6Dd6LPbdByWW+pjLgC3zcSRYF/lQmOChEw4+bKCpE4700KBGOF4c/atbegHD+eVltAC1hSgfoxxWGDjfc6gSxQ14V1hHshFH/fNXyjY+QjE/73bd/VAQvB87B+/HnFPqgPb5H/rVYsxeuwcrTorBbEdgWotZDV92OXX0+9hZuGNGaJ3OUbGzAAD9HFbslFJs2ccPnMU4yg/o/fEEMPNVBIQSd1bEVXdpljplPhOiBl7iNHdDsFpKdvGuhyNVUdCdK0EAVEUiyM9cxDjZOC4W/Fc5+VkFQqiN7BS7EvEBbAATwg8Dxuo6JLepcuTFKCA38PfOgwCAqFmhOhIMK/YsFa74hpqu/zcm/4NbB52Iz+dutGx8Sw2zwh2/APAngA6EkC2EkBvMaFcNPWFYQZ+BXTPRxy6N9ntvmlBCbMEoAEAe5QeldMV6W4U/DJySWiMzZiBDYCZRLY+NW4HKCB8tY8XNbpiLPsCa4NGWeaetGBZVUyyOXlcy4vaVBdar4WOapSyNsE/Xdh/M3IDxy7bjiZ9WYsjH89P/wETMior5F6W0CaU0SCltTinNvvS7al/ph4bG3NgTdQsNTJ3PMo4dSK6rGhBM8hcnJtfAzLLGhzk8tgvoPxz/PrM9Pr2hB1B0NACjFrvGDim8OYlRIdnWSjU1iuTYSxG57jfLJglZEXLqhssIAJoT9dDdlnQbwqgybl1rYEWberg0MJ3vP41hQuHLokZBdrjnnS0D0hW5PfmoBsYaNiGOvbVvB27wTQCiHyLgJ7KJOgmlL+3upYlrUBD243YDd/0xFR9kJe+dZ7TjP7SZhXaP/mJIsen5jfJNmN2NabYS7tKsNkpNapOAQyEqUA7rYq2dUmxyWpBdisvzUYHvubvxc7AnQI43vV+nXDEA0JZoZCsVoADygs4odhcNVeuDQtlSCxMTqiuZ4Ip5OjAKZ/nno87uearuINFSFQdR35pSghd+NVbZ/qBZisMfQAQBQ8UatCxnpfQKZilk/fHE9nOX/wcsz7sRdcGHOooPc3eoYnNRO59hIXd5b98Kiyx2g2Q5eAoAf4QfSNs/BYmnj7Ybzyl2IFmRiIUjSrjmapvrp9BYGKYU8W2CUiAoKTxBKY1fiaJCf2dqiSz1gPO3vRH3iJaiVnoJMmsv91NzUyMAxmRrRXagKynBv4Nj48vO9s8BABQRPqyRgq/aVW6kypIKU7huAIC3ruiGekZcjyahdv6lFctcFe5okywciGOpRTzpipEq9jW0OW6rHIod4GORzz2uqfGGO18IfDPEFNl80SNoQSLYCz5KhR8bSHbFfDZnE2rlmZ8fPTsyV21dfSWq66S3fEPsw7FJxaiN2VzruMYYUPUiujSvh9PaFwGzDTWjiJG3iWnh+1KWJerOJtxut3++EL+v2glkl0QUAHBJ5TAsEWL4j2teBwsfG4Dih37JvmEDqF0x4vJ6pBxY+Jbp/VKjdqlJij3dWwgFSZ2EaBOeU+yU0iQfewx+bAVvaa96cpDq7EpdmDiyddK8uzAOQDHGKKxNKI/DVe5KFmaEC/zqmlVqsX8degKtfLsMZ+cUIaCIwQ+/j+DoJrXS/yDDtrOlGdkDCJk+E7MUKX4zsQDLAnp0/LN42XZrWQd1CxQs97w6QIVJxbcVSGexW4W09Xei5+H2gEpKjk4XACt/kAhmj8VOQRDjmCtGFxTA45HrUJHfGAAQkfiE3RDlpXYxUyCu5YiCf6I2yjO++bZQg4PEtkARRhUCJb8CABqQMrTy8YNs2d7w0tNsdu5780tu2DcO8P3tvdGhscI0+2A+MLwMaNXbkn6V9vGhwBiMCmWXqiIdUotZ8ygff3Xy97A5xoAeiz1mtOxblnjOYgeArSjC0lPfRY/fLsQGrnF8uRvid9UKYksTP1HKAaB4PDAa0UOXACjEkrybgfWZ9dWn8jXjgloMAcUYyfT754KJ6J0QiQnbGG9bietOLs66epXZilhsb8Y/9mXzBAA07cY/9bYvBmJCigN5LnSTUDqPtwZ+tqQvKVLFeohmMNWtRiMLpEmFghgu55ktnlTsAFBerzNuqRqKaVwiL4oL9LoqkmEkEEoRQAxDAhPBrfkdz+BTQ21yJr9wDTunE1oXFQJfpN82HX5wOMH3T/oNDSAdY5FG3Qw/r3PWbZuv2HnenbrW1HaT+lC68G+eCiz5Cvj+ZoCzVrHbFVF/gBagliTdg9hrCdcUvds1BDbobMjGwVOnymR6zxUTn5pNMJHrgQok0vO6WK/zcgvC30S+R1CoGpQuJj8TZj90Rla/v/6U1ji9gzmFEKxMQSuGOVIYL3Cuhl+mJA9kYgkq4mBIpl8YmBeT20X154bPBLvcTSdUvqclhH4ysADvrrozg4aToQ4qds9a7ErICywY4tqfUbltOcKTHsq+LRWu8f2KUj8fqmfmTdG0TrZKyDz07JfbSsEBQEHIB0imRGT7VpS6jzbuszyLoWUWuz1EZOpKPDfK5VY0yECxL6HqWWFfC72j+VsOBP8bv1p3X2biOYtdygsXH4vLuvPx68c2r21OHu7WpwI9b86+HRm8yyBxU+stYpCOD6/tjptObW1KW2ZiXmHmVKQhhGpFK4wivyFiWSt2efvmK3bVq16ciEOEvyYmuUvqxuR9+iiqrxib/l5lRyiDtL3ZTazyeAUlRyDAZSe2wDMXHoNBnRvjhUuONa1pQwnEVOhBVqEZSvHlvM2ISZRQtqf85io+gVedgiAePbtTIjWwS9Bzs2c7G7G80lgR7EwwkhRNyrG+dQgiihpCxkizyt9JCQVUrlfRlyxez1d+A/S+B3h4i6n9m/3mFdF5zI2nFNB/Tt33TqkPz7li5Ac66PfhvatPMLUPM/Okjw0/BY4StPnxc1wWjiFfaHqIf0JW7f7GnWiCdNahR4EZVQji70p2lZse7ii/wrK12J8JfoTBvrno7V+B5Vwxrq36T1btiQw+pjHGL9uBt684Hg0UykACSCgwUcE36gQMeBKoVKre5B6MTDxqWFPlGAgtJmGbxe4cnrXYvXS4xVJ+UkUWJtZbm4YxwaKz0n+eFMdudj/Ck0LMcpltBkoA6O3nk6938W3AX3nm1Ol8+dKuGH/3qTj7WI1CI6KlLo8CMTl8TC3E1yhax/yn2Enxz2Jd4xb1CnBUkUa+c59sdrcJuWLcjmcVu9foSVaprquZRWEHSwbdwwZrSUqwwuUgIn1oFITMfukU23a36ZAf8qNT0zQTbUSFnuJ6cPe+aSn27bR+/PPIa7oDAEJaqXH7DQNa90leloFiD8CbM8M9p9jNHiyzixa+XcgjEcV1n4aetVka69FzQ9QnBw21LVXsZ3YyebKJLFe8WyJ37qy6Cw9EMhzUl7ti4su9q9jzkQg60FUC8tT7U/c3Ax87U+w2Y2Uleis43bdIdV1Xn/HJK+5QO6mYkkZZBenAbPw6CJgV6ikqdl9KX07yM9cLX8f6YjHXFp+Er9D3I5+KYne9xa6ulvIl11WyCpCcp1CaN84MfOwhmO8ynb9hr+ltyvGsYvcaZ/vnZfX7VVxLkySxhzwov52YQYoV/cg24MF1yhtnilgExWUWu8gFVU9hTN6/9G0sj4qJL3e5Yqfqaum72CkKS2XnKJSmvmgGir2U1tG9rV7sSPzHFLtHUCtZ51bPVBjWWexiAYc+7YX8+aFCIGRWpSK5Yvcwaq6YDBSbE2i5YmZzXeKfiXS7TBKcZeBjL0UddKr4SH/bOrDjmvKcYpdWe69ORFUiU60ac4jcOBW3Vg01/PuwhRZ7Hqrw+32nYfT1Pcxv/OxXgPy6WAd+4lu24Y6OojZ46vKoEEOzfVueBNwyXd+2Ge7/YTMS6Euw44XJs1ety98mTUfvpA2zIE2Ow6+cccUZVhkoNoMgiVl3/o+/GvjPBtztewQ3Vt2PA9S6mqV6uakqtZCHLtTCHV2O7hBT+UUQH2dJY+xkMHgqcmu9DzP+jWr3Npil3jrjAF8xB0Dzus7fcHYyJdYt/lkaHWGVJybbvDtmWOxvRi9QXH6A5lueovnuc0/C75y5E9+M8geXXAha966ruWJcjvEZpcLv0r3FGnBF7fBrzBfIEGaxK3Djqa0x/9H+aN0gzQBJjrGStkJxxRhc3vRXfB3ra3l/YrTJZ9F+hn6vVrk+E16OXqa4/PXoxZbbPBd0a4Zzjm1i6eDpt7FTdW1neJJUfPDU3a4Xw9QQMpEef62wQOdx8vmBOxdYIpIemGJXgBCCIs3pw7lJIq46+ao4Wqlijkn8fNcpGDBI2WpOR1OhNJzZ9Kp4Ex/GBttyc7x06XGWPkCqdJcIzFIKvRb7oOez68du8mrzlaF6381/j18UOlwxDdoB10lqxHa+SPMnFEA5NcfXzlwxjDji66k0cu23e/ugjlKNS5Po0qw2GjU0NgHIqpmnag84K8gL+hH0W9dPpi6HL246Kf1GSR0IYXV6FPvwMuCkWzNr30bmc+3Tb6THFdOqd+ImKpaETl46Km3zZuWNYRY7I454qUp9y7bUU2x3pqGfFcKaog7xMESbBs+t7CZTRVErP0PfsJimNwdcMcMiQ3RspeN4DhmftSzZwsIdGXGUQsBsUeyEYDGnXmxAjSsCUywQxv5se3lB624Ro75z3bOuOUGxe2zwVImWDXS4HLVcMQX1U5c5hB2z5r1/xqsZ/TsmXCN2TU66vOox9Kx4y57O0sDZbLFbOZ5jeYy8aLGnC+9LNwXfZpQu6/sHddLxS42L4r5VwKM7jYpk6s1mYlZw9T6s74JhBqKl2rpBIY5tXhsAkgp3WEkFwtgF86dWGyHhirFHs1uZK8bytw/xGAXTDPr9S6N6eYMO5smTBYTTMQ1ffDNRui8C4fTHwSaYj50Rh1L+agj4CYb2bwdCgLZaOahzlMTgqffJVLFn/Bxv1h045T7gwhHa22mlYzDZjVPCNTX0Ox+no5Sk3qiYDPjhjkSqAvNaZa4Y57htNnCxebPNskW8qAI+H844uhHWP3s2auYFNX9jbv/uUKV2u2JEKpHsknkmcgXOqXw6qzbNKOKhic8H9H8cqJVOmWrI4ZIp3lwoTe55AFYrTLNaZxa7kzTqDBxzidNSxIkKKQUCFobfqZEf9MNuG/mXmFo6A16OeoXWhXkmIZjJ2/zNkhaXoRCraXYZNw3lRHGCepkPnqth5GF2buXTiNXVUbDdQo2pZa2XZZh2wo47yd1p3hhxIsKpMrPQtl7+uP80bN57GBhte9cpUBD8+8z2CAfsCuHjb+mYzCVBQbJWzJn+vq7wMDuhlc3jHdf9ArzS0ZSm9Lz5Sbc5v/JJLKNt4vMWmtXJByrUfimJY791FuDP/o22bgHfRuemtUFLlbcppXVQm+ivgmZ1OgyAWez6OU5nDmyLEC32bHO4GKFpnXz0bJMIF9tO69kug4jtWYp78jVK9/gapKzK1pWSscu8Tj4m3HMqhp3TOat+M8bBcMkl9CgAACe8ORWENB7ocTkp0LgLUJT9wG+r+oUYd0dvDD9PPSonUzelHa4YZrF7BDG7o5UzId2E2l7a7uvveTPQ82YcejZ5yjkvR7aKXfv3V1U9jJAsmVrHJnp8zWZCgBrmlR80ev7KK/lKRjXzAsABlY0s0pjHteDfkNSGbzN9wLOUAm7g0k+Ay5z3QSR87M6fMp+Fhar1cOHxzW3v0wl/+AbaGJNlmR0tQUkhnvaf5PXD9mJzuF180Xj/GYa6MvrGVRXlr7mGNbVCFkVXjMFODJKxYmeDpy6g8wVAp/OdliJeaKOehblh9OLkOwMF4f2sNpMfTj7uYvipldhWHatWs9RlHc9L/u7zg5NOdDKsnYz9rleb+rinXzs8d/ExGk07dWW67y3aFMVOCBlECFlDCCkhhDxkRpuMZNo1rouFjw1A7QL7QhzVeCl6qWN9F+bbr9QBoEfboqTvgUD2t046vW3bW4KY/hYAOgwWPihJRyWfjCkzo7/z+QjuHdBeO+md1MduIznpYyeE+AG8DWAAgC0A5hNCfqSUrsy2bUaCqC9oX4hfGqwo8JuODhUfg4MPT17UJf3GFhAMJD9QQ/7so3LS3d+Wx7krceknQFU5cGAr/12ihSgnVezGHjp6VG7WOfCNvuo06w5s1crTbs758EpUTA8AJZTSdZTSKgBfAnDed2EhOx1QbOVwT8UoK4tPqFGJECII4F89sosdN0w4OZ8KsSE6KeJEbEMgBBTUS+SXCSVmN+dLEqLRNMrpf5HUKLJ+lS+qWrezacLFYjiNQ7YKU5qfXRFluTKV1is+9mYANku+bxGW5RYtEhNmbqj6t23dTuW6obhijO01T7VwQrE7Tl7tpK9lR6JZN5nuOFbZodgfXK+8vGFH4Iz/Apd+nFhUU/rGqK2dlNxIa2kznRa7UbLUmAZzyWTsivFIVIySlCnnjxByMyFkASFkQWmpSqS/mzkhkQ+6DPblaPERPhqAczYQJQmpRbWGsz9CxRGadkv6akbYZSiNn94Wi71AZU4CIUCfB5LSEUgfRLE0hoba8dFz3CpgcBzJTFP47FeAwS8lLaoRTpXrhcj/5azFvgVAC8n35gC2yTeilI6klHanlHYvKiqSr3Y/krNhp49Z7JWzLUQiPXZY7K57K2hzGnD7HMzj+EkvQRNmvhbWT32xHS9JpeCIK0YnO/2NNddbnpJYCTMHT0+8AehxU9Ii0TcelezbDO4YAxa79Zhx9OcDaEcIaU0ICQG4HMCPJrTrWioQRnHFGFv6EmPGXaTXHQnumv7A6fh1qL7iz5bRsGPcxZDNDOCfYz1xVdXDmFN7cMo6qQvDdblkJNfgQZ92Dne1gV89SjBrV4zFN4v0vBjpyROFNiilUQB3ApgIYBWAsZTSFdm2y+ARLVc3WeyTuW7pN8oSuQJoWb8ARze2e9ZlKi3q8uGW2dycMfgxkzsGkUAhMLwMle3Ojq9zJBJGN/qjYtQeSqZExQxdBty1UEdL1vCh7+L4ZyMuOa+4YkApHU8pbU8pbUspfcaMNqsLP8WUCxRzNDnvuJsUu9RFYNUUf9e5YgQqarcFAGwLFhtuQ9wzv8Id7mrFTqWKPfPBUx49Fnuac1+nJVC/rcYG1l47P5D+iMUnqBFXumLc68SrJhyhyqXX4sWrBVeMHeVNGelZfuwjeOifjmjSoSsebloLMFDaVVQESu4cR3zTupFchGmyjGbnisky3NFiIygU8CMSDcCPCCiMhDt6wBXDyA61G0C0eHwudMVMf+B0y/twS2EPOf5QHubToxHlONxympbVqI64b+JgHCFSn60P/SpfxJCqB+LLXr+8K+Y+0i8Lqc2HpomKicGHvpUvG2rbeDlCA9dM7QzmRQjN+30k/tbK36eJfpdzxXqbsbr8JoYAABP4SURBVBSm2DPhniXAHfNMbVI9LIynTQN+YpKL9Dpa1i/gZyhaiGi13dJ4LHD/35b2lQl92hehc9NaGNq/veE24q4Y8e6T5A3nKMFa2gxTJOMYXZrVRqNaLqjXKXXFpLE6KQg20CYpy/W4mrJ3w2Xw+9v/BB5Ym1GzVOJ+kfc0nTs2bTOe8bFXG+oWA0UdUFzfvFmgP8RO0VxfI8SfophbfDFBYd/9/GQVq33hB/21gZrmpY3Nllp5Qfxy96lo30g7KkSbZFcM9SU8olJXzPldm2Lmf05H26IaWfRlJlLFnsYVo5IkTc2QiUreAAzrvYDg1jz5Lv2/CdcAClNz7WtRr0Y4fiSozMf+WvRi5R/ZDPOxG+C3e0/jXSNZDhOPjJ6NBVTZ8uOjDmIQZ3EP7d9OcTtbuWIsUHQ0/9mm4gsuKblpKqIiEH2t1BdMWQfwrprmdd2TSiL5tVH7xGQ6VpCs2JNn4+UFdbbl8wPDyzLqNyOEXX7xkmNB3xEt9mTTJgo/HozchEL1Mk+2wBS7AdLNGMwE+avpH7Fu6OdfJBk8pdjw3NmpP3SC9gMTnwXFbrUv3E0uKLMQ9ykeFaNisbsP/Ra7Wjik2umU7neVZObp0uFn2pI0KxMa1MjDXhVXDACMjWmPQdlxTbv5KqoWyBXjvhptk5dTF+USkGKxxb6IO8rS9p1EHhUjtdilD3p3qTNkGO6YoSuG8g+39VwjjI6dGV9eKy+IGmG32Z9SH3vm4Y52wBS74yRfFF2OPQGA5MZwrWK37mLuV/ki3o+55C3FAuRRMVKL3XWzTdVI82BXe/NQU4Jikrs3ohclWexuRbofyZ/dgUeuotxEfhFMjnXFkcIWso2qn2LnswDm7qUpj4qRWuydm0mySLrNECyU5Hgi2uGOmU60ipd+JLGMxbIdQpIGTzPFDuWfu3ePHTyyPaufyy+JKgQRCfE39loqZNVzq5NZsNjEKfZW0dzi9p0gbrELrphgKJEOt36NxP7akd41Iy5LhLhedEILjQ31Z3dcetIrwK2z4qUfg/CAYgeQFxTj2JMHT93ilmGKPRtC5kQsjGn1FAD+Iqld3A1XVz2E+yK38SsbHm1KH6ZD9CXDOrfyacNdjLj6BDxxnjMVk6xEnlIg2O2K+DpO4uLQHQ1iFzUaAvX56KyOTbUznKpNMpLXit3Z8hygcZeExY7s89xbR0L2mnn8W5Yhi90GY81lV071Qjy9+0J8CtSTTumPDo1rYgZ3LEpoc+Dan4Bz33BOQC1EBZTGJRN/8zDAwM6NkR9yT4ER85ClFGhwFNDvcQDJPvaHznLpQx1I64rxQ9mFKFdp4tUjKnZ3W+yp4Z439WkjU+7usNjdNtxcLdla0BG4bTbqFXVMXtG6jzMC6SFuWZobz1wdSBk8BYBjLgFmv4n1LS4Glu/HRcc3i1uFriTN4KlPRbHLEVXlu/Ri1IwdwJex0xH0u0M5aiKcu6t6FmPLqgLggP6f2uFcZYrdLTTqHP8475F+iLplpqkaYhy7X7vAdjrFvovWQUOyP2lZ1xZ1bHlddYr4HAWpYq/TEvjPehyYvxnAfutjtwvqA4f3ZP47US6Dil3NdbEfNXFv5A78OvRUFNUIAy8pbuYwEtklb6zN6+RnpNjtgCl2FyDXYQ3dkBckLWKoXvpkUFoo1XL94Y7ehqXyAgmLPXVdTLgYlFL6msqdC4BKA9pIvFjTKHa9rhj5A7xlvQIUhLyglhJpezONEMsPWu9eZO/J2XLUAKclcAbhxiZpwjFzOWzRKKJin7RqZ8o6MYunL4sKTbooqMfnPjJKOsVOlK+L4dFrMSPWBX0rX8bJFS4dP9IDMeZX79qiDprWsT7Si9112XLVNyn5KaR1K2fGOst/kRuYVF/SrQU1rETc4wNHIinrjhISfh3f0r66uhkRd8VoKzO1qJgS2hxXRx7BBtoE25BZ8i13Idn/2sm1a/t3VE9ad8pR9uwzU+xm0ffh+MflXOv454CK5QIAM+IpPj2o3MRc4jnsC9fFoOdSFh1XMVLzJ2opXwGgZ5v6mPHg6bjkhOZmSGchqYr9jqq745/1Dp6KvHLZcejQqCbyTCgSbgvx3afA2a/EF/dqU18z+2v9GtpjUmbBFLtZ9H0o/vEwElWR8lCluPkW2gAzuWMsF8syRFcMOHSveNeUJrtWjMCxFe+b0pZtnHQbcGJyNfsyaKfZjSt2lWdii3oFtlTZMQZJ+iNlA01Yqmo+djXOOqYJJt7bx3oXlGlIUn6EE+c7FPBpvsxc06vYWrEEmGK3gH00caJLaW3FbS6tfDz+2ZNGr+iKoRS7obyPamyl9RPNSJbvR00cQKEJwrkbcR+pF9/UREwOd/QEUo2tUoYv3YS9dOvNgil2C9iPRBGGw8jDJq4oaf02Wg/bUV/+M28hDLwFT7sPDw7qoLnptFhyVRm5NXdD1f24pHKYqeK5hRIudYLWeo6fkObJB3ocbQU1M3yaTXLYiPSEqTzYBnVu7Ir6xEyxW8AcLjHRKIwIwiR1kEzKTX3aWC2S+eTVAoaXgRxzCW7vq5xi9/XohQBSE0LJB9b+4E7AAuriWZZZEFUI59xEGzogiclo+BuWc8Xw1/DywKgelLOvXnZiC1c8sJlit4BKJAZIwqhS9bOLuKf0mXkMi1yLV6OXAEiNZZdXyMlllBT7XuGN7vXLu6Ws8wwKKQXqFyau+1i22i2cmXvPFroKOX2C+aquGMAdLjam2C1mOW2NMLQt9lyDEh9GxwYC8eiP5MtM6orZRV0a1mcSUYVbrIryqQLaFnl4PEHBYq8nifjgsvVH3L0IuGthdm0YodedSfnxkzjzGeDhrbxij7+FKih25/U6U+xW81r0YjwWHeK0GI6i5oop4ZrihqoHnBDJNsbFUmfRRoQJ33YNpJlKXKGnyv7QoIQ7TW9KDFUlWFgfqN82Q+FMYOAzwDCVVAs+XyIC5v8+BbpdBdRPdUO6IR2GF+buepoY/Pg61hfzCk4DLd+F6eF7UcI1S//DHCLVFcNf+N/FTo1H1BSG/Pjhjt4I+nPL1hgVG4THg58mLasUbruAz8P7KrXYT7odaHsGGhcmUmFkbbG7nYYdgfPfVlzlhj338JXlbl6IXAYM+TX+vZwLYxNthCurHsaJD/zkngLVlpBszc3nOsjW8pd+ozqJiRwUQLtGNVHcwIvuicStPCo6ULYu1bL1tMUuIrVKBz0LtBsA6XHQ72N3gxo0F/muD+3fDv07NsJnN/S0TQam2C3indgFQKte8e+1C3i/6skDLkF+rbpOiWUPEmvuhFZ1MSo2KGn1qNggoMVJiB57pd2SWcvglzDk6bFpNxNrenpTseuTOZbrFrsGnEyz3973KHxwbXec0s6+SCHmirGJRwd3RJ/2RQh48mbWwflv86lgv7g8qaTbZzf0xP4jVcCriU130nrADR+jx5YyYMouAO4YcDJMm9OB+R8ATZWjXDZxRWhZ3A7YNBsA754DlLM7egftE6ZfsXv6ICgi7vp1Jxfj6l6tEArYbz8zi90m+nVshKDflzRV/PXLu2LUkBMdlMpEul0FtD0jZXF+yI8mtZWz2R3TvDb+fvosAMCATuqJk1xPx3OAh7cAzbsrru5T9Tpw/QQglAhrvaBrUxenDcie5nX1lo308hNdDX6f2jWq4VgoM7PYHeT8rjk6iJpGYUkzOoYCPsx5uB/qFdqTHMkywjXTb3PrTNzyEl8Q+jUvx7ADAAgw4CnFKl9tiwoxqEtjLNtapvC73Ceest7BtxFmsVtAwEcQduD1y6s0rp3nyOuq7dRrjYlcjryhgQK97waadk0sCvID3/lFrXHDKa1TfnFUw4T1OrAz/4bmaRecCuI+OelqYxa7BSx/Qh4ZUU2I36XaV/QSzoMpFBg8Wm9jRe2By0YDbU5HXtCPprXzgMrE6h/v7I1OwyZaL6PDiIOnTnraqoGZZD95QT/ybCh/5TrExEiNOqlu0rHiIyynua3YJw51cRFyq+l0Pp9HSAGlkne5OMzQpRk/N6NVfedCd5nFbjHPXXQMOjZRvtBzjkAIuOZHoPExwJN/4qJuqWMIR+CFeq7Z0aFxTWDgs3wUzCKnpbEIHT4UCuCbWB9c4p+eTTOe45perdCzTT0c3di5+54pdjMZ8FRS0n0AuLxHS4eEcYg2fLrWlU8ORFheDSdgfa1H19Drdv7fol+clsRk9JvY71/THR/PfhqXrEh9g+nboSEmrtiJtg1zLwEeIcRRpQ4wxW4uve9Ov001IeW1e8ivQJ2WwLOLnRHIIZ46vzMmr97ltBiO0KVZbbx06XHAitR1l5/YAoM6N0Zdr0dDuZSsfOyEkEsJISsIIRwhRDmIl8EA+Fm4tXM0vFODq3sVY9SQHuk3rCbUygugWZ18EEKYUreQbC325QAuAjDCBFkYDEaOs3jYmU6LUC3ISrFTSlcByOkZdAyGmQR8BEN6FzsthmN4p1i1t2E+doatPHVBF7TPwQEzvZT8b7DTIphEDoaz5BBpFTsh5HcAjRVWPUopHae3I0LIzQBuBoCWLatZpAgjztUntXJaBEY2FPcGdq0A8us5LQlDg7SKnVLa34yOKKUjAYwEgO7du7PHPYPhRQb+Dzjxpmo5EO4l2MxTBoOhH3+QTx3AcDXZhjteSAjZAqAXgF8IIbmfCILBYDBcTrZRMd8D+N4kWRgMBsObFDQADu92Woo4LCqGwWAwsuW2WcD+TU5LEYcpdgaDwciWmo35fy6BDZ4yGAxGjsEUO4PBYOQYTLEzGAxGjsEUO4PBYOQYTLEzGAxGjsEUO4PBYOQYLNyRwWBYy8UfAgUsaZidMMXOYDCs5ZhLnJag2sFcMQwGg5FjMMXOYDAYOQZT7AwGg5FjMMXOYDAYOQZT7AwGg5FjMMXOYDAYOQZT7AwGg5FjMMXOYDAYOQahlNrfKSGlADYa/HkDAO6pQaWM22V0u3wAk9EM3C4f4H4Z3SZfK0ppUbqNHFHs2UAIWUAp7e60HFq4XUa3ywcwGc3A7fIB7pfR7fKpwVwxDAaDkWMwxc5gMBg5hhcV+0inBdCB22V0u3wAk9EM3C4f4H4Z3S6fIp7zsTMYDAZDGy9a7AwGg8HQwFOKnRAyiBCyhhBSQgh5yCEZWhBCphBCVhFCVhBC7hGW1yOETCKE/CP8rSssJ4SQNwSZlxJCjrdJTj8hZBEh5Gfhe2tCyFxBvq8IISFheVj4XiKsL7ZJvjqEkG8IIauFY9nLhcfwXuEcLyeEfEEIyXP6OBJCPiKE7CKELJcsy/i4EUKuFbb/hxByrcXyvSic56WEkO8JIXUk6x4W5FtDCBkoWW7Zva4ko2TdvwkhlBDSQPhu+zE0BUqpJ/4B8ANYC6ANgBCAJQA6OSBHEwDHC59rAvgbQCcALwB4SFj+EIDnhc+DAUwAQACcBGCuTXLeB2AMgJ+F72MBXC58fg/AbcLn2wG8J3y+HMBXNsn3CYAbhc8hAHXcdAwBNAOwHkC+5Phd5/RxBNAHwPEAlkuWZXTcANQDsE74W1f4XNdC+c4EEBA+Py+Rr5NwH4cBtBbub7/V97qSjMLyFgAmgp9j08CpY2jKPjotQAYnoxeAiZLvDwN42AVyjQMwAMAaAE2EZU0ArBE+jwDwL8n28e0slKk5gD8AnAHgZ+Gi3C25ueLHUriQewmfA8J2xGL5aglKk8iWu+kYNgOwWbhxA8JxHOiG4wigWKY4MzpuAP4FYIRkedJ2ZssnW3chgM+Fz0n3sHgM7bjXlWQE8A2A4wBsQEKxO3IMs/3nJVeMeKOJbBGWOYbwut0NwFwAjSil2wFA+NtQ2MwJuV8D8CAATvheH8B+SmlUQYa4fML6MmF7K2kDoBTAKMFd9AEhpBAuOoaU0q0AXgKwCcB28MflL7jrOIpketycvJeuB28BQ0MO2+UjhJwHYCuldIlslWtkzAQvKXaisMyxkB5CSA0A3wIYSik9oLWpwjLL5CaEnANgF6X0L50yOHFcA+Bfhd+llHYDcAi8C0EN22UU/NTng3cRNAVQCOAsDTlcdX0KqMnkiKyEkEcBRAF8Li5SkcPue6YAwKMAhimtVpHFjec7jpcU+xbwPjCR5gC2OSEIISQIXql/Tin9Tli8kxDSRFjfBMAuYbndcvcGcB4hZAOAL8G7Y14DUIcQIhYvl8oQl09YXxvAXgvlE/vcQimdK3z/Bryid8sxBID+ANZTSksppREA3wE4Ge46jiKZHjfbj6cwuHgOgCup4LtwkXxtwT/Alwj3TXMACwkhjV0kY0Z4SbHPB9BOiEoIgR+g+tFuIQghBMCHAFZRSl+RrPoRgDgyfi1437u4/BphdP0kAGXia7MVUEofppQ2p5QWgz9GkymlVwKYAkAsFy+XT5T7EmF7Sy0PSukOAJsJIR2ERf0ArIRLjqHAJgAnEUIKhHMuyuia4ygh0+M2EcCZhJC6wpvJmcIySyCEDALwHwDnUUoPy+S+XIgoag2gHYB5sPlep5Quo5Q2pJQWC/fNFvABEjvgkmOYMU47+TMc8BgMPgplLYBHHZLhFPCvXEsBLBb+DQbvT/0DwD/C33rC9gTA24LMywB0t1HWvkhExbQBf9OUAPgaQFhYnid8LxHWt7FJtq4AFgjH8QfwkQWuOoYAngCwGsByAJ+Cj95w9DgC+AK8zz8CXgHdYOS4gfd1lwj/hlgsXwl4f7R4v7wn2f5RQb41AM6SLLfsXleSUbZ+AxKDp7YfQzP+sZmnDAaDkWN4yRXDYDAYDB0wxc5gMBg5BlPsDAaDkWMwxc5gMBg5BlPsDAaDkWMwxc5gMBg5BlPsDAaDkWMwxc5gMBg5xv8DBJM4cqomQ18AAAAASUVORK5CYII=\n", 434 | "text/plain": [ 435 | "
" 436 | ] 437 | }, 438 | "metadata": { 439 | "needs_background": "light" 440 | }, 441 | "output_type": "display_data" 442 | } 443 | ], 444 | "source": [ 445 | "plt.plot(x)\n", 446 | "plt.plot(predicted)\n", 447 | "plt.axvline(len(trainX), c=\"r\")\n", 448 | "plt.show()" 449 | ] 450 | }, 451 | { 452 | "cell_type": "markdown", 453 | "metadata": {}, 454 | "source": [ 455 | "
\n", 456 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 457 | "
\n", 458 | "Class.Vision - AkhavanPour.ir - GitHub\n", 459 | "\n", 460 | "
" 461 | ] 462 | } 463 | ], 464 | "metadata": { 465 | "kernelspec": { 466 | "display_name": "tensorflow", 467 | "language": "python", 468 | "name": "tensorflow" 469 | }, 470 | "language_info": { 471 | "codemirror_mode": { 472 | "name": "ipython", 473 | "version": 3 474 | }, 475 | "file_extension": ".py", 476 | "mimetype": "text/x-python", 477 | "name": "python", 478 | "nbconvert_exporter": "python", 479 | "pygments_lexer": "ipython3", 480 | "version": "3.6.8" 481 | } 482 | }, 483 | "nbformat": 4, 484 | "nbformat_minor": 2 485 | } 486 | -------------------------------------------------------------------------------- /08_shahnameh-text-generation-language-model.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

مدل زبانی در سطح کاراکتر و تولید متنی شبیه شاهنامه

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": { 15 | "colab_type": "text", 16 | "id": "ovpZyIhNIgoq" 17 | }, 18 | "source": [ 19 | "# Text generation with an RNN" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "colab_type": "text", 26 | "id": "WGyKZj3bzf9p" 27 | }, 28 | "source": [ 29 | "### Import TensorFlow and other libraries" 30 | ] 31 | }, 32 | { 33 | "cell_type": "code", 34 | "execution_count": 1, 35 | "metadata": { 36 | "colab": {}, 37 | "colab_type": "code", 38 | "id": "yG_n40gFzf9s" 39 | }, 40 | "outputs": [], 41 | "source": [ 42 | "import tensorflow as tf\n", 43 | "\n", 44 | "import numpy as np\n", 45 | "import os\n", 46 | "import time" 47 | ] 48 | }, 49 | { 50 | "cell_type": "markdown", 51 | "metadata": {}, 52 | "source": [ 53 | "###
مجموعه داده\n", 54 | "
\n", 55 | "\n", 56 | "
مجموعه داده زیر را از سایت گنجور برای این تمرین استخراج کرده ایم. لطفا فایل txt را دانلود کرده و در کنار نوت بوک قرار دهید.
\n", 57 | "\n", 58 | "\n", 59 | "\n", 60 | "http://dataset.class.vision/NLP/shahnameh.txt" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 2, 66 | "metadata": { 67 | "colab": {}, 68 | "colab_type": "code", 69 | "id": "pD_55cOxLkAb" 70 | }, 71 | "outputs": [], 72 | "source": [ 73 | "path_to_file = \"shahnameh.txt\"" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": { 79 | "colab_type": "text", 80 | "id": "UHjdCjDuSvX_" 81 | }, 82 | "source": [ 83 | "### Read the data\n", 84 | "\n", 85 | "First, look in the text:" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": 4, 91 | "metadata": { 92 | "colab": {}, 93 | "colab_type": "code", 94 | "id": "aavnuByVymwK" 95 | }, 96 | "outputs": [ 97 | { 98 | "name": "stdout", 99 | "output_type": "stream", 100 | "text": [ 101 | "Length of text: 2653849 characters\n" 102 | ] 103 | } 104 | ], 105 | "source": [ 106 | "# Read, then decode for py2 compat.\n", 107 | "text = open(path_to_file, 'rb').read().decode(encoding='utf-8')\n", 108 | "# length of text is the number of characters in it\n", 109 | "print ('Length of text: {} characters'.format(len(text)))" 110 | ] 111 | }, 112 | { 113 | "cell_type": "code", 114 | "execution_count": 5, 115 | "metadata": { 116 | "colab": {}, 117 | "colab_type": "code", 118 | "id": "Duhg9NrUymwO" 119 | }, 120 | "outputs": [ 121 | { 122 | "name": "stdout", 123 | "output_type": "stream", 124 | "text": [ 125 | "|به نام خداوند جان و خرد\n", 126 | "|کزین برتر اندیشه برنگذرد\n", 127 | "|خداوند نام و خداوند جای\n", 128 | "|خداوند روزی ده رهنمای\n", 129 | "|خداوند کیوان و گردان سپهر\n", 130 | "|فروزنده ماه و ناهید و مهر\n", 131 | "|ز نام و نشان و گمان برترست\n", 132 | "|نگارندهٔ بر شده پیکرست\n", 133 | "|به بینندگان آفریننده را\n", 134 | "|نبینی مرنجان دو بین\n" 135 | ] 136 | } 137 | ], 138 | "source": [ 139 | "# Take a look at the first 250 characters in text\n", 140 | "print(text[:250])" 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": 6, 146 | "metadata": { 147 | "colab": {}, 148 | "colab_type": "code", 149 | "id": "IlCgQBRVymwR" 150 | }, 151 | "outputs": [ 152 | { 153 | "name": "stdout", 154 | "output_type": "stream", 155 | "text": [ 156 | "48 unique characters\n" 157 | ] 158 | } 159 | ], 160 | "source": [ 161 | "# The unique characters in the file\n", 162 | "vocab = sorted(set(text))\n", 163 | "print ('{} unique characters'.format(len(vocab)))" 164 | ] 165 | }, 166 | { 167 | "cell_type": "markdown", 168 | "metadata": { 169 | "colab_type": "text", 170 | "id": "rNnrKn_lL-IJ" 171 | }, 172 | "source": [ 173 | "## Process the text" 174 | ] 175 | }, 176 | { 177 | "cell_type": "markdown", 178 | "metadata": { 179 | "colab_type": "text", 180 | "id": "LFjSVAlWzf-N" 181 | }, 182 | "source": [ 183 | "### Vectorize the text\n", 184 | "\n", 185 | "Before training, we need to map strings to a numerical representation. Create two lookup tables: one mapping characters to numbers, and another for numbers to characters." 186 | ] 187 | }, 188 | { 189 | "cell_type": "code", 190 | "execution_count": 7, 191 | "metadata": { 192 | "colab": {}, 193 | "colab_type": "code", 194 | "id": "IalZLbvOzf-F" 195 | }, 196 | "outputs": [], 197 | "source": [ 198 | "# Creating a mapping from unique characters to indices\n", 199 | "char2idx = {u:i for i, u in enumerate(vocab)}\n", 200 | "idx2char = np.array(vocab)\n", 201 | "\n", 202 | "text_as_int = np.array([char2idx[c] for c in text])" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": { 208 | "colab_type": "text", 209 | "id": "tZfqhkYCymwX" 210 | }, 211 | "source": [ 212 | "Now we have an integer representation for each character. Notice that we mapped the character as indexes from 0 to `len(unique)`." 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 8, 218 | "metadata": { 219 | "colab": {}, 220 | "colab_type": "code", 221 | "id": "FYyNlCNXymwY" 222 | }, 223 | "outputs": [ 224 | { 225 | "name": "stdout", 226 | "output_type": "stream", 227 | "text": [ 228 | "{\n", 229 | " '\\n': 0,\n", 230 | " ' ' : 1,\n", 231 | " '(' : 2,\n", 232 | " ')' : 3,\n", 233 | " '|' : 4,\n", 234 | " '«' : 5,\n", 235 | " '»' : 6,\n", 236 | " '،' : 7,\n", 237 | " '؟' : 8,\n", 238 | " 'ء' : 9,\n", 239 | " 'آ' : 10,\n", 240 | " 'أ' : 11,\n", 241 | " 'ؤ' : 12,\n", 242 | " 'ئ' : 13,\n", 243 | " 'ا' : 14,\n", 244 | " 'ب' : 15,\n", 245 | " 'ت' : 16,\n", 246 | " 'ث' : 17,\n", 247 | " 'ج' : 18,\n", 248 | " 'ح' : 19,\n", 249 | " ...\n", 250 | "}\n" 251 | ] 252 | } 253 | ], 254 | "source": [ 255 | "print('{')\n", 256 | "for char,_ in zip(char2idx, range(20)):\n", 257 | " print(' {:4s}: {:3d},'.format(repr(char), char2idx[char]))\n", 258 | "print(' ...\\n}')" 259 | ] 260 | }, 261 | { 262 | "cell_type": "code", 263 | "execution_count": 9, 264 | "metadata": { 265 | "colab": {}, 266 | "colab_type": "code", 267 | "id": "l1VKcQHcymwb" 268 | }, 269 | "outputs": [ 270 | { 271 | "name": "stdout", 272 | "output_type": "stream", 273 | "text": [ 274 | "'|به نام خداون' ---- characters mapped to int ---- > [ 4 15 38 1 37 14 36 1 20 21 14 39 37]\n" 275 | ] 276 | } 277 | ], 278 | "source": [ 279 | "# Show how the first 13 characters from the text are mapped to integers\n", 280 | "print ('{} ---- characters mapped to int ---- > {}'.format(repr(text[:13]), text_as_int[:13]))" 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "metadata": { 286 | "colab_type": "text", 287 | "id": "bbmsf23Bymwe" 288 | }, 289 | "source": [ 290 | "### The prediction task" 291 | ] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": { 296 | "colab_type": "text", 297 | "id": "wssHQ1oGymwe" 298 | }, 299 | "source": [ 300 | "Given a character, or a sequence of characters, what is the most probable next character? This is the task we're training the model to perform. The input to the model will be a sequence of characters, and we train the model to predict the output—the following character at each time step.\n", 301 | "\n", 302 | "Since RNNs maintain an internal state that depends on the previously seen elements, given all the characters computed until this moment, what is the next character?\n" 303 | ] 304 | }, 305 | { 306 | "cell_type": "markdown", 307 | "metadata": { 308 | "colab_type": "text", 309 | "id": "hgsVvVxnymwf" 310 | }, 311 | "source": [ 312 | "### Create training examples and targets\n", 313 | "\n", 314 | "Next divide the text into example sequences. Each input sequence will contain `seq_length` characters from the text.\n", 315 | "\n", 316 | "For each input sequence, the corresponding targets contain the same length of text, except shifted one character to the right.\n", 317 | "\n", 318 | "So break the text into chunks of `seq_length+1`. For example, say `seq_length` is 4 and our text is \"Hello\". The input sequence would be \"Hell\", and the target sequence \"ello\".\n", 319 | "\n", 320 | "To do this first use the `tf.data.Dataset.from_tensor_slices` function to convert the text vector into a stream of character indices." 321 | ] 322 | }, 323 | { 324 | "cell_type": "code", 325 | "execution_count": 10, 326 | "metadata": { 327 | "colab": {}, 328 | "colab_type": "code", 329 | "id": "0UHJDA39zf-O" 330 | }, 331 | "outputs": [ 332 | { 333 | "name": "stdout", 334 | "output_type": "stream", 335 | "text": [ 336 | "|\n", 337 | "ب\n", 338 | "ه\n", 339 | " \n", 340 | "ن\n" 341 | ] 342 | } 343 | ], 344 | "source": [ 345 | "# The maximum length sentence we want for a single input in characters\n", 346 | "seq_length = 100\n", 347 | "\n", 348 | "# Create training examples / targets\n", 349 | "char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)\n", 350 | "\n", 351 | "for i in char_dataset.take(5):\n", 352 | " print(idx2char[i.numpy()])" 353 | ] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": { 358 | "colab_type": "text", 359 | "id": "-ZSYAcQV8OGP" 360 | }, 361 | "source": [ 362 | "The `batch` method lets us easily convert these individual characters to sequences of the desired size." 363 | ] 364 | }, 365 | { 366 | "cell_type": "code", 367 | "execution_count": 14, 368 | "metadata": { 369 | "colab": {}, 370 | "colab_type": "code", 371 | "id": "l4hkDU3i7ozi" 372 | }, 373 | "outputs": [ 374 | { 375 | "name": "stdout", 376 | "output_type": "stream", 377 | "text": [ 378 | "'|به نام خداوند جان و خرد\\n|کزین برتر اندیشه برنگذرد\\n|خداوند نام و خداوند جای\\n|خداوند روزی ده رهنمای\\n|خ'\n", 379 | "***************\n", 380 | "'داوند کیوان و گردان سپهر\\n|فروزنده ماه و ناهید و مهر\\n|ز نام و نشان و گمان برترست\\n|نگارندهٔ بر شده پیکر'\n", 381 | "***************\n", 382 | "'ست\\n|به بینندگان آفریننده را\\n|نبینی مرنجان دو بیننده را\\n|نیابد بدو نیز اندیشه راه\\n|که او برتر از نام و'\n", 383 | "***************\n", 384 | "' از جایگاه\\n|سخن هر چه زین گوهران بگذرد\\n|نیابد بدو راه جان و خرد\\n|خرد گر سخن برگزیند همی\\n|همان را گزین'\n", 385 | "***************\n", 386 | "'د که بیند همی\\n|ستودن نداند کس او را چو هست\\n|میان بندگی را ببایدت بست\\n|خرد را و جان را همی سنجد اوی\\n|د'\n", 387 | "***************\n" 388 | ] 389 | } 390 | ], 391 | "source": [ 392 | "sequences = char_dataset.batch(seq_length+1, drop_remainder=True)\n", 393 | "\n", 394 | "for item in sequences.take(5):\n", 395 | " print(repr(''.join(idx2char[item.numpy()])))\n", 396 | " print(\"***\"*5)" 397 | ] 398 | }, 399 | { 400 | "cell_type": "markdown", 401 | "metadata": { 402 | "colab_type": "text", 403 | "id": "UbLcIPBj_mWZ" 404 | }, 405 | "source": [ 406 | "For each sequence, duplicate and shift it to form the input and target text by using the `map` method to apply a simple function to each batch:" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": 15, 412 | "metadata": { 413 | "colab": {}, 414 | "colab_type": "code", 415 | "id": "9NGu-FkO_kYU" 416 | }, 417 | "outputs": [], 418 | "source": [ 419 | "def split_input_target(chunk):\n", 420 | " input_text = chunk[:-1]\n", 421 | " target_text = chunk[1:]\n", 422 | " return input_text, target_text\n", 423 | "\n", 424 | "dataset = sequences.map(split_input_target)" 425 | ] 426 | }, 427 | { 428 | "cell_type": "markdown", 429 | "metadata": { 430 | "colab_type": "text", 431 | "id": "hiCopyGZymwi" 432 | }, 433 | "source": [ 434 | "Print the first examples input and target values:" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": 16, 440 | "metadata": { 441 | "colab": {}, 442 | "colab_type": "code", 443 | "id": "GNbw-iR0ymwj" 444 | }, 445 | "outputs": [ 446 | { 447 | "name": "stdout", 448 | "output_type": "stream", 449 | "text": [ 450 | "Input data: '|به نام خداوند جان و خرد\\n|کزین برتر اندیشه برنگذرد\\n|خداوند نام و خداوند جای\\n|خداوند روزی ده رهنمای\\n|'\n", 451 | "Target data: 'به نام خداوند جان و خرد\\n|کزین برتر اندیشه برنگذرد\\n|خداوند نام و خداوند جای\\n|خداوند روزی ده رهنمای\\n|خ'\n" 452 | ] 453 | } 454 | ], 455 | "source": [ 456 | "for input_example, target_example in dataset.take(1):\n", 457 | " print ('Input data: ', repr(''.join(idx2char[input_example.numpy()])))\n", 458 | " print ('Target data:', repr(''.join(idx2char[target_example.numpy()])))" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": { 464 | "colab_type": "text", 465 | "id": "_33OHL3b84i0" 466 | }, 467 | "source": [ 468 | "Each index of these vectors are processed as one time step. For the input at time step 0, the model receives the index for \"F\" and trys to predict the index for \"i\" as the next character. At the next timestep, it does the same thing but the `RNN` considers the previous step context in addition to the current input character." 469 | ] 470 | }, 471 | { 472 | "cell_type": "code", 473 | "execution_count": 17, 474 | "metadata": { 475 | "colab": {}, 476 | "colab_type": "code", 477 | "id": "0eBu9WZG84i0" 478 | }, 479 | "outputs": [ 480 | { 481 | "name": "stdout", 482 | "output_type": "stream", 483 | "text": [ 484 | "Step 0\n", 485 | " input: 4 ('|')\n", 486 | " expected output: 15 ('ب')\n", 487 | "Step 1\n", 488 | " input: 15 ('ب')\n", 489 | " expected output: 38 ('ه')\n", 490 | "Step 2\n", 491 | " input: 38 ('ه')\n", 492 | " expected output: 1 (' ')\n", 493 | "Step 3\n", 494 | " input: 1 (' ')\n", 495 | " expected output: 37 ('ن')\n", 496 | "Step 4\n", 497 | " input: 37 ('ن')\n", 498 | " expected output: 14 ('ا')\n" 499 | ] 500 | } 501 | ], 502 | "source": [ 503 | "for i, (input_idx, target_idx) in enumerate(zip(input_example[:5], target_example[:5])):\n", 504 | " print(\"Step {:4d}\".format(i))\n", 505 | " print(\" input: {} ({:s})\".format(input_idx, repr(idx2char[input_idx])))\n", 506 | " print(\" expected output: {} ({:s})\".format(target_idx, repr(idx2char[target_idx])))" 507 | ] 508 | }, 509 | { 510 | "cell_type": "markdown", 511 | "metadata": { 512 | "colab_type": "text", 513 | "id": "MJdfPmdqzf-R" 514 | }, 515 | "source": [ 516 | "### Create training batches\n", 517 | "\n", 518 | "We used `tf.data` to split the text into manageable sequences. But before feeding this data into the model, we need to shuffle the data and pack it into batches." 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": 18, 524 | "metadata": { 525 | "colab": {}, 526 | "colab_type": "code", 527 | "id": "p2pGotuNzf-S" 528 | }, 529 | "outputs": [ 530 | { 531 | "data": { 532 | "text/plain": [ 533 | "" 534 | ] 535 | }, 536 | "execution_count": 18, 537 | "metadata": {}, 538 | "output_type": "execute_result" 539 | } 540 | ], 541 | "source": [ 542 | "# Batch size\n", 543 | "BATCH_SIZE = 64\n", 544 | "\n", 545 | "\n", 546 | "dataset = dataset.batch(BATCH_SIZE, drop_remainder=True)\n", 547 | "\n", 548 | "dataset" 549 | ] 550 | }, 551 | { 552 | "cell_type": "markdown", 553 | "metadata": { 554 | "colab_type": "text", 555 | "id": "r6oUuElIMgVx" 556 | }, 557 | "source": [ 558 | "## Build The Model" 559 | ] 560 | }, 561 | { 562 | "cell_type": "markdown", 563 | "metadata": { 564 | "colab_type": "text", 565 | "id": "m8gPwEjRzf-Z" 566 | }, 567 | "source": [ 568 | "Use `tf.keras.Sequential` to define the model. For this simple example three layers are used to define our model:\n", 569 | "\n", 570 | "* `tf.keras.layers.Embedding`: The input layer. A trainable lookup table that will map the numbers of each character to a vector with `embedding_dim` dimensions;\n", 571 | "* `tf.keras.layers.GRU`: A type of RNN with size `units=rnn_units` (You can also use a LSTM layer here.)\n", 572 | "* `tf.keras.layers.Dense`: The output layer, with `vocab_size` outputs." 573 | ] 574 | }, 575 | { 576 | "cell_type": "code", 577 | "execution_count": 19, 578 | "metadata": { 579 | "colab": {}, 580 | "colab_type": "code", 581 | "id": "zHT8cLh7EAsg" 582 | }, 583 | "outputs": [], 584 | "source": [ 585 | "# Length of the vocabulary in chars\n", 586 | "vocab_size = len(vocab)\n", 587 | "\n", 588 | "# The embedding dimension\n", 589 | "embedding_dim = 25\n", 590 | "\n", 591 | "# Number of RNN units\n", 592 | "rnn_units = 1024" 593 | ] 594 | }, 595 | { 596 | "cell_type": "code", 597 | "execution_count": 20, 598 | "metadata": { 599 | "colab": {}, 600 | "colab_type": "code", 601 | "id": "MtCrdfzEI2N0" 602 | }, 603 | "outputs": [], 604 | "source": [ 605 | "def build_model(vocab_size, embedding_dim, rnn_units, batch_size):\n", 606 | " model = tf.keras.Sequential([\n", 607 | " tf.keras.layers.Embedding(vocab_size, embedding_dim,\n", 608 | " batch_input_shape=[batch_size, None]),\n", 609 | " tf.keras.layers.GRU(rnn_units,\n", 610 | " return_sequences=True,\n", 611 | " stateful=True,\n", 612 | " recurrent_initializer='glorot_uniform'),\n", 613 | " tf.keras.layers.Dense(vocab_size)\n", 614 | " ])\n", 615 | " return model" 616 | ] 617 | }, 618 | { 619 | "cell_type": "code", 620 | "execution_count": 21, 621 | "metadata": { 622 | "colab": {}, 623 | "colab_type": "code", 624 | "id": "wwsrpOik5zhv" 625 | }, 626 | "outputs": [], 627 | "source": [ 628 | "model = build_model(\n", 629 | " vocab_size = len(vocab),\n", 630 | " embedding_dim=embedding_dim,\n", 631 | " rnn_units=rnn_units,\n", 632 | " batch_size=BATCH_SIZE)" 633 | ] 634 | }, 635 | { 636 | "cell_type": "markdown", 637 | "metadata": { 638 | "colab_type": "text", 639 | "id": "RkA5upJIJ7W7" 640 | }, 641 | "source": [ 642 | "For each character the model looks up the embedding, runs the GRU one timestep with the embedding as input, and applies the dense layer to generate logits predicting the log-likelihood of the next character:\n", 643 | "\n", 644 | "![A drawing of the data passing through the model](images/text_generation_training.png)" 645 | ] 646 | }, 647 | { 648 | "cell_type": "markdown", 649 | "metadata": { 650 | "colab_type": "text", 651 | "id": "-ubPo0_9Prjb" 652 | }, 653 | "source": [ 654 | "## Try the model\n", 655 | "\n", 656 | "Now run the model to see that it behaves as expected.\n", 657 | "\n", 658 | "First check the shape of the output:" 659 | ] 660 | }, 661 | { 662 | "cell_type": "code", 663 | "execution_count": 22, 664 | "metadata": { 665 | "colab": {}, 666 | "colab_type": "code", 667 | "id": "C-_70kKAPrPU" 668 | }, 669 | "outputs": [ 670 | { 671 | "name": "stdout", 672 | "output_type": "stream", 673 | "text": [ 674 | "(64, 100, 48) # (batch_size, sequence_length, vocab_size)\n" 675 | ] 676 | } 677 | ], 678 | "source": [ 679 | "for input_example_batch, target_example_batch in dataset.take(1):\n", 680 | " example_batch_predictions = model.predict(input_example_batch)\n", 681 | " print(example_batch_predictions.shape, \"# (batch_size, sequence_length, vocab_size)\")" 682 | ] 683 | }, 684 | { 685 | "cell_type": "markdown", 686 | "metadata": { 687 | "colab_type": "text", 688 | "id": "Q6NzLBi4VM4o" 689 | }, 690 | "source": [ 691 | "In the above example the sequence length of the input is `100` but the model can be run on inputs of any length:" 692 | ] 693 | }, 694 | { 695 | "cell_type": "code", 696 | "execution_count": 23, 697 | "metadata": { 698 | "colab": {}, 699 | "colab_type": "code", 700 | "id": "vPGmAAXmVLGC" 701 | }, 702 | "outputs": [ 703 | { 704 | "name": "stdout", 705 | "output_type": "stream", 706 | "text": [ 707 | "Model: \"sequential\"\n", 708 | "_________________________________________________________________\n", 709 | "Layer (type) Output Shape Param # \n", 710 | "=================================================================\n", 711 | "embedding (Embedding) (64, None, 25) 1200 \n", 712 | "_________________________________________________________________\n", 713 | "gru (GRU) (64, None, 1024) 3228672 \n", 714 | "_________________________________________________________________\n", 715 | "dense (Dense) (64, None, 48) 49200 \n", 716 | "=================================================================\n", 717 | "Total params: 3,279,072\n", 718 | "Trainable params: 3,279,072\n", 719 | "Non-trainable params: 0\n", 720 | "_________________________________________________________________\n" 721 | ] 722 | } 723 | ], 724 | "source": [ 725 | "model.summary()" 726 | ] 727 | }, 728 | { 729 | "cell_type": "markdown", 730 | "metadata": { 731 | "colab_type": "text", 732 | "id": "uwv0gEkURfx1" 733 | }, 734 | "source": [ 735 | "To get actual predictions from the model we need to sample from the output distribution, to get actual character indices. This distribution is defined by the logits over the character vocabulary.\n", 736 | "\n", 737 | "Note: It is important to _sample_ from this distribution as taking the _argmax_ of the distribution can easily get the model stuck in a loop.\n", 738 | "\n", 739 | "Try it for the first example in the batch:" 740 | ] 741 | }, 742 | { 743 | "cell_type": "code", 744 | "execution_count": 24, 745 | "metadata": { 746 | "colab": {}, 747 | "colab_type": "code", 748 | "id": "4V4MfFg0RQJg" 749 | }, 750 | "outputs": [], 751 | "source": [ 752 | "sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)\n", 753 | "sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()" 754 | ] 755 | }, 756 | { 757 | "cell_type": "markdown", 758 | "metadata": { 759 | "colab_type": "text", 760 | "id": "QM1Vbxs_URw5" 761 | }, 762 | "source": [ 763 | "This gives us, at each timestep, a prediction of the next character index:" 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 25, 769 | "metadata": { 770 | "colab": {}, 771 | "colab_type": "code", 772 | "id": "YqFMUQc_UFgM" 773 | }, 774 | "outputs": [ 775 | { 776 | "data": { 777 | "text/plain": [ 778 | "array([38, 47, 10, 11, 21, 29, 37, 41, 13, 35, 41, 23, 12, 21, 41, 15, 47,\n", 779 | " 29, 3, 10, 42, 33, 12, 13, 18, 4, 35, 27, 41, 16, 8, 45, 15, 43,\n", 780 | " 8, 40, 28, 35, 29, 3, 5, 8, 17, 32, 10, 44, 25, 24, 45, 24, 26,\n", 781 | " 40, 19, 8, 6, 45, 25, 3, 31, 13, 28, 12, 6, 46, 22, 45, 12, 33,\n", 782 | " 42, 33, 0, 15, 32, 36, 45, 22, 33, 44, 43, 33, 36, 36, 9, 47, 4,\n", 783 | " 18, 0, 31, 33, 33, 0, 45, 37, 15, 44, 22, 30, 10, 29, 28],\n", 784 | " dtype=int64)" 785 | ] 786 | }, 787 | "execution_count": 25, 788 | "metadata": {}, 789 | "output_type": "execute_result" 790 | } 791 | ], 792 | "source": [ 793 | "sampled_indices" 794 | ] 795 | }, 796 | { 797 | "cell_type": "markdown", 798 | "metadata": { 799 | "colab_type": "text", 800 | "id": "LfLtsP3mUhCG" 801 | }, 802 | "source": [ 803 | "Decode these to see the text predicted by this untrained model:" 804 | ] 805 | }, 806 | { 807 | "cell_type": "code", 808 | "execution_count": 26, 809 | "metadata": { 810 | "colab": {}, 811 | "colab_type": "code", 812 | "id": "xWcFwPwLSo05" 813 | }, 814 | "outputs": [ 815 | { 816 | "name": "stdout", 817 | "output_type": "stream", 818 | "text": [ 819 | "Input: \n", 820 | " '|به نام خداوند جان و خرد\\n|کزین برتر اندیشه برنگذرد\\n|خداوند نام و خداوند جای\\n|خداوند روزی ده رهنمای\\n|'\n", 821 | "\n", 822 | "Next Char Predictions: \n", 823 | " 'ه\\u200cآأدطنپئلپرؤدپب\\u200cط)آچفؤئج|لصپت؟گبژ؟ٔضلط)«؟ثغآکسزگزشٔح؟»گس)عئضؤ»یذگؤفچف\\nبغمگذفکژفممء\\u200c|ج\\nعفف\\nگنبکذظآطض'\n" 824 | ] 825 | } 826 | ], 827 | "source": [ 828 | "print(\"Input: \\n\", repr(\"\".join(idx2char[input_example_batch[0]])))\n", 829 | "print()\n", 830 | "print(\"Next Char Predictions: \\n\", repr(\"\".join(idx2char[sampled_indices ])))" 831 | ] 832 | }, 833 | { 834 | "cell_type": "markdown", 835 | "metadata": { 836 | "colab_type": "text", 837 | "id": "LJL0Q0YPY6Ee" 838 | }, 839 | "source": [ 840 | "## Train the model" 841 | ] 842 | }, 843 | { 844 | "cell_type": "markdown", 845 | "metadata": { 846 | "colab_type": "text", 847 | "id": "YCbHQHiaa4Ic" 848 | }, 849 | "source": [ 850 | "At this point the problem can be treated as a standard classification problem. Given the previous RNN state, and the input this time step, predict the class of the next character." 851 | ] 852 | }, 853 | { 854 | "cell_type": "markdown", 855 | "metadata": { 856 | "colab_type": "text", 857 | "id": "trpqTWyvk0nr" 858 | }, 859 | "source": [ 860 | "### Attach an optimizer, and a loss function" 861 | ] 862 | }, 863 | { 864 | "cell_type": "markdown", 865 | "metadata": { 866 | "colab_type": "text", 867 | "id": "UAjbjY03eiQ4" 868 | }, 869 | "source": [ 870 | "The standard `tf.keras.losses.sparse_categorical_crossentropy` loss function works in this case because it is applied across the last dimension of the predictions.\n", 871 | "\n", 872 | "Because our model returns logits, we need to set the `from_logits` flag.\n" 873 | ] 874 | }, 875 | { 876 | "cell_type": "code", 877 | "execution_count": 27, 878 | "metadata": { 879 | "colab": {}, 880 | "colab_type": "code", 881 | "id": "4HrXTACTdzY-" 882 | }, 883 | "outputs": [], 884 | "source": [ 885 | "def loss(labels, logits):\n", 886 | " return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)\n" 887 | ] 888 | }, 889 | { 890 | "cell_type": "markdown", 891 | "metadata": { 892 | "colab_type": "text", 893 | "id": "jeOXriLcymww" 894 | }, 895 | "source": [ 896 | "Configure the training procedure using the `tf.keras.Model.compile` method. We'll use `tf.keras.optimizers.Adam` with default arguments and the loss function." 897 | ] 898 | }, 899 | { 900 | "cell_type": "code", 901 | "execution_count": 28, 902 | "metadata": { 903 | "colab": {}, 904 | "colab_type": "code", 905 | "id": "DDl1_Een6rL0" 906 | }, 907 | "outputs": [], 908 | "source": [ 909 | "model.compile(optimizer='adam', loss=loss)" 910 | ] 911 | }, 912 | { 913 | "cell_type": "markdown", 914 | "metadata": { 915 | "colab_type": "text", 916 | "id": "ieSJdchZggUj" 917 | }, 918 | "source": [ 919 | "### Configure checkpoints" 920 | ] 921 | }, 922 | { 923 | "cell_type": "markdown", 924 | "metadata": { 925 | "colab_type": "text", 926 | "id": "C6XBUUavgF56" 927 | }, 928 | "source": [ 929 | "Use a `tf.keras.callbacks.ModelCheckpoint` to ensure that checkpoints are saved during training:" 930 | ] 931 | }, 932 | { 933 | "cell_type": "code", 934 | "execution_count": 29, 935 | "metadata": { 936 | "colab": {}, 937 | "colab_type": "code", 938 | "id": "W6fWTriUZP-n" 939 | }, 940 | "outputs": [], 941 | "source": [ 942 | "# Directory where the checkpoints will be saved\n", 943 | "checkpoint_dir = './training_checkpoints'\n", 944 | "# Name of the checkpoint files\n", 945 | "checkpoint_prefix = os.path.join(checkpoint_dir, \"ckpt_{epoch}\")\n", 946 | "\n", 947 | "checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(\n", 948 | " filepath=checkpoint_prefix,\n", 949 | " save_weights_only=True)" 950 | ] 951 | }, 952 | { 953 | "cell_type": "markdown", 954 | "metadata": { 955 | "colab_type": "text", 956 | "id": "3Ky3F_BhgkTW" 957 | }, 958 | "source": [ 959 | "### Execute the training" 960 | ] 961 | }, 962 | { 963 | "cell_type": "markdown", 964 | "metadata": { 965 | "colab_type": "text", 966 | "id": "IxdOA-rgyGvs" 967 | }, 968 | "source": [ 969 | "To keep training time reasonable, use 10 epochs to train the model. In Colab, set the runtime to GPU for faster training." 970 | ] 971 | }, 972 | { 973 | "cell_type": "code", 974 | "execution_count": 30, 975 | "metadata": { 976 | "colab": {}, 977 | "colab_type": "code", 978 | "id": "7yGBE2zxMMHs" 979 | }, 980 | "outputs": [], 981 | "source": [ 982 | "EPOCHS=10" 983 | ] 984 | }, 985 | { 986 | "cell_type": "code", 987 | "execution_count": 31, 988 | "metadata": { 989 | "colab": {}, 990 | "colab_type": "code", 991 | "id": "UK-hmKjYVoll" 992 | }, 993 | "outputs": [ 994 | { 995 | "name": "stdout", 996 | "output_type": "stream", 997 | "text": [ 998 | "Epoch 1/10\n", 999 | "410/410 [==============================] - 129s 314ms/step - loss: 2.4480\n", 1000 | "Epoch 2/10\n", 1001 | "410/410 [==============================] - 149s 363ms/step - loss: 1.8074\n", 1002 | "Epoch 3/10\n", 1003 | "410/410 [==============================] - 152s 371ms/step - loss: 1.5528\n", 1004 | "Epoch 4/10\n", 1005 | "410/410 [==============================] - 154s 375ms/step - loss: 1.4235\n", 1006 | "Epoch 5/10\n", 1007 | "410/410 [==============================] - 154s 377ms/step - loss: 1.3444\n", 1008 | "Epoch 6/10\n", 1009 | "410/410 [==============================] - 157s 382ms/step - loss: 1.2861\n", 1010 | "Epoch 7/10\n", 1011 | "410/410 [==============================] - 157s 383ms/step - loss: 1.2370\n", 1012 | "Epoch 8/10\n", 1013 | "410/410 [==============================] - 136s 331ms/step - loss: 1.1920\n", 1014 | "Epoch 9/10\n", 1015 | "410/410 [==============================] - 147s 359ms/step - loss: 1.1487\n", 1016 | "Epoch 10/10\n", 1017 | "410/410 [==============================] - 152s 370ms/step - loss: 1.1066\n" 1018 | ] 1019 | } 1020 | ], 1021 | "source": [ 1022 | "history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])" 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "markdown", 1027 | "metadata": { 1028 | "colab_type": "text", 1029 | "id": "kKkD5M6eoSiN" 1030 | }, 1031 | "source": [ 1032 | "## Generate text" 1033 | ] 1034 | }, 1035 | { 1036 | "cell_type": "markdown", 1037 | "metadata": { 1038 | "colab_type": "text", 1039 | "id": "JIPcXllKjkdr" 1040 | }, 1041 | "source": [ 1042 | "### Restore the latest checkpoint" 1043 | ] 1044 | }, 1045 | { 1046 | "cell_type": "markdown", 1047 | "metadata": { 1048 | "colab_type": "text", 1049 | "id": "LyeYRiuVjodY" 1050 | }, 1051 | "source": [ 1052 | "To keep this prediction step simple, use a batch size of 1.\n", 1053 | "\n", 1054 | "Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.\n", 1055 | "\n", 1056 | "To run the model with a different `batch_size`, we need to rebuild the model and restore the weights from the checkpoint.\n" 1057 | ] 1058 | }, 1059 | { 1060 | "cell_type": "code", 1061 | "execution_count": 32, 1062 | "metadata": { 1063 | "colab": {}, 1064 | "colab_type": "code", 1065 | "id": "zk2WJ2-XjkGz" 1066 | }, 1067 | "outputs": [ 1068 | { 1069 | "data": { 1070 | "text/plain": [ 1071 | "'./training_checkpoints\\\\ckpt_10'" 1072 | ] 1073 | }, 1074 | "execution_count": 32, 1075 | "metadata": {}, 1076 | "output_type": "execute_result" 1077 | } 1078 | ], 1079 | "source": [ 1080 | "tf.train.latest_checkpoint(checkpoint_dir)" 1081 | ] 1082 | }, 1083 | { 1084 | "cell_type": "code", 1085 | "execution_count": 33, 1086 | "metadata": { 1087 | "colab": {}, 1088 | "colab_type": "code", 1089 | "id": "LycQ-ot_jjyu" 1090 | }, 1091 | "outputs": [], 1092 | "source": [ 1093 | "model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)\n", 1094 | "\n", 1095 | "model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))\n", 1096 | "\n", 1097 | "model.build(tf.TensorShape([1, None]))" 1098 | ] 1099 | }, 1100 | { 1101 | "cell_type": "code", 1102 | "execution_count": 34, 1103 | "metadata": { 1104 | "colab": {}, 1105 | "colab_type": "code", 1106 | "id": "71xa6jnYVrAN" 1107 | }, 1108 | "outputs": [ 1109 | { 1110 | "name": "stdout", 1111 | "output_type": "stream", 1112 | "text": [ 1113 | "Model: \"sequential_1\"\n", 1114 | "_________________________________________________________________\n", 1115 | "Layer (type) Output Shape Param # \n", 1116 | "=================================================================\n", 1117 | "embedding_1 (Embedding) (1, None, 25) 1200 \n", 1118 | "_________________________________________________________________\n", 1119 | "gru_1 (GRU) (1, None, 1024) 3228672 \n", 1120 | "_________________________________________________________________\n", 1121 | "dense_1 (Dense) (1, None, 48) 49200 \n", 1122 | "=================================================================\n", 1123 | "Total params: 3,279,072\n", 1124 | "Trainable params: 3,279,072\n", 1125 | "Non-trainable params: 0\n", 1126 | "_________________________________________________________________\n" 1127 | ] 1128 | } 1129 | ], 1130 | "source": [ 1131 | "model.summary()" 1132 | ] 1133 | }, 1134 | { 1135 | "cell_type": "markdown", 1136 | "metadata": { 1137 | "colab_type": "text", 1138 | "id": "DjGz1tDkzf-u" 1139 | }, 1140 | "source": [ 1141 | "### The prediction loop\n", 1142 | "\n", 1143 | "The following code block generates the text:\n", 1144 | "\n", 1145 | "* It Starts by choosing a start string, initializing the RNN state and setting the number of characters to generate.\n", 1146 | "\n", 1147 | "* Get the prediction distribution of the next character using the start string and the RNN state.\n", 1148 | "\n", 1149 | "* Then, use a categorical distribution to calculate the index of the predicted character. Use this predicted character as our next input to the model.\n", 1150 | "\n", 1151 | "* The RNN state returned by the model is fed back into the model so that it now has more context, instead than only one word. After predicting the next word, the modified RNN states are again fed back into the model, which is how it learns as it gets more context from the previously predicted words.\n", 1152 | "\n", 1153 | "\n", 1154 | "![To generate text the model's output is fed back to the input](images/text_generation_sampling.png)\n", 1155 | "\n", 1156 | "Looking at the generated text, you'll see the model knows when to capitalize, make paragraphs and imitates a Shakespeare-like writing vocabulary. With the small number of training epochs, it has not yet learned to form coherent sentences." 1157 | ] 1158 | }, 1159 | { 1160 | "cell_type": "code", 1161 | "execution_count": 35, 1162 | "metadata": { 1163 | "colab": {}, 1164 | "colab_type": "code", 1165 | "id": "WvuwZBX5Ogfd" 1166 | }, 1167 | "outputs": [], 1168 | "source": [ 1169 | "def generate_text(model, start_string):\n", 1170 | " # Evaluation step (generating text using the learned model)\n", 1171 | "\n", 1172 | " # Number of characters to generate\n", 1173 | " num_generate = 1000\n", 1174 | "\n", 1175 | " # Converting our start string to numbers (vectorizing)\n", 1176 | " input_eval = [char2idx[s] for s in start_string]\n", 1177 | " input_eval = tf.expand_dims(input_eval, 0)\n", 1178 | "\n", 1179 | " # Empty string to store our results\n", 1180 | " text_generated = []\n", 1181 | "\n", 1182 | "\n", 1183 | " # Here batch size == 1\n", 1184 | " model.reset_states()\n", 1185 | " for i in range(num_generate):\n", 1186 | " predictions = model(input_eval)\n", 1187 | " # remove the batch dimension\n", 1188 | " predictions = tf.squeeze(predictions, 0)\n", 1189 | "\n", 1190 | " # using a categorical distribution to predict the word returned by the model\n", 1191 | " predictions = predictions \n", 1192 | " predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()\n", 1193 | "\n", 1194 | " # We pass the predicted word as the next input to the model\n", 1195 | " # along with the previous hidden state\n", 1196 | " input_eval = tf.expand_dims([predicted_id], 0)\n", 1197 | "\n", 1198 | " text_generated.append(idx2char[predicted_id])\n", 1199 | "\n", 1200 | " return (start_string + ''.join(text_generated))" 1201 | ] 1202 | }, 1203 | { 1204 | "cell_type": "code", 1205 | "execution_count": 38, 1206 | "metadata": { 1207 | "colab": {}, 1208 | "colab_type": "code", 1209 | "id": "ktovv0RFhrkn" 1210 | }, 1211 | "outputs": [ 1212 | { 1213 | "name": "stdout", 1214 | "output_type": "stream", 1215 | "text": [ 1216 | "به نام خدایست نستوه تو\n", 1217 | "|به گیتی نماید نگارد ز بتاختی\n", 1218 | "|وزان پس چنین تا برآرم بماه\n", 1219 | "|صد آن تختها برکشمد از تو بر تن خویش یابی به خون\n", 1220 | "|ز شادی شگفتی که بیکار گشت\n", 1221 | "|دوماه\n", 1222 | "|کجا آن همه ریز کردم همی\n", 1223 | "|ز تخم بد و باژ و پر بوی مهر\n", 1224 | "|شنیده تخت باژی چو کوه بزرگ\n", 1225 | "|بدست سخن گوی برخاستند\n", 1226 | "|به زندان بیاوردش از جنگ جفت\n", 1227 | "|یکی دیگر آنگه که تن بگذرد\n", 1228 | "|من آن تخت راخسر بر تنگ هنگام موسن شود\n", 1229 | "|سربخت این را که پوشیده‌ام\n", 1230 | "|سراسان کنم داد و دانندگان\n", 1231 | "|گلاب و عنان برگرفتند راه\n", 1232 | "|نماند به رستم که لشکر براند\n", 1233 | "|چه افگند دینار و گرمان به دست\n", 1234 | "|چو ارجات داری خرامید یاد\n", 1235 | "|که نزد کزت بر تو بر خاک روی\n", 1236 | "|شهنشاه بینندهٔ رخش بروخون\n", 1237 | "|تو گفتی همی درکشید این سخن\n", 1238 | "|سواری بر اب گوهرنگار\n", 1239 | "|صزو تن به پا اندر آویختست\n", 1240 | "|نه زین باره و گردیه را بدست\n", 1241 | "|به خون خسره آیید گفتار من\n", 1242 | "|نگردد به بازد اسیدش تخل به درد\n", 1243 | "|سوی حلبهاد آن سه زر\n", 1244 | "|سپاس از دبیرو ستم\n", 1245 | "|همی دشمنندان او تخت را نو نمرد\n", 1246 | "|هرآنکس که او دشمن ایمن ببین\n", 1247 | "|بدو گفت بهرام چون بر روان\n", 1248 | "|یبا پیرسر گفت زن پر ز خون\n", 1249 | "|نگه کرده و از بلت خسرو شوردار\n", 1250 | "|بدآنید تاوان به ایران تویی\n", 1251 | "|\n", 1252 | "|ار و دوبست و زه برکشد\n", 1253 | "|فروشد نه بیما نیر خ\n" 1254 | ] 1255 | } 1256 | ], 1257 | "source": [ 1258 | "print(generate_text(model, start_string=u\"به نام خدا\"))" 1259 | ] 1260 | }, 1261 | { 1262 | "cell_type": "markdown", 1263 | "metadata": { 1264 | "colab_type": "text", 1265 | "id": "AM2Uma_-yVIq" 1266 | }, 1267 | "source": [ 1268 | "The easiest thing you can do to improve the results it to train it for longer (try `EPOCHS=30`).\n", 1269 | "\n", 1270 | "You can also experiment with a different start string, or try adding another RNN layer to improve the model's accuracy, or adjusting the temperature parameter to generate more or less random predictions." 1271 | ] 1272 | }, 1273 | { 1274 | "cell_type": "markdown", 1275 | "metadata": {}, 1276 | "source": [ 1277 | "
\n", 1278 | "
\n", 1279 | "source:\n", 1280 | " \n", 1281 | "https://www.tensorflow.org/tutorials/text/text_generation" 1282 | ] 1283 | }, 1284 | { 1285 | "cell_type": "markdown", 1286 | "metadata": {}, 1287 | "source": [ 1288 | "
\n", 1289 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 1290 | "
\n", 1291 | "Class.Vision - AkhavanPour.ir - GitHub\n", 1292 | "\n", 1293 | "
" 1294 | ] 1295 | } 1296 | ], 1297 | "metadata": { 1298 | "accelerator": "GPU", 1299 | "colab": { 1300 | "collapsed_sections": [], 1301 | "name": "text_generation.ipynb", 1302 | "private_outputs": true, 1303 | "provenance": [], 1304 | "toc_visible": true, 1305 | "version": "0.3.2" 1306 | }, 1307 | "kernelspec": { 1308 | "display_name": "tf2-GPU", 1309 | "language": "python", 1310 | "name": "tf2" 1311 | }, 1312 | "language_info": { 1313 | "codemirror_mode": { 1314 | "name": "ipython", 1315 | "version": 3 1316 | }, 1317 | "file_extension": ".py", 1318 | "mimetype": "text/x-python", 1319 | "name": "python", 1320 | "nbconvert_exporter": "python", 1321 | "pygments_lexer": "ipython3", 1322 | "version": "3.6.9" 1323 | } 1324 | }, 1325 | "nbformat": 4, 1326 | "nbformat_minor": 1 1327 | } 1328 | -------------------------------------------------------------------------------- /07_text-classification-Emojify.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "
به نام خدا
\n", 8 | "\"class.vision\"\n", 9 | "

طبقه‌بندی متن - Emojify!

" 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "
\n", 17 | "کدها با تغییرات برگرفته از کورس Sequence Models پروفسور Andrew NG است.\n", 18 | "
\n", 19 | "\n", 20 | "[https://www.coursera.org/learn/nlp-sequence-models](https://www.coursera.org/learn/nlp-sequence-models)\n", 21 | "\n", 22 | "\n", 23 | "\n", 24 | "
در این نوت بوک میخواهید برای جملات دلخواه یک emoji مرتبط به صورت خودکار بگذاریم!\n", 25 | "در واقع یک طبقه بندی ساده 5 کلاسه است که هر جمله را به یک ایموجی نسبت می‌دهد.\n", 26 | "
\n", 27 | "\n", 28 | "\n" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "metadata": {}, 34 | "source": [ 35 | "##
لود کتابخانه‌های مورد استفاده
\n", 36 | "
\n", 37 | "برای اجرای این نوت‌بوک باید کتابخانه ی emoji را نصب کنید.\n", 38 | "بدین منظور به اینترنت متصل شود و در ترمینال دستورات زیر را بنویسید:\n", 39 | "
\n", 40 | "

pip install emoji

\n", 41 | "\n", 42 | "
\n", 43 | "میتوانید به جای pip از کلمه ی conda استفاده کنید. (اگر از آناکوندا استفاده میکنید.)\n", 44 | "
\n" 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 1, 50 | "metadata": {}, 51 | "outputs": [ 52 | { 53 | "name": "stderr", 54 | "output_type": "stream", 55 | "text": [ 56 | "Using TensorFlow backend.\n" 57 | ] 58 | } 59 | ], 60 | "source": [ 61 | "import numpy as np\n", 62 | "import matplotlib.pyplot as plt\n", 63 | "import keras\n", 64 | "%matplotlib inline" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": {}, 70 | "source": [ 71 | "## 1 - Baseline model: Emojifier-V1\n", 72 | "\n", 73 | "### 1.1 - Dataset EMOJISET\n", 74 | "\n", 75 | "Let's start by building a simple baseline classifier. \n", 76 | "\n", 77 | "You have a tiny dataset (X, Y) where:\n", 78 | "- X contains 127 sentences (strings)\n", 79 | "- Y contains a integer label between 0 and 4 corresponding to an emoji for each sentence\n", 80 | "\n", 81 | "\n" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | "###
تابع کمکی برای خواند مجموعه داده\n", 89 | "
\n" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": 2, 95 | "metadata": {}, 96 | "outputs": [], 97 | "source": [ 98 | "import csv\n", 99 | "def read_csv(filename):\n", 100 | " phrase = []\n", 101 | " emoji = []\n", 102 | "\n", 103 | " with open (filename) as csvDataFile:\n", 104 | " csvReader = csv.reader(csvDataFile)\n", 105 | "\n", 106 | " for row in csvReader:\n", 107 | " phrase.append(row[0])\n", 108 | " emoji.append(row[1])\n", 109 | "\n", 110 | " X = np.asarray(phrase)\n", 111 | " Y = np.asarray(emoji, dtype=int)\n", 112 | "\n", 113 | " return X, Y" 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "###
مجموعه داده‌ی Emoji\n", 121 | "
\n", 122 | "\n", 123 | "
مجموعه داده را میتوانید از مسیر زیر دانلود کنید.
\n", 124 | "
\n", 125 | "\n", 126 | "[http://dataset.class.vision/NLP/emoji.zip](http://dataset.class.vision/NLP/emoji.zip)\n", 127 | "\n", 128 | "
\n" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": 3, 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "X_train, Y_train = read_csv('D:/dataset/NLP/emoji/train_emoji.csv')\n", 138 | "X_test, Y_test = read_csv('D:/dataset/NLP/emoji/tesss.csv')" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "
طول بزرگترین جمله
" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": 4, 151 | "metadata": {}, 152 | "outputs": [ 153 | { 154 | "data": { 155 | "text/plain": [ 156 | "10" 157 | ] 158 | }, 159 | "execution_count": 4, 160 | "metadata": {}, 161 | "output_type": "execute_result" 162 | } 163 | ], 164 | "source": [ 165 | "maxLen = len(max(X_train, key=len).split())\n", 166 | "maxLen" 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "###
تابع کمکی تبدیل label ها به Emoji\n", 174 | "
\n" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": 9, 180 | "metadata": {}, 181 | "outputs": [], 182 | "source": [ 183 | "import emoji \n", 184 | "\n", 185 | "emoji_dictionary = {\"0\": \"\\u2764\\uFE0F\", # :heart: prints a black instead of red heart depending on the font\n", 186 | " \"1\": \":baseball:\",\n", 187 | " \"2\": \":smile:\",\n", 188 | " \"3\": \":disappointed:\",\n", 189 | " \"4\": \":fork_and_knife:\"}\n", 190 | "\n", 191 | "def label_to_emoji(label):\n", 192 | "\n", 193 | " return emoji.emojize(emoji_dictionary[str(label)], use_aliases=True)" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": 10, 199 | "metadata": {}, 200 | "outputs": [ 201 | { 202 | "name": "stdout", 203 | "output_type": "stream", 204 | "text": [ 205 | "I love you mum ❤️\n" 206 | ] 207 | } 208 | ], 209 | "source": [ 210 | "index = 5\n", 211 | "print(X_train[index], label_to_emoji(Y_train[index]))" 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": {}, 217 | "source": [ 218 | "### 1.2 - Overview of the Emojifier-V1\n", 219 | "\n", 220 | "\n", 221 | "
\n", 222 | "\n", 223 | "
\n" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "###
تبدیل Labelها به بردار One-Hot\n", 231 | "
\n" 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": 11, 237 | "metadata": {}, 238 | "outputs": [], 239 | "source": [ 240 | "Y_oh_train = keras.utils.to_categorical(Y_train, 5)\n", 241 | "Y_oh_test = keras.utils.to_categorical(Y_test, 5)" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": 12, 247 | "metadata": {}, 248 | "outputs": [ 249 | { 250 | "name": "stdout", 251 | "output_type": "stream", 252 | "text": [ 253 | "0 is converted into one hot [1. 0. 0. 0. 0.]\n" 254 | ] 255 | } 256 | ], 257 | "source": [ 258 | "index = 50\n", 259 | "print(Y_train[index], \"is converted into one hot\", Y_oh_train[index])" 260 | ] 261 | }, 262 | { 263 | "cell_type": "markdown", 264 | "metadata": {}, 265 | "source": [ 266 | "### 1.3 - Implementing Emojifier-V1\n", 267 | "\n", 268 | "As shown in Figure (2), the first step is to convert an input sentence into the word vector representation, which then get averaged together. Similar to the previous exercise, we will use pretrained 50-dimensional GloVe embeddings. Run the following cell to load the `word_to_vec_map`, which contains all the vector representations." 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": {}, 274 | "source": [ 275 | "###
تابع کمکی برای خواندن embedding از پیش آموزش داده شده.\n", 276 | "
\n" 277 | ] 278 | }, 279 | { 280 | "cell_type": "code", 281 | "execution_count": 13, 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "def read_glove_vecs(glove_file):\n", 286 | " with open(glove_file, encoding=\"utf8\") as f:\n", 287 | " words = set()\n", 288 | " word_to_vec_map = {}\n", 289 | " for line in f:\n", 290 | " line = line.strip().split()\n", 291 | " curr_word = line[0]\n", 292 | " words.add(curr_word)\n", 293 | " word_to_vec_map[curr_word] = np.array(line[1:], dtype=np.float64)\n", 294 | " \n", 295 | " i = 1\n", 296 | " words_to_index = {}\n", 297 | " index_to_words = {}\n", 298 | " for w in sorted(words):\n", 299 | " words_to_index[w] = i\n", 300 | " index_to_words[i] = w\n", 301 | " i = i + 1\n", 302 | " return words_to_index, index_to_words, word_to_vec_map" 303 | ] 304 | }, 305 | { 306 | "cell_type": "code", 307 | "execution_count": 14, 308 | "metadata": {}, 309 | "outputs": [], 310 | "source": [ 311 | "word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('D:/dataset/glove.6B/glove.6B.50d.txt')" 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": {}, 317 | "source": [ 318 | "You've loaded:\n", 319 | "- `word_to_index`: dictionary mapping from words to their indices in the vocabulary (400,001 words, with the valid indices ranging from 0 to 400,000)\n", 320 | "- `index_to_word`: dictionary mapping from indices to their corresponding words in the vocabulary\n", 321 | "- `word_to_vec_map`: dictionary mapping words to their GloVe vector " 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": 15, 327 | "metadata": {}, 328 | "outputs": [ 329 | { 330 | "name": "stdout", 331 | "output_type": "stream", 332 | "text": [ 333 | "the index of ali in the vocabulary is 51314\n", 334 | "the 113317th word in the vocabulary is cucumber\n" 335 | ] 336 | } 337 | ], 338 | "source": [ 339 | "word = \"ali\"\n", 340 | "index = 113317\n", 341 | "print(\"the index of\", word, \"in the vocabulary is\", word_to_index[word])\n", 342 | "print(\"the\", str(index) + \"th word in the vocabulary is\", index_to_word[index])" 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": 16, 348 | "metadata": {}, 349 | "outputs": [ 350 | { 351 | "data": { 352 | "text/plain": [ 353 | "array([-0.71587 , 0.7874 , 0.71305 , -0.089955, 1.366 , -1.3149 ,\n", 354 | " 0.7309 , 0.79725 , 0.47211 , 0.53347 , 0.37542 , -0.10256 ,\n", 355 | " -1.0003 , -0.31226 , 0.26217 , 0.92426 , 0.43014 , -0.015593,\n", 356 | " 0.4149 , 0.88286 , 0.10869 , 0.95213 , 1.1807 , 0.06445 ,\n", 357 | " -0.05814 , -1.797 , -0.18432 , -0.41754 , -0.73625 , 1.1607 ,\n", 358 | " 1.5932 , -0.70268 , -0.61621 , 0.47118 , 0.95046 , 0.35206 ,\n", 359 | " 0.6072 , 0.59339 , -0.47091 , 1.4916 , 0.27146 , 1.8252 ,\n", 360 | " -1.2073 , -0.80058 , 0.52558 , -0.33346 , -1.4102 , -0.21514 ,\n", 361 | " 0.12945 , -0.69603 ])" 362 | ] 363 | }, 364 | "execution_count": 16, 365 | "metadata": {}, 366 | "output_type": "execute_result" 367 | } 368 | ], 369 | "source": [ 370 | "word_to_vec_map[\"ali\"]" 371 | ] 372 | }, 373 | { 374 | "cell_type": "markdown", 375 | "metadata": {}, 376 | "source": [ 377 | "###
تبدیل جمله به میانگین Embeddingهای کلمات آن\n", 378 | "
\n", 379 | "
\n", 380 | "هر جمله را به کلمات تشکیل دهنده آن و سپس هر کلمه را به embedding و در نهایت این بردارهای embedding را میانگین خواهیم گرفت.\n", 381 | "
" 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": 17, 387 | "metadata": {}, 388 | "outputs": [], 389 | "source": [ 390 | "def sentence_to_avg(sentence, word_to_vec_map):\n", 391 | " \n", 392 | " # Split sentence into list of lower case words\n", 393 | " words = sentence.lower().split()\n", 394 | "\n", 395 | " # Initialize the average word vector, should have the same shape as your word vectors.\n", 396 | " avg = np.zeros((50,))\n", 397 | " \n", 398 | " # average the word vectors. You can loop over the words in the list \"words\".\n", 399 | " for w in words:\n", 400 | " avg += word_to_vec_map[w]\n", 401 | " avg = avg / len(words)\n", 402 | " \n", 403 | " \n", 404 | " return avg" 405 | ] 406 | }, 407 | { 408 | "cell_type": "code", 409 | "execution_count": 18, 410 | "metadata": { 411 | "scrolled": true 412 | }, 413 | "outputs": [ 414 | { 415 | "name": "stdout", 416 | "output_type": "stream", 417 | "text": [ 418 | "avg = [-0.008005 0.56370833 -0.50427333 0.258865 0.55131103 0.03104983\n", 419 | " -0.21013718 0.16893933 -0.09590267 0.141784 -0.15708967 0.18525867\n", 420 | " 0.6495785 0.38371117 0.21102167 0.11301667 0.02613967 0.26037767\n", 421 | " 0.05820667 -0.01578167 -0.12078833 -0.02471267 0.4128455 0.5152061\n", 422 | " 0.38756167 -0.898661 -0.535145 0.33501167 0.68806933 -0.2156265\n", 423 | " 1.797155 0.10476933 -0.36775333 0.750785 0.10282583 0.348925\n", 424 | " -0.27262833 0.66768 -0.10706167 -0.283635 0.59580117 0.28747333\n", 425 | " -0.3366635 0.23393817 0.34349183 0.178405 0.1166155 -0.076433\n", 426 | " 0.1445417 0.09808667]\n" 427 | ] 428 | } 429 | ], 430 | "source": [ 431 | "avg = sentence_to_avg(\"Morrocan couscous is my favorite dish\", word_to_vec_map)\n", 432 | "print(\"avg = \", avg)" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": { 438 | "collapsed": true 439 | }, 440 | "source": [ 441 | "##
مدل\n", 442 | "
\n", 443 | "You now have all the pieces to finish implementing the `model()` function. After using `sentence_to_avg()` you need to pass the average through forward propagation, compute the cost, and then backpropagate to update the softmax's parameters. \n", 444 | "\n", 445 | "Assuming here that $Yoh$ (\"Y one hot\") is the one-hot encoding of the output labels, the equations you need to implement in the forward pass and to compute the cross-entropy cost are:\n", 446 | "$$ z^{(i)} = W . avg^{(i)} + b$$\n", 447 | "$$ a^{(i)} = softmax(z^{(i)})$$\n", 448 | "$$ \\mathcal{L}^{(i)} = - \\sum_{k = 0}^{n_y - 1} Yoh^{(i)}_k * log(a^{(i)}_k)$$\n", 449 | "\n", 450 | "It is possible to come up with a more efficient vectorized implementation. But since we are using a for-loop to convert the sentences one at a time into the avg^{(i)} representation anyway, let's not bother this time. \n" 451 | ] 452 | }, 453 | { 454 | "cell_type": "code", 455 | "execution_count": 19, 456 | "metadata": {}, 457 | "outputs": [], 458 | "source": [ 459 | "def softmax(x):\n", 460 | " \"\"\"Compute softmax values for each sets of scores in x.\"\"\"\n", 461 | " e_x = np.exp(x - np.max(x))\n", 462 | " return e_x / e_x.sum()\n", 463 | "\n", 464 | "def predict(X, Y, W, b, word_to_vec_map):\n", 465 | " \"\"\"\n", 466 | " Given X (sentences) and Y (emoji indices), predict emojis and compute the accuracy of your model over the given set.\n", 467 | " \n", 468 | " Arguments:\n", 469 | " X -- input data containing sentences, numpy array of shape (m, None)\n", 470 | " Y -- labels, containing index of the label emoji, numpy array of shape (m, 1)\n", 471 | " \n", 472 | " Returns:\n", 473 | " pred -- numpy array of shape (m, 1) with your predictions\n", 474 | " \"\"\"\n", 475 | " m = X.shape[0]\n", 476 | " pred = np.zeros((m, 1))\n", 477 | " \n", 478 | " for j in range(m): # Loop over training examples\n", 479 | " \n", 480 | " # Split jth test example (sentence) into list of lower case words\n", 481 | " words = X[j].lower().split()\n", 482 | " \n", 483 | " # Average words' vectors\n", 484 | " avg = np.zeros((50,))\n", 485 | " for w in words:\n", 486 | " avg += word_to_vec_map[w]\n", 487 | " avg = avg/len(words)\n", 488 | "\n", 489 | " # Forward propagation\n", 490 | " Z = np.dot(W, avg) + b\n", 491 | " A = softmax(Z)\n", 492 | " pred[j] = np.argmax(A)\n", 493 | " \n", 494 | " print(\"Accuracy: \" + str(np.mean((pred[:] == Y.reshape(Y.shape[0],1)[:]))))\n", 495 | " \n", 496 | " return pred\n", 497 | "\n", 498 | "def model(X, Y, word_to_vec_map, learning_rate = 0.01, num_iterations = 401):\n", 499 | " \"\"\"\n", 500 | " Model to train word vector representations in numpy.\n", 501 | " \n", 502 | " Arguments:\n", 503 | " X -- input data, numpy array of sentences as strings, of shape (m, 1)\n", 504 | " Y -- labels, numpy array of integers between 0 and 7, numpy-array of shape (m, 1)\n", 505 | " word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representation\n", 506 | " learning_rate -- learning_rate for the stochastic gradient descent algorithm\n", 507 | " num_iterations -- number of iterations\n", 508 | " \n", 509 | " Returns:\n", 510 | " pred -- vector of predictions, numpy-array of shape (m, 1)\n", 511 | " W -- weight matrix of the softmax layer, of shape (n_y, n_h)\n", 512 | " b -- bias of the softmax layer, of shape (n_y,)\n", 513 | " \"\"\"\n", 514 | " \n", 515 | " np.random.seed(1)\n", 516 | "\n", 517 | " # Define number of training examples\n", 518 | " m = Y.shape[0] # number of training examples\n", 519 | " n_y = 5 # number of classes \n", 520 | " n_h = 50 # dimensions of the GloVe vectors \n", 521 | " \n", 522 | " # Initialize parameters using Xavier initialization\n", 523 | " W = np.random.randn(n_y, n_h) / np.sqrt(n_h)\n", 524 | " b = np.zeros((n_y,))\n", 525 | " \n", 526 | " # Convert Y to Y_onehot with n_y classes\n", 527 | " Y_oh = keras.utils.to_categorical(Y, n_y) \n", 528 | " \n", 529 | " # Optimization loop\n", 530 | " for t in range(num_iterations): # Loop over the number of iterations\n", 531 | " for i in range(m): # Loop over the training examples\n", 532 | " \n", 533 | " # Average the word vectors of the words from the i'th training example\n", 534 | " avg = sentence_to_avg(X[i], word_to_vec_map)\n", 535 | "\n", 536 | " # Forward propagate the avg through the softmax layer\n", 537 | " z = np.dot(W, avg) + b\n", 538 | " a = softmax(z)\n", 539 | "\n", 540 | " # Compute cost using the i'th training label's one hot representation and \"A\" (the output of the softmax)\n", 541 | " cost = -np.sum(Y_oh[i] * np.log(a))\n", 542 | " \n", 543 | " # Compute gradients \n", 544 | " dz = a - Y_oh[i]\n", 545 | " dW = np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))\n", 546 | " db = dz\n", 547 | "\n", 548 | " # Update parameters with Stochastic Gradient Descent\n", 549 | " W = W - learning_rate * dW\n", 550 | " b = b - learning_rate * db\n", 551 | " \n", 552 | " if t % 100 == 0:\n", 553 | " print(\"Epoch: \" + str(t) + \" --- cost = \" + str(cost))\n", 554 | " pred = predict(X, Y, W, b, word_to_vec_map)\n", 555 | "\n", 556 | " return pred, W, b" 557 | ] 558 | }, 559 | { 560 | "cell_type": "code", 561 | "execution_count": 20, 562 | "metadata": { 563 | "scrolled": true 564 | }, 565 | "outputs": [ 566 | { 567 | "name": "stdout", 568 | "output_type": "stream", 569 | "text": [ 570 | "Epoch: 0 --- cost = 1.9520498812810072\n", 571 | "Accuracy: 0.3484848484848485\n", 572 | "Epoch: 100 --- cost = 0.07971818726014807\n", 573 | "Accuracy: 0.9318181818181818\n", 574 | "Epoch: 200 --- cost = 0.04456369243681402\n", 575 | "Accuracy: 0.9545454545454546\n", 576 | "Epoch: 300 --- cost = 0.03432267378786059\n", 577 | "Accuracy: 0.9696969696969697\n", 578 | "Epoch: 400 --- cost = 0.02906976783312465\n", 579 | "Accuracy: 0.9772727272727273\n" 580 | ] 581 | } 582 | ], 583 | "source": [ 584 | "pred, W, b = model(X_train, Y_train, word_to_vec_map)\n" 585 | ] 586 | }, 587 | { 588 | "cell_type": "markdown", 589 | "metadata": { 590 | "collapsed": true 591 | }, 592 | "source": [ 593 | "### 1.4 - Examining test set performance \n" 594 | ] 595 | }, 596 | { 597 | "cell_type": "code", 598 | "execution_count": 22, 599 | "metadata": { 600 | "scrolled": false 601 | }, 602 | "outputs": [ 603 | { 604 | "name": "stdout", 605 | "output_type": "stream", 606 | "text": [ 607 | "Training set:\n", 608 | "Accuracy: 0.9772727272727273\n", 609 | "Test set:\n", 610 | "Accuracy: 0.8571428571428571\n" 611 | ] 612 | } 613 | ], 614 | "source": [ 615 | "print(\"Training set:\")\n", 616 | "pred_train = predict(X_train, Y_train, W, b, word_to_vec_map)\n", 617 | "print('Test set:')\n", 618 | "pred_test = predict(X_test, Y_test, W, b, word_to_vec_map)" 619 | ] 620 | }, 621 | { 622 | "cell_type": "markdown", 623 | "metadata": {}, 624 | "source": [ 625 | "Random guessing would have had 20% accuracy given that there are 5 classes. This is pretty good performance after training on only 127 examples. \n", 626 | "\n", 627 | "In the training set, the algorithm saw the sentence \"*I love you*\" with the label ❤️. You can check however that the word \"adore\" does not appear in the training set. Nonetheless, lets see what happens if you write \"*I adore you*.\"\n", 628 | "\n" 629 | ] 630 | }, 631 | { 632 | "cell_type": "code", 633 | "execution_count": 23, 634 | "metadata": {}, 635 | "outputs": [], 636 | "source": [ 637 | "def print_predictions(X, pred):\n", 638 | " print()\n", 639 | " for i in range(X.shape[0]):\n", 640 | " print(X[i], label_to_emoji(int(pred[i])))" 641 | ] 642 | }, 643 | { 644 | "cell_type": "code", 645 | "execution_count": 24, 646 | "metadata": {}, 647 | "outputs": [ 648 | { 649 | "name": "stdout", 650 | "output_type": "stream", 651 | "text": [ 652 | "Accuracy: 0.8333333333333334\n", 653 | "\n", 654 | "i adore you ❤️\n", 655 | "i love you ❤️\n", 656 | "funny lol 😄\n", 657 | "lets play with a ball ⚾\n", 658 | "food is ready 🍴\n", 659 | "not feeling happy 😄\n" 660 | ] 661 | } 662 | ], 663 | "source": [ 664 | "X_my_sentences = np.array([\"i adore you\", \"i love you\", \"funny lol\", \"lets play with a ball\", \"food is ready\", \"not feeling happy\"])\n", 665 | "Y_my_labels = np.array([[0], [0], [2], [1], [4],[3]])\n", 666 | "\n", 667 | "pred = predict(X_my_sentences, Y_my_labels , W, b, word_to_vec_map)\n", 668 | "print_predictions(X_my_sentences, pred)" 669 | ] 670 | }, 671 | { 672 | "cell_type": "markdown", 673 | "metadata": {}, 674 | "source": [ 675 | "Amazing! Because *adore* has a similar embedding as *love*, the algorithm has generalized correctly even to a word it has never seen before. Words such as *heart*, *dear*, *beloved* or *adore* have embedding vectors similar to *love*, and so might work too---feel free to modify the inputs above and try out a variety of input sentences. How well does it work?\n", 676 | "\n", 677 | "Note though that it doesn't get \"not feeling happy\" correct. This algorithm ignores word ordering, so is not good at understanding phrases like \"not happy.\" \n", 678 | "\n", 679 | "Printing the confusion matrix can also help understand which classes are more difficult for your model. A confusion matrix shows how often an example whose label is one class (\"actual\" class) is mislabeled by the algorithm with a different class (\"predicted\" class). \n", 680 | "\n", 681 | "\n" 682 | ] 683 | }, 684 | { 685 | "cell_type": "markdown", 686 | "metadata": {}, 687 | "source": [ 688 | "## 2 - Emojifier-V2: Using LSTMs in Keras: \n", 689 | "\n", 690 | "Let's build an LSTM model that takes as input word sequences. This model will be able to take word ordering into account. Emojifier-V2 will continue to use pre-trained word embeddings to represent words, but will feed them into an LSTM, whose job it is to predict the most appropriate emoji. \n", 691 | "\n", 692 | "Run the following cell to load the Keras packages." 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": 25, 698 | "metadata": {}, 699 | "outputs": [], 700 | "source": [ 701 | "import numpy as np\n", 702 | "np.random.seed(0)\n", 703 | "from keras.models import Model\n", 704 | "from keras.layers import Dense, Input, Dropout, LSTM, Activation\n", 705 | "from keras.layers.embeddings import Embedding\n", 706 | "from keras.preprocessing import sequence\n", 707 | "from keras.initializers import glorot_uniform\n", 708 | "np.random.seed(1)" 709 | ] 710 | }, 711 | { 712 | "cell_type": "markdown", 713 | "metadata": {}, 714 | "source": [ 715 | "### 2.1 - Overview of the model\n", 716 | "\n", 717 | "Here is the Emojifier-v2 you will implement:\n", 718 | "\n", 719 | "
\n", 720 | "
**Figure 3**: Emojifier-V2. A 2-layer LSTM sequence classifier.
\n", 721 | "\n" 722 | ] 723 | }, 724 | { 725 | "cell_type": "markdown", 726 | "metadata": {}, 727 | "source": [ 728 | "### 2.3 - The Embedding layer\n", 729 | "\n", 730 | "In Keras, the embedding matrix is represented as a \"layer\", and maps positive integers (indices corresponding to words) into dense vectors of fixed size (the embedding vectors). It can be trained or initialized with a pretrained embedding. In this part, you will learn how to create an [Embedding()](https://keras.io/layers/embeddings/) layer in Keras, initialize it with the GloVe 50-dimensional vectors loaded earlier in the notebook. Because our training set is quite small, we will not update the word embeddings but will instead leave their values fixed. But in the code below, we'll show you how Keras allows you to either train or leave fixed this layer. \n", 731 | "\n", 732 | "The `Embedding()` layer takes an integer matrix of size (batch size, max input length) as input. This corresponds to sentences converted into lists of indices (integers), as shown in the figure below.\n", 733 | "\n", 734 | "\n", 735 | "
**Figure 4**: Embedding layer. This example shows the propagation of two examples through the embedding layer. Both have been zero-padded to a length of `max_len=5`. The final dimension of the representation is `(2,max_len,50)` because the word embeddings we are using are 50 dimensional.
\n", 736 | "\n", 737 | "The largest integer (i.e. word index) in the input should be no larger than the vocabulary size. The layer outputs an array of shape (batch size, max input length, dimension of word vectors).\n", 738 | "\n", 739 | "The first step is to convert all your training sentences into lists of indices, and then zero-pad all these lists so that their length is the length of the longest sentence. \n", 740 | "\n" 741 | ] 742 | }, 743 | { 744 | "cell_type": "markdown", 745 | "metadata": {}, 746 | "source": [ 747 | "###
تبدیل جمله به indexها \n", 748 | "
\n", 749 | "
\n", 750 | "این تابع طول تمام جمله ها را نیز یکسان میکند.\n", 751 | "
\n", 752 | "\n" 753 | ] 754 | }, 755 | { 756 | "cell_type": "code", 757 | "execution_count": 26, 758 | "metadata": {}, 759 | "outputs": [], 760 | "source": [ 761 | "\n", 762 | "def sentences_to_indices(X, word_to_index, max_len):\n", 763 | " m = X.shape[0] # number of training examples\n", 764 | " \n", 765 | " # Initialize X_indices as a numpy matrix of zeros and the correct shape (≈ 1 line)\n", 766 | " X_indices = np.zeros((m, max_len))\n", 767 | " \n", 768 | " for i in range(m): # loop over training examples\n", 769 | " \n", 770 | " # Convert the ith training sentence in lower case and split is into words. You should get a list of words.\n", 771 | " sentence_words =X[i].lower().split()\n", 772 | "\n", 773 | " \n", 774 | " # Loop over the words of sentence_words\n", 775 | " for j, w in enumerate(sentence_words):\n", 776 | " # Set the (i,j)th entry of X_indices to the index of the correct word.\n", 777 | " X_indices[i, j] = word_to_index[w]\n", 778 | "\n", 779 | " return X_indices" 780 | ] 781 | }, 782 | { 783 | "cell_type": "markdown", 784 | "metadata": {}, 785 | "source": [ 786 | "Run the following cell to check what `sentences_to_indices()` does, and check your results." 787 | ] 788 | }, 789 | { 790 | "cell_type": "code", 791 | "execution_count": 29, 792 | "metadata": {}, 793 | "outputs": [ 794 | { 795 | "name": "stdout", 796 | "output_type": "stream", 797 | "text": [ 798 | "X1 = ['funny lol' 'lets play baseball' 'food is ready for you']\n", 799 | "X1_indices = [[155345. 225122. 0. 0. 0.]\n", 800 | " [220930. 286375. 69714. 0. 0.]\n", 801 | " [151204. 192973. 302254. 151349. 394475.]]\n" 802 | ] 803 | } 804 | ], 805 | "source": [ 806 | "X1 = np.array([\"funny lol\", \"lets play baseball\", \"food is ready for you\"])\n", 807 | "X1_indices = sentences_to_indices(X1,word_to_index, max_len = 5)\n", 808 | "print(\"X1 =\", X1)\n", 809 | "print(\"X1_indices =\", X1_indices)" 810 | ] 811 | }, 812 | { 813 | "cell_type": "markdown", 814 | "metadata": {}, 815 | "source": [ 816 | "###
تابعی برای ایجاد لایه Embedding و لود وزن های از پیش آموزش داده شده \n", 817 | "
" 818 | ] 819 | }, 820 | { 821 | "cell_type": "markdown", 822 | "metadata": {}, 823 | "source": [ 824 | "Let's build the `Embedding()` layer in Keras, using pre-trained word vectors. After this layer is built, you will pass the output of `sentences_to_indices()` to it as an input, and the `Embedding()` layer will return the word embeddings for a sentence. \n", 825 | "\n", 826 | "\n", 827 | "1. Initialize the embedding matrix as a numpy array of zeroes with the correct shape.\n", 828 | "2. Fill in the embedding matrix with all the word embeddings extracted from `word_to_vec_map`.\n", 829 | "3. Define Keras embedding layer. Use [Embedding()](https://keras.io/layers/embeddings/). Be sure to make this layer non-trainable, by setting `trainable = False` when calling `Embedding()`. If you were to set `trainable = True`, then it will allow the optimization algorithm to modify the values of the word embeddings. \n", 830 | "4. Set the embedding weights to be equal to the embedding matrix " 831 | ] 832 | }, 833 | { 834 | "cell_type": "code", 835 | "execution_count": 30, 836 | "metadata": {}, 837 | "outputs": [], 838 | "source": [ 839 | "from keras.layers import Embedding\n", 840 | "def pretrained_embedding_layer(word_to_vec_map, word_to_index):\n", 841 | " \n", 842 | " vocab_len = len(word_to_index) + 1 # adding 1 to fit Keras embedding (requirement)\n", 843 | " emb_dim = word_to_vec_map[\"cucumber\"].shape[0] # define dimensionality of your GloVe word vectors (= 50)\n", 844 | " \n", 845 | " # Initialize the embedding matrix as a numpy array of zeros of shape (vocab_len, dimensions of word vectors = emb_dim)\n", 846 | " emb_matrix = np.zeros((vocab_len, emb_dim))\n", 847 | " \n", 848 | " # Set each row \"index\" of the embedding matrix to be the word vector representation of the \"index\"th word of the vocabulary\n", 849 | " for word, index in word_to_index.items():\n", 850 | " emb_matrix[index, :] = word_to_vec_map[word]\n", 851 | "\n", 852 | " # Define Keras embedding layer with the correct output/input sizes, make it trainable. Use Embedding(...). Make sure to set trainable=False. \n", 853 | " embedding_layer = Embedding(vocab_len, emb_dim, trainable = False)\n", 854 | "\n", 855 | " # Build the embedding layer, it is required before setting the weights of the embedding layer. Do not modify the \"None\".\n", 856 | " embedding_layer.build((None,))\n", 857 | " \n", 858 | " # Set the weights of the embedding layer to the embedding matrix. Your layer is now pretrained.\n", 859 | " embedding_layer.set_weights([emb_matrix])\n", 860 | " \n", 861 | " return embedding_layer" 862 | ] 863 | }, 864 | { 865 | "cell_type": "markdown", 866 | "metadata": {}, 867 | "source": [ 868 | "## 2.3 Building the Emojifier-V2\n" 869 | ] 870 | }, 871 | { 872 | "cell_type": "code", 873 | "execution_count": 33, 874 | "metadata": {}, 875 | "outputs": [], 876 | "source": [ 877 | "from keras.layers import Input\n", 878 | "from keras.layers import LSTM\n", 879 | "from keras.layers import Dense, Dropout\n", 880 | "from keras.models import Model\n", 881 | "\n", 882 | "def Emojify_V2(input_shape, word_to_vec_map, word_to_index):\n", 883 | "\n", 884 | " # Define sentence_indices as the input of the graph, it should be of shape input_shape and dtype 'int32' (as it contains indices).\n", 885 | " sentence_indices = Input(input_shape, dtype = np.int32)\n", 886 | " \n", 887 | " # Create the embedding layer pretrained with GloVe Vectors (≈1 line)\n", 888 | " embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)\n", 889 | " \n", 890 | " # Propagate sentence_indices through your embedding layer, you get back the embeddings\n", 891 | " embeddings = embedding_layer(sentence_indices)\n", 892 | " \n", 893 | " # Propagate the embeddings through an LSTM layer with 128-dimensional hidden state\n", 894 | " # Be careful, the returned output should be a batch of sequences.\n", 895 | " X = LSTM(128, return_sequences=True)(embeddings)\n", 896 | " # Add dropout with a probability of 0.5\n", 897 | " X = Dropout(0.5)(X)\n", 898 | " # Propagate X trough another LSTM layer with 128-dimensional hidden state\n", 899 | " # Be careful, the returned output should be a single hidden state, not a batch of sequences.\n", 900 | " X = LSTM(128)(X)\n", 901 | " # Add dropout with a probability of 0.5\n", 902 | " X = Dropout(0.5)(X)\n", 903 | " # Propagate X through a Dense layer with softmax activation to get back a batch of 5-dimensional vectors.\n", 904 | " X = Dense(5, activation = 'softmax')(X)\n", 905 | " \n", 906 | " # Create Model instance which converts sentence_indices into X.\n", 907 | " model = Model(sentence_indices, X)\n", 908 | " \n", 909 | " \n", 910 | " return model" 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": 34, 916 | "metadata": { 917 | "scrolled": false 918 | }, 919 | "outputs": [ 920 | { 921 | "name": "stdout", 922 | "output_type": "stream", 923 | "text": [ 924 | "Model: \"model_1\"\n", 925 | "_________________________________________________________________\n", 926 | "Layer (type) Output Shape Param # \n", 927 | "=================================================================\n", 928 | "input_2 (InputLayer) (None, 10) 0 \n", 929 | "_________________________________________________________________\n", 930 | "embedding_2 (Embedding) (None, 10, 50) 20000050 \n", 931 | "_________________________________________________________________\n", 932 | "lstm_3 (LSTM) (None, 10, 128) 91648 \n", 933 | "_________________________________________________________________\n", 934 | "dropout_3 (Dropout) (None, 10, 128) 0 \n", 935 | "_________________________________________________________________\n", 936 | "lstm_4 (LSTM) (None, 128) 131584 \n", 937 | "_________________________________________________________________\n", 938 | "dropout_4 (Dropout) (None, 128) 0 \n", 939 | "_________________________________________________________________\n", 940 | "dense_2 (Dense) (None, 5) 645 \n", 941 | "=================================================================\n", 942 | "Total params: 20,223,927\n", 943 | "Trainable params: 223,877\n", 944 | "Non-trainable params: 20,000,050\n", 945 | "_________________________________________________________________\n" 946 | ] 947 | } 948 | ], 949 | "source": [ 950 | "model = Emojify_V2((maxLen,), word_to_vec_map, word_to_index)\n", 951 | "model.summary()" 952 | ] 953 | }, 954 | { 955 | "cell_type": "markdown", 956 | "metadata": {}, 957 | "source": [ 958 | "As usual, after creating your model in Keras, you need to compile it and define what loss, optimizer and metrics your are want to use. Compile your model using `categorical_crossentropy` loss, `adam` optimizer and `['accuracy']` metrics:" 959 | ] 960 | }, 961 | { 962 | "cell_type": "code", 963 | "execution_count": 35, 964 | "metadata": {}, 965 | "outputs": [], 966 | "source": [ 967 | "model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])" 968 | ] 969 | }, 970 | { 971 | "cell_type": "markdown", 972 | "metadata": {}, 973 | "source": [ 974 | "It's time to train your model. Your Emojifier-V2 `model` takes as input an array of shape (`m`, `max_len`) and outputs probability vectors of shape (`m`, `number of classes`). We thus have to convert X_train (array of sentences as strings) to X_train_indices (array of sentences as list of word indices), and Y_train (labels as indices) to Y_train_oh (labels as one-hot vectors)." 975 | ] 976 | }, 977 | { 978 | "cell_type": "code", 979 | "execution_count": 36, 980 | "metadata": {}, 981 | "outputs": [], 982 | "source": [ 983 | "X_train_indices = sentences_to_indices(X_train, word_to_index, maxLen)\n", 984 | "Y_train_oh = keras.utils.to_categorical(Y_train, 5)" 985 | ] 986 | }, 987 | { 988 | "cell_type": "code", 989 | "execution_count": 37, 990 | "metadata": { 991 | "scrolled": true 992 | }, 993 | "outputs": [ 994 | { 995 | "name": "stdout", 996 | "output_type": "stream", 997 | "text": [ 998 | "Epoch 1/50\n", 999 | "132/132 [==============================] - 2s 14ms/step - loss: 1.5785 - accuracy: 0.2803\n", 1000 | "Epoch 2/50\n", 1001 | "132/132 [==============================] - 0s 2ms/step - loss: 1.5018 - accuracy: 0.3333\n", 1002 | "Epoch 3/50\n", 1003 | "132/132 [==============================] - 0s 2ms/step - loss: 1.4604 - accuracy: 0.3485\n", 1004 | "Epoch 4/50\n", 1005 | "132/132 [==============================] - 0s 2ms/step - loss: 1.3685 - accuracy: 0.4470\n", 1006 | "Epoch 5/50\n", 1007 | "132/132 [==============================] - 0s 2ms/step - loss: 1.2991 - accuracy: 0.4697\n", 1008 | "Epoch 6/50\n", 1009 | "132/132 [==============================] - 0s 2ms/step - loss: 1.1840 - accuracy: 0.5455\n", 1010 | "Epoch 7/50\n", 1011 | "132/132 [==============================] - 0s 2ms/step - loss: 1.0434 - accuracy: 0.6061\n", 1012 | "Epoch 8/50\n", 1013 | "132/132 [==============================] - 0s 2ms/step - loss: 0.8732 - accuracy: 0.7197\n", 1014 | "Epoch 9/50\n", 1015 | "132/132 [==============================] - 0s 2ms/step - loss: 0.8077 - accuracy: 0.7273\n", 1016 | "Epoch 10/50\n", 1017 | "132/132 [==============================] - 0s 2ms/step - loss: 0.6823 - accuracy: 0.7500\n", 1018 | "Epoch 11/50\n", 1019 | "132/132 [==============================] - 0s 2ms/step - loss: 0.6593 - accuracy: 0.7424\n", 1020 | "Epoch 12/50\n", 1021 | "132/132 [==============================] - 0s 2ms/step - loss: 0.5245 - accuracy: 0.7879\n", 1022 | "Epoch 13/50\n", 1023 | "132/132 [==============================] - 0s 2ms/step - loss: 0.5793 - accuracy: 0.8106\n", 1024 | "Epoch 14/50\n", 1025 | "132/132 [==============================] - 0s 2ms/step - loss: 0.4114 - accuracy: 0.8712\n", 1026 | "Epoch 15/50\n", 1027 | "132/132 [==============================] - 0s 2ms/step - loss: 0.4243 - accuracy: 0.8333\n", 1028 | "Epoch 16/50\n", 1029 | "132/132 [==============================] - 0s 2ms/step - loss: 0.3319 - accuracy: 0.8788\n", 1030 | "Epoch 17/50\n", 1031 | "132/132 [==============================] - 0s 2ms/step - loss: 0.3110 - accuracy: 0.8864\n", 1032 | "Epoch 18/50\n", 1033 | "132/132 [==============================] - 0s 2ms/step - loss: 0.3255 - accuracy: 0.8864\n", 1034 | "Epoch 19/50\n", 1035 | "132/132 [==============================] - 0s 2ms/step - loss: 0.3134 - accuracy: 0.9167\n", 1036 | "Epoch 20/50\n", 1037 | "132/132 [==============================] - 0s 2ms/step - loss: 0.2598 - accuracy: 0.9015\n", 1038 | "Epoch 21/50\n", 1039 | "132/132 [==============================] - 0s 2ms/step - loss: 0.2034 - accuracy: 0.9318\n", 1040 | "Epoch 22/50\n", 1041 | "132/132 [==============================] - 0s 2ms/step - loss: 0.3006 - accuracy: 0.8939\n", 1042 | "Epoch 23/50\n", 1043 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1704 - accuracy: 0.9394\n", 1044 | "Epoch 24/50\n", 1045 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1986 - accuracy: 0.9470\n", 1046 | "Epoch 25/50\n", 1047 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1613 - accuracy: 0.9621\n", 1048 | "Epoch 26/50\n", 1049 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1410 - accuracy: 0.9545\n", 1050 | "Epoch 27/50\n", 1051 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1857 - accuracy: 0.9167\n", 1052 | "Epoch 28/50\n", 1053 | "132/132 [==============================] - 0s 2ms/step - loss: 0.2262 - accuracy: 0.9394\n", 1054 | "Epoch 29/50\n", 1055 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1146 - accuracy: 0.9470\n", 1056 | "Epoch 30/50\n", 1057 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1039 - accuracy: 0.9697\n", 1058 | "Epoch 31/50\n", 1059 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1061 - accuracy: 0.9621\n", 1060 | "Epoch 32/50\n", 1061 | "132/132 [==============================] - 0s 2ms/step - loss: 0.2473 - accuracy: 0.9167\n", 1062 | "Epoch 33/50\n", 1063 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1782 - accuracy: 0.9318\n", 1064 | "Epoch 34/50\n", 1065 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1944 - accuracy: 0.9394\n", 1066 | "Epoch 35/50\n", 1067 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1595 - accuracy: 0.9318\n", 1068 | "Epoch 36/50\n", 1069 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1048 - accuracy: 0.9545\n", 1070 | "Epoch 37/50\n", 1071 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1591 - accuracy: 0.9545\n", 1072 | "Epoch 38/50\n", 1073 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0960 - accuracy: 0.9621\n", 1074 | "Epoch 39/50\n", 1075 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0617 - accuracy: 0.9924\n", 1076 | "Epoch 40/50\n", 1077 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0381 - accuracy: 1.0000\n", 1078 | "Epoch 41/50\n", 1079 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0345 - accuracy: 1.0000\n", 1080 | "Epoch 42/50\n", 1081 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0371 - accuracy: 1.0000\n", 1082 | "Epoch 43/50\n", 1083 | "132/132 [==============================] - 0s 2ms/step - loss: 0.1414 - accuracy: 0.9621\n", 1084 | "Epoch 44/50\n", 1085 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0375 - accuracy: 0.9924\n", 1086 | "Epoch 45/50\n", 1087 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0719 - accuracy: 0.9848\n", 1088 | "Epoch 46/50\n", 1089 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0595 - accuracy: 0.9848\n", 1090 | "Epoch 47/50\n", 1091 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0240 - accuracy: 0.9848\n", 1092 | "Epoch 48/50\n", 1093 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0170 - accuracy: 1.0000\n", 1094 | "Epoch 49/50\n", 1095 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0152 - accuracy: 1.0000\n", 1096 | "Epoch 50/50\n", 1097 | "132/132 [==============================] - 0s 2ms/step - loss: 0.0097 - accuracy: 1.0000\n" 1098 | ] 1099 | }, 1100 | { 1101 | "data": { 1102 | "text/plain": [ 1103 | "" 1104 | ] 1105 | }, 1106 | "execution_count": 37, 1107 | "metadata": {}, 1108 | "output_type": "execute_result" 1109 | } 1110 | ], 1111 | "source": [ 1112 | "model.fit(X_train_indices, Y_train_oh, epochs = 50, batch_size = 32, shuffle=True)" 1113 | ] 1114 | }, 1115 | { 1116 | "cell_type": "markdown", 1117 | "metadata": {}, 1118 | "source": [ 1119 | "Your model should perform close to **100% accuracy** on the training set. The exact accuracy you get may be a little different. Run the following cell to evaluate your model on the test set. " 1120 | ] 1121 | }, 1122 | { 1123 | "cell_type": "code", 1124 | "execution_count": 38, 1125 | "metadata": { 1126 | "scrolled": true 1127 | }, 1128 | "outputs": [ 1129 | { 1130 | "name": "stdout", 1131 | "output_type": "stream", 1132 | "text": [ 1133 | "56/56 [==============================] - 0s 4ms/step\n", 1134 | "\n", 1135 | "Test accuracy = 0.8035714030265808\n" 1136 | ] 1137 | } 1138 | ], 1139 | "source": [ 1140 | "X_test_indices = sentences_to_indices(X_test, word_to_index, max_len = maxLen)\n", 1141 | "Y_test_oh = keras.utils.to_categorical(Y_test, 5)\n", 1142 | "loss, acc = model.evaluate(X_test_indices, Y_test_oh)\n", 1143 | "print()\n", 1144 | "print(\"Test accuracy = \", acc)" 1145 | ] 1146 | }, 1147 | { 1148 | "cell_type": "markdown", 1149 | "metadata": {}, 1150 | "source": [ 1151 | "You should get a test accuracy between 80% and 95%. Run the cell below to see the mislabelled examples. " 1152 | ] 1153 | }, 1154 | { 1155 | "cell_type": "code", 1156 | "execution_count": 39, 1157 | "metadata": {}, 1158 | "outputs": [ 1159 | { 1160 | "name": "stdout", 1161 | "output_type": "stream", 1162 | "text": [ 1163 | "Expected emoji:😄 prediction: she got me a nice present\t❤️\n", 1164 | "Expected emoji:😞 prediction: work is hard\t😄\n", 1165 | "Expected emoji:😞 prediction: This girl is messing with me\t❤️\n", 1166 | "Expected emoji:😞 prediction: work is horrible\t😄\n", 1167 | "Expected emoji:🍴 prediction: any suggestions for dinner\t😄\n", 1168 | "Expected emoji:😄 prediction: you brighten my day\t❤️\n", 1169 | "Expected emoji:😞 prediction: she is a bully\t❤️\n", 1170 | "Expected emoji:😞 prediction: My life is so boring\t❤️\n", 1171 | "Expected emoji:😄 prediction: What you did was awesome\t😞\n", 1172 | "Expected emoji:😞 prediction: go away\t⚾\n", 1173 | "Expected emoji:❤️ prediction: family is all I have\t😞\n" 1174 | ] 1175 | } 1176 | ], 1177 | "source": [ 1178 | "# This code allows you to see the mislabelled examples\n", 1179 | "C = 5\n", 1180 | "y_test_oh = np.eye(C)[Y_test.reshape(-1)]\n", 1181 | "X_test_indices = sentences_to_indices(X_test, word_to_index, maxLen)\n", 1182 | "pred = model.predict(X_test_indices)\n", 1183 | "for i in range(len(X_test)):\n", 1184 | " x = X_test_indices\n", 1185 | " num = np.argmax(pred[i])\n", 1186 | " if(num != Y_test[i]):\n", 1187 | " print('Expected emoji:'+ label_to_emoji(Y_test[i]) + ' prediction: '+ X_test[i] + label_to_emoji(num).strip())" 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "markdown", 1192 | "metadata": {}, 1193 | "source": [ 1194 | "Now you can try it on your own example. Write your own sentence below. " 1195 | ] 1196 | }, 1197 | { 1198 | "cell_type": "code", 1199 | "execution_count": 40, 1200 | "metadata": {}, 1201 | "outputs": [ 1202 | { 1203 | "name": "stdout", 1204 | "output_type": "stream", 1205 | "text": [ 1206 | "not feeling happy 😞\n" 1207 | ] 1208 | } 1209 | ], 1210 | "source": [ 1211 | "# Change the sentence below to see your prediction. Make sure all the words are in the Glove embeddings. \n", 1212 | "x_test = np.array(['not feeling happy'])\n", 1213 | "X_test_indices = sentences_to_indices(x_test, word_to_index, maxLen)\n", 1214 | "print(x_test[0] +' '+ label_to_emoji(np.argmax(model.predict(X_test_indices))))" 1215 | ] 1216 | }, 1217 | { 1218 | "cell_type": "markdown", 1219 | "metadata": {}, 1220 | "source": [ 1221 | "Previously, Emojify-V1 model did not correctly label \"not feeling happy,\" but our implementation of Emojiy-V2 got it right. (Keras' outputs are slightly random each time, so you may not have obtained the same result.) The current model still isn't very robust at understanding negation (like \"not happy\") because the training set is small and so doesn't have a lot of examples of negation. But if the training set were larger, the LSTM model would be much better than the Emojify-V1 model at understanding such complex sentences. \n" 1222 | ] 1223 | }, 1224 | { 1225 | "cell_type": "markdown", 1226 | "metadata": {}, 1227 | "source": [ 1228 | "
\n", 1229 | "
دوره پیشرفته یادگیری عمیق
علیرضا اخوان پور
آبان و آذر 1399
\n", 1230 | "
\n", 1231 | "Class.Vision - AkhavanPour.ir - GitHub\n", 1232 | "\n", 1233 | "
" 1234 | ] 1235 | } 1236 | ], 1237 | "metadata": { 1238 | "coursera": { 1239 | "course_slug": "nlp-sequence-models", 1240 | "graded_item_id": "RNnEs", 1241 | "launcher_item_id": "acNYU" 1242 | }, 1243 | "kernelspec": { 1244 | "display_name": "tf2-GPU", 1245 | "language": "python", 1246 | "name": "tf2" 1247 | }, 1248 | "language_info": { 1249 | "codemirror_mode": { 1250 | "name": "ipython", 1251 | "version": 3 1252 | }, 1253 | "file_extension": ".py", 1254 | "mimetype": "text/x-python", 1255 | "name": "python", 1256 | "nbconvert_exporter": "python", 1257 | "pygments_lexer": "ipython3", 1258 | "version": "3.6.9" 1259 | } 1260 | }, 1261 | "nbformat": 4, 1262 | "nbformat_minor": 2 1263 | } 1264 | --------------------------------------------------------------------------------