├── .gitignore ├── .ipynb_checkpoints └── README-checkpoint.md ├── Deep Learning Emotion Recognition PyTorch.ipynb ├── Deep Learning Emotion Recognition TensorFlow .ipynb ├── README.md ├── data └── merged_training.pkl ├── helpers ├── .ipynb_checkpoints │ └── evaluate-checkpoint.py ├── __pycache__ │ ├── evaluate.cpython-36.pyc │ └── pickle_helpers.cpython-36.pyc ├── evaluate.py └── pickle_helpers.py └── img ├── autograd.jpg ├── dl_frameworks.png ├── emotion_classifier.png ├── gru-model.png └── tensor.png /.gitignore: -------------------------------------------------------------------------------- 1 | */__pycache__ 2 | __pycache__/ 3 | .ipynb_checkpoints -------------------------------------------------------------------------------- /.ipynb_checkpoints/README-checkpoint.md: -------------------------------------------------------------------------------- 1 | ## Deep Learning Based NLP 2 | This repository contains python notebooks related to several deep learning based NLP tasks such as emotion recognition and neural machine translation. It will provide implementations based on several deep learning frameworks such as PyTorch and TensorFlow. 3 | 4 | ## Emotion Recognition with GRU 5 | - [Deep Learning Based Emotion Recognition With PyTorch](https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks/blob/master/Deep%20Learning%20Emotion%20Recognition%20PyTorch.ipynb) 6 | - [Deep Learning Based Emotion Recognition With TensorFlow](https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks/blob/master/Deep%20Learning%20Emotion%20Recognition%20TensorFlow%20.ipynb) 7 | 8 | --- 9 | Author: [Elvis Saravia](https://twitter.com/omarsar0) -------------------------------------------------------------------------------- /Deep Learning Emotion Recognition TensorFlow .ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Deep Learning Based Emotion Recognition with TensorFlow\n", 8 | "\n", 9 | "![alt txt](img/emotion_classifier.png)\n", 10 | "\n", 11 | "In this notebook we are going to learn how to train deep neural networks, such as recurrent neural networks (RNNs), for addressing a natural language task known as **emotion recognition**. We will cover everything you need to know to get started with NLP using deep learning frameworks such as TensorFlow. We will cover the common best practices, functionalities, and steps you need to understand the basics of TensorFlow APIs to build powerful predictive models via the computation graph. In the process of building our models, we will compare PyTorch and TensorFlow to let the learner appreciate the strenghts of each tool.\n", 12 | "\n", 13 | "by [Elvis Saravia](https://twitter.com/omarsar0)" 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": {}, 19 | "source": [ 20 | "---" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "## Outline\n", 28 | "1. Deep Learning Frameworks\n", 29 | " - 1.1 Eager execution\n", 30 | " - 1.2 Computation graph\n", 31 | "2. Tensors\n", 32 | " - 2.1 Basic math with tensors\n", 33 | " - 2.2 Transforming tensors\n", 34 | "3. Data\n", 35 | " - 3.1 Preprocessing data\n", 36 | " - Tokenization and Sampling\n", 37 | " - Constructing Vocabulary and Index-Word Mapping\n", 38 | " - 3.2 Converting data into tensors\n", 39 | " - 3.3 Padding data\n", 40 | " - 3.4 Binarization\n", 41 | " - 3.5 Split data\n", 42 | " - 3.6 Data Loader\n", 43 | "4. Model\n", 44 | " - 4.1 Pretesting Model\n", 45 | " - 4.2 Testing models with eager execution\n", 46 | "5. Training\n", 47 | "6. Evaluation on Testing Dataset\n", 48 | " - 6.1 Confusion matrix\n", 49 | "- Final Words\n", 50 | "- References\n", 51 | "- *Storing models and setting checkpoints (Exercise)*\n", 52 | "- *Restoring models (Exercise)*" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "---" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "## 1. Deep Learning Frameworks\n", 67 | "There are many deep learning frameworks such as Chainer, DyNet, MXNet, PyTorch, TensorFlow, and Keras. Each framework has their own strenghts which a researcher or a developer may want to consider before choosing the right framework. In my opinion, PyTorch is great for researchers and offers eager execution by default, but its high-level APIs require some understanding of deep learning concepts such as **affine layers** and **automatic differentiation**. On the other hand, TensorFlow was originally built as a low-level API that provides a robust list of functionalities to build deep learning models from the ground up. More recently, TensorFlow also offers **eager execution** and is equipped with a high-level API known as Keras.\n", 68 | "\n", 69 | "![alt txt](img/dl_frameworks.png)" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | "### 1.1 Eager Execution\n", 77 | "Eager execution allows us to operate on the computation graph dynamically, also known as **imperative programming**. TensorFlow requires that you manually turn this mode on, while PyTorch comes with this mode by default. Below we import the necessary library and enable eager execution." 78 | ] 79 | }, 80 | { 81 | "cell_type": "code", 82 | "execution_count": 1, 83 | "metadata": {}, 84 | "outputs": [ 85 | { 86 | "name": "stdout", 87 | "output_type": "stream", 88 | "text": [ 89 | "1.10.0\n", 90 | "EE enabled? True\n" 91 | ] 92 | } 93 | ], 94 | "source": [ 95 | "import tensorflow as tf\n", 96 | "tf.enable_eager_execution()\n", 97 | "print(tf.__version__)\n", 98 | "print(\"EE enabled?\", tf.executing_eagerly())" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "### 1.2 Computation Graph\n", 106 | "A simplified definition of a neural network is a string of functions that are **differentiable** and that we can combine together to get more complicated functions. An intuitive way to express this process is through computation graphs. \n", 107 | "\n", 108 | "![alt txt](http://colah.github.io/posts/2015-08-Backprop/img/tree-eval-derivs.png)\n", 109 | "\n", 110 | "Image credit: [Chris Olah](http://colah.github.io/posts/2015-08-Backprop/)" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "## 2. Tensors\n", 118 | "Tensors are the fundamental data structure used to store data that will be fed as input to a computation graph for processing and applying tranformations. Let's create two tensors and multiply them, and then output the result. The figure below shows a 4-D Tensor.\n", 119 | "\n", 120 | "![alt txt](img/tensor.png)" 121 | ] 122 | }, 123 | { 124 | "cell_type": "code", 125 | "execution_count": 2, 126 | "metadata": {}, 127 | "outputs": [ 128 | { 129 | "name": "stdout", 130 | "output_type": "stream", 131 | "text": [ 132 | "tf.Tensor(\n", 133 | "[[1. 3.]\n", 134 | " [3. 7.]], shape=(2, 2), dtype=float32)\n" 135 | ] 136 | } 137 | ], 138 | "source": [ 139 | "c = tf.constant([[1.0, 2.0], [3.0, 4.0]])\n", 140 | "d = tf.constant([[1.0, 1.0], [0.0, 1.0]])\n", 141 | "e = tf.matmul(c, d)\n", 142 | "print(e)" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "metadata": {}, 148 | "source": [ 149 | "### 2.1 Math with Tensors\n", 150 | "TensorFlow and other deep learning libraries like PyTorch allow you to do **automatic differentation**. Let's try to compute the derivative of a function -- in this case that function is stored in the variable `z`. In TensorFlow, the `tf.GradienTape()` function allows tracking of operations on the input tensor. " 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": 3, 156 | "metadata": {}, 157 | "outputs": [ 158 | { 159 | "name": "stdout", 160 | "output_type": "stream", 161 | "text": [ 162 | "tf.Tensor(\n", 163 | "[[4.5 4.5]\n", 164 | " [4.5 4.5]], shape=(2, 2), dtype=float32)\n" 165 | ] 166 | } 167 | ], 168 | "source": [ 169 | "### Automatic differentiation with TensorFlow\n", 170 | "\n", 171 | "x = tf.contrib.eager.Variable(tf.ones((2,2)))\n", 172 | "with tf.GradientTape() as tape:\n", 173 | " y = x + 2\n", 174 | " z = y * y * 3\n", 175 | " out = tf.reduce_mean(z)\n", 176 | "\n", 177 | "grad = tape.gradient(out, x) # d(out)/dx\n", 178 | "print(grad)" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": {}, 184 | "source": [ 185 | "You can verfiy the output with the equations in the figure below:\n", 186 | "\n", 187 | "![alt txt](img/autograd.jpg)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "### 2.2 Transforming Tensors\n", 195 | "We can also apply some transformation to a tensor such as adding a dimension or transposing it. Let's try both adding a dimension and transposing a matrix below." 196 | ] 197 | }, 198 | { 199 | "cell_type": "code", 200 | "execution_count": 7, 201 | "metadata": {}, 202 | "outputs": [ 203 | { 204 | "name": "stdout", 205 | "output_type": "stream", 206 | "text": [ 207 | "X shape: (2, 3)\n", 208 | "tf.Tensor([2 1 3], shape=(3,), dtype=int32)\n" 209 | ] 210 | }, 211 | { 212 | "data": { 213 | "text/plain": [ 214 | "TensorShape([Dimension(3), Dimension(2)])" 215 | ] 216 | }, 217 | "execution_count": 7, 218 | "metadata": {}, 219 | "output_type": "execute_result" 220 | } 221 | ], 222 | "source": [ 223 | "x = tf.constant([[1, 2, 3], [4, 5, 6]])\n", 224 | "print(\"X shape: \", x.shape)\n", 225 | "\n", 226 | "# add dimension\n", 227 | "print(tf.shape(tf.expand_dims(x, 1)))\n", 228 | "\n", 229 | "# transpose\n", 230 | "tf.transpose(x).shape" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "---" 238 | ] 239 | }, 240 | { 241 | "cell_type": "markdown", 242 | "metadata": {}, 243 | "source": [ 244 | "## 3. Emotion Dataset\n", 245 | "In this notebook we are working on an emotion classification task. We are using the public emotion dataset provided [here](https://github.com/huseinzol05/NLP-Dataset/tree/master/emotion-english). The dataset contains tweets labeled into 6 categories." 246 | ] 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 8, 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": [ 254 | "import re\n", 255 | "import numpy as np\n", 256 | "import time\n", 257 | "import helpers.pickle_helpers as ph\n", 258 | "from sklearn import preprocessing\n", 259 | "from sklearn.model_selection import train_test_split\n", 260 | "import matplotlib.pyplot as plt\n", 261 | "%matplotlib inline" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": 9, 267 | "metadata": {}, 268 | "outputs": [ 269 | { 270 | "data": { 271 | "text/plain": [ 272 | "" 273 | ] 274 | }, 275 | "execution_count": 9, 276 | "metadata": {}, 277 | "output_type": "execute_result" 278 | }, 279 | { 280 | "data": { 281 | "image/png": "\n", 282 | "text/plain": [ 283 | "
" 284 | ] 285 | }, 286 | "metadata": {}, 287 | "output_type": "display_data" 288 | } 289 | ], 290 | "source": [ 291 | "# load data\n", 292 | "data = ph.load_from_pickle(directory=\"data/merged_training.pkl\")\n", 293 | "data.emotions.value_counts().plot.bar()" 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": 10, 299 | "metadata": {}, 300 | "outputs": [ 301 | { 302 | "data": { 303 | "text/html": [ 304 | "
\n", 305 | "\n", 318 | "\n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | "
textemotions
27383i feel awful about it too because it s my job ...sadness
110083im alone i feel awfulsadness
140764ive probably mentioned this before but i reall...joy
100071i was feeling a little low few days backsadness
2837i beleive that i am much more sensitive to oth...love
18231i find myself frustrated with christians becau...love
10714i am one of those people who feels like going ...joy
35177i feel especially pleased about this as this h...joy
122177i was struggling with these awful feelings and...joy
26723i feel so enraged but helpless at the same timeanger
\n", 379 | "
" 380 | ], 381 | "text/plain": [ 382 | " text emotions\n", 383 | "27383 i feel awful about it too because it s my job ... sadness\n", 384 | "110083 im alone i feel awful sadness\n", 385 | "140764 ive probably mentioned this before but i reall... joy\n", 386 | "100071 i was feeling a little low few days back sadness\n", 387 | "2837 i beleive that i am much more sensitive to oth... love\n", 388 | "18231 i find myself frustrated with christians becau... love\n", 389 | "10714 i am one of those people who feels like going ... joy\n", 390 | "35177 i feel especially pleased about this as this h... joy\n", 391 | "122177 i was struggling with these awful feelings and... joy\n", 392 | "26723 i feel so enraged but helpless at the same time anger" 393 | ] 394 | }, 395 | "execution_count": 10, 396 | "metadata": {}, 397 | "output_type": "execute_result" 398 | } 399 | ], 400 | "source": [ 401 | "data.head(10)" 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": {}, 407 | "source": [ 408 | "### 3.1 Preprocessing Data\n", 409 | "In the next steps we are going to create tokenize the text, create index mapping for words, and also construct a vocabulary. " 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": {}, 415 | "source": [ 416 | "#### Tokenization and Sampling" 417 | ] 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": 11, 422 | "metadata": {}, 423 | "outputs": [], 424 | "source": [ 425 | "# retain only text that contain less that 70 tokens to avoid too much padding\n", 426 | "data[\"token_size\"] = data[\"text\"].apply(lambda x: len(x.split(' ')))\n", 427 | "data = data.loc[data['token_size'] < 70].copy()\n", 428 | "\n", 429 | "# sampling\n", 430 | "data = data.sample(n=50000);" 431 | ] 432 | }, 433 | { 434 | "cell_type": "markdown", 435 | "metadata": {}, 436 | "source": [ 437 | "#### Constructing Vocabulary and Index-Word Mapping" 438 | ] 439 | }, 440 | { 441 | "cell_type": "code", 442 | "execution_count": 12, 443 | "metadata": {}, 444 | "outputs": [], 445 | "source": [ 446 | "# This class creates a word -> index mapping (e.g,. \"dad\" -> 5) and vice-versa \n", 447 | "# (e.g., 5 -> \"dad\") for the dataset\n", 448 | "class ConstructVocab():\n", 449 | " def __init__(self, sentences):\n", 450 | " self.sentences = sentences\n", 451 | " self.word2idx = {}\n", 452 | " self.idx2word = {}\n", 453 | " self.vocab = set()\n", 454 | " self.create_index()\n", 455 | " \n", 456 | " def create_index(self):\n", 457 | " for s in self.sentences:\n", 458 | " # update with individual tokens\n", 459 | " self.vocab.update(s.split(' '))\n", 460 | " \n", 461 | " # sort the vocab\n", 462 | " self.vocab = sorted(self.vocab)\n", 463 | "\n", 464 | " # add a padding token with index 0\n", 465 | " self.word2idx[''] = 0\n", 466 | " \n", 467 | " # word to index mapping\n", 468 | " for index, word in enumerate(self.vocab):\n", 469 | " self.word2idx[word] = index + 1 # +1 because of pad token\n", 470 | " \n", 471 | " # index to word mapping\n", 472 | " for word, index in self.word2idx.items():\n", 473 | " self.idx2word[index] = word " 474 | ] 475 | }, 476 | { 477 | "cell_type": "code", 478 | "execution_count": 16, 479 | "metadata": {}, 480 | "outputs": [ 481 | { 482 | "data": { 483 | "text/plain": [ 484 | "['a',\n", 485 | " 'aa',\n", 486 | " 'aaa',\n", 487 | " 'aaaaaaaaaaaaaaaaggghhhh',\n", 488 | " 'aaaaall',\n", 489 | " 'aaaand',\n", 490 | " 'aaradhya',\n", 491 | " 'aaron',\n", 492 | " 'aashiqui',\n", 493 | " 'ab']" 494 | ] 495 | }, 496 | "execution_count": 16, 497 | "metadata": {}, 498 | "output_type": "execute_result" 499 | } 500 | ], 501 | "source": [ 502 | "# construct vocab and indexing\n", 503 | "inputs = ConstructVocab(data[\"text\"].values.tolist())\n", 504 | "\n", 505 | "# examples of what is in the vocab\n", 506 | "inputs.vocab[0:10]" 507 | ] 508 | }, 509 | { 510 | "cell_type": "markdown", 511 | "metadata": {}, 512 | "source": [ 513 | "### 3.2 Converting Data into Tensors \n", 514 | "For convenience we would like to convert the data into tensors. " 515 | ] 516 | }, 517 | { 518 | "cell_type": "code", 519 | "execution_count": 17, 520 | "metadata": {}, 521 | "outputs": [], 522 | "source": [ 523 | "# vectorize to tensor\n", 524 | "input_tensor = [[inputs.word2idx[s] for s in es.split(' ')] for es in data[\"text\"].values.tolist()]" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": 18, 530 | "metadata": {}, 531 | "outputs": [ 532 | { 533 | "data": { 534 | "text/plain": [ 535 | "[[11503,\n", 536 | " 3362,\n", 537 | " 6829,\n", 538 | " 26812,\n", 539 | " 15723,\n", 540 | " 17689,\n", 541 | " 11766,\n", 542 | " 24088,\n", 543 | " 18414,\n", 544 | " 16507,\n", 545 | " 16822,\n", 546 | " 9190,\n", 547 | " 11291,\n", 548 | " 26812,\n", 549 | " 8667,\n", 550 | " 6770,\n", 551 | " 11766,\n", 552 | " 15723,\n", 553 | " 22504],\n", 554 | " [11503,\n", 555 | " 19743,\n", 556 | " 24088,\n", 557 | " 14496,\n", 558 | " 14234,\n", 559 | " 11766,\n", 560 | " 10867,\n", 561 | " 865,\n", 562 | " 16507,\n", 563 | " 667,\n", 564 | " 16507,\n", 565 | " 24088,\n", 566 | " 24380,\n", 567 | " 11503,\n", 568 | " 26325,\n", 569 | " 16609,\n", 570 | " 20769,\n", 571 | " 15723,\n", 572 | " 9410,\n", 573 | " 27106,\n", 574 | " 865,\n", 575 | " 11503,\n", 576 | " 3397,\n", 577 | " 10876,\n", 578 | " 3214,\n", 579 | " 8660,\n", 580 | " 1,\n", 581 | " 25058,\n", 582 | " 16507,\n", 583 | " 10212,\n", 584 | " 24622,\n", 585 | " 10909,\n", 586 | " 8059,\n", 587 | " 24232,\n", 588 | " 21437,\n", 589 | " 1,\n", 590 | " 10689,\n", 591 | " 2332,\n", 592 | " 865,\n", 593 | " 11503,\n", 594 | " 6249,\n", 595 | " 8081,\n", 596 | " 8768,\n", 597 | " 16507,\n", 598 | " 10909,\n", 599 | " 2079,\n", 600 | " 865,\n", 601 | " 12917,\n", 602 | " 328,\n", 603 | " 24206,\n", 604 | " 14386,\n", 605 | " 15723,\n", 606 | " 21849,\n", 607 | " 5348]]" 608 | ] 609 | }, 610 | "execution_count": 18, 611 | "metadata": {}, 612 | "output_type": "execute_result" 613 | } 614 | ], 615 | "source": [ 616 | "# examples of what is in the input tensors\n", 617 | "input_tensor[0:2]" 618 | ] 619 | }, 620 | { 621 | "cell_type": "markdown", 622 | "metadata": {}, 623 | "source": [ 624 | "### 3.3 Padding data\n", 625 | "In order to train our recurrent neural network later on in the notebook, it is required padding to generate inputs of same length." 626 | ] 627 | }, 628 | { 629 | "cell_type": "code", 630 | "execution_count": 19, 631 | "metadata": {}, 632 | "outputs": [], 633 | "source": [ 634 | "def max_length(tensor):\n", 635 | " return max(len(t) for t in tensor)" 636 | ] 637 | }, 638 | { 639 | "cell_type": "code", 640 | "execution_count": 20, 641 | "metadata": {}, 642 | "outputs": [ 643 | { 644 | "name": "stdout", 645 | "output_type": "stream", 646 | "text": [ 647 | "68\n" 648 | ] 649 | } 650 | ], 651 | "source": [ 652 | "# calculate the max_length of input tensor\n", 653 | "max_length_inp = max_length(input_tensor)\n", 654 | "print(max_length_inp)" 655 | ] 656 | }, 657 | { 658 | "cell_type": "code", 659 | "execution_count": 21, 660 | "metadata": {}, 661 | "outputs": [], 662 | "source": [ 663 | "# Padding the input and output tensor to the maximum length\n", 664 | "input_tensor = tf.keras.preprocessing.sequence.pad_sequences(input_tensor, \n", 665 | " maxlen=max_length_inp,\n", 666 | " padding='post')" 667 | ] 668 | }, 669 | { 670 | "cell_type": "code", 671 | "execution_count": 40, 672 | "metadata": {}, 673 | "outputs": [ 674 | { 675 | "data": { 676 | "text/plain": [ 677 | "array([[11503, 3362, 6829, 26812, 15723, 17689, 11766, 24088, 18414,\n", 678 | " 16507, 16822, 9190, 11291, 26812, 8667, 6770, 11766, 15723,\n", 679 | " 22504, 0, 0, 0, 0, 0, 0, 0, 0,\n", 680 | " 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 681 | " 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 682 | " 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 683 | " 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 684 | " 0, 0, 0, 0, 0],\n", 685 | " [11503, 19743, 24088, 14496, 14234, 11766, 10867, 865, 16507,\n", 686 | " 667, 16507, 24088, 24380, 11503, 26325, 16609, 20769, 15723,\n", 687 | " 9410, 27106, 865, 11503, 3397, 10876, 3214, 8660, 1,\n", 688 | " 25058, 16507, 10212, 24622, 10909, 8059, 24232, 21437, 1,\n", 689 | " 10689, 2332, 865, 11503, 6249, 8081, 8768, 16507, 10909,\n", 690 | " 2079, 865, 12917, 328, 24206, 14386, 15723, 21849, 5348,\n", 691 | " 0, 0, 0, 0, 0, 0, 0, 0, 0,\n", 692 | " 0, 0, 0, 0, 0]], dtype=int32)" 693 | ] 694 | }, 695 | "execution_count": 40, 696 | "metadata": {}, 697 | "output_type": "execute_result" 698 | } 699 | ], 700 | "source": [ 701 | "input_tensor[0:2]" 702 | ] 703 | }, 704 | { 705 | "cell_type": "markdown", 706 | "metadata": {}, 707 | "source": [ 708 | "### 3.4 Binarization\n", 709 | "We would like to binarize our target so that we can obtain one-hot encodings as target values. These are easier and more efficient to work with and will be useful when training the models." 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "execution_count": 23, 715 | "metadata": {}, 716 | "outputs": [], 717 | "source": [ 718 | "### convert targets to one-hot encoding vectors\n", 719 | "emotions = list(set(data.emotions.unique()))\n", 720 | "num_emotions = len(emotions)\n", 721 | "# binarizer\n", 722 | "mlb = preprocessing.MultiLabelBinarizer()\n", 723 | "data_labels = [set(emos) & set(emotions) for emos in data[['emotions']].values]\n", 724 | "bin_emotions = mlb.fit_transform(data_labels)\n", 725 | "target_tensor = np.array(bin_emotions.tolist())" 726 | ] 727 | }, 728 | { 729 | "cell_type": "code", 730 | "execution_count": 24, 731 | "metadata": {}, 732 | "outputs": [ 733 | { 734 | "data": { 735 | "text/plain": [ 736 | "array([[0, 0, 0, 0, 1, 0],\n", 737 | " [1, 0, 0, 0, 0, 0]])" 738 | ] 739 | }, 740 | "execution_count": 24, 741 | "metadata": {}, 742 | "output_type": "execute_result" 743 | } 744 | ], 745 | "source": [ 746 | "target_tensor[0:2] " 747 | ] 748 | }, 749 | { 750 | "cell_type": "code", 751 | "execution_count": 30, 752 | "metadata": {}, 753 | "outputs": [ 754 | { 755 | "data": { 756 | "text/html": [ 757 | "
\n", 758 | "\n", 771 | "\n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | "
textemotionstoken_size
11677i can do without my phone in the presence of o...sadness19
41836i remember the many lunches in hell and of all...anger54
\n", 795 | "
" 796 | ], 797 | "text/plain": [ 798 | " text emotions token_size\n", 799 | "11677 i can do without my phone in the presence of o... sadness 19\n", 800 | "41836 i remember the many lunches in hell and of all... anger 54" 801 | ] 802 | }, 803 | "execution_count": 30, 804 | "metadata": {}, 805 | "output_type": "execute_result" 806 | } 807 | ], 808 | "source": [ 809 | "data[0:2]" 810 | ] 811 | }, 812 | { 813 | "cell_type": "code", 814 | "execution_count": 26, 815 | "metadata": {}, 816 | "outputs": [], 817 | "source": [ 818 | "get_emotion = lambda t: np.argmax(t)" 819 | ] 820 | }, 821 | { 822 | "cell_type": "code", 823 | "execution_count": 27, 824 | "metadata": {}, 825 | "outputs": [ 826 | { 827 | "data": { 828 | "text/plain": [ 829 | "4" 830 | ] 831 | }, 832 | "execution_count": 27, 833 | "metadata": {}, 834 | "output_type": "execute_result" 835 | } 836 | ], 837 | "source": [ 838 | "get_emotion(target_tensor[0])" 839 | ] 840 | }, 841 | { 842 | "cell_type": "code", 843 | "execution_count": 32, 844 | "metadata": {}, 845 | "outputs": [], 846 | "source": [ 847 | "emotion_dict = {0: 'anger', 1: 'fear', 2: 'joy', 3: 'love', 4: 'sadness', 5: 'surprise'}" 848 | ] 849 | }, 850 | { 851 | "cell_type": "code", 852 | "execution_count": 35, 853 | "metadata": {}, 854 | "outputs": [ 855 | { 856 | "data": { 857 | "text/plain": [ 858 | "'sadness'" 859 | ] 860 | }, 861 | "execution_count": 35, 862 | "metadata": {}, 863 | "output_type": "execute_result" 864 | } 865 | ], 866 | "source": [ 867 | "emotion_dict[get_emotion(target_tensor[0])]" 868 | ] 869 | }, 870 | { 871 | "cell_type": "markdown", 872 | "metadata": {}, 873 | "source": [ 874 | "### 3.5 Split data\n", 875 | "We would like to split our data into a train and validation set. In addition, we also want a holdout dataset (test set) for evaluating the models." 876 | ] 877 | }, 878 | { 879 | "cell_type": "code", 880 | "execution_count": 36, 881 | "metadata": {}, 882 | "outputs": [ 883 | { 884 | "data": { 885 | "text/plain": [ 886 | "(40000, 40000, 5000, 5000, 5000, 5000)" 887 | ] 888 | }, 889 | "execution_count": 36, 890 | "metadata": {}, 891 | "output_type": "execute_result" 892 | } 893 | ], 894 | "source": [ 895 | "# Creating training and validation sets using an 80-20 split\n", 896 | "input_tensor_train, input_tensor_val, target_tensor_train, target_tensor_val = train_test_split(input_tensor, target_tensor, test_size=0.2)\n", 897 | "\n", 898 | "# Split the validataion further to obtain a holdout dataset (for testing) -- split 50:50\n", 899 | "input_tensor_val, input_tensor_test, target_tensor_val, target_tensor_test = train_test_split(input_tensor_val, target_tensor_val, test_size=0.5)\n", 900 | "\n", 901 | "# Show length\n", 902 | "len(input_tensor_train), len(target_tensor_train), len(input_tensor_val), len(target_tensor_val), len(input_tensor_test), len(target_tensor_test)" 903 | ] 904 | }, 905 | { 906 | "cell_type": "markdown", 907 | "metadata": {}, 908 | "source": [ 909 | "### 3.6 Data Loader\n", 910 | "We can also load the data into a data loader, which makes it easy to **manipulate the data**, **create batches**, and apply further **transformations**. In TensorFlow we can use the `tf.data` function." 911 | ] 912 | }, 913 | { 914 | "cell_type": "code", 915 | "execution_count": 37, 916 | "metadata": {}, 917 | "outputs": [], 918 | "source": [ 919 | "TRAIN_BUFFER_SIZE = len(input_tensor_train)\n", 920 | "VAL_BUFFER_SIZE = len(input_tensor_val)\n", 921 | "TEST_BUFFER_SIZE = len(input_tensor_test)\n", 922 | "BATCH_SIZE = 64\n", 923 | "TRAIN_N_BATCH = TRAIN_BUFFER_SIZE // BATCH_SIZE\n", 924 | "VAL_N_BATCH = VAL_BUFFER_SIZE // BATCH_SIZE\n", 925 | "TEST_N_BATCH = TEST_BUFFER_SIZE // BATCH_SIZE\n", 926 | "\n", 927 | "embedding_dim = 256\n", 928 | "units = 1024\n", 929 | "vocab_inp_size = len(inputs.word2idx)\n", 930 | "target_size = num_emotions\n", 931 | "\n", 932 | "train_dataset = tf.data.Dataset.from_tensor_slices((input_tensor_train, \n", 933 | " target_tensor_train)).shuffle(TRAIN_BUFFER_SIZE)\n", 934 | "train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=True)\n", 935 | "val_dataset = tf.data.Dataset.from_tensor_slices((input_tensor_val, \n", 936 | " target_tensor_val)).shuffle(VAL_BUFFER_SIZE)\n", 937 | "val_dataset = val_dataset.batch(BATCH_SIZE, drop_remainder=True)\n", 938 | "test_dataset = tf.data.Dataset.from_tensor_slices((input_tensor_test, \n", 939 | " target_tensor_test)).shuffle(TEST_BUFFER_SIZE)\n", 940 | "test_dataset = test_dataset.batch(BATCH_SIZE, drop_remainder=True)" 941 | ] 942 | }, 943 | { 944 | "cell_type": "code", 945 | "execution_count": 38, 946 | "metadata": {}, 947 | "outputs": [ 948 | { 949 | "name": "stdout", 950 | "output_type": "stream", 951 | "text": [ 952 | "\n", 953 | "\n", 954 | "\n" 955 | ] 956 | } 957 | ], 958 | "source": [ 959 | "# checking minibatch\n", 960 | "print(train_dataset)\n", 961 | "print(val_dataset)\n", 962 | "print(test_dataset)" 963 | ] 964 | }, 965 | { 966 | "cell_type": "markdown", 967 | "metadata": {}, 968 | "source": [ 969 | "## 4. Model\n", 970 | "After the data has been preprocessed, transformed and prepared it is now time to construct the model or the so-called computation graph that will be used to train our classification models. We are going to use a gated recurrent neural network (GRU), which is considered a more efficient version of a basic RNN. The figure below shows a high-level overview of the model details. \n", 971 | "\n", 972 | "![alt txt](img/gru-model.png)" 973 | ] 974 | }, 975 | { 976 | "cell_type": "markdown", 977 | "metadata": {}, 978 | "source": [ 979 | "### 4.1 Constructing the Model\n", 980 | "Below we construct our model:" 981 | ] 982 | }, 983 | { 984 | "cell_type": "code", 985 | "execution_count": 41, 986 | "metadata": {}, 987 | "outputs": [], 988 | "source": [ 989 | "### define the GRU component\n", 990 | "def gru(units):\n", 991 | " # If you have a GPU, we recommend using CuDNNGRU(provides a 3x speedup than GRU)\n", 992 | " # the code automatically does that.\n", 993 | " if tf.test.is_gpu_available():\n", 994 | " return tf.keras.layers.CuDNNGRU(units, \n", 995 | " return_sequences=True, \n", 996 | " return_state=True, \n", 997 | " recurrent_initializer='glorot_uniform')\n", 998 | " else:\n", 999 | " return tf.keras.layers.GRU(units, \n", 1000 | " return_sequences=True, \n", 1001 | " return_state=True, \n", 1002 | " recurrent_activation='relu', \n", 1003 | " recurrent_initializer='glorot_uniform')\n", 1004 | "\n", 1005 | "### Build the model\n", 1006 | "class EmoGRU(tf.keras.Model):\n", 1007 | " def __init__(self, vocab_size, embedding_dim, hidden_units, batch_sz, output_size):\n", 1008 | " super(EmoGRU, self).__init__()\n", 1009 | " self.batch_sz = batch_sz\n", 1010 | " self.hidden_units = hidden_units\n", 1011 | " \n", 1012 | " # layers\n", 1013 | " self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)\n", 1014 | " self.dropout = tf.keras.layers.Dropout(0.5)\n", 1015 | " self.gru = gru(self.hidden_units)\n", 1016 | " self.fc = tf.keras.layers.Dense(output_size)\n", 1017 | " \n", 1018 | " def call(self, x, hidden):\n", 1019 | " x = self.embedding(x) # batch_size X max_len X embedding_dim\n", 1020 | " output, state = self.gru(x, initial_state = hidden) # batch_size X max_len X hidden_units\n", 1021 | " out = output[:,-1,:]\n", 1022 | " out = self.dropout(out)\n", 1023 | " out = self.fc(out) # batch_size X max_len X output_size\n", 1024 | " return out, state\n", 1025 | " \n", 1026 | " def initialize_hidden_state(self):\n", 1027 | " return tf.zeros((self.batch_sz, self.hidden_units))" 1028 | ] 1029 | }, 1030 | { 1031 | "cell_type": "markdown", 1032 | "metadata": {}, 1033 | "source": [ 1034 | "### 4.1 Pretesting model\n", 1035 | "Since eager execution is enabled we can print the output of the model by passing a sample of the dataset and making sure that the dimensions of the outputs are as expected." 1036 | ] 1037 | }, 1038 | { 1039 | "cell_type": "code", 1040 | "execution_count": 42, 1041 | "metadata": {}, 1042 | "outputs": [ 1043 | { 1044 | "name": "stdout", 1045 | "output_type": "stream", 1046 | "text": [ 1047 | "(64, 6)\n" 1048 | ] 1049 | } 1050 | ], 1051 | "source": [ 1052 | "model = EmoGRU(vocab_inp_size, embedding_dim, units, BATCH_SIZE, target_size)\n", 1053 | "\n", 1054 | "# initialize the hidden state of the RNN\n", 1055 | "hidden = model.initialize_hidden_state()\n", 1056 | "\n", 1057 | "# testing for the first batch only then break the for loop\n", 1058 | "# Potential bug: out is not randomized enough\n", 1059 | "for (batch, (inp, targ)) in enumerate(train_dataset):\n", 1060 | " out, state = model(inp, hidden)\n", 1061 | " print(out.shape) \n", 1062 | " break" 1063 | ] 1064 | }, 1065 | { 1066 | "cell_type": "markdown", 1067 | "metadata": {}, 1068 | "source": [ 1069 | "## 5. Training the Model\n", 1070 | "Now that we have tested the model, it is time to train it. We will define our optimization algorithm, learning rate, and other necessary information to train the model." 1071 | ] 1072 | }, 1073 | { 1074 | "cell_type": "code", 1075 | "execution_count": 43, 1076 | "metadata": {}, 1077 | "outputs": [], 1078 | "source": [ 1079 | "optimizer = tf.train.AdamOptimizer()\n", 1080 | "\n", 1081 | "def loss_function(y, prediction):\n", 1082 | " return tf.losses.softmax_cross_entropy(y, logits=prediction)\n", 1083 | "\n", 1084 | "def accuracy(y, yhat):\n", 1085 | " #compare the predictions to the truth\n", 1086 | " yhat = tf.argmax(yhat, 1).numpy()\n", 1087 | " y = tf.argmax(y , 1).numpy()\n", 1088 | " return np.sum(y == yhat)/len(y)" 1089 | ] 1090 | }, 1091 | { 1092 | "cell_type": "code", 1093 | "execution_count": 44, 1094 | "metadata": {}, 1095 | "outputs": [ 1096 | { 1097 | "name": "stdout", 1098 | "output_type": "stream", 1099 | "text": [ 1100 | "Epoch 1 Batch 0 Val. Loss 0.3018\n", 1101 | "Epoch 1 Batch 100 Val. Loss 0.2750\n", 1102 | "Epoch 1 Batch 200 Val. Loss 0.2609\n", 1103 | "Epoch 1 Batch 300 Val. Loss 0.1847\n", 1104 | "Epoch 1 Batch 400 Val. Loss 0.0389\n", 1105 | "Epoch 1 Batch 500 Val. Loss 0.0250\n", 1106 | "Epoch 1 Batch 600 Val. Loss 0.0175\n", 1107 | "Epoch 1 Loss 0.1530 -- Train Acc. 0.6361 -- Val Acc. 0.9319\n", 1108 | "Time taken for 1 epoch 41.33251762390137 sec\n", 1109 | "\n", 1110 | "Epoch 2 Batch 0 Val. Loss 0.0263\n", 1111 | "Epoch 2 Batch 100 Val. Loss 0.0195\n", 1112 | "Epoch 2 Batch 200 Val. Loss 0.0247\n", 1113 | "Epoch 2 Batch 300 Val. Loss 0.0213\n", 1114 | "Epoch 2 Batch 400 Val. Loss 0.0381\n", 1115 | "Epoch 2 Batch 500 Val. Loss 0.0204\n", 1116 | "Epoch 2 Batch 600 Val. Loss 0.0086\n", 1117 | "Epoch 2 Loss 0.0208 -- Train Acc. 0.9358 -- Val Acc. 0.9431\n", 1118 | "Time taken for 1 epoch 41.09313082695007 sec\n", 1119 | "\n", 1120 | "Epoch 3 Batch 0 Val. Loss 0.0228\n", 1121 | "Epoch 3 Batch 100 Val. Loss 0.0345\n", 1122 | "Epoch 3 Batch 200 Val. Loss 0.0262\n", 1123 | "Epoch 3 Batch 300 Val. Loss 0.0169\n", 1124 | "Epoch 3 Batch 400 Val. Loss 0.0142\n", 1125 | "Epoch 3 Batch 500 Val. Loss 0.0175\n", 1126 | "Epoch 3 Batch 600 Val. Loss 0.0194\n", 1127 | "Epoch 3 Loss 0.0172 -- Train Acc. 0.9425 -- Val Acc. 0.9393\n", 1128 | "Time taken for 1 epoch 41.424949169158936 sec\n", 1129 | "\n", 1130 | "Epoch 4 Batch 0 Val. Loss 0.0080\n", 1131 | "Epoch 4 Batch 100 Val. Loss 0.0179\n", 1132 | "Epoch 4 Batch 200 Val. Loss 0.0104\n", 1133 | "Epoch 4 Batch 300 Val. Loss 0.0192\n", 1134 | "Epoch 4 Batch 400 Val. Loss 0.0105\n", 1135 | "Epoch 4 Batch 500 Val. Loss 0.0223\n", 1136 | "Epoch 4 Batch 600 Val. Loss 0.0192\n", 1137 | "Epoch 4 Loss 0.0161 -- Train Acc. 0.9463 -- Val Acc. 0.9463\n", 1138 | "Time taken for 1 epoch 41.28922462463379 sec\n", 1139 | "\n", 1140 | "Epoch 5 Batch 0 Val. Loss 0.0172\n", 1141 | "Epoch 5 Batch 100 Val. Loss 0.0188\n", 1142 | "Epoch 5 Batch 200 Val. Loss 0.0138\n", 1143 | "Epoch 5 Batch 300 Val. Loss 0.0220\n", 1144 | "Epoch 5 Batch 400 Val. Loss 0.0089\n", 1145 | "Epoch 5 Batch 500 Val. Loss 0.0146\n", 1146 | "Epoch 5 Batch 600 Val. Loss 0.0107\n", 1147 | "Epoch 5 Loss 0.0153 -- Train Acc. 0.9496 -- Val Acc. 0.9425\n", 1148 | "Time taken for 1 epoch 41.406941413879395 sec\n", 1149 | "\n", 1150 | "Epoch 6 Batch 0 Val. Loss 0.0157\n", 1151 | "Epoch 6 Batch 100 Val. Loss 0.0232\n", 1152 | "Epoch 6 Batch 200 Val. Loss 0.0162\n", 1153 | "Epoch 6 Batch 300 Val. Loss 0.0166\n", 1154 | "Epoch 6 Batch 400 Val. Loss 0.0100\n", 1155 | "Epoch 6 Batch 500 Val. Loss 0.0156\n", 1156 | "Epoch 6 Batch 600 Val. Loss 0.0116\n", 1157 | "Epoch 6 Loss 0.0145 -- Train Acc. 0.9530 -- Val Acc. 0.9415\n", 1158 | "Time taken for 1 epoch 41.41020727157593 sec\n", 1159 | "\n", 1160 | "Epoch 7 Batch 0 Val. Loss 0.0048\n", 1161 | "Epoch 7 Batch 100 Val. Loss 0.0097\n", 1162 | "Epoch 7 Batch 200 Val. Loss 0.0168\n", 1163 | "Epoch 7 Batch 300 Val. Loss 0.0154\n", 1164 | "Epoch 7 Batch 400 Val. Loss 0.0075\n", 1165 | "Epoch 7 Batch 500 Val. Loss 0.0104\n", 1166 | "Epoch 7 Batch 600 Val. Loss 0.0105\n", 1167 | "Epoch 7 Loss 0.0128 -- Train Acc. 0.9607 -- Val Acc. 0.9387\n", 1168 | "Time taken for 1 epoch 41.614171266555786 sec\n", 1169 | "\n", 1170 | "Epoch 8 Batch 0 Val. Loss 0.0038\n", 1171 | "Epoch 8 Batch 100 Val. Loss 0.0120\n", 1172 | "Epoch 8 Batch 200 Val. Loss 0.0007\n", 1173 | "Epoch 8 Batch 300 Val. Loss 0.0105\n", 1174 | "Epoch 8 Batch 400 Val. Loss 0.0278\n", 1175 | "Epoch 8 Batch 500 Val. Loss 0.0136\n", 1176 | "Epoch 8 Batch 600 Val. Loss 0.0149\n", 1177 | "Epoch 8 Loss 0.0116 -- Train Acc. 0.9665 -- Val Acc. 0.9373\n", 1178 | "Time taken for 1 epoch 41.726645946502686 sec\n", 1179 | "\n", 1180 | "Epoch 9 Batch 0 Val. Loss 0.0066\n", 1181 | "Epoch 9 Batch 100 Val. Loss 0.0156\n", 1182 | "Epoch 9 Batch 200 Val. Loss 0.0122\n", 1183 | "Epoch 9 Batch 300 Val. Loss 0.0114\n", 1184 | "Epoch 9 Batch 400 Val. Loss 0.0040\n", 1185 | "Epoch 9 Batch 500 Val. Loss 0.0078\n", 1186 | "Epoch 9 Batch 600 Val. Loss 0.0088\n", 1187 | "Epoch 9 Loss 0.0106 -- Train Acc. 0.9690 -- Val Acc. 0.9361\n", 1188 | "Time taken for 1 epoch 41.65887784957886 sec\n", 1189 | "\n", 1190 | "Epoch 10 Batch 0 Val. Loss 0.0225\n", 1191 | "Epoch 10 Batch 100 Val. Loss 0.0042\n", 1192 | "Epoch 10 Batch 200 Val. Loss 0.0122\n", 1193 | "Epoch 10 Batch 300 Val. Loss 0.0003\n", 1194 | "Epoch 10 Batch 400 Val. Loss 0.0362\n", 1195 | "Epoch 10 Batch 500 Val. Loss 0.0187\n", 1196 | "Epoch 10 Batch 600 Val. Loss 0.0089\n", 1197 | "Epoch 10 Loss 0.0105 -- Train Acc. 0.9702 -- Val Acc. 0.9401\n", 1198 | "Time taken for 1 epoch 41.729103326797485 sec\n", 1199 | "\n" 1200 | ] 1201 | } 1202 | ], 1203 | "source": [ 1204 | "EPOCHS = 10\n", 1205 | "\n", 1206 | "for epoch in range(EPOCHS):\n", 1207 | " start = time.time()\n", 1208 | " \n", 1209 | " ### Initialize hidden state\n", 1210 | " hidden = model.initialize_hidden_state()\n", 1211 | " total_loss = 0\n", 1212 | " train_accuracy, val_accuracy = 0, 0\n", 1213 | " \n", 1214 | " ### Training\n", 1215 | " for (batch, (inp, targ)) in enumerate(train_dataset):\n", 1216 | " loss = 0\n", 1217 | " \n", 1218 | " with tf.GradientTape() as tape:\n", 1219 | " predictions,_ = model(inp, hidden)\n", 1220 | " loss += loss_function(targ, predictions)\n", 1221 | " batch_loss = (loss / int(targ.shape[1])) \n", 1222 | " total_loss += batch_loss\n", 1223 | " \n", 1224 | " batch_accuracy = accuracy(targ, predictions)\n", 1225 | " train_accuracy += batch_accuracy\n", 1226 | " \n", 1227 | " gradients = tape.gradient(loss, model.variables)\n", 1228 | " optimizer.apply_gradients(zip(gradients, model.variables))\n", 1229 | " \n", 1230 | " if batch % 100 == 0:\n", 1231 | " print('Epoch {} Batch {} Val. Loss {:.4f}'.format(epoch + 1,\n", 1232 | " batch,\n", 1233 | " batch_loss.numpy()))\n", 1234 | " \n", 1235 | " ### Validating\n", 1236 | " hidden = model.initialize_hidden_state()\n", 1237 | "\n", 1238 | " for (batch, (inp, targ)) in enumerate(val_dataset): \n", 1239 | " predictions,_ = model(inp, hidden) \n", 1240 | " batch_accuracy = accuracy(targ, predictions)\n", 1241 | " val_accuracy += batch_accuracy\n", 1242 | " \n", 1243 | " print('Epoch {} Loss {:.4f} -- Train Acc. {:.4f} -- Val Acc. {:.4f}'.format(epoch + 1, \n", 1244 | " total_loss / TRAIN_N_BATCH, \n", 1245 | " train_accuracy / TRAIN_N_BATCH,\n", 1246 | " val_accuracy / VAL_N_BATCH))\n", 1247 | " print('Time taken for 1 epoch {} sec\\n'.format(time.time() - start))" 1248 | ] 1249 | }, 1250 | { 1251 | "cell_type": "code", 1252 | "execution_count": 45, 1253 | "metadata": {}, 1254 | "outputs": [ 1255 | { 1256 | "name": "stdout", 1257 | "output_type": "stream", 1258 | "text": [ 1259 | "_________________________________________________________________\n", 1260 | "Layer (type) Output Shape Param # \n", 1261 | "=================================================================\n", 1262 | "embedding (Embedding) multiple 6984192 \n", 1263 | "_________________________________________________________________\n", 1264 | "dropout (Dropout) multiple 0 \n", 1265 | "_________________________________________________________________\n", 1266 | "cu_dnngru (CuDNNGRU) multiple 3938304 \n", 1267 | "_________________________________________________________________\n", 1268 | "dense (Dense) multiple 6150 \n", 1269 | "=================================================================\n", 1270 | "Total params: 10,928,646\n", 1271 | "Trainable params: 10,928,646\n", 1272 | "Non-trainable params: 0\n", 1273 | "_________________________________________________________________\n" 1274 | ] 1275 | } 1276 | ], 1277 | "source": [ 1278 | "model.summary()" 1279 | ] 1280 | }, 1281 | { 1282 | "cell_type": "markdown", 1283 | "metadata": {}, 1284 | "source": [ 1285 | "## 6. Evaluation on the Testing Data\n", 1286 | "Now we will evaluate the model with the holdout dataset." 1287 | ] 1288 | }, 1289 | { 1290 | "cell_type": "code", 1291 | "execution_count": 57, 1292 | "metadata": {}, 1293 | "outputs": [ 1294 | { 1295 | "name": "stdout", 1296 | "output_type": "stream", 1297 | "text": [ 1298 | "Test Accuracy: 0.9326923076923077\n" 1299 | ] 1300 | } 1301 | ], 1302 | "source": [ 1303 | "test_accuracy = 0\n", 1304 | "all_predictions = []\n", 1305 | "x_raw = []\n", 1306 | "y_raw = []\n", 1307 | "\n", 1308 | "hidden = model.initialize_hidden_state()\n", 1309 | "\n", 1310 | "for (batch, (inp, targ)) in enumerate(test_dataset): \n", 1311 | " predictions,_ = model(inp, hidden) \n", 1312 | " batch_accuracy = accuracy(targ, predictions)\n", 1313 | " test_accuracy += batch_accuracy\n", 1314 | " \n", 1315 | " x_raw = x_raw + [x for x in inp]\n", 1316 | " y_raw = y_raw + [y for y in targ]\n", 1317 | " \n", 1318 | " all_predictions.append(predictions)\n", 1319 | " \n", 1320 | "print(\"Test Accuracy: \", test_accuracy/TEST_N_BATCH)" 1321 | ] 1322 | }, 1323 | { 1324 | "cell_type": "markdown", 1325 | "metadata": {}, 1326 | "source": [ 1327 | "### 6.1 Confusion Matrix\n", 1328 | "The test accuracy alone is not an interesting performance metric in this case. Let's plot a confusion matrix to get a drilled down view of how the model is performing with regards to each emotion." 1329 | ] 1330 | }, 1331 | { 1332 | "cell_type": "code", 1333 | "execution_count": 73, 1334 | "metadata": {}, 1335 | "outputs": [ 1336 | { 1337 | "name": "stdout", 1338 | "output_type": "stream", 1339 | "text": [ 1340 | "Default Classification report\n", 1341 | " precision recall f1-score support\n", 1342 | "\n", 1343 | " anger 0.96 0.90 0.93 687\n", 1344 | " fear 0.82 0.96 0.88 532\n", 1345 | " joy 0.94 0.97 0.96 1765\n", 1346 | " love 0.89 0.79 0.84 425\n", 1347 | " sadness 0.97 0.97 0.97 1405\n", 1348 | " surprise 0.89 0.66 0.76 178\n", 1349 | "\n", 1350 | "avg / total 0.93 0.93 0.93 4992\n", 1351 | "\n", 1352 | "\n", 1353 | "Accuracy:\n", 1354 | "0.9326923076923077\n", 1355 | "Correct Predictions: 4656\n", 1356 | "precision: 0.91\n", 1357 | "recall: 0.88\n", 1358 | "f1: 0.89\n", 1359 | "\n", 1360 | "confusion matrix\n", 1361 | " [[ 621 36 3 1 26 0]\n", 1362 | " [ 6 511 1 0 5 9]\n", 1363 | " [ 5 2 1711 39 4 4]\n", 1364 | " [ 1 0 86 336 2 0]\n", 1365 | " [ 14 29 2 0 1359 1]\n", 1366 | " [ 0 46 14 0 0 118]]\n", 1367 | "(row=expected, col=predicted)\n" 1368 | ] 1369 | }, 1370 | { 1371 | "data": { 1372 | "image/png": "\n", 1373 | "text/plain": [ 1374 | "
" 1375 | ] 1376 | }, 1377 | "metadata": {}, 1378 | "output_type": "display_data" 1379 | } 1380 | ], 1381 | "source": [ 1382 | "import helpers.evaluate as ev\n", 1383 | "evaluator = ev.Evaluate()\n", 1384 | "import pandas as pd\n", 1385 | "\n", 1386 | "final_predictions = []\n", 1387 | "\n", 1388 | "for p in all_predictions:\n", 1389 | " for sub_p in p:\n", 1390 | " final_predictions.append(sub_p)\n", 1391 | "\n", 1392 | "predictions = [np.argmax(p).item() for p in final_predictions]\n", 1393 | "targets = [np.argmax(t).item() for t in y_raw]\n", 1394 | "correct_predictions = float(np.sum(predictions == targets))\n", 1395 | "\n", 1396 | "# predictions\n", 1397 | "predictions_human_readable = ((x_raw, predictions))\n", 1398 | "# actual targets\n", 1399 | "target_human_readable = ((x_raw, targets))\n", 1400 | "\n", 1401 | "emotion_dict = {0: 'anger', 1: 'fear', 2: 'joy', 3: 'love', 4: 'sadness', 5: 'surprise'}\n", 1402 | "\n", 1403 | "# convert results into dataframe\n", 1404 | "model_test_result = pd.DataFrame(predictions_human_readable[1],columns=[\"emotion\"])\n", 1405 | "test = pd.DataFrame(target_human_readable[1], columns=[\"emotion\"])\n", 1406 | "\n", 1407 | "model_test_result.emotion = model_test_result.emotion.map(lambda x: emotion_dict[int(float(x))])\n", 1408 | "test.emotion = test.emotion.map(lambda x: emotion_dict[int(x)])\n", 1409 | "\n", 1410 | "evaluator.evaluate_class(model_test_result.emotion, test.emotion );" 1411 | ] 1412 | }, 1413 | { 1414 | "cell_type": "markdown", 1415 | "metadata": {}, 1416 | "source": [ 1417 | "## Final Words\n", 1418 | "You have learned how to perform neural-based emotion recognition using RNNs. There are many things you can do after you have completed this tutorial. You can attempt the exercises outlined in the \"Outline\" section of this notebook. You can also try other types of neural architectures such as LSTMs, Bi-LSTMS, attentions models, and CNNs. In addition, you can also store the models and conduct transfer learning to other emotion-related tasks. \n" 1419 | ] 1420 | }, 1421 | { 1422 | "cell_type": "markdown", 1423 | "metadata": {}, 1424 | "source": [ 1425 | "---" 1426 | ] 1427 | }, 1428 | { 1429 | "cell_type": "markdown", 1430 | "metadata": {}, 1431 | "source": [ 1432 | "## References\n", 1433 | "\n", 1434 | "- [Introduction to what is a Tensor](https://www.youtube.com/watch?v=hCSjWCVrphc&t=1137s)\n", 1435 | "- [Deep Learning for NLP](https://docs.google.com/presentation/d/1cf2H1qMvP1rdKUF5000ifOIRv1_b0bvj0ZTVL7-RaVE/edit?usp=sharing)\n", 1436 | "- [Enable Eager Execution on TensorFlow](https://colab.research.google.com/github/zaidalyafeai/Notebooks/blob/master/Eager_Execution_Gradient_.ipynb)\n", 1437 | "- [Basic Text Classification](https://www.tensorflow.org/tutorials/keras/basic_text_classification)\n", 1438 | "- [Deep Learning for NLP: An Overview of Recent Trends](https://medium.com/dair-ai/deep-learning-for-nlp-an-overview-of-recent-trends-d0d8f40a776d)" 1439 | ] 1440 | } 1441 | ], 1442 | "metadata": { 1443 | "kernelspec": { 1444 | "display_name": "Python 3", 1445 | "language": "python", 1446 | "name": "python3" 1447 | }, 1448 | "language_info": { 1449 | "codemirror_mode": { 1450 | "name": "ipython", 1451 | "version": 3 1452 | }, 1453 | "file_extension": ".py", 1454 | "mimetype": "text/x-python", 1455 | "name": "python", 1456 | "nbconvert_exporter": "python", 1457 | "pygments_lexer": "ipython3", 1458 | "version": "3.6.1" 1459 | } 1460 | }, 1461 | "nbformat": 4, 1462 | "nbformat_minor": 2 1463 | } 1464 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Deep Learning Based NLP 2 | This repository contains python notebooks related to several deep learning based NLP tasks such as emotion recognition and neural machine translation. It will provide implementations based on several deep learning frameworks such as PyTorch and TensorFlow. 3 | 4 | ## Emotion Recognition with GRU 5 | - [Deep Learning Based Emotion Recognition With PyTorch](https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks/blob/master/Deep%20Learning%20Emotion%20Recognition%20PyTorch.ipynb) 6 | - [Deep Learning Based Emotion Recognition With TensorFlow](https://github.com/omarsar/nlp_pytorch_tensorflow_notebooks/blob/master/Deep%20Learning%20Emotion%20Recognition%20TensorFlow%20.ipynb) 7 | 8 | --- 9 | Author: [Elvis Saravia](https://twitter.com/omarsar0) -------------------------------------------------------------------------------- /data/merged_training.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/data/merged_training.pkl -------------------------------------------------------------------------------- /helpers/.ipynb_checkpoints/evaluate-checkpoint.py: -------------------------------------------------------------------------------- 1 | from sklearn import metrics 2 | from sklearn.preprocessing import LabelEncoder 3 | from scipy import stats 4 | import numpy as np 5 | import pandas as pd 6 | import matplotlib.pyplot as plt 7 | import itertools 8 | 9 | """ Author: Elvis Saravia (Adapted from Renaud)""" 10 | 11 | class Evaluate(): 12 | 13 | def va_dist(cls, prediction, target, va_df, binarizer, name='', silent=False): 14 | """ Computes distance between actual and prediction through cosine distance """ 15 | va_matrix = va_df.loc[binarizer.classes_][['valence','arousal']].values 16 | y_va = target.dot(va_matrix) 17 | F_va = prediction.dot(va_matrix) 18 | 19 | # dist is a one row vector with size of the test data passed(emotion) 20 | dist = metrics.pairwise.paired_cosine_distances(y_va, F_va) 21 | res = stats.describe(dist) 22 | 23 | # print by default (if silent=False) 24 | if not silent: 25 | print('%s\tmean: %f\tvariance: %f' % (name, res.mean, res.variance)) 26 | 27 | return { 28 | 'distances': dist, 29 | 'dist_stat': res 30 | } 31 | 32 | def evaluate_class(cls, predictions, target, target2=None, silent=False): 33 | """ Compute only the predicted class """ 34 | p_2_annotation = dict() 35 | 36 | precision_recall_fscore_support = [ 37 | (pair[0], pair[1].mean()) for pair in zip( 38 | ['precision', 'recall', 'f1', 'support'], 39 | metrics.precision_recall_fscore_support(target, predictions) 40 | ) 41 | ] 42 | 43 | metrics.precision_recall_fscore_support(target, predictions) 44 | 45 | # confusion matrix 46 | le = LabelEncoder() 47 | target_le = le.fit_transform(target) 48 | predictions_le = le.transform(predictions) 49 | cm = metrics.confusion_matrix(target_le, predictions_le) 50 | 51 | # prediction if two annotations are given on test data 52 | if target2: 53 | p_2_annotation = pd.DataFrame( 54 | [(pred, pred in set([t1,t2])) for pred, t1, t2 in zip(predictions, target, target2)], 55 | columns=['emo','success'] 56 | ).groupby('emo').apply(lambda emo: emo.success.sum()/ len(emo.success)).to_dict() 57 | 58 | if not silent: 59 | print("Default Classification report") 60 | print(metrics.classification_report(target, predictions)) 61 | 62 | # print if target2 was provided 63 | if len(p_2_annotation) > 0: 64 | print('\nPrecision on 2 annotations:') 65 | for emo in p_2_annotation: 66 | print("%s: %.2f" % (emo, p_2_annotation[emo])) 67 | 68 | # print accuracies, precision, recall, and f1 69 | print('\nAccuracy:') 70 | print(metrics.accuracy_score(target, predictions)) 71 | print("Correct Predictions: ", metrics.accuracy_score(target, predictions,normalize=False)) 72 | for to_print in precision_recall_fscore_support[:3]: 73 | print( "%s: %.2f" % to_print ) 74 | 75 | # normalizing the values of the consfusion matrix 76 | print('\nconfusion matrix\n %s' % cm) 77 | print('(row=expected, col=predicted)') 78 | cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 79 | cls.plot_confusion_matrix(cm_normalized, le.classes_, 'Confusion matrix Normalized') 80 | 81 | return { 82 | 'precision_recall_fscore_support': precision_recall_fscore_support, 83 | 'accuracy': metrics.accuracy_score(target, predictions), 84 | 'p_2_annotation': p_2_annotation, 85 | 'confusion_matrix': cm 86 | } 87 | 88 | def predict_class(cls, X_train, y_train, X_test, y_test, 89 | pipeline, silent=False, target2=None): 90 | """ Predicted class,then run some performance evaluation """ 91 | pipeline.fit(X_train, y_train) 92 | predictions = pipeline.predict(X_test) 93 | print("predictions computed....") 94 | return cls.evaluate_class(predictions, y_test, target2, silent) 95 | 96 | def evaluate_prob(cls, prediction, target_rank, target_class, binarizer, va_df, silent=False, target2=None): 97 | """ Evaluate through probability """ 98 | # Run normal class evaluator 99 | predict_class = binarizer.classes_[prediction.argmax(axis=1)] 100 | class_eval = cls.evaluate_class(predict_class, target_class, target2, silent) 101 | 102 | print("==============================================================================") 103 | print("Finished from evaluation class, now continue with the probability evaluation") 104 | 105 | if not silent: 106 | print('\n - First Emotion Classification Metrics -') 107 | print('\n - Multiple Emotion rank Metrics -') 108 | print('VA Cosine Distance') 109 | 110 | classes_dist = [ 111 | ( 112 | emo, 113 | cls.va_dist( 114 | prediction[np.array(target_class) == emo], 115 | target_rank[np.array(target_class) == emo], 116 | va_df, 117 | binarizer, 118 | emo, 119 | silent) 120 | ) for emo in binarizer.classes_ 121 | ] 122 | avg_dist = cls.va_dist(prediction, target_rank, va_df, binarizer, 'avg', silent) 123 | 124 | coverage_error = metrics.coverage_error(target_rank, prediction) 125 | average_precision_score = metrics.average_precision_score(target_rank, prediction) 126 | label_ranking_average_precision_score = metrics.label_ranking_average_precision_score(target_rank, prediction) 127 | label_ranking_loss = metrics.label_ranking_loss(target_rank, prediction) 128 | 129 | # recall at 2 130 | # obtain top two predictions 131 | top2_pred = [set([binarizer.classes_[i[0]], binarizer.classes_[i[1]]]) for i in (prediction.argsort(axis=1).T[-2:].T)] 132 | recall_at_2 = pd.DataFrame( 133 | [ 134 | t in p for t, p in zip(target_class, top2_pred) 135 | ], index=target_class, columns=['recall@2']).groupby(level=0).apply(lambda emo: emo.sum()/len(emo)) 136 | 137 | # combine target into sets 138 | if target2: 139 | union_target = [set(t) for t in zip(target_class, target2)] 140 | else: 141 | union_target = [set(t) for t in zip(target_class)] 142 | 143 | # precision at k 144 | top_k_pred = [ 145 | [set([binarizer.classes_[i] for i in i_list]) for i_list in (prediction.argsort(axis=1).T[-i:].T)] 146 | for i in range(2, len(binarizer.classes_)+1)] 147 | precision_at_k = [ 148 | ('p@' + str(k+2), np.array([len(t & p)/(k+2) for t, p in zip(union_target, top_k_pred[k])]).mean()) 149 | for k in range(len(top_k_pred))] 150 | 151 | # do this if silent= False 152 | if not silent: 153 | print('\n') 154 | print(recall_at_2) 155 | print('\n') 156 | print('p@k') 157 | for pk in precision_at_k: 158 | print(pk[0] + ':\t' + str(pk[1])) 159 | print('\ncoverage_error: %f' % coverage_error) 160 | print('average_precision_score: %f' % average_precision_score) 161 | print('label_ranking_average_precision_score: %f' % label_ranking_average_precision_score) 162 | print('label_ranking_loss: %f' % label_ranking_loss) 163 | 164 | return { 165 | 'class_eval': class_eval, 166 | 'recall_at_2': recall_at_2.to_dict(), 167 | 'precision_at_2': precision_at_k, 168 | 'classes_dist': classes_dist, 169 | 'avg_dist': avg_dist, 170 | 'coverage_error': coverage_error, 171 | 'average_precision_score': average_precision_score, 172 | 'label_ranking_average_precision_score': label_ranking_average_precision_score, 173 | 'label_ranking_loss': label_ranking_loss 174 | } 175 | 176 | 177 | def predict_prob(cls, X_train, y_train, X_test, y_test, label_test, pipeline, binarizer, va_df, silent=False, target2=None): 178 | """ Output predcations based on training and labels """ 179 | pipeline.fit(X_train, y_train) 180 | predictions = pipeline.predict_proba(X_test) 181 | pred_to_mlb = [np.where(pipeline.classes_ == emo)[0][0] for emo in binarizer.classes_.tolist()] 182 | return cls.evaluate_prob(predictions[:,pred_to_mlb], y_test, label_test, binarizer, va_df, silent, target2) 183 | 184 | 185 | def plot_confusion_matrix(cls, cm, my_tags, title='Confusion matrix', cmap=plt.cm.Blues): 186 | """ Plotting the confusion_matrix""" 187 | plt.rc('figure', figsize=(4, 4), dpi=100) 188 | plt.imshow(cm, interpolation='nearest', cmap=cmap) 189 | plt.title(title) 190 | plt.colorbar() 191 | tick_marks = np.arange(len(my_tags)) 192 | target_names = my_tags 193 | plt.xticks(tick_marks, target_names, rotation=45) 194 | plt.yticks(tick_marks, target_names) 195 | 196 | # add normalized valued inside the Confusion matrix 197 | fmt = '.2f' 198 | thresh = cm.max() / 2. 199 | for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): 200 | plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center", color="white" if cm[i, j] > thresh else "black") 201 | 202 | plt.tight_layout() 203 | plt.ylabel('True label') 204 | plt.xlabel('Predicted label') 205 | -------------------------------------------------------------------------------- /helpers/__pycache__/evaluate.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/helpers/__pycache__/evaluate.cpython-36.pyc -------------------------------------------------------------------------------- /helpers/__pycache__/pickle_helpers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/helpers/__pycache__/pickle_helpers.cpython-36.pyc -------------------------------------------------------------------------------- /helpers/evaluate.py: -------------------------------------------------------------------------------- 1 | from sklearn import metrics 2 | from sklearn.preprocessing import LabelEncoder 3 | from scipy import stats 4 | import numpy as np 5 | import pandas as pd 6 | import matplotlib.pyplot as plt 7 | import itertools 8 | 9 | """ Author: Elvis Saravia (Adapted from Renaud)""" 10 | 11 | class Evaluate(): 12 | 13 | def va_dist(cls, prediction, target, va_df, binarizer, name='', silent=False): 14 | """ Computes distance between actual and prediction through cosine distance """ 15 | va_matrix = va_df.loc[binarizer.classes_][['valence','arousal']].values 16 | y_va = target.dot(va_matrix) 17 | F_va = prediction.dot(va_matrix) 18 | 19 | # dist is a one row vector with size of the test data passed(emotion) 20 | dist = metrics.pairwise.paired_cosine_distances(y_va, F_va) 21 | res = stats.describe(dist) 22 | 23 | # print by default (if silent=False) 24 | if not silent: 25 | print('%s\tmean: %f\tvariance: %f' % (name, res.mean, res.variance)) 26 | 27 | return { 28 | 'distances': dist, 29 | 'dist_stat': res 30 | } 31 | 32 | def evaluate_class(cls, predictions, target, target2=None, silent=False): 33 | """ Compute only the predicted class """ 34 | p_2_annotation = dict() 35 | 36 | precision_recall_fscore_support = [ 37 | (pair[0], pair[1].mean()) for pair in zip( 38 | ['precision', 'recall', 'f1', 'support'], 39 | metrics.precision_recall_fscore_support(target, predictions) 40 | ) 41 | ] 42 | 43 | metrics.precision_recall_fscore_support(target, predictions) 44 | 45 | # confusion matrix 46 | le = LabelEncoder() 47 | target_le = le.fit_transform(target) 48 | predictions_le = le.transform(predictions) 49 | cm = metrics.confusion_matrix(target_le, predictions_le) 50 | 51 | # prediction if two annotations are given on test data 52 | if target2: 53 | p_2_annotation = pd.DataFrame( 54 | [(pred, pred in set([t1,t2])) for pred, t1, t2 in zip(predictions, target, target2)], 55 | columns=['emo','success'] 56 | ).groupby('emo').apply(lambda emo: emo.success.sum()/ len(emo.success)).to_dict() 57 | 58 | if not silent: 59 | print("Default Classification report") 60 | print(metrics.classification_report(target, predictions)) 61 | 62 | # print if target2 was provided 63 | if len(p_2_annotation) > 0: 64 | print('\nPrecision on 2 annotations:') 65 | for emo in p_2_annotation: 66 | print("%s: %.2f" % (emo, p_2_annotation[emo])) 67 | 68 | # print accuracies, precision, recall, and f1 69 | print('\nAccuracy:') 70 | print(metrics.accuracy_score(target, predictions)) 71 | print("Correct Predictions: ", metrics.accuracy_score(target, predictions,normalize=False)) 72 | for to_print in precision_recall_fscore_support[:3]: 73 | print( "%s: %.2f" % to_print ) 74 | 75 | # normalizing the values of the consfusion matrix 76 | print('\nconfusion matrix\n %s' % cm) 77 | print('(row=expected, col=predicted)') 78 | cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 79 | cls.plot_confusion_matrix(cm_normalized, le.classes_, 'Confusion matrix Normalized') 80 | 81 | return { 82 | 'precision_recall_fscore_support': precision_recall_fscore_support, 83 | 'accuracy': metrics.accuracy_score(target, predictions), 84 | 'p_2_annotation': p_2_annotation, 85 | 'confusion_matrix': cm 86 | } 87 | 88 | def predict_class(cls, X_train, y_train, X_test, y_test, 89 | pipeline, silent=False, target2=None): 90 | """ Predicted class,then run some performance evaluation """ 91 | pipeline.fit(X_train, y_train) 92 | predictions = pipeline.predict(X_test) 93 | print("predictions computed....") 94 | return cls.evaluate_class(predictions, y_test, target2, silent) 95 | 96 | def evaluate_prob(cls, prediction, target_rank, target_class, binarizer, va_df, silent=False, target2=None): 97 | """ Evaluate through probability """ 98 | # Run normal class evaluator 99 | predict_class = binarizer.classes_[prediction.argmax(axis=1)] 100 | class_eval = cls.evaluate_class(predict_class, target_class, target2, silent) 101 | 102 | print("==============================================================================") 103 | print("Finished from evaluation class, now continue with the probability evaluation") 104 | 105 | if not silent: 106 | print('\n - First Emotion Classification Metrics -') 107 | print('\n - Multiple Emotion rank Metrics -') 108 | print('VA Cosine Distance') 109 | 110 | classes_dist = [ 111 | ( 112 | emo, 113 | cls.va_dist( 114 | prediction[np.array(target_class) == emo], 115 | target_rank[np.array(target_class) == emo], 116 | va_df, 117 | binarizer, 118 | emo, 119 | silent) 120 | ) for emo in binarizer.classes_ 121 | ] 122 | avg_dist = cls.va_dist(prediction, target_rank, va_df, binarizer, 'avg', silent) 123 | 124 | coverage_error = metrics.coverage_error(target_rank, prediction) 125 | average_precision_score = metrics.average_precision_score(target_rank, prediction) 126 | label_ranking_average_precision_score = metrics.label_ranking_average_precision_score(target_rank, prediction) 127 | label_ranking_loss = metrics.label_ranking_loss(target_rank, prediction) 128 | 129 | # recall at 2 130 | # obtain top two predictions 131 | top2_pred = [set([binarizer.classes_[i[0]], binarizer.classes_[i[1]]]) for i in (prediction.argsort(axis=1).T[-2:].T)] 132 | recall_at_2 = pd.DataFrame( 133 | [ 134 | t in p for t, p in zip(target_class, top2_pred) 135 | ], index=target_class, columns=['recall@2']).groupby(level=0).apply(lambda emo: emo.sum()/len(emo)) 136 | 137 | # combine target into sets 138 | if target2: 139 | union_target = [set(t) for t in zip(target_class, target2)] 140 | else: 141 | union_target = [set(t) for t in zip(target_class)] 142 | 143 | # precision at k 144 | top_k_pred = [ 145 | [set([binarizer.classes_[i] for i in i_list]) for i_list in (prediction.argsort(axis=1).T[-i:].T)] 146 | for i in range(2, len(binarizer.classes_)+1)] 147 | precision_at_k = [ 148 | ('p@' + str(k+2), np.array([len(t & p)/(k+2) for t, p in zip(union_target, top_k_pred[k])]).mean()) 149 | for k in range(len(top_k_pred))] 150 | 151 | # do this if silent= False 152 | if not silent: 153 | print('\n') 154 | print(recall_at_2) 155 | print('\n') 156 | print('p@k') 157 | for pk in precision_at_k: 158 | print(pk[0] + ':\t' + str(pk[1])) 159 | print('\ncoverage_error: %f' % coverage_error) 160 | print('average_precision_score: %f' % average_precision_score) 161 | print('label_ranking_average_precision_score: %f' % label_ranking_average_precision_score) 162 | print('label_ranking_loss: %f' % label_ranking_loss) 163 | 164 | return { 165 | 'class_eval': class_eval, 166 | 'recall_at_2': recall_at_2.to_dict(), 167 | 'precision_at_2': precision_at_k, 168 | 'classes_dist': classes_dist, 169 | 'avg_dist': avg_dist, 170 | 'coverage_error': coverage_error, 171 | 'average_precision_score': average_precision_score, 172 | 'label_ranking_average_precision_score': label_ranking_average_precision_score, 173 | 'label_ranking_loss': label_ranking_loss 174 | } 175 | 176 | 177 | def predict_prob(cls, X_train, y_train, X_test, y_test, label_test, pipeline, binarizer, va_df, silent=False, target2=None): 178 | """ Output predcations based on training and labels """ 179 | pipeline.fit(X_train, y_train) 180 | predictions = pipeline.predict_proba(X_test) 181 | pred_to_mlb = [np.where(pipeline.classes_ == emo)[0][0] for emo in binarizer.classes_.tolist()] 182 | return cls.evaluate_prob(predictions[:,pred_to_mlb], y_test, label_test, binarizer, va_df, silent, target2) 183 | 184 | 185 | def plot_confusion_matrix(cls, cm, my_tags, title='Confusion matrix', cmap=plt.cm.Blues): 186 | """ Plotting the confusion_matrix""" 187 | plt.rc('figure', figsize=(4, 4), dpi=100) 188 | plt.imshow(cm, interpolation='nearest', cmap=cmap) 189 | plt.title(title) 190 | plt.colorbar() 191 | tick_marks = np.arange(len(my_tags)) 192 | target_names = my_tags 193 | plt.xticks(tick_marks, target_names, rotation=45) 194 | plt.yticks(tick_marks, target_names) 195 | 196 | # add normalized valued inside the Confusion matrix 197 | fmt = '.2f' 198 | thresh = cm.max() / 2. 199 | for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): 200 | plt.text(j, i, format(cm[i, j], fmt), horizontalalignment="center", color="white" if cm[i, j] > thresh else "black") 201 | 202 | plt.tight_layout() 203 | plt.ylabel('True label') 204 | plt.xlabel('Predicted label') 205 | -------------------------------------------------------------------------------- /helpers/pickle_helpers.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | 3 | ''' 4 | Store and load anything to and from pickle 5 | ''' 6 | 7 | def convert_to_pickle(item, directory): 8 | ''' 9 | Usage: convert dictionary object to pickle format 10 | pickle_helpers.convert_to_pickle(cat_list,"data/liwc_pickle/liwc_cat.p") 11 | ''' 12 | 13 | pickle.dump(item, open(directory,"wb")) 14 | 15 | 16 | def load_from_pickle(directory): 17 | ''' 18 | Usage: Load pickle file 19 | pickle_helpers.load_from_pickle("data/liwc_pickle/liwc_cat.p") 20 | ''' 21 | 22 | return pickle.load(open(directory,"rb")) 23 | -------------------------------------------------------------------------------- /img/autograd.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/img/autograd.jpg -------------------------------------------------------------------------------- /img/dl_frameworks.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/img/dl_frameworks.png -------------------------------------------------------------------------------- /img/emotion_classifier.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/img/emotion_classifier.png -------------------------------------------------------------------------------- /img/gru-model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/img/gru-model.png -------------------------------------------------------------------------------- /img/tensor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/omarsar/nlp_pytorch_tensorflow_notebooks/2fc77b0aa2b1a4be4558f14365c429cbde892586/img/tensor.png --------------------------------------------------------------------------------