├── .gitattributes ├── .ipynb_checkpoints ├── Tensorflow+Tutorial-checkpoint.ipynb ├── tensorflow-checkpoint.ipynb └── translate_Tensorflow+Tutorial-checkpoint.ipynb ├── README.md ├── Tensorflow+Tutorial.ipynb ├── __pycache__ └── tf_utils.cpython-35.pyc ├── datasets ├── test_signs.h5 └── train_signs.h5 ├── images ├── hands.png ├── onehot.png └── thumbs_up.jpg ├── improv_utils.py ├── pytorch ├── __pycache__ │ └── net.cpython-35.pyc ├── net.py └── train.py ├── tensorflow.ipynb ├── tf_utils.py └── translate_Tensorflow+Tutorial.ipynb /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/Tensorflow+Tutorial-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# TensorFlow Tutorial\n", 8 | "\n", 9 | "Welcome to this week's programming assignment. Until now, you've always used numpy to build neural networks. Now we will step you through a deep learning framework that will allow you to build neural networks more easily. Machine learning frameworks like TensorFlow, PaddlePaddle, Torch, Caffe, Keras, and many others can speed up your machine learning development significantly. All of these frameworks also have a lot of documentation, which you should feel free to read. In this assignment, you will learn to do the following in TensorFlow: \n", 10 | "\n", 11 | "- Initialize variables\n", 12 | "- Start your own session\n", 13 | "- Train algorithms \n", 14 | "- Implement a Neural Network\n", 15 | "\n", 16 | "Programing frameworks can not only shorten your coding time, but sometimes also perform optimizations that speed up your code. \n", 17 | "\n", 18 | "---\n", 19 | "\n", 20 | "欢迎来到本周的编程任务。到目前为止,你一直使用numpy来建立神经网络。现在我们将引导您深入学习框架,让您更容易地建立神经网络。TensorFlow,PaddlePaddle,Torch,Caffe,Keras等机器学习框架可以显着加速您的机器学习开发。这些框架有很多文档,你可以随意阅读。在这个任务中,您将学习如何在TensorFlow中执行以下操作:\n", 21 | "\n", 22 | "- 初始化变量\n", 23 | "- 开始你自己的会话\n", 24 | "- 训练算法\n", 25 | "- 实现一个神经网络\n", 26 | "\n", 27 | "编程框架不仅可以缩短您的编码时间,但有时也可以执行优化来加速您的代码。\n", 28 | "\n", 29 | "\n", 30 | "## 1 - Exploring the Tensorflow Library\n", 31 | "\n", 32 | "To start, you will import the library:\n" 33 | ] 34 | }, 35 | { 36 | "cell_type": "code", 37 | "execution_count": 66, 38 | "metadata": { 39 | "collapsed": false 40 | }, 41 | "outputs": [], 42 | "source": [ 43 | "import math\n", 44 | "import numpy as np\n", 45 | "import h5py\n", 46 | "import matplotlib.pyplot as plt\n", 47 | "import tensorflow as tf\n", 48 | "from tensorflow.python.framework import ops\n", 49 | "from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict\n", 50 | "\n", 51 | "%matplotlib inline\n", 52 | "np.random.seed(1)" 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "Now that you have imported the library, we will walk you through its different applications. You will start with an example, where we compute for you the loss of one training example. \n", 60 | "$$loss = \\mathcal{L}(\\hat{y}, y) = (\\hat y^{(i)} - y^{(i)})^2 \\tag{1}$$" 61 | ] 62 | }, 63 | { 64 | "cell_type": "code", 65 | "execution_count": 38, 66 | "metadata": { 67 | "collapsed": false 68 | }, 69 | "outputs": [ 70 | { 71 | "name": "stdout", 72 | "output_type": "stream", 73 | "text": [ 74 | "9\n" 75 | ] 76 | } 77 | ], 78 | "source": [ 79 | "y_hat = tf.constant(36, name='y_hat') # Define y_hat constant. Set to 36.\n", 80 | "y = tf.constant(39, name='y') # Define y. Set to 39\n", 81 | "\n", 82 | "loss = tf.Variable((y - y_hat)**2, name='loss') # Create a variable for the loss\n", 83 | "\n", 84 | "init = tf.global_variables_initializer() # When init is run later (session.run(init)),\n", 85 | " # the loss variable will be initialized and ready to be computed\n", 86 | "with tf.Session() as session: # Create a session and print the output\n", 87 | " session.run(init) # Initializes the variables\n", 88 | " print(session.run(loss)) # Prints the loss" 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": {}, 94 | "source": [ 95 | "Writing and running programs in TensorFlow has the following steps:\n", 96 | "\n", 97 | "1. Create Tensors (variables) that are not yet executed/evaluated. \n", 98 | "2. Write operations between those Tensors.\n", 99 | "3. Initialize your Tensors. \n", 100 | "4. Create a Session. \n", 101 | "5. Run the Session. This will run the operations you'd written above. \n", 102 | "\n", 103 | "Therefore, when we created a variable for the loss, we simply defined the loss as a function of other quantities, but did not evaluate its value. To evaluate it, we had to run `init=tf.global_variables_initializer()`. That initialized the loss variable, and in the last line we were finally able to evaluate the value of `loss` and print its value.\n", 104 | "\n", 105 | "Now let us look at an easy example. Run the cell below:" 106 | ] 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": 39, 111 | "metadata": { 112 | "collapsed": false 113 | }, 114 | "outputs": [ 115 | { 116 | "name": "stdout", 117 | "output_type": "stream", 118 | "text": [ 119 | "Tensor(\"Mul:0\", shape=(), dtype=int32)\n" 120 | ] 121 | } 122 | ], 123 | "source": [ 124 | "a = tf.constant(2)\n", 125 | "b = tf.constant(10)\n", 126 | "c = tf.multiply(a,b)\n", 127 | "print(c)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "As expected, you will not see 20! You got a tensor saying that the result is a tensor that does not have the shape attribute, and is of type \"int32\". All you did was put in the 'computation graph', but you have not run this computation yet. In order to actually multiply the two numbers, you will have to create a session and run it." 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": 40, 140 | "metadata": { 141 | "collapsed": false 142 | }, 143 | "outputs": [ 144 | { 145 | "name": "stdout", 146 | "output_type": "stream", 147 | "text": [ 148 | "20\n" 149 | ] 150 | } 151 | ], 152 | "source": [ 153 | "sess = tf.Session()\n", 154 | "print(sess.run(c))" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "Great! To summarize, **remember to initialize your variables, create a session and run the operations inside the session**. \n", 162 | "\n", 163 | "Next, you'll also have to know about placeholders. A placeholder is an object whose value you can specify only later. \n", 164 | "To specify values for a placeholder, you can pass in values by using a \"feed dictionary\" (`feed_dict` variable). Below, we created a placeholder for x. This allows us to pass in a number later when we run the session. " 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 41, 170 | "metadata": { 171 | "collapsed": false 172 | }, 173 | "outputs": [ 174 | { 175 | "name": "stdout", 176 | "output_type": "stream", 177 | "text": [ 178 | "6\n" 179 | ] 180 | } 181 | ], 182 | "source": [ 183 | "# Change the value of x in the feed_dict\n", 184 | "\n", 185 | "x = tf.placeholder(tf.int64, name = 'x')\n", 186 | "print(sess.run(2 * x, feed_dict = {x: 3}))\n", 187 | "sess.close()" 188 | ] 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": [ 194 | "When you first defined `x` you did not have to specify a value for it. A placeholder is simply a variable that you will assign data to only later, when running the session. We say that you **feed data** to these placeholders when running the session. \n", 195 | "\n", 196 | "Here's what's happening: When you specify the operations needed for a computation, you are telling TensorFlow how to construct a computation graph. The computation graph can have some placeholders whose values you will specify only later. Finally, when you run the session, you are telling TensorFlow to execute the computation graph." 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "### 1.1 - Linear function\n", 204 | "\n", 205 | "Lets start this programming exercise by computing the following equation: $Y = WX + b$, where $W$ and $X$ are random matrices and b is a random vector. \n", 206 | "\n", 207 | "**Exercise**: Compute $WX + b$ where $W, X$, and $b$ are drawn from a random normal distribution. W is of shape (4, 3), X is (3,1) and b is (4,1). As an example, here is how you would define a constant X that has shape (3,1):\n", 208 | "```python\n", 209 | "X = tf.constant(np.random.randn(3,1), name = \"X\")\n", 210 | "\n", 211 | "```\n", 212 | "You might find the following functions helpful: \n", 213 | "- tf.matmul(..., ...) to do a matrix multiplication\n", 214 | "- tf.add(..., ...) to do an addition\n", 215 | "- np.random.randn(...) to initialize randomly\n" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": 42, 221 | "metadata": { 222 | "collapsed": true 223 | }, 224 | "outputs": [], 225 | "source": [ 226 | "# GRADED FUNCTION: linear_function\n", 227 | "\n", 228 | "def linear_function():\n", 229 | " \"\"\"\n", 230 | " Implements a linear function: \n", 231 | " Initializes W to be a random tensor of shape (4,3)\n", 232 | " Initializes X to be a random tensor of shape (3,1)\n", 233 | " Initializes b to be a random tensor of shape (4,1)\n", 234 | " Returns: \n", 235 | " result -- runs the session for Y = WX + b \n", 236 | " \"\"\"\n", 237 | " \n", 238 | " np.random.seed(1)\n", 239 | " \n", 240 | " ### START CODE HERE ### (4 lines of code)\n", 241 | " X = tf.constant(np.random.randn(3,1), name = \"X\")\n", 242 | " W = tf.constant(np.random.randn(4,3), name = \"W\")\n", 243 | " b = tf.constant(np.random.randn(4,1), name = \"b\")\n", 244 | "# print(X)\n", 245 | "# print(W)\n", 246 | "# print(b)\n", 247 | " Y = tf.add(tf.matmul(W,X), b)\n", 248 | " ### END CODE HERE ### \n", 249 | " \n", 250 | " # Create the session using tf.Session() and run it with sess.run(...) on the variable you want to calculate\n", 251 | " \n", 252 | " ### START CODE HERE ###\n", 253 | " sess = tf.Session()\n", 254 | " result = sess.run(Y)\n", 255 | " ### END CODE HERE ### \n", 256 | " \n", 257 | " # close the session \n", 258 | " sess.close()\n", 259 | "\n", 260 | " return result" 261 | ] 262 | }, 263 | { 264 | "cell_type": "code", 265 | "execution_count": 43, 266 | "metadata": { 267 | "collapsed": false 268 | }, 269 | "outputs": [ 270 | { 271 | "name": "stdout", 272 | "output_type": "stream", 273 | "text": [ 274 | "result = [[-2.15657382]\n", 275 | " [ 2.95891446]\n", 276 | " [-1.08926781]\n", 277 | " [-0.84538042]]\n" 278 | ] 279 | } 280 | ], 281 | "source": [ 282 | "print( \"result = \" + str(linear_function()))" 283 | ] 284 | }, 285 | { 286 | "cell_type": "markdown", 287 | "metadata": {}, 288 | "source": [ 289 | "*** Expected Output ***: \n", 290 | "\n", 291 | " \n", 292 | " \n", 293 | "\n", 296 | "\n", 302 | " \n", 303 | "\n", 304 | "
\n", 294 | "**result**\n", 295 | "\n", 297 | "[[-2.15657382]\n", 298 | " [ 2.95891446]\n", 299 | " [-1.08926781]\n", 300 | " [-0.84538042]]\n", 301 | "
" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "### 1.2 - Computing the sigmoid \n", 312 | "Great! You just implemented a linear function. Tensorflow offers a variety of commonly used neural network functions like `tf.sigmoid` and `tf.softmax`. For this exercise lets compute the sigmoid function of an input. \n", 313 | "\n", 314 | "You will do this exercise using a placeholder variable `x`. When running the session, you should use the feed dictionary to pass in the input `z`. In this exercise, you will have to (i) create a placeholder `x`, (ii) define the operations needed to compute the sigmoid using `tf.sigmoid`, and then (iii) run the session. \n", 315 | "\n", 316 | "** Exercise **: Implement the sigmoid function below. You should use the following: \n", 317 | "\n", 318 | "- `tf.placeholder(tf.float32, name = \"...\")`\n", 319 | "- `tf.sigmoid(...)`\n", 320 | "- `sess.run(..., feed_dict = {x: z})`\n", 321 | "\n", 322 | "\n", 323 | "Note that there are two typical ways to create and use sessions in tensorflow: \n", 324 | "\n", 325 | "**Method 1:**\n", 326 | "```python\n", 327 | "sess = tf.Session()\n", 328 | "# Run the variables initialization (if needed), run the operations\n", 329 | "result = sess.run(..., feed_dict = {...})\n", 330 | "sess.close() # Close the session\n", 331 | "```\n", 332 | "**Method 2:**\n", 333 | "```python\n", 334 | "with tf.Session() as sess: \n", 335 | " # run the variables initialization (if needed), run the operations\n", 336 | " result = sess.run(..., feed_dict = {...})\n", 337 | " # This takes care of closing the session for you :)\n", 338 | "```\n" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": 44, 344 | "metadata": { 345 | "collapsed": true 346 | }, 347 | "outputs": [], 348 | "source": [ 349 | "# GRADED FUNCTION: sigmoid\n", 350 | "\n", 351 | "def sigmoid(z):\n", 352 | " \"\"\"\n", 353 | " Computes the sigmoid of z\n", 354 | " \n", 355 | " Arguments:\n", 356 | " z -- input value, scalar or vector\n", 357 | " \n", 358 | " Returns: \n", 359 | " results -- the sigmoid of z\n", 360 | " \"\"\"\n", 361 | " \n", 362 | " ### START CODE HERE ### ( approx. 4 lines of code)\n", 363 | " # Create a placeholder for x. Name it 'x'.\n", 364 | " x = tf.placeholder(tf.float32, name=\"x\")\n", 365 | "\n", 366 | " # compute sigmoid(x)\n", 367 | " sigmoid = tf.sigmoid(x)\n", 368 | "\n", 369 | " # Create a session, and run it. Please use the method 2 explained above. \n", 370 | " # You should use a feed_dict to pass z's value to x. \n", 371 | " with tf.Session() as sess:\n", 372 | " # Run session and call the output \"result\"\n", 373 | " result = sess.run(sigmoid, feed_dict={x:z})\n", 374 | " \n", 375 | " ### END CODE HERE ###\n", 376 | " \n", 377 | " return result" 378 | ] 379 | }, 380 | { 381 | "cell_type": "code", 382 | "execution_count": 45, 383 | "metadata": { 384 | "collapsed": false 385 | }, 386 | "outputs": [ 387 | { 388 | "name": "stdout", 389 | "output_type": "stream", 390 | "text": [ 391 | "sigmoid(0) = 0.5\n", 392 | "sigmoid(12) = 0.999994\n" 393 | ] 394 | } 395 | ], 396 | "source": [ 397 | "print (\"sigmoid(0) = \" + str(sigmoid(0)))\n", 398 | "print (\"sigmoid(12) = \" + str(sigmoid(12)))" 399 | ] 400 | }, 401 | { 402 | "cell_type": "markdown", 403 | "metadata": {}, 404 | "source": [ 405 | "*** Expected Output ***: \n", 406 | "\n", 407 | " \n", 408 | " \n", 409 | "\n", 412 | "\n", 415 | "\n", 416 | " \n", 417 | "\n", 420 | "\n", 423 | " \n", 424 | "\n", 425 | "
\n", 410 | "**sigmoid(0)**\n", 411 | "\n", 413 | "0.5\n", 414 | "
\n", 418 | "**sigmoid(12)**\n", 419 | "\n", 421 | "0.999994\n", 422 | "
" 426 | ] 427 | }, 428 | { 429 | "cell_type": "markdown", 430 | "metadata": {}, 431 | "source": [ 432 | "\n", 433 | "**To summarize, you how know how to**:\n", 434 | "1. Create placeholders\n", 435 | "2. Specify the computation graph corresponding to operations you want to compute\n", 436 | "3. Create the session\n", 437 | "4. Run the session, using a feed dictionary if necessary to specify placeholder variables' values. " 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "metadata": {}, 443 | "source": [ 444 | "### 1.3 - Computing the Cost\n", 445 | "\n", 446 | "You can also use a built-in function to compute the cost of your neural network. So instead of needing to write code to compute this as a function of $a^{[2](i)}$ and $y^{(i)}$ for i=1...m: \n", 447 | "$$ J = - \\frac{1}{m} \\sum_{i = 1}^m \\large ( \\small y^{(i)} \\log a^{ [2] (i)} + (1-y^{(i)})\\log (1-a^{ [2] (i)} )\\large )\\small\\tag{2}$$\n", 448 | "\n", 449 | "you can do it in one line of code in tensorflow!\n", 450 | "\n", 451 | "**Exercise**: Implement the cross entropy loss. The function you will use is: \n", 452 | "\n", 453 | "\n", 454 | "- `tf.nn.sigmoid_cross_entropy_with_logits(logits = ..., labels = ...)`\n", 455 | "\n", 456 | "Your code should input `z`, compute the sigmoid (to get `a`) and then compute the cross entropy cost $J$. All this can be done using one call to `tf.nn.sigmoid_cross_entropy_with_logits`, which computes\n", 457 | "\n", 458 | "$$- \\frac{1}{m} \\sum_{i = 1}^m \\large ( \\small y^{(i)} \\log \\sigma(z^{[2](i)}) + (1-y^{(i)})\\log (1-\\sigma(z^{[2](i)})\\large )\\small\\tag{2}$$\n", 459 | "\n" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": 46, 465 | "metadata": { 466 | "collapsed": true 467 | }, 468 | "outputs": [], 469 | "source": [ 470 | "# GRADED FUNCTION: cost\n", 471 | "\n", 472 | "def cost(logits, labels):\n", 473 | " \"\"\"\n", 474 | "    Computes the cost using the sigmoid cross entropy\n", 475 | "    \n", 476 | "    Arguments:\n", 477 | "    logits -- vector containing z, output of the last linear unit (before the final sigmoid activation)\n", 478 | "    labels -- vector of labels y (1 or 0) \n", 479 | " \n", 480 | " Note: What we've been calling \"z\" and \"y\" in this class are respectively called \"logits\" and \"labels\" \n", 481 | " in the TensorFlow documentation. So logits will feed into z, and labels into y. \n", 482 | "    \n", 483 | "    Returns:\n", 484 | "    cost -- runs the session of the cost (formula (2))\n", 485 | " \"\"\"\n", 486 | " \n", 487 | " ### START CODE HERE ### \n", 488 | " \n", 489 | " # Create the placeholders for \"logits\" (z) and \"labels\" (y) (approx. 2 lines)\n", 490 | " z = tf.placeholder(tf.float32, name=\"logits\")\n", 491 | " y = tf.placeholder(tf.float32, name=\"labels\")\n", 492 | " \n", 493 | " # Use the loss function (approx. 1 line)\n", 494 | " cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=z, labels=y)\n", 495 | " \n", 496 | " # Create a session (approx. 1 line). See method 1 above.\n", 497 | " sess = tf.Session()\n", 498 | " \n", 499 | " # Run the session (approx. 1 line).\n", 500 | " cost = sess.run(cost, feed_dict={z:logits, y:labels})\n", 501 | " \n", 502 | " \n", 503 | " # Close the session (approx. 1 line). See method 1 above.\n", 504 | " sess.close()\n", 505 | " \n", 506 | " ### END CODE HERE ###\n", 507 | " \n", 508 | " return cost" 509 | ] 510 | }, 511 | { 512 | "cell_type": "code", 513 | "execution_count": 47, 514 | "metadata": { 515 | "collapsed": false 516 | }, 517 | "outputs": [ 518 | { 519 | "name": "stdout", 520 | "output_type": "stream", 521 | "text": [ 522 | "[ 0.54983395 0.59868765 0.66818774 0.71094948]\n" 523 | ] 524 | } 525 | ], 526 | "source": [ 527 | "logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))\n", 528 | "cost = cost(logits, np.array([0,0,1,1]))\n", 529 | "print(logits)" 530 | ] 531 | }, 532 | { 533 | "cell_type": "markdown", 534 | "metadata": {}, 535 | "source": [ 536 | "** Expected Output** : \n", 537 | "\n", 538 | " \n", 539 | " \n", 540 | " \n", 543 | " \n", 546 | " \n", 547 | "\n", 548 | "
\n", 541 | " **cost**\n", 542 | " \n", 544 | " [ 1.00538719 1.03664088 0.41385433 0.39956614]\n", 545 | "
" 549 | ] 550 | }, 551 | { 552 | "cell_type": "markdown", 553 | "metadata": {}, 554 | "source": [ 555 | "### 1.4 - Using One Hot encodings\n", 556 | "\n", 557 | "Many times in deep learning you will have a y vector with numbers ranging from 0 to C-1, where C is the number of classes. If C is for example 4, then you might have the following y vector which you will need to convert as follows:\n", 558 | "\n", 559 | "\n", 560 | "\n", 561 | "\n", 562 | "This is called a \"one hot\" encoding, because in the converted representation exactly one element of each column is \"hot\" (meaning set to 1). To do this conversion in numpy, you might have to write a few lines of code. In tensorflow, you can use one line of code: \n", 563 | "\n", 564 | "- tf.one_hot(labels, depth, axis) \n", 565 | "\n", 566 | "**Exercise:** Implement the function below to take one vector of labels and the total number of classes $C$, and return the one hot encoding. Use `tf.one_hot()` to do this. " 567 | ] 568 | }, 569 | { 570 | "cell_type": "code", 571 | "execution_count": 48, 572 | "metadata": { 573 | "collapsed": true 574 | }, 575 | "outputs": [], 576 | "source": [ 577 | "# GRADED FUNCTION: one_hot_matrix\n", 578 | "\n", 579 | "def one_hot_matrix(labels, C):\n", 580 | " \"\"\"\n", 581 | " Creates a matrix where the i-th row corresponds to the ith class number and the jth column\n", 582 | " corresponds to the jth training example. So if example j had a label i. Then entry (i,j) \n", 583 | " will be 1. \n", 584 | " \n", 585 | " Arguments:\n", 586 | " labels -- vector containing the labels \n", 587 | " C -- number of classes, the depth of the one hot dimension\n", 588 | " \n", 589 | " Returns: \n", 590 | " one_hot -- one hot matrix\n", 591 | " \"\"\"\n", 592 | " \n", 593 | " ### START CODE HERE ###\n", 594 | " \n", 595 | " # Create a tf.constant equal to C (depth), name it 'C'. (approx. 1 line)\n", 596 | " C = tf.constant(C)\n", 597 | " \n", 598 | " # Use tf.one_hot, be careful with the axis (approx. 1 line)\n", 599 | " one_hot_matrix = tf.one_hot(labels, C, axis=0)\n", 600 | " \n", 601 | " # Create the session (approx. 1 line)\n", 602 | " sess = tf.Session()\n", 603 | " \n", 604 | " # Run the session (approx. 1 line)\n", 605 | " one_hot = sess.run(one_hot_matrix)\n", 606 | " \n", 607 | " # Close the session (approx. 1 line). See method 1 above.\n", 608 | " sess.close()\n", 609 | " \n", 610 | " ### END CODE HERE ###\n", 611 | " \n", 612 | " return one_hot" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": 49, 618 | "metadata": { 619 | "collapsed": false 620 | }, 621 | "outputs": [ 622 | { 623 | "name": "stdout", 624 | "output_type": "stream", 625 | "text": [ 626 | "one_hot = [[ 0. 0. 0. 1. 0. 0.]\n", 627 | " [ 1. 0. 0. 0. 0. 1.]\n", 628 | " [ 0. 1. 0. 0. 1. 0.]\n", 629 | " [ 0. 0. 1. 0. 0. 0.]]\n" 630 | ] 631 | } 632 | ], 633 | "source": [ 634 | "labels = np.array([1,2,3,0,2,1])\n", 635 | "one_hot = one_hot_matrix(labels, C = 4)\n", 636 | "print (\"one_hot = \" + str(one_hot))" 637 | ] 638 | }, 639 | { 640 | "cell_type": "markdown", 641 | "metadata": {}, 642 | "source": [ 643 | "**Expected Output**: \n", 644 | "\n", 645 | " \n", 646 | " \n", 647 | " \n", 650 | " \n", 656 | " \n", 657 | "\n", 658 | "
\n", 648 | " **one_hot**\n", 649 | " \n", 651 | " [[ 0. 0. 0. 1. 0. 0.]\n", 652 | " [ 1. 0. 0. 0. 0. 1.]\n", 653 | " [ 0. 1. 0. 0. 1. 0.]\n", 654 | " [ 0. 0. 1. 0. 0. 0.]]\n", 655 | "
\n" 659 | ] 660 | }, 661 | { 662 | "cell_type": "markdown", 663 | "metadata": {}, 664 | "source": [ 665 | "### 1.5 - Initialize with zeros and ones\n", 666 | "\n", 667 | "Now you will learn how to initialize a vector of zeros and ones. The function you will be calling is `tf.ones()`. To initialize with zeros you could use tf.zeros() instead. These functions take in a shape and return an array of dimension shape full of zeros and ones respectively. \n", 668 | "\n", 669 | "**Exercise:** Implement the function below to take in a shape and to return an array (of the shape's dimension of ones). \n", 670 | "\n", 671 | " - tf.ones(shape)\n" 672 | ] 673 | }, 674 | { 675 | "cell_type": "code", 676 | "execution_count": 50, 677 | "metadata": { 678 | "collapsed": true 679 | }, 680 | "outputs": [], 681 | "source": [ 682 | "# GRADED FUNCTION: ones\n", 683 | "\n", 684 | "def ones(shape):\n", 685 | " \"\"\"\n", 686 | " Creates an array of ones of dimension shape\n", 687 | " \n", 688 | " Arguments:\n", 689 | " shape -- shape of the array you want to create\n", 690 | " \n", 691 | " Returns: \n", 692 | " ones -- array containing only ones\n", 693 | " \"\"\"\n", 694 | " \n", 695 | " ### START CODE HERE ###\n", 696 | " \n", 697 | " # Create \"ones\" tensor using tf.ones(...). (approx. 1 line)\n", 698 | " ones = tf.ones(shape)\n", 699 | " \n", 700 | " # Create the session (approx. 1 line)\n", 701 | " sess = tf.Session()\n", 702 | " \n", 703 | " # Run the session to compute 'ones' (approx. 1 line)\n", 704 | " ones = sess.run(ones)\n", 705 | " \n", 706 | " # Close the session (approx. 1 line). See method 1 above.\n", 707 | " sess.close()\n", 708 | " \n", 709 | " ### END CODE HERE ###\n", 710 | " return ones" 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": 51, 716 | "metadata": { 717 | "collapsed": false 718 | }, 719 | "outputs": [ 720 | { 721 | "name": "stdout", 722 | "output_type": "stream", 723 | "text": [ 724 | "ones = [ 1. 1. 1.]\n" 725 | ] 726 | } 727 | ], 728 | "source": [ 729 | "print (\"ones = \" + str(ones([3])))" 730 | ] 731 | }, 732 | { 733 | "cell_type": "markdown", 734 | "metadata": {}, 735 | "source": [ 736 | "**Expected Output:**\n", 737 | "\n", 738 | " \n", 739 | " \n", 740 | " \n", 743 | " \n", 746 | " \n", 747 | "\n", 748 | "
\n", 741 | " **ones**\n", 742 | " \n", 744 | " [ 1. 1. 1.]\n", 745 | "
" 749 | ] 750 | }, 751 | { 752 | "cell_type": "markdown", 753 | "metadata": {}, 754 | "source": [ 755 | "# 2 - Building your first neural network in tensorflow\n", 756 | "\n", 757 | "In this part of the assignment you will build a neural network using tensorflow. Remember that there are two parts to implement a tensorflow model:\n", 758 | "\n", 759 | "- Create the computation graph\n", 760 | "- Run the graph\n", 761 | "\n", 762 | "Let's delve into the problem you'd like to solve!\n", 763 | "\n", 764 | "### 2.0 - Problem statement: SIGNS Dataset\n", 765 | "\n", 766 | "One afternoon, with some friends we decided to teach our computers to decipher sign language. We spent a few hours taking pictures in front of a white wall and came up with the following dataset. It's now your job to build an algorithm that would facilitate communications from a speech-impaired person to someone who doesn't understand sign language.\n", 767 | "\n", 768 | "- **Training set**: 1080 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (180 pictures per number).\n", 769 | "- **Test set**: 120 pictures (64 by 64 pixels) of signs representing numbers from 0 to 5 (20 pictures per number).\n", 770 | "\n", 771 | "Note that this is a subset of the SIGNS dataset. The complete dataset contains many more signs.\n", 772 | "\n", 773 | "Here are examples for each number, and how an explanation of how we represent the labels. These are the original pictures, before we lowered the image resolutoion to 64 by 64 pixels.\n", 774 | "
**Figure 1**: SIGNS dataset
\n", 775 | "\n", 776 | "\n", 777 | "Run the following code to load the dataset." 778 | ] 779 | }, 780 | { 781 | "cell_type": "code", 782 | "execution_count": 74, 783 | "metadata": { 784 | "collapsed": false 785 | }, 786 | "outputs": [ 787 | { 788 | "ename": "AttributeError", 789 | "evalue": "module 'h5py' has no attribute 'File'", 790 | "output_type": "error", 791 | "traceback": [ 792 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 793 | "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", 794 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[1;31m# Loading the dataset\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtrain_dataset\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mh5py\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'1212.mp'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"r\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0mX_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mX_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mclasses\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mload_dataset\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", 795 | "\u001b[0;31mAttributeError\u001b[0m: module 'h5py' has no attribute 'File'" 796 | ] 797 | } 798 | ], 799 | "source": [ 800 | "# Loading the dataset\n", 801 | "X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()" 802 | ] 803 | }, 804 | { 805 | "cell_type": "markdown", 806 | "metadata": {}, 807 | "source": [ 808 | "Change the index below and run the cell to visualize some examples in the dataset." 809 | ] 810 | }, 811 | { 812 | "cell_type": "code", 813 | "execution_count": 23, 814 | "metadata": { 815 | "collapsed": false 816 | }, 817 | "outputs": [], 818 | "source": [ 819 | "# Example of a picture\n", 820 | "index = 0\n", 821 | "plt.imshow(X_train_orig[index])\n", 822 | "print (\"y = \" + str(np.squeeze(Y_train_orig[:, index])))" 823 | ] 824 | }, 825 | { 826 | "cell_type": "markdown", 827 | "metadata": {}, 828 | "source": [ 829 | "As usual you flatten the image dataset, then normalize it by dividing by 255. On top of that, you will convert each label to a one-hot vector as shown in Figure 1. Run the cell below to do so." 830 | ] 831 | }, 832 | { 833 | "cell_type": "code", 834 | "execution_count": null, 835 | "metadata": { 836 | "collapsed": true 837 | }, 838 | "outputs": [], 839 | "source": [ 840 | "# Flatten the training and test images\n", 841 | "X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 842 | "X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T\n", 843 | "# Normalize image vectors\n", 844 | "X_train = X_train_flatten/255.\n", 845 | "X_test = X_test_flatten/255.\n", 846 | "# Convert training and test labels to one hot matrices\n", 847 | "Y_train = convert_to_one_hot(Y_train_orig, 6)\n", 848 | "Y_test = convert_to_one_hot(Y_test_orig, 6)\n", 849 | "\n", 850 | "print (\"number of training examples = \" + str(X_train.shape[1]))\n", 851 | "print (\"number of test examples = \" + str(X_test.shape[1]))\n", 852 | "print (\"X_train shape: \" + str(X_train.shape))\n", 853 | "print (\"Y_train shape: \" + str(Y_train.shape))\n", 854 | "print (\"X_test shape: \" + str(X_test.shape))\n", 855 | "print (\"Y_test shape: \" + str(Y_test.shape))" 856 | ] 857 | }, 858 | { 859 | "cell_type": "markdown", 860 | "metadata": {}, 861 | "source": [ 862 | "**Note** that 12288 comes from $64 \\times 64 \\times 3$. Each image is square, 64 by 64 pixels, and 3 is for the RGB colors. Please make sure all these shapes make sense to you before continuing." 863 | ] 864 | }, 865 | { 866 | "cell_type": "markdown", 867 | "metadata": {}, 868 | "source": [ 869 | "**Your goal** is to build an algorithm capable of recognizing a sign with high accuracy. To do so, you are going to build a tensorflow model that is almost the same as one you have previously built in numpy for cat recognition (but now using a softmax output). It is a great occasion to compare your numpy implementation to the tensorflow one. \n", 870 | "\n", 871 | "**The model** is *LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX*. The SIGMOID output layer has been converted to a SOFTMAX. A SOFTMAX layer generalizes SIGMOID to when there are more than two classes. " 872 | ] 873 | }, 874 | { 875 | "cell_type": "markdown", 876 | "metadata": {}, 877 | "source": [ 878 | "### 2.1 - Create placeholders\n", 879 | "\n", 880 | "Your first task is to create placeholders for `X` and `Y`. This will allow you to later pass your training data in when you run your session. \n", 881 | "\n", 882 | "**Exercise:** Implement the function below to create the placeholders in tensorflow." 883 | ] 884 | }, 885 | { 886 | "cell_type": "code", 887 | "execution_count": null, 888 | "metadata": { 889 | "collapsed": true 890 | }, 891 | "outputs": [], 892 | "source": [ 893 | "# GRADED FUNCTION: create_placeholders\n", 894 | "\n", 895 | "def create_placeholders(n_x, n_y):\n", 896 | " \"\"\"\n", 897 | " Creates the placeholders for the tensorflow session.\n", 898 | " \n", 899 | " Arguments:\n", 900 | " n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288)\n", 901 | " n_y -- scalar, number of classes (from 0 to 5, so -> 6)\n", 902 | " \n", 903 | " Returns:\n", 904 | " X -- placeholder for the data input, of shape [n_x, None] and dtype \"float\"\n", 905 | " Y -- placeholder for the input labels, of shape [n_y, None] and dtype \"float\"\n", 906 | " \n", 907 | " Tips:\n", 908 | " - You will use None because it let's us be flexible on the number of examples you will for the placeholders.\n", 909 | " In fact, the number of examples during test/train is different.\n", 910 | " \"\"\"\n", 911 | "\n", 912 | " ### START CODE HERE ### (approx. 2 lines)\n", 913 | " X = tf.placeholder(tf.float32, shape=(n_x,None), name=\"X\")\n", 914 | " Y = tf.placeholder(tf.float32, shape=(n_y,None), name=\"Y\")\n", 915 | " ### END CODE HERE ###\n", 916 | " \n", 917 | " return X, Y" 918 | ] 919 | }, 920 | { 921 | "cell_type": "code", 922 | "execution_count": null, 923 | "metadata": { 924 | "collapsed": true 925 | }, 926 | "outputs": [], 927 | "source": [ 928 | "X, Y = create_placeholders(12288, 6)\n", 929 | "print (\"X = \" + str(X))\n", 930 | "print (\"Y = \" + str(Y))" 931 | ] 932 | }, 933 | { 934 | "cell_type": "markdown", 935 | "metadata": {}, 936 | "source": [ 937 | "**Expected Output**: \n", 938 | "\n", 939 | " \n", 940 | " \n", 941 | " \n", 944 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 952 | " \n", 955 | " \n", 956 | "\n", 957 | "
\n", 942 | " **X**\n", 943 | " \n", 945 | " Tensor(\"Placeholder_1:0\", shape=(12288, ?), dtype=float32) (not necessarily Placeholder_1)\n", 946 | "
\n", 950 | " **Y**\n", 951 | " \n", 953 | " Tensor(\"Placeholder_2:0\", shape=(10, ?), dtype=float32) (not necessarily Placeholder_2)\n", 954 | "
" 958 | ] 959 | }, 960 | { 961 | "cell_type": "markdown", 962 | "metadata": {}, 963 | "source": [ 964 | "### 2.2 - Initializing the parameters\n", 965 | "\n", 966 | "Your second task is to initialize the parameters in tensorflow.\n", 967 | "\n", 968 | "**Exercise:** Implement the function below to initialize the parameters in tensorflow. You are going use Xavier Initialization for weights and Zero Initialization for biases. The shapes are given below. As an example, to help you, for W1 and b1 you could use: \n", 969 | "\n", 970 | "```python\n", 971 | "W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 972 | "b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n", 973 | "```\n", 974 | "Please use `seed = 1` to make sure your results match ours." 975 | ] 976 | }, 977 | { 978 | "cell_type": "code", 979 | "execution_count": null, 980 | "metadata": { 981 | "collapsed": true 982 | }, 983 | "outputs": [], 984 | "source": [ 985 | "# GRADED FUNCTION: initialize_parameters\n", 986 | "\n", 987 | "def initialize_parameters():\n", 988 | " \"\"\"\n", 989 | " Initializes parameters to build a neural network with tensorflow. The shapes are:\n", 990 | " W1 : [25, 12288]\n", 991 | " b1 : [25, 1]\n", 992 | " W2 : [12, 25]\n", 993 | " b2 : [12, 1]\n", 994 | " W3 : [6, 12]\n", 995 | " b3 : [6, 1]\n", 996 | " \n", 997 | " Returns:\n", 998 | " parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3\n", 999 | " \"\"\"\n", 1000 | " \n", 1001 | " tf.set_random_seed(1) # so that your \"random\" numbers match ours\n", 1002 | " \n", 1003 | " ### START CODE HERE ### (approx. 6 lines of code)\n", 1004 | " W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 1005 | " b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n", 1006 | " W2 = tf.get_variable(\"W2\", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 1007 | " b2 = tf.get_variable(\"b2\", [12,1], initializer = tf.zeros_initializer())\n", 1008 | " W3 = tf.get_variable(\"W3\", [6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 1009 | " b3 = tf.get_variable(\"b3\", [6,1], initializer = tf.zeros_initializer())\n", 1010 | " ### END CODE HERE ###\n", 1011 | "\n", 1012 | " parameters = {\"W1\": W1,\n", 1013 | " \"b1\": b1,\n", 1014 | " \"W2\": W2,\n", 1015 | " \"b2\": b2,\n", 1016 | " \"W3\": W3,\n", 1017 | " \"b3\": b3}\n", 1018 | " \n", 1019 | " return parameters" 1020 | ] 1021 | }, 1022 | { 1023 | "cell_type": "code", 1024 | "execution_count": 25, 1025 | "metadata": { 1026 | "collapsed": false 1027 | }, 1028 | "outputs": [], 1029 | "source": [ 1030 | "tf.reset_default_graph()\n", 1031 | "with tf.Session() as sess:\n", 1032 | " parameters = initialize_parameters()\n", 1033 | " print(\"W1 = \" + str(parameters[\"W1\"]))\n", 1034 | " print(\"b1 = \" + str(parameters[\"b1\"]))\n", 1035 | " print(\"W2 = \" + str(parameters[\"W2\"]))\n", 1036 | " print(\"b2 = \" + str(parameters[\"b2\"]))" 1037 | ] 1038 | }, 1039 | { 1040 | "cell_type": "markdown", 1041 | "metadata": {}, 1042 | "source": [ 1043 | "**Expected Output**: \n", 1044 | "\n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1050 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1058 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1066 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1074 | " \n", 1077 | " \n", 1078 | "\n", 1079 | "
\n", 1048 | " **W1**\n", 1049 | " \n", 1051 | " < tf.Variable 'W1:0' shape=(25, 12288) dtype=float32_ref >\n", 1052 | "
\n", 1056 | " **b1**\n", 1057 | " \n", 1059 | " < tf.Variable 'b1:0' shape=(25, 1) dtype=float32_ref >\n", 1060 | "
\n", 1064 | " **W2**\n", 1065 | " \n", 1067 | " < tf.Variable 'W2:0' shape=(12, 25) dtype=float32_ref >\n", 1068 | "
\n", 1072 | " **b2**\n", 1073 | " \n", 1075 | " < tf.Variable 'b2:0' shape=(12, 1) dtype=float32_ref >\n", 1076 | "
" 1080 | ] 1081 | }, 1082 | { 1083 | "cell_type": "markdown", 1084 | "metadata": {}, 1085 | "source": [ 1086 | "As expected, the parameters haven't been evaluated yet." 1087 | ] 1088 | }, 1089 | { 1090 | "cell_type": "markdown", 1091 | "metadata": {}, 1092 | "source": [ 1093 | "### 2.3 - Forward propagation in tensorflow \n", 1094 | "\n", 1095 | "You will now implement the forward propagation module in tensorflow. The function will take in a dictionary of parameters and it will complete the forward pass. The functions you will be using are: \n", 1096 | "\n", 1097 | "- `tf.add(...,...)` to do an addition\n", 1098 | "- `tf.matmul(...,...)` to do a matrix multiplication\n", 1099 | "- `tf.nn.relu(...)` to apply the ReLU activation\n", 1100 | "\n", 1101 | "**Question:** Implement the forward pass of the neural network. We commented for you the numpy equivalents so that you can compare the tensorflow implementation to numpy. It is important to note that the forward propagation stops at `z3`. The reason is that in tensorflow the last linear layer output is given as input to the function computing the loss. Therefore, you don't need `a3`!\n", 1102 | "\n" 1103 | ] 1104 | }, 1105 | { 1106 | "cell_type": "code", 1107 | "execution_count": null, 1108 | "metadata": { 1109 | "collapsed": true 1110 | }, 1111 | "outputs": [], 1112 | "source": [ 1113 | "# GRADED FUNCTION: forward_propagation\n", 1114 | "\n", 1115 | "def forward_propagation(X, parameters):\n", 1116 | " \"\"\"\n", 1117 | " Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX\n", 1118 | " \n", 1119 | " Arguments:\n", 1120 | " X -- input dataset placeholder, of shape (input size, number of examples)\n", 1121 | " parameters -- python dictionary containing your parameters \"W1\", \"b1\", \"W2\", \"b2\", \"W3\", \"b3\"\n", 1122 | " the shapes are given in initialize_parameters\n", 1123 | "\n", 1124 | " Returns:\n", 1125 | " Z3 -- the output of the last LINEAR unit\n", 1126 | " \"\"\"\n", 1127 | " \n", 1128 | " # Retrieve the parameters from the dictionary \"parameters\" \n", 1129 | " W1 = parameters['W1']\n", 1130 | " b1 = parameters['b1']\n", 1131 | " W2 = parameters['W2']\n", 1132 | " b2 = parameters['b2']\n", 1133 | " W3 = parameters['W3']\n", 1134 | " b3 = parameters['b3']\n", 1135 | " \n", 1136 | " ### START CODE HERE ### (approx. 5 lines) # Numpy Equivalents:\n", 1137 | " Z1 = tf.matmul(W1, X)+b1 # Z1 = np.dot(W1, X) + b1\n", 1138 | " A1 = tf.nn.relu(Z1) # A1 = relu(Z1)\n", 1139 | " Z2 = tf.matmul(W2, A1)+b2 # Z2 = np.dot(W2, a1) + b2\n", 1140 | " A2 = tf.nn.relu(Z2) # A2 = relu(Z2)\n", 1141 | " Z3 = tf.matmul(W3, A2)+b3 # Z3 = np.dot(W3,Z2) + b3\n", 1142 | " ### END CODE HERE ###\n", 1143 | " \n", 1144 | " return Z3" 1145 | ] 1146 | }, 1147 | { 1148 | "cell_type": "code", 1149 | "execution_count": null, 1150 | "metadata": { 1151 | "collapsed": true, 1152 | "scrolled": true 1153 | }, 1154 | "outputs": [], 1155 | "source": [ 1156 | "tf.reset_default_graph()\n", 1157 | "\n", 1158 | "with tf.Session() as sess:\n", 1159 | " X, Y = create_placeholders(12288, 6)\n", 1160 | " parameters = initialize_parameters()\n", 1161 | " Z3 = forward_propagation(X, parameters)\n", 1162 | " print(\"Z3 = \" + str(Z3))" 1163 | ] 1164 | }, 1165 | { 1166 | "cell_type": "markdown", 1167 | "metadata": {}, 1168 | "source": [ 1169 | "**Expected Output**: \n", 1170 | "\n", 1171 | " \n", 1172 | " \n", 1173 | " \n", 1176 | " \n", 1179 | " \n", 1180 | "\n", 1181 | "
\n", 1174 | " **Z3**\n", 1175 | " \n", 1177 | " Tensor(\"Add_2:0\", shape=(6, ?), dtype=float32)\n", 1178 | "
" 1182 | ] 1183 | }, 1184 | { 1185 | "cell_type": "markdown", 1186 | "metadata": {}, 1187 | "source": [ 1188 | "You may have noticed that the forward propagation doesn't output any cache. You will understand why below, when we get to brackpropagation." 1189 | ] 1190 | }, 1191 | { 1192 | "cell_type": "markdown", 1193 | "metadata": {}, 1194 | "source": [ 1195 | "### 2.4 Compute cost\n", 1196 | "\n", 1197 | "As seen before, it is very easy to compute the cost using:\n", 1198 | "```python\n", 1199 | "tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = ..., labels = ...))\n", 1200 | "```\n", 1201 | "**Question**: Implement the cost function below. \n", 1202 | "- It is important to know that the \"`logits`\" and \"`labels`\" inputs of `tf.nn.softmax_cross_entropy_with_logits` are expected to be of shape (number of examples, num_classes). We have thus transposed Z3 and Y for you.\n", 1203 | "- Besides, `tf.reduce_mean` basically does the summation over the examples." 1204 | ] 1205 | }, 1206 | { 1207 | "cell_type": "code", 1208 | "execution_count": null, 1209 | "metadata": { 1210 | "collapsed": true 1211 | }, 1212 | "outputs": [], 1213 | "source": [ 1214 | "# GRADED FUNCTION: compute_cost \n", 1215 | "\n", 1216 | "def compute_cost(Z3, Y):\n", 1217 | " \"\"\"\n", 1218 | " Computes the cost\n", 1219 | " \n", 1220 | " Arguments:\n", 1221 | " Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)\n", 1222 | " Y -- \"true\" labels vector placeholder, same shape as Z3\n", 1223 | " \n", 1224 | " Returns:\n", 1225 | " cost - Tensor of the cost function\n", 1226 | " \"\"\"\n", 1227 | " \n", 1228 | " # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)\n", 1229 | " logits = tf.transpose(Z3)\n", 1230 | " labels = tf.transpose(Y)\n", 1231 | " \n", 1232 | " ### START CODE HERE ### (1 line of code)\n", 1233 | " cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))\n", 1234 | " ### END CODE HERE ###\n", 1235 | " \n", 1236 | " return cost" 1237 | ] 1238 | }, 1239 | { 1240 | "cell_type": "code", 1241 | "execution_count": null, 1242 | "metadata": { 1243 | "collapsed": true 1244 | }, 1245 | "outputs": [], 1246 | "source": [ 1247 | "tf.reset_default_graph()\n", 1248 | "\n", 1249 | "with tf.Session() as sess:\n", 1250 | " X, Y = create_placeholders(12288, 6)\n", 1251 | " parameters = initialize_parameters()\n", 1252 | " Z3 = forward_propagation(X, parameters)\n", 1253 | " cost = compute_cost(Z3, Y)\n", 1254 | " print(\"cost = \" + str(cost))" 1255 | ] 1256 | }, 1257 | { 1258 | "cell_type": "markdown", 1259 | "metadata": {}, 1260 | "source": [ 1261 | "**Expected Output**: \n", 1262 | "\n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1268 | " \n", 1271 | " \n", 1272 | "\n", 1273 | "
\n", 1266 | " **cost**\n", 1267 | " \n", 1269 | " Tensor(\"Mean:0\", shape=(), dtype=float32)\n", 1270 | "
" 1274 | ] 1275 | }, 1276 | { 1277 | "cell_type": "markdown", 1278 | "metadata": {}, 1279 | "source": [ 1280 | "### 2.5 - Backward propagation & parameter updates\n", 1281 | "\n", 1282 | "This is where you become grateful to programming frameworks. All the backpropagation and the parameters update is taken care of in 1 line of code. It is very easy to incorporate this line in the model.\n", 1283 | "\n", 1284 | "After you compute the cost function. You will create an \"`optimizer`\" object. You have to call this object along with the cost when running the tf.session. When called, it will perform an optimization on the given cost with the chosen method and learning rate.\n", 1285 | "\n", 1286 | "For instance, for gradient descent the optimizer would be:\n", 1287 | "```python\n", 1288 | "optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)\n", 1289 | "```\n", 1290 | "\n", 1291 | "To make the optimization you would do:\n", 1292 | "```python\n", 1293 | "_ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n", 1294 | "```\n", 1295 | "\n", 1296 | "This computes the backpropagation by passing through the tensorflow graph in the reverse order. From cost to inputs.\n", 1297 | "\n", 1298 | "**Note** When coding, we often use `_` as a \"throwaway\" variable to store values that we won't need to use later. Here, `_` takes on the evaluated value of `optimizer`, which we don't need (and `c` takes the value of the `cost` variable). " 1299 | ] 1300 | }, 1301 | { 1302 | "cell_type": "markdown", 1303 | "metadata": {}, 1304 | "source": [ 1305 | "### 2.6 - Building the model\n", 1306 | "\n", 1307 | "Now, you will bring it all together! \n", 1308 | "\n", 1309 | "**Exercise:** Implement the model. You will be calling the functions you had previously implemented." 1310 | ] 1311 | }, 1312 | { 1313 | "cell_type": "code", 1314 | "execution_count": null, 1315 | "metadata": { 1316 | "collapsed": true 1317 | }, 1318 | "outputs": [], 1319 | "source": [ 1320 | "def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,\n", 1321 | " num_epochs = 1500, minibatch_size = 32, print_cost = True):\n", 1322 | " \"\"\"\n", 1323 | " Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.\n", 1324 | " \n", 1325 | " Arguments:\n", 1326 | " X_train -- training set, of shape (input size = 12288, number of training examples = 1080)\n", 1327 | " Y_train -- test set, of shape (output size = 6, number of training examples = 1080)\n", 1328 | " X_test -- training set, of shape (input size = 12288, number of training examples = 120)\n", 1329 | " Y_test -- test set, of shape (output size = 6, number of test examples = 120)\n", 1330 | " learning_rate -- learning rate of the optimization\n", 1331 | " num_epochs -- number of epochs of the optimization loop\n", 1332 | " minibatch_size -- size of a minibatch\n", 1333 | " print_cost -- True to print the cost every 100 epochs\n", 1334 | " \n", 1335 | " Returns:\n", 1336 | " parameters -- parameters learnt by the model. They can then be used to predict.\n", 1337 | " \"\"\"\n", 1338 | " \n", 1339 | " ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables\n", 1340 | " tf.set_random_seed(1) # to keep consistent results\n", 1341 | " seed = 3 # to keep consistent results\n", 1342 | " (n_x, m) = X_train.shape # (n_x: input size, m : number of examples in the train set)\n", 1343 | " n_y = Y_train.shape[0] # n_y : output size\n", 1344 | " costs = [] # To keep track of the cost\n", 1345 | " \n", 1346 | " # Create Placeholders of shape (n_x, n_y)\n", 1347 | " ### START CODE HERE ### (1 line)\n", 1348 | " X, Y = create_placeholders(n_x, n_y)\n", 1349 | " ### END CODE HERE ###\n", 1350 | "\n", 1351 | " # Initialize parameters\n", 1352 | " ### START CODE HERE ### (1 line)\n", 1353 | " parameters = initialize_parameters()\n", 1354 | " ### END CODE HERE ###\n", 1355 | " \n", 1356 | " # Forward propagation: Build the forward propagation in the tensorflow graph\n", 1357 | " ### START CODE HERE ### (1 line)\n", 1358 | " Z3 = forward_propagation(X, parameters)\n", 1359 | " ### END CODE HERE ###\n", 1360 | " \n", 1361 | " # Cost function: Add cost function to tensorflow graph\n", 1362 | " ### START CODE HERE ### (1 line)\n", 1363 | " cost = compute_cost(Z3, Y)\n", 1364 | " ### END CODE HERE ###\n", 1365 | " \n", 1366 | " # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.\n", 1367 | " ### START CODE HERE ### (1 line)\n", 1368 | " optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)\n", 1369 | " ### END CODE HERE ###\n", 1370 | " \n", 1371 | " # Initialize all the variables\n", 1372 | " init = tf.global_variables_initializer()\n", 1373 | "\n", 1374 | " # Start the session to compute the tensorflow graph\n", 1375 | " with tf.Session() as sess:\n", 1376 | " \n", 1377 | " # Run the initialization\n", 1378 | " sess.run(init)\n", 1379 | " \n", 1380 | " # Do the training loop\n", 1381 | " for epoch in range(num_epochs):\n", 1382 | "\n", 1383 | " epoch_cost = 0. # Defines a cost related to an epoch\n", 1384 | " num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set\n", 1385 | " seed = seed + 1\n", 1386 | " minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)\n", 1387 | "\n", 1388 | " for minibatch in minibatches:\n", 1389 | "\n", 1390 | " # Select a minibatch\n", 1391 | " (minibatch_X, minibatch_Y) = minibatch\n", 1392 | " \n", 1393 | " # IMPORTANT: The line that runs the graph on a minibatch.\n", 1394 | " # Run the session to execute the \"optimizer\" and the \"cost\", the feedict should contain a minibatch for (X,Y).\n", 1395 | " ### START CODE HERE ### (1 line)\n", 1396 | " _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n", 1397 | " ### END CODE HERE ###\n", 1398 | " \n", 1399 | " epoch_cost += minibatch_cost / num_minibatches\n", 1400 | "\n", 1401 | " # Print the cost every epoch\n", 1402 | " if print_cost == True and epoch % 100 == 0:\n", 1403 | " print (\"Cost after epoch %i: %f\" % (epoch, epoch_cost))\n", 1404 | " if print_cost == True and epoch % 5 == 0:\n", 1405 | " costs.append(epoch_cost)\n", 1406 | " \n", 1407 | " # plot the cost\n", 1408 | " plt.plot(np.squeeze(costs))\n", 1409 | " plt.ylabel('cost')\n", 1410 | " plt.xlabel('iterations (per tens)')\n", 1411 | " plt.title(\"Learning rate =\" + str(learning_rate))\n", 1412 | " plt.show()\n", 1413 | "\n", 1414 | " # lets save the parameters in a variable\n", 1415 | " parameters = sess.run(parameters)\n", 1416 | " print (\"Parameters have been trained!\")\n", 1417 | "\n", 1418 | " # Calculate the correct predictions\n", 1419 | " correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))\n", 1420 | "\n", 1421 | " # Calculate accuracy on the test set\n", 1422 | " accuracy = tf.reduce_mean(tf.cast(correct_prediction, \"float\"))\n", 1423 | "\n", 1424 | " print (\"Train Accuracy:\", accuracy.eval({X: X_train, Y: Y_train}))\n", 1425 | " print (\"Test Accuracy:\", accuracy.eval({X: X_test, Y: Y_test}))\n", 1426 | " \n", 1427 | " return parameters" 1428 | ] 1429 | }, 1430 | { 1431 | "cell_type": "markdown", 1432 | "metadata": { 1433 | "collapsed": true 1434 | }, 1435 | "source": [ 1436 | "Run the following cell to train your model! On our machine it takes about 5 minutes. Your \"Cost after epoch 100\" should be 1.016458. If it's not, don't waste time; interrupt the training by clicking on the square (⬛) in the upper bar of the notebook, and try to correct your code. If it is the correct cost, take a break and come back in 5 minutes!" 1437 | ] 1438 | }, 1439 | { 1440 | "cell_type": "code", 1441 | "execution_count": null, 1442 | "metadata": { 1443 | "collapsed": true, 1444 | "scrolled": false 1445 | }, 1446 | "outputs": [], 1447 | "source": [ 1448 | "parameters = model(X_train, Y_train, X_test, Y_test)" 1449 | ] 1450 | }, 1451 | { 1452 | "cell_type": "markdown", 1453 | "metadata": {}, 1454 | "source": [ 1455 | "**Expected Output**:\n", 1456 | "\n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1462 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1470 | " \n", 1473 | " \n", 1474 | "\n", 1475 | "
\n", 1460 | " **Train Accuracy**\n", 1461 | " \n", 1463 | " 0.999074\n", 1464 | "
\n", 1468 | " **Test Accuracy**\n", 1469 | " \n", 1471 | " 0.716667\n", 1472 | "
\n", 1476 | "\n", 1477 | "Amazing, your algorithm can recognize a sign representing a figure between 0 and 5 with 71.7% accuracy.\n", 1478 | "\n", 1479 | "**Insights**:\n", 1480 | "- Your model seems big enough to fit the training set well. However, given the difference between train and test accuracy, you could try to add L2 or dropout regularization to reduce overfitting. \n", 1481 | "- Think about the session as a block of code to train the model. Each time you run the session on a minibatch, it trains the parameters. In total you have run the session a large number of times (1500 epochs) until you obtained well trained parameters." 1482 | ] 1483 | }, 1484 | { 1485 | "cell_type": "markdown", 1486 | "metadata": {}, 1487 | "source": [ 1488 | "### 2.7 - Test with your own image (optional / ungraded exercise)\n", 1489 | "\n", 1490 | "Congratulations on finishing this assignment. You can now take a picture of your hand and see the output of your model. To do that:\n", 1491 | " 1. Click on \"File\" in the upper bar of this notebook, then click \"Open\" to go on your Coursera Hub.\n", 1492 | " 2. Add your image to this Jupyter Notebook's directory, in the \"images\" folder\n", 1493 | " 3. Write your image's name in the following code\n", 1494 | " 4. Run the code and check if the algorithm is right!" 1495 | ] 1496 | }, 1497 | { 1498 | "cell_type": "code", 1499 | "execution_count": null, 1500 | "metadata": { 1501 | "collapsed": true, 1502 | "scrolled": true 1503 | }, 1504 | "outputs": [], 1505 | "source": [ 1506 | "import scipy\n", 1507 | "from PIL import Image\n", 1508 | "from scipy import ndimage\n", 1509 | "\n", 1510 | "## START CODE HERE ## (PUT YOUR IMAGE NAME) \n", 1511 | "my_image = \"thumbs_up.jpg\"\n", 1512 | "## END CODE HERE ##\n", 1513 | "\n", 1514 | "# We preprocess your image to fit your algorithm.\n", 1515 | "fname = \"images/\" + my_image\n", 1516 | "image = np.array(ndimage.imread(fname, flatten=False))\n", 1517 | "my_image = scipy.misc.imresize(image, size=(64,64)).reshape((1, 64*64*3)).T\n", 1518 | "my_image_prediction = predict(my_image, parameters)\n", 1519 | "\n", 1520 | "plt.imshow(image)\n", 1521 | "print(\"Your algorithm predicts: y = \" + str(np.squeeze(my_image_prediction)))" 1522 | ] 1523 | }, 1524 | { 1525 | "cell_type": "markdown", 1526 | "metadata": {}, 1527 | "source": [ 1528 | "You indeed deserved a \"thumbs-up\" although as you can see the algorithm seems to classify it incorrectly. The reason is that the training set doesn't contain any \"thumbs-up\", so the model doesn't know how to deal with it! We call that a \"mismatched data distribution\" and it is one of the various of the next course on \"Structuring Machine Learning Projects\"." 1529 | ] 1530 | }, 1531 | { 1532 | "cell_type": "markdown", 1533 | "metadata": { 1534 | "collapsed": true 1535 | }, 1536 | "source": [ 1537 | "\n", 1538 | "**What you should remember**:\n", 1539 | "- Tensorflow is a programming framework used in deep learning\n", 1540 | "- The two main object classes in tensorflow are Tensors and Operators. \n", 1541 | "- When you code in tensorflow you have to take the following steps:\n", 1542 | " - Create a graph containing Tensors (Variables, Placeholders ...) and Operations (tf.matmul, tf.add, ...)\n", 1543 | " - Create a session\n", 1544 | " - Initialize the session\n", 1545 | " - Run the session to execute the graph\n", 1546 | "- You can execute the graph multiple times as you've seen in model()\n", 1547 | "- The backpropagation and optimization is automatically done when running the session on the \"optimizer\" object." 1548 | ] 1549 | } 1550 | ], 1551 | "metadata": { 1552 | "anaconda-cloud": {}, 1553 | "coursera": { 1554 | "course_slug": "deep-neural-network", 1555 | "graded_item_id": "BFd89", 1556 | "launcher_item_id": "AH2rK" 1557 | }, 1558 | "kernelspec": { 1559 | "display_name": "Python [default]", 1560 | "language": "python", 1561 | "name": "python3" 1562 | }, 1563 | "language_info": { 1564 | "codemirror_mode": { 1565 | "name": "ipython", 1566 | "version": 3 1567 | }, 1568 | "file_extension": ".py", 1569 | "mimetype": "text/x-python", 1570 | "name": "python", 1571 | "nbconvert_exporter": "python", 1572 | "pygments_lexer": "ipython3", 1573 | "version": "3.5.2" 1574 | } 1575 | }, 1576 | "nbformat": 4, 1577 | "nbformat_minor": 1 1578 | } 1579 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/tensorflow-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "", 4 | "signature": "sha256:fa7e6c6b17097e7f0c015a6d63ca784bf6b7c6536f991597a555ef8631f547d0" 5 | }, 6 | "nbformat": 3, 7 | "nbformat_minor": 0, 8 | "worksheets": [ 9 | { 10 | "cells": [ 11 | { 12 | "cell_type": "code", 13 | "collapsed": false, 14 | "input": [ 15 | "import math\n", 16 | "import numpy as np\n", 17 | "import h5py\n", 18 | "import matplotlib.pyplot as plt\n", 19 | "import tensorflow as tf\n", 20 | "from tensorflow.python.framework import ops\n", 21 | "from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict\n", 22 | "\n", 23 | "%matplotlib inline\n", 24 | "np.random.seed(1)" 25 | ], 26 | "language": "python", 27 | "metadata": {}, 28 | "outputs": [], 29 | "prompt_number": 1 30 | }, 31 | { 32 | "cell_type": "code", 33 | "collapsed": false, 34 | "input": [ 35 | "def Liner_fuction():\n", 36 | " np.random.seed(1)\n", 37 | " X = tf.constant(np.random.randn(3,1),name = 'X')\n", 38 | " W = tf.constant(np.random.randn(4,3), name = 'W')\n", 39 | " b = tf.constant(np.random.randn(4,1),name = 'b')\n", 40 | " \n", 41 | " Y = tf.add(tf.matmul(W ,X),b)\n", 42 | " sess = tf.Session()\n", 43 | " result = sess.run(Y)\n", 44 | " \n", 45 | " sess.close()\n", 46 | " \n", 47 | " return result" 48 | ], 49 | "language": "python", 50 | "metadata": {}, 51 | "outputs": [], 52 | "prompt_number": 2 53 | }, 54 | { 55 | "cell_type": "code", 56 | "collapsed": false, 57 | "input": [ 58 | "print (\"result:\" + str(Liner_fuction()))" 59 | ], 60 | "language": "python", 61 | "metadata": {}, 62 | "outputs": [ 63 | { 64 | "output_type": "stream", 65 | "stream": "stdout", 66 | "text": [ 67 | "result:[[-2.15657382]\n", 68 | " [ 2.95891446]\n", 69 | " [-1.08926781]\n", 70 | " [-0.84538042]]\n" 71 | ] 72 | } 73 | ], 74 | "prompt_number": 3 75 | }, 76 | { 77 | "cell_type": "code", 78 | "collapsed": false, 79 | "input": [ 80 | "def sigmoid(z):\n", 81 | " x = tf.placeholder(tf.float32, name='x')\n", 82 | " sigmoid = tf.sigmoid(x)\n", 83 | " with tf.Session() as sess:\n", 84 | " result = sess.run(sigmoid,feed_dict={x:z})\n", 85 | " \n", 86 | " return result" 87 | ], 88 | "language": "python", 89 | "metadata": {}, 90 | "outputs": [], 91 | "prompt_number": 13 92 | }, 93 | { 94 | "cell_type": "code", 95 | "collapsed": false, 96 | "input": [ 97 | "print (sigmoid(0))" 98 | ], 99 | "language": "python", 100 | "metadata": {}, 101 | "outputs": [ 102 | { 103 | "output_type": "stream", 104 | "stream": "stdout", 105 | "text": [ 106 | "0.5\n" 107 | ] 108 | } 109 | ], 110 | "prompt_number": 14 111 | }, 112 | { 113 | "cell_type": "code", 114 | "collapsed": false, 115 | "input": [ 116 | "def cost(logits, label):\n", 117 | " logits1 = tf. placeholder(tf.float32, name = 'logits')\n", 118 | " label1 = tf.placeholder(tf.float32, name = 'label')\n", 119 | " \n", 120 | " cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits1, labels = label1)\n", 121 | " with tf.Session() as sess:\n", 122 | " result = sess.run(cost, feed_dict={logits1:logits,label1:label})\n", 123 | " \n", 124 | " return result" 125 | ], 126 | "language": "python", 127 | "metadata": {}, 128 | "outputs": [], 129 | "prompt_number": 24 130 | }, 131 | { 132 | "cell_type": "code", 133 | "collapsed": false, 134 | "input": [ 135 | "logits = sigmoid([0.2,0.4,0.6,0.8])\n", 136 | "label = [0 ,0, 1, 1]\n", 137 | "print (logits)\n", 138 | "print (cost(logits, label))" 139 | ], 140 | "language": "python", 141 | "metadata": {}, 142 | "outputs": [ 143 | { 144 | "output_type": "stream", 145 | "stream": "stdout", 146 | "text": [ 147 | "[ 0.54983395 0.59868765 0.64565629 0.68997449]\n", 148 | "[ 1.00538719 1.03664088 0.42154732 0.40652379]" 149 | ] 150 | }, 151 | { 152 | "output_type": "stream", 153 | "stream": "stdout", 154 | "text": [ 155 | "\n" 156 | ] 157 | } 158 | ], 159 | "prompt_number": 25 160 | }, 161 | { 162 | "cell_type": "code", 163 | "collapsed": false, 164 | "input": [ 165 | "def one_hot(labels, C):\n", 166 | " depth = tf.constant(C)\n", 167 | " one_hot = tf.one_hot(labels ,depth, axis = 0)\n", 168 | " with tf.Session() as sess:\n", 169 | " result = sess.run(one_hot)\n", 170 | " \n", 171 | " return result\n" 172 | ], 173 | "language": "python", 174 | "metadata": {}, 175 | "outputs": [], 176 | "prompt_number": 26 177 | }, 178 | { 179 | "cell_type": "code", 180 | "collapsed": false, 181 | "input": [ 182 | "labels = [1, 3, 2, 0, 3, 2]\n", 183 | "print (one_hot(labels, C=4))" 184 | ], 185 | "language": "python", 186 | "metadata": {}, 187 | "outputs": [ 188 | { 189 | "output_type": "stream", 190 | "stream": "stdout", 191 | "text": [ 192 | "[[ 0. 0. 0. 1. 0. 0.]\n", 193 | " [ 1. 0. 0. 0. 0. 0.]\n", 194 | " [ 0. 0. 1. 0. 0. 1.]\n", 195 | " [ 0. 1. 0. 0. 1. 0.]]\n" 196 | ] 197 | } 198 | ], 199 | "prompt_number": 28 200 | }, 201 | { 202 | "cell_type": "code", 203 | "collapsed": false, 204 | "input": [ 205 | "with tf.Session() as sess:\n", 206 | " print (sess.run(tf.ones([3,2,4])))" 207 | ], 208 | "language": "python", 209 | "metadata": {}, 210 | "outputs": [ 211 | { 212 | "output_type": "stream", 213 | "stream": "stdout", 214 | "text": [ 215 | "[[[ 1. 1. 1. 1.]\n", 216 | " [ 1. 1. 1. 1.]]\n", 217 | "\n", 218 | " [[ 1. 1. 1. 1.]\n", 219 | " [ 1. 1. 1. 1.]]\n", 220 | "\n", 221 | " [[ 1. 1. 1. 1.]\n", 222 | " [ 1. 1. 1. 1.]]]\n" 223 | ] 224 | } 225 | ], 226 | "prompt_number": 34 227 | }, 228 | { 229 | "cell_type": "code", 230 | "collapsed": false, 231 | "input": [ 232 | "X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()" 233 | ], 234 | "language": "python", 235 | "metadata": {}, 236 | "outputs": [], 237 | "prompt_number": 35 238 | }, 239 | { 240 | "cell_type": "code", 241 | "collapsed": false, 242 | "input": [ 243 | "print (X_train_orig.size)\n", 244 | "X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 245 | "print (X_train_flatten/255)" 246 | ], 247 | "language": "python", 248 | "metadata": {}, 249 | "outputs": [ 250 | { 251 | "output_type": "stream", 252 | "stream": "stdout", 253 | "text": [ 254 | "13271040\n", 255 | "[[ 0.89019608 0.93333333 0.89411765 ..., 0.92156863 0.91372549\n", 256 | " 0.90196078]\n", 257 | " [ 0.8627451 0.90980392 0.8627451 ..., 0.88627451 0.88627451\n", 258 | " 0.8627451 ]\n", 259 | " [ 0.83921569 0.8745098 0.81568627 ..., 0.84705882 0.85098039\n", 260 | " 0.81960784]\n", 261 | " ..., \n", 262 | " [ 0.81568627 0.84313725 0.82745098 ..., 0.78431373 0.8 0.79215686]\n", 263 | " [ 0.81960784 0.8 0.81176471 ..., 0.75294118 0.78823529\n", 264 | " 0.78039216]\n", 265 | " [ 0.81960784 0.75294118 0.79215686 ..., 0.71372549 0.77647059\n", 266 | " 0.77254902]]" 267 | ] 268 | }, 269 | { 270 | "output_type": "stream", 271 | "stream": "stdout", 272 | "text": [ 273 | "\n" 274 | ] 275 | } 276 | ], 277 | "prompt_number": 47 278 | }, 279 | { 280 | "cell_type": "code", 281 | "collapsed": false, 282 | "input": [ 283 | "Y_train = one_hot(Y_train_orig, 6)\n", 284 | "Y_train = Y_train.reshape(Y_train.shape[0], -1)\n", 285 | "print(Y_train)" 286 | ], 287 | "language": "python", 288 | "metadata": {}, 289 | "outputs": [ 290 | { 291 | "output_type": "stream", 292 | "stream": "stdout", 293 | "text": [ 294 | "[[ 0. 1. 0. ..., 0. 0. 0.]\n", 295 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 296 | " [ 0. 0. 1. ..., 1. 0. 0.]\n", 297 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 298 | " [ 0. 0. 0. ..., 0. 1. 0.]\n", 299 | " [ 1. 0. 0. ..., 0. 0. 1.]]\n" 300 | ] 301 | } 302 | ], 303 | "prompt_number": 50 304 | }, 305 | { 306 | "cell_type": "code", 307 | "collapsed": false, 308 | "input": [ 309 | "Y_train = convert_to_one_hot(Y_train_orig, 6)\n", 310 | "print(Y_train)" 311 | ], 312 | "language": "python", 313 | "metadata": {}, 314 | "outputs": [ 315 | { 316 | "output_type": "stream", 317 | "stream": "stdout", 318 | "text": [ 319 | "[[ 0. 1. 0. ..., 0. 0. 0.]\n", 320 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 321 | " [ 0. 0. 1. ..., 1. 0. 0.]\n", 322 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 323 | " [ 0. 0. 0. ..., 0. 1. 0.]\n", 324 | " [ 1. 0. 0. ..., 0. 0. 1.]]\n" 325 | ] 326 | } 327 | ], 328 | "prompt_number": 51 329 | }, 330 | { 331 | "cell_type": "code", 332 | "collapsed": false, 333 | "input": [ 334 | "X_train_flaten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 335 | "print (X_train_flaten.shape)\n", 336 | "X_test_flaten = X_test_orig.reshape(X_test_orig.shape[0], -1).T\n", 337 | "\n", 338 | "X_train_flaten = X_train_flaten/255\n", 339 | "X_test_flaten = X_test_flaten/255\n", 340 | "\n", 341 | "Y_test = convert_to_one_hot(Y_test_orig, 6)\n", 342 | "Y_train = convert_to_one_hot(Y_train_orig, 6)" 343 | ], 344 | "language": "python", 345 | "metadata": {}, 346 | "outputs": [ 347 | { 348 | "output_type": "stream", 349 | "stream": "stdout", 350 | "text": [ 351 | "(12288, 1080)\n" 352 | ] 353 | } 354 | ], 355 | "prompt_number": 58 356 | }, 357 | { 358 | "cell_type": "code", 359 | "collapsed": false, 360 | "input": [ 361 | "def creat_placeholder(n_x, n_y):\n", 362 | " X = tf.placeholder(tf.float32, shape=(n_x,None) ,name='X')\n", 363 | " Y = tf.placeholder(tf.float32, shape=(n_y,None) ,name='Y')\n", 364 | " return X ,Y" 365 | ], 366 | "language": "python", 367 | "metadata": {}, 368 | "outputs": [], 369 | "prompt_number": 62 370 | }, 371 | { 372 | "cell_type": "code", 373 | "collapsed": false, 374 | "input": [ 375 | "X, Y = creat_placeholder(12288, 6)\n", 376 | "print (X)\n", 377 | "print (Y)" 378 | ], 379 | "language": "python", 380 | "metadata": {}, 381 | "outputs": [ 382 | { 383 | "output_type": "stream", 384 | "stream": "stdout", 385 | "text": [ 386 | "Tensor(\"X_2:0\", shape=(12288, ?), dtype=float32)\n", 387 | "Tensor(\"Y_1:0\", shape=(6, ?), dtype=float32)\n" 388 | ] 389 | } 390 | ], 391 | "prompt_number": 63 392 | }, 393 | { 394 | "cell_type": "code", 395 | "collapsed": false, 396 | "input": [ 397 | "def initializer_parameters():\n", 398 | " tf.set_random_seed(1)\n", 399 | " W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 400 | " b1 = tf.get_variable('b1', [25,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 401 | " W2 = tf.get_variable('W2', [12,25], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 402 | " b2 = tf.get_variable('b2', [12,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 403 | " W3 = tf.get_variable('W3', [6,12], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 404 | " b3 = tf.get_variable('b3', [6,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 405 | " parameters = {'W1':W1, 'W2':W2, 'W3':W3, 'b1':b1, 'b2':b2, 'b3':b3}\n", 406 | "\n", 407 | " return parameters\n", 408 | " " 409 | ], 410 | "language": "python", 411 | "metadata": {}, 412 | "outputs": [], 413 | "prompt_number": 75 414 | }, 415 | { 416 | "cell_type": "code", 417 | "collapsed": false, 418 | "input": [ 419 | "parameters = initializer_parameters()\n", 420 | "print (parameters['W1'])" 421 | ], 422 | "language": "python", 423 | "metadata": {}, 424 | "outputs": [ 425 | { 426 | "ename": "ValueError", 427 | "evalue": "Variable W1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:\n\n File \"\", line 3, in initializer_parameters\n W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n File \"\", line 5, in \n parameters = initializer_parameters()\n File \"/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py\", line 2883, in run_code\n exec(code_obj, self.user_global_ns, self.user_ns)\n", 428 | "output_type": "pyerr", 429 | "traceback": [ 430 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", 431 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mparameters\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minitializer_parameters\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mparameters\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'W1'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 432 | "\u001b[0;32m\u001b[0m in \u001b[0;36minitializer_parameters\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0minitializer_parameters\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mset_random_seed\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mW1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'W1'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m12288\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0mb1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'b1'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mW2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'W2'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m12\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 433 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 1201\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1202\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcustom_getter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcustom_getter\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1203\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 1204\u001b[0m get_variable_or_local_docstring = (\n\u001b[1;32m 1205\u001b[0m \"\"\"%s\n", 434 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 1090\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1091\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcustom_getter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcustom_getter\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1092\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 1093\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1094\u001b[0m def _get_partitioned_variable(self,\n", 435 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 423\u001b[0m \u001b[0mcaching_device\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcaching_device\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 424\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 425\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 426\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 427\u001b[0m def _get_partitioned_variable(\n", 436 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36m_true_getter\u001b[0;34m(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, constraint)\u001b[0m\n\u001b[1;32m 392\u001b[0m \u001b[0mtrainable\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtrainable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcollections\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcollections\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 393\u001b[0m \u001b[0mcaching_device\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcaching_device\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 394\u001b[0;31m use_resource=use_resource, constraint=constraint)\n\u001b[0m\u001b[1;32m 395\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 396\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcustom_getter\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 437 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36m_get_single_variable\u001b[0;34m(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource, constraint)\u001b[0m\n\u001b[1;32m 740\u001b[0m \u001b[0;34m\"reuse=tf.AUTO_REUSE in VarScope? \"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 741\u001b[0m \"Originally defined at:\\n\\n%s\" % (\n\u001b[0;32m--> 742\u001b[0;31m name, \"\".join(traceback.format_list(tb))))\n\u001b[0m\u001b[1;32m 743\u001b[0m \u001b[0mfound_var\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_vars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 744\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mshape\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_compatible_with\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfound_var\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_shape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 438 | "\u001b[0;31mValueError\u001b[0m: Variable W1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:\n\n File \"\", line 3, in initializer_parameters\n W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n File \"\", line 5, in \n parameters = initializer_parameters()\n File \"/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py\", line 2883, in run_code\n exec(code_obj, self.user_global_ns, self.user_ns)\n" 439 | ] 440 | } 441 | ], 442 | "prompt_number": 76 443 | }, 444 | { 445 | "cell_type": "code", 446 | "collapsed": false, 447 | "input": [ 448 | "def forward_propagation(x, parameters):\n", 449 | " W1 = parameters['W1']\n", 450 | " W2 = parameters['W2']\n", 451 | " W3 = parameters['W3']\n", 452 | " b1 = parameters['b1']\n", 453 | " b2 = parameters['b2']\n", 454 | " b3 = parameters['b3']\n", 455 | " \n", 456 | " out = tf.matmul(W1,x) + b1\n", 457 | " out = tf.nn.relu(out)\n", 458 | " out = tf.matmul(W2,out) + b2\n", 459 | " out = tf.nn.relu(out)\n", 460 | " out = tf.matmul(W3,out) + b3\n", 461 | " \n", 462 | " return out" 463 | ], 464 | "language": "python", 465 | "metadata": {}, 466 | "outputs": [], 467 | "prompt_number": 77 468 | }, 469 | { 470 | "cell_type": "code", 471 | "collapsed": false, 472 | "input": [ 473 | "tf.reset_default_graph()\n", 474 | "\n", 475 | "with tf.Session() as sess:\n", 476 | " X, Y = creat_placeholder(12288, 6)\n", 477 | " parameters = initializer_parameters()\n", 478 | " out = forward_propagation(X, parameters)\n", 479 | "\n", 480 | "print (out)" 481 | ], 482 | "language": "python", 483 | "metadata": {}, 484 | "outputs": [ 485 | { 486 | "output_type": "stream", 487 | "stream": "stdout", 488 | "text": [ 489 | "Tensor(\"add_2:0\", shape=(6, ?), dtype=float32)\n" 490 | ] 491 | } 492 | ], 493 | "prompt_number": 78 494 | }, 495 | { 496 | "cell_type": "code", 497 | "collapsed": false, 498 | "input": [ 499 | "def compute_cost(out, y):\n", 500 | " logits = tf.transpose(out)\n", 501 | " labels = tf.transpose(y)\n", 502 | " \n", 503 | " cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels=labels))\n", 504 | " return cost" 505 | ], 506 | "language": "python", 507 | "metadata": {}, 508 | "outputs": [], 509 | "prompt_number": 81 510 | }, 511 | { 512 | "cell_type": "code", 513 | "collapsed": false, 514 | "input": [ 515 | "tf.reset_default_graph()\n", 516 | "with tf.Session() as sess:\n", 517 | " X, Y = creat_placeholder(12288, 6)\n", 518 | " parameters = initializer_parameters()\n", 519 | " out = forward_propagation(X, parameters)\n", 520 | " cost = compute_cost(out, Y)\n", 521 | "print (cost)" 522 | ], 523 | "language": "python", 524 | "metadata": {}, 525 | "outputs": [ 526 | { 527 | "output_type": "stream", 528 | "stream": "stdout", 529 | "text": [ 530 | "Tensor(\"Mean:0\", shape=(), dtype=float32)\n" 531 | ] 532 | } 533 | ], 534 | "prompt_number": 82 535 | }, 536 | { 537 | "cell_type": "code", 538 | "collapsed": false, 539 | "input": [ 540 | "def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,\n", 541 | " nums_epoch = 1500, minibatch_size = 32, print_cost = True):\n", 542 | " tf.reset_default_graph()\n", 543 | " tf.set_random_seed(1)\n", 544 | " seed = 3\n", 545 | " (n_x, m)= X_train.shape\n", 546 | " n_y = Y_train.shape[0]\n", 547 | " cost = []\n", 548 | " \n", 549 | " X, Y = creat_placeholder(n_x, n_y)\n", 550 | " parameters = initializer_parameters()\n", 551 | " out = forward_propagation(X, parameters)\n", 552 | " cost = compute_cost(out, Y)\n", 553 | " optimizer = tf.train.AdamOptimizer(learning_rate).minisize(cost)\n", 554 | " \n", 555 | " init = tf.global_variables_initializer()\n", 556 | " \n", 557 | " with tf.Session() as sess:\n", 558 | " sess.run(init)\n", 559 | " for epoch in range(nums_epoch):\n", 560 | " epoch_cost = 0\n", 561 | " num_minibatches = int(m / minibatch_size)\n", 562 | " seed = seed+1\n", 563 | " minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)\n", 564 | " \n", 565 | " for minibatch in minibatches:\n", 566 | " (minibatch_X, minibatch_Y) = minibatch\n", 567 | " _, minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y:minibatch_Y})\n", 568 | " epoch_cost += minibatch_cost / num_minibatches\n", 569 | " \n", 570 | " if print_cost == True and epoch % 100 ==0:\n", 571 | " print (\"Cost after epoch %i: %f\" % (epoch, epoch_cost))\n", 572 | " if print_cost == True and epoch%5 == 0:\n", 573 | " costs.append(epoch_cost)\n", 574 | " \n", 575 | " plt.plot(np.squeeze(costs))\n", 576 | " plt.ylabel('cost')\n", 577 | " plt.xlabel('iterations (per tens)')\n", 578 | " plt.title(\"Learning rate = \" + str(learning_rate))\n", 579 | " plt.show()\n", 580 | " \n", 581 | " parameters = sess.run(parameters)\n", 582 | " print(\"Parameters have benn trained!\")\n", 583 | " correct_prediction = tf.equal(tf.argmax(out), tf.argmax(Y))\n", 584 | " \n", 585 | " accuracy = tf.reduce_mean(tf.cast(correct_prediction, \"float\"))\n", 586 | " \n", 587 | " print(\"Train Accuracy:\", accuracy.eval({X:X_train, Y:Y_train}))\n", 588 | " print (\"Test Accuracy:\", accuracy.eval({X:X_test, Y:Y_test}))\n", 589 | " \n", 590 | " return parameters" 591 | ], 592 | "language": "python", 593 | "metadata": {}, 594 | "outputs": [] 595 | } 596 | ], 597 | "metadata": {} 598 | } 599 | ] 600 | } -------------------------------------------------------------------------------- /.ipynb_checkpoints/translate_Tensorflow+Tutorial-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# TensorFlow Tutorial\n", 8 | "\n", 9 | "欢迎来到本周的编程任务。到目前为止,你一直使用numpy来建立神经网络。现在我们将引导您深入学习框架,让您更容易地建立神经网络。TensorFlow,PaddlePaddle,Torch,Caffe,Keras等机器学习框架可以显着加速您的机器学习开发。这些框架有很多文档,你可以随意阅读。在这个任务中,您将学习如何在TensorFlow中执行以下操作:\n", 10 | "\n", 11 | "- 初始化变量\n", 12 | "- 开始你自己的会话\n", 13 | "- 训练算法\n", 14 | "- 实现一个神经网络\n", 15 | "\n", 16 | "编程框架不仅可以缩短您的编码时间,但有时也可以执行优化来加速您的代码。\n", 17 | "\n", 18 | "\n", 19 | "## 1 - Exploring the Tensorflow Library\n", 20 | "\n", 21 | "首先导入相关库\n" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 66, 27 | "metadata": { 28 | "collapsed": false 29 | }, 30 | "outputs": [], 31 | "source": [ 32 | "import math\n", 33 | "import numpy as np\n", 34 | "import h5py\n", 35 | "import matplotlib.pyplot as plt\n", 36 | "import tensorflow as tf\n", 37 | "from tensorflow.python.framework import ops\n", 38 | "from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict\n", 39 | "\n", 40 | "%matplotlib inline\n", 41 | "np.random.seed(1)" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "我们将从一个例子开始,这个例子里需要计算一个训练样例的损失。\n", 49 | "$$loss = \\mathcal{L}(\\hat{y}, y) = (\\hat y^{(i)} - y^{(i)})^2 \\tag{1}$$" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 38, 55 | "metadata": { 56 | "collapsed": false 57 | }, 58 | "outputs": [ 59 | { 60 | "name": "stdout", 61 | "output_type": "stream", 62 | "text": [ 63 | "9\n" 64 | ] 65 | } 66 | ], 67 | "source": [ 68 | "y_hat = tf.constant(36, name='y_hat') # Define y_hat constant. Set to 36.\n", 69 | "y = tf.constant(39, name='y') # Define y. Set to 39\n", 70 | "\n", 71 | "loss = tf.Variable((y - y_hat)**2, name='loss') # Create a variable for the loss\n", 72 | "\n", 73 | "init = tf.global_variables_initializer() # When init is run later (session.run(init)),\n", 74 | " # the loss variable will be initialized and ready to be computed\n", 75 | "with tf.Session() as session: # Create a session and print the output\n", 76 | " session.run(init) # Initializes the variables\n", 77 | " print(session.run(loss)) # Prints the loss" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | "用TensorFlow编写和运行程序通常有以下步骤:\n", 85 | "\n", 86 | "1. 创建尚未执行/评估的张量(变量)\n", 87 | "2. 在这些张量之间写入操作。\n", 88 | "3. 初始化您的张量。\n", 89 | "4. 创建一个会话。\n", 90 | "5. 运行会话。这将运行你上面已经完成的写操作。 \n", 91 | "\n", 92 | "Therefore, when we created a variable for the loss, we simply defined the loss as a function of other quantities, but did not evaluate its value. To evaluate it, we had to run `init=tf.global_variables_initializer()`. That initialized the loss variable, and in the last line we were finally able to evaluate the value of `loss` and print its value.\n", 93 | "\n", 94 | "现在让我们看一个简单的例子。运行下面的单元格:" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 39, 100 | "metadata": { 101 | "collapsed": false 102 | }, 103 | "outputs": [ 104 | { 105 | "name": "stdout", 106 | "output_type": "stream", 107 | "text": [ 108 | "Tensor(\"Mul:0\", shape=(), dtype=int32)\n" 109 | ] 110 | } 111 | ], 112 | "source": [ 113 | "a = tf.constant(2)\n", 114 | "b = tf.constant(10)\n", 115 | "c = tf.multiply(a,b)\n", 116 | "print(c)" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": {}, 122 | "source": [ 123 | "正如所料,你不会看到20!你得到是一个类型为“int32”且没有shape属性的张量。你目前所做的只是把它放到“计算图”中,你还没有运行这个计算。为了让这两个数实际上相乘,你需要去创建一个会话并运行它." 124 | ] 125 | }, 126 | { 127 | "cell_type": "code", 128 | "execution_count": 40, 129 | "metadata": { 130 | "collapsed": false 131 | }, 132 | "outputs": [ 133 | { 134 | "name": "stdout", 135 | "output_type": "stream", 136 | "text": [ 137 | "20\n" 138 | ] 139 | } 140 | ], 141 | "source": [ 142 | "sess = tf.Session()\n", 143 | "print(sess.run(c))" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "Great! 总结一下, **,记得初始化你的变量,创建一个会话并运行会话中的操作**. \n", 151 | "\n", 152 | "接下来,您还必须了解占位符.占位符是一个您可以稍后指定值的对象. \n", 153 | "要为占位符指定值,可以使用 \"feed dictionary\" (`feed_dict` variable)传值。 下面,我们为x创建了一个占位符。这允许我们稍后在运行会话时传入一个数字。" 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": 41, 159 | "metadata": { 160 | "collapsed": false 161 | }, 162 | "outputs": [ 163 | { 164 | "name": "stdout", 165 | "output_type": "stream", 166 | "text": [ 167 | "6\n" 168 | ] 169 | } 170 | ], 171 | "source": [ 172 | "# Change the value of x in the feed_dict\n", 173 | "\n", 174 | "x = tf.placeholder(tf.int64, name = 'x')\n", 175 | "print(sess.run(2 * x, feed_dict = {x: 3}))\n", 176 | "sess.close()" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "当你第一次定义x你不必为它指定一个值。占位符只是一个变量,您将在稍后运行会话时将数据分配给该变量。\n", 184 | "\n", 185 | "以下是发生了的事情:当您指定计算所需的操作时,您正在告诉TensorFlow如何构建计算图。计算图可以有一些占位符,其值将在稍后指定。最后,当你运行会话时,你正在告诉TensorFlow执行计算图。" 186 | ] 187 | }, 188 | { 189 | "cell_type": "markdown", 190 | "metadata": {}, 191 | "source": [ 192 | "### 1.1 - Linear function\n", 193 | "\n", 194 | "让我们通过计算下面的等式来开始这个编程练习: $Y = WX + b$, where $W$ and $X$ are random matrices and b is a random vector. \n", 195 | "\n", 196 | "**Exercise**: 计算 $WX + b$, $W, X$, 和 $b$ 是从随机正太分布中抽取的. W is of shape (4, 3), X is (3,1) and b is (4,1). 下面是一个小例子,告诉你如何定义一个具有shape(3,1)的常量X:\n", 197 | "```python\n", 198 | "X = tf.constant(np.random.randn(3,1), name = \"X\")\n", 199 | "\n", 200 | "```\n", 201 | "下面的方法可能对你有用: \n", 202 | "- tf.matmul(..., ...) to do a matrix multiplication\n", 203 | "- tf.add(..., ...) to do an addition\n", 204 | "- np.random.randn(...) to initialize randomly\n" 205 | ] 206 | }, 207 | { 208 | "cell_type": "code", 209 | "execution_count": 42, 210 | "metadata": { 211 | "collapsed": true 212 | }, 213 | "outputs": [], 214 | "source": [ 215 | "# GRADED FUNCTION: linear_function\n", 216 | "\n", 217 | "def linear_function():\n", 218 | " \"\"\"\n", 219 | " Implements a linear function: \n", 220 | " Initializes W to be a random tensor of shape (4,3)\n", 221 | " Initializes X to be a random tensor of shape (3,1)\n", 222 | " Initializes b to be a random tensor of shape (4,1)\n", 223 | " Returns: \n", 224 | " result -- runs the session for Y = WX + b \n", 225 | " \"\"\"\n", 226 | " \n", 227 | " np.random.seed(1)\n", 228 | " \n", 229 | " ### START CODE HERE ### (4 lines of code)\n", 230 | " X = tf.constant(np.random.randn(3,1), name = \"X\")\n", 231 | " W = tf.constant(np.random.randn(4,3), name = \"W\")\n", 232 | " b = tf.constant(np.random.randn(4,1), name = \"b\")\n", 233 | "# print(X)\n", 234 | "# print(W)\n", 235 | "# print(b)\n", 236 | " Y = tf.add(tf.matmul(W,X), b)\n", 237 | " ### END CODE HERE ### \n", 238 | " \n", 239 | " # Create the session using tf.Session() and run it with sess.run(...) on the variable you want to calculate\n", 240 | " \n", 241 | " ### START CODE HERE ###\n", 242 | " sess = tf.Session()\n", 243 | " result = sess.run(Y)\n", 244 | " ### END CODE HERE ### \n", 245 | " \n", 246 | " # close the session \n", 247 | " sess.close()\n", 248 | "\n", 249 | " return result" 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": 43, 255 | "metadata": { 256 | "collapsed": false 257 | }, 258 | "outputs": [ 259 | { 260 | "name": "stdout", 261 | "output_type": "stream", 262 | "text": [ 263 | "result = [[-2.15657382]\n", 264 | " [ 2.95891446]\n", 265 | " [-1.08926781]\n", 266 | " [-0.84538042]]\n" 267 | ] 268 | } 269 | ], 270 | "source": [ 271 | "print( \"result = \" + str(linear_function()))" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": {}, 277 | "source": [ 278 | "*** Expected Output ***: \n", 279 | "\n", 280 | " \n", 281 | " \n", 282 | "\n", 285 | "\n", 291 | " \n", 292 | "\n", 293 | "
\n", 283 | "**result**\n", 284 | "\n", 286 | "[[-2.15657382]\n", 287 | " [ 2.95891446]\n", 288 | " [-1.08926781]\n", 289 | " [-0.84538042]]\n", 290 | "
" 294 | ] 295 | }, 296 | { 297 | "cell_type": "markdown", 298 | "metadata": {}, 299 | "source": [ 300 | "### 1.2 - Computing the sigmoid \n", 301 | "Great! 你刚刚实现了一个线性函数. Tensorflow 提供了很多类似 `tf.sigmoid` 和 `tf.softmax`这样的常用的神经网络功能. For this exercise lets compute the sigmoid function of an input. \n", 302 | "\n", 303 | "你将使用占位符变量`x`来做这个练习. When running the session, you should use the feed dictionary to pass in the input `z`. In this exercise, you will have to (i) create a placeholder `x`, (ii) define the operations needed to compute the sigmoid using `tf.sigmoid`, and then (iii) run the session. \n", 304 | "\n", 305 | "** Exercise **: 实现下面的 sigmoid 函数. You should use the following: \n", 306 | "\n", 307 | "- `tf.placeholder(tf.float32, name = \"...\")`\n", 308 | "- `tf.sigmoid(...)`\n", 309 | "- `sess.run(..., feed_dict = {x: z})`\n", 310 | "\n", 311 | "请注意,有两种典型的方法来去创建和使用tensor流中的会话:\n", 312 | "\n", 313 | "**Method 1:**\n", 314 | "```python\n", 315 | "sess = tf.Session()\n", 316 | "# Run the variables initialization (if needed), run the operations\n", 317 | "result = sess.run(..., feed_dict = {...})\n", 318 | "sess.close() # Close the session\n", 319 | "```\n", 320 | "**Method 2:**\n", 321 | "```python\n", 322 | "with tf.Session() as sess: \n", 323 | " # run the variables initialization (if needed), run the operations\n", 324 | " result = sess.run(..., feed_dict = {...})\n", 325 | " # This takes care of closing the session for you :)\n", 326 | "```\n" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": 44, 332 | "metadata": { 333 | "collapsed": true 334 | }, 335 | "outputs": [], 336 | "source": [ 337 | "# GRADED FUNCTION: sigmoid\n", 338 | "\n", 339 | "def sigmoid(z):\n", 340 | " \"\"\"\n", 341 | " Computes the sigmoid of z\n", 342 | " \n", 343 | " Arguments:\n", 344 | " z -- input value, scalar or vector\n", 345 | " \n", 346 | " Returns: \n", 347 | " results -- the sigmoid of z\n", 348 | " \"\"\"\n", 349 | " \n", 350 | " ### START CODE HERE ### ( approx. 4 lines of code)\n", 351 | " # Create a placeholder for x. Name it 'x'.\n", 352 | " x = tf.placeholder(tf.float32, name=\"x\")\n", 353 | "\n", 354 | " # compute sigmoid(x)\n", 355 | " sigmoid = tf.sigmoid(x)\n", 356 | "\n", 357 | " # Create a session, and run it. Please use the method 2 explained above. \n", 358 | " # You should use a feed_dict to pass z's value to x. \n", 359 | " with tf.Session() as sess:\n", 360 | " # Run session and call the output \"result\"\n", 361 | " result = sess.run(sigmoid, feed_dict={x:z})\n", 362 | " \n", 363 | " ### END CODE HERE ###\n", 364 | " \n", 365 | " return result" 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": 45, 371 | "metadata": { 372 | "collapsed": false 373 | }, 374 | "outputs": [ 375 | { 376 | "name": "stdout", 377 | "output_type": "stream", 378 | "text": [ 379 | "sigmoid(0) = 0.5\n", 380 | "sigmoid(12) = 0.999994\n" 381 | ] 382 | } 383 | ], 384 | "source": [ 385 | "print (\"sigmoid(0) = \" + str(sigmoid(0)))\n", 386 | "print (\"sigmoid(12) = \" + str(sigmoid(12)))" 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": {}, 392 | "source": [ 393 | "*** Expected Output ***: \n", 394 | "\n", 395 | " \n", 396 | " \n", 397 | "\n", 400 | "\n", 403 | "\n", 404 | " \n", 405 | "\n", 408 | "\n", 411 | " \n", 412 | "\n", 413 | "
\n", 398 | "**sigmoid(0)**\n", 399 | "\n", 401 | "0.5\n", 402 | "
\n", 406 | "**sigmoid(12)**\n", 407 | "\n", 409 | "0.999994\n", 410 | "
" 414 | ] 415 | }, 416 | { 417 | "cell_type": "markdown", 418 | "metadata": {}, 419 | "source": [ 420 | "\n", 421 | "**总结一下,你应该知道怎样去:**\n", 422 | "1. 创建占位符\n", 423 | "2. 指定与您要计算的操作对应的计算图\n", 424 | "3. 创建会话\n", 425 | "4. 运行会话,必要时使用Feed字典来指定占位符变量的值. " 426 | ] 427 | }, 428 | { 429 | "cell_type": "markdown", 430 | "metadata": {}, 431 | "source": [ 432 | "### 1.3 - Computing the Cost\n", 433 | "\n", 434 | "您也可以使用内置函数来计算您的神经网络的损失. So instead of needing to write code to compute this as a function of $a^{[2](i)}$ and $y^{(i)}$ for i=1...m: \n", 435 | "$$ J = - \\frac{1}{m} \\sum_{i = 1}^m \\large ( \\small y^{(i)} \\log a^{ [2] (i)} + (1-y^{(i)})\\log (1-a^{ [2] (i)} )\\large )\\small\\tag{2}$$\n", 436 | "\n", 437 | "在tensorflow中你可以用一行代码中做到这一点!\n", 438 | "\n", 439 | "**Exercise**: 实现交叉熵损失. The function you will use is: \n", 440 | "\n", 441 | "\n", 442 | "- `tf.nn.sigmoid_cross_entropy_with_logits(logits = ..., labels = ...)`\n", 443 | "\n", 444 | "Your code should input `z`, compute the sigmoid (to get `a`) and then compute the cross entropy cost $J$. All this can be done using one call to `tf.nn.sigmoid_cross_entropy_with_logits`, which computes\n", 445 | "\n", 446 | "$$- \\frac{1}{m} \\sum_{i = 1}^m \\large ( \\small y^{(i)} \\log \\sigma(z^{[2](i)}) + (1-y^{(i)})\\log (1-\\sigma(z^{[2](i)})\\large )\\small\\tag{2}$$\n", 447 | "\n" 448 | ] 449 | }, 450 | { 451 | "cell_type": "code", 452 | "execution_count": 46, 453 | "metadata": { 454 | "collapsed": true 455 | }, 456 | "outputs": [], 457 | "source": [ 458 | "# GRADED FUNCTION: cost\n", 459 | "\n", 460 | "def cost(logits, labels):\n", 461 | " \"\"\"\n", 462 | "    Computes the cost using the sigmoid cross entropy\n", 463 | "    \n", 464 | "    Arguments:\n", 465 | "    logits -- vector containing z, output of the last linear unit (before the final sigmoid activation)\n", 466 | "    labels -- vector of labels y (1 or 0) \n", 467 | " \n", 468 | " Note: What we've been calling \"z\" and \"y\" in this class are respectively called \"logits\" and \"labels\" \n", 469 | " in the TensorFlow documentation. So logits will feed into z, and labels into y. \n", 470 | "    \n", 471 | "    Returns:\n", 472 | "    cost -- runs the session of the cost (formula (2))\n", 473 | " \"\"\"\n", 474 | " \n", 475 | " ### START CODE HERE ### \n", 476 | " \n", 477 | " # Create the placeholders for \"logits\" (z) and \"labels\" (y) (approx. 2 lines)\n", 478 | " z = tf.placeholder(tf.float32, name=\"logits\")\n", 479 | " y = tf.placeholder(tf.float32, name=\"labels\")\n", 480 | " \n", 481 | " # Use the loss function (approx. 1 line)\n", 482 | " cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=z, labels=y)\n", 483 | " \n", 484 | " # Create a session (approx. 1 line). See method 1 above.\n", 485 | " sess = tf.Session()\n", 486 | " \n", 487 | " # Run the session (approx. 1 line).\n", 488 | " cost = sess.run(cost, feed_dict={z:logits, y:labels})\n", 489 | " \n", 490 | " \n", 491 | " # Close the session (approx. 1 line). See method 1 above.\n", 492 | " sess.close()\n", 493 | " \n", 494 | " ### END CODE HERE ###\n", 495 | " \n", 496 | " return cost" 497 | ] 498 | }, 499 | { 500 | "cell_type": "code", 501 | "execution_count": 47, 502 | "metadata": { 503 | "collapsed": false 504 | }, 505 | "outputs": [ 506 | { 507 | "name": "stdout", 508 | "output_type": "stream", 509 | "text": [ 510 | "[ 0.54983395 0.59868765 0.66818774 0.71094948]\n" 511 | ] 512 | } 513 | ], 514 | "source": [ 515 | "logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))\n", 516 | "cost = cost(logits, np.array([0,0,1,1]))\n", 517 | "print(logits)" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "metadata": {}, 523 | "source": [ 524 | "** Expected Output** : \n", 525 | "\n", 526 | " \n", 527 | " \n", 528 | " \n", 531 | " \n", 534 | " \n", 535 | "\n", 536 | "
\n", 529 | " **cost**\n", 530 | " \n", 532 | " [ 1.00538719 1.03664088 0.41385433 0.39956614]\n", 533 | "
" 537 | ] 538 | }, 539 | { 540 | "cell_type": "markdown", 541 | "metadata": {}, 542 | "source": [ 543 | "### 1.4 - Using One Hot encodings\n", 544 | "\n", 545 | "Many times in deep learning you will have a y vector with numbers ranging from 0 to C-1, where C is the number of classes. If C is for example 4, then you might have the following y vector which you will need to convert as follows:\n", 546 | "\n", 547 | "\n", 548 | "\n", 549 | "\n", 550 | "This is called a \"one hot\" encoding, because in the converted representation exactly one element of each column is \"hot\" (meaning set to 1). To do this conversion in numpy, you might have to write a few lines of code. In tensorflow, you can use one line of code: \n", 551 | "\n", 552 | "- tf.one_hot(labels, depth, axis) \n", 553 | "\n", 554 | "**Exercise:** Implement the function below to take one vector of labels and the total number of classes $C$, and return the one hot encoding. Use `tf.one_hot()` to do this. " 555 | ] 556 | }, 557 | { 558 | "cell_type": "code", 559 | "execution_count": 48, 560 | "metadata": { 561 | "collapsed": true 562 | }, 563 | "outputs": [], 564 | "source": [ 565 | "# GRADED FUNCTION: one_hot_matrix\n", 566 | "\n", 567 | "def one_hot_matrix(labels, C):\n", 568 | " \"\"\"\n", 569 | " Creates a matrix where the i-th row corresponds to the ith class number and the jth column\n", 570 | " corresponds to the jth training example. So if example j had a label i. Then entry (i,j) \n", 571 | " will be 1. \n", 572 | " \n", 573 | " Arguments:\n", 574 | " labels -- vector containing the labels \n", 575 | " C -- number of classes, the depth of the one hot dimension\n", 576 | " \n", 577 | " Returns: \n", 578 | " one_hot -- one hot matrix\n", 579 | " \"\"\"\n", 580 | " \n", 581 | " ### START CODE HERE ###\n", 582 | " \n", 583 | " # Create a tf.constant equal to C (depth), name it 'C'. (approx. 1 line)\n", 584 | " C = tf.constant(C)\n", 585 | " \n", 586 | " # Use tf.one_hot, be careful with the axis (approx. 1 line)\n", 587 | " one_hot_matrix = tf.one_hot(labels, C, axis=0)\n", 588 | " \n", 589 | " # Create the session (approx. 1 line)\n", 590 | " sess = tf.Session()\n", 591 | " \n", 592 | " # Run the session (approx. 1 line)\n", 593 | " one_hot = sess.run(one_hot_matrix)\n", 594 | " \n", 595 | " # Close the session (approx. 1 line). See method 1 above.\n", 596 | " sess.close()\n", 597 | " \n", 598 | " ### END CODE HERE ###\n", 599 | " \n", 600 | " return one_hot" 601 | ] 602 | }, 603 | { 604 | "cell_type": "code", 605 | "execution_count": 49, 606 | "metadata": { 607 | "collapsed": false 608 | }, 609 | "outputs": [ 610 | { 611 | "name": "stdout", 612 | "output_type": "stream", 613 | "text": [ 614 | "one_hot = [[ 0. 0. 0. 1. 0. 0.]\n", 615 | " [ 1. 0. 0. 0. 0. 1.]\n", 616 | " [ 0. 1. 0. 0. 1. 0.]\n", 617 | " [ 0. 0. 1. 0. 0. 0.]]\n" 618 | ] 619 | } 620 | ], 621 | "source": [ 622 | "labels = np.array([1,2,3,0,2,1])\n", 623 | "one_hot = one_hot_matrix(labels, C = 4)\n", 624 | "print (\"one_hot = \" + str(one_hot))" 625 | ] 626 | }, 627 | { 628 | "cell_type": "markdown", 629 | "metadata": {}, 630 | "source": [ 631 | "**Expected Output**: \n", 632 | "\n", 633 | " \n", 634 | " \n", 635 | " \n", 638 | " \n", 644 | " \n", 645 | "\n", 646 | "
\n", 636 | " **one_hot**\n", 637 | " \n", 639 | " [[ 0. 0. 0. 1. 0. 0.]\n", 640 | " [ 1. 0. 0. 0. 0. 1.]\n", 641 | " [ 0. 1. 0. 0. 1. 0.]\n", 642 | " [ 0. 0. 1. 0. 0. 0.]]\n", 643 | "
\n" 647 | ] 648 | }, 649 | { 650 | "cell_type": "markdown", 651 | "metadata": {}, 652 | "source": [ 653 | "### 1.5 - Initialize with zeros and ones\n", 654 | "\n", 655 | "现在您将学习如何用 0 和 1 去初始化一个向量. The function you will be calling is `tf.ones()`. To initialize with zeros you could use tf.zeros() instead. These functions take in a shape and return an array of dimension shape full of zeros and ones respectively. \n", 656 | "\n", 657 | "**Exercise:** Implement the function below to take in a shape and to return an array (of the shape's dimension of ones). \n", 658 | "\n", 659 | " - tf.ones(shape)\n" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": 50, 665 | "metadata": { 666 | "collapsed": true 667 | }, 668 | "outputs": [], 669 | "source": [ 670 | "# GRADED FUNCTION: ones\n", 671 | "\n", 672 | "def ones(shape):\n", 673 | " \"\"\"\n", 674 | " Creates an array of ones of dimension shape\n", 675 | " \n", 676 | " Arguments:\n", 677 | " shape -- shape of the array you want to create\n", 678 | " \n", 679 | " Returns: \n", 680 | " ones -- array containing only ones\n", 681 | " \"\"\"\n", 682 | " \n", 683 | " ### START CODE HERE ###\n", 684 | " \n", 685 | " # Create \"ones\" tensor using tf.ones(...). (approx. 1 line)\n", 686 | " ones = tf.ones(shape)\n", 687 | " \n", 688 | " # Create the session (approx. 1 line)\n", 689 | " sess = tf.Session()\n", 690 | " \n", 691 | " # Run the session to compute 'ones' (approx. 1 line)\n", 692 | " ones = sess.run(ones)\n", 693 | " \n", 694 | " # Close the session (approx. 1 line). See method 1 above.\n", 695 | " sess.close()\n", 696 | " \n", 697 | " ### END CODE HERE ###\n", 698 | " return ones" 699 | ] 700 | }, 701 | { 702 | "cell_type": "code", 703 | "execution_count": 51, 704 | "metadata": { 705 | "collapsed": false 706 | }, 707 | "outputs": [ 708 | { 709 | "name": "stdout", 710 | "output_type": "stream", 711 | "text": [ 712 | "ones = [ 1. 1. 1.]\n" 713 | ] 714 | } 715 | ], 716 | "source": [ 717 | "print (\"ones = \" + str(ones([3])))" 718 | ] 719 | }, 720 | { 721 | "cell_type": "markdown", 722 | "metadata": {}, 723 | "source": [ 724 | "**Expected Output:**\n", 725 | "\n", 726 | " \n", 727 | " \n", 728 | " \n", 731 | " \n", 734 | " \n", 735 | "\n", 736 | "
\n", 729 | " **ones**\n", 730 | " \n", 732 | " [ 1. 1. 1.]\n", 733 | "
" 737 | ] 738 | }, 739 | { 740 | "cell_type": "markdown", 741 | "metadata": {}, 742 | "source": [ 743 | "# 2 - 用tensorflow构建第一个神经网络\n", 744 | "\n", 745 | "在这个任务中,您将使用tensorflow建立一个神经网络. 通常有如下两步:\n", 746 | "\n", 747 | "- Create the computation graph\n", 748 | "- Run the graph\n", 749 | "\n", 750 | "### 2.0 - Problem statement: SIGNS Dataset\n", 751 | "\n", 752 | "某天下午,你和一群朋友一起决定去教电脑破解手语。你们花了几个小时在白墙前拍摄照片,然后得到了下面的数据集。现在你的工作是建立一个算法,以促进语言障碍人士和不懂手语的人进行沟通。\n", 753 | "\n", 754 | "- **训练集**: 1080个图片(64×64像素),代表从0到5的数字(每个数字180张图片).\n", 755 | "- **测试集**: 120张图片(64×64像素),代表从0到5的数字(每张数20张).\n", 756 | "\n", 757 | "请注意,这是SIGNS数据集的一个子集。完整的数据集包含更多的signs。\n", 758 | "\n", 759 | "以下是每个数字的示例,以及如何用标签表示示例。 These are the original pictures, before we lowered the image resolutoion to 64 by 64 pixels.\n", 760 | "
**Figure 1**: SIGNS dataset
\n", 761 | "\n", 762 | "\n", 763 | "运行以下代码以加载数据集" 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 74, 769 | "metadata": { 770 | "collapsed": false 771 | }, 772 | "outputs": [ 773 | { 774 | "ename": "AttributeError", 775 | "evalue": "module 'h5py' has no attribute 'File'", 776 | "output_type": "error", 777 | "traceback": [ 778 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 779 | "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", 780 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[1;31m# Loading the dataset\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtrain_dataset\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mh5py\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mFile\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'1212.mp'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"r\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0mX_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_train_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mX_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mY_test_orig\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mclasses\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mload_dataset\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", 781 | "\u001b[0;31mAttributeError\u001b[0m: module 'h5py' has no attribute 'File'" 782 | ] 783 | } 784 | ], 785 | "source": [ 786 | "# Loading the dataset\n", 787 | "X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()" 788 | ] 789 | }, 790 | { 791 | "cell_type": "markdown", 792 | "metadata": {}, 793 | "source": [ 794 | "更改下面的索引并运行单元格去可视化数据集中的一些样例。" 795 | ] 796 | }, 797 | { 798 | "cell_type": "code", 799 | "execution_count": 23, 800 | "metadata": { 801 | "collapsed": false 802 | }, 803 | "outputs": [], 804 | "source": [ 805 | "# Example of a picture\n", 806 | "index = 0\n", 807 | "plt.imshow(X_train_orig[index])\n", 808 | "print (\"y = \" + str(np.squeeze(Y_train_orig[:, index])))" 809 | ] 810 | }, 811 | { 812 | "cell_type": "markdown", 813 | "metadata": {}, 814 | "source": [ 815 | "和往常一样,将图像数据集展开变平,然后除以255进行归一化。最重要的是,你要像图一所示,把每个标签转换为一个热点向量(a one-hot vector). 运行下面的单元格来执行此操作" 816 | ] 817 | }, 818 | { 819 | "cell_type": "code", 820 | "execution_count": null, 821 | "metadata": { 822 | "collapsed": true 823 | }, 824 | "outputs": [], 825 | "source": [ 826 | "# Flatten the training and test images\n", 827 | "X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 828 | "X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T\n", 829 | "# Normalize image vectors\n", 830 | "X_train = X_train_flatten/255.\n", 831 | "X_test = X_test_flatten/255.\n", 832 | "# Convert training and test labels to one hot matrices\n", 833 | "Y_train = convert_to_one_hot(Y_train_orig, 6)\n", 834 | "Y_test = convert_to_one_hot(Y_test_orig, 6)\n", 835 | "\n", 836 | "print (\"number of training examples = \" + str(X_train.shape[1]))\n", 837 | "print (\"number of test examples = \" + str(X_test.shape[1]))\n", 838 | "print (\"X_train shape: \" + str(X_train.shape))\n", 839 | "print (\"Y_train shape: \" + str(Y_train.shape))\n", 840 | "print (\"X_test shape: \" + str(X_test.shape))\n", 841 | "print (\"Y_test shape: \" + str(Y_test.shape))" 842 | ] 843 | }, 844 | { 845 | "cell_type": "markdown", 846 | "metadata": {}, 847 | "source": [ 848 | "**请注意** that 12288 comes from $64 \\times 64 \\times 3$. 每个图像是64* 64 像素的正方形 , 3 是 RGB 颜色" 849 | ] 850 | }, 851 | { 852 | "cell_type": "markdown", 853 | "metadata": {}, 854 | "source": [ 855 | "**您的目标**是建立一个能够高准确度识别符号的算法。要做到这一点,你要建立一个tensorflow模型,这个模型和你之前用 python 实现的猫咪识别模型几乎一样(不同的是现在使用softmax输出). 这是一个很好的场合去对比两者实现的差异(numpy和tensorflow)。\n", 856 | "\n", 857 | "**The model** is *LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX*. SIGMOID输出层已被转换为SOFTMAX. A SOFTMAX layer generalizes SIGMOID to when there are more than two classes. " 858 | ] 859 | }, 860 | { 861 | "cell_type": "markdown", 862 | "metadata": {}, 863 | "source": [ 864 | "### 2.1 - Create placeholders\n", 865 | "\n", 866 | "你的第一个任务是为`X` 和 `Y`创建占位符. 这将允许您稍后在运行会话时传递您的训练数据\n", 867 | "\n", 868 | "**Exercise:** 执行下面的函数去创建tensorflow中的占位符" 869 | ] 870 | }, 871 | { 872 | "cell_type": "code", 873 | "execution_count": null, 874 | "metadata": { 875 | "collapsed": true 876 | }, 877 | "outputs": [], 878 | "source": [ 879 | "# GRADED FUNCTION: create_placeholders\n", 880 | "\n", 881 | "def create_placeholders(n_x, n_y):\n", 882 | " \"\"\"\n", 883 | " Creates the placeholders for the tensorflow session.\n", 884 | " \n", 885 | " Arguments:\n", 886 | " n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288)\n", 887 | " n_y -- scalar, number of classes (from 0 to 5, so -> 6)\n", 888 | " \n", 889 | " Returns:\n", 890 | " X -- placeholder for the data input, of shape [n_x, None] and dtype \"float\"\n", 891 | " Y -- placeholder for the input labels, of shape [n_y, None] and dtype \"float\"\n", 892 | " \n", 893 | " Tips:\n", 894 | " - You will use None because it let's us be flexible on the number of examples you will for the placeholders.\n", 895 | " In fact, the number of examples during test/train is different.\n", 896 | " \"\"\"\n", 897 | "\n", 898 | " ### START CODE HERE ### (approx. 2 lines)\n", 899 | " X = tf.placeholder(tf.float32, shape=(n_x,None), name=\"X\")\n", 900 | " Y = tf.placeholder(tf.float32, shape=(n_y,None), name=\"Y\")\n", 901 | " ### END CODE HERE ###\n", 902 | " \n", 903 | " return X, Y" 904 | ] 905 | }, 906 | { 907 | "cell_type": "code", 908 | "execution_count": null, 909 | "metadata": { 910 | "collapsed": true 911 | }, 912 | "outputs": [], 913 | "source": [ 914 | "X, Y = create_placeholders(12288, 6)\n", 915 | "print (\"X = \" + str(X))\n", 916 | "print (\"Y = \" + str(Y))" 917 | ] 918 | }, 919 | { 920 | "cell_type": "markdown", 921 | "metadata": {}, 922 | "source": [ 923 | "**Expected Output**: \n", 924 | "\n", 925 | " \n", 926 | " \n", 927 | " \n", 930 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 938 | " \n", 941 | " \n", 942 | "\n", 943 | "
\n", 928 | " **X**\n", 929 | " \n", 931 | " Tensor(\"Placeholder_1:0\", shape=(12288, ?), dtype=float32) (not necessarily Placeholder_1)\n", 932 | "
\n", 936 | " **Y**\n", 937 | " \n", 939 | " Tensor(\"Placeholder_2:0\", shape=(10, ?), dtype=float32) (not necessarily Placeholder_2)\n", 940 | "
" 944 | ] 945 | }, 946 | { 947 | "cell_type": "markdown", 948 | "metadata": {}, 949 | "source": [ 950 | "### 2.2 - Initializing the parameters\n", 951 | "\n", 952 | "你的第二个任务是初始化tensorflow的参数。\n", 953 | "\n", 954 | "**Exercise:** 执行下面的函数来初始化tensorflow中的参数。You are going use Xavier Initialization for weights and Zero Initialization for biases. The shapes are given below. As an example, to help you, for W1 and b1 you could use: \n", 955 | "\n", 956 | "```python\n", 957 | "W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 958 | "b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n", 959 | "```\n", 960 | "请使用 `seed = 1` ,以确保您的结果和我们的相匹配。" 961 | ] 962 | }, 963 | { 964 | "cell_type": "code", 965 | "execution_count": null, 966 | "metadata": { 967 | "collapsed": true 968 | }, 969 | "outputs": [], 970 | "source": [ 971 | "# GRADED FUNCTION: initialize_parameters\n", 972 | "\n", 973 | "def initialize_parameters():\n", 974 | " \"\"\"\n", 975 | " Initializes parameters to build a neural network with tensorflow. The shapes are:\n", 976 | " W1 : [25, 12288]\n", 977 | " b1 : [25, 1]\n", 978 | " W2 : [12, 25]\n", 979 | " b2 : [12, 1]\n", 980 | " W3 : [6, 12]\n", 981 | " b3 : [6, 1]\n", 982 | " \n", 983 | " Returns:\n", 984 | " parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3\n", 985 | " \"\"\"\n", 986 | " \n", 987 | " tf.set_random_seed(1) # so that your \"random\" numbers match ours\n", 988 | " \n", 989 | " ### START CODE HERE ### (approx. 6 lines of code)\n", 990 | " W1 = tf.get_variable(\"W1\", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 991 | " b1 = tf.get_variable(\"b1\", [25,1], initializer = tf.zeros_initializer())\n", 992 | " W2 = tf.get_variable(\"W2\", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 993 | " b2 = tf.get_variable(\"b2\", [12,1], initializer = tf.zeros_initializer())\n", 994 | " W3 = tf.get_variable(\"W3\", [6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))\n", 995 | " b3 = tf.get_variable(\"b3\", [6,1], initializer = tf.zeros_initializer())\n", 996 | " ### END CODE HERE ###\n", 997 | "\n", 998 | " parameters = {\"W1\": W1,\n", 999 | " \"b1\": b1,\n", 1000 | " \"W2\": W2,\n", 1001 | " \"b2\": b2,\n", 1002 | " \"W3\": W3,\n", 1003 | " \"b3\": b3}\n", 1004 | " \n", 1005 | " return parameters" 1006 | ] 1007 | }, 1008 | { 1009 | "cell_type": "code", 1010 | "execution_count": 25, 1011 | "metadata": { 1012 | "collapsed": false 1013 | }, 1014 | "outputs": [], 1015 | "source": [ 1016 | "tf.reset_default_graph()\n", 1017 | "with tf.Session() as sess:\n", 1018 | " parameters = initialize_parameters()\n", 1019 | " print(\"W1 = \" + str(parameters[\"W1\"]))\n", 1020 | " print(\"b1 = \" + str(parameters[\"b1\"]))\n", 1021 | " print(\"W2 = \" + str(parameters[\"W2\"]))\n", 1022 | " print(\"b2 = \" + str(parameters[\"b2\"]))" 1023 | ] 1024 | }, 1025 | { 1026 | "cell_type": "markdown", 1027 | "metadata": {}, 1028 | "source": [ 1029 | "**Expected Output**: \n", 1030 | "\n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1036 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1044 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1052 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1060 | " \n", 1063 | " \n", 1064 | "\n", 1065 | "
\n", 1034 | " **W1**\n", 1035 | " \n", 1037 | " < tf.Variable 'W1:0' shape=(25, 12288) dtype=float32_ref >\n", 1038 | "
\n", 1042 | " **b1**\n", 1043 | " \n", 1045 | " < tf.Variable 'b1:0' shape=(25, 1) dtype=float32_ref >\n", 1046 | "
\n", 1050 | " **W2**\n", 1051 | " \n", 1053 | " < tf.Variable 'W2:0' shape=(12, 25) dtype=float32_ref >\n", 1054 | "
\n", 1058 | " **b2**\n", 1059 | " \n", 1061 | " < tf.Variable 'b2:0' shape=(12, 1) dtype=float32_ref >\n", 1062 | "
" 1066 | ] 1067 | }, 1068 | { 1069 | "cell_type": "markdown", 1070 | "metadata": {}, 1071 | "source": [ 1072 | "As expected, the parameters haven't been evaluated yet." 1073 | ] 1074 | }, 1075 | { 1076 | "cell_type": "markdown", 1077 | "metadata": {}, 1078 | "source": [ 1079 | "### 2.3 - tensorflow 的前向传播\n", 1080 | "\n", 1081 | "您现在将实现tensorflow中的前向传播模块。该函数将带有一个参数字典,它将完成前向传递。您将使用的功能是:\n", 1082 | "\n", 1083 | "- `tf.add(...,...)` to do an addition\n", 1084 | "- `tf.matmul(...,...)` to do a matrix multiplication\n", 1085 | "- `tf.nn.relu(...)` to apply the ReLU activation\n", 1086 | "\n", 1087 | "**Question:** 实现神经网络的前向传递. We commented for you the numpy equivalents so that you can compare the tensorflow implementation to numpy. It is important to note that the forward propagation stops at `z3`. The reason is that in tensorflow the last linear layer output is given as input to the function computing the loss. Therefore, you don't need `a3`!\n", 1088 | "\n" 1089 | ] 1090 | }, 1091 | { 1092 | "cell_type": "code", 1093 | "execution_count": null, 1094 | "metadata": { 1095 | "collapsed": true 1096 | }, 1097 | "outputs": [], 1098 | "source": [ 1099 | "# GRADED FUNCTION: forward_propagation\n", 1100 | "\n", 1101 | "def forward_propagation(X, parameters):\n", 1102 | " \"\"\"\n", 1103 | " Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX\n", 1104 | " \n", 1105 | " Arguments:\n", 1106 | " X -- input dataset placeholder, of shape (input size, number of examples)\n", 1107 | " parameters -- python dictionary containing your parameters \"W1\", \"b1\", \"W2\", \"b2\", \"W3\", \"b3\"\n", 1108 | " the shapes are given in initialize_parameters\n", 1109 | "\n", 1110 | " Returns:\n", 1111 | " Z3 -- the output of the last LINEAR unit\n", 1112 | " \"\"\"\n", 1113 | " \n", 1114 | " # Retrieve the parameters from the dictionary \"parameters\" \n", 1115 | " W1 = parameters['W1']\n", 1116 | " b1 = parameters['b1']\n", 1117 | " W2 = parameters['W2']\n", 1118 | " b2 = parameters['b2']\n", 1119 | " W3 = parameters['W3']\n", 1120 | " b3 = parameters['b3']\n", 1121 | " \n", 1122 | " ### START CODE HERE ### (approx. 5 lines) # Numpy Equivalents:\n", 1123 | " Z1 = tf.matmul(W1, X)+b1 # Z1 = np.dot(W1, X) + b1\n", 1124 | " A1 = tf.nn.relu(Z1) # A1 = relu(Z1)\n", 1125 | " Z2 = tf.matmul(W2, A1)+b2 # Z2 = np.dot(W2, a1) + b2\n", 1126 | " A2 = tf.nn.relu(Z2) # A2 = relu(Z2)\n", 1127 | " Z3 = tf.matmul(W3, A2)+b3 # Z3 = np.dot(W3,Z2) + b3\n", 1128 | " ### END CODE HERE ###\n", 1129 | " \n", 1130 | " return Z3" 1131 | ] 1132 | }, 1133 | { 1134 | "cell_type": "code", 1135 | "execution_count": null, 1136 | "metadata": { 1137 | "collapsed": true, 1138 | "scrolled": true 1139 | }, 1140 | "outputs": [], 1141 | "source": [ 1142 | "tf.reset_default_graph()\n", 1143 | "\n", 1144 | "with tf.Session() as sess:\n", 1145 | " X, Y = create_placeholders(12288, 6)\n", 1146 | " parameters = initialize_parameters()\n", 1147 | " Z3 = forward_propagation(X, parameters)\n", 1148 | " print(\"Z3 = \" + str(Z3))" 1149 | ] 1150 | }, 1151 | { 1152 | "cell_type": "markdown", 1153 | "metadata": {}, 1154 | "source": [ 1155 | "**Expected Output**: \n", 1156 | "\n", 1157 | " \n", 1158 | " \n", 1159 | " \n", 1162 | " \n", 1165 | " \n", 1166 | "\n", 1167 | "
\n", 1160 | " **Z3**\n", 1161 | " \n", 1163 | " Tensor(\"Add_2:0\", shape=(6, ?), dtype=float32)\n", 1164 | "
" 1168 | ] 1169 | }, 1170 | { 1171 | "cell_type": "markdown", 1172 | "metadata": {}, 1173 | "source": [ 1174 | "您可能已经注意到,前向传播不会输出任何缓存. You will understand why below, when we get to brackpropagation." 1175 | ] 1176 | }, 1177 | { 1178 | "cell_type": "markdown", 1179 | "metadata": {}, 1180 | "source": [ 1181 | "### 2.4 Compute cost\n", 1182 | "\n", 1183 | "就像之前看到的,我们使用下面的方法计算成本非常简单:\n", 1184 | "```python\n", 1185 | "tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = ..., labels = ...))\n", 1186 | "```\n", 1187 | "**Question**: 实现下面的 cost function. \n", 1188 | "- It is important to know that the \"`logits`\" and \"`labels`\" inputs of `tf.nn.softmax_cross_entropy_with_logits` are expected to be of shape (number of examples, num_classes). We have thus transposed Z3 and Y for you.\n", 1189 | "- Besides, `tf.reduce_mean` basically does the summation over the examples." 1190 | ] 1191 | }, 1192 | { 1193 | "cell_type": "code", 1194 | "execution_count": null, 1195 | "metadata": { 1196 | "collapsed": true 1197 | }, 1198 | "outputs": [], 1199 | "source": [ 1200 | "# GRADED FUNCTION: compute_cost \n", 1201 | "\n", 1202 | "def compute_cost(Z3, Y):\n", 1203 | " \"\"\"\n", 1204 | " Computes the cost\n", 1205 | " \n", 1206 | " Arguments:\n", 1207 | " Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)\n", 1208 | " Y -- \"true\" labels vector placeholder, same shape as Z3\n", 1209 | " \n", 1210 | " Returns:\n", 1211 | " cost - Tensor of the cost function\n", 1212 | " \"\"\"\n", 1213 | " \n", 1214 | " # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)\n", 1215 | " logits = tf.transpose(Z3)\n", 1216 | " labels = tf.transpose(Y)\n", 1217 | " \n", 1218 | " ### START CODE HERE ### (1 line of code)\n", 1219 | " cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))\n", 1220 | " ### END CODE HERE ###\n", 1221 | " \n", 1222 | " return cost" 1223 | ] 1224 | }, 1225 | { 1226 | "cell_type": "code", 1227 | "execution_count": null, 1228 | "metadata": { 1229 | "collapsed": true 1230 | }, 1231 | "outputs": [], 1232 | "source": [ 1233 | "tf.reset_default_graph()\n", 1234 | "\n", 1235 | "with tf.Session() as sess:\n", 1236 | " X, Y = create_placeholders(12288, 6)\n", 1237 | " parameters = initialize_parameters()\n", 1238 | " Z3 = forward_propagation(X, parameters)\n", 1239 | " cost = compute_cost(Z3, Y)\n", 1240 | " print(\"cost = \" + str(cost))" 1241 | ] 1242 | }, 1243 | { 1244 | "cell_type": "markdown", 1245 | "metadata": {}, 1246 | "source": [ 1247 | "**Expected Output**: \n", 1248 | "\n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1254 | " \n", 1257 | " \n", 1258 | "\n", 1259 | "
\n", 1252 | " **cost**\n", 1253 | " \n", 1255 | " Tensor(\"Mean:0\", shape=(), dtype=float32)\n", 1256 | "
" 1260 | ] 1261 | }, 1262 | { 1263 | "cell_type": "markdown", 1264 | "metadata": {}, 1265 | "source": [ 1266 | "### 2.5 - Backward propagation & parameter updates\n", 1267 | "\n", 1268 | "所有的反向传播和参数更新都在1行代码中处理。\n", 1269 | "\n", 1270 | "计算成本函数后。你将创建一个\"`optimizer`\"对象. 在运行tf.session时,你必须调用这个对象和成本函数. 当被调用时,它将根据所选择的方法和学习率对给定的成本进行优化。\n", 1271 | "\n", 1272 | "例如,对于梯度下降,优化器将是:\n", 1273 | "```python\n", 1274 | "optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)\n", 1275 | "```\n", 1276 | "\n", 1277 | "要进行优化,您可以这样做:\n", 1278 | "```python\n", 1279 | "_ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n", 1280 | "```\n", 1281 | "\n", 1282 | "This computes the backpropagation by passing through the tensorflow graph in the reverse order. From cost to inputs.\n", 1283 | "\n", 1284 | "**Note** 在编码时,我们经常使用“一次性”变量 下滑线 `_` 来存储我们稍后不需要使用的值。 `_` takes on the evaluated value of `optimizer`, which we don't need (and `c` takes the value of the `cost` variable). " 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "markdown", 1289 | "metadata": {}, 1290 | "source": [ 1291 | "### 2.6 - Building the model\n", 1292 | "\n", 1293 | "Now, you will bring it all together! \n", 1294 | "\n", 1295 | "**Exercise:** 完成模型。你将会调用你之前已经实现的功能。" 1296 | ] 1297 | }, 1298 | { 1299 | "cell_type": "code", 1300 | "execution_count": null, 1301 | "metadata": { 1302 | "collapsed": true 1303 | }, 1304 | "outputs": [], 1305 | "source": [ 1306 | "def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,\n", 1307 | " num_epochs = 1500, minibatch_size = 32, print_cost = True):\n", 1308 | " \"\"\"\n", 1309 | " Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.\n", 1310 | " \n", 1311 | " Arguments:\n", 1312 | " X_train -- training set, of shape (input size = 12288, number of training examples = 1080)\n", 1313 | " Y_train -- test set, of shape (output size = 6, number of training examples = 1080)\n", 1314 | " X_test -- training set, of shape (input size = 12288, number of training examples = 120)\n", 1315 | " Y_test -- test set, of shape (output size = 6, number of test examples = 120)\n", 1316 | " learning_rate -- learning rate of the optimization\n", 1317 | " num_epochs -- number of epochs of the optimization loop\n", 1318 | " minibatch_size -- size of a minibatch\n", 1319 | " print_cost -- True to print the cost every 100 epochs\n", 1320 | " \n", 1321 | " Returns:\n", 1322 | " parameters -- parameters learnt by the model. They can then be used to predict.\n", 1323 | " \"\"\"\n", 1324 | " \n", 1325 | " ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables\n", 1326 | " tf.set_random_seed(1) # to keep consistent results\n", 1327 | " seed = 3 # to keep consistent results\n", 1328 | " (n_x, m) = X_train.shape # (n_x: input size, m : number of examples in the train set)\n", 1329 | " n_y = Y_train.shape[0] # n_y : output size\n", 1330 | " costs = [] # To keep track of the cost\n", 1331 | " \n", 1332 | " # Create Placeholders of shape (n_x, n_y)\n", 1333 | " ### START CODE HERE ### (1 line)\n", 1334 | " X, Y = create_placeholders(n_x, n_y)\n", 1335 | " ### END CODE HERE ###\n", 1336 | "\n", 1337 | " # Initialize parameters\n", 1338 | " ### START CODE HERE ### (1 line)\n", 1339 | " parameters = initialize_parameters()\n", 1340 | " ### END CODE HERE ###\n", 1341 | " \n", 1342 | " # Forward propagation: Build the forward propagation in the tensorflow graph\n", 1343 | " ### START CODE HERE ### (1 line)\n", 1344 | " Z3 = forward_propagation(X, parameters)\n", 1345 | " ### END CODE HERE ###\n", 1346 | " \n", 1347 | " # Cost function: Add cost function to tensorflow graph\n", 1348 | " ### START CODE HERE ### (1 line)\n", 1349 | " cost = compute_cost(Z3, Y)\n", 1350 | " ### END CODE HERE ###\n", 1351 | " \n", 1352 | " # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.\n", 1353 | " ### START CODE HERE ### (1 line)\n", 1354 | " optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)\n", 1355 | " ### END CODE HERE ###\n", 1356 | " \n", 1357 | " # Initialize all the variables\n", 1358 | " init = tf.global_variables_initializer()\n", 1359 | "\n", 1360 | " # Start the session to compute the tensorflow graph\n", 1361 | " with tf.Session() as sess:\n", 1362 | " \n", 1363 | " # Run the initialization\n", 1364 | " sess.run(init)\n", 1365 | " \n", 1366 | " # Do the training loop\n", 1367 | " for epoch in range(num_epochs):\n", 1368 | "\n", 1369 | " epoch_cost = 0. # Defines a cost related to an epoch\n", 1370 | " num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set\n", 1371 | " seed = seed + 1\n", 1372 | " minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)\n", 1373 | "\n", 1374 | " for minibatch in minibatches:\n", 1375 | "\n", 1376 | " # Select a minibatch\n", 1377 | " (minibatch_X, minibatch_Y) = minibatch\n", 1378 | " \n", 1379 | " # IMPORTANT: The line that runs the graph on a minibatch.\n", 1380 | " # Run the session to execute the \"optimizer\" and the \"cost\", the feedict should contain a minibatch for (X,Y).\n", 1381 | " ### START CODE HERE ### (1 line)\n", 1382 | " _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})\n", 1383 | " ### END CODE HERE ###\n", 1384 | " \n", 1385 | " epoch_cost += minibatch_cost / num_minibatches\n", 1386 | "\n", 1387 | " # Print the cost every epoch\n", 1388 | " if print_cost == True and epoch % 100 == 0:\n", 1389 | " print (\"Cost after epoch %i: %f\" % (epoch, epoch_cost))\n", 1390 | " if print_cost == True and epoch % 5 == 0:\n", 1391 | " costs.append(epoch_cost)\n", 1392 | " \n", 1393 | " # plot the cost\n", 1394 | " plt.plot(np.squeeze(costs))\n", 1395 | " plt.ylabel('cost')\n", 1396 | " plt.xlabel('iterations (per tens)')\n", 1397 | " plt.title(\"Learning rate =\" + str(learning_rate))\n", 1398 | " plt.show()\n", 1399 | "\n", 1400 | " # lets save the parameters in a variable\n", 1401 | " parameters = sess.run(parameters)\n", 1402 | " print (\"Parameters have been trained!\")\n", 1403 | "\n", 1404 | " # Calculate the correct predictions\n", 1405 | " correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))\n", 1406 | "\n", 1407 | " # Calculate accuracy on the test set\n", 1408 | " accuracy = tf.reduce_mean(tf.cast(correct_prediction, \"float\"))\n", 1409 | "\n", 1410 | " print (\"Train Accuracy:\", accuracy.eval({X: X_train, Y: Y_train}))\n", 1411 | " print (\"Test Accuracy:\", accuracy.eval({X: X_test, Y: Y_test}))\n", 1412 | " \n", 1413 | " return parameters" 1414 | ] 1415 | }, 1416 | { 1417 | "cell_type": "markdown", 1418 | "metadata": { 1419 | "collapsed": true 1420 | }, 1421 | "source": [ 1422 | "运行下面的单元格来训练你的模型!在我们的机器上大约需要5分钟. Your \"Cost after epoch 100\" should be 1.016458. 如果不是,不要浪费时间; 通过点击 notebook 上方的方块 (⬛)去中断训练, 并尝试更正您的代码. 如果损失是正确的,请休息一下,5分钟后回来!" 1423 | ] 1424 | }, 1425 | { 1426 | "cell_type": "code", 1427 | "execution_count": null, 1428 | "metadata": { 1429 | "collapsed": true, 1430 | "scrolled": false 1431 | }, 1432 | "outputs": [], 1433 | "source": [ 1434 | "parameters = model(X_train, Y_train, X_test, Y_test)" 1435 | ] 1436 | }, 1437 | { 1438 | "cell_type": "markdown", 1439 | "metadata": {}, 1440 | "source": [ 1441 | "**Expected Output**:\n", 1442 | "\n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1448 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1456 | " \n", 1459 | " \n", 1460 | "\n", 1461 | "
\n", 1446 | " **Train Accuracy**\n", 1447 | " \n", 1449 | " 0.999074\n", 1450 | "
\n", 1454 | " **Test Accuracy**\n", 1455 | " \n", 1457 | " 0.716667\n", 1458 | "
\n", 1462 | "\n", 1463 | "太令人惊讶了,您的算法可以识别代表0到5之间的数字的符号,准确率为71.7%。\n", 1464 | "\n", 1465 | "**Insights**:\n", 1466 | "- 你的模型似乎足够大去拟合训练集。但是,考虑到训练集和测试集精确度之间的差异,您可以尝试添加 L2 或 dropout 正则化去减少过拟合。\n", 1467 | "- 把会话视为一组训练模型的代码。每次在小批量集上运行会话时,都会训练参数。In total you have run the session a large number of times (1500 epochs) until you obtained well trained parameters." 1468 | ] 1469 | }, 1470 | { 1471 | "cell_type": "markdown", 1472 | "metadata": {}, 1473 | "source": [ 1474 | "### 2.7 - Test with your own image (optional / ungraded exercise)\n", 1475 | "\n", 1476 | "祝贺您完成这项任务。您现在可以拍摄您的手的照片,并查看模型的输出:\n", 1477 | " 1. Click on \"File\" in the upper bar of this notebook, then click \"Open\" to go on your Coursera Hub.\n", 1478 | " 2. Add your image to this Jupyter Notebook's directory, in the \"images\" folder\n", 1479 | " 3. Write your image's name in the following code\n", 1480 | " 4. Run the code and check if the algorithm is right!" 1481 | ] 1482 | }, 1483 | { 1484 | "cell_type": "code", 1485 | "execution_count": null, 1486 | "metadata": { 1487 | "collapsed": true, 1488 | "scrolled": true 1489 | }, 1490 | "outputs": [], 1491 | "source": [ 1492 | "import scipy\n", 1493 | "from PIL import Image\n", 1494 | "from scipy import ndimage\n", 1495 | "\n", 1496 | "## START CODE HERE ## (PUT YOUR IMAGE NAME) \n", 1497 | "my_image = \"thumbs_up.jpg\"\n", 1498 | "## END CODE HERE ##\n", 1499 | "\n", 1500 | "# We preprocess your image to fit your algorithm.\n", 1501 | "fname = \"images/\" + my_image\n", 1502 | "image = np.array(ndimage.imread(fname, flatten=False))\n", 1503 | "my_image = scipy.misc.imresize(image, size=(64,64)).reshape((1, 64*64*3)).T\n", 1504 | "my_image_prediction = predict(my_image, parameters)\n", 1505 | "\n", 1506 | "plt.imshow(image)\n", 1507 | "print(\"Your algorithm predicts: y = \" + str(np.squeeze(my_image_prediction)))" 1508 | ] 1509 | }, 1510 | { 1511 | "cell_type": "markdown", 1512 | "metadata": {}, 1513 | "source": [ 1514 | "You indeed deserved a \"thumbs-up\" although as you can see the algorithm seems to classify it incorrectly. 原因是训练集不包含任何“竖起大拇指”的图像,所以模型不知道该如何处理!我们称这个为 \"mismatched data distribution\", 是下一个课程 \"Structuring Machine Learning Projects\" 要讲解的内容之一." 1515 | ] 1516 | }, 1517 | { 1518 | "cell_type": "markdown", 1519 | "metadata": { 1520 | "collapsed": true 1521 | }, 1522 | "source": [ 1523 | "\n", 1524 | "**What you should remember**:\n", 1525 | "- 在深度学习中 Tensorflow 是一个编程框架\n", 1526 | "- tensorflow 有两个很重要的对象类 Tensors 和 Operators. \n", 1527 | "- 当你用 tensorflow 编码时,你的编码步骤必须是:\n", 1528 | " - 创建一个包含 Tensors (Variables, Placeholders ...) 和 Operations (tf.matmul, tf.add, ...)的计算图\n", 1529 | " - 创建一个会话\n", 1530 | " - 初始化会话\n", 1531 | " - 运行会话以执行计算图\n", 1532 | "- 您能多次执行计算图就像你在 model()中看到的那样\n", 1533 | "- 当运行会话里的 \"optimizer\" 对象时,反向传播和优化会自动完成。" 1534 | ] 1535 | } 1536 | ], 1537 | "metadata": { 1538 | "anaconda-cloud": {}, 1539 | "coursera": { 1540 | "course_slug": "deep-neural-network", 1541 | "graded_item_id": "BFd89", 1542 | "launcher_item_id": "AH2rK" 1543 | }, 1544 | "kernelspec": { 1545 | "display_name": "Python [default]", 1546 | "language": "python", 1547 | "name": "python3" 1548 | }, 1549 | "language_info": { 1550 | "codemirror_mode": { 1551 | "name": "ipython", 1552 | "version": 3 1553 | }, 1554 | "file_extension": ".py", 1555 | "mimetype": "text/x-python", 1556 | "name": "python", 1557 | "nbconvert_exporter": "python", 1558 | "pygments_lexer": "ipython3", 1559 | "version": "3.5.2" 1560 | } 1561 | }, 1562 | "nbformat": 4, 1563 | "nbformat_minor": 1 1564 | } 1565 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## 手势数字识别,pytorch和tensorflow实现 2 | dataset下是手势数字的数据集,h5格式 3 | 1. tensorflow.ipynb 是tensorflow下的实现代码 4 | > 5 | 2. pytorch 文件夹下是pytorch的实现代码 6 | > 7 | python train.py ## 即可跑起来 8 | ![Example](images/hands.png) 9 | -------------------------------------------------------------------------------- /__pycache__/tf_utils.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/__pycache__/tf_utils.cpython-35.pyc -------------------------------------------------------------------------------- /datasets/test_signs.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/datasets/test_signs.h5 -------------------------------------------------------------------------------- /datasets/train_signs.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/datasets/train_signs.h5 -------------------------------------------------------------------------------- /images/hands.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/images/hands.png -------------------------------------------------------------------------------- /images/onehot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/images/onehot.png -------------------------------------------------------------------------------- /images/thumbs_up.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/images/thumbs_up.jpg -------------------------------------------------------------------------------- /improv_utils.py: -------------------------------------------------------------------------------- 1 | import h5py 2 | import numpy as np 3 | import tensorflow as tf 4 | import math 5 | 6 | def load_dataset(): 7 | train_dataset = h5py.File('datasets/train_signs.h5', "r") 8 | train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features 9 | train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels 10 | 11 | test_dataset = h5py.File('datasets/test_signs.h5', "r") 12 | test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features 13 | test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels 14 | 15 | classes = np.array(test_dataset["list_classes"][:]) # the list of classes 16 | 17 | train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0])) 18 | test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0])) 19 | 20 | return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes 21 | 22 | 23 | def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0): 24 | """ 25 | Creates a list of random minibatches from (X, Y) 26 | 27 | Arguments: 28 | X -- input data, of shape (input size, number of examples) 29 | Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) 30 | mini_batch_size - size of the mini-batches, integer 31 | seed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours. 32 | 33 | Returns: 34 | mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y) 35 | """ 36 | 37 | m = X.shape[1] # number of training examples 38 | mini_batches = [] 39 | np.random.seed(seed) 40 | 41 | # Step 1: Shuffle (X, Y) 42 | permutation = list(np.random.permutation(m)) 43 | shuffled_X = X[:, permutation] 44 | shuffled_Y = Y[:, permutation].reshape((Y.shape[0],m)) 45 | 46 | # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case. 47 | num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning 48 | for k in range(0, num_complete_minibatches): 49 | mini_batch_X = shuffled_X[:, k * mini_batch_size : k * mini_batch_size + mini_batch_size] 50 | mini_batch_Y = shuffled_Y[:, k * mini_batch_size : k * mini_batch_size + mini_batch_size] 51 | mini_batch = (mini_batch_X, mini_batch_Y) 52 | mini_batches.append(mini_batch) 53 | 54 | # Handling the end case (last mini-batch < mini_batch_size) 55 | if m % mini_batch_size != 0: 56 | mini_batch_X = shuffled_X[:, num_complete_minibatches * mini_batch_size : m] 57 | mini_batch_Y = shuffled_Y[:, num_complete_minibatches * mini_batch_size : m] 58 | mini_batch = (mini_batch_X, mini_batch_Y) 59 | mini_batches.append(mini_batch) 60 | 61 | return mini_batches 62 | 63 | def convert_to_one_hot(Y, C): 64 | Y = np.eye(C)[Y.reshape(-1)].T 65 | return Y 66 | 67 | def predict(X, parameters): 68 | 69 | W1 = tf.convert_to_tensor(parameters["W1"]) 70 | b1 = tf.convert_to_tensor(parameters["b1"]) 71 | W2 = tf.convert_to_tensor(parameters["W2"]) 72 | b2 = tf.convert_to_tensor(parameters["b2"]) 73 | W3 = tf.convert_to_tensor(parameters["W3"]) 74 | b3 = tf.convert_to_tensor(parameters["b3"]) 75 | 76 | params = {"W1": W1, 77 | "b1": b1, 78 | "W2": W2, 79 | "b2": b2, 80 | "W3": W3, 81 | "b3": b3} 82 | 83 | x = tf.placeholder("float", [12288, 1]) 84 | 85 | z3 = forward_propagation(x, params) 86 | p = tf.argmax(z3) 87 | 88 | with tf.Session() as sess: 89 | prediction = sess.run(p, feed_dict = {x: X}) 90 | 91 | return prediction 92 | 93 | 94 | def create_placeholders(n_x, n_y): 95 | """ 96 | Creates the placeholders for the tensorflow session. 97 | 98 | Arguments: 99 | n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288) 100 | n_y -- scalar, number of classes (from 0 to 5, so -> 6) 101 | 102 | Returns: 103 | X -- placeholder for the data input, of shape [n_x, None] and dtype "float" 104 | Y -- placeholder for the input labels, of shape [n_y, None] and dtype "float" 105 | 106 | Tips: 107 | - You will use None because it let's us be flexible on the number of examples you will for the placeholders. 108 | In fact, the number of examples during test/train is different. 109 | """ 110 | 111 | ### START CODE HERE ### (approx. 2 lines) 112 | X = tf.placeholder("float", [n_x, None]) 113 | Y = tf.placeholder("float", [n_y, None]) 114 | ### END CODE HERE ### 115 | 116 | return X, Y 117 | 118 | 119 | def initialize_parameters(): 120 | """ 121 | Initializes parameters to build a neural network with tensorflow. The shapes are: 122 | W1 : [25, 12288] 123 | b1 : [25, 1] 124 | W2 : [12, 25] 125 | b2 : [12, 1] 126 | W3 : [6, 12] 127 | b3 : [6, 1] 128 | 129 | Returns: 130 | parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3 131 | """ 132 | 133 | tf.set_random_seed(1) # so that your "random" numbers match ours 134 | 135 | ### START CODE HERE ### (approx. 6 lines of code) 136 | W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1)) 137 | b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer()) 138 | W2 = tf.get_variable("W2", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1)) 139 | b2 = tf.get_variable("b2", [12,1], initializer = tf.zeros_initializer()) 140 | W3 = tf.get_variable("W3", [6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1)) 141 | b3 = tf.get_variable("b3", [6,1], initializer = tf.zeros_initializer()) 142 | ### END CODE HERE ### 143 | 144 | parameters = {"W1": W1, 145 | "b1": b1, 146 | "W2": W2, 147 | "b2": b2, 148 | "W3": W3, 149 | "b3": b3} 150 | 151 | return parameters 152 | 153 | 154 | def compute_cost(z3, Y): 155 | """ 156 | Computes the cost 157 | 158 | Arguments: 159 | z3 -- output of forward propagation (output of the last LINEAR unit), of shape (10, number of examples) 160 | Y -- "true" labels vector placeholder, same shape as z3 161 | 162 | Returns: 163 | cost - Tensor of the cost function 164 | """ 165 | 166 | # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits() 167 | logits = tf.transpose(z3) 168 | labels = tf.transpose(Y) 169 | 170 | ### START CODE HERE ### (1 line of code) 171 | cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels)) 172 | ### END CODE HERE ### 173 | 174 | return cost 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, 183 | num_epochs = 1500, minibatch_size = 32, print_cost = True): 184 | """ 185 | Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX. 186 | 187 | Arguments: 188 | X_train -- training set, of shape (input size = 12288, number of training examples = 1080) 189 | Y_train -- test set, of shape (output size = 6, number of training examples = 1080) 190 | X_test -- training set, of shape (input size = 12288, number of training examples = 120) 191 | Y_test -- test set, of shape (output size = 6, number of test examples = 120) 192 | learning_rate -- learning rate of the optimization 193 | num_epochs -- number of epochs of the optimization loop 194 | minibatch_size -- size of a minibatch 195 | print_cost -- True to print the cost every 100 epochs 196 | 197 | Returns: 198 | parameters -- parameters learnt by the model. They can then be used to predict. 199 | """ 200 | 201 | ops.reset_default_graph() # to be able to rerun the model without overwriting tf variables 202 | tf.set_random_seed(1) # to keep consistent results 203 | seed = 3 # to keep consistent results 204 | (n_x, m) = X_train.shape # (n_x: input size, m : number of examples in the train set) 205 | n_y = Y_train.shape[0] # n_y : output size 206 | costs = [] # To keep track of the cost 207 | 208 | # Create Placeholders of shape (n_x, n_y) 209 | ### START CODE HERE ### (1 line) 210 | X, Y = create_placeholders(n_x, n_y) 211 | ### END CODE HERE ### 212 | 213 | # Initialize parameters 214 | ### START CODE HERE ### (1 line) 215 | parameters = initialize_parameters() 216 | ### END CODE HERE ### 217 | 218 | # Forward propagation: Build the forward propagation in the tensorflow graph 219 | ### START CODE HERE ### (1 line) 220 | z3 = forward_propagation(X, parameters) 221 | ### END CODE HERE ### 222 | 223 | # Cost function: Add cost function to tensorflow graph 224 | ### START CODE HERE ### (1 line) 225 | cost = compute_cost(z3, Y) 226 | ### END CODE HERE ### 227 | 228 | # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer. 229 | ### START CODE HERE ### (1 line) 230 | optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost) 231 | ### END CODE HERE ### 232 | 233 | # Initialize all the variables 234 | init = tf.global_variables_initializer() 235 | 236 | # Start the session to compute the tensorflow graph 237 | with tf.Session() as sess: 238 | 239 | # Run the initialization 240 | sess.run(init) 241 | 242 | # Do the training loop 243 | for epoch in range(num_epochs): 244 | 245 | minibatch_cost = 0. 246 | num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set 247 | seed = seed + 1 248 | minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed) 249 | 250 | for minibatch in minibatches: 251 | 252 | # Select a minibatch 253 | (minibatch_X, minibatch_Y) = minibatch 254 | 255 | # IMPORTANT: The line that runs the graph on a minibatch. 256 | # Run the session to execute the optimizer and the cost, the feedict should contain a minibatch for (X,Y). 257 | ### START CODE HERE ### (1 line) 258 | _ , temp_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y}) 259 | ### END CODE HERE ### 260 | 261 | minibatch_cost += temp_cost / num_minibatches 262 | 263 | # Print the cost every epoch 264 | if print_cost == True and epoch % 100 == 0: 265 | print ("Cost after epoch %i: %f" % (epoch, minibatch_cost)) 266 | if print_cost == True and epoch % 5 == 0: 267 | costs.append(minibatch_cost) 268 | 269 | # plot the cost 270 | plt.plot(np.squeeze(costs)) 271 | plt.ylabel('cost') 272 | plt.xlabel('iterations (per tens)') 273 | plt.title("Learning rate =" + str(learning_rate)) 274 | plt.show() 275 | 276 | # lets save the parameters in a variable 277 | parameters = sess.run(parameters) 278 | print ("Parameters have been trained!") 279 | 280 | # Calculate the correct predictions 281 | correct_prediction = tf.equal(tf.argmax(z3), tf.argmax(Y)) 282 | 283 | # Calculate accuracy on the test set 284 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) 285 | 286 | print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train})) 287 | print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test})) 288 | 289 | return parameters -------------------------------------------------------------------------------- /pytorch/__pycache__/net.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/idotc/Gesture-digit-recognition/5b9081c30abde8acfaf392ae6db2a1fdbf1d3182/pytorch/__pycache__/net.cpython-35.pyc -------------------------------------------------------------------------------- /pytorch/net.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | class Net(torch.nn.Module): 6 | def __init__(self, n_feature, n_hidden1, n_hidden2, n_output): 7 | super(Net, self).__init__() 8 | self.hidden1 = torch.nn.Linear(n_feature, n_hidden1) 9 | self.hidden2 = torch.nn.Linear(n_hidden1, n_hidden2) 10 | self.predict = torch.nn.Linear(n_hidden2, n_output) 11 | self.relu = torch.nn.ReLU() 12 | 13 | def forward(self, x): 14 | out = self.hidden1(x) 15 | out = self.relu(out) 16 | out = self.hidden2(out) 17 | out = self.relu(out) 18 | out = self.predict(out) 19 | return out 20 | -------------------------------------------------------------------------------- /pytorch/train.py: -------------------------------------------------------------------------------- 1 | import h5py 2 | import numpy as np 3 | import math 4 | import torch 5 | import torch.nn as nn 6 | from torch.autograd import Variable 7 | from net import Net 8 | import matplotlib.pyplot as plt 9 | 10 | def load_dataset(): 11 | train_dataset = h5py.File('../datasets/train_signs.h5',"r") 12 | train_set_x_orig = np.array(train_dataset["train_set_x"]) 13 | train_set_y_orig = np.array(train_dataset["train_set_y"],dtype='int') 14 | 15 | test_dataset = h5py.File('../datasets/test_signs.h5', "r") 16 | test_set_x_orig = np.array(test_dataset["test_set_x"]) 17 | test_set_y_orig = np.array(test_dataset["test_set_y"],dtype='int') 18 | print (train_set_y_orig.shape) 19 | print (test_set_y_orig.shape) 20 | 21 | classes = np.array(test_dataset["list_classes"]) 22 | 23 | train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0])) 24 | test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0])) 25 | print (train_set_y_orig.shape) 26 | print (test_set_y_orig.shape) 27 | return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes 28 | 29 | def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0): 30 | m = X.shape[1] #numbers of training examples 31 | mini_batches = [] 32 | np.random.seed(seed) 33 | 34 | # Step 1 : Shuffle (X, Y) 35 | permutation = list(np.random.permutation(m)) 36 | Shuffle_X = X[:, permutation] 37 | Shuffle_Y = Y[:, permutation] 38 | 39 | # Step 2 : Partition 40 | num_complete_minibataches = math.floor(m / mini_batch_size) 41 | for k in range(0, num_complete_minibataches): 42 | mini_batch_X = Shuffle_X[:, k*mini_batch_size : (k+1)*mini_batch_size] 43 | mini_batch_Y = Shuffle_Y[:, k*mini_batch_size : (k+1)*mini_batch_size] 44 | mini_batch = (mini_batch_X, mini_batch_Y) 45 | mini_batches.append(mini_batch) 46 | 47 | if m % mini_batch_size !=0: #conside the remaining value 48 | mini_batch_X = Shuffle_X[:, num_complete_minibataches * mini_batch_size : m] 49 | mini_batch_Y = Shuffle_Y[:, num_complete_minibataches * mini_batch_size : m] 50 | mini_batch = (mini_batch_X, mini_batch_Y) 51 | mini_batches.append(mini_batch) 52 | 53 | return mini_batches 54 | 55 | def convert_to_one_hot(Y, C): 56 | Y = np.eye(C)[Y.reshape(-1)].T 57 | return Y 58 | 59 | def data_process(): 60 | X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset() 61 | X_train_flatten = X_train_orig.reshape((X_train_orig.shape[0],-1)).T 62 | X_test_flatten = X_test_orig.reshape((X_test_orig.shape[0], -1)).T 63 | print (X_train_flatten.shape) 64 | print(X_test_flatten.shape) 65 | X_train = X_train_flatten / 255 66 | X_test = X_test_flatten / 255 67 | C = (len(classes)) 68 | Y_train = convert_to_one_hot(Y_train_orig, C) 69 | Y_test = convert_to_one_hot(Y_test_orig, C) 70 | return X_train, Y_train, X_test, Y_test 71 | 72 | def train(learning_rate =0.0001, nums_epoch = 1500, minibatch_size = 32, print_cost = True): 73 | seed = 3 74 | X_train, Y_train, X_test, Y_test = data_process() 75 | (n_x, m)= X_train.shape 76 | n_y = Y_train.shape[0] 77 | costs = [] 78 | 79 | net = Net(12288, 25, 16, 6) 80 | print (net) 81 | 82 | optimizer = torch.optim.Adam(net.parameters(), lr=0.0001) 83 | #loss_fuc = nn.CrossEntropyLoss() 84 | loss_fuc = nn.MultiLabelSoftMarginLoss() 85 | 86 | for epoch in range(1500): 87 | epoch_cost = 0 88 | epoch_accracy = 0 89 | num_batches = int(m / minibatch_size) 90 | seed = seed + 1 91 | minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed) 92 | 93 | for minibatch in minibatches: 94 | (minibatchX, minibatchY) = minibatch 95 | minibatchX = minibatchX.astype(np.float32).T 96 | minibatchY = minibatchY.astype(np.float32).T 97 | b_x = Variable(torch.from_numpy(minibatchX)) 98 | b_y = Variable(torch.from_numpy(minibatchY)) 99 | 100 | output = net(b_x) 101 | minibatch_cost = loss_fuc(output, b_y) 102 | optimizer.zero_grad() 103 | minibatch_cost.backward() 104 | optimizer.step() 105 | #print(b_y) 106 | #print(torch.max(b_y, 1)) 107 | #print(torch.max(b_y, 1)[1].data.squeeze()) 108 | correct_prediction = sum(torch.max(output, 1)[1].data.squeeze() == torch.max(b_y, 1)[1].data.squeeze()) 109 | epoch_accracy += correct_prediction / minibatch_size / len(minibatches) 110 | epoch_cost += minibatch_cost / len(minibatches) 111 | 112 | if print_cost == True and epoch % 100 ==0: 113 | print ("Cost after epoch %i: %f" % (epoch, epoch_cost)) 114 | print ("Traing Acc. after epoch %i: %f" % (epoch, epoch_accracy)) 115 | X_test_tensor = X_test.astype(np.float32).T 116 | Y_test_tensor = Y_test.astype(np.float32).T 117 | X_test_tensor = Variable(torch.from_numpy(X_test_tensor)) 118 | Y_test_tensor = Variable(torch.from_numpy(Y_test_tensor)) 119 | test_output = net(X_test_tensor) 120 | correct_prediction = sum(torch.max(test_output, 1)[1].data.squeeze() == torch.max(Y_test_tensor, 1)[1].data.squeeze()) 121 | #correct_prediction = sum(torch.max(test_output) == torch.max(Y_test_tensor)) 122 | correct_prediction = correct_prediction / test_output.size(0) 123 | print ("Traing Acc. after epoch %i: %f" % (epoch, correct_prediction)) 124 | 125 | if print_cost == True and epoch%5 == 0: 126 | costs.append(epoch_cost) 127 | 128 | plt.plot(np.squeeze(costs)) 129 | plt.ylabel('cost') 130 | plt.xlabel('iterations (per tens)') 131 | plt.title("Learning rate = " + str(learning_rate)) 132 | plt.show() 133 | 134 | 135 | if __name__ == "__main__": 136 | train() 137 | #load_dataset() 138 | #random_mini_batches([2,2,3,1,4,5,6,1],[1,2,3,4,5,6,7,4],mini_batch_size = 3) 139 | -------------------------------------------------------------------------------- /tensorflow.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "metadata": { 3 | "name": "", 4 | "signature": "sha256:a4dc62943a0d749391085bce814022a8115cdb8a3f8ddf8f13489c9ab5caae2b" 5 | }, 6 | "nbformat": 3, 7 | "nbformat_minor": 0, 8 | "worksheets": [ 9 | { 10 | "cells": [ 11 | { 12 | "cell_type": "code", 13 | "collapsed": false, 14 | "input": [ 15 | "import math\n", 16 | "import numpy as np\n", 17 | "import h5py\n", 18 | "import matplotlib.pyplot as plt\n", 19 | "import tensorflow as tf\n", 20 | "from tensorflow.python.framework import ops\n", 21 | "from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict\n", 22 | "\n", 23 | "%matplotlib inline\n", 24 | "np.random.seed(1)" 25 | ], 26 | "language": "python", 27 | "metadata": {}, 28 | "outputs": [], 29 | "prompt_number": 1 30 | }, 31 | { 32 | "cell_type": "code", 33 | "collapsed": false, 34 | "input": [ 35 | "def Liner_fuction():\n", 36 | " np.random.seed(1)\n", 37 | " X = tf.constant(np.random.randn(3,1),name = 'X')\n", 38 | " W = tf.constant(np.random.randn(4,3), name = 'W')\n", 39 | " b = tf.constant(np.random.randn(4,1),name = 'b')\n", 40 | " \n", 41 | " Y = tf.add(tf.matmul(W ,X),b)\n", 42 | " sess = tf.Session()\n", 43 | " result = sess.run(Y)\n", 44 | " \n", 45 | " sess.close()\n", 46 | " \n", 47 | " return result" 48 | ], 49 | "language": "python", 50 | "metadata": {}, 51 | "outputs": [], 52 | "prompt_number": 2 53 | }, 54 | { 55 | "cell_type": "code", 56 | "collapsed": false, 57 | "input": [ 58 | "print (\"result:\" + str(Liner_fuction()))" 59 | ], 60 | "language": "python", 61 | "metadata": {}, 62 | "outputs": [ 63 | { 64 | "output_type": "stream", 65 | "stream": "stdout", 66 | "text": [ 67 | "result:[[-2.15657382]\n", 68 | " [ 2.95891446]\n", 69 | " [-1.08926781]\n", 70 | " [-0.84538042]]\n" 71 | ] 72 | } 73 | ], 74 | "prompt_number": 3 75 | }, 76 | { 77 | "cell_type": "code", 78 | "collapsed": false, 79 | "input": [ 80 | "def sigmoid(z):\n", 81 | " x = tf.placeholder(tf.float32, name='x')\n", 82 | " sigmoid = tf.sigmoid(x)\n", 83 | " with tf.Session() as sess:\n", 84 | " result = sess.run(sigmoid,feed_dict={x:z})\n", 85 | " \n", 86 | " return result" 87 | ], 88 | "language": "python", 89 | "metadata": {}, 90 | "outputs": [], 91 | "prompt_number": 13 92 | }, 93 | { 94 | "cell_type": "code", 95 | "collapsed": false, 96 | "input": [ 97 | "print (sigmoid(0))" 98 | ], 99 | "language": "python", 100 | "metadata": {}, 101 | "outputs": [ 102 | { 103 | "output_type": "stream", 104 | "stream": "stdout", 105 | "text": [ 106 | "0.5\n" 107 | ] 108 | } 109 | ], 110 | "prompt_number": 14 111 | }, 112 | { 113 | "cell_type": "code", 114 | "collapsed": false, 115 | "input": [ 116 | "def cost(logits, label):\n", 117 | " logits1 = tf. placeholder(tf.float32, name = 'logits')\n", 118 | " label1 = tf.placeholder(tf.float32, name = 'label')\n", 119 | " \n", 120 | " cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = logits1, labels = label1)\n", 121 | " with tf.Session() as sess:\n", 122 | " result = sess.run(cost, feed_dict={logits1:logits,label1:label})\n", 123 | " \n", 124 | " return result" 125 | ], 126 | "language": "python", 127 | "metadata": {}, 128 | "outputs": [], 129 | "prompt_number": 24 130 | }, 131 | { 132 | "cell_type": "code", 133 | "collapsed": false, 134 | "input": [ 135 | "logits = sigmoid([0.2,0.4,0.6,0.8])\n", 136 | "label = [0 ,0, 1, 1]\n", 137 | "print (logits)\n", 138 | "print (cost(logits, label))" 139 | ], 140 | "language": "python", 141 | "metadata": {}, 142 | "outputs": [ 143 | { 144 | "output_type": "stream", 145 | "stream": "stdout", 146 | "text": [ 147 | "[ 0.54983395 0.59868765 0.64565629 0.68997449]\n", 148 | "[ 1.00538719 1.03664088 0.42154732 0.40652379]" 149 | ] 150 | }, 151 | { 152 | "output_type": "stream", 153 | "stream": "stdout", 154 | "text": [ 155 | "\n" 156 | ] 157 | } 158 | ], 159 | "prompt_number": 25 160 | }, 161 | { 162 | "cell_type": "code", 163 | "collapsed": false, 164 | "input": [ 165 | "def one_hot(labels, C):\n", 166 | " depth = tf.constant(C)\n", 167 | " one_hot = tf.one_hot(labels ,depth, axis = 0)\n", 168 | " with tf.Session() as sess:\n", 169 | " result = sess.run(one_hot)\n", 170 | " \n", 171 | " return result\n" 172 | ], 173 | "language": "python", 174 | "metadata": {}, 175 | "outputs": [], 176 | "prompt_number": 26 177 | }, 178 | { 179 | "cell_type": "code", 180 | "collapsed": false, 181 | "input": [ 182 | "labels = [1, 3, 2, 0, 3, 2]\n", 183 | "print (one_hot(labels, C=4))" 184 | ], 185 | "language": "python", 186 | "metadata": {}, 187 | "outputs": [ 188 | { 189 | "output_type": "stream", 190 | "stream": "stdout", 191 | "text": [ 192 | "[[ 0. 0. 0. 1. 0. 0.]\n", 193 | " [ 1. 0. 0. 0. 0. 0.]\n", 194 | " [ 0. 0. 1. 0. 0. 1.]\n", 195 | " [ 0. 1. 0. 0. 1. 0.]]\n" 196 | ] 197 | } 198 | ], 199 | "prompt_number": 28 200 | }, 201 | { 202 | "cell_type": "code", 203 | "collapsed": false, 204 | "input": [ 205 | "with tf.Session() as sess:\n", 206 | " print (sess.run(tf.ones([3,2,4])))" 207 | ], 208 | "language": "python", 209 | "metadata": {}, 210 | "outputs": [ 211 | { 212 | "output_type": "stream", 213 | "stream": "stdout", 214 | "text": [ 215 | "[[[ 1. 1. 1. 1.]\n", 216 | " [ 1. 1. 1. 1.]]\n", 217 | "\n", 218 | " [[ 1. 1. 1. 1.]\n", 219 | " [ 1. 1. 1. 1.]]\n", 220 | "\n", 221 | " [[ 1. 1. 1. 1.]\n", 222 | " [ 1. 1. 1. 1.]]]\n" 223 | ] 224 | } 225 | ], 226 | "prompt_number": 34 227 | }, 228 | { 229 | "cell_type": "code", 230 | "collapsed": false, 231 | "input": [ 232 | "X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()" 233 | ], 234 | "language": "python", 235 | "metadata": {}, 236 | "outputs": [], 237 | "prompt_number": 35 238 | }, 239 | { 240 | "cell_type": "code", 241 | "collapsed": false, 242 | "input": [ 243 | "print (X_train_orig.size)\n", 244 | "X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 245 | "print (X_train_flatten/255)" 246 | ], 247 | "language": "python", 248 | "metadata": {}, 249 | "outputs": [ 250 | { 251 | "output_type": "stream", 252 | "stream": "stdout", 253 | "text": [ 254 | "13271040\n", 255 | "[[ 0.89019608 0.93333333 0.89411765 ..., 0.92156863 0.91372549\n", 256 | " 0.90196078]\n", 257 | " [ 0.8627451 0.90980392 0.8627451 ..., 0.88627451 0.88627451\n", 258 | " 0.8627451 ]\n", 259 | " [ 0.83921569 0.8745098 0.81568627 ..., 0.84705882 0.85098039\n", 260 | " 0.81960784]\n", 261 | " ..., \n", 262 | " [ 0.81568627 0.84313725 0.82745098 ..., 0.78431373 0.8 0.79215686]\n", 263 | " [ 0.81960784 0.8 0.81176471 ..., 0.75294118 0.78823529\n", 264 | " 0.78039216]\n", 265 | " [ 0.81960784 0.75294118 0.79215686 ..., 0.71372549 0.77647059\n", 266 | " 0.77254902]]" 267 | ] 268 | }, 269 | { 270 | "output_type": "stream", 271 | "stream": "stdout", 272 | "text": [ 273 | "\n" 274 | ] 275 | } 276 | ], 277 | "prompt_number": 47 278 | }, 279 | { 280 | "cell_type": "code", 281 | "collapsed": false, 282 | "input": [ 283 | "Y_train = one_hot(Y_train_orig, 6)\n", 284 | "Y_train = Y_train.reshape(Y_train.shape[0], -1)\n", 285 | "print(Y_train)" 286 | ], 287 | "language": "python", 288 | "metadata": {}, 289 | "outputs": [ 290 | { 291 | "output_type": "stream", 292 | "stream": "stdout", 293 | "text": [ 294 | "[[ 0. 1. 0. ..., 0. 0. 0.]\n", 295 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 296 | " [ 0. 0. 1. ..., 1. 0. 0.]\n", 297 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 298 | " [ 0. 0. 0. ..., 0. 1. 0.]\n", 299 | " [ 1. 0. 0. ..., 0. 0. 1.]]\n" 300 | ] 301 | } 302 | ], 303 | "prompt_number": 50 304 | }, 305 | { 306 | "cell_type": "code", 307 | "collapsed": false, 308 | "input": [ 309 | "Y_train = convert_to_one_hot(Y_train_orig, 6)\n", 310 | "print(Y_train)" 311 | ], 312 | "language": "python", 313 | "metadata": {}, 314 | "outputs": [ 315 | { 316 | "output_type": "stream", 317 | "stream": "stdout", 318 | "text": [ 319 | "[[ 0. 1. 0. ..., 0. 0. 0.]\n", 320 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 321 | " [ 0. 0. 1. ..., 1. 0. 0.]\n", 322 | " [ 0. 0. 0. ..., 0. 0. 0.]\n", 323 | " [ 0. 0. 0. ..., 0. 1. 0.]\n", 324 | " [ 1. 0. 0. ..., 0. 0. 1.]]\n" 325 | ] 326 | } 327 | ], 328 | "prompt_number": 51 329 | }, 330 | { 331 | "cell_type": "code", 332 | "collapsed": false, 333 | "input": [ 334 | "X_train_flaten = X_train_orig.reshape(X_train_orig.shape[0], -1).T\n", 335 | "print (X_train_flaten.shape)\n", 336 | "X_test_flaten = X_test_orig.reshape(X_test_orig.shape[0], -1).T\n", 337 | "\n", 338 | "X_train = X_train_flaten/255\n", 339 | "X_test = X_test_flaten/255\n", 340 | "\n", 341 | "Y_test = convert_to_one_hot(Y_test_orig, 6)\n", 342 | "Y_train = convert_to_one_hot(Y_train_orig, 6)" 343 | ], 344 | "language": "python", 345 | "metadata": {}, 346 | "outputs": [ 347 | { 348 | "output_type": "stream", 349 | "stream": "stdout", 350 | "text": [ 351 | "(12288, 1080)\n" 352 | ] 353 | } 354 | ], 355 | "prompt_number": 85 356 | }, 357 | { 358 | "cell_type": "code", 359 | "collapsed": false, 360 | "input": [ 361 | "def creat_placeholder(n_x, n_y):\n", 362 | " X = tf.placeholder(tf.float32, shape=(n_x,None) ,name='X')\n", 363 | " Y = tf.placeholder(tf.float32, shape=(n_y,None) ,name='Y')\n", 364 | " return X ,Y" 365 | ], 366 | "language": "python", 367 | "metadata": {}, 368 | "outputs": [], 369 | "prompt_number": 62 370 | }, 371 | { 372 | "cell_type": "code", 373 | "collapsed": false, 374 | "input": [ 375 | "X, Y = creat_placeholder(12288, 6)\n", 376 | "print (X)\n", 377 | "print (Y)" 378 | ], 379 | "language": "python", 380 | "metadata": {}, 381 | "outputs": [ 382 | { 383 | "output_type": "stream", 384 | "stream": "stdout", 385 | "text": [ 386 | "Tensor(\"X_2:0\", shape=(12288, ?), dtype=float32)\n", 387 | "Tensor(\"Y_1:0\", shape=(6, ?), dtype=float32)\n" 388 | ] 389 | } 390 | ], 391 | "prompt_number": 63 392 | }, 393 | { 394 | "cell_type": "code", 395 | "collapsed": false, 396 | "input": [ 397 | "def initializer_parameters():\n", 398 | " tf.set_random_seed(1)\n", 399 | " W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 400 | " b1 = tf.get_variable('b1', [25,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 401 | " W2 = tf.get_variable('W2', [12,25], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 402 | " b2 = tf.get_variable('b2', [12,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 403 | " W3 = tf.get_variable('W3', [6,12], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 404 | " b3 = tf.get_variable('b3', [6,1], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n", 405 | " parameters = {'W1':W1, 'W2':W2, 'W3':W3, 'b1':b1, 'b2':b2, 'b3':b3}\n", 406 | "\n", 407 | " return parameters\n", 408 | " " 409 | ], 410 | "language": "python", 411 | "metadata": {}, 412 | "outputs": [], 413 | "prompt_number": 75 414 | }, 415 | { 416 | "cell_type": "code", 417 | "collapsed": false, 418 | "input": [ 419 | "parameters = initializer_parameters()\n", 420 | "print (parameters['W1'])" 421 | ], 422 | "language": "python", 423 | "metadata": {}, 424 | "outputs": [ 425 | { 426 | "ename": "ValueError", 427 | "evalue": "Variable W1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:\n\n File \"\", line 3, in initializer_parameters\n W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n File \"\", line 5, in \n parameters = initializer_parameters()\n File \"/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py\", line 2883, in run_code\n exec(code_obj, self.user_global_ns, self.user_ns)\n", 428 | "output_type": "pyerr", 429 | "traceback": [ 430 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", 431 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mparameters\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0minitializer_parameters\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0mparameters\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'W1'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 432 | "\u001b[0;32m\u001b[0m in \u001b[0;36minitializer_parameters\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0minitializer_parameters\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mset_random_seed\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mW1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'W1'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m12288\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0mb1\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'b1'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mW2\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_variable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'W2'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m12\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m25\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0minitializer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontrib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlayers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxavier_initializer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mseed\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 433 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 1201\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1202\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcustom_getter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcustom_getter\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1203\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 1204\u001b[0m get_variable_or_local_docstring = (\n\u001b[1;32m 1205\u001b[0m \"\"\"%s\n", 434 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 1090\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1091\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcustom_getter\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcustom_getter\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1092\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 1093\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1094\u001b[0m def _get_partitioned_variable(self,\n", 435 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36mget_variable\u001b[0;34m(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter, constraint)\u001b[0m\n\u001b[1;32m 423\u001b[0m \u001b[0mcaching_device\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcaching_device\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpartitioner\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mpartitioner\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 424\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0muse_resource\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0muse_resource\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 425\u001b[0;31m constraint=constraint)\n\u001b[0m\u001b[1;32m 426\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 427\u001b[0m def _get_partitioned_variable(\n", 436 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36m_true_getter\u001b[0;34m(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, constraint)\u001b[0m\n\u001b[1;32m 392\u001b[0m \u001b[0mtrainable\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtrainable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcollections\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcollections\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 393\u001b[0m \u001b[0mcaching_device\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcaching_device\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalidate_shape\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalidate_shape\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 394\u001b[0;31m use_resource=use_resource, constraint=constraint)\n\u001b[0m\u001b[1;32m 395\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 396\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcustom_getter\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 437 | "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/variable_scope.py\u001b[0m in \u001b[0;36m_get_single_variable\u001b[0;34m(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource, constraint)\u001b[0m\n\u001b[1;32m 740\u001b[0m \u001b[0;34m\"reuse=tf.AUTO_REUSE in VarScope? \"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 741\u001b[0m \"Originally defined at:\\n\\n%s\" % (\n\u001b[0;32m--> 742\u001b[0;31m name, \"\".join(traceback.format_list(tb))))\n\u001b[0m\u001b[1;32m 743\u001b[0m \u001b[0mfound_var\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_vars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 744\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mshape\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_compatible_with\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfound_var\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_shape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 438 | "\u001b[0;31mValueError\u001b[0m: Variable W1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:\n\n File \"\", line 3, in initializer_parameters\n W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))\n File \"\", line 5, in \n parameters = initializer_parameters()\n File \"/usr/lib/python3/dist-packages/IPython/core/interactiveshell.py\", line 2883, in run_code\n exec(code_obj, self.user_global_ns, self.user_ns)\n" 439 | ] 440 | } 441 | ], 442 | "prompt_number": 76 443 | }, 444 | { 445 | "cell_type": "code", 446 | "collapsed": false, 447 | "input": [ 448 | "def forward_propagation(x, parameters):\n", 449 | " W1 = parameters['W1']\n", 450 | " W2 = parameters['W2']\n", 451 | " W3 = parameters['W3']\n", 452 | " b1 = parameters['b1']\n", 453 | " b2 = parameters['b2']\n", 454 | " b3 = parameters['b3']\n", 455 | " \n", 456 | " out = tf.matmul(W1,x) + b1\n", 457 | " out = tf.nn.relu(out)\n", 458 | " out = tf.matmul(W2,out) + b2\n", 459 | " out = tf.nn.relu(out)\n", 460 | " out = tf.matmul(W3,out) + b3\n", 461 | " \n", 462 | " return out" 463 | ], 464 | "language": "python", 465 | "metadata": {}, 466 | "outputs": [], 467 | "prompt_number": 77 468 | }, 469 | { 470 | "cell_type": "code", 471 | "collapsed": false, 472 | "input": [ 473 | "tf.reset_default_graph()\n", 474 | "\n", 475 | "with tf.Session() as sess:\n", 476 | " X, Y = creat_placeholder(12288, 6)\n", 477 | " parameters = initializer_parameters()\n", 478 | " out = forward_propagation(X, parameters)\n", 479 | "\n", 480 | "print (out)" 481 | ], 482 | "language": "python", 483 | "metadata": {}, 484 | "outputs": [ 485 | { 486 | "output_type": "stream", 487 | "stream": "stdout", 488 | "text": [ 489 | "Tensor(\"add_2:0\", shape=(6, ?), dtype=float32)\n" 490 | ] 491 | } 492 | ], 493 | "prompt_number": 78 494 | }, 495 | { 496 | "cell_type": "code", 497 | "collapsed": false, 498 | "input": [ 499 | "def compute_cost(out, y):\n", 500 | " logits = tf.transpose(out)\n", 501 | " labels = tf.transpose(y)\n", 502 | " \n", 503 | " cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels=labels))\n", 504 | " return cost" 505 | ], 506 | "language": "python", 507 | "metadata": {}, 508 | "outputs": [], 509 | "prompt_number": 81 510 | }, 511 | { 512 | "cell_type": "code", 513 | "collapsed": false, 514 | "input": [ 515 | "tf.reset_default_graph()\n", 516 | "with tf.Session() as sess:\n", 517 | " X, Y = creat_placeholder(12288, 6)\n", 518 | " parameters = initializer_parameters()\n", 519 | " out = forward_propagation(X, parameters)\n", 520 | " cost = compute_cost(out, Y)\n", 521 | "print (cost)" 522 | ], 523 | "language": "python", 524 | "metadata": {}, 525 | "outputs": [ 526 | { 527 | "output_type": "stream", 528 | "stream": "stdout", 529 | "text": [ 530 | "Tensor(\"Mean:0\", shape=(), dtype=float32)\n" 531 | ] 532 | } 533 | ], 534 | "prompt_number": 82 535 | }, 536 | { 537 | "cell_type": "code", 538 | "collapsed": false, 539 | "input": [ 540 | "def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,\n", 541 | " nums_epoch = 1500, minibatch_size = 32, print_cost = True):\n", 542 | " tf.reset_default_graph()\n", 543 | " tf.set_random_seed(1)\n", 544 | " seed = 3\n", 545 | " (n_x, m)= X_train.shape\n", 546 | " n_y = Y_train.shape[0]\n", 547 | " costs = []\n", 548 | " \n", 549 | " X, Y = creat_placeholder(n_x, n_y)\n", 550 | " parameters = initializer_parameters()\n", 551 | " out = forward_propagation(X, parameters)\n", 552 | " cost = compute_cost(out, Y)\n", 553 | " optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)\n", 554 | " \n", 555 | " init = tf.global_variables_initializer()\n", 556 | " \n", 557 | " with tf.Session() as sess:\n", 558 | " sess.run(init)\n", 559 | " for epoch in range(nums_epoch):\n", 560 | " epoch_cost = 0\n", 561 | " num_minibatches = int(m / minibatch_size)\n", 562 | " seed = seed+1\n", 563 | " minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)\n", 564 | " \n", 565 | " for minibatch in minibatches:\n", 566 | " (minibatch_X, minibatch_Y) = minibatch\n", 567 | " _, minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y:minibatch_Y})\n", 568 | " epoch_cost += minibatch_cost / num_minibatches\n", 569 | " \n", 570 | " if print_cost == True and epoch % 100 ==0:\n", 571 | " print (\"Cost after epoch %i: %f\" % (epoch, epoch_cost))\n", 572 | " if print_cost == True and epoch%5 == 0:\n", 573 | " costs.append(epoch_cost)\n", 574 | " \n", 575 | " plt.plot(np.squeeze(costs))\n", 576 | " plt.ylabel('cost')\n", 577 | " plt.xlabel('iterations (per tens)')\n", 578 | " plt.title(\"Learning rate = \" + str(learning_rate))\n", 579 | " plt.show()\n", 580 | " \n", 581 | " parameters = sess.run(parameters)\n", 582 | " print(\"Parameters have benn trained!\")\n", 583 | " correct_prediction = tf.equal(tf.argmax(out), tf.argmax(Y))\n", 584 | " \n", 585 | " accuracy = tf.reduce_mean(tf.cast(correct_prediction, \"float\"))\n", 586 | " \n", 587 | " print(\"Train Accuracy:\", accuracy.eval({X:X_train, Y:Y_train}))\n", 588 | " print (\"Test Accuracy:\", accuracy.eval({X:X_test, Y:Y_test}))\n", 589 | " \n", 590 | " return parameters" 591 | ], 592 | "language": "python", 593 | "metadata": {}, 594 | "outputs": [], 595 | "prompt_number": 91 596 | }, 597 | { 598 | "cell_type": "code", 599 | "collapsed": false, 600 | "input": [ 601 | "parameters = model(X_train , Y_train, X_test, Y_test)" 602 | ], 603 | "language": "python", 604 | "metadata": {}, 605 | "outputs": [ 606 | { 607 | "output_type": "stream", 608 | "stream": "stdout", 609 | "text": [ 610 | "Cost after epoch 0: 1.880317\n", 611 | "Cost after epoch 100: 0.915240" 612 | ] 613 | }, 614 | { 615 | "output_type": "stream", 616 | "stream": "stdout", 617 | "text": [ 618 | "\n", 619 | "Cost after epoch 200: 0.522724" 620 | ] 621 | }, 622 | { 623 | "output_type": "stream", 624 | "stream": "stdout", 625 | "text": [ 626 | "\n", 627 | "Cost after epoch 300: 0.293652" 628 | ] 629 | }, 630 | { 631 | "output_type": "stream", 632 | "stream": "stdout", 633 | "text": [ 634 | "\n", 635 | "Cost after epoch 400: 0.173677" 636 | ] 637 | }, 638 | { 639 | "output_type": "stream", 640 | "stream": "stdout", 641 | "text": [ 642 | "\n", 643 | "Cost after epoch 500: 0.094603" 644 | ] 645 | }, 646 | { 647 | "output_type": "stream", 648 | "stream": "stdout", 649 | "text": [ 650 | "\n", 651 | "Cost after epoch 600: 0.057401" 652 | ] 653 | }, 654 | { 655 | "output_type": "stream", 656 | "stream": "stdout", 657 | "text": [ 658 | "\n", 659 | "Cost after epoch 700: 0.029311" 660 | ] 661 | }, 662 | { 663 | "output_type": "stream", 664 | "stream": "stdout", 665 | "text": [ 666 | "\n", 667 | "Cost after epoch 800: 0.016100" 668 | ] 669 | }, 670 | { 671 | "output_type": "stream", 672 | "stream": "stdout", 673 | "text": [ 674 | "\n", 675 | "Cost after epoch 900: 0.007676" 676 | ] 677 | }, 678 | { 679 | "output_type": "stream", 680 | "stream": "stdout", 681 | "text": [ 682 | "\n", 683 | "Cost after epoch 1000: 0.003377" 684 | ] 685 | }, 686 | { 687 | "output_type": "stream", 688 | "stream": "stdout", 689 | "text": [ 690 | "\n", 691 | "Cost after epoch 1100: 0.001795" 692 | ] 693 | }, 694 | { 695 | "output_type": "stream", 696 | "stream": "stdout", 697 | "text": [ 698 | "\n", 699 | "Cost after epoch 1200: 0.001447" 700 | ] 701 | }, 702 | { 703 | "output_type": "stream", 704 | "stream": "stdout", 705 | "text": [ 706 | "\n", 707 | "Cost after epoch 1300: 0.001007" 708 | ] 709 | }, 710 | { 711 | "output_type": "stream", 712 | "stream": "stdout", 713 | "text": [ 714 | "\n", 715 | "Cost after epoch 1400: 0.000706" 716 | ] 717 | }, 718 | { 719 | "output_type": "stream", 720 | "stream": "stdout", 721 | "text": [ 722 | "\n" 723 | ] 724 | }, 725 | { 726 | "metadata": {}, 727 | "output_type": "display_data", 728 | "png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEWCAYAAAB1xKBvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3XmcHFW5//HP07NPZklmSTLJTBZC\nQhIgJDAEUFYVCIiiV5AgKl7w5oKiv6t3Q/0pitffVdy4KopRMKhXRBEkILusGgKZAAkJISGE7Ntk\nmck2yWzP74+qgU4zSyeZnuqe+b5fr3pN16lTVc9JT/qZc6r6lLk7IiIiPYlFHYCIiGQGJQwREUmK\nEoaIiCRFCUNERJKihCEiIklRwhARkaQoYUi/ZWYPmdmVUcch0l8oYUivM7PVZva+qONw9wvc/Y6o\n4wAws6fM7NMRnLfMzO41s71mtsbMPtZNXTOz75jZ9nD5jplZ3PapZrbQzPaFP6cewr6zzWy5mbWb\n2adS1mBJKSUMyUhmlh11DB3SKZZO3AI0A8OAK4CfmdmxXdSdBXwIOAGYAnwA+GcAM8sF7gN+CwwB\n7gDuC8u73Te0CPgM8GJvNUwi4O5atPTqAqwG3tfFtouAl4EGYB4wJW7b9cAbwG7gVeDDcds+Bfwd\n+CGwHfivsOxvwPeAncCbwAVx+zwFfDpu/+7qjgWeCc/9OMEH7W+7aMPZwHrgP4HNwG8IPkQfAOrD\n4z8AVIf1vwW0AfuBPcBPwvKJwGPADmA58NFefh8GESSLCXFlvwG+3UX9ecCsuPWrgfnh6/OADYDF\nbV8LzOhp34Rz/A34VNS/o1oOb1EPQ/qMmU0Dbif4y7Mc+Dkw18zywipvAGcApcA3gN+aWVXcIU4B\nVhH8tfytuLLlQAVwE3Bb/FBIgu7q/g54IYzr68AnemjOcKAMGE3w13UM+FW4PgpoAn4C4O5fAZ4F\nrnP3Ine/zswGESSL3wFDgZnAT81scmcnM7OfmllDF8viLmKcALS6+4q4skVAVz2MY8PtndU9Fljs\n4ad+aHHC9q72lX5CCUP60izg5+7+vLu3eXB94QBwKoC7/9HdN7p7u7vfBbwOTI/bf6O7/9jdW929\nKSxb4+6/cPc2gmGSKoKE0plO65rZKOBk4Gvu3uzufwPm9tCWduAGdz/g7k3uvt3d/+Tu+9x9N0FC\nO6ub/S8CVrv7r8L2vAT8Cbi0s8ru/hl3H9zFMqWLcxQBuxLKGoHibuo3JtQtCpNq4rbEY3W3r/QT\n6Tz2Kv3PaOBKM/tcXFkuMALAzD4JfBEYE24rIugNdFjXyTE3d7xw933h51NRF+fvqm4FsMPd9yWc\nq6abttS7+/6OFTMrJBgum0EwPAVQbGZZYYJKNBo4xcwa4sqyCYaMesseoCShrIRg2C2Z+iXAHnd3\nM+vpWF3ueziBS3pSD0P60jrgWwl/HRe6+51mNhr4BXAdUO7ug4ElQPxfqKn68NkElIUf+h26Sxad\nxfKvwDHAKe5eApwZllsX9dcBTyf8WxS5+7WdnczMbjWzPV0sS7uIcQWQbWbj48pOALqqvzTc3lnd\npcCUhB7DlITtXe0r/YQShqRKjpnlxy3ZBAnhGjM7JbwNc5CZvd/Migku0DrBRWPM7B+B4/oiUHdf\nA9QBXzezXDM7jeAun0NRTHDdosHMyoAbErZvAY6KW38AmGBmnzCznHA52cwmdRHjNWFC6Wzp9FqB\nu+8F7gFuDP+t3w1cTNe9mF8DXzSzkWY2giAJzgm3PUVw4f7zZpZnZteF5U8ksS/hv2s+QQLt+N3Q\n50+G0RsmqfIgwQdox/J1d68D/ongYvBOYCXB3Uu4+6vA94HnCD5cjye4K6qvXAGcxtt3YN1FcH0l\nWTcDBcA2YD7wcML2/wEuMbOdZvaj8DrHeQQXuzcSDJd9B8ijd30mjGsrcCdwrbsvBTCzM8Khpg4/\nB+4HXiHo3f0lLMPdmwlum/0kwR1uVwEfCsu73Tf0KMHvwbuA2eHrM5GMYhpiFHknM7sLeM3dE3sK\nIgOWehgiQDgcNM7MYmY2g2Do5s9RxyWSTnSXlEhgOMF4fznBl/KuDW91FZGQhqRERCQpGpISEZGk\n9KshqYqKCh8zZkzUYYiIZIyFCxduc/fKZOr2q4QxZswY6urqog5DRCRjmNmaZOtqSEpERJKihCEi\nIklRwhARkaQoYYiISFKUMEREJClKGCIikhQlDBERScqATxjt7c5Pnnidp1fURx2KiEhaG/AJIxYz\nZj+ziieWbYk6FBGRtDbgEwbA8NJ8Nu/a33NFEZEBTAkDGF5awOZGJQwRke4oYQBVJflsUsIQEemW\nEgYwrDSf+j0HaGlrjzoUEZG0pYQBVJXm4w71uw9EHYqISNpSwiC46A1oWEpEpBtKGMDwkiBh6MK3\niEjXlDAIhqQA3VorItINJQygtCCH/JwYGxuaog5FRCRtKWEAZsaoskLW7tgXdSgiImkrZc/0NrPb\ngYuAre5+XCfb/x24Ii6OSUClu+8ws9XAbqANaHX32lTF2WF0+SDWbN+b6tOIiGSsVPYw5gAzutro\n7t9196nuPhX4EvC0u++Iq3JOuD3lyQJgTHkha7bvo73d++J0IiIZJ2UJw92fAXb0WDFwOXBnqmJJ\nxpiKQRxobdeFbxGRLkR+DcPMCgl6In+KK3bgUTNbaGazeth/lpnVmVldff3hT1E+pnwQAKs1LCUi\n0qnIEwbwAeDvCcNRp7v7icAFwGfN7Myudnb32e5e6+61lZWVhx3E6PJCANZs14VvEZHOpEPCmEnC\ncJS7bwh/bgXuBaanOoiq0gJys2Ks3qYehohIZyJNGGZWCpwF3BdXNsjMijteA+cBS1IdS1bMOKpy\nEMu37E71qUREMlIqb6u9EzgbqDCz9cANQA6Au98aVvsw8Ki7x/9ZPwy418w64vuduz+cqjjjTa4q\n4W8rt/XFqUREMk7KEoa7X55EnTkEt9/Gl60CTkhNVN2bPKKEe17awPY9BygvyosiBBGRtJUO1zDS\nxqSqEgCWbdKwlIhIIiWMOB0J49VNjRFHIiKSfpQw4pQNyqWiKI83tupOKRGRREoYCSqKctm+V0/e\nExFJpISRoKIoj+17m6MOQ0Qk7ShhJCgblMsOJQwRkXdQwkhQNiiXHXuUMEREEilhJKgoymX3gVYO\ntLZFHYqISFpRwkhQNij4wp6GpUREDqaEkaBsUC4A2zUsJSJyECWMBBVFYcJQD0NE5CBKGAk6ehg7\n9F0MEZGDKGEkKA+vYWhISkTkYEoYCUoKssmOmS56i4gkUMJIYGaUF+WybY+GpERE4ilhdGJ4aQGb\nGvdHHYaISFpRwujEiNJ8NjY0RR2GiEhaSVnCMLPbzWyrmXX6PG4zO9vMGs3s5XD5Wty2GWa23MxW\nmtn1qYqxKyMGF7CxYT/u3tenFhFJW6nsYcwBZvRQ51l3nxouNwKYWRZwC3ABMBm43MwmpzDOdxgx\nuICmljYa9rX05WlFRNJayhKGuz8D7DiMXacDK919lbs3A78HLu7V4HowcnA+ABs0LCUi8paor2Gc\nZmaLzOwhMzs2LBsJrIursz4s65SZzTKzOjOrq6+v75WgRgwuANB1DBGROFEmjBeB0e5+AvBj4M+H\ncxB3n+3ute5eW1lZ2SuBKWGIiLxTZAnD3Xe5+57w9YNAjplVABuAmriq1WFZnykflEtudoyNurVW\nROQtkSUMMxtuZha+nh7Gsh1YAIw3s7FmlgvMBOb2cWzUDClgVf2evjytiEhay07Vgc3sTuBsoMLM\n1gM3ADkA7n4rcAlwrZm1Ak3ATA/uY201s+uAR4As4HZ3X5qqOLtyQs1gnllRj7sT5jURkQEtZQnD\n3S/vYftPgJ90se1B4MFUxJWsaaOGcM+LG1i/s4massIoQxERSQtR3yWVtqbVDAbgxbU7I45ERCQ9\nKGF0YeLwYvJzYry0tiHqUERE0oISRheys2JMriph2aZdUYciIpIWlDC6cczwEpZv2a05pUREUMLo\n1jHDimjY10L9bj0bQ0RECaMbE4YXA7B8y+6IIxERiZ4SRjeOGRYmjM1KGCIiShjdKC/Ko6IoVwlD\nRAQljB5NHF7Css26U0pERAmjB8eOLGH55t00t7ZHHYqISKSUMHpw3IhSWtqc17dqWEpEBjYljB4c\nO6IEgKUbNCwlIgObEkYPxpQPYlBuFks2NkYdiohIpJQwehCLGcdXl/LyOs0pJSIDmxJGEqaPKWPJ\nhkb2HGiNOhQRkcgoYSTh5LFltDssXKOpzkVk4FLCSMKJo4aQFTMWvLkj6lBERCKTsoRhZreb2VYz\nW9LF9ivMbLGZvWJm88zshLhtq8Pyl82sLlUxJmtQXjaTq0r0MCURGdBS2cOYA8zoZvubwFnufjzw\nTWB2wvZz3H2qu9emKL5DMqmqmBWahFBEBrCUJQx3fwbocgzH3ee5e8ef7POB6lTF0hsmDCtm255m\ntu3RVOciMjClyzWMq4GH4tYdeNTMFprZrIhiOsgx4VTn6mWIyEAVecIws3MIEsZ/xhWf7u4nAhcA\nnzWzM7vZf5aZ1ZlZXX19fcri7JjqfIVmrhWRASrShGFmU4BfAhe7+/aOcnffEP7cCtwLTO/qGO4+\n291r3b22srIyZbFWFucxuDCH5Vv2pOwcIiLpLLKEYWajgHuAT7j7irjyQWZW3PEaOA/o9E6rvmRm\nTBpewuL1+sa3iAxM2ak6sJndCZwNVJjZeuAGIAfA3W8FvgaUAz81M4DW8I6oYcC9YVk28Dt3fzhV\ncR6KU48q5+a/rqBhXzODC3OjDkdEpE+lLGG4++U9bP808OlOylcBJ7xzj+idNq6cHz4Oz7+5g/OP\nHR51OCIifSryi96Z5ISaUvJzYjz3xvaeK4uI9DNKGIcgLzuLk8eUKWGIyICkhHGITj2qnOVbdrNd\nX+ATkQFGCeMQnTauHID5qzQRoYgMLEoYh2jKyFKK8rKZ98a2qEMREelTShiHKDsrxsljhvDcKl3H\nEJGBRQnjMJw2rpxV9XvZsmt/1KGIiPQZJYzD8K5xFQC6W0pEBhQljMMwqaqEkvxsJQwRGVCUMA5D\nVsw45ahyXccQkQFFCeMwvWtcOWt37GP9zn1RhyIi0ieUMA5Tx/cxNCwlIgOFEsZhmjC0mLJBudz7\n0gbWblcvQ0T6PyWMwxSLGR+YUsW8N7Yz6zd1UYcjIpJyKZvefCD4xsXHUZiXzexnVrG/pY38nKyo\nQxIRSRn1MI7QcSNKaWt3Vm7Vo1tFpH9TwjhCk6qKAXh1066IIxERSS0ljCM0unwQBTlZvLZpd9Sh\niIikVEoThpndbmZbzWxJF9vNzH5kZivNbLGZnRi37Uozez1crkxlnEciK2YcM7yYZephiEg/l1TC\nMLNLkynrxBxgRjfbLwDGh8ss4GfhscuAG4BTgOnADWY2JJlYo3D8yFIWr2+gta096lBERFIm2R7G\nl5IsO4i7PwN096Shi4Ffe2A+MNjMqoDzgcfcfYe77wQeo/vEE6mTx5axt7mNZRqWEpF+rNvbas3s\nAuBCYKSZ/ShuUwnQ2gvnHwmsi1tfH5Z1Vd5ZjLMIeieMGjWqF0I6dCePCTo/L6zewfHVpZHEICKS\naj31MDYCdcB+YGHcMpegFxA5d5/t7rXuXltZWRlJDFWlBVQPKWDBm3psq4j0X932MNx9EbDIzH7n\n7i0A4bWEmnCo6EhtAGri1qvDsg3A2QnlT/XC+VLm3eMquH/xRhr2NTO4MDfqcEREel2y1zAeM7OS\n8GL0i8AvzOyHvXD+ucAnw7ulTgUa3X0T8AhwnpkNCRPUeWFZ2vrH08ewr7mNO+atiToUEZGUSDZh\nlLr7LuAfCC5SnwK8t6edzOxO4DngGDNbb2ZXm9k1ZnZNWOVBYBWwEvgF8BkAd98BfBNYEC43hmVp\na+LwEt43aSh3PLea5lbdLSUi/U+yc0llh3cvfRT4SrIHd/fLe9juwGe72HY7cHuy50oHl08fxePL\ntvL0inrOnTws6nBERHpVsj2MGwmGhN5w9wVmdhTweurCykxnTqikoiiXuxeu67myiEiGSSphuPsf\n3X2Ku18brq9y94+kNrTMk5MV49LaGh5ZuoX5enyriPQzyX7Tu9rM7g2n+dhqZn8ys+pUB5eJPvee\noxlVVsgN9y2NOhQRkV6V7JDUrwjuaBoRLveHZZKgMDeby06uYfmW3TQ2tUQdjohIr0k2YVS6+6/c\nvTVc5gDRfEsuA0weUQLAa5qQUET6kWQTxnYz+7iZZYXLxwEN0ndhclWQMPSMDBHpT5JNGFcR3FK7\nGdgEXAJ8KkUxZbyhxXmUD8rVlOci0q8k+z2MG4ErO6YDCb/x/T2CRCIJzIxJVSXqYYhIv5JsD2NK\n/NxR4beup6UmpP5h+tgylmzYxcI1af0FdRGRpCWbMGLxDzAKexjJ9k4GpKtPH0tVaT5f/fNSgi+0\ni4hktmQTxveB58zsm2b2TWAecFPqwsp8g/Ky+fx7x/Pqpl0sXt8YdTgiIkcs2W96/5pg4sEt4fIP\n7v6bVAbWH1x4fBW52THufWlD1KGIiByxpIeV3P1V4NUUxtLvlBbk8L5JQ7nnxfVcclI1x43U0/hE\nJHMlOyQlh+nfz59IcX4On7jtefa3tEUdjojIYVPCSLGxFYP47qVT2LmvhUeWbo46HBGRw6aE0QdO\nHVvOyMEF3L1wfdShiIgcNiWMPhCLGTNPruHZ17dx4/26DCQimSmlCcPMZpjZcjNbaWbXd7L9h2b2\ncrisMLOGuG1tcdvmpjLOvvCZc47mo7XV3P73N9mya3/U4YiIHLKUJQwzywJuAS4AJgOXm9nk+Dru\n/gV3n+ruU4EfA/fEbW7q2ObuH0xVnH0lK2ZcccpoAOpW7+yhtohI+kllD2M6sDJ8Ol8z8Hvg4m7q\nXw7cmcJ4Ijd5RAkFOVksWK3pQkQk86QyYYwE4h9uvT4sewczGw2MBZ6IK843szozm29mH+rqJGY2\nK6xXV19f3xtxp0xOVoypNYOp0/xSIpKB0uWi90zgbneP/6LCaHevBT4G3Gxm4zrb0d1nu3utu9dW\nVqb/M51OHlvGqxt3samxKepQREQOSSoTxgagJm69OizrzEwShqPcfUP4cxXwFP1kdtxLT6rGzPjl\ns29GHYqIyCFJZcJYAIw3s7FmlkuQFN5xt5OZTQSGAM/FlQ0xs7zwdQXwbvrJtCQ1ZYVcfMIIfvf8\nWlZs2R11OCIiSUtZwnD3VuA64BFgGfAHd19qZjeaWfxdTzOB3/vBc4BPAurMbBHwJPDtcC6rfuHf\nZxxDcX42V81ZoOlCRCRjWH96VkNtba3X1dVFHUZSnn29nk/c9gI3XTKFj9bW9LyDiEgKmNnC8Hpx\nj9LloveAc/rRFRwzrJg75q3WA5ZEJCMoYUTEzLj69LEs3biLHz+xMupwRER6pIQRoUtrq/mHaSP5\nwWMrWL1tb9ThiIh0SwkjQmbGv55/DAAPa+pzEUlzShgRGzm4gCnVpTy8RAlDRNKbEkYaOP/Y4by8\nroElGxqjDkVEpEtKGGngilNGMbQ4j3/9wyIOtOp7GSKSnpQw0sDgwly+/ZHjWb5lN7f9TVOGiEh6\nUsJIE++ZOIzzJg/jx39dycYGTUwoIulHCSONfPWiybS7862/LIs6FBGRd1DCSCM1ZYV89pyj+csr\nm3jwlU1RhyMichAljDRzzVnjmDZqMP9x92LeqN8TdTgiIm9RwkgzudkxbvnYieRmx7j2twvZc6A1\n6pBERAAljLQ0YnABP5o5jTfq92oKdBFJG0oYaer08RX84KMn8MKbO5gzb3XU4YiIKGGks4unjuSc\nYyr56ZMradjXHHU4IjLAKWGkuesvmMSeA63c8qSmQBeRaKU0YZjZDDNbbmYrzez6TrZ/yszqzezl\ncPl03LYrzez1cLkylXGms2OGF/ORE6u5Y94a1u/cF3U4IjKApSxhmFkWcAtwATAZuNzMJndS9S53\nnxouvwz3LQNuAE4BpgM3mNmQVMWa7r5w7gQAbnnyjYgjEZGBLJU9jOnASndf5e7NwO+Bi5Pc93zg\nMXff4e47gceAGSmKM+2NGFzAzOk1/LFuHU8u3xp1OCIyQKUyYYwE1sWtrw/LEn3EzBab2d1mVnOI\n+2Jms8yszszq6uvreyPutHTdOUdTU1bIP/5qgb4FLiKRiPqi9/3AGHefQtCLuONQD+Dus9291t1r\nKysrez3AdDG0JJ+H/+UMJg4v5jsPv6Yv9IlIn0tlwtgA1MStV4dlb3H37e5+IFz9JXBSsvsORHnZ\nWfznjIms2b6PaTc+ytfuW6Iv9YlIn0llwlgAjDezsWaWC8wE5sZXMLOquNUPAh3TtD4CnGdmQ8KL\n3eeFZQPeOROHcvc1p3HJSdX8+rk1/PLZVVGHJCIDRHaqDuzurWZ2HcEHfRZwu7svNbMbgTp3nwt8\n3sw+CLQCO4BPhfvuMLNvEiQdgBvdfUeqYs00tWPKqB1Txtod+/jd82u59uyjyYpZ1GGJSD9n7h51\nDL2mtrbW6+rqog6jzzy8ZBPX/PZFfnrFiVx4fFXPO4iIJDCzhe5em0zdqC96yxF436RhHD20iO8+\nspzm1vaowxGRfk4JI4NlZ8X48oUTeXPbXr736PKowxGRfk4JI8Odc8xQPnHqaGY/s4o5f38z6nBE\npB9L2UVv6Rtmxtc/eCybd+3nGw+8yvqdTfzLuRMoytNbKyK9Sz2MfiArZvxo5jQ+PHUkt//9TX74\n2IqoQxKRfkgJo58oyM3iB5dN5YLjq7h74Xp9oU9Eep0SRj9zxfRRNDa18Nv5a6IORUT6GSWMfua0\nceWcc0wl33pwGXMXbYw6HBHpR3RltJ8xM356xUlc+asX+OJdL5OfHWPbnmbefXQ5o8sHRR2eiGQw\n9TD6oYLcLG67spZjR5Qw6zcL+fK9r/DNB16NOiwRyXBKGP1UcX4Od1w1nfdOHArAgtU7aWvvP9PA\niEjfU8LoxwYX5nLbp07mJx+bRmNTCzc98hqbGpuiDktEMpQSxgBwxvjgwVI/f3oVH//l87y2eZdu\nuxWRQ6aEMQCUFgTDU9+8+FhWb9/HjJuf5at/XhJ1WCKSYXSX1ABx1oRKoJIp1YP58RMruW/RRk49\nqpyjhxZxQs3gqMMTkQygHsYAc0LNYL547gSaW9v51z8u4iM/m8d9Lw/4p9+KSBLUwxiAJo8o4eOn\njqK0IIf5q3Zww9ylnDNxKCX5OVGHJiJpTD2MAeq/PnQ8/37+RL7xwWNp2NfCtx5YRkubHsIkIl1L\nacIwsxlmttzMVprZ9Z1s/6KZvWpmi83sr2Y2Om5bm5m9HC5zUxnnQHbcyFL+6Yyx3FW3jvf/6Fnm\nvbEt6pBEJE2lLGGYWRZwC3ABMBm43MwmJ1R7Cah19ynA3cBNcdua3H1quHwwVXEKfOX9k/n5J06i\nubWdT972Ave8uD7qkEQkDaWyhzEdWOnuq9y9Gfg9cHF8BXd/0t33havzgeoUxiPdOP/Y4dz/udM5\neUwZX/zDIr4+d6meEy4iB0llwhgJrItbXx+WdeVq4KG49XwzqzOz+Wb2oa52MrNZYb26+vr6I4t4\ngCvOz+HXV0/n6tPHMmfeaj5x2/Ms37ybJ5dvZVNjEwda29i9vyXqMEUkImlxl5SZfRyoBc6KKx7t\n7hvM7CjgCTN7xd3fSNzX3WcDswFqa2s1WdIRysmK8dWLJnPcyBL+7Y+LOf/mZwAYXJjDuMoidu9v\n4ZF/ORMzizhSEelrqUwYG4CauPXqsOwgZvY+4CvAWe5+oKPc3TeEP1eZ2VPANOAdCUNS48PTqqkZ\nUshrm3czrCSfz/7uRRau2QnAyq17GD+sOOIIRaSvpXJIagEw3szGmlkuMBM46G4nM5sG/Bz4oLtv\njSsfYmZ54esK4N2A5ufuY7Vjyvj4qaM5d/Iwbr5sKteePQ6Ax5dt7WFPEemPUtbDcPdWM7sOeATI\nAm5396VmdiNQ5+5zge8CRcAfwyGOteEdUZOAn5tZO0FS+7a7K2FE6MLjq7jw+Cr+9vo27nxhLVOq\nS3n30RVRhyUifcjc+8+wf21trdfV1UUdRr/25PKtfOlPr7B1935mnTmOySNK+MCUKl3TEMlQZrbQ\n3WuTqZsWF70lc5xzzFCe+LezuGrOAm59Orik9NLanfzTGUcxtDiP7CxNHiDSX6mHIYfF3dmxt5kf\nPr6C385fC8DQ4jzeM3EoL67dyV2zTmPIoNyIoxSRnhxKD0MJQ47Yii27mb9qO7c+9QYbG/cD8JET\nq/nepVMOGqpydw1diaQZDUlJn5owrJgJw4o5/9jhLN3YSN3qnfz0qTdo2NfMDy6bSk6W8c0HXuXR\npVt44POnU1VaEHXIInIYlDCk1wwryWdYST5nTxhKRVEe//3QMi768bNkx2Ks3r4Xd/j9C+v4wrkT\nog5VRA6DrlBKr4vFjKtOH8uvrzqFEaUF5Odk8b9Xn8JZEyr52dNvMPGrD/G5O1+ibvUO+tOQqEh/\npx6GpMxp48o5bdxpB5Ut37yb6WPLeHL5Vu5ftJHjRpZw82XTOHpoUURRikiydNFbIrGvuZU/v7SR\nHzy2nAMt7YwfVsRZE4Zy9RljKcrT3zEifUV3SUnGWL1tLzc/voK1O/bx0roGaoYUcsb4CkaVFfLR\n2hrdmiuSYkoYkpGeX7Wd/37oNdZs38vOfS0MLc7j5DFlVBTlcsWpo5mgCQ9Fep0ShmS8JRsa+c7D\nr7GhoYlNDftpa3fOPXYY02oGU5Kfw7ihg3h82VbOnTyME0cNiTpckYylhCH9yrY9B7j58RU8unQL\nW3cfOGhbXnaMmy6ZwjHDi8mOxageEtyVJSLJUcKQfmv7ngPs2t/KonUNjC4v5L/+suyt53R0uPD4\n4Vx71tGMH1ZEXnaMvc1tDMrN0rfMRTqhhCEDRktbO3cvXE9OVoysGCzbtJs581bT3NqOGRjQ7jCl\nupSaskKqhxTw6dOP4sW1Oxleks8JNYOjboJIpJQwZEDb1NjES2sbWLFlN23tTnYsxm1/W0VLm9PU\n0nZQ3QnDijh2RCnXnDWOVzc1YhgfOGEEWbGueyPt7U5jU4vu4JJ+QQlDJEHjvhays4wnl29l7Y59\nnDymjL+v3Mbi9Y387fVtNLdf5YM2AAANC0lEQVS1v1W3sjiP4SX5HD20iGmjBrN9TzMVxXlMripm\n4vASvvrnJTzwyib+8M+nsauphQdf2cSX3z+JkvycCFsocniUMEQOwar6PSxa30D1kEI2Ne7nqeVb\n2bG3mRfX7GTX/taD6pqBO+TnxDCMA61tbw15XTSlirZ2eOHN7Zw5oZLR5YVMriolZkESAtjQ0MTI\nwQXhsXRNRaKXNgnDzGYA/0PwiNZfuvu3E7bnAb8GTgK2A5e5++pw25eAq4E24PPu/khP51PCkN7U\n1NzGjn3NDC/JZ8uu/SzduIulGxsBuGhKFb/6+2qK8rIZP6yYmx5+7a07uIaV5LFl18F3cw0rySMn\nK8b6nU0cPbSIdTv2MbZiENNGDaEgJ4ujKgex90Arbe5UFOUxorSAmMF3H13O+yYN46p3jyUWg+xY\njJgdXrJpam6jubWd0kL1hORtaZEwzCwLWAGcC6wHFgCXxz+b28w+A0xx92vMbCbwYXe/zMwmA3cC\n04ERwOPABHdvSzxPPCUMiVLDvmZ2NbVSU1YQfH+kcT/LNu2itc1ZvL6BPQfamFxVzFMr6plcVcKG\nhiYWrWvo9NpKh/ycGPtb2g8qMwseVlUcDoHF/x+uLM6joiiPDQ1NFOZmMX5oMc1t7RTmZPHQks3s\n3t/CR06qpqm5jfKiXCqK8hiUm82BtnZiBoMLcinKz6a93Wlrd1rbnXYPf7Y7udkxjh5aRLs7+1va\n2d/Sxv6WNtxh/LAiCnOz30poMYOYGTGz4AaEuPXDTXrS+9IlYZwGfN3dzw/XvwTg7v8dV+eRsM5z\nZpYNbAYqgevj68bX6+6cShiSidyd9TubKCnIITcrRv3uA2xsbKJ+9wHeNa6c51ZtZ/3OJtrCD/GW\ntnY2Ne6nqTkuyRjgwZDXrqYWKorz2Lm3mc2N+8nLyWLX/hZGlOZTWpjLqxsbKS3IZee+Ztraox2S\nzoq9nTyywsTSIT6ddCSXg1JMT3V72B5fbl0cuKP84P3ja/ZUt/Ok+FbdwzxWYrhlhbncfe27Oj1X\nT9LlAUojgXVx6+uBU7qq4+6tZtYIlIfl8xP2HdnZScxsFjALYNSoUb0SuEhfMjNqygrfWh9VXsio\n8rfXL5oy4ojP0drWTlbMMDNa29rJzoq9dbfXngOt5GXHaHdoaGpm74FWYmZkxeKWcH3vgTZWbdtD\nTlaM/JwY+TlZ5Odk0dbuvL5lDy1t7bS70+5BIux43e6Ov1UGbe3+1ut2d9o86MF0iP871jst67zu\n22Vx2w8qf+cxOjvXweVdxHVYx4qLvcu63sP+76xb3Ec3XGT8tKDuPhuYDUEPI+JwRNJSdlbsHa9j\nMWPIoNyDbg8eXprf47GOry7ttPzkMWVHGKWku1Q+QGkDUBO3Xh2WdVonHJIqJbj4ncy+IiLSh1KZ\nMBYA481srJnlAjOBuQl15gJXhq8vAZ7woI81F5hpZnlmNhYYD7yQwlhFRKQHKRuSCq9JXAc8QnBb\n7e3uvtTMbgTq3H0ucBvwGzNbCewgSCqE9f4AvAq0Ap/t6Q4pERFJLX1xT0RkADuUu6RSOSQlIiL9\niBKGiIgkRQlDRESSooQhIiJJ6VcXvc2sHlhzmLtXANt6MZwoqS3pp7+0A9SWdHW4bRnt7pXJVOxX\nCeNImFldsncKpDu1Jf30l3aA2pKu+qItGpISEZGkKGGIiEhSlDDeNjvqAHqR2pJ++ks7QG1JVylv\ni65hiIhIUtTDEBGRpChhiIhIUgZ8wjCzGWa23MxWmtn1UcdzqMxstZm9YmYvm1ldWFZmZo+Z2evh\nzyFRx9kZM7vdzLaa2ZK4sk5jt8CPwvdpsZmdGF3k79RFW75uZhvC9+ZlM7swbtuXwrYsN7Pzo4m6\nc2ZWY2ZPmtmrZrbUzP5PWJ5x7003bcm498bM8s3sBTNbFLblG2H5WDN7Poz5rvBxEoSPh7grLH/e\nzMYccRDuPmAXgmnX3wCOAnKBRcDkqOM6xDasBioSym4Crg9fXw98J+o4u4j9TOBEYElPsQMXAg8R\nPM74VOD5qONPoi1fB/6tk7qTw9+1PGBs+DuYFXUb4uKrAk4MXxcDK8KYM+696aYtGffehP++ReHr\nHOD58N/7D8DMsPxW4Nrw9WeAW8PXM4G7jjSGgd7DmA6sdPdV7t4M/B64OOKYesPFwB3h6zuAD0UY\nS5fc/RmC56DE6yr2i4Ffe2A+MNjMqvom0p510ZauXAz83t0PuPubwEqC38W04O6b3P3F8PVuYBkw\nkgx8b7ppS1fS9r0J/333hKs54eLAe4C7w/LE96Xj/bobeK+Z2ZHEMNATxkhgXdz6err/ZUpHDjxq\nZgvNbFZYNszdN4WvNwPDogntsHQVe6a+V9eFwzS3xw0NZkxbwmGMaQR/zWb0e5PQFsjA98bMsszs\nZWAr8BhBD6jB3VvDKvHxvtWWcHsjUH4k5x/oCaM/ON3dTwQuAD5rZmfGb/SgP5qR905ncuyhnwHj\ngKnAJuD70YZzaMysCPgT8C/uvit+W6a9N520JSPfG3dvc/epQDVBz2diX55/oCeMDUBN3Hp1WJYx\n3H1D+HMrcC/BL9GWjiGB8OfW6CI8ZF3FnnHvlbtvCf+DtwO/4O2hjbRvi5nlEHzA/q+73xMWZ+R7\n01lbMvm9AXD3BuBJ4DSCIcCOx23Hx/tWW8LtpcD2IznvQE8YC4Dx4V0GuQQXhuZGHFPSzGyQmRV3\nvAbOA5YQtOHKsNqVwH3RRHhYuop9LvDJ8I6cU4HGuOGRtJQwjv9hgvcGgrbMDO9iGQuMB17o6/i6\nEo5z3wYsc/cfxG3KuPemq7Zk4ntjZpVmNjh8XQCcS3BN5kngkrBa4vvS8X5dAjwR9gwPX9RX/qNe\nCO7wWEEwFviVqOM5xNiPIrijYxGwtCN+gnHKvwKvA48DZVHH2kX8dxIMB7QQjL1e3VXsBHeI3BK+\nT68AtVHHn0RbfhPGujj8z1sVV/8rYVuWAxdEHX9CW04nGG5aDLwcLhdm4nvTTVsy7r0BpgAvhTEv\nAb4Wlh9FkNRWAn8E8sLy/HB9Zbj9qCONQVODiIhIUgb6kJSIiCRJCUNERJKihCEiIklRwhARkaQo\nYYiISFKUMCTtmdm88OcYM/tYLx/7y52dK1XM7ENm9rUUHfvLPdc65GMeb2Zzevu4kpl0W61kDDM7\nm2CG0YsOYZ9sf3uenc6273H3ot6IL8l45gEfdPdtR3icd7QrVW0xs8eBq9x9bW8fWzKLehiS9sys\nY4bObwNnhM8v+EI4Edt3zWxBOIncP4f1zzazZ81sLvBqWPbncILGpR2TNJrZt4GC8Hj/G3+u8FvL\n3zWzJRY8b+SyuGM/ZWZ3m9lrZva/HTOAmtm3LXjuwmIz+14n7ZgAHOhIFmY2x8xuNbM6M1thZheF\n5Um3K+7YnbXl4xY8P+FlM/u5mWV1tNHMvmXBcxXmm9mwsPzSsL2LzOyZuMPfTzALggx0UX97UYuW\nnhZgT/jzbOCBuPJZwP8NX+cBdQTPMDgb2AuMjavb8a3kAoJvyZbHH7uTc32EYDbQLIJZWdcSPFvh\nbIJZP6sJ/uB6juDbxOUE3wzu6LUP7qQd/wh8P259DvBweJzxBN8Qzz+UdnUWe/h6EsEHfU64/lPg\nk+FrBz4Qvr4p7lyvACMT4wfeDdwf9e+BluiXjgmrRDLRecAUM+uYR6eU4IO3GXjBg+cZdPi8mX04\nfF0T1utuIrbTgTvdvY1g0r2ngZOBXeGx1wNYMNX0GGA+sB+4zcweAB7o5JhVQH1C2R88mADvdTNb\nRTD76KG0qyvvBU4CFoQdoALeniywOS6+hQRzEgH8HZhjZn8A7nn7UGwFRiRxTunnlDAkkxnwOXd/\n5KDC4FrH3oT19wGnufs+M3uK4C/5w3Ug7nUbkO3urWY2neCD+hLgOoIH28RrIvjwj5d4EdFJsl09\nMOAOd/9SJ9ta3L3jvG2EnwPufo2ZnQK8H1hoZie5+3aCf6umJM8r/ZiuYUgm2U3wmM0OjwDXWjB9\nNWY2IZy1N1EpsDNMFhMJHmvZoaVj/wTPApeF1xMqCR7B2uWspRY8b6HU3R8EvgCc0Em1ZcDRCWWX\nmlnMzMYRTCK3/BDalSi+LX8FLjGzoeExysxsdHc7m9k4d3/e3b9G0BPqmOZ7Am/P5ioDmHoYkkkW\nA21mtohg/P9/CIaDXgwvPNfT+eNoHwauMbNlBB/I8+O2zQYWm9mL7n5FXPm9BM8aWETwV/9/uPvm\nMOF0phi4z8zyCf66/2IndZ4Bvm9mFvcX/lqCRFQCXOPu+83sl0m2K9FBbTGz/0vwNMYYwSy6nwXW\ndLP/d81sfBj/X8O2A5wD/CWJ80s/p9tqRfqQmf0PwQXkx8PvNzzg7nf3sFtkzCwPeJrgyY5d3p4s\nA4OGpET61v8DCqMO4hCMAq5XshBQD0NERJKkHoaIiCRFCUNERJKihCEiIklRwhARkaQoYYiISFL+\nP/xF9isRjo4BAAAAAElFTkSuQmCC\n", 729 | "text": [ 730 | "" 731 | ] 732 | }, 733 | { 734 | "output_type": "stream", 735 | "stream": "stdout", 736 | "text": [ 737 | "Parameters have benn trained!\n", 738 | "Train Accuracy:" 739 | ] 740 | }, 741 | { 742 | "output_type": "stream", 743 | "stream": "stdout", 744 | "text": [ 745 | " 1.0\n", 746 | "Test Accuracy: 0.875\n" 747 | ] 748 | } 749 | ], 750 | "prompt_number": 92 751 | }, 752 | { 753 | "cell_type": "code", 754 | "collapsed": false, 755 | "input": [], 756 | "language": "python", 757 | "metadata": {}, 758 | "outputs": [] 759 | } 760 | ], 761 | "metadata": {} 762 | } 763 | ] 764 | } -------------------------------------------------------------------------------- /tf_utils.py: -------------------------------------------------------------------------------- 1 | import h5py 2 | import numpy as np 3 | import tensorflow as tf 4 | import math 5 | 6 | def load_dataset(): 7 | train_dataset = h5py.File('datasets/train_signs.h5', "r") 8 | train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features 9 | train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels 10 | 11 | test_dataset = h5py.File('datasets/test_signs.h5', "r") 12 | test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features 13 | test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels 14 | 15 | classes = np.array(test_dataset["list_classes"][:]) # the list of classes 16 | 17 | train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0])) 18 | test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0])) 19 | 20 | return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes 21 | 22 | 23 | def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0): 24 | """ 25 | Creates a list of random minibatches from (X, Y) 26 | 27 | Arguments: 28 | X -- input data, of shape (input size, number of examples) 29 | Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) 30 | mini_batch_size - size of the mini-batches, integer 31 | seed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours. 32 | 33 | Returns: 34 | mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y) 35 | """ 36 | 37 | m = X.shape[1] # number of training examples 38 | mini_batches = [] 39 | np.random.seed(seed) 40 | 41 | # Step 1: Shuffle (X, Y) 42 | permutation = list(np.random.permutation(m)) 43 | shuffled_X = X[:, permutation] 44 | shuffled_Y = Y[:, permutation].reshape((Y.shape[0],m)) 45 | 46 | # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case. 47 | num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning 48 | for k in range(0, num_complete_minibatches): 49 | mini_batch_X = shuffled_X[:, k * mini_batch_size : k * mini_batch_size + mini_batch_size] 50 | mini_batch_Y = shuffled_Y[:, k * mini_batch_size : k * mini_batch_size + mini_batch_size] 51 | mini_batch = (mini_batch_X, mini_batch_Y) 52 | mini_batches.append(mini_batch) 53 | 54 | # Handling the end case (last mini-batch < mini_batch_size) 55 | if m % mini_batch_size != 0: 56 | mini_batch_X = shuffled_X[:, num_complete_minibatches * mini_batch_size : m] 57 | mini_batch_Y = shuffled_Y[:, num_complete_minibatches * mini_batch_size : m] 58 | mini_batch = (mini_batch_X, mini_batch_Y) 59 | mini_batches.append(mini_batch) 60 | 61 | return mini_batches 62 | 63 | def convert_to_one_hot(Y, C): 64 | Y = np.eye(C)[Y.reshape(-1)].T 65 | return Y 66 | 67 | 68 | def predict(X, parameters): 69 | 70 | W1 = tf.convert_to_tensor(parameters["W1"]) 71 | b1 = tf.convert_to_tensor(parameters["b1"]) 72 | W2 = tf.convert_to_tensor(parameters["W2"]) 73 | b2 = tf.convert_to_tensor(parameters["b2"]) 74 | W3 = tf.convert_to_tensor(parameters["W3"]) 75 | b3 = tf.convert_to_tensor(parameters["b3"]) 76 | 77 | params = {"W1": W1, 78 | "b1": b1, 79 | "W2": W2, 80 | "b2": b2, 81 | "W3": W3, 82 | "b3": b3} 83 | 84 | x = tf.placeholder("float", [12288, 1]) 85 | 86 | z3 = forward_propagation_for_predict(x, params) 87 | p = tf.argmax(z3) 88 | 89 | sess = tf.Session() 90 | prediction = sess.run(p, feed_dict = {x: X}) 91 | 92 | return prediction 93 | 94 | def forward_propagation_for_predict(X, parameters): 95 | """ 96 | Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX 97 | 98 | Arguments: 99 | X -- input dataset placeholder, of shape (input size, number of examples) 100 | parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3" 101 | the shapes are given in initialize_parameters 102 | 103 | Returns: 104 | Z3 -- the output of the last LINEAR unit 105 | """ 106 | 107 | # Retrieve the parameters from the dictionary "parameters" 108 | W1 = parameters['W1'] 109 | b1 = parameters['b1'] 110 | W2 = parameters['W2'] 111 | b2 = parameters['b2'] 112 | W3 = parameters['W3'] 113 | b3 = parameters['b3'] 114 | # Numpy Equivalents: 115 | Z1 = tf.add(tf.matmul(W1, X), b1) # Z1 = np.dot(W1, X) + b1 116 | A1 = tf.nn.relu(Z1) # A1 = relu(Z1) 117 | Z2 = tf.add(tf.matmul(W2, A1), b2) # Z2 = np.dot(W2, a1) + b2 118 | A2 = tf.nn.relu(Z2) # A2 = relu(Z2) 119 | Z3 = tf.add(tf.matmul(W3, A2), b3) # Z3 = np.dot(W3,Z2) + b3 120 | 121 | return Z3 122 | --------------------------------------------------------------------------------