├── .gitignore
├── README.md
├── S1
    ├── S1_notebook.ipynb
    ├── S1a_live.py
    └── S1b_live.py
├── S2_live.py
├── S3_live.py
├── S4_live.py
├── S5_live.py
├── S6
    ├── create_tf_records.py
    ├── freeze_model.py
    ├── inference.py
    ├── nets
    │   └── mobilenet_v1.py
    ├── resize_images.py
    └── trainer.py
├── S7
    ├── checkpoints
    │   └── checkpoint
    ├── nets
    │   └── mobilenet_v1.py
    └── trainer.py
├── S8
    ├── datagenerator.py
    ├── nets
    │   └── model.py
    └── trainer.py
├── S9
    ├── S1.py
    ├── S2.py
    ├── S3.py
    ├── S4.py
    ├── S5.py
    └── S6.py
└── img
    ├── dlcc_github.jpg
    └── tfcs_github.png


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | env/
 12 | build/
 13 | develop-eggs/
 14 | dist/
 15 | downloads/
 16 | eggs/
 17 | .eggs/
 18 | lib/
 19 | lib64/
 20 | parts/
 21 | sdist/
 22 | var/
 23 | wheels/
 24 | *.egg-info/
 25 | .installed.cfg
 26 | *.egg
 27 | 
 28 | # PyInstaller
 29 | #  Usually these files are written by a python script from a template
 30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 31 | *.manifest
 32 | *.spec
 33 | 
 34 | # Installer logs
 35 | pip-log.txt
 36 | pip-delete-this-directory.txt
 37 | 
 38 | # Unit test / coverage reports
 39 | htmlcov/
 40 | .tox/
 41 | .coverage
 42 | .coverage.*
 43 | .cache
 44 | nosetests.xml
 45 | coverage.xml
 46 | *.cover
 47 | .hypothesis/
 48 | 
 49 | # Translations
 50 | *.mo
 51 | *.pot
 52 | 
 53 | # Django stuff:
 54 | *.log
 55 | local_settings.py
 56 | 
 57 | # Flask stuff:
 58 | instance/
 59 | .webassets-cache
 60 | 
 61 | # Scrapy stuff:
 62 | .scrapy
 63 | 
 64 | # Sphinx documentation
 65 | docs/_build/
 66 | 
 67 | # PyBuilder
 68 | target/
 69 | 
 70 | # Jupyter Notebook
 71 | .ipynb_checkpoints
 72 | 
 73 | # pyenv
 74 | .python-version
 75 | 
 76 | # celery beat schedule file
 77 | celerybeat-schedule
 78 | 
 79 | # SageMath parsed files
 80 | *.sage.py
 81 | 
 82 | # dotenv
 83 | .env
 84 | 
 85 | # virtualenv
 86 | .venv
 87 | venv/
 88 | ENV/
 89 | 
 90 | # Spyder project settings
 91 | .spyderproject
 92 | .spyproject
 93 | 
 94 | # Rope project settings
 95 | .ropeproject
 96 | 
 97 | # mkdocs documentation
 98 | /site
 99 | 
100 | # mypy
101 | .mypy_cache/
102 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | <a href="#"><img src="https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/master/img/tfcs_github.png" alt="TensorFlow Coding Sessions"></a>
 2 | 
 3 | ## Hands-on Deep Learning: TensorFlow Coding Sessions
 4 | 
 5 | This repository has the code for the Hands-on Deep Learning: TensorFlow Coding Sessions. The videos will be uploaded on a weekly basis.
 6 | 
 7 | The series consist of the introductory TensorFlow tutorials outlined below:
 8 | 
 9 | | # | Tutorial                                                             | Code | Video            |
10 | |-|------------------------------------------------------------------------|------|------------------|
11 | |1| Introduction to TensorFlow: graphs, sessions, constants, and variables |[S1](S1/) and [S1_notebook.ipynb](S1/S1_notebook.ipynb)| [Video #1](https://youtu.be/1KzJbIFnVTE) |
12 | |2| Training a multilayer perceptron                                       |[S2_live.py](S2_live.py)| [Video #2](https://youtu.be/b7ykcBzz9wo) |
13 | |3| Setting up the training and validation pipeline                        |[S3_live.py](S3_live.py)| [Video #3](https://youtu.be/l_ZvxKBToWs) |
14 | |4| Regularization, saving and resuming from checkpoints, and TensorBoard  |[S4_live.py](S4_live.py)| [Video #4](https://youtu.be/ni9FZtF_gLs) |
15 | |5| Convolutional neural networks, batchnorm, learning rate schedules, optimizers|[S5_live.py](S5_live.py)| [Video #5](https://youtu.be/ULX1nWPAJbM) |
16 | |6| Converting a dataset into TFRecords, training an image classifier, and freezing the model for deployment|[S6](S6/)| [Video #6](https://youtu.be/tzKqjPdAf8M) |
17 | |7| Transfer learning: fine tuning a model in TensorFlow                   |[S7](S7/)| [Video #7](https://youtu.be/jccBP_uA98k) |
18 | |8| Using a Python iterator as a data generator and training a denoising autoencoder  |[S8](S8/)| N/A |
19 | |9| What is new in TensorFlow 2.0 **[new]**                   |[S9](S9/)| [Video #8](https://youtu.be/GI_QVLNCgPo) |
20 | 
21 | ---
22 | 
23 | <a href="https://www.youtube.com/watch?v=nmnaO6esC7c&list=PLWKotBjTDoLj3rXBL-nEIPRN9V3a9Cx07"><img src="https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/master/img/dlcc_github.jpg" alt="Deep Learning Crash Course"></a>
24 | 
25 | ## Deep Learning Crash Course
26 | 
27 | A series of mini-lectures on the fundamentals of machine learning, with a focus on neural networks and deep learning.
28 | 
29 | * [Lecture #1: Introduction](https://youtu.be/nmnaO6esC7c)
30 | * [Lecture #2: Artificial Neural Networks Demystified](https://youtu.be/oS5fz_mHVz0)
31 | * [Lecture #3: Artificial Neural Networks: Going Deeper](https://youtu.be/_XPkAxm0Yx0)
32 | * [Lecture #4: Overfitting, Underfitting, and Model Capacity](https://youtu.be/ms-Ooh9mjiE)
33 | * [Lecture #5: Regularization](https://youtu.be/NRCZJUviZN0)
34 | * [Lecture #6: Data Collection and Preprocessing](https://youtu.be/dAg-_gzFo14)
35 | * [Lecture #7: Convolutional Neural Networks Explained](https://youtu.be/-I0lry5ceDs)
36 | * [Lecture #8: How to Design a Convolutional Neural Network](https://youtu.be/fTw3K8D5xDs)
37 | * [Lecture #9: Transfer Learning](https://youtu.be/_2EHcpg52uU)
38 | * [Lecture #10: Optimization Tricks: momentum, batch-norm, and more](https://youtu.be/kK8-jCCR4is)
39 | * [Lecture #11: Recurrent Neural Networks](https://youtu.be/k97Jrg_4tFA)
40 | * [Lecture #12: Deep Unsupervised Learning](https://youtu.be/P8_W5Wc4zeg)
41 | * [Lecture #13: Generative Adversarial Networks](https://youtu.be/7tFBoxex4JE)
42 | * [Lecture #14: Practical Methodology in Deep Learning](https://youtu.be/9Sl_t_GxX6w)
43 | 
44 | ---
45 | 


--------------------------------------------------------------------------------
/S1/S1_notebook.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Deep Learning With Tensorflow\n",
  8 |     "\n",
  9 |     "## Introduction\n",
 10 |     "\n",
 11 |     "Let's start with importing TensorFlow in our project and making sure that we have installed the right version correctly.\n",
 12 |     "If you haven't installed TensorFlow yet, you can easily do so using PyPI: https://www.tensorflow.org/install/."
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "code",
 17 |    "execution_count": 1,
 18 |    "metadata": {},
 19 |    "outputs": [
 20 |     {
 21 |      "name": "stdout",
 22 |      "output_type": "stream",
 23 |      "text": [
 24 |       "1.10.0\n"
 25 |      ]
 26 |     }
 27 |    ],
 28 |    "source": [
 29 |     "import tensorflow as tf\n",
 30 |     "print(tf.__version__)"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "markdown",
 35 |    "metadata": {},
 36 |    "source": [
 37 |     "### Graphs and Sessions\n",
 38 |     "Unless you are using the eager execution mode, operations in TensorFlow are not executed immediately. In TensorFlow, the description of the computations is separated from the execution. A typical TensorFlow program constructs a computational graph first, then creates a session to execute the operations in the graph. Let's create a very simple graph and run it in a session to compute the geometric mean of two numbers. In this example we used placeholders to feed the inputs to the graph. By defining a placeholder we tell the model that we will feed the values later, when we execute the graph. Feeding data this was can lead to input/output bottlenecks in large scale applications. We will later see how to read data in parallel while the graph is being executed."
 39 |    ]
 40 |   },
 41 |   {
 42 |    "cell_type": "code",
 43 |    "execution_count": 2,
 44 |    "metadata": {},
 45 |    "outputs": [
 46 |     {
 47 |      "name": "stdout",
 48 |      "output_type": "stream",
 49 |      "text": [
 50 |       "4.0\n"
 51 |      ]
 52 |     }
 53 |    ],
 54 |    "source": [
 55 |     "# Define the inputs\n",
 56 |     "x = tf.placeholder(tf.float32)\n",
 57 |     "y = tf.placeholder(tf.float32)\n",
 58 |     "\n",
 59 |     "# Define the graph\n",
 60 |     "g_mean = tf.sqrt(x * y)\n",
 61 |     "\n",
 62 |     "# Run the graph\n",
 63 |     "with tf.Session() as sess:\n",
 64 |     "    res = sess.run(g_mean, feed_dict={x: 2, y:8})\n",
 65 |     "    print(res)"
 66 |    ]
 67 |   },
 68 |   {
 69 |    "cell_type": "markdown",
 70 |    "metadata": {},
 71 |    "source": [
 72 |     "### Constants and Variables\n",
 73 |     "\n",
 74 |     "We can declare constants and variables to use in a graph. The main differences between these two are:\n",
 75 |     "* Constants have constant values whereas variables can change during execution. A typical example of a variable is a trainable weight in a neural network.\n",
 76 |     "* Constants are stored in a graph where variables are not. Using constants increases the size of the graph\n",
 77 |     "\n",
 78 |     "Let's take a look at an example."
 79 |    ]
 80 |   },
 81 |   {
 82 |    "cell_type": "code",
 83 |    "execution_count": 7,
 84 |    "metadata": {},
 85 |    "outputs": [
 86 |     {
 87 |      "name": "stdout",
 88 |      "output_type": "stream",
 89 |      "text": [
 90 |       "0.2\n"
 91 |      ]
 92 |     }
 93 |    ],
 94 |    "source": [
 95 |     "# This block gets an existing variable with a specific name within a variable scope\n",
 96 |     "# or creates a new one if no such variable exists\n",
 97 |     "# In this case it's identical to using tf.Variable\n",
 98 |     "# Variable scopes help us define and reuse variables within a context\n",
 99 |     "with tf.variable_scope(\"linear_model\", reuse=tf.AUTO_REUSE):\n",
100 |     "    w = tf.get_variable(\"weight\", dtype=tf.float32, initializer=tf.constant(0.1))\n",
101 |     "    c = tf.get_variable(\"bias\", dtype=tf.float32, initializer=tf.constant(0.0))\n",
102 |     "\n",
103 |     "# here we define our graph\n",
104 |     "model = x * w + c\n",
105 |     "\n",
106 |     "with tf.Session() as sess:\n",
107 |     "    # we need to initialize all variables otherwise it will throw an error\n",
108 |     "    sess.run(tf.global_variables_initializer())\n",
109 |     "    print(sess.run(model, feed_dict={x: 2.0}))"
110 |    ]
111 |   },
112 |   {
113 |    "cell_type": "markdown",
114 |    "metadata": {},
115 |    "source": [
116 |     "In the example above, we defined a very simple linear model with a single input, weight, and bias. We initialized the variables with constant values and ran the graph to print the initial output. We will later see how to train these variables to fit a function to data."
117 |    ]
118 |   }
119 |  ],
120 |  "metadata": {
121 |   "kernelspec": {
122 |    "display_name": "Python 3",
123 |    "language": "python",
124 |    "name": "python3"
125 |   },
126 |   "language_info": {
127 |    "codemirror_mode": {
128 |     "name": "ipython",
129 |     "version": 3
130 |    },
131 |    "file_extension": ".py",
132 |    "mimetype": "text/x-python",
133 |    "name": "python",
134 |    "nbconvert_exporter": "python",
135 |    "pygments_lexer": "ipython3",
136 |    "version": "3.6.4"
137 |   }
138 |  },
139 |  "nbformat": 4,
140 |  "nbformat_minor": 2
141 | }
142 | 


--------------------------------------------------------------------------------
/S1/S1a_live.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | # define the inputs
 4 | x = tf.placeholder(tf.float32)
 5 | y = tf.placeholder(tf.float32)
 6 | 
 7 | # define the graph
 8 | g_mean = tf.sqrt(x * y)
 9 | 
10 | # run the graph
11 | with tf.Session() as sess:
12 |     res = sess.run(g_mean, feed_dict={x: 2, y: 8})
13 |     print(res)


--------------------------------------------------------------------------------
/S1/S1b_live.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | # define the inputs
 4 | x = tf.placeholder(tf.float32)
 5 | 
 6 | with tf.variable_scope("linear_model", reuse=tf.AUTO_REUSE):
 7 |     w = tf.get_variable("weight", dtype=tf.float32, initializer=tf.constant(0.1))
 8 |     c = tf.get_variable("bias", dtype=tf.float32, initializer=tf.constant(0.0))
 9 |     model = x * w + c
10 | 
11 | with tf.Session() as sess:
12 |     sess.run(tf.global_variables_initializer())
13 |     print(sess.run(model, feed_dict={x: 2.0}))


--------------------------------------------------------------------------------
/S2_live.py:
--------------------------------------------------------------------------------
  1 | """ Deep Learning with TensorFlow
  2 | Coding session 2: Training a Multilayer Perceptron
  3 | 
  4 | Let's train a simple neural network that classifies handwritten digits using the MNIST dataset.
  5 | Video will be uploaded later.
  6 | """
  7 | 
  8 | import tensorflow as tf
  9 | 
 10 | def preprocess_data(im, label):
 11 |     im = tf.cast(im, tf.float32)
 12 |     im = im / 127.5
 13 |     im = im - 1
 14 |     im = tf.reshape(im, [-1])
 15 |     return im, label
 16 | 
 17 | def data_layer(data_tensor, num_threads=8, prefetch_buffer=100, batch_size=32):
 18 |     with tf.variable_scope("data"):
 19 |         dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
 20 |         dataset = dataset.shuffle(buffer_size=60000).repeat()
 21 |         dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
 22 |         dataset = dataset.batch(batch_size)
 23 |         dataset = dataset.prefetch(prefetch_buffer)
 24 |         iterator = dataset.make_one_shot_iterator()
 25 |     return iterator
 26 | 
 27 | def model(input_layer, num_classes=10):
 28 |     with tf.variable_scope("model"):
 29 |         net = tf.layers.dense(input_layer, 512)
 30 |         net = tf.nn.relu(net)
 31 |         net = tf.layers.dense(net, num_classes)
 32 |     return net
 33 | 
 34 | def loss_functions(logits, labels, num_classes=10):
 35 |     with tf.variable_scope("loss"):
 36 |         target_prob = tf.one_hot(labels, num_classes)
 37 |         total_loss = tf.losses.softmax_cross_entropy(target_prob, logits)
 38 |     return total_loss
 39 | 
 40 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
 41 |     with tf.variable_scope("optimizer"):
 42 |         optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
 43 |         optimizer = optimizer.minimize(total_loss, global_step=global_step)
 44 |     return optimizer
 45 | 
 46 | def performance_metric(logits, labels):
 47 |     with tf.variable_scope("performance_metric"):
 48 |         preds = tf.argmax(logits, axis=1)
 49 |         labels = tf.cast(labels, tf.int64)
 50 |         corrects = tf.equal(preds, labels)
 51 |         accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 52 |     return accuracy
 53 | 
 54 | def train(data_tensor):
 55 |     global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
 56 | 
 57 |     # training graph
 58 |     images, labels = data_layer(data_tensor).get_next()
 59 |     logits = model(images)
 60 |     loss = loss_functions(logits, labels)
 61 |     optimizer = optimizer_func(loss, global_step)
 62 |     accuracy = performance_metric(logits, labels)
 63 | 
 64 |     # start training
 65 |     num_iter = 10000
 66 |     log_iter = 1000
 67 |     with tf.Session() as sess:
 68 |         sess.run(tf.global_variables_initializer())
 69 |         streaming_loss = 0
 70 |         streaming_accuracy = 0
 71 | 
 72 |         for i in range(1, num_iter + 1):
 73 |             _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
 74 |             streaming_loss += loss_batch
 75 |             streaming_accuracy += acc_batch
 76 |             if i % log_iter == 0:
 77 |                 print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
 78 |                         .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
 79 |                 streaming_loss = 0
 80 |                 streaming_accuracy = 0
 81 | 
 82 | if __name__ == "__main__":
 83 |     # It's very easy to load the MNIST dataset through the Keras module.
 84 |     # Keras is a high-level neural network API that has become a part of TensorFlow since version 1.2.
 85 |     # Therefore, we don't need to install Keras separately.
 86 |     # In the upcoming lectures we will also see how to load and preprocess custom data.
 87 |     data_train, data_val = tf.keras.datasets.mnist.load_data()
 88 | 
 89 |     # The training set has 60,000 samples where each sample is a 28x28 grayscale image.
 90 |     # Each one of these samples have a single label Similarly the validation set has 10,000 images and corresponding labels.
 91 |     # We can verify this by printing the shapes of the loaded tensors
 92 |     print(data_train[0].shape, data_train[1].shape, data_val[0].shape, data_val[1].shape)
 93 | 
 94 |     # Let the training begin!
 95 |     train(data_tensor=data_train)
 96 | 
 97 |     # Even after very few epochs, we got a model that can classify the handwritten digits in the training set
 98 |     # with 98% accuracy. So far we haven't used the validation set at all.
 99 |     # You might wonder why we need a separate validation set in the first place.
100 |     # The answer is to make sure that the model generalizes well to unseen data to have an idea of the actual performance of the model.
101 |     # We will talk about that in the next session.


--------------------------------------------------------------------------------
/S3_live.py:
--------------------------------------------------------------------------------
  1 | """ Deep Learning with TensorFlow
  2 | Coding session 3: Setting up the training and validation pipeline
  3 | 
  4 | In the previous session we trained a model without keeping track of how it's
  5 | doing on a validation set. Let's pick up where we left off and modify our code
  6 | from the previous session to keep track of validation accuracy while training.
  7 | """
  8 | 
  9 | import tensorflow as tf
 10 | import os
 11 | 
 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
 13 | 
 14 | def preprocess_data(im, label):
 15 |     im = tf.cast(im, tf.float32)
 16 |     im = im / 127.5
 17 |     im = im - 1
 18 |     im = tf.reshape(im, [-1])
 19 |     return im, label
 20 | 
 21 | # We will be using the same data pipeline for both training and validation sets
 22 | # So let's create a helper function for that
 23 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
 24 |     dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
 25 |     if is_train:
 26 |         dataset = dataset.shuffle(buffer_size=60000).repeat()
 27 |     dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
 28 |     dataset = dataset.batch(batch_size)
 29 |     dataset = dataset.prefetch(prefetch_buffer)
 30 |     return dataset
 31 | 
 32 | def data_layer():
 33 |     with tf.variable_scope("data"):
 34 |         data_train, data_val = tf.keras.datasets.mnist.load_data()
 35 |         dataset_train = create_dataset_pipeline(data_train, is_train=True)
 36 |         dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
 37 |         iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
 38 |         init_op_train = iterator.make_initializer(dataset_train)
 39 |         init_op_val = iterator.make_initializer(dataset_val)
 40 |     return iterator, init_op_train, init_op_val
 41 | 
 42 | def model(input_layer, num_classes=10):
 43 |     with tf.variable_scope("model"):
 44 |         net = tf.layers.dense(input_layer, 512)
 45 |         net = tf.nn.relu(net)
 46 |         net = tf.layers.dense(net, num_classes)
 47 |     return net
 48 | 
 49 | def loss_functions(logits, labels, num_classes=10):
 50 |     with tf.variable_scope("loss"):
 51 |         target_prob = tf.one_hot(labels, num_classes)
 52 |         total_loss = tf.losses.softmax_cross_entropy(target_prob, logits)
 53 |     return total_loss
 54 | 
 55 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
 56 |     with tf.variable_scope("optimizer"):
 57 |         optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
 58 |         optimizer = optimizer.minimize(total_loss, global_step=global_step)
 59 |     return optimizer
 60 | 
 61 | def performance_metric(logits, labels):
 62 |     with tf.variable_scope("performance_metric"):
 63 |         preds = tf.argmax(logits, axis=1)
 64 |         labels = tf.cast(labels, tf.int64)
 65 |         corrects = tf.equal(preds, labels)
 66 |         accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 67 |     return accuracy
 68 | 
 69 | def train():
 70 |     global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
 71 | 
 72 |     # define the training graph
 73 |     iterator, init_op_train, init_op_val = data_layer()
 74 |     images, labels = iterator.get_next()
 75 |     logits = model(images)
 76 |     loss = loss_functions(logits, labels)
 77 |     optimizer = optimizer_func(loss, global_step)
 78 |     accuracy = performance_metric(logits, labels)
 79 | 
 80 |     # start training
 81 |     num_iter = 18750 # 10 epochs
 82 |     log_iter = 1875
 83 |     val_iter = 1875
 84 |     with tf.Session() as sess:
 85 |         sess.run(tf.global_variables_initializer())
 86 |         sess.run(init_op_train)
 87 | 
 88 |         streaming_loss = 0
 89 |         streaming_accuracy = 0
 90 | 
 91 |         for i in range(1, num_iter + 1):
 92 |             _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
 93 |             streaming_loss += loss_batch
 94 |             streaming_accuracy += acc_batch
 95 |             if i % log_iter == 0:
 96 |                 print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
 97 |                         .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
 98 |                 streaming_loss = 0
 99 |                 streaming_accuracy = 0
100 | 
101 |             if i % val_iter == 0:
102 |                 sess.run(init_op_val)
103 |                 validation_accuracy = 0
104 |                 num_iter = 0
105 |                 while True:
106 |                     try:
107 |                         acc_batch = sess.run(accuracy)
108 |                         validation_accuracy += acc_batch
109 |                         num_iter += 1
110 |                     except tf.errors.OutOfRangeError:
111 |                         validation_accuracy /= num_iter
112 |                         print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
113 |                         sess.run(init_op_train) # switch back to training set
114 |                         break
115 | 
116 | if __name__ == "__main__":
117 |     train()
118 | 


--------------------------------------------------------------------------------
/S4_live.py:
--------------------------------------------------------------------------------
  1 | """ Deep Learning with TensorFlow
  2 | Live coding session 4: Regularization, saving and resuming from checkpoints, basics of TensorBoard
  3 | 
  4 | In the previous session, we wrote this code to train a simple model in TensorFlow.
  5 | In this session, we will train a deeper model, regularize it, and visualize it in TensorBoard.
  6 | """
  7 | 
  8 | import tensorflow as tf
  9 | import os
 10 | 
 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
 12 | 
 13 | def preprocess_data(im, label):
 14 |     im = tf.cast(im, tf.float32)
 15 |     im = im / 127.5
 16 |     im = im - 1
 17 |     im = tf.reshape(im, [-1])
 18 |     return im, label
 19 | 
 20 | # We will be using the same data pipeline for both training and validation sets
 21 | # So let's create a helper function for that
 22 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
 23 |     dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
 24 |     if is_train:
 25 |         dataset = dataset.shuffle(buffer_size=60000).repeat()
 26 |     dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
 27 |     dataset = dataset.batch(batch_size)
 28 |     dataset = dataset.prefetch(prefetch_buffer)
 29 |     return dataset
 30 | 
 31 | def data_layer():
 32 |     with tf.variable_scope("data"):
 33 |         data_train, data_val = tf.keras.datasets.mnist.load_data()
 34 |         dataset_train = create_dataset_pipeline(data_train, is_train=True)
 35 |         dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
 36 |         iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
 37 |         init_op_train = iterator.make_initializer(dataset_train)
 38 |         init_op_val = iterator.make_initializer(dataset_val)
 39 |     return iterator, init_op_train, init_op_val
 40 | 
 41 | ########################################################################
 42 | def model(input_layer, num_classes=10):
 43 |     with tf.variable_scope("model"):
 44 |         reg = tf.contrib.layers.l2_regularizer(0.00001)
 45 |         net = input_layer
 46 |         for i in range(3):
 47 |             net = tf.layers.dense(net,
 48 |                                 units=512,
 49 |                                 kernel_regularizer=reg)
 50 |             net = tf.nn.relu(net)
 51 |             net = tf.layers.dropout(net, rate=0.2)
 52 |         net = tf.layers.dense(net, num_classes)
 53 |     return net
 54 | 
 55 | def loss_functions(logits, labels, num_classes=10):
 56 |     with tf.variable_scope("loss"):
 57 |         target_prob = tf.one_hot(labels, num_classes)
 58 |         tf.losses.softmax_cross_entropy(target_prob, logits)
 59 |         total_loss = tf.losses.get_total_loss() # include regularization loss (I forgot to add this in the video)
 60 |     return total_loss
 61 | ########################################################################
 62 | 
 63 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
 64 |     with tf.variable_scope("optimizer"):
 65 |         optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
 66 |         optimizer = optimizer.minimize(total_loss, global_step=global_step)
 67 |     return optimizer
 68 | 
 69 | def performance_metric(logits, labels):
 70 |     with tf.variable_scope("performance_metric"):
 71 |         preds = tf.argmax(logits, axis=1)
 72 |         labels = tf.cast(labels, tf.int64)
 73 |         corrects = tf.equal(preds, labels)
 74 |         accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 75 |     return accuracy
 76 | 
 77 | def train():
 78 |     global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
 79 | 
 80 |     # define the training graph
 81 |     iterator, init_op_train, init_op_val = data_layer()
 82 |     images, labels = iterator.get_next()
 83 |     logits = model(images)
 84 |     loss = loss_functions(logits, labels)
 85 |     optimizer = optimizer_func(loss, global_step)
 86 |     accuracy = performance_metric(logits, labels)
 87 | 
 88 |     ########################################################################
 89 |     # summary placeholders
 90 |     streaming_loss_p = tf.placeholder(tf.float32)
 91 |     streaming_acc_p = tf.placeholder(tf.float32)
 92 |     val_acc_p = tf.placeholder(tf.float32)
 93 |     val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p)
 94 |     train_summ_ops = tf.summary.merge([
 95 |         tf.summary.scalar('streaming_loss', streaming_loss_p),
 96 |         tf.summary.scalar('streaming_accuracy', streaming_acc_p)
 97 |     ])
 98 |     ########################################################################
 99 | 
100 |     # start training
101 |     num_iter = 18750 # 10 epochs
102 |     log_iter = 1875
103 |     val_iter = 1875
104 |     with tf.Session() as sess:
105 |         sess.run(tf.global_variables_initializer())
106 |         sess.run(init_op_train)
107 | 
108 |         ########################################################################
109 |         # logs for TensorBoard
110 |         logdir = 'logs'
111 |         writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph
112 | 
113 |         # load / save checkpoints
114 |         checkpoint_path = 'checkpoints'
115 |         saver = tf.train.Saver(max_to_keep=None)
116 |         ckpt = tf.train.get_checkpoint_state(checkpoint_path)
117 | 
118 |         # resume training if a checkpoint exists
119 |         if ckpt and ckpt.model_checkpoint_path:
120 |             saver.restore(sess, ckpt.model_checkpoint_path)
121 |             print("Loaded parameters from {}".format(ckpt.model_checkpoint_path))
122 | 
123 |         initial_step = global_step.eval()
124 |         ########################################################################
125 | 
126 |         streaming_loss = 0
127 |         streaming_accuracy = 0
128 | 
129 |         for i in range(initial_step, num_iter + 1): #################################### initial step
130 |             _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
131 |             streaming_loss += loss_batch
132 |             streaming_accuracy += acc_batch
133 |             if i % log_iter == 0:
134 |                 print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
135 |                         .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
136 | 
137 |                 #####################################################################################
138 |                 # save to log file for TensorBoard
139 |                 summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss,
140 |                                                                     streaming_acc_p: streaming_accuracy})
141 |                 writer.add_summary(summary_train, global_step=i)
142 |                 #####################################################################################
143 | 
144 |                 streaming_loss = 0
145 |                 streaming_accuracy = 0
146 | 
147 |             if i % val_iter == 0:
148 |                 #####################################################################################
149 |                 saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step)
150 |                 print("Model saved!")
151 |                 #####################################################################################
152 | 
153 |                 sess.run(init_op_val)
154 |                 validation_accuracy = 0
155 |                 num_iter = 0
156 |                 while True:
157 |                     try:
158 |                         acc_batch = sess.run(accuracy)
159 |                         validation_accuracy += acc_batch
160 |                         num_iter += 1
161 |                     except tf.errors.OutOfRangeError:
162 |                         validation_accuracy /= num_iter
163 |                         print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
164 | 
165 |                         ###############################################################################
166 |                         # save log file to TensorBoard
167 |                         summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy})
168 |                         writer.add_summary(summary_val, global_step=i)
169 |                         ###############################################################################
170 | 
171 |                         sess.run(init_op_train) # switch back to training set
172 |                         break
173 |         writer.close()
174 | 
175 | if __name__ == "__main__":
176 |     train()
177 | 


--------------------------------------------------------------------------------
/S5_live.py:
--------------------------------------------------------------------------------
  1 | """ Deep Learning with TensorFlow
  2 | Live coding session 5: convolutional neural networks, batchnorm, learning rate schedules, optimizers
  3 | """
  4 | 
  5 | import tensorflow as tf
  6 | import os
  7 | 
  8 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
  9 | 
 10 | def preprocess_data(im, label):
 11 |     im = tf.cast(im, tf.float32)
 12 |     im = im / 127.5
 13 |     im = im - 1
 14 |     # im = tf.reshape(im, [-1])
 15 |     return im, label
 16 | 
 17 | # We will be using the same data pipeline for both training and validation sets
 18 | # So let's create a helper function for that
 19 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
 20 |     dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
 21 |     if is_train:
 22 |         dataset = dataset.shuffle(buffer_size=60000).repeat()
 23 |     dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
 24 |     dataset = dataset.batch(batch_size)
 25 |     dataset = dataset.prefetch(prefetch_buffer)
 26 |     return dataset
 27 | 
 28 | def data_layer():
 29 |     with tf.variable_scope("data"):
 30 |         data_train, data_val = tf.keras.datasets.mnist.load_data()
 31 |         dataset_train = create_dataset_pipeline(data_train, is_train=True)
 32 |         dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
 33 |         iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
 34 |         init_op_train = iterator.make_initializer(dataset_train)
 35 |         init_op_val = iterator.make_initializer(dataset_val)
 36 |     return iterator, init_op_train, init_op_val
 37 | 
 38 | ########################################################################
 39 | def model(input_layer, training, num_classes=10):
 40 |     with tf.variable_scope("model"):
 41 |         net = tf.expand_dims(input_layer, axis=3)
 42 | 
 43 |         net = tf.layers.conv2d(net, 20, (5, 5))
 44 |         net = tf.layers.batch_normalization(net, training=training)
 45 |         net = tf.nn.relu(net)
 46 |         net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2))
 47 | 
 48 |         net = tf.layers.conv2d(net, 50, (5, 5))
 49 |         net = tf.layers.batch_normalization(net, training=training)
 50 |         net = tf.nn.relu(net)
 51 |         net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2))
 52 | 
 53 |         net = tf.layers.flatten(net)
 54 |         net = tf.layers.dense(net, 500)
 55 |         net = tf.nn.relu(net) # I forgot to add this ReLU in the video
 56 |         net = tf.layers.dropout(net, rate=0.2, training=training) # I forgot the training argument in the video
 57 |         net = tf.layers.dense(net, num_classes)
 58 |     return net
 59 | 
 60 | def loss_functions(logits, labels, num_classes=10):
 61 |     with tf.variable_scope("loss"):
 62 |         target_prob = tf.one_hot(labels, num_classes)
 63 |         tf.losses.softmax_cross_entropy(target_prob, logits)
 64 |         total_loss = tf.losses.get_total_loss() # include regularization loss
 65 |     return total_loss
 66 | 
 67 | def optimizer_func_momentum(total_loss, global_step, learning_rate=0.01):
 68 |     with tf.variable_scope("optimizer"):
 69 |         lr_schedule = tf.train.exponential_decay(learning_rate=learning_rate,
 70 |                                                  global_step=global_step,
 71 |                                                  decay_steps=1875,
 72 |                                                  decay_rate=0.9,
 73 |                                                  staircase=True)
 74 |         update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 75 |         with tf.control_dependencies(update_ops):
 76 |             optimizer = tf.train.MomentumOptimizer(learning_rate=lr_schedule, momentum=0.9)
 77 |             optimizer = optimizer.minimize(total_loss, global_step=global_step)
 78 |     return optimizer
 79 | 
 80 | def optimizer_func_adam(total_loss, global_step, learning_rate=0.01):
 81 |     with tf.variable_scope("optimizer"):
 82 |         update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 83 |         with tf.control_dependencies(update_ops):
 84 |             optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)
 85 |             optimizer = optimizer.minimize(total_loss, global_step=global_step)
 86 |     return optimizer
 87 | ########################################################################
 88 | 
 89 | def performance_metric(logits, labels):
 90 |     with tf.variable_scope("performance_metric"):
 91 |         preds = tf.argmax(logits, axis=1)
 92 |         labels = tf.cast(labels, tf.int64)
 93 |         corrects = tf.equal(preds, labels)
 94 |         accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 95 |     return accuracy
 96 | 
 97 | def train():
 98 |     global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
 99 | 
100 |     # define the training graph
101 |     iterator, init_op_train, init_op_val = data_layer()
102 |     images, labels = iterator.get_next()
103 |     training = tf.placeholder(tf.bool)
104 |     logits = model(images, training) ##############################
105 |     loss = loss_functions(logits, labels)
106 |     optimizer = optimizer_func_adam(loss, global_step) ##############################
107 |     accuracy = performance_metric(logits, labels)
108 | 
109 |     # summary placeholders
110 |     streaming_loss_p = tf.placeholder(tf.float32)
111 |     streaming_acc_p = tf.placeholder(tf.float32)
112 |     val_acc_p = tf.placeholder(tf.float32)
113 |     val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p)
114 |     train_summ_ops = tf.summary.merge([
115 |         tf.summary.scalar('streaming_loss', streaming_loss_p),
116 |         tf.summary.scalar('streaming_accuracy', streaming_acc_p)
117 |     ])
118 | 
119 |     # start training
120 |     num_iter = 18750 # 10 epochs
121 |     log_iter = 1875
122 |     val_iter = 1875
123 |     with tf.Session() as sess:
124 |         sess.run(tf.global_variables_initializer())
125 |         sess.run(init_op_train)
126 | 
127 |         # logs for TensorBoard
128 |         logdir = 'logs'
129 |         writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph
130 | 
131 |         # load / save checkpoints
132 |         checkpoint_path = 'checkpoints'
133 |         saver = tf.train.Saver(max_to_keep=None)
134 |         ckpt = tf.train.get_checkpoint_state(checkpoint_path)
135 | 
136 |         # resume training if a checkpoint exists
137 |         if ckpt and ckpt.model_checkpoint_path:
138 |             saver.restore(sess, ckpt.model_checkpoint_path)
139 |             print("Loaded parameters from {}".format(ckpt.model_checkpoint_path))
140 | 
141 |         initial_step = global_step.eval()
142 | 
143 |         streaming_loss = 0
144 |         streaming_accuracy = 0
145 | 
146 |         for i in range(initial_step, num_iter + 1):
147 |             _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy], feed_dict={training: True}) ##############################
148 |             streaming_loss += loss_batch
149 |             streaming_accuracy += acc_batch
150 |             if i % log_iter == 0:
151 |                 print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
152 |                         .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
153 | 
154 |                 # save to log file for TensorBoard
155 |                 summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss,
156 |                                                                     streaming_acc_p: streaming_accuracy})
157 |                 writer.add_summary(summary_train, global_step=i)
158 | 
159 |                 streaming_loss = 0
160 |                 streaming_accuracy = 0
161 | 
162 |             if i % val_iter == 0:
163 |                 saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step)
164 |                 print("Model saved!")
165 | 
166 |                 sess.run(init_op_val)
167 |                 validation_accuracy = 0
168 |                 num_iter = 0
169 |                 while True:
170 |                     try:
171 |                         acc_batch = sess.run(accuracy, feed_dict={training: False}) ##############################
172 |                         validation_accuracy += acc_batch
173 |                         num_iter += 1
174 |                     except tf.errors.OutOfRangeError:
175 |                         validation_accuracy /= num_iter
176 |                         print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
177 | 
178 |                         # save log file to TensorBoard
179 |                         summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy})
180 |                         writer.add_summary(summary_val, global_step=i)
181 | 
182 |                         sess.run(init_op_train) # switch back to training set
183 |                         break
184 |         writer.close()
185 | 
186 | if __name__ == "__main__":
187 |     train()
188 | 


--------------------------------------------------------------------------------
/S6/create_tf_records.py:
--------------------------------------------------------------------------------
  1 | """ Converts an image dataset into TFRecords. The dataset should be organized as:
  2 | 
  3 | base_dir:
  4 | -- class_name1
  5 | ---- image_name.jpg
  6 | ...
  7 | -- class_name2
  8 | ---- image_name.jpg
  9 | ...
 10 | -- class_name3
 11 | ---- image_name.jpg
 12 | ...
 13 | 
 14 | Example:
 15 | $ python create_tf_records.py --input_dir ./dataset --output_dir ./tfrecords --num_shards 10 --split_ratio 0.2
 16 | """
 17 | 
 18 | import tensorflow as tf
 19 | import os, glob
 20 | import argparse
 21 | import random
 22 | 
 23 | def _int64_feature(value):
 24 |     return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
 25 | 
 26 | def _bytes_feature(value):
 27 |     return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
 28 | 
 29 | def _create_tfexample(image_data, label):
 30 |     example = tf.train.Example(features=tf.train.Features(feature={
 31 |             'image': _bytes_feature(image_data),
 32 |             'label': _int64_feature(label)
 33 |             }))
 34 |     return example
 35 | 
 36 | def enumerate_classes(class_list, sort=True):
 37 |     class_ids = {}
 38 |     class_id_counter = 0
 39 | 
 40 |     if sort:
 41 |         class_list.sort()
 42 | 
 43 |     for class_name in class_list:
 44 |         if class_name not in class_ids:
 45 |             class_ids[class_name] = class_id_counter
 46 |             class_id_counter += 1
 47 | 
 48 |     return class_ids
 49 | 
 50 | def create_tfrecords(save_dir, dataset_name, filenames, class_ids, num_shards):
 51 | 
 52 |     im_per_shard = int( len(filenames) / num_shards ) + 1
 53 | 
 54 |     for shard in range(num_shards):
 55 |         output_filename = os.path.join(save_dir, '{}_{:03d}-of-{:03d}.tfrecord'
 56 |                                        .format(dataset_name, shard, num_shards))
 57 |         print('Writing into {}'.format(output_filename))
 58 |         filenames_shard = filenames[shard*im_per_shard:(shard+1)*im_per_shard]
 59 | 
 60 |         with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:
 61 | 
 62 |             for filename in filenames_shard:
 63 |                 image = tf.gfile.FastGFile(filename, 'rb').read()
 64 |                 class_name = os.path.basename(os.path.dirname(filename))
 65 |                 label = class_ids[class_name]
 66 | 
 67 |                 example = _create_tfexample(image, label)
 68 |                 tfrecord_writer.write(example.SerializeToString())
 69 | 
 70 |     print('Finished writing {} images into TFRecords'.format(len(filenames)))
 71 | 
 72 | def main(args):
 73 | 
 74 |     supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG']
 75 |     filenames = []
 76 |     for extension in supported_formats:
 77 |         pattern = os.path.join(args.input_dir, '**', extension)
 78 |         filenames.extend(glob.glob(pattern, recursive=False))
 79 | 
 80 |     random.seed(args.seed)
 81 |     random.shuffle(filenames)
 82 | 
 83 |     num_test = int(args.split_ratio * len(filenames))
 84 |     num_shards_test = int(args.split_ratio * args.num_shards)
 85 |     num_shards_train = args.num_shards - num_shards_test
 86 | 
 87 |     # write the list of classes and their corresponding ids to a file
 88 |     class_list = [name for name in os.listdir(args.input_dir)
 89 |                     if os.path.isdir(os.path.join(args.input_dir, name))]
 90 |     class_ids = enumerate_classes(class_list)
 91 |     with open(os.path.join(args.output_dir, 'classes.txt'), 'w') as f:
 92 |         for cid in class_ids:
 93 |             print('{}:{}'.format(class_ids[cid], cid), file=f)
 94 | 
 95 |     # create TFRecords for the training and test sets
 96 |     create_tfrecords(save_dir=args.output_dir,
 97 |                      dataset_name='train',
 98 |                      filenames=filenames[num_test:],
 99 |                      class_ids=class_ids,
100 |                      num_shards=num_shards_train)
101 |     create_tfrecords(save_dir=args.output_dir,
102 |                      dataset_name='test',
103 |                      filenames=filenames[:num_test],
104 |                      class_ids=class_ids,
105 |                      num_shards=num_shards_test)
106 | 
107 | if __name__ == '__main__':
108 |     parser = argparse.ArgumentParser()
109 |     parser.add_argument('--input_dir', type=str,
110 |                         help='path to the directory where the images will be read from')
111 |     parser.add_argument('--output_dir', type=str,
112 |                         help='path to the directory where the TFRecords will be saved to')
113 |     parser.add_argument('--num_shards', type=int,
114 |                         help='total number of shards')
115 |     parser.add_argument('--split_ratio', type=float, default=0.2,
116 |                         help='ratio of number of images in the test set to the total number of images')
117 |     parser.add_argument('--seed', type=int, default=42,
118 |                         help='random seed for repeatable train/test splits')
119 |     args = parser.parse_args()
120 |     main(args)
121 | 


--------------------------------------------------------------------------------
/S6/freeze_model.py:
--------------------------------------------------------------------------------
 1 | """ Freezes a checkpoint, outputs a single pbfile that encapsulates both the graph and weights
 2 | Example:
 3 | $ python freeze_model.py --checkpoint_path ./checkpoints
 4 | """
 5 | 
 6 | import tensorflow as tf
 7 | import argparse
 8 | from nets import mobilenet_v1
 9 | import os
10 | 
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 | 
13 | def freeze_graph(checkpoint_path, output_node_name, outfile):
14 |     input_layer = tf.placeholder(tf.uint8, shape=[None, None, 3], name='input')
15 |     with tf.variable_scope('input_scaling'):
16 |         image = tf.expand_dims(input_layer, axis=0)
17 |         image = tf.image.resize_bilinear(image, [224, 224])
18 |         image = tf.cast(image, tf.float32)
19 |         image = image / 127.5
20 |         image = image - 1
21 | 
22 |     logits, _ = mobilenet_v1.mobilenet_v1(image, num_classes=2, is_training=False)
23 |     preds = tf.squeeze(tf.nn.softmax(logits), name='preds')
24 | 
25 |     with tf.Session() as sess:
26 |         ckpt = tf.train.get_checkpoint_state(checkpoint_path)
27 |         saver = tf.train.Saver()
28 |         saver.restore(sess, ckpt.model_checkpoint_path)
29 | 
30 |         output_graph_def = tf.graph_util.convert_variables_to_constants(
31 |             sess, tf.get_default_graph().as_graph_def(), [output_node_name])
32 | 
33 |         with tf.gfile.GFile(outfile, 'wb') as f:
34 |             f.write(output_graph_def.SerializeToString())
35 | 
36 |         # print a list of ops
37 |         for op in output_graph_def.node:
38 |             print(op.name)
39 | 
40 |         print('Saved frozen model to {}'.format(outfile))
41 |         print('{:d} ops in the final graph.'.format(len(output_graph_def.node)))
42 | 
43 | if __name__ == '__main__':
44 |     parser = argparse.ArgumentParser()
45 |     parser.add_argument('--checkpoint_path', type=str, default='./', help="Path to the dir where the checkpoints are saved")
46 |     parser.add_argument('--output_node_name', type=str, default='preds', help="Name of the output node")
47 |     parser.add_argument('--outfile', type=str, default='frozen_model.pb', help="Frozen model path")
48 |     args = parser.parse_args()
49 |     freeze_graph(args.checkpoint_path, args.output_node_name, args.outfile)


--------------------------------------------------------------------------------
/S6/inference.py:
--------------------------------------------------------------------------------
 1 | """ Runs inference given a frozen model and a set of images
 2 | Example:
 3 | $ python inference.py --frozen_model frozen_model.pb --input_path ./test_images
 4 | """
 5 | 
 6 | import argparse
 7 | import tensorflow as tf
 8 | import os, glob
 9 | import cv2
10 | 
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 | 
13 | class InferenceEngine:
14 |     def __init__(self, frozen_graph_filename):
15 |         with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
16 |             graph_def = tf.GraphDef()
17 |             graph_def.ParseFromString(f.read())
18 | 
19 |         with tf.Graph().as_default() as graph:
20 |             tf.import_graph_def(graph_def, name="Pretrained")
21 | 
22 |         self.graph = graph
23 | 
24 |     def run_inference(self, input_path):
25 |         if os.path.isdir(input_path):
26 |             filenames = glob.glob(os.path.join(input_path, '*.jpg'))
27 |             filenames.extend(glob.glob(os.path.join(input_path, '*.jpeg')))
28 |             filenames.extend(glob.glob(os.path.join(input_path, '*.png')))
29 |             filenames.extend(glob.glob(os.path.join(input_path, '*.bmp')))
30 |         else:
31 |             filenames = [input_path]
32 | 
33 |         input_layer = self.graph.get_tensor_by_name('Pretrained/input:0')
34 |         preds = self.graph.get_tensor_by_name('Pretrained/preds:0')
35 |         pred_idx = tf.argmax(preds)
36 | 
37 |         with tf.Session(graph=self.graph) as sess:
38 |             for filename in filenames:
39 |                 image = cv2.imread(filename)
40 |                 class_label, probs = sess.run([pred_idx, preds], feed_dict={input_layer: image})
41 |                 print("Label: {:d}, Probability: {:.2f} \t File: {}".format(class_label, probs[class_label], filename))
42 | 
43 | if __name__ == '__main__':
44 |     parser = argparse.ArgumentParser()
45 |     parser.add_argument("--frozen_model", default="frozen_model.pb", type=str, help="Path to the frozen model file to import")
46 |     parser.add_argument("--input_path", type=str, help="Path to the input file(s). If this is a dir all files will be processed.")
47 |     args = parser.parse_args()
48 | 
49 |     ie = InferenceEngine(args.frozen_model)
50 |     ie.run_inference(args.input_path)
51 | 
52 | 


--------------------------------------------------------------------------------
/S6/nets/mobilenet_v1.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # =============================================================================
 15 | """MobileNet v1.
 16 | 
 17 | MobileNet is a general architecture and can be used for multiple use cases.
 18 | Depending on the use case, it can use different input layer size and different
 19 | head (for example: embeddings, localization and classification).
 20 | 
 21 | As described in https://arxiv.org/abs/1704.04861.
 22 | 
 23 |   MobileNets: Efficient Convolutional Neural Networks for
 24 |     Mobile Vision Applications
 25 |   Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
 26 |     Tobias Weyand, Marco Andreetto, Hartwig Adam
 27 | 
 28 | 100% Mobilenet V1 (base) with input size 224x224:
 29 | 
 30 | See mobilenet_v1()
 31 | 
 32 | Layer                                                     params           macs
 33 | --------------------------------------------------------------------------------
 34 | MobilenetV1/Conv2d_0/Conv2D:                                 864      10,838,016
 35 | MobilenetV1/Conv2d_1_depthwise/depthwise:                    288       3,612,672
 36 | MobilenetV1/Conv2d_1_pointwise/Conv2D:                     2,048      25,690,112
 37 | MobilenetV1/Conv2d_2_depthwise/depthwise:                    576       1,806,336
 38 | MobilenetV1/Conv2d_2_pointwise/Conv2D:                     8,192      25,690,112
 39 | MobilenetV1/Conv2d_3_depthwise/depthwise:                  1,152       3,612,672
 40 | MobilenetV1/Conv2d_3_pointwise/Conv2D:                    16,384      51,380,224
 41 | MobilenetV1/Conv2d_4_depthwise/depthwise:                  1,152         903,168
 42 | MobilenetV1/Conv2d_4_pointwise/Conv2D:                    32,768      25,690,112
 43 | MobilenetV1/Conv2d_5_depthwise/depthwise:                  2,304       1,806,336
 44 | MobilenetV1/Conv2d_5_pointwise/Conv2D:                    65,536      51,380,224
 45 | MobilenetV1/Conv2d_6_depthwise/depthwise:                  2,304         451,584
 46 | MobilenetV1/Conv2d_6_pointwise/Conv2D:                   131,072      25,690,112
 47 | MobilenetV1/Conv2d_7_depthwise/depthwise:                  4,608         903,168
 48 | MobilenetV1/Conv2d_7_pointwise/Conv2D:                   262,144      51,380,224
 49 | MobilenetV1/Conv2d_8_depthwise/depthwise:                  4,608         903,168
 50 | MobilenetV1/Conv2d_8_pointwise/Conv2D:                   262,144      51,380,224
 51 | MobilenetV1/Conv2d_9_depthwise/depthwise:                  4,608         903,168
 52 | MobilenetV1/Conv2d_9_pointwise/Conv2D:                   262,144      51,380,224
 53 | MobilenetV1/Conv2d_10_depthwise/depthwise:                 4,608         903,168
 54 | MobilenetV1/Conv2d_10_pointwise/Conv2D:                  262,144      51,380,224
 55 | MobilenetV1/Conv2d_11_depthwise/depthwise:                 4,608         903,168
 56 | MobilenetV1/Conv2d_11_pointwise/Conv2D:                  262,144      51,380,224
 57 | MobilenetV1/Conv2d_12_depthwise/depthwise:                 4,608         225,792
 58 | MobilenetV1/Conv2d_12_pointwise/Conv2D:                  524,288      25,690,112
 59 | MobilenetV1/Conv2d_13_depthwise/depthwise:                 9,216         451,584
 60 | MobilenetV1/Conv2d_13_pointwise/Conv2D:                1,048,576      51,380,224
 61 | --------------------------------------------------------------------------------
 62 | Total:                                                 3,185,088     567,716,352
 63 | 
 64 | 
 65 | 75% Mobilenet V1 (base) with input size 128x128:
 66 | 
 67 | See mobilenet_v1_075()
 68 | 
 69 | Layer                                                     params           macs
 70 | --------------------------------------------------------------------------------
 71 | MobilenetV1/Conv2d_0/Conv2D:                                 648       2,654,208
 72 | MobilenetV1/Conv2d_1_depthwise/depthwise:                    216         884,736
 73 | MobilenetV1/Conv2d_1_pointwise/Conv2D:                     1,152       4,718,592
 74 | MobilenetV1/Conv2d_2_depthwise/depthwise:                    432         442,368
 75 | MobilenetV1/Conv2d_2_pointwise/Conv2D:                     4,608       4,718,592
 76 | MobilenetV1/Conv2d_3_depthwise/depthwise:                    864         884,736
 77 | MobilenetV1/Conv2d_3_pointwise/Conv2D:                     9,216       9,437,184
 78 | MobilenetV1/Conv2d_4_depthwise/depthwise:                    864         221,184
 79 | MobilenetV1/Conv2d_4_pointwise/Conv2D:                    18,432       4,718,592
 80 | MobilenetV1/Conv2d_5_depthwise/depthwise:                  1,728         442,368
 81 | MobilenetV1/Conv2d_5_pointwise/Conv2D:                    36,864       9,437,184
 82 | MobilenetV1/Conv2d_6_depthwise/depthwise:                  1,728         110,592
 83 | MobilenetV1/Conv2d_6_pointwise/Conv2D:                    73,728       4,718,592
 84 | MobilenetV1/Conv2d_7_depthwise/depthwise:                  3,456         221,184
 85 | MobilenetV1/Conv2d_7_pointwise/Conv2D:                   147,456       9,437,184
 86 | MobilenetV1/Conv2d_8_depthwise/depthwise:                  3,456         221,184
 87 | MobilenetV1/Conv2d_8_pointwise/Conv2D:                   147,456       9,437,184
 88 | MobilenetV1/Conv2d_9_depthwise/depthwise:                  3,456         221,184
 89 | MobilenetV1/Conv2d_9_pointwise/Conv2D:                   147,456       9,437,184
 90 | MobilenetV1/Conv2d_10_depthwise/depthwise:                 3,456         221,184
 91 | MobilenetV1/Conv2d_10_pointwise/Conv2D:                  147,456       9,437,184
 92 | MobilenetV1/Conv2d_11_depthwise/depthwise:                 3,456         221,184
 93 | MobilenetV1/Conv2d_11_pointwise/Conv2D:                  147,456       9,437,184
 94 | MobilenetV1/Conv2d_12_depthwise/depthwise:                 3,456          55,296
 95 | MobilenetV1/Conv2d_12_pointwise/Conv2D:                  294,912       4,718,592
 96 | MobilenetV1/Conv2d_13_depthwise/depthwise:                 6,912         110,592
 97 | MobilenetV1/Conv2d_13_pointwise/Conv2D:                  589,824       9,437,184
 98 | --------------------------------------------------------------------------------
 99 | Total:                                                 1,800,144     106,002,432
100 | 
101 | """
102 | 
103 | # Tensorflow mandates these.
104 | from __future__ import absolute_import
105 | from __future__ import division
106 | from __future__ import print_function
107 | 
108 | from collections import namedtuple
109 | import functools
110 | 
111 | import tensorflow as tf
112 | 
113 | slim = tf.contrib.slim
114 | 
115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
116 | # Conv defines 3x3 convolution layers
117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
118 | # stride is the stride of the convolution
119 | # depth is the number of channels or filters in a layer
120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])
122 | 
123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body
124 | MOBILENETV1_CONV_DEFS = [
125 |     Conv(kernel=[3, 3], stride=2, depth=32),
126 |     DepthSepConv(kernel=[3, 3], stride=1, depth=64),
127 |     DepthSepConv(kernel=[3, 3], stride=2, depth=128),
128 |     DepthSepConv(kernel=[3, 3], stride=1, depth=128),
129 |     DepthSepConv(kernel=[3, 3], stride=2, depth=256),
130 |     DepthSepConv(kernel=[3, 3], stride=1, depth=256),
131 |     DepthSepConv(kernel=[3, 3], stride=2, depth=512),
132 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
133 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
134 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
135 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
136 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
137 |     DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
138 |     DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
139 | ]
140 | 
141 | 
142 | def _fixed_padding(inputs, kernel_size, rate=1):
143 |   """Pads the input along the spatial dimensions independently of input size.
144 | 
145 |   Pads the input such that if it was used in a convolution with 'VALID' padding,
146 |   the output would have the same dimensions as if the unpadded input was used
147 |   in a convolution with 'SAME' padding.
148 | 
149 |   Args:
150 |     inputs: A tensor of size [batch, height_in, width_in, channels].
151 |     kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
152 |     rate: An integer, rate for atrous convolution.
153 | 
154 |   Returns:
155 |     output: A tensor of size [batch, height_out, width_out, channels] with the
156 |       input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
157 |   """
158 |   kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1),
159 |                            kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)]
160 |   pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1]
161 |   pad_beg = [pad_total[0] // 2, pad_total[1] // 2]
162 |   pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]]
163 |   padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]],
164 |                                   [pad_beg[1], pad_end[1]], [0, 0]])
165 |   return padded_inputs
166 | 
167 | 
168 | def mobilenet_v1_base(inputs,
169 |                       final_endpoint='Conv2d_13_pointwise',
170 |                       min_depth=8,
171 |                       depth_multiplier=1.0,
172 |                       conv_defs=None,
173 |                       output_stride=None,
174 |                       use_explicit_padding=False,
175 |                       scope=None):
176 |   """Mobilenet v1.
177 | 
178 |   Constructs a Mobilenet v1 network from inputs to the given final endpoint.
179 | 
180 |   Args:
181 |     inputs: a tensor of shape [batch_size, height, width, channels].
182 |     final_endpoint: specifies the endpoint to construct the network up to. It
183 |       can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',
184 |       'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,
185 |       'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',
186 |       'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',
187 |       'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].
188 |     min_depth: Minimum depth value (number of channels) for all convolution ops.
189 |       Enforced when depth_multiplier < 1, and not an active constraint when
190 |       depth_multiplier >= 1.
191 |     depth_multiplier: Float multiplier for the depth (number of channels)
192 |       for all convolution ops. The value must be greater than zero. Typical
193 |       usage will be to set this value in (0, 1) to reduce the number of
194 |       parameters or computation cost of the model.
195 |     conv_defs: A list of ConvDef namedtuples specifying the net architecture.
196 |     output_stride: An integer that specifies the requested ratio of input to
197 |       output spatial resolution. If not None, then we invoke atrous convolution
198 |       if necessary to prevent the network from reducing the spatial resolution
199 |       of the activation maps. Allowed values are 8 (accurate fully convolutional
200 |       mode), 16 (fast fully convolutional mode), 32 (classification mode).
201 |     use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
202 |       inputs so that the output dimensions are the same as if 'SAME' padding
203 |       were used.
204 |     scope: Optional variable_scope.
205 | 
206 |   Returns:
207 |     tensor_out: output tensor corresponding to the final_endpoint.
208 |     end_points: a set of activations for external use, for example summaries or
209 |                 losses.
210 | 
211 |   Raises:
212 |     ValueError: if final_endpoint is not set to one of the predefined values,
213 |                 or depth_multiplier <= 0, or the target output_stride is not
214 |                 allowed.
215 |   """
216 |   depth = lambda d: max(int(d * depth_multiplier), min_depth)
217 |   end_points = {}
218 | 
219 |   # Used to find thinned depths for each layer.
220 |   if depth_multiplier <= 0:
221 |     raise ValueError('depth_multiplier is not greater than zero.')
222 | 
223 |   if conv_defs is None:
224 |     conv_defs = MOBILENETV1_CONV_DEFS
225 | 
226 |   if output_stride is not None and output_stride not in [8, 16, 32]:
227 |     raise ValueError('Only allowed output_stride values are 8, 16, 32.')
228 | 
229 |   padding = 'SAME'
230 |   if use_explicit_padding:
231 |     padding = 'VALID'
232 |   with tf.variable_scope(scope, 'MobilenetV1', [inputs]):
233 |     with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding):
234 |       # The current_stride variable keeps track of the output stride of the
235 |       # activations, i.e., the running product of convolution strides up to the
236 |       # current network layer. This allows us to invoke atrous convolution
237 |       # whenever applying the next convolution would result in the activations
238 |       # having output stride larger than the target output_stride.
239 |       current_stride = 1
240 | 
241 |       # The atrous convolution rate parameter.
242 |       rate = 1
243 | 
244 |       net = inputs
245 |       for i, conv_def in enumerate(conv_defs):
246 |         end_point_base = 'Conv2d_%d' % i
247 | 
248 |         if output_stride is not None and current_stride == output_stride:
249 |           # If we have reached the target output_stride, then we need to employ
250 |           # atrous convolution with stride=1 and multiply the atrous rate by the
251 |           # current unit's stride for use in subsequent layers.
252 |           layer_stride = 1
253 |           layer_rate = rate
254 |           rate *= conv_def.stride
255 |         else:
256 |           layer_stride = conv_def.stride
257 |           layer_rate = 1
258 |           current_stride *= conv_def.stride
259 | 
260 |         if isinstance(conv_def, Conv):
261 |           end_point = end_point_base
262 |           if use_explicit_padding:
263 |             net = _fixed_padding(net, conv_def.kernel)
264 |           net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,
265 |                             stride=conv_def.stride,
266 |                             normalizer_fn=slim.batch_norm,
267 |                             scope=end_point)
268 |           end_points[end_point] = net
269 |           if end_point == final_endpoint:
270 |             return net, end_points
271 | 
272 |         elif isinstance(conv_def, DepthSepConv):
273 |           end_point = end_point_base + '_depthwise'
274 | 
275 |           # By passing filters=None
276 |           # separable_conv2d produces only a depthwise convolution layer
277 |           if use_explicit_padding:
278 |             net = _fixed_padding(net, conv_def.kernel, layer_rate)
279 |           net = slim.separable_conv2d(net, None, conv_def.kernel,
280 |                                       depth_multiplier=1,
281 |                                       stride=layer_stride,
282 |                                       rate=layer_rate,
283 |                                       normalizer_fn=slim.batch_norm,
284 |                                       scope=end_point)
285 | 
286 |           end_points[end_point] = net
287 |           if end_point == final_endpoint:
288 |             return net, end_points
289 | 
290 |           end_point = end_point_base + '_pointwise'
291 | 
292 |           net = slim.conv2d(net, depth(conv_def.depth), [1, 1],
293 |                             stride=1,
294 |                             normalizer_fn=slim.batch_norm,
295 |                             scope=end_point)
296 | 
297 |           end_points[end_point] = net
298 |           if end_point == final_endpoint:
299 |             return net, end_points
300 |         else:
301 |           raise ValueError('Unknown convolution type %s for layer %d'
302 |                            % (conv_def.ltype, i))
303 |   raise ValueError('Unknown final endpoint %s' % final_endpoint)
304 | 
305 | 
306 | def mobilenet_v1(inputs,
307 |                  num_classes=1000,
308 |                  dropout_keep_prob=0.999,
309 |                  is_training=True,
310 |                  min_depth=8,
311 |                  depth_multiplier=1.0,
312 |                  conv_defs=None,
313 |                  prediction_fn=tf.contrib.layers.softmax,
314 |                  spatial_squeeze=True,
315 |                  reuse=None,
316 |                  scope='MobilenetV1',
317 |                  global_pool=False):
318 |   """Mobilenet v1 model for classification.
319 | 
320 |   Args:
321 |     inputs: a tensor of shape [batch_size, height, width, channels].
322 |     num_classes: number of predicted classes. If 0 or None, the logits layer
323 |       is omitted and the input features to the logits layer (before dropout)
324 |       are returned instead.
325 |     dropout_keep_prob: the percentage of activation values that are retained.
326 |     is_training: whether is training or not.
327 |     min_depth: Minimum depth value (number of channels) for all convolution ops.
328 |       Enforced when depth_multiplier < 1, and not an active constraint when
329 |       depth_multiplier >= 1.
330 |     depth_multiplier: Float multiplier for the depth (number of channels)
331 |       for all convolution ops. The value must be greater than zero. Typical
332 |       usage will be to set this value in (0, 1) to reduce the number of
333 |       parameters or computation cost of the model.
334 |     conv_defs: A list of ConvDef namedtuples specifying the net architecture.
335 |     prediction_fn: a function to get predictions out of logits.
336 |     spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
337 |         of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
338 |     reuse: whether or not the network and its variables should be reused. To be
339 |       able to reuse 'scope' must be given.
340 |     scope: Optional variable_scope.
341 |     global_pool: Optional boolean flag to control the avgpooling before the
342 |       logits layer. If false or unset, pooling is done with a fixed window
343 |       that reduces default-sized inputs to 1x1, while larger inputs lead to
344 |       larger outputs. If true, any input size is pooled down to 1x1.
345 | 
346 |   Returns:
347 |     net: a 2D Tensor with the logits (pre-softmax activations) if num_classes
348 |       is a non-zero integer, or the non-dropped-out input to the logits layer
349 |       if num_classes is 0 or None.
350 |     end_points: a dictionary from components of the network to the corresponding
351 |       activation.
352 | 
353 |   Raises:
354 |     ValueError: Input rank is invalid.
355 |   """
356 |   input_shape = inputs.get_shape().as_list()
357 |   if len(input_shape) != 4:
358 |     raise ValueError('Invalid input tensor rank, expected 4, was: %d' %
359 |                      len(input_shape))
360 | 
361 |   with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope:
362 |     with slim.arg_scope([slim.batch_norm, slim.dropout],
363 |                         is_training=is_training):
364 |       net, end_points = mobilenet_v1_base(inputs, scope=scope,
365 |                                           min_depth=min_depth,
366 |                                           depth_multiplier=depth_multiplier,
367 |                                           conv_defs=conv_defs)
368 |       with tf.variable_scope('Logits'):
369 |         if global_pool:
370 |           # Global average pooling.
371 |           net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool')
372 |           end_points['global_pool'] = net
373 |         else:
374 |           # Pooling with a fixed kernel size.
375 |           kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
376 |           net = slim.avg_pool2d(net, kernel_size, padding='VALID',
377 |                                 scope='AvgPool_1a')
378 |           end_points['AvgPool_1a'] = net
379 |         if not num_classes:
380 |           return net, end_points
381 |         # 1 x 1 x 1024
382 |         net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
383 |         logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
384 |                              normalizer_fn=None, scope='Conv2d_1c_1x1')
385 |         if spatial_squeeze:
386 |           logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
387 |       end_points['Logits'] = logits
388 |       if prediction_fn:
389 |         end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
390 |   return logits, end_points
391 | 
392 | mobilenet_v1.default_image_size = 224
393 | 
394 | 
395 | def wrapped_partial(func, *args, **kwargs):
396 |   partial_func = functools.partial(func, *args, **kwargs)
397 |   functools.update_wrapper(partial_func, func)
398 |   return partial_func
399 | 
400 | 
401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)
402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)
403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)
404 | 
405 | 
406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
407 |   """Define kernel size which is automatically reduced for small input.
408 | 
409 |   If the shape of the input images is unknown at graph construction time this
410 |   function assumes that the input images are large enough.
411 | 
412 |   Args:
413 |     input_tensor: input tensor of size [batch_size, height, width, channels].
414 |     kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]
415 | 
416 |   Returns:
417 |     a tensor with the kernel size.
418 |   """
419 |   shape = input_tensor.get_shape().as_list()
420 |   if shape[1] is None or shape[2] is None:
421 |     kernel_size_out = kernel_size
422 |   else:
423 |     kernel_size_out = [min(shape[1], kernel_size[0]),
424 |                        min(shape[2], kernel_size[1])]
425 |   return kernel_size_out
426 | 
427 | 
428 | def mobilenet_v1_arg_scope(
429 |     is_training=True,
430 |     weight_decay=0.00004,
431 |     stddev=0.09,
432 |     regularize_depthwise=False,
433 |     batch_norm_decay=0.9997,
434 |     batch_norm_epsilon=0.001,
435 |     batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS):
436 |   """Defines the default MobilenetV1 arg scope.
437 | 
438 |   Args:
439 |     is_training: Whether or not we're training the model. If this is set to
440 |       None, the parameter is not added to the batch_norm arg_scope.
441 |     weight_decay: The weight decay to use for regularizing the model.
442 |     stddev: The standard deviation of the trunctated normal weight initializer.
443 |     regularize_depthwise: Whether or not apply regularization on depthwise.
444 |     batch_norm_decay: Decay for batch norm moving average.
445 |     batch_norm_epsilon: Small float added to variance to avoid dividing by zero
446 |       in batch norm.
447 |     batch_norm_updates_collections: Collection for the update ops for
448 |       batch norm.
449 | 
450 |   Returns:
451 |     An `arg_scope` to use for the mobilenet v1 model.
452 |   """
453 |   batch_norm_params = {
454 |       'center': True,
455 |       'scale': True,
456 |       'decay': batch_norm_decay,
457 |       'epsilon': batch_norm_epsilon,
458 |       'updates_collections': batch_norm_updates_collections,
459 |   }
460 |   if is_training is not None:
461 |     batch_norm_params['is_training'] = is_training
462 | 
463 |   # Set weight_decay for weights in Conv and DepthSepConv layers.
464 |   weights_init = tf.truncated_normal_initializer(stddev=stddev)
465 |   regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
466 |   if regularize_depthwise:
467 |     depthwise_regularizer = regularizer
468 |   else:
469 |     depthwise_regularizer = None
470 |   with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
471 |                       weights_initializer=weights_init,
472 |                       activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):
473 |     with slim.arg_scope([slim.batch_norm], **batch_norm_params):
474 |       with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):
475 |         with slim.arg_scope([slim.separable_conv2d],
476 |                             weights_regularizer=depthwise_regularizer) as sc:
477 |           return sc
478 | 


--------------------------------------------------------------------------------
/S6/resize_images.py:
--------------------------------------------------------------------------------
 1 | """ A utility script that resizes all images in a given directory to a specified size
 2 | WARNING: the original images will be overwritten!
 3 | """
 4 | 
 5 | import cv2
 6 | import os, glob
 7 | import argparse
 8 | 
 9 | def main(args):
10 |     supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG']
11 |     filenames = []
12 |     for extension in supported_formats:
13 |         pattern = os.path.join(args.input_dir, '**', extension)
14 |         filenames.extend(glob.glob(pattern, recursive=True))
15 | 
16 |     num_images = len(filenames)
17 |     for i in range(num_images):
18 |         if i % 100 == 0:
19 |             print("{} of {} \t Resizing: {}".format(i, num_images, filenames[i]))
20 |         image = cv2.imread(filenames[i])
21 |         image = cv2.resize(image, (args.resize, args.resize), interpolation=cv2.INTER_AREA)
22 |         cv2.imwrite(filenames[i], image)
23 | 
24 | if __name__ == '__main__':
25 |     # resizes images in-place
26 |     parser = argparse.ArgumentParser()
27 |     parser.add_argument('--input_dir', type=str,
28 |                         help='path to the directory where the images will be read from')
29 |     parser.add_argument('--resize', type=int,
30 |                         help='the images will be resized to NxN')
31 | 
32 |     args = parser.parse_args()
33 |     main(args)
34 | 


--------------------------------------------------------------------------------
/S6/trainer.py:
--------------------------------------------------------------------------------
  1 | """ Trains a TensorFlow model
  2 | Example:
  3 | $ python trainer.py --checkpoint_path ./checkpoints --data_path ./tfrecords
  4 | """
  5 | 
  6 | import tensorflow as tf
  7 | import numpy as np
  8 | import os, glob
  9 | import argparse
 10 | from nets import mobilenet_v1 #####################################################
 11 | 
 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
 13 | 
 14 | class TFModelTrainer:
 15 | 
 16 |     def __init__(self, checkpoint_path, data_path):
 17 |         self.checkpoint_path = checkpoint_path
 18 | 
 19 |         # set training parameters #################################################
 20 |         self.learning_rate = 0.01
 21 |         self.num_iter = 100000
 22 |         self.save_iter = 5000
 23 |         self.val_iter = 5000
 24 |         self.log_iter = 100
 25 |         self.batch_size = 32
 26 | 
 27 |         # set up data layer
 28 |         self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord'))
 29 |         self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord'))
 30 |         self.iterator, self.filenames = self._data_layer()
 31 |         self.num_val_samples = 10000
 32 |         self.num_classes = 2
 33 |         self.image_size = 224
 34 | 
 35 |     def preprocess_image(self, image_string):
 36 |         image = tf.image.decode_jpeg(image_string, channels=3)
 37 | 
 38 |         # flip for data augmentation
 39 |         image = tf.image.random_flip_left_right(image) ############################
 40 | 
 41 |         # normalize image to [-1, +1]
 42 |         image = tf.cast(image, tf.float32)
 43 |         image = image / 127.5
 44 |         image = image - 1
 45 |         return image
 46 | 
 47 |     def _parse_tfrecord(self, example_proto): #####################################
 48 |         keys_to_features = {'image': tf.FixedLenFeature([], tf.string),
 49 |                             'label': tf.FixedLenFeature([], tf.int64)}
 50 |         parsed_features = tf.parse_single_example(example_proto, keys_to_features)
 51 |         image = parsed_features['image']
 52 |         label = parsed_features['label']
 53 |         image = self.preprocess_image(image)
 54 |         return image, label
 55 | 
 56 |     def _data_layer(self, num_threads=8, prefetch_buffer=100):
 57 |         with tf.variable_scope('data'):
 58 |             filenames = tf.placeholder(tf.string, shape=[None])
 59 |             dataset = tf.data.TFRecordDataset(filenames) ##########################
 60 |             dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads)
 61 |             dataset = dataset.repeat()
 62 |             dataset = dataset.batch(self.batch_size)
 63 |             dataset = dataset.prefetch(prefetch_buffer)
 64 |             iterator = dataset.make_initializable_iterator()
 65 |         return iterator, filenames
 66 | 
 67 |     def _loss_functions(self, logits, labels):
 68 |         with tf.variable_scope('loss'):
 69 |             target_prob = tf.one_hot(labels, self.num_classes)
 70 |             tf.losses.softmax_cross_entropy(target_prob, logits)
 71 |             total_loss = tf.losses.get_total_loss() #include regularization loss
 72 |         return total_loss
 73 | 
 74 |     def _optimizer(self, total_loss, global_step):
 75 |         update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 76 |         with tf.control_dependencies(update_ops):
 77 |             optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
 78 |             optimizer = optimizer.minimize(total_loss, global_step=global_step)
 79 |         return optimizer
 80 | 
 81 |     def _performance_metric(self, logits, labels):
 82 |         with tf.variable_scope("performance_metric"):
 83 |             preds = tf.argmax(logits, axis=1)
 84 |             labels = tf.cast(labels, tf.int64)
 85 |             corrects = tf.equal(preds, labels)
 86 |             accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 87 |         return accuracy
 88 | 
 89 |     def train(self):
 90 |         # iteration number
 91 |         global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
 92 | 
 93 |         # training graph
 94 |         images, labels = self.iterator.get_next()
 95 |         images = tf.image.resize_bilinear(images, (self.image_size, self.image_size))
 96 |         training = tf.placeholder(tf.bool, name='is_training')
 97 |         logits, _ = mobilenet_v1.mobilenet_v1(images,
 98 |                      num_classes=self.num_classes,
 99 |                      is_training=training,
100 |                      scope='MobilenetV1',
101 |                      global_pool=True) ############################################
102 |         loss = self._loss_functions(logits, labels)
103 |         optimizer = self._optimizer(loss, global_step)
104 |         accuracy = self._performance_metric(logits, labels)
105 | 
106 |         # summary placeholders
107 |         streaming_loss_p = tf.placeholder(tf.float32)
108 |         accuracy_p = tf.placeholder(tf.float32)
109 |         summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
110 |         summ_op_test = tf.summary.scalar('accuracy', accuracy_p)
111 | 
112 |         with tf.Session() as sess:
113 |             sess.run(tf.global_variables_initializer())
114 |             sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
115 | 
116 |             writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
117 | 
118 |             saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
119 |             ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
120 | 
121 |             # resume training if a checkpoint exists
122 |             if ckpt and ckpt.model_checkpoint_path:
123 |                 saver.restore(sess, ckpt.model_checkpoint_path)
124 |                 print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
125 | 
126 |             initial_step = global_step.eval()
127 | 
128 |             # train the model
129 |             streaming_loss = 0
130 |             for i in range(initial_step, self.num_iter + 1):
131 |                 _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
132 | 
133 |                 if not np.isfinite(loss_batch):
134 |                     print('loss diverged, stopping')
135 |                     exit()
136 | 
137 |                 # log summary
138 |                 streaming_loss += loss_batch
139 |                 if i % self.log_iter == self.log_iter - 1:
140 |                     streaming_loss /= self.log_iter
141 |                     print(i + 1, streaming_loss)
142 |                     summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
143 |                     writer.add_summary(summary_train, global_step=i)
144 |                     streaming_loss = 0
145 | 
146 |                 # save model
147 |                 if i % self.save_iter == self.save_iter - 1:
148 |                     saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
149 |                     print("Model saved!")
150 | 
151 |                 # run validation
152 |                 if i % self.val_iter == self.val_iter - 1:
153 |                     print("Running validation.")
154 |                     sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames})
155 | 
156 |                     validation_accuracy = 0
157 |                     for j in range(self.num_val_samples // self.batch_size): ###################################
158 |                         acc_batch = sess.run(accuracy, feed_dict={training: False})
159 |                         validation_accuracy += acc_batch
160 |                     validation_accuracy /= j
161 | 
162 |                     print("Accuracy: {}".format(validation_accuracy))
163 | 
164 |                     summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy})
165 |                     writer.add_summary(summary_test, global_step=i)
166 | 
167 |                     sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
168 | 
169 |             writer.close()
170 | 
171 | def main():
172 |     parser = argparse.ArgumentParser()
173 |     parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
174 |                         help="Path to the dir where the checkpoints are saved")
175 |     parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords")
176 |     args = parser.parse_args()
177 |     trainer = TFModelTrainer(args.checkpoint_path, args.data_path)
178 |     trainer.train()
179 | 
180 | if __name__ == '__main__':
181 |     main()
182 | 


--------------------------------------------------------------------------------
/S7/checkpoints/checkpoint:
--------------------------------------------------------------------------------
1 | model_checkpoint_path: "mobilenet_v1_1.0_224.ckpt"
2 | 


--------------------------------------------------------------------------------
/S7/nets/mobilenet_v1.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | # http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # =============================================================================
 15 | """MobileNet v1.
 16 | 
 17 | MobileNet is a general architecture and can be used for multiple use cases.
 18 | Depending on the use case, it can use different input layer size and different
 19 | head (for example: embeddings, localization and classification).
 20 | 
 21 | As described in https://arxiv.org/abs/1704.04861.
 22 | 
 23 |   MobileNets: Efficient Convolutional Neural Networks for
 24 |     Mobile Vision Applications
 25 |   Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
 26 |     Tobias Weyand, Marco Andreetto, Hartwig Adam
 27 | 
 28 | 100% Mobilenet V1 (base) with input size 224x224:
 29 | 
 30 | See mobilenet_v1()
 31 | 
 32 | Layer                                                     params           macs
 33 | --------------------------------------------------------------------------------
 34 | MobilenetV1/Conv2d_0/Conv2D:                                 864      10,838,016
 35 | MobilenetV1/Conv2d_1_depthwise/depthwise:                    288       3,612,672
 36 | MobilenetV1/Conv2d_1_pointwise/Conv2D:                     2,048      25,690,112
 37 | MobilenetV1/Conv2d_2_depthwise/depthwise:                    576       1,806,336
 38 | MobilenetV1/Conv2d_2_pointwise/Conv2D:                     8,192      25,690,112
 39 | MobilenetV1/Conv2d_3_depthwise/depthwise:                  1,152       3,612,672
 40 | MobilenetV1/Conv2d_3_pointwise/Conv2D:                    16,384      51,380,224
 41 | MobilenetV1/Conv2d_4_depthwise/depthwise:                  1,152         903,168
 42 | MobilenetV1/Conv2d_4_pointwise/Conv2D:                    32,768      25,690,112
 43 | MobilenetV1/Conv2d_5_depthwise/depthwise:                  2,304       1,806,336
 44 | MobilenetV1/Conv2d_5_pointwise/Conv2D:                    65,536      51,380,224
 45 | MobilenetV1/Conv2d_6_depthwise/depthwise:                  2,304         451,584
 46 | MobilenetV1/Conv2d_6_pointwise/Conv2D:                   131,072      25,690,112
 47 | MobilenetV1/Conv2d_7_depthwise/depthwise:                  4,608         903,168
 48 | MobilenetV1/Conv2d_7_pointwise/Conv2D:                   262,144      51,380,224
 49 | MobilenetV1/Conv2d_8_depthwise/depthwise:                  4,608         903,168
 50 | MobilenetV1/Conv2d_8_pointwise/Conv2D:                   262,144      51,380,224
 51 | MobilenetV1/Conv2d_9_depthwise/depthwise:                  4,608         903,168
 52 | MobilenetV1/Conv2d_9_pointwise/Conv2D:                   262,144      51,380,224
 53 | MobilenetV1/Conv2d_10_depthwise/depthwise:                 4,608         903,168
 54 | MobilenetV1/Conv2d_10_pointwise/Conv2D:                  262,144      51,380,224
 55 | MobilenetV1/Conv2d_11_depthwise/depthwise:                 4,608         903,168
 56 | MobilenetV1/Conv2d_11_pointwise/Conv2D:                  262,144      51,380,224
 57 | MobilenetV1/Conv2d_12_depthwise/depthwise:                 4,608         225,792
 58 | MobilenetV1/Conv2d_12_pointwise/Conv2D:                  524,288      25,690,112
 59 | MobilenetV1/Conv2d_13_depthwise/depthwise:                 9,216         451,584
 60 | MobilenetV1/Conv2d_13_pointwise/Conv2D:                1,048,576      51,380,224
 61 | --------------------------------------------------------------------------------
 62 | Total:                                                 3,185,088     567,716,352
 63 | 
 64 | 
 65 | 75% Mobilenet V1 (base) with input size 128x128:
 66 | 
 67 | See mobilenet_v1_075()
 68 | 
 69 | Layer                                                     params           macs
 70 | --------------------------------------------------------------------------------
 71 | MobilenetV1/Conv2d_0/Conv2D:                                 648       2,654,208
 72 | MobilenetV1/Conv2d_1_depthwise/depthwise:                    216         884,736
 73 | MobilenetV1/Conv2d_1_pointwise/Conv2D:                     1,152       4,718,592
 74 | MobilenetV1/Conv2d_2_depthwise/depthwise:                    432         442,368
 75 | MobilenetV1/Conv2d_2_pointwise/Conv2D:                     4,608       4,718,592
 76 | MobilenetV1/Conv2d_3_depthwise/depthwise:                    864         884,736
 77 | MobilenetV1/Conv2d_3_pointwise/Conv2D:                     9,216       9,437,184
 78 | MobilenetV1/Conv2d_4_depthwise/depthwise:                    864         221,184
 79 | MobilenetV1/Conv2d_4_pointwise/Conv2D:                    18,432       4,718,592
 80 | MobilenetV1/Conv2d_5_depthwise/depthwise:                  1,728         442,368
 81 | MobilenetV1/Conv2d_5_pointwise/Conv2D:                    36,864       9,437,184
 82 | MobilenetV1/Conv2d_6_depthwise/depthwise:                  1,728         110,592
 83 | MobilenetV1/Conv2d_6_pointwise/Conv2D:                    73,728       4,718,592
 84 | MobilenetV1/Conv2d_7_depthwise/depthwise:                  3,456         221,184
 85 | MobilenetV1/Conv2d_7_pointwise/Conv2D:                   147,456       9,437,184
 86 | MobilenetV1/Conv2d_8_depthwise/depthwise:                  3,456         221,184
 87 | MobilenetV1/Conv2d_8_pointwise/Conv2D:                   147,456       9,437,184
 88 | MobilenetV1/Conv2d_9_depthwise/depthwise:                  3,456         221,184
 89 | MobilenetV1/Conv2d_9_pointwise/Conv2D:                   147,456       9,437,184
 90 | MobilenetV1/Conv2d_10_depthwise/depthwise:                 3,456         221,184
 91 | MobilenetV1/Conv2d_10_pointwise/Conv2D:                  147,456       9,437,184
 92 | MobilenetV1/Conv2d_11_depthwise/depthwise:                 3,456         221,184
 93 | MobilenetV1/Conv2d_11_pointwise/Conv2D:                  147,456       9,437,184
 94 | MobilenetV1/Conv2d_12_depthwise/depthwise:                 3,456          55,296
 95 | MobilenetV1/Conv2d_12_pointwise/Conv2D:                  294,912       4,718,592
 96 | MobilenetV1/Conv2d_13_depthwise/depthwise:                 6,912         110,592
 97 | MobilenetV1/Conv2d_13_pointwise/Conv2D:                  589,824       9,437,184
 98 | --------------------------------------------------------------------------------
 99 | Total:                                                 1,800,144     106,002,432
100 | 
101 | """
102 | 
103 | # Tensorflow mandates these.
104 | from __future__ import absolute_import
105 | from __future__ import division
106 | from __future__ import print_function
107 | 
108 | from collections import namedtuple
109 | import functools
110 | 
111 | import tensorflow as tf
112 | 
113 | slim = tf.contrib.slim
114 | 
115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
116 | # Conv defines 3x3 convolution layers
117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
118 | # stride is the stride of the convolution
119 | # depth is the number of channels or filters in a layer
120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])
122 | 
123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body
124 | MOBILENETV1_CONV_DEFS = [
125 |     Conv(kernel=[3, 3], stride=2, depth=32),
126 |     DepthSepConv(kernel=[3, 3], stride=1, depth=64),
127 |     DepthSepConv(kernel=[3, 3], stride=2, depth=128),
128 |     DepthSepConv(kernel=[3, 3], stride=1, depth=128),
129 |     DepthSepConv(kernel=[3, 3], stride=2, depth=256),
130 |     DepthSepConv(kernel=[3, 3], stride=1, depth=256),
131 |     DepthSepConv(kernel=[3, 3], stride=2, depth=512),
132 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
133 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
134 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
135 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
136 |     DepthSepConv(kernel=[3, 3], stride=1, depth=512),
137 |     DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
138 |     DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
139 | ]
140 | 
141 | 
142 | def _fixed_padding(inputs, kernel_size, rate=1):
143 |   """Pads the input along the spatial dimensions independently of input size.
144 | 
145 |   Pads the input such that if it was used in a convolution with 'VALID' padding,
146 |   the output would have the same dimensions as if the unpadded input was used
147 |   in a convolution with 'SAME' padding.
148 | 
149 |   Args:
150 |     inputs: A tensor of size [batch, height_in, width_in, channels].
151 |     kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
152 |     rate: An integer, rate for atrous convolution.
153 | 
154 |   Returns:
155 |     output: A tensor of size [batch, height_out, width_out, channels] with the
156 |       input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
157 |   """
158 |   kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1),
159 |                            kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)]
160 |   pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1]
161 |   pad_beg = [pad_total[0] // 2, pad_total[1] // 2]
162 |   pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]]
163 |   padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]],
164 |                                   [pad_beg[1], pad_end[1]], [0, 0]])
165 |   return padded_inputs
166 | 
167 | 
168 | def mobilenet_v1_base(inputs,
169 |                       final_endpoint='Conv2d_13_pointwise',
170 |                       min_depth=8,
171 |                       depth_multiplier=1.0,
172 |                       conv_defs=None,
173 |                       output_stride=None,
174 |                       use_explicit_padding=False,
175 |                       scope=None):
176 |   """Mobilenet v1.
177 | 
178 |   Constructs a Mobilenet v1 network from inputs to the given final endpoint.
179 | 
180 |   Args:
181 |     inputs: a tensor of shape [batch_size, height, width, channels].
182 |     final_endpoint: specifies the endpoint to construct the network up to. It
183 |       can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',
184 |       'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,
185 |       'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',
186 |       'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',
187 |       'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].
188 |     min_depth: Minimum depth value (number of channels) for all convolution ops.
189 |       Enforced when depth_multiplier < 1, and not an active constraint when
190 |       depth_multiplier >= 1.
191 |     depth_multiplier: Float multiplier for the depth (number of channels)
192 |       for all convolution ops. The value must be greater than zero. Typical
193 |       usage will be to set this value in (0, 1) to reduce the number of
194 |       parameters or computation cost of the model.
195 |     conv_defs: A list of ConvDef namedtuples specifying the net architecture.
196 |     output_stride: An integer that specifies the requested ratio of input to
197 |       output spatial resolution. If not None, then we invoke atrous convolution
198 |       if necessary to prevent the network from reducing the spatial resolution
199 |       of the activation maps. Allowed values are 8 (accurate fully convolutional
200 |       mode), 16 (fast fully convolutional mode), 32 (classification mode).
201 |     use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
202 |       inputs so that the output dimensions are the same as if 'SAME' padding
203 |       were used.
204 |     scope: Optional variable_scope.
205 | 
206 |   Returns:
207 |     tensor_out: output tensor corresponding to the final_endpoint.
208 |     end_points: a set of activations for external use, for example summaries or
209 |                 losses.
210 | 
211 |   Raises:
212 |     ValueError: if final_endpoint is not set to one of the predefined values,
213 |                 or depth_multiplier <= 0, or the target output_stride is not
214 |                 allowed.
215 |   """
216 |   depth = lambda d: max(int(d * depth_multiplier), min_depth)
217 |   end_points = {}
218 | 
219 |   # Used to find thinned depths for each layer.
220 |   if depth_multiplier <= 0:
221 |     raise ValueError('depth_multiplier is not greater than zero.')
222 | 
223 |   if conv_defs is None:
224 |     conv_defs = MOBILENETV1_CONV_DEFS
225 | 
226 |   if output_stride is not None and output_stride not in [8, 16, 32]:
227 |     raise ValueError('Only allowed output_stride values are 8, 16, 32.')
228 | 
229 |   padding = 'SAME'
230 |   if use_explicit_padding:
231 |     padding = 'VALID'
232 |   with tf.variable_scope(scope, 'MobilenetV1', [inputs]):
233 |     with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding):
234 |       # The current_stride variable keeps track of the output stride of the
235 |       # activations, i.e., the running product of convolution strides up to the
236 |       # current network layer. This allows us to invoke atrous convolution
237 |       # whenever applying the next convolution would result in the activations
238 |       # having output stride larger than the target output_stride.
239 |       current_stride = 1
240 | 
241 |       # The atrous convolution rate parameter.
242 |       rate = 1
243 | 
244 |       net = inputs
245 |       for i, conv_def in enumerate(conv_defs):
246 |         end_point_base = 'Conv2d_%d' % i
247 | 
248 |         if output_stride is not None and current_stride == output_stride:
249 |           # If we have reached the target output_stride, then we need to employ
250 |           # atrous convolution with stride=1 and multiply the atrous rate by the
251 |           # current unit's stride for use in subsequent layers.
252 |           layer_stride = 1
253 |           layer_rate = rate
254 |           rate *= conv_def.stride
255 |         else:
256 |           layer_stride = conv_def.stride
257 |           layer_rate = 1
258 |           current_stride *= conv_def.stride
259 | 
260 |         if isinstance(conv_def, Conv):
261 |           end_point = end_point_base
262 |           if use_explicit_padding:
263 |             net = _fixed_padding(net, conv_def.kernel)
264 |           net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,
265 |                             stride=conv_def.stride,
266 |                             normalizer_fn=slim.batch_norm,
267 |                             scope=end_point)
268 |           end_points[end_point] = net
269 |           if end_point == final_endpoint:
270 |             return net, end_points
271 | 
272 |         elif isinstance(conv_def, DepthSepConv):
273 |           end_point = end_point_base + '_depthwise'
274 | 
275 |           # By passing filters=None
276 |           # separable_conv2d produces only a depthwise convolution layer
277 |           if use_explicit_padding:
278 |             net = _fixed_padding(net, conv_def.kernel, layer_rate)
279 |           net = slim.separable_conv2d(net, None, conv_def.kernel,
280 |                                       depth_multiplier=1,
281 |                                       stride=layer_stride,
282 |                                       rate=layer_rate,
283 |                                       normalizer_fn=slim.batch_norm,
284 |                                       scope=end_point)
285 | 
286 |           end_points[end_point] = net
287 |           if end_point == final_endpoint:
288 |             return net, end_points
289 | 
290 |           end_point = end_point_base + '_pointwise'
291 | 
292 |           net = slim.conv2d(net, depth(conv_def.depth), [1, 1],
293 |                             stride=1,
294 |                             normalizer_fn=slim.batch_norm,
295 |                             scope=end_point)
296 | 
297 |           end_points[end_point] = net
298 |           if end_point == final_endpoint:
299 |             return net, end_points
300 |         else:
301 |           raise ValueError('Unknown convolution type %s for layer %d'
302 |                            % (conv_def.ltype, i))
303 |   raise ValueError('Unknown final endpoint %s' % final_endpoint)
304 | 
305 | 
306 | def mobilenet_v1(inputs,
307 |                  num_classes=1000,
308 |                  dropout_keep_prob=0.999,
309 |                  is_training=True,
310 |                  min_depth=8,
311 |                  depth_multiplier=1.0,
312 |                  conv_defs=None,
313 |                  prediction_fn=tf.contrib.layers.softmax,
314 |                  spatial_squeeze=True,
315 |                  reuse=None,
316 |                  scope='MobilenetV1',
317 |                  global_pool=False):
318 |   """Mobilenet v1 model for classification.
319 | 
320 |   Args:
321 |     inputs: a tensor of shape [batch_size, height, width, channels].
322 |     num_classes: number of predicted classes. If 0 or None, the logits layer
323 |       is omitted and the input features to the logits layer (before dropout)
324 |       are returned instead.
325 |     dropout_keep_prob: the percentage of activation values that are retained.
326 |     is_training: whether is training or not.
327 |     min_depth: Minimum depth value (number of channels) for all convolution ops.
328 |       Enforced when depth_multiplier < 1, and not an active constraint when
329 |       depth_multiplier >= 1.
330 |     depth_multiplier: Float multiplier for the depth (number of channels)
331 |       for all convolution ops. The value must be greater than zero. Typical
332 |       usage will be to set this value in (0, 1) to reduce the number of
333 |       parameters or computation cost of the model.
334 |     conv_defs: A list of ConvDef namedtuples specifying the net architecture.
335 |     prediction_fn: a function to get predictions out of logits.
336 |     spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
337 |         of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
338 |     reuse: whether or not the network and its variables should be reused. To be
339 |       able to reuse 'scope' must be given.
340 |     scope: Optional variable_scope.
341 |     global_pool: Optional boolean flag to control the avgpooling before the
342 |       logits layer. If false or unset, pooling is done with a fixed window
343 |       that reduces default-sized inputs to 1x1, while larger inputs lead to
344 |       larger outputs. If true, any input size is pooled down to 1x1.
345 | 
346 |   Returns:
347 |     net: a 2D Tensor with the logits (pre-softmax activations) if num_classes
348 |       is a non-zero integer, or the non-dropped-out input to the logits layer
349 |       if num_classes is 0 or None.
350 |     end_points: a dictionary from components of the network to the corresponding
351 |       activation.
352 | 
353 |   Raises:
354 |     ValueError: Input rank is invalid.
355 |   """
356 |   input_shape = inputs.get_shape().as_list()
357 |   if len(input_shape) != 4:
358 |     raise ValueError('Invalid input tensor rank, expected 4, was: %d' %
359 |                      len(input_shape))
360 | 
361 |   with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope:
362 |     with slim.arg_scope([slim.batch_norm, slim.dropout],
363 |                         is_training=is_training):
364 |       net, end_points = mobilenet_v1_base(inputs, scope=scope,
365 |                                           min_depth=min_depth,
366 |                                           depth_multiplier=depth_multiplier,
367 |                                           conv_defs=conv_defs)
368 |       with tf.variable_scope('Logits'):
369 |         if global_pool:
370 |           # Global average pooling.
371 |           net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool')
372 |           end_points['global_pool'] = net
373 |         else:
374 |           # Pooling with a fixed kernel size.
375 |           kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
376 |           net = slim.avg_pool2d(net, kernel_size, padding='VALID',
377 |                                 scope='AvgPool_1a')
378 |           end_points['AvgPool_1a'] = net
379 |         if not num_classes:
380 |           return net, end_points
381 |         # 1 x 1 x 1024
382 |         net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
383 |         logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
384 |                              normalizer_fn=None, scope='Conv2d_1c_1x1')
385 |         if spatial_squeeze:
386 |           logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
387 |       end_points['Logits'] = logits
388 |       if prediction_fn:
389 |         end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
390 |   return logits, end_points
391 | 
392 | mobilenet_v1.default_image_size = 224
393 | 
394 | 
395 | def wrapped_partial(func, *args, **kwargs):
396 |   partial_func = functools.partial(func, *args, **kwargs)
397 |   functools.update_wrapper(partial_func, func)
398 |   return partial_func
399 | 
400 | 
401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)
402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)
403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)
404 | 
405 | 
406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
407 |   """Define kernel size which is automatically reduced for small input.
408 | 
409 |   If the shape of the input images is unknown at graph construction time this
410 |   function assumes that the input images are large enough.
411 | 
412 |   Args:
413 |     input_tensor: input tensor of size [batch_size, height, width, channels].
414 |     kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]
415 | 
416 |   Returns:
417 |     a tensor with the kernel size.
418 |   """
419 |   shape = input_tensor.get_shape().as_list()
420 |   if shape[1] is None or shape[2] is None:
421 |     kernel_size_out = kernel_size
422 |   else:
423 |     kernel_size_out = [min(shape[1], kernel_size[0]),
424 |                        min(shape[2], kernel_size[1])]
425 |   return kernel_size_out
426 | 
427 | 
428 | def mobilenet_v1_arg_scope(
429 |     is_training=True,
430 |     weight_decay=0.00004,
431 |     stddev=0.09,
432 |     regularize_depthwise=False,
433 |     batch_norm_decay=0.9997,
434 |     batch_norm_epsilon=0.001,
435 |     batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS):
436 |   """Defines the default MobilenetV1 arg scope.
437 | 
438 |   Args:
439 |     is_training: Whether or not we're training the model. If this is set to
440 |       None, the parameter is not added to the batch_norm arg_scope.
441 |     weight_decay: The weight decay to use for regularizing the model.
442 |     stddev: The standard deviation of the trunctated normal weight initializer.
443 |     regularize_depthwise: Whether or not apply regularization on depthwise.
444 |     batch_norm_decay: Decay for batch norm moving average.
445 |     batch_norm_epsilon: Small float added to variance to avoid dividing by zero
446 |       in batch norm.
447 |     batch_norm_updates_collections: Collection for the update ops for
448 |       batch norm.
449 | 
450 |   Returns:
451 |     An `arg_scope` to use for the mobilenet v1 model.
452 |   """
453 |   batch_norm_params = {
454 |       'center': True,
455 |       'scale': True,
456 |       'decay': batch_norm_decay,
457 |       'epsilon': batch_norm_epsilon,
458 |       'updates_collections': batch_norm_updates_collections,
459 |   }
460 |   if is_training is not None:
461 |     batch_norm_params['is_training'] = is_training
462 | 
463 |   # Set weight_decay for weights in Conv and DepthSepConv layers.
464 |   weights_init = tf.truncated_normal_initializer(stddev=stddev)
465 |   regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
466 |   if regularize_depthwise:
467 |     depthwise_regularizer = regularizer
468 |   else:
469 |     depthwise_regularizer = None
470 |   with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
471 |                       weights_initializer=weights_init,
472 |                       activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):
473 |     with slim.arg_scope([slim.batch_norm], **batch_norm_params):
474 |       with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):
475 |         with slim.arg_scope([slim.separable_conv2d],
476 |                             weights_regularizer=depthwise_regularizer) as sc:
477 |           return sc
478 | 


--------------------------------------------------------------------------------
/S7/trainer.py:
--------------------------------------------------------------------------------
  1 | """ Coding Session 7: fine tuning a model in TensorFlow
  2 | You can download the pre-trained checkpoint from:
  3 | https://github.com/tensorflow/models/tree/master/research/slim
  4 | """
  5 | 
  6 | import tensorflow as tf
  7 | import numpy as np
  8 | import os, glob
  9 | import argparse
 10 | from nets import mobilenet_v1
 11 | 
 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
 13 | 
 14 | class TFModelTrainer:
 15 | 
 16 |     def __init__(self, checkpoint_path, data_path):
 17 |         self.checkpoint_path = checkpoint_path
 18 | 
 19 |         # set training parameters
 20 |         self.learning_rate = 0.01
 21 |         self.num_iter = 100000
 22 |         self.save_iter = 5000
 23 |         self.val_iter = 5000
 24 |         self.log_iter = 100
 25 |         self.batch_size = 32
 26 | 
 27 |         # set up data layer
 28 |         self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord'))
 29 |         self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord'))
 30 |         self.iterator, self.filenames = self._data_layer()
 31 |         self.num_val_samples = 10000
 32 |         self.num_classes = 2
 33 |         self.image_size = 224
 34 | 
 35 |         # fine tune only the last layer
 36 |         self.fine_tune = True #####################################################################
 37 | 
 38 |     def preprocess_image(self, image_string):
 39 |         image = tf.image.decode_jpeg(image_string, channels=3)
 40 | 
 41 |         # flip for data augmentation
 42 |         image = tf.image.random_flip_left_right(image)
 43 | 
 44 |         # normalize image to [-1, +1]
 45 |         image = tf.cast(image, tf.float32)
 46 |         image = image / 127.5
 47 |         image = image - 1
 48 |         return image
 49 | 
 50 |     def _parse_tfrecord(self, example_proto):
 51 |         keys_to_features = {'image': tf.FixedLenFeature([], tf.string),
 52 |                             'label': tf.FixedLenFeature([], tf.int64)}
 53 |         parsed_features = tf.parse_single_example(example_proto, keys_to_features)
 54 |         image = parsed_features['image']
 55 |         label = parsed_features['label']
 56 |         image = self.preprocess_image(image)
 57 |         return image, label
 58 | 
 59 |     def _data_layer(self, num_threads=8, prefetch_buffer=100):
 60 |         with tf.variable_scope('data'):
 61 |             filenames = tf.placeholder(tf.string, shape=[None])
 62 |             dataset = tf.data.TFRecordDataset(filenames)
 63 |             dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads)
 64 |             dataset = dataset.repeat()
 65 |             dataset = dataset.batch(self.batch_size)
 66 |             dataset = dataset.prefetch(prefetch_buffer)
 67 |             iterator = dataset.make_initializable_iterator()
 68 |         return iterator, filenames
 69 | 
 70 |     def _loss_functions(self, logits, labels):
 71 |         with tf.variable_scope('loss'):
 72 |             target_prob = tf.one_hot(labels, self.num_classes)
 73 |             tf.losses.softmax_cross_entropy(target_prob, logits)
 74 |             total_loss = tf.losses.get_total_loss() #include regularization loss
 75 |         return total_loss
 76 | 
 77 |     def _optimizer(self, total_loss, global_step):
 78 |         update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 79 |         with tf.control_dependencies(update_ops):
 80 |             optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
 81 |             if self.fine_tune: #####################################################################
 82 |                 train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "MobilenetV1/Logits")
 83 |                 optimizer = optimizer.minimize(total_loss, var_list=train_vars, global_step=global_step)
 84 |             else:
 85 |                 optimizer = optimizer.minimize(total_loss, global_step=global_step)
 86 |         return optimizer
 87 | 
 88 |     def _performance_metric(self, logits, labels):
 89 |         with tf.variable_scope("performance_metric"):
 90 |             preds = tf.argmax(logits, axis=1)
 91 |             labels = tf.cast(labels, tf.int64)
 92 |             corrects = tf.equal(preds, labels)
 93 |             accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
 94 |         return accuracy
 95 | 
 96 |     def _variables_to_restore(self, save_file, graph): #############################################
 97 |         # returns a list of variables that can be restored from a checkpoint
 98 |         reader = tf.train.NewCheckpointReader(save_file)
 99 |         saved_shapes = reader.get_variable_to_shape_map()
100 |         var_names = sorted([(var.name, var.name.split(':')[0]) for var in tf.global_variables()
101 |                 if var.name.split(':')[0] in saved_shapes])
102 |         restore_vars = []
103 |         for var_name, saved_var_name in var_names:
104 |             curr_var = graph.get_tensor_by_name(var_name)
105 |             var_shape = curr_var.get_shape().as_list()
106 |             if var_shape == saved_shapes[saved_var_name]:
107 |                 restore_vars.append(curr_var)
108 |         return restore_vars
109 | 
110 |     def train(self):
111 |         # iteration number
112 |         global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
113 | 
114 |         # training graph
115 |         images, labels = self.iterator.get_next()
116 |         images = tf.image.resize_bilinear(images, (self.image_size, self.image_size))
117 |         training = tf.placeholder(tf.bool, name='is_training')
118 |         logits, _ = mobilenet_v1.mobilenet_v1(images,
119 |                      num_classes=self.num_classes,
120 |                      is_training=training,
121 |                      scope='MobilenetV1',
122 |                      global_pool=True)
123 |         loss = self._loss_functions(logits, labels)
124 |         optimizer = self._optimizer(loss, global_step)
125 |         accuracy = self._performance_metric(logits, labels)
126 | 
127 |         # summary placeholders
128 |         streaming_loss_p = tf.placeholder(tf.float32)
129 |         accuracy_p = tf.placeholder(tf.float32)
130 |         summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
131 |         summ_op_test = tf.summary.scalar('accuracy', accuracy_p)
132 | 
133 |         # don't allocate entire GPU memory #########################################################
134 |         config = tf.ConfigProto()
135 |         config.gpu_options.allow_growth = True
136 | 
137 |         with tf.Session(config=config) as sess:
138 |             sess.run(tf.global_variables_initializer())
139 |             sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
140 | 
141 |             writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
142 | 
143 |             saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
144 |             ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
145 | 
146 |             # resume training if a checkpoint exists
147 |             if ckpt and ckpt.model_checkpoint_path:
148 |                 restore_vars = self._variables_to_restore(ckpt.model_checkpoint_path, sess.graph)
149 |                 saver = tf.train.Saver(var_list=restore_vars)
150 |                 saver.restore(sess, ckpt.model_checkpoint_path)
151 |                 print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
152 | 
153 |             initial_step = global_step.eval()
154 | 
155 |             # train the model
156 |             streaming_loss = 0
157 |             for i in range(initial_step, self.num_iter + 1):
158 |                 _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
159 | 
160 |                 if not np.isfinite(loss_batch):
161 |                     print('loss diverged, stopping')
162 |                     exit()
163 | 
164 |                 # log summary
165 |                 streaming_loss += loss_batch
166 |                 if i % self.log_iter == self.log_iter - 1:
167 |                     streaming_loss /= self.log_iter
168 |                     print(i + 1, streaming_loss)
169 |                     summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
170 |                     writer.add_summary(summary_train, global_step=i)
171 |                     streaming_loss = 0
172 | 
173 |                 # save model
174 |                 if i % self.save_iter == self.save_iter - 1:
175 |                     saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
176 |                     print("Model saved!")
177 | 
178 |                 # run validation
179 |                 if i % self.val_iter == self.val_iter - 1:
180 |                     print("Running validation.")
181 |                     sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames})
182 | 
183 |                     validation_accuracy = 0
184 |                     for j in range(self.num_val_samples // self.batch_size):
185 |                         acc_batch = sess.run(accuracy, feed_dict={training: False})
186 |                         validation_accuracy += acc_batch
187 |                     validation_accuracy /= j
188 | 
189 |                     print("Accuracy: {}".format(validation_accuracy))
190 | 
191 |                     summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy})
192 |                     writer.add_summary(summary_test, global_step=i)
193 | 
194 |                     sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
195 | 
196 |             writer.close()
197 | 
198 | def main():
199 |     parser = argparse.ArgumentParser()
200 |     parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
201 |                         help="Path to the dir where the checkpoints are saved")
202 |     parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords")
203 |     args = parser.parse_args()
204 |     trainer = TFModelTrainer(args.checkpoint_path, args.data_path)
205 |     trainer.train()
206 | 
207 | if __name__ == '__main__':
208 |     main()
209 | 


--------------------------------------------------------------------------------
/S8/datagenerator.py:
--------------------------------------------------------------------------------
 1 | """ A Python iterator that loads and processes images.
 2 | This iterator will be called through TensorFlow Dataset API to feed pairs of
 3 | clean and noisy images into the model.
 4 | """
 5 | 
 6 | import cv2
 7 | import numpy as np
 8 | import random
 9 | import os, glob
10 | 
11 | class DataGenerator:
12 | 
13 |     def __init__(self, image_size, base_dir = '../S6/dataset'):
14 | 		# all images will be center cropped and resized to image_size
15 |         self.image_size = image_size
16 | 
17 |         # number of validation samples
18 |         self.num_val = 320
19 | 
20 |         filenames = glob.glob(os.path.join(base_dir, '**', '*.jpg'), recursive=True)
21 |         self.filenames_train = filenames[self.num_val:]
22 |         self.filenames_val = filenames[:self.num_val]
23 | 
24 |         # dataset mode
25 |         self.is_training = True
26 |         self.val_idx = 0
27 | 
28 |     def __iter__(self):
29 |         return self
30 | 
31 |     def __next__(self):
32 |         try:
33 |             return self.fetch_sample()
34 |         except IndexError:
35 |             raise StopIteration()
36 | 
37 |     def get_tensor_shape(self):
38 |         return (self.image_size[1], self.image_size[0], 3)
39 | 
40 |     def set_mode(self, is_training):
41 |         self.is_training = is_training
42 |         self.val_idx = 0
43 | 
44 |     def fetch_sample(self):
45 |         if self.is_training:
46 |             # pick a random image
47 |             impath = random.choice(self.filenames_train)
48 |         else:
49 |             # pick the next validation sample
50 |             impath = self.filenames_val[self.val_idx]
51 |             self.val_idx += 1
52 |         image_in = cv2.imread(impath)
53 | 
54 |         # resize to image_size
55 |         image_in = self.center_crop_and_resize(image_in)
56 | 
57 |         # inject noise
58 |         image_out = self.add_random_noise(image_in)
59 | 
60 |         return image_in, image_out
61 | 
62 |     def center_crop_and_resize(self, image):
63 |         R, C, _ = image.shape
64 |         if R > C:
65 |             pad = (R - C) // 2
66 |             image = image[pad:-pad, :]
67 |         elif C > R:
68 |             pad = (C - R) // 2
69 |             image = image[:, pad:-pad]
70 |         image = cv2.resize(image, self.image_size)
71 |         return image
72 | 
73 |     def add_random_noise(self, image):
74 |         noise_var = random.randrange(3, 15)
75 |         h, w, c = image.shape
76 |         image_out = image.copy()
77 |         image_out = image_out.astype(np.float32)
78 |         noise = np.random.randn(h, w, c) * noise_var
79 |         image_out += noise
80 |         image_out = np.minimum(np.maximum(image_out, 0), 255)
81 |         image_out = image_out.astype(np.uint8)
82 |         return image_out
83 | 
84 | if __name__ == '__main__':
85 |     dg = DataGenerator(image_size=(256, 256))
86 |     image_in, image_out = next(dg)
87 |     cv2.imwrite('image_in.jpg', image_in)
88 |     cv2.imwrite('image_out.jpg', image_out)
89 | 


--------------------------------------------------------------------------------
/S8/nets/model.py:
--------------------------------------------------------------------------------
  1 | """ A simple encoder-decoder network with skip connections.
  2 | """
  3 | 
  4 | import tensorflow as tf
  5 | 
  6 | def model(input_layer, training, weight_decay=0.00001):
  7 | 
  8 |     reg = tf.contrib.layers.l2_regularizer(weight_decay)
  9 | 
 10 |     def conv_block(inputs, num_filters, name):
 11 |         with tf.variable_scope(name):
 12 |             net = tf.layers.separable_conv2d(
 13 |                         inputs=inputs,
 14 |                         filters=num_filters,
 15 |                         kernel_size=(3,3),
 16 |                         padding='SAME',
 17 |                         use_bias=False,
 18 |                         activation=None,
 19 |                         pointwise_regularizer=reg,
 20 |                         depthwise_regularizer=reg)
 21 |             net = tf.layers.batch_normalization(
 22 |                         inputs=net,
 23 |                         training=training)
 24 |             net = tf.nn.relu(net)
 25 |         return net
 26 | 
 27 |     def pointwise_block(inputs, num_filters, name):
 28 |         with tf.variable_scope(name):
 29 |             net = tf.layers.conv2d(
 30 |                         inputs=inputs,
 31 |                         filters=num_filters,
 32 |                         kernel_size=1,
 33 |                         use_bias=False,
 34 |                         activation=None,
 35 |                         kernel_regularizer=reg)
 36 |             net = tf.layers.batch_normalization(
 37 |                         inputs=net,
 38 |                         training=training)
 39 |             net = tf.nn.relu(net)
 40 |         return net
 41 | 
 42 |     def pooling(inputs, name):
 43 |         with tf.variable_scope(name):
 44 |             net = tf.layers.max_pooling2d(inputs,
 45 |                                     pool_size=(2,2),
 46 |                                     strides=(2,2))
 47 |         return net
 48 | 
 49 |     def downsampling(inputs, name):
 50 |         with tf.variable_scope(name):
 51 |             net = tf.layers.average_pooling2d(inputs,
 52 |                                     pool_size=(2,2),
 53 |                                     strides=(2,2))
 54 |         return net
 55 | 
 56 |     def upsampling(inputs, name):
 57 |         with tf.variable_scope(name):
 58 |             dims = tf.shape(inputs)
 59 |             new_size = [dims[1]*2, dims[2]*2]
 60 |             net = tf.image.resize_bilinear(inputs, new_size)
 61 |         return net
 62 | 
 63 |     def output_block(inputs, name):
 64 |         with tf.variable_scope(name):
 65 |             net = tf.layers.conv2d(
 66 |                         inputs=inputs,
 67 |                         filters=3,
 68 |                         kernel_size=(1,1),
 69 |                         activation=None)
 70 |         return net
 71 | 
 72 |     def subnet_module(inputs, name, num_filters, num_layers=3):
 73 |         with tf.variable_scope(name):
 74 |             for i in range(num_layers-1):
 75 |                 net = conv_block(inputs, num_filters=num_filters, name='{}_conv{}'.format(name, i))
 76 |                 inputs = tf.concat([net, inputs], axis=3)
 77 |             net = conv_block(inputs, num_filters=num_filters, name='{}_conv3'.format(name))
 78 |         return net
 79 | 
 80 |     num_filters = 16
 81 |     net = input_layer
 82 |     skip_connections = []
 83 |     # encoder
 84 |     with tf.variable_scope('encoder'):
 85 |         for i in range(4):
 86 |             net = subnet_module(net, num_filters=num_filters, name='conv_e{}'.format(i))
 87 |             skip_connections.append(net)
 88 |             net = pooling(net, name='pool{}'.format(i))
 89 |             num_filters *= 2
 90 | 
 91 |     # bottleneck
 92 |     net = subnet_module(net, num_filters=num_filters, name='conv_bottleneck'.format(i))
 93 | 
 94 |     # decoder
 95 |     with tf.variable_scope('decoder'):
 96 |         for i in range(4):
 97 |             num_filters /= 2
 98 |             net = upsampling(net, name='upsample{}'.format(i))
 99 |             net = tf.concat([net, skip_connections.pop()], axis=3)
100 |             net = subnet_module(net, num_filters=num_filters, name='subnet_d{}'.format(i))
101 | 
102 |     # exit flow
103 |     with tf.variable_scope('exit_flow'):
104 |         logits = output_block(net, name='output_block')
105 | 
106 |     return logits
107 | 


--------------------------------------------------------------------------------
/S8/trainer.py:
--------------------------------------------------------------------------------
  1 | """ Coding Session 8: using a python iterator as a data generator and training a denoising autoencoder
  2 | """
  3 | 
  4 | import tensorflow as tf
  5 | import numpy as np
  6 | import os, glob
  7 | import argparse
  8 | from nets.model import model
  9 | from datagenerator import DataGenerator #########################
 10 | 
 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
 12 | 
 13 | class TFModelTrainer:
 14 | 
 15 |     def __init__(self, checkpoint_path):
 16 |         self.checkpoint_path = checkpoint_path
 17 | 
 18 |         # set training parameters
 19 |         self.learning_rate = 0.01
 20 |         self.num_iter = 100000
 21 |         self.save_iter = 1000
 22 |         self.val_iter = 1000
 23 |         self.log_iter = 100
 24 |         self.batch_size = 16
 25 | 
 26 |         # set up data layer
 27 |         self.image_size = (224, 224)
 28 |         self.data_generator = DataGenerator(self.image_size)
 29 | 
 30 |     def preprocess_image(self, image):
 31 |         # normalize image to [-1, +1]
 32 |         image = tf.cast(image, tf.float32)
 33 |         image = image / 127.5
 34 |         image = image - 1
 35 |         return image
 36 | 
 37 |     def _preprocess_images(self, image_orig, image_noisy):
 38 |         image_orig = self.preprocess_image(image_orig)
 39 |         image_noisy = self.preprocess_image(image_noisy)
 40 |         return image_orig, image_noisy
 41 | 
 42 |     def _data_layer(self, num_threads=8, prefetch_buffer=100):
 43 |         with tf.variable_scope('data'):
 44 |             data_shape = self.data_generator.get_tensor_shape() #########################
 45 |             dataset = tf.data.Dataset.from_generator(lambda: self.data_generator,
 46 |                                                      (tf.float32, tf.float32),
 47 |                                                      (tf.TensorShape(data_shape),
 48 |                                                       tf.TensorShape(data_shape)))
 49 |             dataset = dataset.map(self._preprocess_images, num_parallel_calls=num_threads)
 50 |             dataset = dataset.batch(self.batch_size)
 51 |             dataset = dataset.prefetch(prefetch_buffer)
 52 |             iterator = dataset.make_initializable_iterator()
 53 |         return iterator
 54 | 
 55 |     def _loss_functions(self, preds, ground_truth):
 56 |         with tf.name_scope('loss'):
 57 |             tf.losses.mean_squared_error(ground_truth, preds) #########################
 58 |             total_loss = tf.losses.get_total_loss()
 59 |         return total_loss
 60 | 
 61 |     def _optimizer(self, loss, global_step):
 62 |         update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
 63 |         with tf.control_dependencies(update_ops):
 64 |             optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
 65 |             optimizer = optimizer.minimize(loss, global_step=global_step)
 66 |         return optimizer
 67 | 
 68 |     def train(self):
 69 |         # iteration number
 70 |         global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
 71 | 
 72 |         # training graph
 73 |         iterator = self._data_layer()
 74 |         image_orig, image_noisy = iterator.get_next()
 75 |         training = tf.placeholder(tf.bool, name='is_training')
 76 |         logits = model(image_noisy, training=training)
 77 |         loss = self._loss_functions(logits, image_orig)
 78 |         optimizer = self._optimizer(loss, global_step)
 79 | 
 80 |         # summary placeholders
 81 |         streaming_loss_p = tf.placeholder(tf.float32)
 82 |         validation_loss_p = tf.placeholder(tf.float32)
 83 |         summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
 84 |         summ_op_test = tf.summary.scalar('validation_loss', validation_loss_p)
 85 | 
 86 |         # don't allocate entire gpu memory
 87 |         config = tf.ConfigProto()
 88 |         config.gpu_options.allow_growth = True
 89 | 
 90 |         with tf.Session(config=config) as sess:
 91 |             sess.run(tf.global_variables_initializer())
 92 |             sess.run(iterator.initializer)
 93 | 
 94 |             writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
 95 | 
 96 |             saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
 97 |             ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
 98 | 
 99 |             # resume training if a checkpoint exists
100 |             if ckpt and ckpt.model_checkpoint_path:
101 |                 saver.restore(sess, ckpt.model_checkpoint_path)
102 |                 print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
103 | 
104 |             initial_step = global_step.eval()
105 | 
106 |             # train the model
107 |             streaming_loss = 0
108 |             for i in range(initial_step, self.num_iter + 1):
109 |                 _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
110 | 
111 |                 if not np.isfinite(loss_batch):
112 |                     print('loss diverged, stopping')
113 |                     exit()
114 | 
115 |                 # log summary
116 |                 streaming_loss += loss_batch
117 |                 if i % self.log_iter == self.log_iter - 1:
118 |                     streaming_loss /= self.log_iter
119 |                     print(i + 1, streaming_loss)
120 |                     summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
121 |                     writer.add_summary(summary_train, global_step=i)
122 |                     streaming_loss = 0
123 | 
124 |                 # save model
125 |                 if i % self.save_iter == self.save_iter - 1:
126 |                     saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
127 |                     print("Model saved!")
128 | 
129 |                 # run validation
130 |                 if i % self.val_iter == self.val_iter - 1:
131 |                     print("Running validation.")
132 |                     self.data_generator.set_mode(is_training=False)
133 |                     sess.run(iterator.initializer)
134 | 
135 |                     validation_loss = 0
136 |                     for j in range(self.data_generator.num_val // self.batch_size):
137 |                         loss_batch = sess.run(loss, feed_dict={training: False})
138 |                         validation_loss += loss_batch
139 |                     validation_loss /= j
140 | 
141 |                     print("Validation loss: {}".format(validation_loss))
142 | 
143 |                     summary_test = sess.run(summ_op_test, feed_dict={validation_loss_p: validation_loss})
144 |                     writer.add_summary(summary_test, global_step=i)
145 | 
146 |                     self.data_generator.set_mode(is_training=True)
147 |                     sess.run(iterator.initializer)
148 | 
149 |             writer.close()
150 | 
151 | def main():
152 |     parser = argparse.ArgumentParser()
153 |     parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
154 |                         help="Path to the dir where the checkpoints are saved")
155 |     args = parser.parse_args()
156 |     trainer = TFModelTrainer(args.checkpoint_path)
157 |     trainer.train()
158 | 
159 | if __name__ == '__main__':
160 |     main()
161 | 


--------------------------------------------------------------------------------
/S9/S1.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | x = 2.0
 4 | y = 8.0
 5 | 
 6 | @tf.function
 7 | def geometric_mean(x, y):
 8 |     g_mean = tf.sqrt(x * y)
 9 |     return g_mean
10 | 
11 | g_mean = geometric_mean(x, y)
12 | tf.print(g_mean)


--------------------------------------------------------------------------------
/S9/S2.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
 4 | 
 5 | model = tf.keras.Sequential([
 6 |     tf.keras.layers.Flatten(input_shape=(28, 28)),
 7 |     tf.keras.layers.Dense(512, activation=tf.nn.relu),
 8 |     tf.keras.layers.Dense(10, activation=tf.nn.softmax)
 9 | ])
10 | 
11 | model.compile(optimizer='adam', 
12 |               loss='sparse_categorical_crossentropy',
13 |               metrics=['accuracy'])
14 | 
15 | model.fit(train_images, train_labels, epochs=5)
16 | 


--------------------------------------------------------------------------------
/S9/S3.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
 4 | 
 5 | inputs = tf.keras.Input(shape=(28,28))
 6 | net = tf.keras.layers.Flatten()(inputs)
 7 | net = tf.keras.layers.Dense(512, activation=tf.nn.relu)(net)
 8 | net = tf.keras.layers.Dense(10, activation=tf.nn.softmax)(net)
 9 | model = tf.keras.Model(inputs=inputs, outputs=net)
10 | 
11 | model.compile(optimizer='adam', 
12 |               loss='sparse_categorical_crossentropy',
13 |               metrics=['accuracy'])
14 | 
15 | model.fit(train_images, train_labels, epochs=5)
16 | 


--------------------------------------------------------------------------------
/S9/S4.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | data_train, _ = tf.keras.datasets.mnist.load_data()
 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
 5 | dataset = dataset.shuffle(buffer_size=60000)
 6 | dataset = dataset.batch(32)
 7 | 
 8 | for row in dataset:
 9 |     print(row)
10 | 
11 | model = tf.keras.Sequential([
12 |     tf.keras.layers.Flatten(input_shape=(28, 28)),
13 |     tf.keras.layers.Dense(512, activation=tf.nn.relu),
14 |     tf.keras.layers.Dense(10, activation=tf.nn.softmax)
15 | ])
16 | 
17 | model.compile(optimizer='adam', 
18 |               loss='sparse_categorical_crossentropy',
19 |               metrics=['accuracy'])
20 | model.fit(dataset, epochs=5)
21 | 


--------------------------------------------------------------------------------
/S9/S5.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | data_train, _ = tf.keras.datasets.mnist.load_data()
 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
 5 | dataset = dataset.shuffle(buffer_size=60000)
 6 | dataset = dataset.batch(32)
 7 | 
 8 | model = tf.keras.Sequential([
 9 |     tf.keras.layers.Flatten(input_shape=(28, 28)),
10 |     tf.keras.layers.Dense(512, activation=tf.nn.relu),
11 |     tf.keras.layers.Dense(10, activation=tf.nn.softmax)
12 | ])
13 | 
14 | model.compile(optimizer='adam', 
15 |               loss='sparse_categorical_crossentropy',
16 |               metrics=['accuracy'])
17 | 
18 | tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints')
19 | model.fit(dataset, epochs=5, callbacks=[tb])
20 | 


--------------------------------------------------------------------------------
/S9/S6.py:
--------------------------------------------------------------------------------
 1 | import tensorflow as tf
 2 | 
 3 | data_train, _ = tf.keras.datasets.mnist.load_data()
 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
 5 | dataset = dataset.shuffle(buffer_size=60000)
 6 | dataset = dataset.batch(32)
 7 | 
 8 | strategy = tf.distribute.MirroredStrategy()
 9 | with strategy.scope():
10 |     model = tf.keras.Sequential([
11 |         tf.keras.layers.Flatten(input_shape=(28, 28)),
12 |         tf.keras.layers.Dense(512, activation=tf.nn.relu),
13 |         tf.keras.layers.Dense(10, activation=tf.nn.softmax)
14 |     ])
15 | 
16 |     model.compile(optimizer='adam', 
17 |                   loss='sparse_categorical_crossentropy',
18 |                   metrics=['accuracy'])
19 | 
20 |     tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints')
21 |     model.fit(dataset, epochs=5, callbacks=[tb])
22 | 


--------------------------------------------------------------------------------
/img/dlcc_github.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/dlcc_github.jpg


--------------------------------------------------------------------------------
/img/tfcs_github.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/tfcs_github.png


--------------------------------------------------------------------------------