├── .gitignore ├── README.md ├── S1 ├── S1_notebook.ipynb ├── S1a_live.py └── S1b_live.py ├── S2_live.py ├── S3_live.py ├── S4_live.py ├── S5_live.py ├── S6 ├── create_tf_records.py ├── freeze_model.py ├── inference.py ├── nets │ └── mobilenet_v1.py ├── resize_images.py └── trainer.py ├── S7 ├── checkpoints │ └── checkpoint ├── nets │ └── mobilenet_v1.py └── trainer.py ├── S8 ├── datagenerator.py ├── nets │ └── model.py └── trainer.py ├── S9 ├── S1.py ├── S2.py ├── S3.py ├── S4.py ├── S5.py └── S6.py └── img ├── dlcc_github.jpg └── tfcs_github.png /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | TensorFlow Coding Sessions 2 | 3 | ## Hands-on Deep Learning: TensorFlow Coding Sessions 4 | 5 | This repository has the code for the Hands-on Deep Learning: TensorFlow Coding Sessions. The videos will be uploaded on a weekly basis. 6 | 7 | The series consist of the introductory TensorFlow tutorials outlined below: 8 | 9 | | # | Tutorial | Code | Video | 10 | |-|------------------------------------------------------------------------|------|------------------| 11 | |1| Introduction to TensorFlow: graphs, sessions, constants, and variables |[S1](S1/) and [S1_notebook.ipynb](S1/S1_notebook.ipynb)| [Video #1](https://youtu.be/1KzJbIFnVTE) | 12 | |2| Training a multilayer perceptron |[S2_live.py](S2_live.py)| [Video #2](https://youtu.be/b7ykcBzz9wo) | 13 | |3| Setting up the training and validation pipeline |[S3_live.py](S3_live.py)| [Video #3](https://youtu.be/l_ZvxKBToWs) | 14 | |4| Regularization, saving and resuming from checkpoints, and TensorBoard |[S4_live.py](S4_live.py)| [Video #4](https://youtu.be/ni9FZtF_gLs) | 15 | |5| Convolutional neural networks, batchnorm, learning rate schedules, optimizers|[S5_live.py](S5_live.py)| [Video #5](https://youtu.be/ULX1nWPAJbM) | 16 | |6| Converting a dataset into TFRecords, training an image classifier, and freezing the model for deployment|[S6](S6/)| [Video #6](https://youtu.be/tzKqjPdAf8M) | 17 | |7| Transfer learning: fine tuning a model in TensorFlow |[S7](S7/)| [Video #7](https://youtu.be/jccBP_uA98k) | 18 | |8| Using a Python iterator as a data generator and training a denoising autoencoder |[S8](S8/)| N/A | 19 | |9| What is new in TensorFlow 2.0 **[new]** |[S9](S9/)| [Video #8](https://youtu.be/GI_QVLNCgPo) | 20 | 21 | --- 22 | 23 | Deep Learning Crash Course 24 | 25 | ## Deep Learning Crash Course 26 | 27 | A series of mini-lectures on the fundamentals of machine learning, with a focus on neural networks and deep learning. 28 | 29 | * [Lecture #1: Introduction](https://youtu.be/nmnaO6esC7c) 30 | * [Lecture #2: Artificial Neural Networks Demystified](https://youtu.be/oS5fz_mHVz0) 31 | * [Lecture #3: Artificial Neural Networks: Going Deeper](https://youtu.be/_XPkAxm0Yx0) 32 | * [Lecture #4: Overfitting, Underfitting, and Model Capacity](https://youtu.be/ms-Ooh9mjiE) 33 | * [Lecture #5: Regularization](https://youtu.be/NRCZJUviZN0) 34 | * [Lecture #6: Data Collection and Preprocessing](https://youtu.be/dAg-_gzFo14) 35 | * [Lecture #7: Convolutional Neural Networks Explained](https://youtu.be/-I0lry5ceDs) 36 | * [Lecture #8: How to Design a Convolutional Neural Network](https://youtu.be/fTw3K8D5xDs) 37 | * [Lecture #9: Transfer Learning](https://youtu.be/_2EHcpg52uU) 38 | * [Lecture #10: Optimization Tricks: momentum, batch-norm, and more](https://youtu.be/kK8-jCCR4is) 39 | * [Lecture #11: Recurrent Neural Networks](https://youtu.be/k97Jrg_4tFA) 40 | * [Lecture #12: Deep Unsupervised Learning](https://youtu.be/P8_W5Wc4zeg) 41 | * [Lecture #13: Generative Adversarial Networks](https://youtu.be/7tFBoxex4JE) 42 | * [Lecture #14: Practical Methodology in Deep Learning](https://youtu.be/9Sl_t_GxX6w) 43 | 44 | --- 45 | -------------------------------------------------------------------------------- /S1/S1_notebook.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Deep Learning With Tensorflow\n", 8 | "\n", 9 | "## Introduction\n", 10 | "\n", 11 | "Let's start with importing TensorFlow in our project and making sure that we have installed the right version correctly.\n", 12 | "If you haven't installed TensorFlow yet, you can easily do so using PyPI: https://www.tensorflow.org/install/." 13 | ] 14 | }, 15 | { 16 | "cell_type": "code", 17 | "execution_count": 1, 18 | "metadata": {}, 19 | "outputs": [ 20 | { 21 | "name": "stdout", 22 | "output_type": "stream", 23 | "text": [ 24 | "1.10.0\n" 25 | ] 26 | } 27 | ], 28 | "source": [ 29 | "import tensorflow as tf\n", 30 | "print(tf.__version__)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "### Graphs and Sessions\n", 38 | "Unless you are using the eager execution mode, operations in TensorFlow are not executed immediately. In TensorFlow, the description of the computations is separated from the execution. A typical TensorFlow program constructs a computational graph first, then creates a session to execute the operations in the graph. Let's create a very simple graph and run it in a session to compute the geometric mean of two numbers. In this example we used placeholders to feed the inputs to the graph. By defining a placeholder we tell the model that we will feed the values later, when we execute the graph. Feeding data this was can lead to input/output bottlenecks in large scale applications. We will later see how to read data in parallel while the graph is being executed." 39 | ] 40 | }, 41 | { 42 | "cell_type": "code", 43 | "execution_count": 2, 44 | "metadata": {}, 45 | "outputs": [ 46 | { 47 | "name": "stdout", 48 | "output_type": "stream", 49 | "text": [ 50 | "4.0\n" 51 | ] 52 | } 53 | ], 54 | "source": [ 55 | "# Define the inputs\n", 56 | "x = tf.placeholder(tf.float32)\n", 57 | "y = tf.placeholder(tf.float32)\n", 58 | "\n", 59 | "# Define the graph\n", 60 | "g_mean = tf.sqrt(x * y)\n", 61 | "\n", 62 | "# Run the graph\n", 63 | "with tf.Session() as sess:\n", 64 | " res = sess.run(g_mean, feed_dict={x: 2, y:8})\n", 65 | " print(res)" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "### Constants and Variables\n", 73 | "\n", 74 | "We can declare constants and variables to use in a graph. The main differences between these two are:\n", 75 | "* Constants have constant values whereas variables can change during execution. A typical example of a variable is a trainable weight in a neural network.\n", 76 | "* Constants are stored in a graph where variables are not. Using constants increases the size of the graph\n", 77 | "\n", 78 | "Let's take a look at an example." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": 7, 84 | "metadata": {}, 85 | "outputs": [ 86 | { 87 | "name": "stdout", 88 | "output_type": "stream", 89 | "text": [ 90 | "0.2\n" 91 | ] 92 | } 93 | ], 94 | "source": [ 95 | "# This block gets an existing variable with a specific name within a variable scope\n", 96 | "# or creates a new one if no such variable exists\n", 97 | "# In this case it's identical to using tf.Variable\n", 98 | "# Variable scopes help us define and reuse variables within a context\n", 99 | "with tf.variable_scope(\"linear_model\", reuse=tf.AUTO_REUSE):\n", 100 | " w = tf.get_variable(\"weight\", dtype=tf.float32, initializer=tf.constant(0.1))\n", 101 | " c = tf.get_variable(\"bias\", dtype=tf.float32, initializer=tf.constant(0.0))\n", 102 | "\n", 103 | "# here we define our graph\n", 104 | "model = x * w + c\n", 105 | "\n", 106 | "with tf.Session() as sess:\n", 107 | " # we need to initialize all variables otherwise it will throw an error\n", 108 | " sess.run(tf.global_variables_initializer())\n", 109 | " print(sess.run(model, feed_dict={x: 2.0}))" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "In the example above, we defined a very simple linear model with a single input, weight, and bias. We initialized the variables with constant values and ran the graph to print the initial output. We will later see how to train these variables to fit a function to data." 117 | ] 118 | } 119 | ], 120 | "metadata": { 121 | "kernelspec": { 122 | "display_name": "Python 3", 123 | "language": "python", 124 | "name": "python3" 125 | }, 126 | "language_info": { 127 | "codemirror_mode": { 128 | "name": "ipython", 129 | "version": 3 130 | }, 131 | "file_extension": ".py", 132 | "mimetype": "text/x-python", 133 | "name": "python", 134 | "nbconvert_exporter": "python", 135 | "pygments_lexer": "ipython3", 136 | "version": "3.6.4" 137 | } 138 | }, 139 | "nbformat": 4, 140 | "nbformat_minor": 2 141 | } 142 | -------------------------------------------------------------------------------- /S1/S1a_live.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | # define the inputs 4 | x = tf.placeholder(tf.float32) 5 | y = tf.placeholder(tf.float32) 6 | 7 | # define the graph 8 | g_mean = tf.sqrt(x * y) 9 | 10 | # run the graph 11 | with tf.Session() as sess: 12 | res = sess.run(g_mean, feed_dict={x: 2, y: 8}) 13 | print(res) -------------------------------------------------------------------------------- /S1/S1b_live.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | # define the inputs 4 | x = tf.placeholder(tf.float32) 5 | 6 | with tf.variable_scope("linear_model", reuse=tf.AUTO_REUSE): 7 | w = tf.get_variable("weight", dtype=tf.float32, initializer=tf.constant(0.1)) 8 | c = tf.get_variable("bias", dtype=tf.float32, initializer=tf.constant(0.0)) 9 | model = x * w + c 10 | 11 | with tf.Session() as sess: 12 | sess.run(tf.global_variables_initializer()) 13 | print(sess.run(model, feed_dict={x: 2.0})) -------------------------------------------------------------------------------- /S2_live.py: -------------------------------------------------------------------------------- 1 | """ Deep Learning with TensorFlow 2 | Coding session 2: Training a Multilayer Perceptron 3 | 4 | Let's train a simple neural network that classifies handwritten digits using the MNIST dataset. 5 | Video will be uploaded later. 6 | """ 7 | 8 | import tensorflow as tf 9 | 10 | def preprocess_data(im, label): 11 | im = tf.cast(im, tf.float32) 12 | im = im / 127.5 13 | im = im - 1 14 | im = tf.reshape(im, [-1]) 15 | return im, label 16 | 17 | def data_layer(data_tensor, num_threads=8, prefetch_buffer=100, batch_size=32): 18 | with tf.variable_scope("data"): 19 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor) 20 | dataset = dataset.shuffle(buffer_size=60000).repeat() 21 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads) 22 | dataset = dataset.batch(batch_size) 23 | dataset = dataset.prefetch(prefetch_buffer) 24 | iterator = dataset.make_one_shot_iterator() 25 | return iterator 26 | 27 | def model(input_layer, num_classes=10): 28 | with tf.variable_scope("model"): 29 | net = tf.layers.dense(input_layer, 512) 30 | net = tf.nn.relu(net) 31 | net = tf.layers.dense(net, num_classes) 32 | return net 33 | 34 | def loss_functions(logits, labels, num_classes=10): 35 | with tf.variable_scope("loss"): 36 | target_prob = tf.one_hot(labels, num_classes) 37 | total_loss = tf.losses.softmax_cross_entropy(target_prob, logits) 38 | return total_loss 39 | 40 | def optimizer_func(total_loss, global_step, learning_rate=0.1): 41 | with tf.variable_scope("optimizer"): 42 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) 43 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 44 | return optimizer 45 | 46 | def performance_metric(logits, labels): 47 | with tf.variable_scope("performance_metric"): 48 | preds = tf.argmax(logits, axis=1) 49 | labels = tf.cast(labels, tf.int64) 50 | corrects = tf.equal(preds, labels) 51 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 52 | return accuracy 53 | 54 | def train(data_tensor): 55 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number") 56 | 57 | # training graph 58 | images, labels = data_layer(data_tensor).get_next() 59 | logits = model(images) 60 | loss = loss_functions(logits, labels) 61 | optimizer = optimizer_func(loss, global_step) 62 | accuracy = performance_metric(logits, labels) 63 | 64 | # start training 65 | num_iter = 10000 66 | log_iter = 1000 67 | with tf.Session() as sess: 68 | sess.run(tf.global_variables_initializer()) 69 | streaming_loss = 0 70 | streaming_accuracy = 0 71 | 72 | for i in range(1, num_iter + 1): 73 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy]) 74 | streaming_loss += loss_batch 75 | streaming_accuracy += acc_batch 76 | if i % log_iter == 0: 77 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}" 78 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter)) 79 | streaming_loss = 0 80 | streaming_accuracy = 0 81 | 82 | if __name__ == "__main__": 83 | # It's very easy to load the MNIST dataset through the Keras module. 84 | # Keras is a high-level neural network API that has become a part of TensorFlow since version 1.2. 85 | # Therefore, we don't need to install Keras separately. 86 | # In the upcoming lectures we will also see how to load and preprocess custom data. 87 | data_train, data_val = tf.keras.datasets.mnist.load_data() 88 | 89 | # The training set has 60,000 samples where each sample is a 28x28 grayscale image. 90 | # Each one of these samples have a single label Similarly the validation set has 10,000 images and corresponding labels. 91 | # We can verify this by printing the shapes of the loaded tensors 92 | print(data_train[0].shape, data_train[1].shape, data_val[0].shape, data_val[1].shape) 93 | 94 | # Let the training begin! 95 | train(data_tensor=data_train) 96 | 97 | # Even after very few epochs, we got a model that can classify the handwritten digits in the training set 98 | # with 98% accuracy. So far we haven't used the validation set at all. 99 | # You might wonder why we need a separate validation set in the first place. 100 | # The answer is to make sure that the model generalizes well to unseen data to have an idea of the actual performance of the model. 101 | # We will talk about that in the next session. -------------------------------------------------------------------------------- /S3_live.py: -------------------------------------------------------------------------------- 1 | """ Deep Learning with TensorFlow 2 | Coding session 3: Setting up the training and validation pipeline 3 | 4 | In the previous session we trained a model without keeping track of how it's 5 | doing on a validation set. Let's pick up where we left off and modify our code 6 | from the previous session to keep track of validation accuracy while training. 7 | """ 8 | 9 | import tensorflow as tf 10 | import os 11 | 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 13 | 14 | def preprocess_data(im, label): 15 | im = tf.cast(im, tf.float32) 16 | im = im / 127.5 17 | im = im - 1 18 | im = tf.reshape(im, [-1]) 19 | return im, label 20 | 21 | # We will be using the same data pipeline for both training and validation sets 22 | # So let's create a helper function for that 23 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32): 24 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor) 25 | if is_train: 26 | dataset = dataset.shuffle(buffer_size=60000).repeat() 27 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads) 28 | dataset = dataset.batch(batch_size) 29 | dataset = dataset.prefetch(prefetch_buffer) 30 | return dataset 31 | 32 | def data_layer(): 33 | with tf.variable_scope("data"): 34 | data_train, data_val = tf.keras.datasets.mnist.load_data() 35 | dataset_train = create_dataset_pipeline(data_train, is_train=True) 36 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1) 37 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes) 38 | init_op_train = iterator.make_initializer(dataset_train) 39 | init_op_val = iterator.make_initializer(dataset_val) 40 | return iterator, init_op_train, init_op_val 41 | 42 | def model(input_layer, num_classes=10): 43 | with tf.variable_scope("model"): 44 | net = tf.layers.dense(input_layer, 512) 45 | net = tf.nn.relu(net) 46 | net = tf.layers.dense(net, num_classes) 47 | return net 48 | 49 | def loss_functions(logits, labels, num_classes=10): 50 | with tf.variable_scope("loss"): 51 | target_prob = tf.one_hot(labels, num_classes) 52 | total_loss = tf.losses.softmax_cross_entropy(target_prob, logits) 53 | return total_loss 54 | 55 | def optimizer_func(total_loss, global_step, learning_rate=0.1): 56 | with tf.variable_scope("optimizer"): 57 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) 58 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 59 | return optimizer 60 | 61 | def performance_metric(logits, labels): 62 | with tf.variable_scope("performance_metric"): 63 | preds = tf.argmax(logits, axis=1) 64 | labels = tf.cast(labels, tf.int64) 65 | corrects = tf.equal(preds, labels) 66 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 67 | return accuracy 68 | 69 | def train(): 70 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number") 71 | 72 | # define the training graph 73 | iterator, init_op_train, init_op_val = data_layer() 74 | images, labels = iterator.get_next() 75 | logits = model(images) 76 | loss = loss_functions(logits, labels) 77 | optimizer = optimizer_func(loss, global_step) 78 | accuracy = performance_metric(logits, labels) 79 | 80 | # start training 81 | num_iter = 18750 # 10 epochs 82 | log_iter = 1875 83 | val_iter = 1875 84 | with tf.Session() as sess: 85 | sess.run(tf.global_variables_initializer()) 86 | sess.run(init_op_train) 87 | 88 | streaming_loss = 0 89 | streaming_accuracy = 0 90 | 91 | for i in range(1, num_iter + 1): 92 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy]) 93 | streaming_loss += loss_batch 94 | streaming_accuracy += acc_batch 95 | if i % log_iter == 0: 96 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}" 97 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter)) 98 | streaming_loss = 0 99 | streaming_accuracy = 0 100 | 101 | if i % val_iter == 0: 102 | sess.run(init_op_val) 103 | validation_accuracy = 0 104 | num_iter = 0 105 | while True: 106 | try: 107 | acc_batch = sess.run(accuracy) 108 | validation_accuracy += acc_batch 109 | num_iter += 1 110 | except tf.errors.OutOfRangeError: 111 | validation_accuracy /= num_iter 112 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy)) 113 | sess.run(init_op_train) # switch back to training set 114 | break 115 | 116 | if __name__ == "__main__": 117 | train() 118 | -------------------------------------------------------------------------------- /S4_live.py: -------------------------------------------------------------------------------- 1 | """ Deep Learning with TensorFlow 2 | Live coding session 4: Regularization, saving and resuming from checkpoints, basics of TensorBoard 3 | 4 | In the previous session, we wrote this code to train a simple model in TensorFlow. 5 | In this session, we will train a deeper model, regularize it, and visualize it in TensorBoard. 6 | """ 7 | 8 | import tensorflow as tf 9 | import os 10 | 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 12 | 13 | def preprocess_data(im, label): 14 | im = tf.cast(im, tf.float32) 15 | im = im / 127.5 16 | im = im - 1 17 | im = tf.reshape(im, [-1]) 18 | return im, label 19 | 20 | # We will be using the same data pipeline for both training and validation sets 21 | # So let's create a helper function for that 22 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32): 23 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor) 24 | if is_train: 25 | dataset = dataset.shuffle(buffer_size=60000).repeat() 26 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads) 27 | dataset = dataset.batch(batch_size) 28 | dataset = dataset.prefetch(prefetch_buffer) 29 | return dataset 30 | 31 | def data_layer(): 32 | with tf.variable_scope("data"): 33 | data_train, data_val = tf.keras.datasets.mnist.load_data() 34 | dataset_train = create_dataset_pipeline(data_train, is_train=True) 35 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1) 36 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes) 37 | init_op_train = iterator.make_initializer(dataset_train) 38 | init_op_val = iterator.make_initializer(dataset_val) 39 | return iterator, init_op_train, init_op_val 40 | 41 | ######################################################################## 42 | def model(input_layer, num_classes=10): 43 | with tf.variable_scope("model"): 44 | reg = tf.contrib.layers.l2_regularizer(0.00001) 45 | net = input_layer 46 | for i in range(3): 47 | net = tf.layers.dense(net, 48 | units=512, 49 | kernel_regularizer=reg) 50 | net = tf.nn.relu(net) 51 | net = tf.layers.dropout(net, rate=0.2) 52 | net = tf.layers.dense(net, num_classes) 53 | return net 54 | 55 | def loss_functions(logits, labels, num_classes=10): 56 | with tf.variable_scope("loss"): 57 | target_prob = tf.one_hot(labels, num_classes) 58 | tf.losses.softmax_cross_entropy(target_prob, logits) 59 | total_loss = tf.losses.get_total_loss() # include regularization loss (I forgot to add this in the video) 60 | return total_loss 61 | ######################################################################## 62 | 63 | def optimizer_func(total_loss, global_step, learning_rate=0.1): 64 | with tf.variable_scope("optimizer"): 65 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) 66 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 67 | return optimizer 68 | 69 | def performance_metric(logits, labels): 70 | with tf.variable_scope("performance_metric"): 71 | preds = tf.argmax(logits, axis=1) 72 | labels = tf.cast(labels, tf.int64) 73 | corrects = tf.equal(preds, labels) 74 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 75 | return accuracy 76 | 77 | def train(): 78 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number") 79 | 80 | # define the training graph 81 | iterator, init_op_train, init_op_val = data_layer() 82 | images, labels = iterator.get_next() 83 | logits = model(images) 84 | loss = loss_functions(logits, labels) 85 | optimizer = optimizer_func(loss, global_step) 86 | accuracy = performance_metric(logits, labels) 87 | 88 | ######################################################################## 89 | # summary placeholders 90 | streaming_loss_p = tf.placeholder(tf.float32) 91 | streaming_acc_p = tf.placeholder(tf.float32) 92 | val_acc_p = tf.placeholder(tf.float32) 93 | val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p) 94 | train_summ_ops = tf.summary.merge([ 95 | tf.summary.scalar('streaming_loss', streaming_loss_p), 96 | tf.summary.scalar('streaming_accuracy', streaming_acc_p) 97 | ]) 98 | ######################################################################## 99 | 100 | # start training 101 | num_iter = 18750 # 10 epochs 102 | log_iter = 1875 103 | val_iter = 1875 104 | with tf.Session() as sess: 105 | sess.run(tf.global_variables_initializer()) 106 | sess.run(init_op_train) 107 | 108 | ######################################################################## 109 | # logs for TensorBoard 110 | logdir = 'logs' 111 | writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph 112 | 113 | # load / save checkpoints 114 | checkpoint_path = 'checkpoints' 115 | saver = tf.train.Saver(max_to_keep=None) 116 | ckpt = tf.train.get_checkpoint_state(checkpoint_path) 117 | 118 | # resume training if a checkpoint exists 119 | if ckpt and ckpt.model_checkpoint_path: 120 | saver.restore(sess, ckpt.model_checkpoint_path) 121 | print("Loaded parameters from {}".format(ckpt.model_checkpoint_path)) 122 | 123 | initial_step = global_step.eval() 124 | ######################################################################## 125 | 126 | streaming_loss = 0 127 | streaming_accuracy = 0 128 | 129 | for i in range(initial_step, num_iter + 1): #################################### initial step 130 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy]) 131 | streaming_loss += loss_batch 132 | streaming_accuracy += acc_batch 133 | if i % log_iter == 0: 134 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}" 135 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter)) 136 | 137 | ##################################################################################### 138 | # save to log file for TensorBoard 139 | summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss, 140 | streaming_acc_p: streaming_accuracy}) 141 | writer.add_summary(summary_train, global_step=i) 142 | ##################################################################################### 143 | 144 | streaming_loss = 0 145 | streaming_accuracy = 0 146 | 147 | if i % val_iter == 0: 148 | ##################################################################################### 149 | saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step) 150 | print("Model saved!") 151 | ##################################################################################### 152 | 153 | sess.run(init_op_val) 154 | validation_accuracy = 0 155 | num_iter = 0 156 | while True: 157 | try: 158 | acc_batch = sess.run(accuracy) 159 | validation_accuracy += acc_batch 160 | num_iter += 1 161 | except tf.errors.OutOfRangeError: 162 | validation_accuracy /= num_iter 163 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy)) 164 | 165 | ############################################################################### 166 | # save log file to TensorBoard 167 | summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy}) 168 | writer.add_summary(summary_val, global_step=i) 169 | ############################################################################### 170 | 171 | sess.run(init_op_train) # switch back to training set 172 | break 173 | writer.close() 174 | 175 | if __name__ == "__main__": 176 | train() 177 | -------------------------------------------------------------------------------- /S5_live.py: -------------------------------------------------------------------------------- 1 | """ Deep Learning with TensorFlow 2 | Live coding session 5: convolutional neural networks, batchnorm, learning rate schedules, optimizers 3 | """ 4 | 5 | import tensorflow as tf 6 | import os 7 | 8 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 9 | 10 | def preprocess_data(im, label): 11 | im = tf.cast(im, tf.float32) 12 | im = im / 127.5 13 | im = im - 1 14 | # im = tf.reshape(im, [-1]) 15 | return im, label 16 | 17 | # We will be using the same data pipeline for both training and validation sets 18 | # So let's create a helper function for that 19 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32): 20 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor) 21 | if is_train: 22 | dataset = dataset.shuffle(buffer_size=60000).repeat() 23 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads) 24 | dataset = dataset.batch(batch_size) 25 | dataset = dataset.prefetch(prefetch_buffer) 26 | return dataset 27 | 28 | def data_layer(): 29 | with tf.variable_scope("data"): 30 | data_train, data_val = tf.keras.datasets.mnist.load_data() 31 | dataset_train = create_dataset_pipeline(data_train, is_train=True) 32 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1) 33 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes) 34 | init_op_train = iterator.make_initializer(dataset_train) 35 | init_op_val = iterator.make_initializer(dataset_val) 36 | return iterator, init_op_train, init_op_val 37 | 38 | ######################################################################## 39 | def model(input_layer, training, num_classes=10): 40 | with tf.variable_scope("model"): 41 | net = tf.expand_dims(input_layer, axis=3) 42 | 43 | net = tf.layers.conv2d(net, 20, (5, 5)) 44 | net = tf.layers.batch_normalization(net, training=training) 45 | net = tf.nn.relu(net) 46 | net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2)) 47 | 48 | net = tf.layers.conv2d(net, 50, (5, 5)) 49 | net = tf.layers.batch_normalization(net, training=training) 50 | net = tf.nn.relu(net) 51 | net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2)) 52 | 53 | net = tf.layers.flatten(net) 54 | net = tf.layers.dense(net, 500) 55 | net = tf.nn.relu(net) # I forgot to add this ReLU in the video 56 | net = tf.layers.dropout(net, rate=0.2, training=training) # I forgot the training argument in the video 57 | net = tf.layers.dense(net, num_classes) 58 | return net 59 | 60 | def loss_functions(logits, labels, num_classes=10): 61 | with tf.variable_scope("loss"): 62 | target_prob = tf.one_hot(labels, num_classes) 63 | tf.losses.softmax_cross_entropy(target_prob, logits) 64 | total_loss = tf.losses.get_total_loss() # include regularization loss 65 | return total_loss 66 | 67 | def optimizer_func_momentum(total_loss, global_step, learning_rate=0.01): 68 | with tf.variable_scope("optimizer"): 69 | lr_schedule = tf.train.exponential_decay(learning_rate=learning_rate, 70 | global_step=global_step, 71 | decay_steps=1875, 72 | decay_rate=0.9, 73 | staircase=True) 74 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 75 | with tf.control_dependencies(update_ops): 76 | optimizer = tf.train.MomentumOptimizer(learning_rate=lr_schedule, momentum=0.9) 77 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 78 | return optimizer 79 | 80 | def optimizer_func_adam(total_loss, global_step, learning_rate=0.01): 81 | with tf.variable_scope("optimizer"): 82 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 83 | with tf.control_dependencies(update_ops): 84 | optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1) 85 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 86 | return optimizer 87 | ######################################################################## 88 | 89 | def performance_metric(logits, labels): 90 | with tf.variable_scope("performance_metric"): 91 | preds = tf.argmax(logits, axis=1) 92 | labels = tf.cast(labels, tf.int64) 93 | corrects = tf.equal(preds, labels) 94 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 95 | return accuracy 96 | 97 | def train(): 98 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number") 99 | 100 | # define the training graph 101 | iterator, init_op_train, init_op_val = data_layer() 102 | images, labels = iterator.get_next() 103 | training = tf.placeholder(tf.bool) 104 | logits = model(images, training) ############################## 105 | loss = loss_functions(logits, labels) 106 | optimizer = optimizer_func_adam(loss, global_step) ############################## 107 | accuracy = performance_metric(logits, labels) 108 | 109 | # summary placeholders 110 | streaming_loss_p = tf.placeholder(tf.float32) 111 | streaming_acc_p = tf.placeholder(tf.float32) 112 | val_acc_p = tf.placeholder(tf.float32) 113 | val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p) 114 | train_summ_ops = tf.summary.merge([ 115 | tf.summary.scalar('streaming_loss', streaming_loss_p), 116 | tf.summary.scalar('streaming_accuracy', streaming_acc_p) 117 | ]) 118 | 119 | # start training 120 | num_iter = 18750 # 10 epochs 121 | log_iter = 1875 122 | val_iter = 1875 123 | with tf.Session() as sess: 124 | sess.run(tf.global_variables_initializer()) 125 | sess.run(init_op_train) 126 | 127 | # logs for TensorBoard 128 | logdir = 'logs' 129 | writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph 130 | 131 | # load / save checkpoints 132 | checkpoint_path = 'checkpoints' 133 | saver = tf.train.Saver(max_to_keep=None) 134 | ckpt = tf.train.get_checkpoint_state(checkpoint_path) 135 | 136 | # resume training if a checkpoint exists 137 | if ckpt and ckpt.model_checkpoint_path: 138 | saver.restore(sess, ckpt.model_checkpoint_path) 139 | print("Loaded parameters from {}".format(ckpt.model_checkpoint_path)) 140 | 141 | initial_step = global_step.eval() 142 | 143 | streaming_loss = 0 144 | streaming_accuracy = 0 145 | 146 | for i in range(initial_step, num_iter + 1): 147 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy], feed_dict={training: True}) ############################## 148 | streaming_loss += loss_batch 149 | streaming_accuracy += acc_batch 150 | if i % log_iter == 0: 151 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}" 152 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter)) 153 | 154 | # save to log file for TensorBoard 155 | summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss, 156 | streaming_acc_p: streaming_accuracy}) 157 | writer.add_summary(summary_train, global_step=i) 158 | 159 | streaming_loss = 0 160 | streaming_accuracy = 0 161 | 162 | if i % val_iter == 0: 163 | saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step) 164 | print("Model saved!") 165 | 166 | sess.run(init_op_val) 167 | validation_accuracy = 0 168 | num_iter = 0 169 | while True: 170 | try: 171 | acc_batch = sess.run(accuracy, feed_dict={training: False}) ############################## 172 | validation_accuracy += acc_batch 173 | num_iter += 1 174 | except tf.errors.OutOfRangeError: 175 | validation_accuracy /= num_iter 176 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy)) 177 | 178 | # save log file to TensorBoard 179 | summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy}) 180 | writer.add_summary(summary_val, global_step=i) 181 | 182 | sess.run(init_op_train) # switch back to training set 183 | break 184 | writer.close() 185 | 186 | if __name__ == "__main__": 187 | train() 188 | -------------------------------------------------------------------------------- /S6/create_tf_records.py: -------------------------------------------------------------------------------- 1 | """ Converts an image dataset into TFRecords. The dataset should be organized as: 2 | 3 | base_dir: 4 | -- class_name1 5 | ---- image_name.jpg 6 | ... 7 | -- class_name2 8 | ---- image_name.jpg 9 | ... 10 | -- class_name3 11 | ---- image_name.jpg 12 | ... 13 | 14 | Example: 15 | $ python create_tf_records.py --input_dir ./dataset --output_dir ./tfrecords --num_shards 10 --split_ratio 0.2 16 | """ 17 | 18 | import tensorflow as tf 19 | import os, glob 20 | import argparse 21 | import random 22 | 23 | def _int64_feature(value): 24 | return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) 25 | 26 | def _bytes_feature(value): 27 | return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) 28 | 29 | def _create_tfexample(image_data, label): 30 | example = tf.train.Example(features=tf.train.Features(feature={ 31 | 'image': _bytes_feature(image_data), 32 | 'label': _int64_feature(label) 33 | })) 34 | return example 35 | 36 | def enumerate_classes(class_list, sort=True): 37 | class_ids = {} 38 | class_id_counter = 0 39 | 40 | if sort: 41 | class_list.sort() 42 | 43 | for class_name in class_list: 44 | if class_name not in class_ids: 45 | class_ids[class_name] = class_id_counter 46 | class_id_counter += 1 47 | 48 | return class_ids 49 | 50 | def create_tfrecords(save_dir, dataset_name, filenames, class_ids, num_shards): 51 | 52 | im_per_shard = int( len(filenames) / num_shards ) + 1 53 | 54 | for shard in range(num_shards): 55 | output_filename = os.path.join(save_dir, '{}_{:03d}-of-{:03d}.tfrecord' 56 | .format(dataset_name, shard, num_shards)) 57 | print('Writing into {}'.format(output_filename)) 58 | filenames_shard = filenames[shard*im_per_shard:(shard+1)*im_per_shard] 59 | 60 | with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer: 61 | 62 | for filename in filenames_shard: 63 | image = tf.gfile.FastGFile(filename, 'rb').read() 64 | class_name = os.path.basename(os.path.dirname(filename)) 65 | label = class_ids[class_name] 66 | 67 | example = _create_tfexample(image, label) 68 | tfrecord_writer.write(example.SerializeToString()) 69 | 70 | print('Finished writing {} images into TFRecords'.format(len(filenames))) 71 | 72 | def main(args): 73 | 74 | supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG'] 75 | filenames = [] 76 | for extension in supported_formats: 77 | pattern = os.path.join(args.input_dir, '**', extension) 78 | filenames.extend(glob.glob(pattern, recursive=False)) 79 | 80 | random.seed(args.seed) 81 | random.shuffle(filenames) 82 | 83 | num_test = int(args.split_ratio * len(filenames)) 84 | num_shards_test = int(args.split_ratio * args.num_shards) 85 | num_shards_train = args.num_shards - num_shards_test 86 | 87 | # write the list of classes and their corresponding ids to a file 88 | class_list = [name for name in os.listdir(args.input_dir) 89 | if os.path.isdir(os.path.join(args.input_dir, name))] 90 | class_ids = enumerate_classes(class_list) 91 | with open(os.path.join(args.output_dir, 'classes.txt'), 'w') as f: 92 | for cid in class_ids: 93 | print('{}:{}'.format(class_ids[cid], cid), file=f) 94 | 95 | # create TFRecords for the training and test sets 96 | create_tfrecords(save_dir=args.output_dir, 97 | dataset_name='train', 98 | filenames=filenames[num_test:], 99 | class_ids=class_ids, 100 | num_shards=num_shards_train) 101 | create_tfrecords(save_dir=args.output_dir, 102 | dataset_name='test', 103 | filenames=filenames[:num_test], 104 | class_ids=class_ids, 105 | num_shards=num_shards_test) 106 | 107 | if __name__ == '__main__': 108 | parser = argparse.ArgumentParser() 109 | parser.add_argument('--input_dir', type=str, 110 | help='path to the directory where the images will be read from') 111 | parser.add_argument('--output_dir', type=str, 112 | help='path to the directory where the TFRecords will be saved to') 113 | parser.add_argument('--num_shards', type=int, 114 | help='total number of shards') 115 | parser.add_argument('--split_ratio', type=float, default=0.2, 116 | help='ratio of number of images in the test set to the total number of images') 117 | parser.add_argument('--seed', type=int, default=42, 118 | help='random seed for repeatable train/test splits') 119 | args = parser.parse_args() 120 | main(args) 121 | -------------------------------------------------------------------------------- /S6/freeze_model.py: -------------------------------------------------------------------------------- 1 | """ Freezes a checkpoint, outputs a single pbfile that encapsulates both the graph and weights 2 | Example: 3 | $ python freeze_model.py --checkpoint_path ./checkpoints 4 | """ 5 | 6 | import tensorflow as tf 7 | import argparse 8 | from nets import mobilenet_v1 9 | import os 10 | 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 12 | 13 | def freeze_graph(checkpoint_path, output_node_name, outfile): 14 | input_layer = tf.placeholder(tf.uint8, shape=[None, None, 3], name='input') 15 | with tf.variable_scope('input_scaling'): 16 | image = tf.expand_dims(input_layer, axis=0) 17 | image = tf.image.resize_bilinear(image, [224, 224]) 18 | image = tf.cast(image, tf.float32) 19 | image = image / 127.5 20 | image = image - 1 21 | 22 | logits, _ = mobilenet_v1.mobilenet_v1(image, num_classes=2, is_training=False) 23 | preds = tf.squeeze(tf.nn.softmax(logits), name='preds') 24 | 25 | with tf.Session() as sess: 26 | ckpt = tf.train.get_checkpoint_state(checkpoint_path) 27 | saver = tf.train.Saver() 28 | saver.restore(sess, ckpt.model_checkpoint_path) 29 | 30 | output_graph_def = tf.graph_util.convert_variables_to_constants( 31 | sess, tf.get_default_graph().as_graph_def(), [output_node_name]) 32 | 33 | with tf.gfile.GFile(outfile, 'wb') as f: 34 | f.write(output_graph_def.SerializeToString()) 35 | 36 | # print a list of ops 37 | for op in output_graph_def.node: 38 | print(op.name) 39 | 40 | print('Saved frozen model to {}'.format(outfile)) 41 | print('{:d} ops in the final graph.'.format(len(output_graph_def.node))) 42 | 43 | if __name__ == '__main__': 44 | parser = argparse.ArgumentParser() 45 | parser.add_argument('--checkpoint_path', type=str, default='./', help="Path to the dir where the checkpoints are saved") 46 | parser.add_argument('--output_node_name', type=str, default='preds', help="Name of the output node") 47 | parser.add_argument('--outfile', type=str, default='frozen_model.pb', help="Frozen model path") 48 | args = parser.parse_args() 49 | freeze_graph(args.checkpoint_path, args.output_node_name, args.outfile) -------------------------------------------------------------------------------- /S6/inference.py: -------------------------------------------------------------------------------- 1 | """ Runs inference given a frozen model and a set of images 2 | Example: 3 | $ python inference.py --frozen_model frozen_model.pb --input_path ./test_images 4 | """ 5 | 6 | import argparse 7 | import tensorflow as tf 8 | import os, glob 9 | import cv2 10 | 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 12 | 13 | class InferenceEngine: 14 | def __init__(self, frozen_graph_filename): 15 | with tf.gfile.GFile(frozen_graph_filename, "rb") as f: 16 | graph_def = tf.GraphDef() 17 | graph_def.ParseFromString(f.read()) 18 | 19 | with tf.Graph().as_default() as graph: 20 | tf.import_graph_def(graph_def, name="Pretrained") 21 | 22 | self.graph = graph 23 | 24 | def run_inference(self, input_path): 25 | if os.path.isdir(input_path): 26 | filenames = glob.glob(os.path.join(input_path, '*.jpg')) 27 | filenames.extend(glob.glob(os.path.join(input_path, '*.jpeg'))) 28 | filenames.extend(glob.glob(os.path.join(input_path, '*.png'))) 29 | filenames.extend(glob.glob(os.path.join(input_path, '*.bmp'))) 30 | else: 31 | filenames = [input_path] 32 | 33 | input_layer = self.graph.get_tensor_by_name('Pretrained/input:0') 34 | preds = self.graph.get_tensor_by_name('Pretrained/preds:0') 35 | pred_idx = tf.argmax(preds) 36 | 37 | with tf.Session(graph=self.graph) as sess: 38 | for filename in filenames: 39 | image = cv2.imread(filename) 40 | class_label, probs = sess.run([pred_idx, preds], feed_dict={input_layer: image}) 41 | print("Label: {:d}, Probability: {:.2f} \t File: {}".format(class_label, probs[class_label], filename)) 42 | 43 | if __name__ == '__main__': 44 | parser = argparse.ArgumentParser() 45 | parser.add_argument("--frozen_model", default="frozen_model.pb", type=str, help="Path to the frozen model file to import") 46 | parser.add_argument("--input_path", type=str, help="Path to the input file(s). If this is a dir all files will be processed.") 47 | args = parser.parse_args() 48 | 49 | ie = InferenceEngine(args.frozen_model) 50 | ie.run_inference(args.input_path) 51 | 52 | -------------------------------------------------------------------------------- /S6/nets/mobilenet_v1.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================= 15 | """MobileNet v1. 16 | 17 | MobileNet is a general architecture and can be used for multiple use cases. 18 | Depending on the use case, it can use different input layer size and different 19 | head (for example: embeddings, localization and classification). 20 | 21 | As described in https://arxiv.org/abs/1704.04861. 22 | 23 | MobileNets: Efficient Convolutional Neural Networks for 24 | Mobile Vision Applications 25 | Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, 26 | Tobias Weyand, Marco Andreetto, Hartwig Adam 27 | 28 | 100% Mobilenet V1 (base) with input size 224x224: 29 | 30 | See mobilenet_v1() 31 | 32 | Layer params macs 33 | -------------------------------------------------------------------------------- 34 | MobilenetV1/Conv2d_0/Conv2D: 864 10,838,016 35 | MobilenetV1/Conv2d_1_depthwise/depthwise: 288 3,612,672 36 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 2,048 25,690,112 37 | MobilenetV1/Conv2d_2_depthwise/depthwise: 576 1,806,336 38 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 8,192 25,690,112 39 | MobilenetV1/Conv2d_3_depthwise/depthwise: 1,152 3,612,672 40 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 16,384 51,380,224 41 | MobilenetV1/Conv2d_4_depthwise/depthwise: 1,152 903,168 42 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 32,768 25,690,112 43 | MobilenetV1/Conv2d_5_depthwise/depthwise: 2,304 1,806,336 44 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 65,536 51,380,224 45 | MobilenetV1/Conv2d_6_depthwise/depthwise: 2,304 451,584 46 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 131,072 25,690,112 47 | MobilenetV1/Conv2d_7_depthwise/depthwise: 4,608 903,168 48 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 262,144 51,380,224 49 | MobilenetV1/Conv2d_8_depthwise/depthwise: 4,608 903,168 50 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 262,144 51,380,224 51 | MobilenetV1/Conv2d_9_depthwise/depthwise: 4,608 903,168 52 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 262,144 51,380,224 53 | MobilenetV1/Conv2d_10_depthwise/depthwise: 4,608 903,168 54 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 262,144 51,380,224 55 | MobilenetV1/Conv2d_11_depthwise/depthwise: 4,608 903,168 56 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 262,144 51,380,224 57 | MobilenetV1/Conv2d_12_depthwise/depthwise: 4,608 225,792 58 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 524,288 25,690,112 59 | MobilenetV1/Conv2d_13_depthwise/depthwise: 9,216 451,584 60 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 1,048,576 51,380,224 61 | -------------------------------------------------------------------------------- 62 | Total: 3,185,088 567,716,352 63 | 64 | 65 | 75% Mobilenet V1 (base) with input size 128x128: 66 | 67 | See mobilenet_v1_075() 68 | 69 | Layer params macs 70 | -------------------------------------------------------------------------------- 71 | MobilenetV1/Conv2d_0/Conv2D: 648 2,654,208 72 | MobilenetV1/Conv2d_1_depthwise/depthwise: 216 884,736 73 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 1,152 4,718,592 74 | MobilenetV1/Conv2d_2_depthwise/depthwise: 432 442,368 75 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 4,608 4,718,592 76 | MobilenetV1/Conv2d_3_depthwise/depthwise: 864 884,736 77 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 9,216 9,437,184 78 | MobilenetV1/Conv2d_4_depthwise/depthwise: 864 221,184 79 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 18,432 4,718,592 80 | MobilenetV1/Conv2d_5_depthwise/depthwise: 1,728 442,368 81 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 36,864 9,437,184 82 | MobilenetV1/Conv2d_6_depthwise/depthwise: 1,728 110,592 83 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 73,728 4,718,592 84 | MobilenetV1/Conv2d_7_depthwise/depthwise: 3,456 221,184 85 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 147,456 9,437,184 86 | MobilenetV1/Conv2d_8_depthwise/depthwise: 3,456 221,184 87 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 147,456 9,437,184 88 | MobilenetV1/Conv2d_9_depthwise/depthwise: 3,456 221,184 89 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 147,456 9,437,184 90 | MobilenetV1/Conv2d_10_depthwise/depthwise: 3,456 221,184 91 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 147,456 9,437,184 92 | MobilenetV1/Conv2d_11_depthwise/depthwise: 3,456 221,184 93 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 147,456 9,437,184 94 | MobilenetV1/Conv2d_12_depthwise/depthwise: 3,456 55,296 95 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 294,912 4,718,592 96 | MobilenetV1/Conv2d_13_depthwise/depthwise: 6,912 110,592 97 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 589,824 9,437,184 98 | -------------------------------------------------------------------------------- 99 | Total: 1,800,144 106,002,432 100 | 101 | """ 102 | 103 | # Tensorflow mandates these. 104 | from __future__ import absolute_import 105 | from __future__ import division 106 | from __future__ import print_function 107 | 108 | from collections import namedtuple 109 | import functools 110 | 111 | import tensorflow as tf 112 | 113 | slim = tf.contrib.slim 114 | 115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture 116 | # Conv defines 3x3 convolution layers 117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution. 118 | # stride is the stride of the convolution 119 | # depth is the number of channels or filters in a layer 120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth']) 121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth']) 122 | 123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body 124 | MOBILENETV1_CONV_DEFS = [ 125 | Conv(kernel=[3, 3], stride=2, depth=32), 126 | DepthSepConv(kernel=[3, 3], stride=1, depth=64), 127 | DepthSepConv(kernel=[3, 3], stride=2, depth=128), 128 | DepthSepConv(kernel=[3, 3], stride=1, depth=128), 129 | DepthSepConv(kernel=[3, 3], stride=2, depth=256), 130 | DepthSepConv(kernel=[3, 3], stride=1, depth=256), 131 | DepthSepConv(kernel=[3, 3], stride=2, depth=512), 132 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 133 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 134 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 135 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 136 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 137 | DepthSepConv(kernel=[3, 3], stride=2, depth=1024), 138 | DepthSepConv(kernel=[3, 3], stride=1, depth=1024) 139 | ] 140 | 141 | 142 | def _fixed_padding(inputs, kernel_size, rate=1): 143 | """Pads the input along the spatial dimensions independently of input size. 144 | 145 | Pads the input such that if it was used in a convolution with 'VALID' padding, 146 | the output would have the same dimensions as if the unpadded input was used 147 | in a convolution with 'SAME' padding. 148 | 149 | Args: 150 | inputs: A tensor of size [batch, height_in, width_in, channels]. 151 | kernel_size: The kernel to be used in the conv2d or max_pool2d operation. 152 | rate: An integer, rate for atrous convolution. 153 | 154 | Returns: 155 | output: A tensor of size [batch, height_out, width_out, channels] with the 156 | input, either intact (if kernel_size == 1) or padded (if kernel_size > 1). 157 | """ 158 | kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1), 159 | kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)] 160 | pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1] 161 | pad_beg = [pad_total[0] // 2, pad_total[1] // 2] 162 | pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]] 163 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]], 164 | [pad_beg[1], pad_end[1]], [0, 0]]) 165 | return padded_inputs 166 | 167 | 168 | def mobilenet_v1_base(inputs, 169 | final_endpoint='Conv2d_13_pointwise', 170 | min_depth=8, 171 | depth_multiplier=1.0, 172 | conv_defs=None, 173 | output_stride=None, 174 | use_explicit_padding=False, 175 | scope=None): 176 | """Mobilenet v1. 177 | 178 | Constructs a Mobilenet v1 network from inputs to the given final endpoint. 179 | 180 | Args: 181 | inputs: a tensor of shape [batch_size, height, width, channels]. 182 | final_endpoint: specifies the endpoint to construct the network up to. It 183 | can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise', 184 | 'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise, 185 | 'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise', 186 | 'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise', 187 | 'Conv2d_12_pointwise', 'Conv2d_13_pointwise']. 188 | min_depth: Minimum depth value (number of channels) for all convolution ops. 189 | Enforced when depth_multiplier < 1, and not an active constraint when 190 | depth_multiplier >= 1. 191 | depth_multiplier: Float multiplier for the depth (number of channels) 192 | for all convolution ops. The value must be greater than zero. Typical 193 | usage will be to set this value in (0, 1) to reduce the number of 194 | parameters or computation cost of the model. 195 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 196 | output_stride: An integer that specifies the requested ratio of input to 197 | output spatial resolution. If not None, then we invoke atrous convolution 198 | if necessary to prevent the network from reducing the spatial resolution 199 | of the activation maps. Allowed values are 8 (accurate fully convolutional 200 | mode), 16 (fast fully convolutional mode), 32 (classification mode). 201 | use_explicit_padding: Use 'VALID' padding for convolutions, but prepad 202 | inputs so that the output dimensions are the same as if 'SAME' padding 203 | were used. 204 | scope: Optional variable_scope. 205 | 206 | Returns: 207 | tensor_out: output tensor corresponding to the final_endpoint. 208 | end_points: a set of activations for external use, for example summaries or 209 | losses. 210 | 211 | Raises: 212 | ValueError: if final_endpoint is not set to one of the predefined values, 213 | or depth_multiplier <= 0, or the target output_stride is not 214 | allowed. 215 | """ 216 | depth = lambda d: max(int(d * depth_multiplier), min_depth) 217 | end_points = {} 218 | 219 | # Used to find thinned depths for each layer. 220 | if depth_multiplier <= 0: 221 | raise ValueError('depth_multiplier is not greater than zero.') 222 | 223 | if conv_defs is None: 224 | conv_defs = MOBILENETV1_CONV_DEFS 225 | 226 | if output_stride is not None and output_stride not in [8, 16, 32]: 227 | raise ValueError('Only allowed output_stride values are 8, 16, 32.') 228 | 229 | padding = 'SAME' 230 | if use_explicit_padding: 231 | padding = 'VALID' 232 | with tf.variable_scope(scope, 'MobilenetV1', [inputs]): 233 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding): 234 | # The current_stride variable keeps track of the output stride of the 235 | # activations, i.e., the running product of convolution strides up to the 236 | # current network layer. This allows us to invoke atrous convolution 237 | # whenever applying the next convolution would result in the activations 238 | # having output stride larger than the target output_stride. 239 | current_stride = 1 240 | 241 | # The atrous convolution rate parameter. 242 | rate = 1 243 | 244 | net = inputs 245 | for i, conv_def in enumerate(conv_defs): 246 | end_point_base = 'Conv2d_%d' % i 247 | 248 | if output_stride is not None and current_stride == output_stride: 249 | # If we have reached the target output_stride, then we need to employ 250 | # atrous convolution with stride=1 and multiply the atrous rate by the 251 | # current unit's stride for use in subsequent layers. 252 | layer_stride = 1 253 | layer_rate = rate 254 | rate *= conv_def.stride 255 | else: 256 | layer_stride = conv_def.stride 257 | layer_rate = 1 258 | current_stride *= conv_def.stride 259 | 260 | if isinstance(conv_def, Conv): 261 | end_point = end_point_base 262 | if use_explicit_padding: 263 | net = _fixed_padding(net, conv_def.kernel) 264 | net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel, 265 | stride=conv_def.stride, 266 | normalizer_fn=slim.batch_norm, 267 | scope=end_point) 268 | end_points[end_point] = net 269 | if end_point == final_endpoint: 270 | return net, end_points 271 | 272 | elif isinstance(conv_def, DepthSepConv): 273 | end_point = end_point_base + '_depthwise' 274 | 275 | # By passing filters=None 276 | # separable_conv2d produces only a depthwise convolution layer 277 | if use_explicit_padding: 278 | net = _fixed_padding(net, conv_def.kernel, layer_rate) 279 | net = slim.separable_conv2d(net, None, conv_def.kernel, 280 | depth_multiplier=1, 281 | stride=layer_stride, 282 | rate=layer_rate, 283 | normalizer_fn=slim.batch_norm, 284 | scope=end_point) 285 | 286 | end_points[end_point] = net 287 | if end_point == final_endpoint: 288 | return net, end_points 289 | 290 | end_point = end_point_base + '_pointwise' 291 | 292 | net = slim.conv2d(net, depth(conv_def.depth), [1, 1], 293 | stride=1, 294 | normalizer_fn=slim.batch_norm, 295 | scope=end_point) 296 | 297 | end_points[end_point] = net 298 | if end_point == final_endpoint: 299 | return net, end_points 300 | else: 301 | raise ValueError('Unknown convolution type %s for layer %d' 302 | % (conv_def.ltype, i)) 303 | raise ValueError('Unknown final endpoint %s' % final_endpoint) 304 | 305 | 306 | def mobilenet_v1(inputs, 307 | num_classes=1000, 308 | dropout_keep_prob=0.999, 309 | is_training=True, 310 | min_depth=8, 311 | depth_multiplier=1.0, 312 | conv_defs=None, 313 | prediction_fn=tf.contrib.layers.softmax, 314 | spatial_squeeze=True, 315 | reuse=None, 316 | scope='MobilenetV1', 317 | global_pool=False): 318 | """Mobilenet v1 model for classification. 319 | 320 | Args: 321 | inputs: a tensor of shape [batch_size, height, width, channels]. 322 | num_classes: number of predicted classes. If 0 or None, the logits layer 323 | is omitted and the input features to the logits layer (before dropout) 324 | are returned instead. 325 | dropout_keep_prob: the percentage of activation values that are retained. 326 | is_training: whether is training or not. 327 | min_depth: Minimum depth value (number of channels) for all convolution ops. 328 | Enforced when depth_multiplier < 1, and not an active constraint when 329 | depth_multiplier >= 1. 330 | depth_multiplier: Float multiplier for the depth (number of channels) 331 | for all convolution ops. The value must be greater than zero. Typical 332 | usage will be to set this value in (0, 1) to reduce the number of 333 | parameters or computation cost of the model. 334 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 335 | prediction_fn: a function to get predictions out of logits. 336 | spatial_squeeze: if True, logits is of shape is [B, C], if false logits is 337 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes. 338 | reuse: whether or not the network and its variables should be reused. To be 339 | able to reuse 'scope' must be given. 340 | scope: Optional variable_scope. 341 | global_pool: Optional boolean flag to control the avgpooling before the 342 | logits layer. If false or unset, pooling is done with a fixed window 343 | that reduces default-sized inputs to 1x1, while larger inputs lead to 344 | larger outputs. If true, any input size is pooled down to 1x1. 345 | 346 | Returns: 347 | net: a 2D Tensor with the logits (pre-softmax activations) if num_classes 348 | is a non-zero integer, or the non-dropped-out input to the logits layer 349 | if num_classes is 0 or None. 350 | end_points: a dictionary from components of the network to the corresponding 351 | activation. 352 | 353 | Raises: 354 | ValueError: Input rank is invalid. 355 | """ 356 | input_shape = inputs.get_shape().as_list() 357 | if len(input_shape) != 4: 358 | raise ValueError('Invalid input tensor rank, expected 4, was: %d' % 359 | len(input_shape)) 360 | 361 | with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope: 362 | with slim.arg_scope([slim.batch_norm, slim.dropout], 363 | is_training=is_training): 364 | net, end_points = mobilenet_v1_base(inputs, scope=scope, 365 | min_depth=min_depth, 366 | depth_multiplier=depth_multiplier, 367 | conv_defs=conv_defs) 368 | with tf.variable_scope('Logits'): 369 | if global_pool: 370 | # Global average pooling. 371 | net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool') 372 | end_points['global_pool'] = net 373 | else: 374 | # Pooling with a fixed kernel size. 375 | kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7]) 376 | net = slim.avg_pool2d(net, kernel_size, padding='VALID', 377 | scope='AvgPool_1a') 378 | end_points['AvgPool_1a'] = net 379 | if not num_classes: 380 | return net, end_points 381 | # 1 x 1 x 1024 382 | net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b') 383 | logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, 384 | normalizer_fn=None, scope='Conv2d_1c_1x1') 385 | if spatial_squeeze: 386 | logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze') 387 | end_points['Logits'] = logits 388 | if prediction_fn: 389 | end_points['Predictions'] = prediction_fn(logits, scope='Predictions') 390 | return logits, end_points 391 | 392 | mobilenet_v1.default_image_size = 224 393 | 394 | 395 | def wrapped_partial(func, *args, **kwargs): 396 | partial_func = functools.partial(func, *args, **kwargs) 397 | functools.update_wrapper(partial_func, func) 398 | return partial_func 399 | 400 | 401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75) 402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50) 403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25) 404 | 405 | 406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size): 407 | """Define kernel size which is automatically reduced for small input. 408 | 409 | If the shape of the input images is unknown at graph construction time this 410 | function assumes that the input images are large enough. 411 | 412 | Args: 413 | input_tensor: input tensor of size [batch_size, height, width, channels]. 414 | kernel_size: desired kernel size of length 2: [kernel_height, kernel_width] 415 | 416 | Returns: 417 | a tensor with the kernel size. 418 | """ 419 | shape = input_tensor.get_shape().as_list() 420 | if shape[1] is None or shape[2] is None: 421 | kernel_size_out = kernel_size 422 | else: 423 | kernel_size_out = [min(shape[1], kernel_size[0]), 424 | min(shape[2], kernel_size[1])] 425 | return kernel_size_out 426 | 427 | 428 | def mobilenet_v1_arg_scope( 429 | is_training=True, 430 | weight_decay=0.00004, 431 | stddev=0.09, 432 | regularize_depthwise=False, 433 | batch_norm_decay=0.9997, 434 | batch_norm_epsilon=0.001, 435 | batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS): 436 | """Defines the default MobilenetV1 arg scope. 437 | 438 | Args: 439 | is_training: Whether or not we're training the model. If this is set to 440 | None, the parameter is not added to the batch_norm arg_scope. 441 | weight_decay: The weight decay to use for regularizing the model. 442 | stddev: The standard deviation of the trunctated normal weight initializer. 443 | regularize_depthwise: Whether or not apply regularization on depthwise. 444 | batch_norm_decay: Decay for batch norm moving average. 445 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero 446 | in batch norm. 447 | batch_norm_updates_collections: Collection for the update ops for 448 | batch norm. 449 | 450 | Returns: 451 | An `arg_scope` to use for the mobilenet v1 model. 452 | """ 453 | batch_norm_params = { 454 | 'center': True, 455 | 'scale': True, 456 | 'decay': batch_norm_decay, 457 | 'epsilon': batch_norm_epsilon, 458 | 'updates_collections': batch_norm_updates_collections, 459 | } 460 | if is_training is not None: 461 | batch_norm_params['is_training'] = is_training 462 | 463 | # Set weight_decay for weights in Conv and DepthSepConv layers. 464 | weights_init = tf.truncated_normal_initializer(stddev=stddev) 465 | regularizer = tf.contrib.layers.l2_regularizer(weight_decay) 466 | if regularize_depthwise: 467 | depthwise_regularizer = regularizer 468 | else: 469 | depthwise_regularizer = None 470 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], 471 | weights_initializer=weights_init, 472 | activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm): 473 | with slim.arg_scope([slim.batch_norm], **batch_norm_params): 474 | with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer): 475 | with slim.arg_scope([slim.separable_conv2d], 476 | weights_regularizer=depthwise_regularizer) as sc: 477 | return sc 478 | -------------------------------------------------------------------------------- /S6/resize_images.py: -------------------------------------------------------------------------------- 1 | """ A utility script that resizes all images in a given directory to a specified size 2 | WARNING: the original images will be overwritten! 3 | """ 4 | 5 | import cv2 6 | import os, glob 7 | import argparse 8 | 9 | def main(args): 10 | supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG'] 11 | filenames = [] 12 | for extension in supported_formats: 13 | pattern = os.path.join(args.input_dir, '**', extension) 14 | filenames.extend(glob.glob(pattern, recursive=True)) 15 | 16 | num_images = len(filenames) 17 | for i in range(num_images): 18 | if i % 100 == 0: 19 | print("{} of {} \t Resizing: {}".format(i, num_images, filenames[i])) 20 | image = cv2.imread(filenames[i]) 21 | image = cv2.resize(image, (args.resize, args.resize), interpolation=cv2.INTER_AREA) 22 | cv2.imwrite(filenames[i], image) 23 | 24 | if __name__ == '__main__': 25 | # resizes images in-place 26 | parser = argparse.ArgumentParser() 27 | parser.add_argument('--input_dir', type=str, 28 | help='path to the directory where the images will be read from') 29 | parser.add_argument('--resize', type=int, 30 | help='the images will be resized to NxN') 31 | 32 | args = parser.parse_args() 33 | main(args) 34 | -------------------------------------------------------------------------------- /S6/trainer.py: -------------------------------------------------------------------------------- 1 | """ Trains a TensorFlow model 2 | Example: 3 | $ python trainer.py --checkpoint_path ./checkpoints --data_path ./tfrecords 4 | """ 5 | 6 | import tensorflow as tf 7 | import numpy as np 8 | import os, glob 9 | import argparse 10 | from nets import mobilenet_v1 ##################################################### 11 | 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 13 | 14 | class TFModelTrainer: 15 | 16 | def __init__(self, checkpoint_path, data_path): 17 | self.checkpoint_path = checkpoint_path 18 | 19 | # set training parameters ################################################# 20 | self.learning_rate = 0.01 21 | self.num_iter = 100000 22 | self.save_iter = 5000 23 | self.val_iter = 5000 24 | self.log_iter = 100 25 | self.batch_size = 32 26 | 27 | # set up data layer 28 | self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord')) 29 | self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord')) 30 | self.iterator, self.filenames = self._data_layer() 31 | self.num_val_samples = 10000 32 | self.num_classes = 2 33 | self.image_size = 224 34 | 35 | def preprocess_image(self, image_string): 36 | image = tf.image.decode_jpeg(image_string, channels=3) 37 | 38 | # flip for data augmentation 39 | image = tf.image.random_flip_left_right(image) ############################ 40 | 41 | # normalize image to [-1, +1] 42 | image = tf.cast(image, tf.float32) 43 | image = image / 127.5 44 | image = image - 1 45 | return image 46 | 47 | def _parse_tfrecord(self, example_proto): ##################################### 48 | keys_to_features = {'image': tf.FixedLenFeature([], tf.string), 49 | 'label': tf.FixedLenFeature([], tf.int64)} 50 | parsed_features = tf.parse_single_example(example_proto, keys_to_features) 51 | image = parsed_features['image'] 52 | label = parsed_features['label'] 53 | image = self.preprocess_image(image) 54 | return image, label 55 | 56 | def _data_layer(self, num_threads=8, prefetch_buffer=100): 57 | with tf.variable_scope('data'): 58 | filenames = tf.placeholder(tf.string, shape=[None]) 59 | dataset = tf.data.TFRecordDataset(filenames) ########################## 60 | dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads) 61 | dataset = dataset.repeat() 62 | dataset = dataset.batch(self.batch_size) 63 | dataset = dataset.prefetch(prefetch_buffer) 64 | iterator = dataset.make_initializable_iterator() 65 | return iterator, filenames 66 | 67 | def _loss_functions(self, logits, labels): 68 | with tf.variable_scope('loss'): 69 | target_prob = tf.one_hot(labels, self.num_classes) 70 | tf.losses.softmax_cross_entropy(target_prob, logits) 71 | total_loss = tf.losses.get_total_loss() #include regularization loss 72 | return total_loss 73 | 74 | def _optimizer(self, total_loss, global_step): 75 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 76 | with tf.control_dependencies(update_ops): 77 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1) 78 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 79 | return optimizer 80 | 81 | def _performance_metric(self, logits, labels): 82 | with tf.variable_scope("performance_metric"): 83 | preds = tf.argmax(logits, axis=1) 84 | labels = tf.cast(labels, tf.int64) 85 | corrects = tf.equal(preds, labels) 86 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 87 | return accuracy 88 | 89 | def train(self): 90 | # iteration number 91 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number') 92 | 93 | # training graph 94 | images, labels = self.iterator.get_next() 95 | images = tf.image.resize_bilinear(images, (self.image_size, self.image_size)) 96 | training = tf.placeholder(tf.bool, name='is_training') 97 | logits, _ = mobilenet_v1.mobilenet_v1(images, 98 | num_classes=self.num_classes, 99 | is_training=training, 100 | scope='MobilenetV1', 101 | global_pool=True) ############################################ 102 | loss = self._loss_functions(logits, labels) 103 | optimizer = self._optimizer(loss, global_step) 104 | accuracy = self._performance_metric(logits, labels) 105 | 106 | # summary placeholders 107 | streaming_loss_p = tf.placeholder(tf.float32) 108 | accuracy_p = tf.placeholder(tf.float32) 109 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p) 110 | summ_op_test = tf.summary.scalar('accuracy', accuracy_p) 111 | 112 | with tf.Session() as sess: 113 | sess.run(tf.global_variables_initializer()) 114 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames}) 115 | 116 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph) 117 | 118 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints 119 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path) 120 | 121 | # resume training if a checkpoint exists 122 | if ckpt and ckpt.model_checkpoint_path: 123 | saver.restore(sess, ckpt.model_checkpoint_path) 124 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path)) 125 | 126 | initial_step = global_step.eval() 127 | 128 | # train the model 129 | streaming_loss = 0 130 | for i in range(initial_step, self.num_iter + 1): 131 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True}) 132 | 133 | if not np.isfinite(loss_batch): 134 | print('loss diverged, stopping') 135 | exit() 136 | 137 | # log summary 138 | streaming_loss += loss_batch 139 | if i % self.log_iter == self.log_iter - 1: 140 | streaming_loss /= self.log_iter 141 | print(i + 1, streaming_loss) 142 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss}) 143 | writer.add_summary(summary_train, global_step=i) 144 | streaming_loss = 0 145 | 146 | # save model 147 | if i % self.save_iter == self.save_iter - 1: 148 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step) 149 | print("Model saved!") 150 | 151 | # run validation 152 | if i % self.val_iter == self.val_iter - 1: 153 | print("Running validation.") 154 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames}) 155 | 156 | validation_accuracy = 0 157 | for j in range(self.num_val_samples // self.batch_size): ################################### 158 | acc_batch = sess.run(accuracy, feed_dict={training: False}) 159 | validation_accuracy += acc_batch 160 | validation_accuracy /= j 161 | 162 | print("Accuracy: {}".format(validation_accuracy)) 163 | 164 | summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy}) 165 | writer.add_summary(summary_test, global_step=i) 166 | 167 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames}) 168 | 169 | writer.close() 170 | 171 | def main(): 172 | parser = argparse.ArgumentParser() 173 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/', 174 | help="Path to the dir where the checkpoints are saved") 175 | parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords") 176 | args = parser.parse_args() 177 | trainer = TFModelTrainer(args.checkpoint_path, args.data_path) 178 | trainer.train() 179 | 180 | if __name__ == '__main__': 181 | main() 182 | -------------------------------------------------------------------------------- /S7/checkpoints/checkpoint: -------------------------------------------------------------------------------- 1 | model_checkpoint_path: "mobilenet_v1_1.0_224.ckpt" 2 | -------------------------------------------------------------------------------- /S7/nets/mobilenet_v1.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================= 15 | """MobileNet v1. 16 | 17 | MobileNet is a general architecture and can be used for multiple use cases. 18 | Depending on the use case, it can use different input layer size and different 19 | head (for example: embeddings, localization and classification). 20 | 21 | As described in https://arxiv.org/abs/1704.04861. 22 | 23 | MobileNets: Efficient Convolutional Neural Networks for 24 | Mobile Vision Applications 25 | Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, 26 | Tobias Weyand, Marco Andreetto, Hartwig Adam 27 | 28 | 100% Mobilenet V1 (base) with input size 224x224: 29 | 30 | See mobilenet_v1() 31 | 32 | Layer params macs 33 | -------------------------------------------------------------------------------- 34 | MobilenetV1/Conv2d_0/Conv2D: 864 10,838,016 35 | MobilenetV1/Conv2d_1_depthwise/depthwise: 288 3,612,672 36 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 2,048 25,690,112 37 | MobilenetV1/Conv2d_2_depthwise/depthwise: 576 1,806,336 38 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 8,192 25,690,112 39 | MobilenetV1/Conv2d_3_depthwise/depthwise: 1,152 3,612,672 40 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 16,384 51,380,224 41 | MobilenetV1/Conv2d_4_depthwise/depthwise: 1,152 903,168 42 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 32,768 25,690,112 43 | MobilenetV1/Conv2d_5_depthwise/depthwise: 2,304 1,806,336 44 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 65,536 51,380,224 45 | MobilenetV1/Conv2d_6_depthwise/depthwise: 2,304 451,584 46 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 131,072 25,690,112 47 | MobilenetV1/Conv2d_7_depthwise/depthwise: 4,608 903,168 48 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 262,144 51,380,224 49 | MobilenetV1/Conv2d_8_depthwise/depthwise: 4,608 903,168 50 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 262,144 51,380,224 51 | MobilenetV1/Conv2d_9_depthwise/depthwise: 4,608 903,168 52 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 262,144 51,380,224 53 | MobilenetV1/Conv2d_10_depthwise/depthwise: 4,608 903,168 54 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 262,144 51,380,224 55 | MobilenetV1/Conv2d_11_depthwise/depthwise: 4,608 903,168 56 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 262,144 51,380,224 57 | MobilenetV1/Conv2d_12_depthwise/depthwise: 4,608 225,792 58 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 524,288 25,690,112 59 | MobilenetV1/Conv2d_13_depthwise/depthwise: 9,216 451,584 60 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 1,048,576 51,380,224 61 | -------------------------------------------------------------------------------- 62 | Total: 3,185,088 567,716,352 63 | 64 | 65 | 75% Mobilenet V1 (base) with input size 128x128: 66 | 67 | See mobilenet_v1_075() 68 | 69 | Layer params macs 70 | -------------------------------------------------------------------------------- 71 | MobilenetV1/Conv2d_0/Conv2D: 648 2,654,208 72 | MobilenetV1/Conv2d_1_depthwise/depthwise: 216 884,736 73 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 1,152 4,718,592 74 | MobilenetV1/Conv2d_2_depthwise/depthwise: 432 442,368 75 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 4,608 4,718,592 76 | MobilenetV1/Conv2d_3_depthwise/depthwise: 864 884,736 77 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 9,216 9,437,184 78 | MobilenetV1/Conv2d_4_depthwise/depthwise: 864 221,184 79 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 18,432 4,718,592 80 | MobilenetV1/Conv2d_5_depthwise/depthwise: 1,728 442,368 81 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 36,864 9,437,184 82 | MobilenetV1/Conv2d_6_depthwise/depthwise: 1,728 110,592 83 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 73,728 4,718,592 84 | MobilenetV1/Conv2d_7_depthwise/depthwise: 3,456 221,184 85 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 147,456 9,437,184 86 | MobilenetV1/Conv2d_8_depthwise/depthwise: 3,456 221,184 87 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 147,456 9,437,184 88 | MobilenetV1/Conv2d_9_depthwise/depthwise: 3,456 221,184 89 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 147,456 9,437,184 90 | MobilenetV1/Conv2d_10_depthwise/depthwise: 3,456 221,184 91 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 147,456 9,437,184 92 | MobilenetV1/Conv2d_11_depthwise/depthwise: 3,456 221,184 93 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 147,456 9,437,184 94 | MobilenetV1/Conv2d_12_depthwise/depthwise: 3,456 55,296 95 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 294,912 4,718,592 96 | MobilenetV1/Conv2d_13_depthwise/depthwise: 6,912 110,592 97 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 589,824 9,437,184 98 | -------------------------------------------------------------------------------- 99 | Total: 1,800,144 106,002,432 100 | 101 | """ 102 | 103 | # Tensorflow mandates these. 104 | from __future__ import absolute_import 105 | from __future__ import division 106 | from __future__ import print_function 107 | 108 | from collections import namedtuple 109 | import functools 110 | 111 | import tensorflow as tf 112 | 113 | slim = tf.contrib.slim 114 | 115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture 116 | # Conv defines 3x3 convolution layers 117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution. 118 | # stride is the stride of the convolution 119 | # depth is the number of channels or filters in a layer 120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth']) 121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth']) 122 | 123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body 124 | MOBILENETV1_CONV_DEFS = [ 125 | Conv(kernel=[3, 3], stride=2, depth=32), 126 | DepthSepConv(kernel=[3, 3], stride=1, depth=64), 127 | DepthSepConv(kernel=[3, 3], stride=2, depth=128), 128 | DepthSepConv(kernel=[3, 3], stride=1, depth=128), 129 | DepthSepConv(kernel=[3, 3], stride=2, depth=256), 130 | DepthSepConv(kernel=[3, 3], stride=1, depth=256), 131 | DepthSepConv(kernel=[3, 3], stride=2, depth=512), 132 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 133 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 134 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 135 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 136 | DepthSepConv(kernel=[3, 3], stride=1, depth=512), 137 | DepthSepConv(kernel=[3, 3], stride=2, depth=1024), 138 | DepthSepConv(kernel=[3, 3], stride=1, depth=1024) 139 | ] 140 | 141 | 142 | def _fixed_padding(inputs, kernel_size, rate=1): 143 | """Pads the input along the spatial dimensions independently of input size. 144 | 145 | Pads the input such that if it was used in a convolution with 'VALID' padding, 146 | the output would have the same dimensions as if the unpadded input was used 147 | in a convolution with 'SAME' padding. 148 | 149 | Args: 150 | inputs: A tensor of size [batch, height_in, width_in, channels]. 151 | kernel_size: The kernel to be used in the conv2d or max_pool2d operation. 152 | rate: An integer, rate for atrous convolution. 153 | 154 | Returns: 155 | output: A tensor of size [batch, height_out, width_out, channels] with the 156 | input, either intact (if kernel_size == 1) or padded (if kernel_size > 1). 157 | """ 158 | kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1), 159 | kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)] 160 | pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1] 161 | pad_beg = [pad_total[0] // 2, pad_total[1] // 2] 162 | pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]] 163 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]], 164 | [pad_beg[1], pad_end[1]], [0, 0]]) 165 | return padded_inputs 166 | 167 | 168 | def mobilenet_v1_base(inputs, 169 | final_endpoint='Conv2d_13_pointwise', 170 | min_depth=8, 171 | depth_multiplier=1.0, 172 | conv_defs=None, 173 | output_stride=None, 174 | use_explicit_padding=False, 175 | scope=None): 176 | """Mobilenet v1. 177 | 178 | Constructs a Mobilenet v1 network from inputs to the given final endpoint. 179 | 180 | Args: 181 | inputs: a tensor of shape [batch_size, height, width, channels]. 182 | final_endpoint: specifies the endpoint to construct the network up to. It 183 | can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise', 184 | 'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise, 185 | 'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise', 186 | 'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise', 187 | 'Conv2d_12_pointwise', 'Conv2d_13_pointwise']. 188 | min_depth: Minimum depth value (number of channels) for all convolution ops. 189 | Enforced when depth_multiplier < 1, and not an active constraint when 190 | depth_multiplier >= 1. 191 | depth_multiplier: Float multiplier for the depth (number of channels) 192 | for all convolution ops. The value must be greater than zero. Typical 193 | usage will be to set this value in (0, 1) to reduce the number of 194 | parameters or computation cost of the model. 195 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 196 | output_stride: An integer that specifies the requested ratio of input to 197 | output spatial resolution. If not None, then we invoke atrous convolution 198 | if necessary to prevent the network from reducing the spatial resolution 199 | of the activation maps. Allowed values are 8 (accurate fully convolutional 200 | mode), 16 (fast fully convolutional mode), 32 (classification mode). 201 | use_explicit_padding: Use 'VALID' padding for convolutions, but prepad 202 | inputs so that the output dimensions are the same as if 'SAME' padding 203 | were used. 204 | scope: Optional variable_scope. 205 | 206 | Returns: 207 | tensor_out: output tensor corresponding to the final_endpoint. 208 | end_points: a set of activations for external use, for example summaries or 209 | losses. 210 | 211 | Raises: 212 | ValueError: if final_endpoint is not set to one of the predefined values, 213 | or depth_multiplier <= 0, or the target output_stride is not 214 | allowed. 215 | """ 216 | depth = lambda d: max(int(d * depth_multiplier), min_depth) 217 | end_points = {} 218 | 219 | # Used to find thinned depths for each layer. 220 | if depth_multiplier <= 0: 221 | raise ValueError('depth_multiplier is not greater than zero.') 222 | 223 | if conv_defs is None: 224 | conv_defs = MOBILENETV1_CONV_DEFS 225 | 226 | if output_stride is not None and output_stride not in [8, 16, 32]: 227 | raise ValueError('Only allowed output_stride values are 8, 16, 32.') 228 | 229 | padding = 'SAME' 230 | if use_explicit_padding: 231 | padding = 'VALID' 232 | with tf.variable_scope(scope, 'MobilenetV1', [inputs]): 233 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding): 234 | # The current_stride variable keeps track of the output stride of the 235 | # activations, i.e., the running product of convolution strides up to the 236 | # current network layer. This allows us to invoke atrous convolution 237 | # whenever applying the next convolution would result in the activations 238 | # having output stride larger than the target output_stride. 239 | current_stride = 1 240 | 241 | # The atrous convolution rate parameter. 242 | rate = 1 243 | 244 | net = inputs 245 | for i, conv_def in enumerate(conv_defs): 246 | end_point_base = 'Conv2d_%d' % i 247 | 248 | if output_stride is not None and current_stride == output_stride: 249 | # If we have reached the target output_stride, then we need to employ 250 | # atrous convolution with stride=1 and multiply the atrous rate by the 251 | # current unit's stride for use in subsequent layers. 252 | layer_stride = 1 253 | layer_rate = rate 254 | rate *= conv_def.stride 255 | else: 256 | layer_stride = conv_def.stride 257 | layer_rate = 1 258 | current_stride *= conv_def.stride 259 | 260 | if isinstance(conv_def, Conv): 261 | end_point = end_point_base 262 | if use_explicit_padding: 263 | net = _fixed_padding(net, conv_def.kernel) 264 | net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel, 265 | stride=conv_def.stride, 266 | normalizer_fn=slim.batch_norm, 267 | scope=end_point) 268 | end_points[end_point] = net 269 | if end_point == final_endpoint: 270 | return net, end_points 271 | 272 | elif isinstance(conv_def, DepthSepConv): 273 | end_point = end_point_base + '_depthwise' 274 | 275 | # By passing filters=None 276 | # separable_conv2d produces only a depthwise convolution layer 277 | if use_explicit_padding: 278 | net = _fixed_padding(net, conv_def.kernel, layer_rate) 279 | net = slim.separable_conv2d(net, None, conv_def.kernel, 280 | depth_multiplier=1, 281 | stride=layer_stride, 282 | rate=layer_rate, 283 | normalizer_fn=slim.batch_norm, 284 | scope=end_point) 285 | 286 | end_points[end_point] = net 287 | if end_point == final_endpoint: 288 | return net, end_points 289 | 290 | end_point = end_point_base + '_pointwise' 291 | 292 | net = slim.conv2d(net, depth(conv_def.depth), [1, 1], 293 | stride=1, 294 | normalizer_fn=slim.batch_norm, 295 | scope=end_point) 296 | 297 | end_points[end_point] = net 298 | if end_point == final_endpoint: 299 | return net, end_points 300 | else: 301 | raise ValueError('Unknown convolution type %s for layer %d' 302 | % (conv_def.ltype, i)) 303 | raise ValueError('Unknown final endpoint %s' % final_endpoint) 304 | 305 | 306 | def mobilenet_v1(inputs, 307 | num_classes=1000, 308 | dropout_keep_prob=0.999, 309 | is_training=True, 310 | min_depth=8, 311 | depth_multiplier=1.0, 312 | conv_defs=None, 313 | prediction_fn=tf.contrib.layers.softmax, 314 | spatial_squeeze=True, 315 | reuse=None, 316 | scope='MobilenetV1', 317 | global_pool=False): 318 | """Mobilenet v1 model for classification. 319 | 320 | Args: 321 | inputs: a tensor of shape [batch_size, height, width, channels]. 322 | num_classes: number of predicted classes. If 0 or None, the logits layer 323 | is omitted and the input features to the logits layer (before dropout) 324 | are returned instead. 325 | dropout_keep_prob: the percentage of activation values that are retained. 326 | is_training: whether is training or not. 327 | min_depth: Minimum depth value (number of channels) for all convolution ops. 328 | Enforced when depth_multiplier < 1, and not an active constraint when 329 | depth_multiplier >= 1. 330 | depth_multiplier: Float multiplier for the depth (number of channels) 331 | for all convolution ops. The value must be greater than zero. Typical 332 | usage will be to set this value in (0, 1) to reduce the number of 333 | parameters or computation cost of the model. 334 | conv_defs: A list of ConvDef namedtuples specifying the net architecture. 335 | prediction_fn: a function to get predictions out of logits. 336 | spatial_squeeze: if True, logits is of shape is [B, C], if false logits is 337 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes. 338 | reuse: whether or not the network and its variables should be reused. To be 339 | able to reuse 'scope' must be given. 340 | scope: Optional variable_scope. 341 | global_pool: Optional boolean flag to control the avgpooling before the 342 | logits layer. If false or unset, pooling is done with a fixed window 343 | that reduces default-sized inputs to 1x1, while larger inputs lead to 344 | larger outputs. If true, any input size is pooled down to 1x1. 345 | 346 | Returns: 347 | net: a 2D Tensor with the logits (pre-softmax activations) if num_classes 348 | is a non-zero integer, or the non-dropped-out input to the logits layer 349 | if num_classes is 0 or None. 350 | end_points: a dictionary from components of the network to the corresponding 351 | activation. 352 | 353 | Raises: 354 | ValueError: Input rank is invalid. 355 | """ 356 | input_shape = inputs.get_shape().as_list() 357 | if len(input_shape) != 4: 358 | raise ValueError('Invalid input tensor rank, expected 4, was: %d' % 359 | len(input_shape)) 360 | 361 | with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope: 362 | with slim.arg_scope([slim.batch_norm, slim.dropout], 363 | is_training=is_training): 364 | net, end_points = mobilenet_v1_base(inputs, scope=scope, 365 | min_depth=min_depth, 366 | depth_multiplier=depth_multiplier, 367 | conv_defs=conv_defs) 368 | with tf.variable_scope('Logits'): 369 | if global_pool: 370 | # Global average pooling. 371 | net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool') 372 | end_points['global_pool'] = net 373 | else: 374 | # Pooling with a fixed kernel size. 375 | kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7]) 376 | net = slim.avg_pool2d(net, kernel_size, padding='VALID', 377 | scope='AvgPool_1a') 378 | end_points['AvgPool_1a'] = net 379 | if not num_classes: 380 | return net, end_points 381 | # 1 x 1 x 1024 382 | net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b') 383 | logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, 384 | normalizer_fn=None, scope='Conv2d_1c_1x1') 385 | if spatial_squeeze: 386 | logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze') 387 | end_points['Logits'] = logits 388 | if prediction_fn: 389 | end_points['Predictions'] = prediction_fn(logits, scope='Predictions') 390 | return logits, end_points 391 | 392 | mobilenet_v1.default_image_size = 224 393 | 394 | 395 | def wrapped_partial(func, *args, **kwargs): 396 | partial_func = functools.partial(func, *args, **kwargs) 397 | functools.update_wrapper(partial_func, func) 398 | return partial_func 399 | 400 | 401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75) 402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50) 403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25) 404 | 405 | 406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size): 407 | """Define kernel size which is automatically reduced for small input. 408 | 409 | If the shape of the input images is unknown at graph construction time this 410 | function assumes that the input images are large enough. 411 | 412 | Args: 413 | input_tensor: input tensor of size [batch_size, height, width, channels]. 414 | kernel_size: desired kernel size of length 2: [kernel_height, kernel_width] 415 | 416 | Returns: 417 | a tensor with the kernel size. 418 | """ 419 | shape = input_tensor.get_shape().as_list() 420 | if shape[1] is None or shape[2] is None: 421 | kernel_size_out = kernel_size 422 | else: 423 | kernel_size_out = [min(shape[1], kernel_size[0]), 424 | min(shape[2], kernel_size[1])] 425 | return kernel_size_out 426 | 427 | 428 | def mobilenet_v1_arg_scope( 429 | is_training=True, 430 | weight_decay=0.00004, 431 | stddev=0.09, 432 | regularize_depthwise=False, 433 | batch_norm_decay=0.9997, 434 | batch_norm_epsilon=0.001, 435 | batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS): 436 | """Defines the default MobilenetV1 arg scope. 437 | 438 | Args: 439 | is_training: Whether or not we're training the model. If this is set to 440 | None, the parameter is not added to the batch_norm arg_scope. 441 | weight_decay: The weight decay to use for regularizing the model. 442 | stddev: The standard deviation of the trunctated normal weight initializer. 443 | regularize_depthwise: Whether or not apply regularization on depthwise. 444 | batch_norm_decay: Decay for batch norm moving average. 445 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero 446 | in batch norm. 447 | batch_norm_updates_collections: Collection for the update ops for 448 | batch norm. 449 | 450 | Returns: 451 | An `arg_scope` to use for the mobilenet v1 model. 452 | """ 453 | batch_norm_params = { 454 | 'center': True, 455 | 'scale': True, 456 | 'decay': batch_norm_decay, 457 | 'epsilon': batch_norm_epsilon, 458 | 'updates_collections': batch_norm_updates_collections, 459 | } 460 | if is_training is not None: 461 | batch_norm_params['is_training'] = is_training 462 | 463 | # Set weight_decay for weights in Conv and DepthSepConv layers. 464 | weights_init = tf.truncated_normal_initializer(stddev=stddev) 465 | regularizer = tf.contrib.layers.l2_regularizer(weight_decay) 466 | if regularize_depthwise: 467 | depthwise_regularizer = regularizer 468 | else: 469 | depthwise_regularizer = None 470 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], 471 | weights_initializer=weights_init, 472 | activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm): 473 | with slim.arg_scope([slim.batch_norm], **batch_norm_params): 474 | with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer): 475 | with slim.arg_scope([slim.separable_conv2d], 476 | weights_regularizer=depthwise_regularizer) as sc: 477 | return sc 478 | -------------------------------------------------------------------------------- /S7/trainer.py: -------------------------------------------------------------------------------- 1 | """ Coding Session 7: fine tuning a model in TensorFlow 2 | You can download the pre-trained checkpoint from: 3 | https://github.com/tensorflow/models/tree/master/research/slim 4 | """ 5 | 6 | import tensorflow as tf 7 | import numpy as np 8 | import os, glob 9 | import argparse 10 | from nets import mobilenet_v1 11 | 12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 13 | 14 | class TFModelTrainer: 15 | 16 | def __init__(self, checkpoint_path, data_path): 17 | self.checkpoint_path = checkpoint_path 18 | 19 | # set training parameters 20 | self.learning_rate = 0.01 21 | self.num_iter = 100000 22 | self.save_iter = 5000 23 | self.val_iter = 5000 24 | self.log_iter = 100 25 | self.batch_size = 32 26 | 27 | # set up data layer 28 | self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord')) 29 | self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord')) 30 | self.iterator, self.filenames = self._data_layer() 31 | self.num_val_samples = 10000 32 | self.num_classes = 2 33 | self.image_size = 224 34 | 35 | # fine tune only the last layer 36 | self.fine_tune = True ##################################################################### 37 | 38 | def preprocess_image(self, image_string): 39 | image = tf.image.decode_jpeg(image_string, channels=3) 40 | 41 | # flip for data augmentation 42 | image = tf.image.random_flip_left_right(image) 43 | 44 | # normalize image to [-1, +1] 45 | image = tf.cast(image, tf.float32) 46 | image = image / 127.5 47 | image = image - 1 48 | return image 49 | 50 | def _parse_tfrecord(self, example_proto): 51 | keys_to_features = {'image': tf.FixedLenFeature([], tf.string), 52 | 'label': tf.FixedLenFeature([], tf.int64)} 53 | parsed_features = tf.parse_single_example(example_proto, keys_to_features) 54 | image = parsed_features['image'] 55 | label = parsed_features['label'] 56 | image = self.preprocess_image(image) 57 | return image, label 58 | 59 | def _data_layer(self, num_threads=8, prefetch_buffer=100): 60 | with tf.variable_scope('data'): 61 | filenames = tf.placeholder(tf.string, shape=[None]) 62 | dataset = tf.data.TFRecordDataset(filenames) 63 | dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads) 64 | dataset = dataset.repeat() 65 | dataset = dataset.batch(self.batch_size) 66 | dataset = dataset.prefetch(prefetch_buffer) 67 | iterator = dataset.make_initializable_iterator() 68 | return iterator, filenames 69 | 70 | def _loss_functions(self, logits, labels): 71 | with tf.variable_scope('loss'): 72 | target_prob = tf.one_hot(labels, self.num_classes) 73 | tf.losses.softmax_cross_entropy(target_prob, logits) 74 | total_loss = tf.losses.get_total_loss() #include regularization loss 75 | return total_loss 76 | 77 | def _optimizer(self, total_loss, global_step): 78 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 79 | with tf.control_dependencies(update_ops): 80 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1) 81 | if self.fine_tune: ##################################################################### 82 | train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "MobilenetV1/Logits") 83 | optimizer = optimizer.minimize(total_loss, var_list=train_vars, global_step=global_step) 84 | else: 85 | optimizer = optimizer.minimize(total_loss, global_step=global_step) 86 | return optimizer 87 | 88 | def _performance_metric(self, logits, labels): 89 | with tf.variable_scope("performance_metric"): 90 | preds = tf.argmax(logits, axis=1) 91 | labels = tf.cast(labels, tf.int64) 92 | corrects = tf.equal(preds, labels) 93 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32)) 94 | return accuracy 95 | 96 | def _variables_to_restore(self, save_file, graph): ############################################# 97 | # returns a list of variables that can be restored from a checkpoint 98 | reader = tf.train.NewCheckpointReader(save_file) 99 | saved_shapes = reader.get_variable_to_shape_map() 100 | var_names = sorted([(var.name, var.name.split(':')[0]) for var in tf.global_variables() 101 | if var.name.split(':')[0] in saved_shapes]) 102 | restore_vars = [] 103 | for var_name, saved_var_name in var_names: 104 | curr_var = graph.get_tensor_by_name(var_name) 105 | var_shape = curr_var.get_shape().as_list() 106 | if var_shape == saved_shapes[saved_var_name]: 107 | restore_vars.append(curr_var) 108 | return restore_vars 109 | 110 | def train(self): 111 | # iteration number 112 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number') 113 | 114 | # training graph 115 | images, labels = self.iterator.get_next() 116 | images = tf.image.resize_bilinear(images, (self.image_size, self.image_size)) 117 | training = tf.placeholder(tf.bool, name='is_training') 118 | logits, _ = mobilenet_v1.mobilenet_v1(images, 119 | num_classes=self.num_classes, 120 | is_training=training, 121 | scope='MobilenetV1', 122 | global_pool=True) 123 | loss = self._loss_functions(logits, labels) 124 | optimizer = self._optimizer(loss, global_step) 125 | accuracy = self._performance_metric(logits, labels) 126 | 127 | # summary placeholders 128 | streaming_loss_p = tf.placeholder(tf.float32) 129 | accuracy_p = tf.placeholder(tf.float32) 130 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p) 131 | summ_op_test = tf.summary.scalar('accuracy', accuracy_p) 132 | 133 | # don't allocate entire GPU memory ######################################################### 134 | config = tf.ConfigProto() 135 | config.gpu_options.allow_growth = True 136 | 137 | with tf.Session(config=config) as sess: 138 | sess.run(tf.global_variables_initializer()) 139 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames}) 140 | 141 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph) 142 | 143 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints 144 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path) 145 | 146 | # resume training if a checkpoint exists 147 | if ckpt and ckpt.model_checkpoint_path: 148 | restore_vars = self._variables_to_restore(ckpt.model_checkpoint_path, sess.graph) 149 | saver = tf.train.Saver(var_list=restore_vars) 150 | saver.restore(sess, ckpt.model_checkpoint_path) 151 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path)) 152 | 153 | initial_step = global_step.eval() 154 | 155 | # train the model 156 | streaming_loss = 0 157 | for i in range(initial_step, self.num_iter + 1): 158 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True}) 159 | 160 | if not np.isfinite(loss_batch): 161 | print('loss diverged, stopping') 162 | exit() 163 | 164 | # log summary 165 | streaming_loss += loss_batch 166 | if i % self.log_iter == self.log_iter - 1: 167 | streaming_loss /= self.log_iter 168 | print(i + 1, streaming_loss) 169 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss}) 170 | writer.add_summary(summary_train, global_step=i) 171 | streaming_loss = 0 172 | 173 | # save model 174 | if i % self.save_iter == self.save_iter - 1: 175 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step) 176 | print("Model saved!") 177 | 178 | # run validation 179 | if i % self.val_iter == self.val_iter - 1: 180 | print("Running validation.") 181 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames}) 182 | 183 | validation_accuracy = 0 184 | for j in range(self.num_val_samples // self.batch_size): 185 | acc_batch = sess.run(accuracy, feed_dict={training: False}) 186 | validation_accuracy += acc_batch 187 | validation_accuracy /= j 188 | 189 | print("Accuracy: {}".format(validation_accuracy)) 190 | 191 | summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy}) 192 | writer.add_summary(summary_test, global_step=i) 193 | 194 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames}) 195 | 196 | writer.close() 197 | 198 | def main(): 199 | parser = argparse.ArgumentParser() 200 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/', 201 | help="Path to the dir where the checkpoints are saved") 202 | parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords") 203 | args = parser.parse_args() 204 | trainer = TFModelTrainer(args.checkpoint_path, args.data_path) 205 | trainer.train() 206 | 207 | if __name__ == '__main__': 208 | main() 209 | -------------------------------------------------------------------------------- /S8/datagenerator.py: -------------------------------------------------------------------------------- 1 | """ A Python iterator that loads and processes images. 2 | This iterator will be called through TensorFlow Dataset API to feed pairs of 3 | clean and noisy images into the model. 4 | """ 5 | 6 | import cv2 7 | import numpy as np 8 | import random 9 | import os, glob 10 | 11 | class DataGenerator: 12 | 13 | def __init__(self, image_size, base_dir = '../S6/dataset'): 14 | # all images will be center cropped and resized to image_size 15 | self.image_size = image_size 16 | 17 | # number of validation samples 18 | self.num_val = 320 19 | 20 | filenames = glob.glob(os.path.join(base_dir, '**', '*.jpg'), recursive=True) 21 | self.filenames_train = filenames[self.num_val:] 22 | self.filenames_val = filenames[:self.num_val] 23 | 24 | # dataset mode 25 | self.is_training = True 26 | self.val_idx = 0 27 | 28 | def __iter__(self): 29 | return self 30 | 31 | def __next__(self): 32 | try: 33 | return self.fetch_sample() 34 | except IndexError: 35 | raise StopIteration() 36 | 37 | def get_tensor_shape(self): 38 | return (self.image_size[1], self.image_size[0], 3) 39 | 40 | def set_mode(self, is_training): 41 | self.is_training = is_training 42 | self.val_idx = 0 43 | 44 | def fetch_sample(self): 45 | if self.is_training: 46 | # pick a random image 47 | impath = random.choice(self.filenames_train) 48 | else: 49 | # pick the next validation sample 50 | impath = self.filenames_val[self.val_idx] 51 | self.val_idx += 1 52 | image_in = cv2.imread(impath) 53 | 54 | # resize to image_size 55 | image_in = self.center_crop_and_resize(image_in) 56 | 57 | # inject noise 58 | image_out = self.add_random_noise(image_in) 59 | 60 | return image_in, image_out 61 | 62 | def center_crop_and_resize(self, image): 63 | R, C, _ = image.shape 64 | if R > C: 65 | pad = (R - C) // 2 66 | image = image[pad:-pad, :] 67 | elif C > R: 68 | pad = (C - R) // 2 69 | image = image[:, pad:-pad] 70 | image = cv2.resize(image, self.image_size) 71 | return image 72 | 73 | def add_random_noise(self, image): 74 | noise_var = random.randrange(3, 15) 75 | h, w, c = image.shape 76 | image_out = image.copy() 77 | image_out = image_out.astype(np.float32) 78 | noise = np.random.randn(h, w, c) * noise_var 79 | image_out += noise 80 | image_out = np.minimum(np.maximum(image_out, 0), 255) 81 | image_out = image_out.astype(np.uint8) 82 | return image_out 83 | 84 | if __name__ == '__main__': 85 | dg = DataGenerator(image_size=(256, 256)) 86 | image_in, image_out = next(dg) 87 | cv2.imwrite('image_in.jpg', image_in) 88 | cv2.imwrite('image_out.jpg', image_out) 89 | -------------------------------------------------------------------------------- /S8/nets/model.py: -------------------------------------------------------------------------------- 1 | """ A simple encoder-decoder network with skip connections. 2 | """ 3 | 4 | import tensorflow as tf 5 | 6 | def model(input_layer, training, weight_decay=0.00001): 7 | 8 | reg = tf.contrib.layers.l2_regularizer(weight_decay) 9 | 10 | def conv_block(inputs, num_filters, name): 11 | with tf.variable_scope(name): 12 | net = tf.layers.separable_conv2d( 13 | inputs=inputs, 14 | filters=num_filters, 15 | kernel_size=(3,3), 16 | padding='SAME', 17 | use_bias=False, 18 | activation=None, 19 | pointwise_regularizer=reg, 20 | depthwise_regularizer=reg) 21 | net = tf.layers.batch_normalization( 22 | inputs=net, 23 | training=training) 24 | net = tf.nn.relu(net) 25 | return net 26 | 27 | def pointwise_block(inputs, num_filters, name): 28 | with tf.variable_scope(name): 29 | net = tf.layers.conv2d( 30 | inputs=inputs, 31 | filters=num_filters, 32 | kernel_size=1, 33 | use_bias=False, 34 | activation=None, 35 | kernel_regularizer=reg) 36 | net = tf.layers.batch_normalization( 37 | inputs=net, 38 | training=training) 39 | net = tf.nn.relu(net) 40 | return net 41 | 42 | def pooling(inputs, name): 43 | with tf.variable_scope(name): 44 | net = tf.layers.max_pooling2d(inputs, 45 | pool_size=(2,2), 46 | strides=(2,2)) 47 | return net 48 | 49 | def downsampling(inputs, name): 50 | with tf.variable_scope(name): 51 | net = tf.layers.average_pooling2d(inputs, 52 | pool_size=(2,2), 53 | strides=(2,2)) 54 | return net 55 | 56 | def upsampling(inputs, name): 57 | with tf.variable_scope(name): 58 | dims = tf.shape(inputs) 59 | new_size = [dims[1]*2, dims[2]*2] 60 | net = tf.image.resize_bilinear(inputs, new_size) 61 | return net 62 | 63 | def output_block(inputs, name): 64 | with tf.variable_scope(name): 65 | net = tf.layers.conv2d( 66 | inputs=inputs, 67 | filters=3, 68 | kernel_size=(1,1), 69 | activation=None) 70 | return net 71 | 72 | def subnet_module(inputs, name, num_filters, num_layers=3): 73 | with tf.variable_scope(name): 74 | for i in range(num_layers-1): 75 | net = conv_block(inputs, num_filters=num_filters, name='{}_conv{}'.format(name, i)) 76 | inputs = tf.concat([net, inputs], axis=3) 77 | net = conv_block(inputs, num_filters=num_filters, name='{}_conv3'.format(name)) 78 | return net 79 | 80 | num_filters = 16 81 | net = input_layer 82 | skip_connections = [] 83 | # encoder 84 | with tf.variable_scope('encoder'): 85 | for i in range(4): 86 | net = subnet_module(net, num_filters=num_filters, name='conv_e{}'.format(i)) 87 | skip_connections.append(net) 88 | net = pooling(net, name='pool{}'.format(i)) 89 | num_filters *= 2 90 | 91 | # bottleneck 92 | net = subnet_module(net, num_filters=num_filters, name='conv_bottleneck'.format(i)) 93 | 94 | # decoder 95 | with tf.variable_scope('decoder'): 96 | for i in range(4): 97 | num_filters /= 2 98 | net = upsampling(net, name='upsample{}'.format(i)) 99 | net = tf.concat([net, skip_connections.pop()], axis=3) 100 | net = subnet_module(net, num_filters=num_filters, name='subnet_d{}'.format(i)) 101 | 102 | # exit flow 103 | with tf.variable_scope('exit_flow'): 104 | logits = output_block(net, name='output_block') 105 | 106 | return logits 107 | -------------------------------------------------------------------------------- /S8/trainer.py: -------------------------------------------------------------------------------- 1 | """ Coding Session 8: using a python iterator as a data generator and training a denoising autoencoder 2 | """ 3 | 4 | import tensorflow as tf 5 | import numpy as np 6 | import os, glob 7 | import argparse 8 | from nets.model import model 9 | from datagenerator import DataGenerator ######################### 10 | 11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3' 12 | 13 | class TFModelTrainer: 14 | 15 | def __init__(self, checkpoint_path): 16 | self.checkpoint_path = checkpoint_path 17 | 18 | # set training parameters 19 | self.learning_rate = 0.01 20 | self.num_iter = 100000 21 | self.save_iter = 1000 22 | self.val_iter = 1000 23 | self.log_iter = 100 24 | self.batch_size = 16 25 | 26 | # set up data layer 27 | self.image_size = (224, 224) 28 | self.data_generator = DataGenerator(self.image_size) 29 | 30 | def preprocess_image(self, image): 31 | # normalize image to [-1, +1] 32 | image = tf.cast(image, tf.float32) 33 | image = image / 127.5 34 | image = image - 1 35 | return image 36 | 37 | def _preprocess_images(self, image_orig, image_noisy): 38 | image_orig = self.preprocess_image(image_orig) 39 | image_noisy = self.preprocess_image(image_noisy) 40 | return image_orig, image_noisy 41 | 42 | def _data_layer(self, num_threads=8, prefetch_buffer=100): 43 | with tf.variable_scope('data'): 44 | data_shape = self.data_generator.get_tensor_shape() ######################### 45 | dataset = tf.data.Dataset.from_generator(lambda: self.data_generator, 46 | (tf.float32, tf.float32), 47 | (tf.TensorShape(data_shape), 48 | tf.TensorShape(data_shape))) 49 | dataset = dataset.map(self._preprocess_images, num_parallel_calls=num_threads) 50 | dataset = dataset.batch(self.batch_size) 51 | dataset = dataset.prefetch(prefetch_buffer) 52 | iterator = dataset.make_initializable_iterator() 53 | return iterator 54 | 55 | def _loss_functions(self, preds, ground_truth): 56 | with tf.name_scope('loss'): 57 | tf.losses.mean_squared_error(ground_truth, preds) ######################### 58 | total_loss = tf.losses.get_total_loss() 59 | return total_loss 60 | 61 | def _optimizer(self, loss, global_step): 62 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) 63 | with tf.control_dependencies(update_ops): 64 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1) 65 | optimizer = optimizer.minimize(loss, global_step=global_step) 66 | return optimizer 67 | 68 | def train(self): 69 | # iteration number 70 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number') 71 | 72 | # training graph 73 | iterator = self._data_layer() 74 | image_orig, image_noisy = iterator.get_next() 75 | training = tf.placeholder(tf.bool, name='is_training') 76 | logits = model(image_noisy, training=training) 77 | loss = self._loss_functions(logits, image_orig) 78 | optimizer = self._optimizer(loss, global_step) 79 | 80 | # summary placeholders 81 | streaming_loss_p = tf.placeholder(tf.float32) 82 | validation_loss_p = tf.placeholder(tf.float32) 83 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p) 84 | summ_op_test = tf.summary.scalar('validation_loss', validation_loss_p) 85 | 86 | # don't allocate entire gpu memory 87 | config = tf.ConfigProto() 88 | config.gpu_options.allow_growth = True 89 | 90 | with tf.Session(config=config) as sess: 91 | sess.run(tf.global_variables_initializer()) 92 | sess.run(iterator.initializer) 93 | 94 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph) 95 | 96 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints 97 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path) 98 | 99 | # resume training if a checkpoint exists 100 | if ckpt and ckpt.model_checkpoint_path: 101 | saver.restore(sess, ckpt.model_checkpoint_path) 102 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path)) 103 | 104 | initial_step = global_step.eval() 105 | 106 | # train the model 107 | streaming_loss = 0 108 | for i in range(initial_step, self.num_iter + 1): 109 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True}) 110 | 111 | if not np.isfinite(loss_batch): 112 | print('loss diverged, stopping') 113 | exit() 114 | 115 | # log summary 116 | streaming_loss += loss_batch 117 | if i % self.log_iter == self.log_iter - 1: 118 | streaming_loss /= self.log_iter 119 | print(i + 1, streaming_loss) 120 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss}) 121 | writer.add_summary(summary_train, global_step=i) 122 | streaming_loss = 0 123 | 124 | # save model 125 | if i % self.save_iter == self.save_iter - 1: 126 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step) 127 | print("Model saved!") 128 | 129 | # run validation 130 | if i % self.val_iter == self.val_iter - 1: 131 | print("Running validation.") 132 | self.data_generator.set_mode(is_training=False) 133 | sess.run(iterator.initializer) 134 | 135 | validation_loss = 0 136 | for j in range(self.data_generator.num_val // self.batch_size): 137 | loss_batch = sess.run(loss, feed_dict={training: False}) 138 | validation_loss += loss_batch 139 | validation_loss /= j 140 | 141 | print("Validation loss: {}".format(validation_loss)) 142 | 143 | summary_test = sess.run(summ_op_test, feed_dict={validation_loss_p: validation_loss}) 144 | writer.add_summary(summary_test, global_step=i) 145 | 146 | self.data_generator.set_mode(is_training=True) 147 | sess.run(iterator.initializer) 148 | 149 | writer.close() 150 | 151 | def main(): 152 | parser = argparse.ArgumentParser() 153 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/', 154 | help="Path to the dir where the checkpoints are saved") 155 | args = parser.parse_args() 156 | trainer = TFModelTrainer(args.checkpoint_path) 157 | trainer.train() 158 | 159 | if __name__ == '__main__': 160 | main() 161 | -------------------------------------------------------------------------------- /S9/S1.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | x = 2.0 4 | y = 8.0 5 | 6 | @tf.function 7 | def geometric_mean(x, y): 8 | g_mean = tf.sqrt(x * y) 9 | return g_mean 10 | 11 | g_mean = geometric_mean(x, y) 12 | tf.print(g_mean) -------------------------------------------------------------------------------- /S9/S2.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data() 4 | 5 | model = tf.keras.Sequential([ 6 | tf.keras.layers.Flatten(input_shape=(28, 28)), 7 | tf.keras.layers.Dense(512, activation=tf.nn.relu), 8 | tf.keras.layers.Dense(10, activation=tf.nn.softmax) 9 | ]) 10 | 11 | model.compile(optimizer='adam', 12 | loss='sparse_categorical_crossentropy', 13 | metrics=['accuracy']) 14 | 15 | model.fit(train_images, train_labels, epochs=5) 16 | -------------------------------------------------------------------------------- /S9/S3.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data() 4 | 5 | inputs = tf.keras.Input(shape=(28,28)) 6 | net = tf.keras.layers.Flatten()(inputs) 7 | net = tf.keras.layers.Dense(512, activation=tf.nn.relu)(net) 8 | net = tf.keras.layers.Dense(10, activation=tf.nn.softmax)(net) 9 | model = tf.keras.Model(inputs=inputs, outputs=net) 10 | 11 | model.compile(optimizer='adam', 12 | loss='sparse_categorical_crossentropy', 13 | metrics=['accuracy']) 14 | 15 | model.fit(train_images, train_labels, epochs=5) 16 | -------------------------------------------------------------------------------- /S9/S4.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | data_train, _ = tf.keras.datasets.mnist.load_data() 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train) 5 | dataset = dataset.shuffle(buffer_size=60000) 6 | dataset = dataset.batch(32) 7 | 8 | for row in dataset: 9 | print(row) 10 | 11 | model = tf.keras.Sequential([ 12 | tf.keras.layers.Flatten(input_shape=(28, 28)), 13 | tf.keras.layers.Dense(512, activation=tf.nn.relu), 14 | tf.keras.layers.Dense(10, activation=tf.nn.softmax) 15 | ]) 16 | 17 | model.compile(optimizer='adam', 18 | loss='sparse_categorical_crossentropy', 19 | metrics=['accuracy']) 20 | model.fit(dataset, epochs=5) 21 | -------------------------------------------------------------------------------- /S9/S5.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | data_train, _ = tf.keras.datasets.mnist.load_data() 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train) 5 | dataset = dataset.shuffle(buffer_size=60000) 6 | dataset = dataset.batch(32) 7 | 8 | model = tf.keras.Sequential([ 9 | tf.keras.layers.Flatten(input_shape=(28, 28)), 10 | tf.keras.layers.Dense(512, activation=tf.nn.relu), 11 | tf.keras.layers.Dense(10, activation=tf.nn.softmax) 12 | ]) 13 | 14 | model.compile(optimizer='adam', 15 | loss='sparse_categorical_crossentropy', 16 | metrics=['accuracy']) 17 | 18 | tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints') 19 | model.fit(dataset, epochs=5, callbacks=[tb]) 20 | -------------------------------------------------------------------------------- /S9/S6.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | data_train, _ = tf.keras.datasets.mnist.load_data() 4 | dataset = tf.data.Dataset.from_tensor_slices(data_train) 5 | dataset = dataset.shuffle(buffer_size=60000) 6 | dataset = dataset.batch(32) 7 | 8 | strategy = tf.distribute.MirroredStrategy() 9 | with strategy.scope(): 10 | model = tf.keras.Sequential([ 11 | tf.keras.layers.Flatten(input_shape=(28, 28)), 12 | tf.keras.layers.Dense(512, activation=tf.nn.relu), 13 | tf.keras.layers.Dense(10, activation=tf.nn.softmax) 14 | ]) 15 | 16 | model.compile(optimizer='adam', 17 | loss='sparse_categorical_crossentropy', 18 | metrics=['accuracy']) 19 | 20 | tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints') 21 | model.fit(dataset, epochs=5, callbacks=[tb]) 22 | -------------------------------------------------------------------------------- /img/dlcc_github.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/dlcc_github.jpg -------------------------------------------------------------------------------- /img/tfcs_github.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/tfcs_github.png --------------------------------------------------------------------------------