├── .gitignore
├── README.md
├── S1
├── S1_notebook.ipynb
├── S1a_live.py
└── S1b_live.py
├── S2_live.py
├── S3_live.py
├── S4_live.py
├── S5_live.py
├── S6
├── create_tf_records.py
├── freeze_model.py
├── inference.py
├── nets
│ └── mobilenet_v1.py
├── resize_images.py
└── trainer.py
├── S7
├── checkpoints
│ └── checkpoint
├── nets
│ └── mobilenet_v1.py
└── trainer.py
├── S8
├── datagenerator.py
├── nets
│ └── model.py
└── trainer.py
├── S9
├── S1.py
├── S2.py
├── S3.py
├── S4.py
├── S5.py
└── S6.py
└── img
├── dlcc_github.jpg
└── tfcs_github.png
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | wheels/
24 | *.egg-info/
25 | .installed.cfg
26 | *.egg
27 |
28 | # PyInstaller
29 | # Usually these files are written by a python script from a template
30 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
31 | *.manifest
32 | *.spec
33 |
34 | # Installer logs
35 | pip-log.txt
36 | pip-delete-this-directory.txt
37 |
38 | # Unit test / coverage reports
39 | htmlcov/
40 | .tox/
41 | .coverage
42 | .coverage.*
43 | .cache
44 | nosetests.xml
45 | coverage.xml
46 | *.cover
47 | .hypothesis/
48 |
49 | # Translations
50 | *.mo
51 | *.pot
52 |
53 | # Django stuff:
54 | *.log
55 | local_settings.py
56 |
57 | # Flask stuff:
58 | instance/
59 | .webassets-cache
60 |
61 | # Scrapy stuff:
62 | .scrapy
63 |
64 | # Sphinx documentation
65 | docs/_build/
66 |
67 | # PyBuilder
68 | target/
69 |
70 | # Jupyter Notebook
71 | .ipynb_checkpoints
72 |
73 | # pyenv
74 | .python-version
75 |
76 | # celery beat schedule file
77 | celerybeat-schedule
78 |
79 | # SageMath parsed files
80 | *.sage.py
81 |
82 | # dotenv
83 | .env
84 |
85 | # virtualenv
86 | .venv
87 | venv/
88 | ENV/
89 |
90 | # Spyder project settings
91 | .spyderproject
92 | .spyproject
93 |
94 | # Rope project settings
95 | .ropeproject
96 |
97 | # mkdocs documentation
98 | /site
99 |
100 | # mypy
101 | .mypy_cache/
102 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ## Hands-on Deep Learning: TensorFlow Coding Sessions
4 |
5 | This repository has the code for the Hands-on Deep Learning: TensorFlow Coding Sessions. The videos will be uploaded on a weekly basis.
6 |
7 | The series consist of the introductory TensorFlow tutorials outlined below:
8 |
9 | | # | Tutorial | Code | Video |
10 | |-|------------------------------------------------------------------------|------|------------------|
11 | |1| Introduction to TensorFlow: graphs, sessions, constants, and variables |[S1](S1/) and [S1_notebook.ipynb](S1/S1_notebook.ipynb)| [Video #1](https://youtu.be/1KzJbIFnVTE) |
12 | |2| Training a multilayer perceptron |[S2_live.py](S2_live.py)| [Video #2](https://youtu.be/b7ykcBzz9wo) |
13 | |3| Setting up the training and validation pipeline |[S3_live.py](S3_live.py)| [Video #3](https://youtu.be/l_ZvxKBToWs) |
14 | |4| Regularization, saving and resuming from checkpoints, and TensorBoard |[S4_live.py](S4_live.py)| [Video #4](https://youtu.be/ni9FZtF_gLs) |
15 | |5| Convolutional neural networks, batchnorm, learning rate schedules, optimizers|[S5_live.py](S5_live.py)| [Video #5](https://youtu.be/ULX1nWPAJbM) |
16 | |6| Converting a dataset into TFRecords, training an image classifier, and freezing the model for deployment|[S6](S6/)| [Video #6](https://youtu.be/tzKqjPdAf8M) |
17 | |7| Transfer learning: fine tuning a model in TensorFlow |[S7](S7/)| [Video #7](https://youtu.be/jccBP_uA98k) |
18 | |8| Using a Python iterator as a data generator and training a denoising autoencoder |[S8](S8/)| N/A |
19 | |9| What is new in TensorFlow 2.0 **[new]** |[S9](S9/)| [Video #8](https://youtu.be/GI_QVLNCgPo) |
20 |
21 | ---
22 |
23 |
24 |
25 | ## Deep Learning Crash Course
26 |
27 | A series of mini-lectures on the fundamentals of machine learning, with a focus on neural networks and deep learning.
28 |
29 | * [Lecture #1: Introduction](https://youtu.be/nmnaO6esC7c)
30 | * [Lecture #2: Artificial Neural Networks Demystified](https://youtu.be/oS5fz_mHVz0)
31 | * [Lecture #3: Artificial Neural Networks: Going Deeper](https://youtu.be/_XPkAxm0Yx0)
32 | * [Lecture #4: Overfitting, Underfitting, and Model Capacity](https://youtu.be/ms-Ooh9mjiE)
33 | * [Lecture #5: Regularization](https://youtu.be/NRCZJUviZN0)
34 | * [Lecture #6: Data Collection and Preprocessing](https://youtu.be/dAg-_gzFo14)
35 | * [Lecture #7: Convolutional Neural Networks Explained](https://youtu.be/-I0lry5ceDs)
36 | * [Lecture #8: How to Design a Convolutional Neural Network](https://youtu.be/fTw3K8D5xDs)
37 | * [Lecture #9: Transfer Learning](https://youtu.be/_2EHcpg52uU)
38 | * [Lecture #10: Optimization Tricks: momentum, batch-norm, and more](https://youtu.be/kK8-jCCR4is)
39 | * [Lecture #11: Recurrent Neural Networks](https://youtu.be/k97Jrg_4tFA)
40 | * [Lecture #12: Deep Unsupervised Learning](https://youtu.be/P8_W5Wc4zeg)
41 | * [Lecture #13: Generative Adversarial Networks](https://youtu.be/7tFBoxex4JE)
42 | * [Lecture #14: Practical Methodology in Deep Learning](https://youtu.be/9Sl_t_GxX6w)
43 |
44 | ---
45 |
--------------------------------------------------------------------------------
/S1/S1_notebook.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Deep Learning With Tensorflow\n",
8 | "\n",
9 | "## Introduction\n",
10 | "\n",
11 | "Let's start with importing TensorFlow in our project and making sure that we have installed the right version correctly.\n",
12 | "If you haven't installed TensorFlow yet, you can easily do so using PyPI: https://www.tensorflow.org/install/."
13 | ]
14 | },
15 | {
16 | "cell_type": "code",
17 | "execution_count": 1,
18 | "metadata": {},
19 | "outputs": [
20 | {
21 | "name": "stdout",
22 | "output_type": "stream",
23 | "text": [
24 | "1.10.0\n"
25 | ]
26 | }
27 | ],
28 | "source": [
29 | "import tensorflow as tf\n",
30 | "print(tf.__version__)"
31 | ]
32 | },
33 | {
34 | "cell_type": "markdown",
35 | "metadata": {},
36 | "source": [
37 | "### Graphs and Sessions\n",
38 | "Unless you are using the eager execution mode, operations in TensorFlow are not executed immediately. In TensorFlow, the description of the computations is separated from the execution. A typical TensorFlow program constructs a computational graph first, then creates a session to execute the operations in the graph. Let's create a very simple graph and run it in a session to compute the geometric mean of two numbers. In this example we used placeholders to feed the inputs to the graph. By defining a placeholder we tell the model that we will feed the values later, when we execute the graph. Feeding data this was can lead to input/output bottlenecks in large scale applications. We will later see how to read data in parallel while the graph is being executed."
39 | ]
40 | },
41 | {
42 | "cell_type": "code",
43 | "execution_count": 2,
44 | "metadata": {},
45 | "outputs": [
46 | {
47 | "name": "stdout",
48 | "output_type": "stream",
49 | "text": [
50 | "4.0\n"
51 | ]
52 | }
53 | ],
54 | "source": [
55 | "# Define the inputs\n",
56 | "x = tf.placeholder(tf.float32)\n",
57 | "y = tf.placeholder(tf.float32)\n",
58 | "\n",
59 | "# Define the graph\n",
60 | "g_mean = tf.sqrt(x * y)\n",
61 | "\n",
62 | "# Run the graph\n",
63 | "with tf.Session() as sess:\n",
64 | " res = sess.run(g_mean, feed_dict={x: 2, y:8})\n",
65 | " print(res)"
66 | ]
67 | },
68 | {
69 | "cell_type": "markdown",
70 | "metadata": {},
71 | "source": [
72 | "### Constants and Variables\n",
73 | "\n",
74 | "We can declare constants and variables to use in a graph. The main differences between these two are:\n",
75 | "* Constants have constant values whereas variables can change during execution. A typical example of a variable is a trainable weight in a neural network.\n",
76 | "* Constants are stored in a graph where variables are not. Using constants increases the size of the graph\n",
77 | "\n",
78 | "Let's take a look at an example."
79 | ]
80 | },
81 | {
82 | "cell_type": "code",
83 | "execution_count": 7,
84 | "metadata": {},
85 | "outputs": [
86 | {
87 | "name": "stdout",
88 | "output_type": "stream",
89 | "text": [
90 | "0.2\n"
91 | ]
92 | }
93 | ],
94 | "source": [
95 | "# This block gets an existing variable with a specific name within a variable scope\n",
96 | "# or creates a new one if no such variable exists\n",
97 | "# In this case it's identical to using tf.Variable\n",
98 | "# Variable scopes help us define and reuse variables within a context\n",
99 | "with tf.variable_scope(\"linear_model\", reuse=tf.AUTO_REUSE):\n",
100 | " w = tf.get_variable(\"weight\", dtype=tf.float32, initializer=tf.constant(0.1))\n",
101 | " c = tf.get_variable(\"bias\", dtype=tf.float32, initializer=tf.constant(0.0))\n",
102 | "\n",
103 | "# here we define our graph\n",
104 | "model = x * w + c\n",
105 | "\n",
106 | "with tf.Session() as sess:\n",
107 | " # we need to initialize all variables otherwise it will throw an error\n",
108 | " sess.run(tf.global_variables_initializer())\n",
109 | " print(sess.run(model, feed_dict={x: 2.0}))"
110 | ]
111 | },
112 | {
113 | "cell_type": "markdown",
114 | "metadata": {},
115 | "source": [
116 | "In the example above, we defined a very simple linear model with a single input, weight, and bias. We initialized the variables with constant values and ran the graph to print the initial output. We will later see how to train these variables to fit a function to data."
117 | ]
118 | }
119 | ],
120 | "metadata": {
121 | "kernelspec": {
122 | "display_name": "Python 3",
123 | "language": "python",
124 | "name": "python3"
125 | },
126 | "language_info": {
127 | "codemirror_mode": {
128 | "name": "ipython",
129 | "version": 3
130 | },
131 | "file_extension": ".py",
132 | "mimetype": "text/x-python",
133 | "name": "python",
134 | "nbconvert_exporter": "python",
135 | "pygments_lexer": "ipython3",
136 | "version": "3.6.4"
137 | }
138 | },
139 | "nbformat": 4,
140 | "nbformat_minor": 2
141 | }
142 |
--------------------------------------------------------------------------------
/S1/S1a_live.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | # define the inputs
4 | x = tf.placeholder(tf.float32)
5 | y = tf.placeholder(tf.float32)
6 |
7 | # define the graph
8 | g_mean = tf.sqrt(x * y)
9 |
10 | # run the graph
11 | with tf.Session() as sess:
12 | res = sess.run(g_mean, feed_dict={x: 2, y: 8})
13 | print(res)
--------------------------------------------------------------------------------
/S1/S1b_live.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | # define the inputs
4 | x = tf.placeholder(tf.float32)
5 |
6 | with tf.variable_scope("linear_model", reuse=tf.AUTO_REUSE):
7 | w = tf.get_variable("weight", dtype=tf.float32, initializer=tf.constant(0.1))
8 | c = tf.get_variable("bias", dtype=tf.float32, initializer=tf.constant(0.0))
9 | model = x * w + c
10 |
11 | with tf.Session() as sess:
12 | sess.run(tf.global_variables_initializer())
13 | print(sess.run(model, feed_dict={x: 2.0}))
--------------------------------------------------------------------------------
/S2_live.py:
--------------------------------------------------------------------------------
1 | """ Deep Learning with TensorFlow
2 | Coding session 2: Training a Multilayer Perceptron
3 |
4 | Let's train a simple neural network that classifies handwritten digits using the MNIST dataset.
5 | Video will be uploaded later.
6 | """
7 |
8 | import tensorflow as tf
9 |
10 | def preprocess_data(im, label):
11 | im = tf.cast(im, tf.float32)
12 | im = im / 127.5
13 | im = im - 1
14 | im = tf.reshape(im, [-1])
15 | return im, label
16 |
17 | def data_layer(data_tensor, num_threads=8, prefetch_buffer=100, batch_size=32):
18 | with tf.variable_scope("data"):
19 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
20 | dataset = dataset.shuffle(buffer_size=60000).repeat()
21 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
22 | dataset = dataset.batch(batch_size)
23 | dataset = dataset.prefetch(prefetch_buffer)
24 | iterator = dataset.make_one_shot_iterator()
25 | return iterator
26 |
27 | def model(input_layer, num_classes=10):
28 | with tf.variable_scope("model"):
29 | net = tf.layers.dense(input_layer, 512)
30 | net = tf.nn.relu(net)
31 | net = tf.layers.dense(net, num_classes)
32 | return net
33 |
34 | def loss_functions(logits, labels, num_classes=10):
35 | with tf.variable_scope("loss"):
36 | target_prob = tf.one_hot(labels, num_classes)
37 | total_loss = tf.losses.softmax_cross_entropy(target_prob, logits)
38 | return total_loss
39 |
40 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
41 | with tf.variable_scope("optimizer"):
42 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
43 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
44 | return optimizer
45 |
46 | def performance_metric(logits, labels):
47 | with tf.variable_scope("performance_metric"):
48 | preds = tf.argmax(logits, axis=1)
49 | labels = tf.cast(labels, tf.int64)
50 | corrects = tf.equal(preds, labels)
51 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
52 | return accuracy
53 |
54 | def train(data_tensor):
55 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
56 |
57 | # training graph
58 | images, labels = data_layer(data_tensor).get_next()
59 | logits = model(images)
60 | loss = loss_functions(logits, labels)
61 | optimizer = optimizer_func(loss, global_step)
62 | accuracy = performance_metric(logits, labels)
63 |
64 | # start training
65 | num_iter = 10000
66 | log_iter = 1000
67 | with tf.Session() as sess:
68 | sess.run(tf.global_variables_initializer())
69 | streaming_loss = 0
70 | streaming_accuracy = 0
71 |
72 | for i in range(1, num_iter + 1):
73 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
74 | streaming_loss += loss_batch
75 | streaming_accuracy += acc_batch
76 | if i % log_iter == 0:
77 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
78 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
79 | streaming_loss = 0
80 | streaming_accuracy = 0
81 |
82 | if __name__ == "__main__":
83 | # It's very easy to load the MNIST dataset through the Keras module.
84 | # Keras is a high-level neural network API that has become a part of TensorFlow since version 1.2.
85 | # Therefore, we don't need to install Keras separately.
86 | # In the upcoming lectures we will also see how to load and preprocess custom data.
87 | data_train, data_val = tf.keras.datasets.mnist.load_data()
88 |
89 | # The training set has 60,000 samples where each sample is a 28x28 grayscale image.
90 | # Each one of these samples have a single label Similarly the validation set has 10,000 images and corresponding labels.
91 | # We can verify this by printing the shapes of the loaded tensors
92 | print(data_train[0].shape, data_train[1].shape, data_val[0].shape, data_val[1].shape)
93 |
94 | # Let the training begin!
95 | train(data_tensor=data_train)
96 |
97 | # Even after very few epochs, we got a model that can classify the handwritten digits in the training set
98 | # with 98% accuracy. So far we haven't used the validation set at all.
99 | # You might wonder why we need a separate validation set in the first place.
100 | # The answer is to make sure that the model generalizes well to unseen data to have an idea of the actual performance of the model.
101 | # We will talk about that in the next session.
--------------------------------------------------------------------------------
/S3_live.py:
--------------------------------------------------------------------------------
1 | """ Deep Learning with TensorFlow
2 | Coding session 3: Setting up the training and validation pipeline
3 |
4 | In the previous session we trained a model without keeping track of how it's
5 | doing on a validation set. Let's pick up where we left off and modify our code
6 | from the previous session to keep track of validation accuracy while training.
7 | """
8 |
9 | import tensorflow as tf
10 | import os
11 |
12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
13 |
14 | def preprocess_data(im, label):
15 | im = tf.cast(im, tf.float32)
16 | im = im / 127.5
17 | im = im - 1
18 | im = tf.reshape(im, [-1])
19 | return im, label
20 |
21 | # We will be using the same data pipeline for both training and validation sets
22 | # So let's create a helper function for that
23 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
24 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
25 | if is_train:
26 | dataset = dataset.shuffle(buffer_size=60000).repeat()
27 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
28 | dataset = dataset.batch(batch_size)
29 | dataset = dataset.prefetch(prefetch_buffer)
30 | return dataset
31 |
32 | def data_layer():
33 | with tf.variable_scope("data"):
34 | data_train, data_val = tf.keras.datasets.mnist.load_data()
35 | dataset_train = create_dataset_pipeline(data_train, is_train=True)
36 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
37 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
38 | init_op_train = iterator.make_initializer(dataset_train)
39 | init_op_val = iterator.make_initializer(dataset_val)
40 | return iterator, init_op_train, init_op_val
41 |
42 | def model(input_layer, num_classes=10):
43 | with tf.variable_scope("model"):
44 | net = tf.layers.dense(input_layer, 512)
45 | net = tf.nn.relu(net)
46 | net = tf.layers.dense(net, num_classes)
47 | return net
48 |
49 | def loss_functions(logits, labels, num_classes=10):
50 | with tf.variable_scope("loss"):
51 | target_prob = tf.one_hot(labels, num_classes)
52 | total_loss = tf.losses.softmax_cross_entropy(target_prob, logits)
53 | return total_loss
54 |
55 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
56 | with tf.variable_scope("optimizer"):
57 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
58 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
59 | return optimizer
60 |
61 | def performance_metric(logits, labels):
62 | with tf.variable_scope("performance_metric"):
63 | preds = tf.argmax(logits, axis=1)
64 | labels = tf.cast(labels, tf.int64)
65 | corrects = tf.equal(preds, labels)
66 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
67 | return accuracy
68 |
69 | def train():
70 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
71 |
72 | # define the training graph
73 | iterator, init_op_train, init_op_val = data_layer()
74 | images, labels = iterator.get_next()
75 | logits = model(images)
76 | loss = loss_functions(logits, labels)
77 | optimizer = optimizer_func(loss, global_step)
78 | accuracy = performance_metric(logits, labels)
79 |
80 | # start training
81 | num_iter = 18750 # 10 epochs
82 | log_iter = 1875
83 | val_iter = 1875
84 | with tf.Session() as sess:
85 | sess.run(tf.global_variables_initializer())
86 | sess.run(init_op_train)
87 |
88 | streaming_loss = 0
89 | streaming_accuracy = 0
90 |
91 | for i in range(1, num_iter + 1):
92 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
93 | streaming_loss += loss_batch
94 | streaming_accuracy += acc_batch
95 | if i % log_iter == 0:
96 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
97 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
98 | streaming_loss = 0
99 | streaming_accuracy = 0
100 |
101 | if i % val_iter == 0:
102 | sess.run(init_op_val)
103 | validation_accuracy = 0
104 | num_iter = 0
105 | while True:
106 | try:
107 | acc_batch = sess.run(accuracy)
108 | validation_accuracy += acc_batch
109 | num_iter += 1
110 | except tf.errors.OutOfRangeError:
111 | validation_accuracy /= num_iter
112 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
113 | sess.run(init_op_train) # switch back to training set
114 | break
115 |
116 | if __name__ == "__main__":
117 | train()
118 |
--------------------------------------------------------------------------------
/S4_live.py:
--------------------------------------------------------------------------------
1 | """ Deep Learning with TensorFlow
2 | Live coding session 4: Regularization, saving and resuming from checkpoints, basics of TensorBoard
3 |
4 | In the previous session, we wrote this code to train a simple model in TensorFlow.
5 | In this session, we will train a deeper model, regularize it, and visualize it in TensorBoard.
6 | """
7 |
8 | import tensorflow as tf
9 | import os
10 |
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 |
13 | def preprocess_data(im, label):
14 | im = tf.cast(im, tf.float32)
15 | im = im / 127.5
16 | im = im - 1
17 | im = tf.reshape(im, [-1])
18 | return im, label
19 |
20 | # We will be using the same data pipeline for both training and validation sets
21 | # So let's create a helper function for that
22 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
23 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
24 | if is_train:
25 | dataset = dataset.shuffle(buffer_size=60000).repeat()
26 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
27 | dataset = dataset.batch(batch_size)
28 | dataset = dataset.prefetch(prefetch_buffer)
29 | return dataset
30 |
31 | def data_layer():
32 | with tf.variable_scope("data"):
33 | data_train, data_val = tf.keras.datasets.mnist.load_data()
34 | dataset_train = create_dataset_pipeline(data_train, is_train=True)
35 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
36 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
37 | init_op_train = iterator.make_initializer(dataset_train)
38 | init_op_val = iterator.make_initializer(dataset_val)
39 | return iterator, init_op_train, init_op_val
40 |
41 | ########################################################################
42 | def model(input_layer, num_classes=10):
43 | with tf.variable_scope("model"):
44 | reg = tf.contrib.layers.l2_regularizer(0.00001)
45 | net = input_layer
46 | for i in range(3):
47 | net = tf.layers.dense(net,
48 | units=512,
49 | kernel_regularizer=reg)
50 | net = tf.nn.relu(net)
51 | net = tf.layers.dropout(net, rate=0.2)
52 | net = tf.layers.dense(net, num_classes)
53 | return net
54 |
55 | def loss_functions(logits, labels, num_classes=10):
56 | with tf.variable_scope("loss"):
57 | target_prob = tf.one_hot(labels, num_classes)
58 | tf.losses.softmax_cross_entropy(target_prob, logits)
59 | total_loss = tf.losses.get_total_loss() # include regularization loss (I forgot to add this in the video)
60 | return total_loss
61 | ########################################################################
62 |
63 | def optimizer_func(total_loss, global_step, learning_rate=0.1):
64 | with tf.variable_scope("optimizer"):
65 | optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
66 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
67 | return optimizer
68 |
69 | def performance_metric(logits, labels):
70 | with tf.variable_scope("performance_metric"):
71 | preds = tf.argmax(logits, axis=1)
72 | labels = tf.cast(labels, tf.int64)
73 | corrects = tf.equal(preds, labels)
74 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
75 | return accuracy
76 |
77 | def train():
78 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
79 |
80 | # define the training graph
81 | iterator, init_op_train, init_op_val = data_layer()
82 | images, labels = iterator.get_next()
83 | logits = model(images)
84 | loss = loss_functions(logits, labels)
85 | optimizer = optimizer_func(loss, global_step)
86 | accuracy = performance_metric(logits, labels)
87 |
88 | ########################################################################
89 | # summary placeholders
90 | streaming_loss_p = tf.placeholder(tf.float32)
91 | streaming_acc_p = tf.placeholder(tf.float32)
92 | val_acc_p = tf.placeholder(tf.float32)
93 | val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p)
94 | train_summ_ops = tf.summary.merge([
95 | tf.summary.scalar('streaming_loss', streaming_loss_p),
96 | tf.summary.scalar('streaming_accuracy', streaming_acc_p)
97 | ])
98 | ########################################################################
99 |
100 | # start training
101 | num_iter = 18750 # 10 epochs
102 | log_iter = 1875
103 | val_iter = 1875
104 | with tf.Session() as sess:
105 | sess.run(tf.global_variables_initializer())
106 | sess.run(init_op_train)
107 |
108 | ########################################################################
109 | # logs for TensorBoard
110 | logdir = 'logs'
111 | writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph
112 |
113 | # load / save checkpoints
114 | checkpoint_path = 'checkpoints'
115 | saver = tf.train.Saver(max_to_keep=None)
116 | ckpt = tf.train.get_checkpoint_state(checkpoint_path)
117 |
118 | # resume training if a checkpoint exists
119 | if ckpt and ckpt.model_checkpoint_path:
120 | saver.restore(sess, ckpt.model_checkpoint_path)
121 | print("Loaded parameters from {}".format(ckpt.model_checkpoint_path))
122 |
123 | initial_step = global_step.eval()
124 | ########################################################################
125 |
126 | streaming_loss = 0
127 | streaming_accuracy = 0
128 |
129 | for i in range(initial_step, num_iter + 1): #################################### initial step
130 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy])
131 | streaming_loss += loss_batch
132 | streaming_accuracy += acc_batch
133 | if i % log_iter == 0:
134 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
135 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
136 |
137 | #####################################################################################
138 | # save to log file for TensorBoard
139 | summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss,
140 | streaming_acc_p: streaming_accuracy})
141 | writer.add_summary(summary_train, global_step=i)
142 | #####################################################################################
143 |
144 | streaming_loss = 0
145 | streaming_accuracy = 0
146 |
147 | if i % val_iter == 0:
148 | #####################################################################################
149 | saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step)
150 | print("Model saved!")
151 | #####################################################################################
152 |
153 | sess.run(init_op_val)
154 | validation_accuracy = 0
155 | num_iter = 0
156 | while True:
157 | try:
158 | acc_batch = sess.run(accuracy)
159 | validation_accuracy += acc_batch
160 | num_iter += 1
161 | except tf.errors.OutOfRangeError:
162 | validation_accuracy /= num_iter
163 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
164 |
165 | ###############################################################################
166 | # save log file to TensorBoard
167 | summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy})
168 | writer.add_summary(summary_val, global_step=i)
169 | ###############################################################################
170 |
171 | sess.run(init_op_train) # switch back to training set
172 | break
173 | writer.close()
174 |
175 | if __name__ == "__main__":
176 | train()
177 |
--------------------------------------------------------------------------------
/S5_live.py:
--------------------------------------------------------------------------------
1 | """ Deep Learning with TensorFlow
2 | Live coding session 5: convolutional neural networks, batchnorm, learning rate schedules, optimizers
3 | """
4 |
5 | import tensorflow as tf
6 | import os
7 |
8 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
9 |
10 | def preprocess_data(im, label):
11 | im = tf.cast(im, tf.float32)
12 | im = im / 127.5
13 | im = im - 1
14 | # im = tf.reshape(im, [-1])
15 | return im, label
16 |
17 | # We will be using the same data pipeline for both training and validation sets
18 | # So let's create a helper function for that
19 | def create_dataset_pipeline(data_tensor, is_train=True, num_threads=8, prefetch_buffer=100, batch_size=32):
20 | dataset = tf.data.Dataset.from_tensor_slices(data_tensor)
21 | if is_train:
22 | dataset = dataset.shuffle(buffer_size=60000).repeat()
23 | dataset = dataset.map(preprocess_data, num_parallel_calls=num_threads)
24 | dataset = dataset.batch(batch_size)
25 | dataset = dataset.prefetch(prefetch_buffer)
26 | return dataset
27 |
28 | def data_layer():
29 | with tf.variable_scope("data"):
30 | data_train, data_val = tf.keras.datasets.mnist.load_data()
31 | dataset_train = create_dataset_pipeline(data_train, is_train=True)
32 | dataset_val = create_dataset_pipeline(data_val, is_train=False, batch_size=1)
33 | iterator = tf.data.Iterator.from_structure(dataset_train.output_types, dataset_train.output_shapes)
34 | init_op_train = iterator.make_initializer(dataset_train)
35 | init_op_val = iterator.make_initializer(dataset_val)
36 | return iterator, init_op_train, init_op_val
37 |
38 | ########################################################################
39 | def model(input_layer, training, num_classes=10):
40 | with tf.variable_scope("model"):
41 | net = tf.expand_dims(input_layer, axis=3)
42 |
43 | net = tf.layers.conv2d(net, 20, (5, 5))
44 | net = tf.layers.batch_normalization(net, training=training)
45 | net = tf.nn.relu(net)
46 | net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2))
47 |
48 | net = tf.layers.conv2d(net, 50, (5, 5))
49 | net = tf.layers.batch_normalization(net, training=training)
50 | net = tf.nn.relu(net)
51 | net = tf.layers.max_pooling2d(net, pool_size=(2, 2), strides=(2, 2))
52 |
53 | net = tf.layers.flatten(net)
54 | net = tf.layers.dense(net, 500)
55 | net = tf.nn.relu(net) # I forgot to add this ReLU in the video
56 | net = tf.layers.dropout(net, rate=0.2, training=training) # I forgot the training argument in the video
57 | net = tf.layers.dense(net, num_classes)
58 | return net
59 |
60 | def loss_functions(logits, labels, num_classes=10):
61 | with tf.variable_scope("loss"):
62 | target_prob = tf.one_hot(labels, num_classes)
63 | tf.losses.softmax_cross_entropy(target_prob, logits)
64 | total_loss = tf.losses.get_total_loss() # include regularization loss
65 | return total_loss
66 |
67 | def optimizer_func_momentum(total_loss, global_step, learning_rate=0.01):
68 | with tf.variable_scope("optimizer"):
69 | lr_schedule = tf.train.exponential_decay(learning_rate=learning_rate,
70 | global_step=global_step,
71 | decay_steps=1875,
72 | decay_rate=0.9,
73 | staircase=True)
74 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
75 | with tf.control_dependencies(update_ops):
76 | optimizer = tf.train.MomentumOptimizer(learning_rate=lr_schedule, momentum=0.9)
77 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
78 | return optimizer
79 |
80 | def optimizer_func_adam(total_loss, global_step, learning_rate=0.01):
81 | with tf.variable_scope("optimizer"):
82 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
83 | with tf.control_dependencies(update_ops):
84 | optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)
85 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
86 | return optimizer
87 | ########################################################################
88 |
89 | def performance_metric(logits, labels):
90 | with tf.variable_scope("performance_metric"):
91 | preds = tf.argmax(logits, axis=1)
92 | labels = tf.cast(labels, tf.int64)
93 | corrects = tf.equal(preds, labels)
94 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
95 | return accuracy
96 |
97 | def train():
98 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name="iter_number")
99 |
100 | # define the training graph
101 | iterator, init_op_train, init_op_val = data_layer()
102 | images, labels = iterator.get_next()
103 | training = tf.placeholder(tf.bool)
104 | logits = model(images, training) ##############################
105 | loss = loss_functions(logits, labels)
106 | optimizer = optimizer_func_adam(loss, global_step) ##############################
107 | accuracy = performance_metric(logits, labels)
108 |
109 | # summary placeholders
110 | streaming_loss_p = tf.placeholder(tf.float32)
111 | streaming_acc_p = tf.placeholder(tf.float32)
112 | val_acc_p = tf.placeholder(tf.float32)
113 | val_summ_ops = tf.summary.scalar('validation_acc', val_acc_p)
114 | train_summ_ops = tf.summary.merge([
115 | tf.summary.scalar('streaming_loss', streaming_loss_p),
116 | tf.summary.scalar('streaming_accuracy', streaming_acc_p)
117 | ])
118 |
119 | # start training
120 | num_iter = 18750 # 10 epochs
121 | log_iter = 1875
122 | val_iter = 1875
123 | with tf.Session() as sess:
124 | sess.run(tf.global_variables_initializer())
125 | sess.run(init_op_train)
126 |
127 | # logs for TensorBoard
128 | logdir = 'logs'
129 | writer = tf.summary.FileWriter(logdir, sess.graph) # visualize the graph
130 |
131 | # load / save checkpoints
132 | checkpoint_path = 'checkpoints'
133 | saver = tf.train.Saver(max_to_keep=None)
134 | ckpt = tf.train.get_checkpoint_state(checkpoint_path)
135 |
136 | # resume training if a checkpoint exists
137 | if ckpt and ckpt.model_checkpoint_path:
138 | saver.restore(sess, ckpt.model_checkpoint_path)
139 | print("Loaded parameters from {}".format(ckpt.model_checkpoint_path))
140 |
141 | initial_step = global_step.eval()
142 |
143 | streaming_loss = 0
144 | streaming_accuracy = 0
145 |
146 | for i in range(initial_step, num_iter + 1):
147 | _, loss_batch, acc_batch = sess.run([optimizer, loss, accuracy], feed_dict={training: True}) ##############################
148 | streaming_loss += loss_batch
149 | streaming_accuracy += acc_batch
150 | if i % log_iter == 0:
151 | print("Iteration: {}, Streaming loss: {:.2f}, Streaming accuracy: {:.2f}"
152 | .format(i, streaming_loss/log_iter, streaming_accuracy/log_iter))
153 |
154 | # save to log file for TensorBoard
155 | summary_train = sess.run(train_summ_ops, feed_dict={streaming_loss_p: streaming_loss,
156 | streaming_acc_p: streaming_accuracy})
157 | writer.add_summary(summary_train, global_step=i)
158 |
159 | streaming_loss = 0
160 | streaming_accuracy = 0
161 |
162 | if i % val_iter == 0:
163 | saver.save(sess, os.path.join(checkpoint_path, 'checkpoint'), global_step=global_step)
164 | print("Model saved!")
165 |
166 | sess.run(init_op_val)
167 | validation_accuracy = 0
168 | num_iter = 0
169 | while True:
170 | try:
171 | acc_batch = sess.run(accuracy, feed_dict={training: False}) ##############################
172 | validation_accuracy += acc_batch
173 | num_iter += 1
174 | except tf.errors.OutOfRangeError:
175 | validation_accuracy /= num_iter
176 | print("Iteration: {}, Validation accuracy: {:.2f}".format(i, validation_accuracy))
177 |
178 | # save log file to TensorBoard
179 | summary_val = sess.run(val_summ_ops, feed_dict={val_acc_p: validation_accuracy})
180 | writer.add_summary(summary_val, global_step=i)
181 |
182 | sess.run(init_op_train) # switch back to training set
183 | break
184 | writer.close()
185 |
186 | if __name__ == "__main__":
187 | train()
188 |
--------------------------------------------------------------------------------
/S6/create_tf_records.py:
--------------------------------------------------------------------------------
1 | """ Converts an image dataset into TFRecords. The dataset should be organized as:
2 |
3 | base_dir:
4 | -- class_name1
5 | ---- image_name.jpg
6 | ...
7 | -- class_name2
8 | ---- image_name.jpg
9 | ...
10 | -- class_name3
11 | ---- image_name.jpg
12 | ...
13 |
14 | Example:
15 | $ python create_tf_records.py --input_dir ./dataset --output_dir ./tfrecords --num_shards 10 --split_ratio 0.2
16 | """
17 |
18 | import tensorflow as tf
19 | import os, glob
20 | import argparse
21 | import random
22 |
23 | def _int64_feature(value):
24 | return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
25 |
26 | def _bytes_feature(value):
27 | return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
28 |
29 | def _create_tfexample(image_data, label):
30 | example = tf.train.Example(features=tf.train.Features(feature={
31 | 'image': _bytes_feature(image_data),
32 | 'label': _int64_feature(label)
33 | }))
34 | return example
35 |
36 | def enumerate_classes(class_list, sort=True):
37 | class_ids = {}
38 | class_id_counter = 0
39 |
40 | if sort:
41 | class_list.sort()
42 |
43 | for class_name in class_list:
44 | if class_name not in class_ids:
45 | class_ids[class_name] = class_id_counter
46 | class_id_counter += 1
47 |
48 | return class_ids
49 |
50 | def create_tfrecords(save_dir, dataset_name, filenames, class_ids, num_shards):
51 |
52 | im_per_shard = int( len(filenames) / num_shards ) + 1
53 |
54 | for shard in range(num_shards):
55 | output_filename = os.path.join(save_dir, '{}_{:03d}-of-{:03d}.tfrecord'
56 | .format(dataset_name, shard, num_shards))
57 | print('Writing into {}'.format(output_filename))
58 | filenames_shard = filenames[shard*im_per_shard:(shard+1)*im_per_shard]
59 |
60 | with tf.python_io.TFRecordWriter(output_filename) as tfrecord_writer:
61 |
62 | for filename in filenames_shard:
63 | image = tf.gfile.FastGFile(filename, 'rb').read()
64 | class_name = os.path.basename(os.path.dirname(filename))
65 | label = class_ids[class_name]
66 |
67 | example = _create_tfexample(image, label)
68 | tfrecord_writer.write(example.SerializeToString())
69 |
70 | print('Finished writing {} images into TFRecords'.format(len(filenames)))
71 |
72 | def main(args):
73 |
74 | supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG']
75 | filenames = []
76 | for extension in supported_formats:
77 | pattern = os.path.join(args.input_dir, '**', extension)
78 | filenames.extend(glob.glob(pattern, recursive=False))
79 |
80 | random.seed(args.seed)
81 | random.shuffle(filenames)
82 |
83 | num_test = int(args.split_ratio * len(filenames))
84 | num_shards_test = int(args.split_ratio * args.num_shards)
85 | num_shards_train = args.num_shards - num_shards_test
86 |
87 | # write the list of classes and their corresponding ids to a file
88 | class_list = [name for name in os.listdir(args.input_dir)
89 | if os.path.isdir(os.path.join(args.input_dir, name))]
90 | class_ids = enumerate_classes(class_list)
91 | with open(os.path.join(args.output_dir, 'classes.txt'), 'w') as f:
92 | for cid in class_ids:
93 | print('{}:{}'.format(class_ids[cid], cid), file=f)
94 |
95 | # create TFRecords for the training and test sets
96 | create_tfrecords(save_dir=args.output_dir,
97 | dataset_name='train',
98 | filenames=filenames[num_test:],
99 | class_ids=class_ids,
100 | num_shards=num_shards_train)
101 | create_tfrecords(save_dir=args.output_dir,
102 | dataset_name='test',
103 | filenames=filenames[:num_test],
104 | class_ids=class_ids,
105 | num_shards=num_shards_test)
106 |
107 | if __name__ == '__main__':
108 | parser = argparse.ArgumentParser()
109 | parser.add_argument('--input_dir', type=str,
110 | help='path to the directory where the images will be read from')
111 | parser.add_argument('--output_dir', type=str,
112 | help='path to the directory where the TFRecords will be saved to')
113 | parser.add_argument('--num_shards', type=int,
114 | help='total number of shards')
115 | parser.add_argument('--split_ratio', type=float, default=0.2,
116 | help='ratio of number of images in the test set to the total number of images')
117 | parser.add_argument('--seed', type=int, default=42,
118 | help='random seed for repeatable train/test splits')
119 | args = parser.parse_args()
120 | main(args)
121 |
--------------------------------------------------------------------------------
/S6/freeze_model.py:
--------------------------------------------------------------------------------
1 | """ Freezes a checkpoint, outputs a single pbfile that encapsulates both the graph and weights
2 | Example:
3 | $ python freeze_model.py --checkpoint_path ./checkpoints
4 | """
5 |
6 | import tensorflow as tf
7 | import argparse
8 | from nets import mobilenet_v1
9 | import os
10 |
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 |
13 | def freeze_graph(checkpoint_path, output_node_name, outfile):
14 | input_layer = tf.placeholder(tf.uint8, shape=[None, None, 3], name='input')
15 | with tf.variable_scope('input_scaling'):
16 | image = tf.expand_dims(input_layer, axis=0)
17 | image = tf.image.resize_bilinear(image, [224, 224])
18 | image = tf.cast(image, tf.float32)
19 | image = image / 127.5
20 | image = image - 1
21 |
22 | logits, _ = mobilenet_v1.mobilenet_v1(image, num_classes=2, is_training=False)
23 | preds = tf.squeeze(tf.nn.softmax(logits), name='preds')
24 |
25 | with tf.Session() as sess:
26 | ckpt = tf.train.get_checkpoint_state(checkpoint_path)
27 | saver = tf.train.Saver()
28 | saver.restore(sess, ckpt.model_checkpoint_path)
29 |
30 | output_graph_def = tf.graph_util.convert_variables_to_constants(
31 | sess, tf.get_default_graph().as_graph_def(), [output_node_name])
32 |
33 | with tf.gfile.GFile(outfile, 'wb') as f:
34 | f.write(output_graph_def.SerializeToString())
35 |
36 | # print a list of ops
37 | for op in output_graph_def.node:
38 | print(op.name)
39 |
40 | print('Saved frozen model to {}'.format(outfile))
41 | print('{:d} ops in the final graph.'.format(len(output_graph_def.node)))
42 |
43 | if __name__ == '__main__':
44 | parser = argparse.ArgumentParser()
45 | parser.add_argument('--checkpoint_path', type=str, default='./', help="Path to the dir where the checkpoints are saved")
46 | parser.add_argument('--output_node_name', type=str, default='preds', help="Name of the output node")
47 | parser.add_argument('--outfile', type=str, default='frozen_model.pb', help="Frozen model path")
48 | args = parser.parse_args()
49 | freeze_graph(args.checkpoint_path, args.output_node_name, args.outfile)
--------------------------------------------------------------------------------
/S6/inference.py:
--------------------------------------------------------------------------------
1 | """ Runs inference given a frozen model and a set of images
2 | Example:
3 | $ python inference.py --frozen_model frozen_model.pb --input_path ./test_images
4 | """
5 |
6 | import argparse
7 | import tensorflow as tf
8 | import os, glob
9 | import cv2
10 |
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 |
13 | class InferenceEngine:
14 | def __init__(self, frozen_graph_filename):
15 | with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
16 | graph_def = tf.GraphDef()
17 | graph_def.ParseFromString(f.read())
18 |
19 | with tf.Graph().as_default() as graph:
20 | tf.import_graph_def(graph_def, name="Pretrained")
21 |
22 | self.graph = graph
23 |
24 | def run_inference(self, input_path):
25 | if os.path.isdir(input_path):
26 | filenames = glob.glob(os.path.join(input_path, '*.jpg'))
27 | filenames.extend(glob.glob(os.path.join(input_path, '*.jpeg')))
28 | filenames.extend(glob.glob(os.path.join(input_path, '*.png')))
29 | filenames.extend(glob.glob(os.path.join(input_path, '*.bmp')))
30 | else:
31 | filenames = [input_path]
32 |
33 | input_layer = self.graph.get_tensor_by_name('Pretrained/input:0')
34 | preds = self.graph.get_tensor_by_name('Pretrained/preds:0')
35 | pred_idx = tf.argmax(preds)
36 |
37 | with tf.Session(graph=self.graph) as sess:
38 | for filename in filenames:
39 | image = cv2.imread(filename)
40 | class_label, probs = sess.run([pred_idx, preds], feed_dict={input_layer: image})
41 | print("Label: {:d}, Probability: {:.2f} \t File: {}".format(class_label, probs[class_label], filename))
42 |
43 | if __name__ == '__main__':
44 | parser = argparse.ArgumentParser()
45 | parser.add_argument("--frozen_model", default="frozen_model.pb", type=str, help="Path to the frozen model file to import")
46 | parser.add_argument("--input_path", type=str, help="Path to the input file(s). If this is a dir all files will be processed.")
47 | args = parser.parse_args()
48 |
49 | ie = InferenceEngine(args.frozen_model)
50 | ie.run_inference(args.input_path)
51 |
52 |
--------------------------------------------------------------------------------
/S6/nets/mobilenet_v1.py:
--------------------------------------------------------------------------------
1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # =============================================================================
15 | """MobileNet v1.
16 |
17 | MobileNet is a general architecture and can be used for multiple use cases.
18 | Depending on the use case, it can use different input layer size and different
19 | head (for example: embeddings, localization and classification).
20 |
21 | As described in https://arxiv.org/abs/1704.04861.
22 |
23 | MobileNets: Efficient Convolutional Neural Networks for
24 | Mobile Vision Applications
25 | Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
26 | Tobias Weyand, Marco Andreetto, Hartwig Adam
27 |
28 | 100% Mobilenet V1 (base) with input size 224x224:
29 |
30 | See mobilenet_v1()
31 |
32 | Layer params macs
33 | --------------------------------------------------------------------------------
34 | MobilenetV1/Conv2d_0/Conv2D: 864 10,838,016
35 | MobilenetV1/Conv2d_1_depthwise/depthwise: 288 3,612,672
36 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 2,048 25,690,112
37 | MobilenetV1/Conv2d_2_depthwise/depthwise: 576 1,806,336
38 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 8,192 25,690,112
39 | MobilenetV1/Conv2d_3_depthwise/depthwise: 1,152 3,612,672
40 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 16,384 51,380,224
41 | MobilenetV1/Conv2d_4_depthwise/depthwise: 1,152 903,168
42 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 32,768 25,690,112
43 | MobilenetV1/Conv2d_5_depthwise/depthwise: 2,304 1,806,336
44 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 65,536 51,380,224
45 | MobilenetV1/Conv2d_6_depthwise/depthwise: 2,304 451,584
46 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 131,072 25,690,112
47 | MobilenetV1/Conv2d_7_depthwise/depthwise: 4,608 903,168
48 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 262,144 51,380,224
49 | MobilenetV1/Conv2d_8_depthwise/depthwise: 4,608 903,168
50 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 262,144 51,380,224
51 | MobilenetV1/Conv2d_9_depthwise/depthwise: 4,608 903,168
52 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 262,144 51,380,224
53 | MobilenetV1/Conv2d_10_depthwise/depthwise: 4,608 903,168
54 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 262,144 51,380,224
55 | MobilenetV1/Conv2d_11_depthwise/depthwise: 4,608 903,168
56 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 262,144 51,380,224
57 | MobilenetV1/Conv2d_12_depthwise/depthwise: 4,608 225,792
58 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 524,288 25,690,112
59 | MobilenetV1/Conv2d_13_depthwise/depthwise: 9,216 451,584
60 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 1,048,576 51,380,224
61 | --------------------------------------------------------------------------------
62 | Total: 3,185,088 567,716,352
63 |
64 |
65 | 75% Mobilenet V1 (base) with input size 128x128:
66 |
67 | See mobilenet_v1_075()
68 |
69 | Layer params macs
70 | --------------------------------------------------------------------------------
71 | MobilenetV1/Conv2d_0/Conv2D: 648 2,654,208
72 | MobilenetV1/Conv2d_1_depthwise/depthwise: 216 884,736
73 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 1,152 4,718,592
74 | MobilenetV1/Conv2d_2_depthwise/depthwise: 432 442,368
75 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 4,608 4,718,592
76 | MobilenetV1/Conv2d_3_depthwise/depthwise: 864 884,736
77 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 9,216 9,437,184
78 | MobilenetV1/Conv2d_4_depthwise/depthwise: 864 221,184
79 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 18,432 4,718,592
80 | MobilenetV1/Conv2d_5_depthwise/depthwise: 1,728 442,368
81 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 36,864 9,437,184
82 | MobilenetV1/Conv2d_6_depthwise/depthwise: 1,728 110,592
83 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 73,728 4,718,592
84 | MobilenetV1/Conv2d_7_depthwise/depthwise: 3,456 221,184
85 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 147,456 9,437,184
86 | MobilenetV1/Conv2d_8_depthwise/depthwise: 3,456 221,184
87 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 147,456 9,437,184
88 | MobilenetV1/Conv2d_9_depthwise/depthwise: 3,456 221,184
89 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 147,456 9,437,184
90 | MobilenetV1/Conv2d_10_depthwise/depthwise: 3,456 221,184
91 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 147,456 9,437,184
92 | MobilenetV1/Conv2d_11_depthwise/depthwise: 3,456 221,184
93 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 147,456 9,437,184
94 | MobilenetV1/Conv2d_12_depthwise/depthwise: 3,456 55,296
95 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 294,912 4,718,592
96 | MobilenetV1/Conv2d_13_depthwise/depthwise: 6,912 110,592
97 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 589,824 9,437,184
98 | --------------------------------------------------------------------------------
99 | Total: 1,800,144 106,002,432
100 |
101 | """
102 |
103 | # Tensorflow mandates these.
104 | from __future__ import absolute_import
105 | from __future__ import division
106 | from __future__ import print_function
107 |
108 | from collections import namedtuple
109 | import functools
110 |
111 | import tensorflow as tf
112 |
113 | slim = tf.contrib.slim
114 |
115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
116 | # Conv defines 3x3 convolution layers
117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
118 | # stride is the stride of the convolution
119 | # depth is the number of channels or filters in a layer
120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])
122 |
123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body
124 | MOBILENETV1_CONV_DEFS = [
125 | Conv(kernel=[3, 3], stride=2, depth=32),
126 | DepthSepConv(kernel=[3, 3], stride=1, depth=64),
127 | DepthSepConv(kernel=[3, 3], stride=2, depth=128),
128 | DepthSepConv(kernel=[3, 3], stride=1, depth=128),
129 | DepthSepConv(kernel=[3, 3], stride=2, depth=256),
130 | DepthSepConv(kernel=[3, 3], stride=1, depth=256),
131 | DepthSepConv(kernel=[3, 3], stride=2, depth=512),
132 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
133 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
134 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
135 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
136 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
137 | DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
138 | DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
139 | ]
140 |
141 |
142 | def _fixed_padding(inputs, kernel_size, rate=1):
143 | """Pads the input along the spatial dimensions independently of input size.
144 |
145 | Pads the input such that if it was used in a convolution with 'VALID' padding,
146 | the output would have the same dimensions as if the unpadded input was used
147 | in a convolution with 'SAME' padding.
148 |
149 | Args:
150 | inputs: A tensor of size [batch, height_in, width_in, channels].
151 | kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
152 | rate: An integer, rate for atrous convolution.
153 |
154 | Returns:
155 | output: A tensor of size [batch, height_out, width_out, channels] with the
156 | input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
157 | """
158 | kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1),
159 | kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)]
160 | pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1]
161 | pad_beg = [pad_total[0] // 2, pad_total[1] // 2]
162 | pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]]
163 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]],
164 | [pad_beg[1], pad_end[1]], [0, 0]])
165 | return padded_inputs
166 |
167 |
168 | def mobilenet_v1_base(inputs,
169 | final_endpoint='Conv2d_13_pointwise',
170 | min_depth=8,
171 | depth_multiplier=1.0,
172 | conv_defs=None,
173 | output_stride=None,
174 | use_explicit_padding=False,
175 | scope=None):
176 | """Mobilenet v1.
177 |
178 | Constructs a Mobilenet v1 network from inputs to the given final endpoint.
179 |
180 | Args:
181 | inputs: a tensor of shape [batch_size, height, width, channels].
182 | final_endpoint: specifies the endpoint to construct the network up to. It
183 | can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',
184 | 'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,
185 | 'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',
186 | 'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',
187 | 'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].
188 | min_depth: Minimum depth value (number of channels) for all convolution ops.
189 | Enforced when depth_multiplier < 1, and not an active constraint when
190 | depth_multiplier >= 1.
191 | depth_multiplier: Float multiplier for the depth (number of channels)
192 | for all convolution ops. The value must be greater than zero. Typical
193 | usage will be to set this value in (0, 1) to reduce the number of
194 | parameters or computation cost of the model.
195 | conv_defs: A list of ConvDef namedtuples specifying the net architecture.
196 | output_stride: An integer that specifies the requested ratio of input to
197 | output spatial resolution. If not None, then we invoke atrous convolution
198 | if necessary to prevent the network from reducing the spatial resolution
199 | of the activation maps. Allowed values are 8 (accurate fully convolutional
200 | mode), 16 (fast fully convolutional mode), 32 (classification mode).
201 | use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
202 | inputs so that the output dimensions are the same as if 'SAME' padding
203 | were used.
204 | scope: Optional variable_scope.
205 |
206 | Returns:
207 | tensor_out: output tensor corresponding to the final_endpoint.
208 | end_points: a set of activations for external use, for example summaries or
209 | losses.
210 |
211 | Raises:
212 | ValueError: if final_endpoint is not set to one of the predefined values,
213 | or depth_multiplier <= 0, or the target output_stride is not
214 | allowed.
215 | """
216 | depth = lambda d: max(int(d * depth_multiplier), min_depth)
217 | end_points = {}
218 |
219 | # Used to find thinned depths for each layer.
220 | if depth_multiplier <= 0:
221 | raise ValueError('depth_multiplier is not greater than zero.')
222 |
223 | if conv_defs is None:
224 | conv_defs = MOBILENETV1_CONV_DEFS
225 |
226 | if output_stride is not None and output_stride not in [8, 16, 32]:
227 | raise ValueError('Only allowed output_stride values are 8, 16, 32.')
228 |
229 | padding = 'SAME'
230 | if use_explicit_padding:
231 | padding = 'VALID'
232 | with tf.variable_scope(scope, 'MobilenetV1', [inputs]):
233 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding):
234 | # The current_stride variable keeps track of the output stride of the
235 | # activations, i.e., the running product of convolution strides up to the
236 | # current network layer. This allows us to invoke atrous convolution
237 | # whenever applying the next convolution would result in the activations
238 | # having output stride larger than the target output_stride.
239 | current_stride = 1
240 |
241 | # The atrous convolution rate parameter.
242 | rate = 1
243 |
244 | net = inputs
245 | for i, conv_def in enumerate(conv_defs):
246 | end_point_base = 'Conv2d_%d' % i
247 |
248 | if output_stride is not None and current_stride == output_stride:
249 | # If we have reached the target output_stride, then we need to employ
250 | # atrous convolution with stride=1 and multiply the atrous rate by the
251 | # current unit's stride for use in subsequent layers.
252 | layer_stride = 1
253 | layer_rate = rate
254 | rate *= conv_def.stride
255 | else:
256 | layer_stride = conv_def.stride
257 | layer_rate = 1
258 | current_stride *= conv_def.stride
259 |
260 | if isinstance(conv_def, Conv):
261 | end_point = end_point_base
262 | if use_explicit_padding:
263 | net = _fixed_padding(net, conv_def.kernel)
264 | net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,
265 | stride=conv_def.stride,
266 | normalizer_fn=slim.batch_norm,
267 | scope=end_point)
268 | end_points[end_point] = net
269 | if end_point == final_endpoint:
270 | return net, end_points
271 |
272 | elif isinstance(conv_def, DepthSepConv):
273 | end_point = end_point_base + '_depthwise'
274 |
275 | # By passing filters=None
276 | # separable_conv2d produces only a depthwise convolution layer
277 | if use_explicit_padding:
278 | net = _fixed_padding(net, conv_def.kernel, layer_rate)
279 | net = slim.separable_conv2d(net, None, conv_def.kernel,
280 | depth_multiplier=1,
281 | stride=layer_stride,
282 | rate=layer_rate,
283 | normalizer_fn=slim.batch_norm,
284 | scope=end_point)
285 |
286 | end_points[end_point] = net
287 | if end_point == final_endpoint:
288 | return net, end_points
289 |
290 | end_point = end_point_base + '_pointwise'
291 |
292 | net = slim.conv2d(net, depth(conv_def.depth), [1, 1],
293 | stride=1,
294 | normalizer_fn=slim.batch_norm,
295 | scope=end_point)
296 |
297 | end_points[end_point] = net
298 | if end_point == final_endpoint:
299 | return net, end_points
300 | else:
301 | raise ValueError('Unknown convolution type %s for layer %d'
302 | % (conv_def.ltype, i))
303 | raise ValueError('Unknown final endpoint %s' % final_endpoint)
304 |
305 |
306 | def mobilenet_v1(inputs,
307 | num_classes=1000,
308 | dropout_keep_prob=0.999,
309 | is_training=True,
310 | min_depth=8,
311 | depth_multiplier=1.0,
312 | conv_defs=None,
313 | prediction_fn=tf.contrib.layers.softmax,
314 | spatial_squeeze=True,
315 | reuse=None,
316 | scope='MobilenetV1',
317 | global_pool=False):
318 | """Mobilenet v1 model for classification.
319 |
320 | Args:
321 | inputs: a tensor of shape [batch_size, height, width, channels].
322 | num_classes: number of predicted classes. If 0 or None, the logits layer
323 | is omitted and the input features to the logits layer (before dropout)
324 | are returned instead.
325 | dropout_keep_prob: the percentage of activation values that are retained.
326 | is_training: whether is training or not.
327 | min_depth: Minimum depth value (number of channels) for all convolution ops.
328 | Enforced when depth_multiplier < 1, and not an active constraint when
329 | depth_multiplier >= 1.
330 | depth_multiplier: Float multiplier for the depth (number of channels)
331 | for all convolution ops. The value must be greater than zero. Typical
332 | usage will be to set this value in (0, 1) to reduce the number of
333 | parameters or computation cost of the model.
334 | conv_defs: A list of ConvDef namedtuples specifying the net architecture.
335 | prediction_fn: a function to get predictions out of logits.
336 | spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
337 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
338 | reuse: whether or not the network and its variables should be reused. To be
339 | able to reuse 'scope' must be given.
340 | scope: Optional variable_scope.
341 | global_pool: Optional boolean flag to control the avgpooling before the
342 | logits layer. If false or unset, pooling is done with a fixed window
343 | that reduces default-sized inputs to 1x1, while larger inputs lead to
344 | larger outputs. If true, any input size is pooled down to 1x1.
345 |
346 | Returns:
347 | net: a 2D Tensor with the logits (pre-softmax activations) if num_classes
348 | is a non-zero integer, or the non-dropped-out input to the logits layer
349 | if num_classes is 0 or None.
350 | end_points: a dictionary from components of the network to the corresponding
351 | activation.
352 |
353 | Raises:
354 | ValueError: Input rank is invalid.
355 | """
356 | input_shape = inputs.get_shape().as_list()
357 | if len(input_shape) != 4:
358 | raise ValueError('Invalid input tensor rank, expected 4, was: %d' %
359 | len(input_shape))
360 |
361 | with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope:
362 | with slim.arg_scope([slim.batch_norm, slim.dropout],
363 | is_training=is_training):
364 | net, end_points = mobilenet_v1_base(inputs, scope=scope,
365 | min_depth=min_depth,
366 | depth_multiplier=depth_multiplier,
367 | conv_defs=conv_defs)
368 | with tf.variable_scope('Logits'):
369 | if global_pool:
370 | # Global average pooling.
371 | net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool')
372 | end_points['global_pool'] = net
373 | else:
374 | # Pooling with a fixed kernel size.
375 | kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
376 | net = slim.avg_pool2d(net, kernel_size, padding='VALID',
377 | scope='AvgPool_1a')
378 | end_points['AvgPool_1a'] = net
379 | if not num_classes:
380 | return net, end_points
381 | # 1 x 1 x 1024
382 | net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
383 | logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
384 | normalizer_fn=None, scope='Conv2d_1c_1x1')
385 | if spatial_squeeze:
386 | logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
387 | end_points['Logits'] = logits
388 | if prediction_fn:
389 | end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
390 | return logits, end_points
391 |
392 | mobilenet_v1.default_image_size = 224
393 |
394 |
395 | def wrapped_partial(func, *args, **kwargs):
396 | partial_func = functools.partial(func, *args, **kwargs)
397 | functools.update_wrapper(partial_func, func)
398 | return partial_func
399 |
400 |
401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)
402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)
403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)
404 |
405 |
406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
407 | """Define kernel size which is automatically reduced for small input.
408 |
409 | If the shape of the input images is unknown at graph construction time this
410 | function assumes that the input images are large enough.
411 |
412 | Args:
413 | input_tensor: input tensor of size [batch_size, height, width, channels].
414 | kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]
415 |
416 | Returns:
417 | a tensor with the kernel size.
418 | """
419 | shape = input_tensor.get_shape().as_list()
420 | if shape[1] is None or shape[2] is None:
421 | kernel_size_out = kernel_size
422 | else:
423 | kernel_size_out = [min(shape[1], kernel_size[0]),
424 | min(shape[2], kernel_size[1])]
425 | return kernel_size_out
426 |
427 |
428 | def mobilenet_v1_arg_scope(
429 | is_training=True,
430 | weight_decay=0.00004,
431 | stddev=0.09,
432 | regularize_depthwise=False,
433 | batch_norm_decay=0.9997,
434 | batch_norm_epsilon=0.001,
435 | batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS):
436 | """Defines the default MobilenetV1 arg scope.
437 |
438 | Args:
439 | is_training: Whether or not we're training the model. If this is set to
440 | None, the parameter is not added to the batch_norm arg_scope.
441 | weight_decay: The weight decay to use for regularizing the model.
442 | stddev: The standard deviation of the trunctated normal weight initializer.
443 | regularize_depthwise: Whether or not apply regularization on depthwise.
444 | batch_norm_decay: Decay for batch norm moving average.
445 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero
446 | in batch norm.
447 | batch_norm_updates_collections: Collection for the update ops for
448 | batch norm.
449 |
450 | Returns:
451 | An `arg_scope` to use for the mobilenet v1 model.
452 | """
453 | batch_norm_params = {
454 | 'center': True,
455 | 'scale': True,
456 | 'decay': batch_norm_decay,
457 | 'epsilon': batch_norm_epsilon,
458 | 'updates_collections': batch_norm_updates_collections,
459 | }
460 | if is_training is not None:
461 | batch_norm_params['is_training'] = is_training
462 |
463 | # Set weight_decay for weights in Conv and DepthSepConv layers.
464 | weights_init = tf.truncated_normal_initializer(stddev=stddev)
465 | regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
466 | if regularize_depthwise:
467 | depthwise_regularizer = regularizer
468 | else:
469 | depthwise_regularizer = None
470 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
471 | weights_initializer=weights_init,
472 | activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):
473 | with slim.arg_scope([slim.batch_norm], **batch_norm_params):
474 | with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):
475 | with slim.arg_scope([slim.separable_conv2d],
476 | weights_regularizer=depthwise_regularizer) as sc:
477 | return sc
478 |
--------------------------------------------------------------------------------
/S6/resize_images.py:
--------------------------------------------------------------------------------
1 | """ A utility script that resizes all images in a given directory to a specified size
2 | WARNING: the original images will be overwritten!
3 | """
4 |
5 | import cv2
6 | import os, glob
7 | import argparse
8 |
9 | def main(args):
10 | supported_formats = ['*.jpg', '*.JPG', '*.jpeg', '*.JPEG']
11 | filenames = []
12 | for extension in supported_formats:
13 | pattern = os.path.join(args.input_dir, '**', extension)
14 | filenames.extend(glob.glob(pattern, recursive=True))
15 |
16 | num_images = len(filenames)
17 | for i in range(num_images):
18 | if i % 100 == 0:
19 | print("{} of {} \t Resizing: {}".format(i, num_images, filenames[i]))
20 | image = cv2.imread(filenames[i])
21 | image = cv2.resize(image, (args.resize, args.resize), interpolation=cv2.INTER_AREA)
22 | cv2.imwrite(filenames[i], image)
23 |
24 | if __name__ == '__main__':
25 | # resizes images in-place
26 | parser = argparse.ArgumentParser()
27 | parser.add_argument('--input_dir', type=str,
28 | help='path to the directory where the images will be read from')
29 | parser.add_argument('--resize', type=int,
30 | help='the images will be resized to NxN')
31 |
32 | args = parser.parse_args()
33 | main(args)
34 |
--------------------------------------------------------------------------------
/S6/trainer.py:
--------------------------------------------------------------------------------
1 | """ Trains a TensorFlow model
2 | Example:
3 | $ python trainer.py --checkpoint_path ./checkpoints --data_path ./tfrecords
4 | """
5 |
6 | import tensorflow as tf
7 | import numpy as np
8 | import os, glob
9 | import argparse
10 | from nets import mobilenet_v1 #####################################################
11 |
12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
13 |
14 | class TFModelTrainer:
15 |
16 | def __init__(self, checkpoint_path, data_path):
17 | self.checkpoint_path = checkpoint_path
18 |
19 | # set training parameters #################################################
20 | self.learning_rate = 0.01
21 | self.num_iter = 100000
22 | self.save_iter = 5000
23 | self.val_iter = 5000
24 | self.log_iter = 100
25 | self.batch_size = 32
26 |
27 | # set up data layer
28 | self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord'))
29 | self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord'))
30 | self.iterator, self.filenames = self._data_layer()
31 | self.num_val_samples = 10000
32 | self.num_classes = 2
33 | self.image_size = 224
34 |
35 | def preprocess_image(self, image_string):
36 | image = tf.image.decode_jpeg(image_string, channels=3)
37 |
38 | # flip for data augmentation
39 | image = tf.image.random_flip_left_right(image) ############################
40 |
41 | # normalize image to [-1, +1]
42 | image = tf.cast(image, tf.float32)
43 | image = image / 127.5
44 | image = image - 1
45 | return image
46 |
47 | def _parse_tfrecord(self, example_proto): #####################################
48 | keys_to_features = {'image': tf.FixedLenFeature([], tf.string),
49 | 'label': tf.FixedLenFeature([], tf.int64)}
50 | parsed_features = tf.parse_single_example(example_proto, keys_to_features)
51 | image = parsed_features['image']
52 | label = parsed_features['label']
53 | image = self.preprocess_image(image)
54 | return image, label
55 |
56 | def _data_layer(self, num_threads=8, prefetch_buffer=100):
57 | with tf.variable_scope('data'):
58 | filenames = tf.placeholder(tf.string, shape=[None])
59 | dataset = tf.data.TFRecordDataset(filenames) ##########################
60 | dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads)
61 | dataset = dataset.repeat()
62 | dataset = dataset.batch(self.batch_size)
63 | dataset = dataset.prefetch(prefetch_buffer)
64 | iterator = dataset.make_initializable_iterator()
65 | return iterator, filenames
66 |
67 | def _loss_functions(self, logits, labels):
68 | with tf.variable_scope('loss'):
69 | target_prob = tf.one_hot(labels, self.num_classes)
70 | tf.losses.softmax_cross_entropy(target_prob, logits)
71 | total_loss = tf.losses.get_total_loss() #include regularization loss
72 | return total_loss
73 |
74 | def _optimizer(self, total_loss, global_step):
75 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
76 | with tf.control_dependencies(update_ops):
77 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
78 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
79 | return optimizer
80 |
81 | def _performance_metric(self, logits, labels):
82 | with tf.variable_scope("performance_metric"):
83 | preds = tf.argmax(logits, axis=1)
84 | labels = tf.cast(labels, tf.int64)
85 | corrects = tf.equal(preds, labels)
86 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
87 | return accuracy
88 |
89 | def train(self):
90 | # iteration number
91 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
92 |
93 | # training graph
94 | images, labels = self.iterator.get_next()
95 | images = tf.image.resize_bilinear(images, (self.image_size, self.image_size))
96 | training = tf.placeholder(tf.bool, name='is_training')
97 | logits, _ = mobilenet_v1.mobilenet_v1(images,
98 | num_classes=self.num_classes,
99 | is_training=training,
100 | scope='MobilenetV1',
101 | global_pool=True) ############################################
102 | loss = self._loss_functions(logits, labels)
103 | optimizer = self._optimizer(loss, global_step)
104 | accuracy = self._performance_metric(logits, labels)
105 |
106 | # summary placeholders
107 | streaming_loss_p = tf.placeholder(tf.float32)
108 | accuracy_p = tf.placeholder(tf.float32)
109 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
110 | summ_op_test = tf.summary.scalar('accuracy', accuracy_p)
111 |
112 | with tf.Session() as sess:
113 | sess.run(tf.global_variables_initializer())
114 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
115 |
116 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
117 |
118 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
119 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
120 |
121 | # resume training if a checkpoint exists
122 | if ckpt and ckpt.model_checkpoint_path:
123 | saver.restore(sess, ckpt.model_checkpoint_path)
124 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
125 |
126 | initial_step = global_step.eval()
127 |
128 | # train the model
129 | streaming_loss = 0
130 | for i in range(initial_step, self.num_iter + 1):
131 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
132 |
133 | if not np.isfinite(loss_batch):
134 | print('loss diverged, stopping')
135 | exit()
136 |
137 | # log summary
138 | streaming_loss += loss_batch
139 | if i % self.log_iter == self.log_iter - 1:
140 | streaming_loss /= self.log_iter
141 | print(i + 1, streaming_loss)
142 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
143 | writer.add_summary(summary_train, global_step=i)
144 | streaming_loss = 0
145 |
146 | # save model
147 | if i % self.save_iter == self.save_iter - 1:
148 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
149 | print("Model saved!")
150 |
151 | # run validation
152 | if i % self.val_iter == self.val_iter - 1:
153 | print("Running validation.")
154 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames})
155 |
156 | validation_accuracy = 0
157 | for j in range(self.num_val_samples // self.batch_size): ###################################
158 | acc_batch = sess.run(accuracy, feed_dict={training: False})
159 | validation_accuracy += acc_batch
160 | validation_accuracy /= j
161 |
162 | print("Accuracy: {}".format(validation_accuracy))
163 |
164 | summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy})
165 | writer.add_summary(summary_test, global_step=i)
166 |
167 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
168 |
169 | writer.close()
170 |
171 | def main():
172 | parser = argparse.ArgumentParser()
173 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
174 | help="Path to the dir where the checkpoints are saved")
175 | parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords")
176 | args = parser.parse_args()
177 | trainer = TFModelTrainer(args.checkpoint_path, args.data_path)
178 | trainer.train()
179 |
180 | if __name__ == '__main__':
181 | main()
182 |
--------------------------------------------------------------------------------
/S7/checkpoints/checkpoint:
--------------------------------------------------------------------------------
1 | model_checkpoint_path: "mobilenet_v1_1.0_224.ckpt"
2 |
--------------------------------------------------------------------------------
/S7/nets/mobilenet_v1.py:
--------------------------------------------------------------------------------
1 | # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # =============================================================================
15 | """MobileNet v1.
16 |
17 | MobileNet is a general architecture and can be used for multiple use cases.
18 | Depending on the use case, it can use different input layer size and different
19 | head (for example: embeddings, localization and classification).
20 |
21 | As described in https://arxiv.org/abs/1704.04861.
22 |
23 | MobileNets: Efficient Convolutional Neural Networks for
24 | Mobile Vision Applications
25 | Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
26 | Tobias Weyand, Marco Andreetto, Hartwig Adam
27 |
28 | 100% Mobilenet V1 (base) with input size 224x224:
29 |
30 | See mobilenet_v1()
31 |
32 | Layer params macs
33 | --------------------------------------------------------------------------------
34 | MobilenetV1/Conv2d_0/Conv2D: 864 10,838,016
35 | MobilenetV1/Conv2d_1_depthwise/depthwise: 288 3,612,672
36 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 2,048 25,690,112
37 | MobilenetV1/Conv2d_2_depthwise/depthwise: 576 1,806,336
38 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 8,192 25,690,112
39 | MobilenetV1/Conv2d_3_depthwise/depthwise: 1,152 3,612,672
40 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 16,384 51,380,224
41 | MobilenetV1/Conv2d_4_depthwise/depthwise: 1,152 903,168
42 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 32,768 25,690,112
43 | MobilenetV1/Conv2d_5_depthwise/depthwise: 2,304 1,806,336
44 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 65,536 51,380,224
45 | MobilenetV1/Conv2d_6_depthwise/depthwise: 2,304 451,584
46 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 131,072 25,690,112
47 | MobilenetV1/Conv2d_7_depthwise/depthwise: 4,608 903,168
48 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 262,144 51,380,224
49 | MobilenetV1/Conv2d_8_depthwise/depthwise: 4,608 903,168
50 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 262,144 51,380,224
51 | MobilenetV1/Conv2d_9_depthwise/depthwise: 4,608 903,168
52 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 262,144 51,380,224
53 | MobilenetV1/Conv2d_10_depthwise/depthwise: 4,608 903,168
54 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 262,144 51,380,224
55 | MobilenetV1/Conv2d_11_depthwise/depthwise: 4,608 903,168
56 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 262,144 51,380,224
57 | MobilenetV1/Conv2d_12_depthwise/depthwise: 4,608 225,792
58 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 524,288 25,690,112
59 | MobilenetV1/Conv2d_13_depthwise/depthwise: 9,216 451,584
60 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 1,048,576 51,380,224
61 | --------------------------------------------------------------------------------
62 | Total: 3,185,088 567,716,352
63 |
64 |
65 | 75% Mobilenet V1 (base) with input size 128x128:
66 |
67 | See mobilenet_v1_075()
68 |
69 | Layer params macs
70 | --------------------------------------------------------------------------------
71 | MobilenetV1/Conv2d_0/Conv2D: 648 2,654,208
72 | MobilenetV1/Conv2d_1_depthwise/depthwise: 216 884,736
73 | MobilenetV1/Conv2d_1_pointwise/Conv2D: 1,152 4,718,592
74 | MobilenetV1/Conv2d_2_depthwise/depthwise: 432 442,368
75 | MobilenetV1/Conv2d_2_pointwise/Conv2D: 4,608 4,718,592
76 | MobilenetV1/Conv2d_3_depthwise/depthwise: 864 884,736
77 | MobilenetV1/Conv2d_3_pointwise/Conv2D: 9,216 9,437,184
78 | MobilenetV1/Conv2d_4_depthwise/depthwise: 864 221,184
79 | MobilenetV1/Conv2d_4_pointwise/Conv2D: 18,432 4,718,592
80 | MobilenetV1/Conv2d_5_depthwise/depthwise: 1,728 442,368
81 | MobilenetV1/Conv2d_5_pointwise/Conv2D: 36,864 9,437,184
82 | MobilenetV1/Conv2d_6_depthwise/depthwise: 1,728 110,592
83 | MobilenetV1/Conv2d_6_pointwise/Conv2D: 73,728 4,718,592
84 | MobilenetV1/Conv2d_7_depthwise/depthwise: 3,456 221,184
85 | MobilenetV1/Conv2d_7_pointwise/Conv2D: 147,456 9,437,184
86 | MobilenetV1/Conv2d_8_depthwise/depthwise: 3,456 221,184
87 | MobilenetV1/Conv2d_8_pointwise/Conv2D: 147,456 9,437,184
88 | MobilenetV1/Conv2d_9_depthwise/depthwise: 3,456 221,184
89 | MobilenetV1/Conv2d_9_pointwise/Conv2D: 147,456 9,437,184
90 | MobilenetV1/Conv2d_10_depthwise/depthwise: 3,456 221,184
91 | MobilenetV1/Conv2d_10_pointwise/Conv2D: 147,456 9,437,184
92 | MobilenetV1/Conv2d_11_depthwise/depthwise: 3,456 221,184
93 | MobilenetV1/Conv2d_11_pointwise/Conv2D: 147,456 9,437,184
94 | MobilenetV1/Conv2d_12_depthwise/depthwise: 3,456 55,296
95 | MobilenetV1/Conv2d_12_pointwise/Conv2D: 294,912 4,718,592
96 | MobilenetV1/Conv2d_13_depthwise/depthwise: 6,912 110,592
97 | MobilenetV1/Conv2d_13_pointwise/Conv2D: 589,824 9,437,184
98 | --------------------------------------------------------------------------------
99 | Total: 1,800,144 106,002,432
100 |
101 | """
102 |
103 | # Tensorflow mandates these.
104 | from __future__ import absolute_import
105 | from __future__ import division
106 | from __future__ import print_function
107 |
108 | from collections import namedtuple
109 | import functools
110 |
111 | import tensorflow as tf
112 |
113 | slim = tf.contrib.slim
114 |
115 | # Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
116 | # Conv defines 3x3 convolution layers
117 | # DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
118 | # stride is the stride of the convolution
119 | # depth is the number of channels or filters in a layer
120 | Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
121 | DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])
122 |
123 | # MOBILENETV1_CONV_DEFS specifies the MobileNet body
124 | MOBILENETV1_CONV_DEFS = [
125 | Conv(kernel=[3, 3], stride=2, depth=32),
126 | DepthSepConv(kernel=[3, 3], stride=1, depth=64),
127 | DepthSepConv(kernel=[3, 3], stride=2, depth=128),
128 | DepthSepConv(kernel=[3, 3], stride=1, depth=128),
129 | DepthSepConv(kernel=[3, 3], stride=2, depth=256),
130 | DepthSepConv(kernel=[3, 3], stride=1, depth=256),
131 | DepthSepConv(kernel=[3, 3], stride=2, depth=512),
132 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
133 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
134 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
135 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
136 | DepthSepConv(kernel=[3, 3], stride=1, depth=512),
137 | DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
138 | DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
139 | ]
140 |
141 |
142 | def _fixed_padding(inputs, kernel_size, rate=1):
143 | """Pads the input along the spatial dimensions independently of input size.
144 |
145 | Pads the input such that if it was used in a convolution with 'VALID' padding,
146 | the output would have the same dimensions as if the unpadded input was used
147 | in a convolution with 'SAME' padding.
148 |
149 | Args:
150 | inputs: A tensor of size [batch, height_in, width_in, channels].
151 | kernel_size: The kernel to be used in the conv2d or max_pool2d operation.
152 | rate: An integer, rate for atrous convolution.
153 |
154 | Returns:
155 | output: A tensor of size [batch, height_out, width_out, channels] with the
156 | input, either intact (if kernel_size == 1) or padded (if kernel_size > 1).
157 | """
158 | kernel_size_effective = [kernel_size[0] + (kernel_size[0] - 1) * (rate - 1),
159 | kernel_size[0] + (kernel_size[0] - 1) * (rate - 1)]
160 | pad_total = [kernel_size_effective[0] - 1, kernel_size_effective[1] - 1]
161 | pad_beg = [pad_total[0] // 2, pad_total[1] // 2]
162 | pad_end = [pad_total[0] - pad_beg[0], pad_total[1] - pad_beg[1]]
163 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg[0], pad_end[0]],
164 | [pad_beg[1], pad_end[1]], [0, 0]])
165 | return padded_inputs
166 |
167 |
168 | def mobilenet_v1_base(inputs,
169 | final_endpoint='Conv2d_13_pointwise',
170 | min_depth=8,
171 | depth_multiplier=1.0,
172 | conv_defs=None,
173 | output_stride=None,
174 | use_explicit_padding=False,
175 | scope=None):
176 | """Mobilenet v1.
177 |
178 | Constructs a Mobilenet v1 network from inputs to the given final endpoint.
179 |
180 | Args:
181 | inputs: a tensor of shape [batch_size, height, width, channels].
182 | final_endpoint: specifies the endpoint to construct the network up to. It
183 | can be one of ['Conv2d_0', 'Conv2d_1_pointwise', 'Conv2d_2_pointwise',
184 | 'Conv2d_3_pointwise', 'Conv2d_4_pointwise', 'Conv2d_5'_pointwise,
185 | 'Conv2d_6_pointwise', 'Conv2d_7_pointwise', 'Conv2d_8_pointwise',
186 | 'Conv2d_9_pointwise', 'Conv2d_10_pointwise', 'Conv2d_11_pointwise',
187 | 'Conv2d_12_pointwise', 'Conv2d_13_pointwise'].
188 | min_depth: Minimum depth value (number of channels) for all convolution ops.
189 | Enforced when depth_multiplier < 1, and not an active constraint when
190 | depth_multiplier >= 1.
191 | depth_multiplier: Float multiplier for the depth (number of channels)
192 | for all convolution ops. The value must be greater than zero. Typical
193 | usage will be to set this value in (0, 1) to reduce the number of
194 | parameters or computation cost of the model.
195 | conv_defs: A list of ConvDef namedtuples specifying the net architecture.
196 | output_stride: An integer that specifies the requested ratio of input to
197 | output spatial resolution. If not None, then we invoke atrous convolution
198 | if necessary to prevent the network from reducing the spatial resolution
199 | of the activation maps. Allowed values are 8 (accurate fully convolutional
200 | mode), 16 (fast fully convolutional mode), 32 (classification mode).
201 | use_explicit_padding: Use 'VALID' padding for convolutions, but prepad
202 | inputs so that the output dimensions are the same as if 'SAME' padding
203 | were used.
204 | scope: Optional variable_scope.
205 |
206 | Returns:
207 | tensor_out: output tensor corresponding to the final_endpoint.
208 | end_points: a set of activations for external use, for example summaries or
209 | losses.
210 |
211 | Raises:
212 | ValueError: if final_endpoint is not set to one of the predefined values,
213 | or depth_multiplier <= 0, or the target output_stride is not
214 | allowed.
215 | """
216 | depth = lambda d: max(int(d * depth_multiplier), min_depth)
217 | end_points = {}
218 |
219 | # Used to find thinned depths for each layer.
220 | if depth_multiplier <= 0:
221 | raise ValueError('depth_multiplier is not greater than zero.')
222 |
223 | if conv_defs is None:
224 | conv_defs = MOBILENETV1_CONV_DEFS
225 |
226 | if output_stride is not None and output_stride not in [8, 16, 32]:
227 | raise ValueError('Only allowed output_stride values are 8, 16, 32.')
228 |
229 | padding = 'SAME'
230 | if use_explicit_padding:
231 | padding = 'VALID'
232 | with tf.variable_scope(scope, 'MobilenetV1', [inputs]):
233 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d], padding=padding):
234 | # The current_stride variable keeps track of the output stride of the
235 | # activations, i.e., the running product of convolution strides up to the
236 | # current network layer. This allows us to invoke atrous convolution
237 | # whenever applying the next convolution would result in the activations
238 | # having output stride larger than the target output_stride.
239 | current_stride = 1
240 |
241 | # The atrous convolution rate parameter.
242 | rate = 1
243 |
244 | net = inputs
245 | for i, conv_def in enumerate(conv_defs):
246 | end_point_base = 'Conv2d_%d' % i
247 |
248 | if output_stride is not None and current_stride == output_stride:
249 | # If we have reached the target output_stride, then we need to employ
250 | # atrous convolution with stride=1 and multiply the atrous rate by the
251 | # current unit's stride for use in subsequent layers.
252 | layer_stride = 1
253 | layer_rate = rate
254 | rate *= conv_def.stride
255 | else:
256 | layer_stride = conv_def.stride
257 | layer_rate = 1
258 | current_stride *= conv_def.stride
259 |
260 | if isinstance(conv_def, Conv):
261 | end_point = end_point_base
262 | if use_explicit_padding:
263 | net = _fixed_padding(net, conv_def.kernel)
264 | net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel,
265 | stride=conv_def.stride,
266 | normalizer_fn=slim.batch_norm,
267 | scope=end_point)
268 | end_points[end_point] = net
269 | if end_point == final_endpoint:
270 | return net, end_points
271 |
272 | elif isinstance(conv_def, DepthSepConv):
273 | end_point = end_point_base + '_depthwise'
274 |
275 | # By passing filters=None
276 | # separable_conv2d produces only a depthwise convolution layer
277 | if use_explicit_padding:
278 | net = _fixed_padding(net, conv_def.kernel, layer_rate)
279 | net = slim.separable_conv2d(net, None, conv_def.kernel,
280 | depth_multiplier=1,
281 | stride=layer_stride,
282 | rate=layer_rate,
283 | normalizer_fn=slim.batch_norm,
284 | scope=end_point)
285 |
286 | end_points[end_point] = net
287 | if end_point == final_endpoint:
288 | return net, end_points
289 |
290 | end_point = end_point_base + '_pointwise'
291 |
292 | net = slim.conv2d(net, depth(conv_def.depth), [1, 1],
293 | stride=1,
294 | normalizer_fn=slim.batch_norm,
295 | scope=end_point)
296 |
297 | end_points[end_point] = net
298 | if end_point == final_endpoint:
299 | return net, end_points
300 | else:
301 | raise ValueError('Unknown convolution type %s for layer %d'
302 | % (conv_def.ltype, i))
303 | raise ValueError('Unknown final endpoint %s' % final_endpoint)
304 |
305 |
306 | def mobilenet_v1(inputs,
307 | num_classes=1000,
308 | dropout_keep_prob=0.999,
309 | is_training=True,
310 | min_depth=8,
311 | depth_multiplier=1.0,
312 | conv_defs=None,
313 | prediction_fn=tf.contrib.layers.softmax,
314 | spatial_squeeze=True,
315 | reuse=None,
316 | scope='MobilenetV1',
317 | global_pool=False):
318 | """Mobilenet v1 model for classification.
319 |
320 | Args:
321 | inputs: a tensor of shape [batch_size, height, width, channels].
322 | num_classes: number of predicted classes. If 0 or None, the logits layer
323 | is omitted and the input features to the logits layer (before dropout)
324 | are returned instead.
325 | dropout_keep_prob: the percentage of activation values that are retained.
326 | is_training: whether is training or not.
327 | min_depth: Minimum depth value (number of channels) for all convolution ops.
328 | Enforced when depth_multiplier < 1, and not an active constraint when
329 | depth_multiplier >= 1.
330 | depth_multiplier: Float multiplier for the depth (number of channels)
331 | for all convolution ops. The value must be greater than zero. Typical
332 | usage will be to set this value in (0, 1) to reduce the number of
333 | parameters or computation cost of the model.
334 | conv_defs: A list of ConvDef namedtuples specifying the net architecture.
335 | prediction_fn: a function to get predictions out of logits.
336 | spatial_squeeze: if True, logits is of shape is [B, C], if false logits is
337 | of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
338 | reuse: whether or not the network and its variables should be reused. To be
339 | able to reuse 'scope' must be given.
340 | scope: Optional variable_scope.
341 | global_pool: Optional boolean flag to control the avgpooling before the
342 | logits layer. If false or unset, pooling is done with a fixed window
343 | that reduces default-sized inputs to 1x1, while larger inputs lead to
344 | larger outputs. If true, any input size is pooled down to 1x1.
345 |
346 | Returns:
347 | net: a 2D Tensor with the logits (pre-softmax activations) if num_classes
348 | is a non-zero integer, or the non-dropped-out input to the logits layer
349 | if num_classes is 0 or None.
350 | end_points: a dictionary from components of the network to the corresponding
351 | activation.
352 |
353 | Raises:
354 | ValueError: Input rank is invalid.
355 | """
356 | input_shape = inputs.get_shape().as_list()
357 | if len(input_shape) != 4:
358 | raise ValueError('Invalid input tensor rank, expected 4, was: %d' %
359 | len(input_shape))
360 |
361 | with tf.variable_scope(scope, 'MobilenetV1', [inputs], reuse=reuse) as scope:
362 | with slim.arg_scope([slim.batch_norm, slim.dropout],
363 | is_training=is_training):
364 | net, end_points = mobilenet_v1_base(inputs, scope=scope,
365 | min_depth=min_depth,
366 | depth_multiplier=depth_multiplier,
367 | conv_defs=conv_defs)
368 | with tf.variable_scope('Logits'):
369 | if global_pool:
370 | # Global average pooling.
371 | net = tf.reduce_mean(net, [1, 2], keep_dims=True, name='global_pool')
372 | end_points['global_pool'] = net
373 | else:
374 | # Pooling with a fixed kernel size.
375 | kernel_size = _reduced_kernel_size_for_small_input(net, [7, 7])
376 | net = slim.avg_pool2d(net, kernel_size, padding='VALID',
377 | scope='AvgPool_1a')
378 | end_points['AvgPool_1a'] = net
379 | if not num_classes:
380 | return net, end_points
381 | # 1 x 1 x 1024
382 | net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
383 | logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
384 | normalizer_fn=None, scope='Conv2d_1c_1x1')
385 | if spatial_squeeze:
386 | logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
387 | end_points['Logits'] = logits
388 | if prediction_fn:
389 | end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
390 | return logits, end_points
391 |
392 | mobilenet_v1.default_image_size = 224
393 |
394 |
395 | def wrapped_partial(func, *args, **kwargs):
396 | partial_func = functools.partial(func, *args, **kwargs)
397 | functools.update_wrapper(partial_func, func)
398 | return partial_func
399 |
400 |
401 | mobilenet_v1_075 = wrapped_partial(mobilenet_v1, depth_multiplier=0.75)
402 | mobilenet_v1_050 = wrapped_partial(mobilenet_v1, depth_multiplier=0.50)
403 | mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25)
404 |
405 |
406 | def _reduced_kernel_size_for_small_input(input_tensor, kernel_size):
407 | """Define kernel size which is automatically reduced for small input.
408 |
409 | If the shape of the input images is unknown at graph construction time this
410 | function assumes that the input images are large enough.
411 |
412 | Args:
413 | input_tensor: input tensor of size [batch_size, height, width, channels].
414 | kernel_size: desired kernel size of length 2: [kernel_height, kernel_width]
415 |
416 | Returns:
417 | a tensor with the kernel size.
418 | """
419 | shape = input_tensor.get_shape().as_list()
420 | if shape[1] is None or shape[2] is None:
421 | kernel_size_out = kernel_size
422 | else:
423 | kernel_size_out = [min(shape[1], kernel_size[0]),
424 | min(shape[2], kernel_size[1])]
425 | return kernel_size_out
426 |
427 |
428 | def mobilenet_v1_arg_scope(
429 | is_training=True,
430 | weight_decay=0.00004,
431 | stddev=0.09,
432 | regularize_depthwise=False,
433 | batch_norm_decay=0.9997,
434 | batch_norm_epsilon=0.001,
435 | batch_norm_updates_collections=tf.GraphKeys.UPDATE_OPS):
436 | """Defines the default MobilenetV1 arg scope.
437 |
438 | Args:
439 | is_training: Whether or not we're training the model. If this is set to
440 | None, the parameter is not added to the batch_norm arg_scope.
441 | weight_decay: The weight decay to use for regularizing the model.
442 | stddev: The standard deviation of the trunctated normal weight initializer.
443 | regularize_depthwise: Whether or not apply regularization on depthwise.
444 | batch_norm_decay: Decay for batch norm moving average.
445 | batch_norm_epsilon: Small float added to variance to avoid dividing by zero
446 | in batch norm.
447 | batch_norm_updates_collections: Collection for the update ops for
448 | batch norm.
449 |
450 | Returns:
451 | An `arg_scope` to use for the mobilenet v1 model.
452 | """
453 | batch_norm_params = {
454 | 'center': True,
455 | 'scale': True,
456 | 'decay': batch_norm_decay,
457 | 'epsilon': batch_norm_epsilon,
458 | 'updates_collections': batch_norm_updates_collections,
459 | }
460 | if is_training is not None:
461 | batch_norm_params['is_training'] = is_training
462 |
463 | # Set weight_decay for weights in Conv and DepthSepConv layers.
464 | weights_init = tf.truncated_normal_initializer(stddev=stddev)
465 | regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
466 | if regularize_depthwise:
467 | depthwise_regularizer = regularizer
468 | else:
469 | depthwise_regularizer = None
470 | with slim.arg_scope([slim.conv2d, slim.separable_conv2d],
471 | weights_initializer=weights_init,
472 | activation_fn=tf.nn.relu6, normalizer_fn=slim.batch_norm):
473 | with slim.arg_scope([slim.batch_norm], **batch_norm_params):
474 | with slim.arg_scope([slim.conv2d], weights_regularizer=regularizer):
475 | with slim.arg_scope([slim.separable_conv2d],
476 | weights_regularizer=depthwise_regularizer) as sc:
477 | return sc
478 |
--------------------------------------------------------------------------------
/S7/trainer.py:
--------------------------------------------------------------------------------
1 | """ Coding Session 7: fine tuning a model in TensorFlow
2 | You can download the pre-trained checkpoint from:
3 | https://github.com/tensorflow/models/tree/master/research/slim
4 | """
5 |
6 | import tensorflow as tf
7 | import numpy as np
8 | import os, glob
9 | import argparse
10 | from nets import mobilenet_v1
11 |
12 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
13 |
14 | class TFModelTrainer:
15 |
16 | def __init__(self, checkpoint_path, data_path):
17 | self.checkpoint_path = checkpoint_path
18 |
19 | # set training parameters
20 | self.learning_rate = 0.01
21 | self.num_iter = 100000
22 | self.save_iter = 5000
23 | self.val_iter = 5000
24 | self.log_iter = 100
25 | self.batch_size = 32
26 |
27 | # set up data layer
28 | self.training_filenames = glob.glob(os.path.join(data_path, 'train_*.tfrecord'))
29 | self.validation_filenames = glob.glob(os.path.join(data_path, 'test_*.tfrecord'))
30 | self.iterator, self.filenames = self._data_layer()
31 | self.num_val_samples = 10000
32 | self.num_classes = 2
33 | self.image_size = 224
34 |
35 | # fine tune only the last layer
36 | self.fine_tune = True #####################################################################
37 |
38 | def preprocess_image(self, image_string):
39 | image = tf.image.decode_jpeg(image_string, channels=3)
40 |
41 | # flip for data augmentation
42 | image = tf.image.random_flip_left_right(image)
43 |
44 | # normalize image to [-1, +1]
45 | image = tf.cast(image, tf.float32)
46 | image = image / 127.5
47 | image = image - 1
48 | return image
49 |
50 | def _parse_tfrecord(self, example_proto):
51 | keys_to_features = {'image': tf.FixedLenFeature([], tf.string),
52 | 'label': tf.FixedLenFeature([], tf.int64)}
53 | parsed_features = tf.parse_single_example(example_proto, keys_to_features)
54 | image = parsed_features['image']
55 | label = parsed_features['label']
56 | image = self.preprocess_image(image)
57 | return image, label
58 |
59 | def _data_layer(self, num_threads=8, prefetch_buffer=100):
60 | with tf.variable_scope('data'):
61 | filenames = tf.placeholder(tf.string, shape=[None])
62 | dataset = tf.data.TFRecordDataset(filenames)
63 | dataset = dataset.map(self._parse_tfrecord, num_parallel_calls=num_threads)
64 | dataset = dataset.repeat()
65 | dataset = dataset.batch(self.batch_size)
66 | dataset = dataset.prefetch(prefetch_buffer)
67 | iterator = dataset.make_initializable_iterator()
68 | return iterator, filenames
69 |
70 | def _loss_functions(self, logits, labels):
71 | with tf.variable_scope('loss'):
72 | target_prob = tf.one_hot(labels, self.num_classes)
73 | tf.losses.softmax_cross_entropy(target_prob, logits)
74 | total_loss = tf.losses.get_total_loss() #include regularization loss
75 | return total_loss
76 |
77 | def _optimizer(self, total_loss, global_step):
78 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
79 | with tf.control_dependencies(update_ops):
80 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
81 | if self.fine_tune: #####################################################################
82 | train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "MobilenetV1/Logits")
83 | optimizer = optimizer.minimize(total_loss, var_list=train_vars, global_step=global_step)
84 | else:
85 | optimizer = optimizer.minimize(total_loss, global_step=global_step)
86 | return optimizer
87 |
88 | def _performance_metric(self, logits, labels):
89 | with tf.variable_scope("performance_metric"):
90 | preds = tf.argmax(logits, axis=1)
91 | labels = tf.cast(labels, tf.int64)
92 | corrects = tf.equal(preds, labels)
93 | accuracy = tf.reduce_mean(tf.cast(corrects, tf.float32))
94 | return accuracy
95 |
96 | def _variables_to_restore(self, save_file, graph): #############################################
97 | # returns a list of variables that can be restored from a checkpoint
98 | reader = tf.train.NewCheckpointReader(save_file)
99 | saved_shapes = reader.get_variable_to_shape_map()
100 | var_names = sorted([(var.name, var.name.split(':')[0]) for var in tf.global_variables()
101 | if var.name.split(':')[0] in saved_shapes])
102 | restore_vars = []
103 | for var_name, saved_var_name in var_names:
104 | curr_var = graph.get_tensor_by_name(var_name)
105 | var_shape = curr_var.get_shape().as_list()
106 | if var_shape == saved_shapes[saved_var_name]:
107 | restore_vars.append(curr_var)
108 | return restore_vars
109 |
110 | def train(self):
111 | # iteration number
112 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
113 |
114 | # training graph
115 | images, labels = self.iterator.get_next()
116 | images = tf.image.resize_bilinear(images, (self.image_size, self.image_size))
117 | training = tf.placeholder(tf.bool, name='is_training')
118 | logits, _ = mobilenet_v1.mobilenet_v1(images,
119 | num_classes=self.num_classes,
120 | is_training=training,
121 | scope='MobilenetV1',
122 | global_pool=True)
123 | loss = self._loss_functions(logits, labels)
124 | optimizer = self._optimizer(loss, global_step)
125 | accuracy = self._performance_metric(logits, labels)
126 |
127 | # summary placeholders
128 | streaming_loss_p = tf.placeholder(tf.float32)
129 | accuracy_p = tf.placeholder(tf.float32)
130 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
131 | summ_op_test = tf.summary.scalar('accuracy', accuracy_p)
132 |
133 | # don't allocate entire GPU memory #########################################################
134 | config = tf.ConfigProto()
135 | config.gpu_options.allow_growth = True
136 |
137 | with tf.Session(config=config) as sess:
138 | sess.run(tf.global_variables_initializer())
139 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
140 |
141 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
142 |
143 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
144 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
145 |
146 | # resume training if a checkpoint exists
147 | if ckpt and ckpt.model_checkpoint_path:
148 | restore_vars = self._variables_to_restore(ckpt.model_checkpoint_path, sess.graph)
149 | saver = tf.train.Saver(var_list=restore_vars)
150 | saver.restore(sess, ckpt.model_checkpoint_path)
151 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
152 |
153 | initial_step = global_step.eval()
154 |
155 | # train the model
156 | streaming_loss = 0
157 | for i in range(initial_step, self.num_iter + 1):
158 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
159 |
160 | if not np.isfinite(loss_batch):
161 | print('loss diverged, stopping')
162 | exit()
163 |
164 | # log summary
165 | streaming_loss += loss_batch
166 | if i % self.log_iter == self.log_iter - 1:
167 | streaming_loss /= self.log_iter
168 | print(i + 1, streaming_loss)
169 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
170 | writer.add_summary(summary_train, global_step=i)
171 | streaming_loss = 0
172 |
173 | # save model
174 | if i % self.save_iter == self.save_iter - 1:
175 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
176 | print("Model saved!")
177 |
178 | # run validation
179 | if i % self.val_iter == self.val_iter - 1:
180 | print("Running validation.")
181 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.validation_filenames})
182 |
183 | validation_accuracy = 0
184 | for j in range(self.num_val_samples // self.batch_size):
185 | acc_batch = sess.run(accuracy, feed_dict={training: False})
186 | validation_accuracy += acc_batch
187 | validation_accuracy /= j
188 |
189 | print("Accuracy: {}".format(validation_accuracy))
190 |
191 | summary_test = sess.run(summ_op_test, feed_dict={accuracy_p: validation_accuracy})
192 | writer.add_summary(summary_test, global_step=i)
193 |
194 | sess.run(self.iterator.initializer, feed_dict={self.filenames: self.training_filenames})
195 |
196 | writer.close()
197 |
198 | def main():
199 | parser = argparse.ArgumentParser()
200 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
201 | help="Path to the dir where the checkpoints are saved")
202 | parser.add_argument('--data_path', type=str, default='./tfrecords/', help="Path to the TFRecords")
203 | args = parser.parse_args()
204 | trainer = TFModelTrainer(args.checkpoint_path, args.data_path)
205 | trainer.train()
206 |
207 | if __name__ == '__main__':
208 | main()
209 |
--------------------------------------------------------------------------------
/S8/datagenerator.py:
--------------------------------------------------------------------------------
1 | """ A Python iterator that loads and processes images.
2 | This iterator will be called through TensorFlow Dataset API to feed pairs of
3 | clean and noisy images into the model.
4 | """
5 |
6 | import cv2
7 | import numpy as np
8 | import random
9 | import os, glob
10 |
11 | class DataGenerator:
12 |
13 | def __init__(self, image_size, base_dir = '../S6/dataset'):
14 | # all images will be center cropped and resized to image_size
15 | self.image_size = image_size
16 |
17 | # number of validation samples
18 | self.num_val = 320
19 |
20 | filenames = glob.glob(os.path.join(base_dir, '**', '*.jpg'), recursive=True)
21 | self.filenames_train = filenames[self.num_val:]
22 | self.filenames_val = filenames[:self.num_val]
23 |
24 | # dataset mode
25 | self.is_training = True
26 | self.val_idx = 0
27 |
28 | def __iter__(self):
29 | return self
30 |
31 | def __next__(self):
32 | try:
33 | return self.fetch_sample()
34 | except IndexError:
35 | raise StopIteration()
36 |
37 | def get_tensor_shape(self):
38 | return (self.image_size[1], self.image_size[0], 3)
39 |
40 | def set_mode(self, is_training):
41 | self.is_training = is_training
42 | self.val_idx = 0
43 |
44 | def fetch_sample(self):
45 | if self.is_training:
46 | # pick a random image
47 | impath = random.choice(self.filenames_train)
48 | else:
49 | # pick the next validation sample
50 | impath = self.filenames_val[self.val_idx]
51 | self.val_idx += 1
52 | image_in = cv2.imread(impath)
53 |
54 | # resize to image_size
55 | image_in = self.center_crop_and_resize(image_in)
56 |
57 | # inject noise
58 | image_out = self.add_random_noise(image_in)
59 |
60 | return image_in, image_out
61 |
62 | def center_crop_and_resize(self, image):
63 | R, C, _ = image.shape
64 | if R > C:
65 | pad = (R - C) // 2
66 | image = image[pad:-pad, :]
67 | elif C > R:
68 | pad = (C - R) // 2
69 | image = image[:, pad:-pad]
70 | image = cv2.resize(image, self.image_size)
71 | return image
72 |
73 | def add_random_noise(self, image):
74 | noise_var = random.randrange(3, 15)
75 | h, w, c = image.shape
76 | image_out = image.copy()
77 | image_out = image_out.astype(np.float32)
78 | noise = np.random.randn(h, w, c) * noise_var
79 | image_out += noise
80 | image_out = np.minimum(np.maximum(image_out, 0), 255)
81 | image_out = image_out.astype(np.uint8)
82 | return image_out
83 |
84 | if __name__ == '__main__':
85 | dg = DataGenerator(image_size=(256, 256))
86 | image_in, image_out = next(dg)
87 | cv2.imwrite('image_in.jpg', image_in)
88 | cv2.imwrite('image_out.jpg', image_out)
89 |
--------------------------------------------------------------------------------
/S8/nets/model.py:
--------------------------------------------------------------------------------
1 | """ A simple encoder-decoder network with skip connections.
2 | """
3 |
4 | import tensorflow as tf
5 |
6 | def model(input_layer, training, weight_decay=0.00001):
7 |
8 | reg = tf.contrib.layers.l2_regularizer(weight_decay)
9 |
10 | def conv_block(inputs, num_filters, name):
11 | with tf.variable_scope(name):
12 | net = tf.layers.separable_conv2d(
13 | inputs=inputs,
14 | filters=num_filters,
15 | kernel_size=(3,3),
16 | padding='SAME',
17 | use_bias=False,
18 | activation=None,
19 | pointwise_regularizer=reg,
20 | depthwise_regularizer=reg)
21 | net = tf.layers.batch_normalization(
22 | inputs=net,
23 | training=training)
24 | net = tf.nn.relu(net)
25 | return net
26 |
27 | def pointwise_block(inputs, num_filters, name):
28 | with tf.variable_scope(name):
29 | net = tf.layers.conv2d(
30 | inputs=inputs,
31 | filters=num_filters,
32 | kernel_size=1,
33 | use_bias=False,
34 | activation=None,
35 | kernel_regularizer=reg)
36 | net = tf.layers.batch_normalization(
37 | inputs=net,
38 | training=training)
39 | net = tf.nn.relu(net)
40 | return net
41 |
42 | def pooling(inputs, name):
43 | with tf.variable_scope(name):
44 | net = tf.layers.max_pooling2d(inputs,
45 | pool_size=(2,2),
46 | strides=(2,2))
47 | return net
48 |
49 | def downsampling(inputs, name):
50 | with tf.variable_scope(name):
51 | net = tf.layers.average_pooling2d(inputs,
52 | pool_size=(2,2),
53 | strides=(2,2))
54 | return net
55 |
56 | def upsampling(inputs, name):
57 | with tf.variable_scope(name):
58 | dims = tf.shape(inputs)
59 | new_size = [dims[1]*2, dims[2]*2]
60 | net = tf.image.resize_bilinear(inputs, new_size)
61 | return net
62 |
63 | def output_block(inputs, name):
64 | with tf.variable_scope(name):
65 | net = tf.layers.conv2d(
66 | inputs=inputs,
67 | filters=3,
68 | kernel_size=(1,1),
69 | activation=None)
70 | return net
71 |
72 | def subnet_module(inputs, name, num_filters, num_layers=3):
73 | with tf.variable_scope(name):
74 | for i in range(num_layers-1):
75 | net = conv_block(inputs, num_filters=num_filters, name='{}_conv{}'.format(name, i))
76 | inputs = tf.concat([net, inputs], axis=3)
77 | net = conv_block(inputs, num_filters=num_filters, name='{}_conv3'.format(name))
78 | return net
79 |
80 | num_filters = 16
81 | net = input_layer
82 | skip_connections = []
83 | # encoder
84 | with tf.variable_scope('encoder'):
85 | for i in range(4):
86 | net = subnet_module(net, num_filters=num_filters, name='conv_e{}'.format(i))
87 | skip_connections.append(net)
88 | net = pooling(net, name='pool{}'.format(i))
89 | num_filters *= 2
90 |
91 | # bottleneck
92 | net = subnet_module(net, num_filters=num_filters, name='conv_bottleneck'.format(i))
93 |
94 | # decoder
95 | with tf.variable_scope('decoder'):
96 | for i in range(4):
97 | num_filters /= 2
98 | net = upsampling(net, name='upsample{}'.format(i))
99 | net = tf.concat([net, skip_connections.pop()], axis=3)
100 | net = subnet_module(net, num_filters=num_filters, name='subnet_d{}'.format(i))
101 |
102 | # exit flow
103 | with tf.variable_scope('exit_flow'):
104 | logits = output_block(net, name='output_block')
105 |
106 | return logits
107 |
--------------------------------------------------------------------------------
/S8/trainer.py:
--------------------------------------------------------------------------------
1 | """ Coding Session 8: using a python iterator as a data generator and training a denoising autoencoder
2 | """
3 |
4 | import tensorflow as tf
5 | import numpy as np
6 | import os, glob
7 | import argparse
8 | from nets.model import model
9 | from datagenerator import DataGenerator #########################
10 |
11 | os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
12 |
13 | class TFModelTrainer:
14 |
15 | def __init__(self, checkpoint_path):
16 | self.checkpoint_path = checkpoint_path
17 |
18 | # set training parameters
19 | self.learning_rate = 0.01
20 | self.num_iter = 100000
21 | self.save_iter = 1000
22 | self.val_iter = 1000
23 | self.log_iter = 100
24 | self.batch_size = 16
25 |
26 | # set up data layer
27 | self.image_size = (224, 224)
28 | self.data_generator = DataGenerator(self.image_size)
29 |
30 | def preprocess_image(self, image):
31 | # normalize image to [-1, +1]
32 | image = tf.cast(image, tf.float32)
33 | image = image / 127.5
34 | image = image - 1
35 | return image
36 |
37 | def _preprocess_images(self, image_orig, image_noisy):
38 | image_orig = self.preprocess_image(image_orig)
39 | image_noisy = self.preprocess_image(image_noisy)
40 | return image_orig, image_noisy
41 |
42 | def _data_layer(self, num_threads=8, prefetch_buffer=100):
43 | with tf.variable_scope('data'):
44 | data_shape = self.data_generator.get_tensor_shape() #########################
45 | dataset = tf.data.Dataset.from_generator(lambda: self.data_generator,
46 | (tf.float32, tf.float32),
47 | (tf.TensorShape(data_shape),
48 | tf.TensorShape(data_shape)))
49 | dataset = dataset.map(self._preprocess_images, num_parallel_calls=num_threads)
50 | dataset = dataset.batch(self.batch_size)
51 | dataset = dataset.prefetch(prefetch_buffer)
52 | iterator = dataset.make_initializable_iterator()
53 | return iterator
54 |
55 | def _loss_functions(self, preds, ground_truth):
56 | with tf.name_scope('loss'):
57 | tf.losses.mean_squared_error(ground_truth, preds) #########################
58 | total_loss = tf.losses.get_total_loss()
59 | return total_loss
60 |
61 | def _optimizer(self, loss, global_step):
62 | update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
63 | with tf.control_dependencies(update_ops):
64 | optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate, epsilon=0.1)
65 | optimizer = optimizer.minimize(loss, global_step=global_step)
66 | return optimizer
67 |
68 | def train(self):
69 | # iteration number
70 | global_step = tf.Variable(1, dtype=tf.int32, trainable=False, name='iter_number')
71 |
72 | # training graph
73 | iterator = self._data_layer()
74 | image_orig, image_noisy = iterator.get_next()
75 | training = tf.placeholder(tf.bool, name='is_training')
76 | logits = model(image_noisy, training=training)
77 | loss = self._loss_functions(logits, image_orig)
78 | optimizer = self._optimizer(loss, global_step)
79 |
80 | # summary placeholders
81 | streaming_loss_p = tf.placeholder(tf.float32)
82 | validation_loss_p = tf.placeholder(tf.float32)
83 | summ_op_train = tf.summary.scalar('streaming_loss', streaming_loss_p)
84 | summ_op_test = tf.summary.scalar('validation_loss', validation_loss_p)
85 |
86 | # don't allocate entire gpu memory
87 | config = tf.ConfigProto()
88 | config.gpu_options.allow_growth = True
89 |
90 | with tf.Session(config=config) as sess:
91 | sess.run(tf.global_variables_initializer())
92 | sess.run(iterator.initializer)
93 |
94 | writer = tf.summary.FileWriter(self.checkpoint_path, sess.graph)
95 |
96 | saver = tf.train.Saver(max_to_keep=None) # keep all checkpoints
97 | ckpt = tf.train.get_checkpoint_state(self.checkpoint_path)
98 |
99 | # resume training if a checkpoint exists
100 | if ckpt and ckpt.model_checkpoint_path:
101 | saver.restore(sess, ckpt.model_checkpoint_path)
102 | print('Loaded parameters from {}'.format(ckpt.model_checkpoint_path))
103 |
104 | initial_step = global_step.eval()
105 |
106 | # train the model
107 | streaming_loss = 0
108 | for i in range(initial_step, self.num_iter + 1):
109 | _, loss_batch = sess.run([optimizer, loss], feed_dict={training: True})
110 |
111 | if not np.isfinite(loss_batch):
112 | print('loss diverged, stopping')
113 | exit()
114 |
115 | # log summary
116 | streaming_loss += loss_batch
117 | if i % self.log_iter == self.log_iter - 1:
118 | streaming_loss /= self.log_iter
119 | print(i + 1, streaming_loss)
120 | summary_train = sess.run(summ_op_train, feed_dict={streaming_loss_p: streaming_loss})
121 | writer.add_summary(summary_train, global_step=i)
122 | streaming_loss = 0
123 |
124 | # save model
125 | if i % self.save_iter == self.save_iter - 1:
126 | saver.save(sess, os.path.join(self.checkpoint_path, 'checkpoint'), global_step=global_step)
127 | print("Model saved!")
128 |
129 | # run validation
130 | if i % self.val_iter == self.val_iter - 1:
131 | print("Running validation.")
132 | self.data_generator.set_mode(is_training=False)
133 | sess.run(iterator.initializer)
134 |
135 | validation_loss = 0
136 | for j in range(self.data_generator.num_val // self.batch_size):
137 | loss_batch = sess.run(loss, feed_dict={training: False})
138 | validation_loss += loss_batch
139 | validation_loss /= j
140 |
141 | print("Validation loss: {}".format(validation_loss))
142 |
143 | summary_test = sess.run(summ_op_test, feed_dict={validation_loss_p: validation_loss})
144 | writer.add_summary(summary_test, global_step=i)
145 |
146 | self.data_generator.set_mode(is_training=True)
147 | sess.run(iterator.initializer)
148 |
149 | writer.close()
150 |
151 | def main():
152 | parser = argparse.ArgumentParser()
153 | parser.add_argument('--checkpoint_path', type=str, default='./checkpoints/',
154 | help="Path to the dir where the checkpoints are saved")
155 | args = parser.parse_args()
156 | trainer = TFModelTrainer(args.checkpoint_path)
157 | trainer.train()
158 |
159 | if __name__ == '__main__':
160 | main()
161 |
--------------------------------------------------------------------------------
/S9/S1.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | x = 2.0
4 | y = 8.0
5 |
6 | @tf.function
7 | def geometric_mean(x, y):
8 | g_mean = tf.sqrt(x * y)
9 | return g_mean
10 |
11 | g_mean = geometric_mean(x, y)
12 | tf.print(g_mean)
--------------------------------------------------------------------------------
/S9/S2.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
4 |
5 | model = tf.keras.Sequential([
6 | tf.keras.layers.Flatten(input_shape=(28, 28)),
7 | tf.keras.layers.Dense(512, activation=tf.nn.relu),
8 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
9 | ])
10 |
11 | model.compile(optimizer='adam',
12 | loss='sparse_categorical_crossentropy',
13 | metrics=['accuracy'])
14 |
15 | model.fit(train_images, train_labels, epochs=5)
16 |
--------------------------------------------------------------------------------
/S9/S3.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | (train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
4 |
5 | inputs = tf.keras.Input(shape=(28,28))
6 | net = tf.keras.layers.Flatten()(inputs)
7 | net = tf.keras.layers.Dense(512, activation=tf.nn.relu)(net)
8 | net = tf.keras.layers.Dense(10, activation=tf.nn.softmax)(net)
9 | model = tf.keras.Model(inputs=inputs, outputs=net)
10 |
11 | model.compile(optimizer='adam',
12 | loss='sparse_categorical_crossentropy',
13 | metrics=['accuracy'])
14 |
15 | model.fit(train_images, train_labels, epochs=5)
16 |
--------------------------------------------------------------------------------
/S9/S4.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | data_train, _ = tf.keras.datasets.mnist.load_data()
4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
5 | dataset = dataset.shuffle(buffer_size=60000)
6 | dataset = dataset.batch(32)
7 |
8 | for row in dataset:
9 | print(row)
10 |
11 | model = tf.keras.Sequential([
12 | tf.keras.layers.Flatten(input_shape=(28, 28)),
13 | tf.keras.layers.Dense(512, activation=tf.nn.relu),
14 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
15 | ])
16 |
17 | model.compile(optimizer='adam',
18 | loss='sparse_categorical_crossentropy',
19 | metrics=['accuracy'])
20 | model.fit(dataset, epochs=5)
21 |
--------------------------------------------------------------------------------
/S9/S5.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | data_train, _ = tf.keras.datasets.mnist.load_data()
4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
5 | dataset = dataset.shuffle(buffer_size=60000)
6 | dataset = dataset.batch(32)
7 |
8 | model = tf.keras.Sequential([
9 | tf.keras.layers.Flatten(input_shape=(28, 28)),
10 | tf.keras.layers.Dense(512, activation=tf.nn.relu),
11 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
12 | ])
13 |
14 | model.compile(optimizer='adam',
15 | loss='sparse_categorical_crossentropy',
16 | metrics=['accuracy'])
17 |
18 | tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints')
19 | model.fit(dataset, epochs=5, callbacks=[tb])
20 |
--------------------------------------------------------------------------------
/S9/S6.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | data_train, _ = tf.keras.datasets.mnist.load_data()
4 | dataset = tf.data.Dataset.from_tensor_slices(data_train)
5 | dataset = dataset.shuffle(buffer_size=60000)
6 | dataset = dataset.batch(32)
7 |
8 | strategy = tf.distribute.MirroredStrategy()
9 | with strategy.scope():
10 | model = tf.keras.Sequential([
11 | tf.keras.layers.Flatten(input_shape=(28, 28)),
12 | tf.keras.layers.Dense(512, activation=tf.nn.relu),
13 | tf.keras.layers.Dense(10, activation=tf.nn.softmax)
14 | ])
15 |
16 | model.compile(optimizer='adam',
17 | loss='sparse_categorical_crossentropy',
18 | metrics=['accuracy'])
19 |
20 | tb = tf.keras.callbacks.TensorBoard(log_dir='./checkpoints')
21 | model.fit(dataset, epochs=5, callbacks=[tb])
22 |
--------------------------------------------------------------------------------
/img/dlcc_github.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/dlcc_github.jpg
--------------------------------------------------------------------------------
/img/tfcs_github.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/isikdogan/deep_learning_tutorials/7a81d56c1b6e8bee715ddb08e85ea25562acbdd8/img/tfcs_github.png
--------------------------------------------------------------------------------