├── README.md
├── client.py
├── custom_model.py
└── demo.ipynb


/README.md:
--------------------------------------------------------------------------------
 1 | # How-to-Deploy-a-Tensorflow-Model-in-Production
 2 | This is the code for the "How to Deploy a Tensorflow Model in Production" by Siraj Raval on YouTube
 3 | 
 4 | ## Overview
 5 | 
 6 | This is the code for [this](https://youtu.be/T_afaArR0E8) video on Youtube by Siraj Raval. We're going to use the [Tensorflow Serving](https://tensorflow.github.io/serving/) library to deploy an inception model in production. 
 7 | 
 8 | ## Dependencies
 9 | 
10 | All included in the iPython notebook. You just need [docker](https://www.docker.com/)
11 | 
12 | ## Usage
13 | 
14 | Run the notebook by running `jupyter notebook` in terminal in the main directory. All the instructions are in there. 
15 | 
16 | The 2 attached files -- one is the client and the other is an example of how we train and save an simple MNIST model for Tensorflow Serving.  
17 | 
18 | ## Credits
19 | 
20 | Credits go to Google. I've merely created a wrapper to get people started.
21 | 


--------------------------------------------------------------------------------
/client.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2016 Google Inc. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # ==============================================================================
15 | 
16 | #!/usr/bin/env python2.7
17 | 
18 | """Send JPEG image to tensorflow_model_server loaded with inception model.
19 | """
20 | 
21 | from __future__ import print_function
22 | 
23 | # This is a placeholder for a Google-internal import.
24 | 
25 | from grpc.beta import implementations
26 | import tensorflow as tf
27 | 
28 | from tensorflow_serving.apis import predict_pb2
29 | from tensorflow_serving.apis import prediction_service_pb2
30 | 
31 | 
32 | tf.app.flags.DEFINE_string('server', 'localhost:9000',
33 |                            'PredictionService host:port')
34 | tf.app.flags.DEFINE_string('image', '', 'path to image in JPEG format')
35 | FLAGS = tf.app.flags.FLAGS
36 | 
37 | 
38 | def main(_):
39 |   host, port = FLAGS.server.split(':')
40 |   channel = implementations.insecure_channel(host, int(port))
41 |   stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
42 |   # Send request
43 |   with open(FLAGS.image, 'rb') as f:
44 |     # See prediction_service.proto for gRPC request/response details.
45 |     data = f.read()
46 |     request = predict_pb2.PredictRequest()
47 |     request.model_spec.name = 'inception'
48 |     request.model_spec.signature_name = 'predict_images'
49 |     request.inputs['images'].CopyFrom(
50 |         tf.contrib.util.make_tensor_proto(data, shape=[1]))
51 |     result = stub.Predict(request, 10.0)  # 10 secs timeout
52 |     print(result)
53 | 
54 | 
55 | if __name__ == '__main__':
56 |   tf.app.run()
57 | 


--------------------------------------------------------------------------------
/custom_model.py:
--------------------------------------------------------------------------------
  1 | # Copyright 2016 Google Inc. All Rights Reserved.
  2 | #
  3 | # Licensed under the Apache License, Version 2.0 (the "License");
  4 | # you may not use this file except in compliance with the License.
  5 | # You may obtain a copy of the License at
  6 | #
  7 | #     http://www.apache.org/licenses/LICENSE-2.0
  8 | #
  9 | # Unless required by applicable law or agreed to in writing, software
 10 | # distributed under the License is distributed on an "AS IS" BASIS,
 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12 | # See the License for the specific language governing permissions and
 13 | # limitations under the License.
 14 | # ==============================================================================
 15 | 
 16 | #!/usr/bin/env python2.7
 17 | """Train and export a simple Softmax Regression TensorFlow model.
 18 | The model is from the TensorFlow "MNIST For ML Beginner" tutorial. This program
 19 | simply follows all its training instructions, and uses TensorFlow SavedModel to
 20 | export the trained model with proper signatures that can be loaded by standard
 21 | tensorflow_model_server.
 22 | Usage: mnist_export.py [--training_iteration=x] [--model_version=y] export_dir
 23 | """
 24 | 
 25 | import os
 26 | import sys
 27 | import tensorflow as tf
 28 | from tensorflow.python.saved_model import builder as saved_model_builder
 29 | from tensorflow.python.saved_model import signature_constants
 30 | from tensorflow.python.saved_model import signature_def_utils
 31 | from tensorflow.python.saved_model import tag_constants
 32 | from tensorflow.python.saved_model import utils
 33 | from tensorflow.python.util import compat
 34 | from tensorflow_serving.example import mnist_input_data
 35 | 
 36 | 
 37 | #training flags
 38 | tf.app.flags.DEFINE_integer('training_iteration', 1000,
 39 |                             'number of training iterations.')
 40 | tf.app.flags.DEFINE_integer('model_version', 1, 'version number of the model.')
 41 | tf.app.flags.DEFINE_string('work_dir', '/tmp', 'Working directory.')
 42 | FLAGS = tf.app.flags.FLAGS
 43 | 
 44 | 
 45 | def main(_):
 46 |   if len(sys.argv) < 2 or sys.argv[-1].startswith('-'):
 47 |     print('Usage: mnist_export.py [--training_iteration=x] '
 48 |           '[--model_version=y] export_dir')
 49 |     sys.exit(-1)
 50 |   if FLAGS.training_iteration <= 0:
 51 |     print 'Please specify a positive value for training iteration.'
 52 |     sys.exit(-1)
 53 |   if FLAGS.model_version <= 0:
 54 |     print 'Please specify a positive value for version number.'
 55 |     sys.exit(-1)
 56 | 
 57 |   # Train model
 58 |   print 'Training model...'
 59 |   
 60 |   #Read the data and format it
 61 |   mnist = mnist_input_data.read_data_sets(FLAGS.work_dir, one_hot=True)
 62 |   sess = tf.InteractiveSession()
 63 |   serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
 64 |   feature_configs = {'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32),}
 65 |   tf_example = tf.parse_example(serialized_tf_example, feature_configs)
 66 |   
 67 |   
 68 |   #Build model
 69 |   x = tf.identity(tf_example['x'], name='x')  # use tf.identity() to assign name
 70 |   y_ = tf.placeholder('float', shape=[None, 10])
 71 |   w = tf.Variable(tf.zeros([784, 10]))
 72 |   b = tf.Variable(tf.zeros([10]))
 73 |   sess.run(tf.global_variables_initializer())
 74 |   y = tf.nn.softmax(tf.matmul(x, w) + b, name='y')
 75 |   cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
 76 |   train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
 77 |   values, indices = tf.nn.top_k(y, 10)
 78 |   table = tf.contrib.lookup.index_to_string_table_from_tensor(
 79 |       tf.constant([str(i) for i in xrange(10)]))
 80 |   
 81 |   
 82 |   #train the model
 83 |   prediction_classes = table.lookup(tf.to_int64(indices))
 84 |   for _ in range(FLAGS.training_iteration):
 85 |     batch = mnist.train.next_batch(50)
 86 |     train_step.run(feed_dict={x: batch[0], y_: batch[1]})
 87 |   correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
 88 |   accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
 89 |   print 'training accuracy %g' % sess.run(
 90 |       accuracy, feed_dict={x: mnist.test.images,
 91 |                            y_: mnist.test.labels})
 92 |   print 'Done training!'
 93 | 
 94 |   # Save the model
 95 |   
 96 |   #where to save to?
 97 |   export_path_base = sys.argv[-1]
 98 |   export_path = os.path.join(
 99 |       compat.as_bytes(export_path_base),
100 |       compat.as_bytes(str(FLAGS.model_version)))
101 |   print 'Exporting trained model to', export_path
102 |   
103 |   #This creates a SERVABLE from our model
104 |   #saves a "snapshot" of the trained model to reliable storage 
105 |   #so that it can be loaded later for inference.
106 |   #can save as many version as necessary
107 |   
108 |   #the tensoroflow serving main file tensorflow_model_server
109 |   #will create a SOURCE out of it, the source
110 |   #can house state that is shared across multiple servables 
111 |   #or versions
112 |   
113 |   #we can later create a LOADER from it using tf.saved_model.loader.load
114 |   
115 |   #then the MANAGER decides how to handle its lifecycle
116 |   
117 |   builder = saved_model_builder.SavedModelBuilder(export_path)
118 | 
119 |   # Build the signature_def_map.
120 |   #Signature specifies what type of model is being exported, 
121 |   #and the input/output tensors to bind to when running inference.
122 |   #think of them as annotiations on the graph for serving
123 |   #we can use them a number of ways
124 |   #grabbing whatever inputs/outputs/models we want either on server
125 |   #or via client
126 |   classification_inputs = utils.build_tensor_info(serialized_tf_example)
127 |   classification_outputs_classes = utils.build_tensor_info(prediction_classes)
128 |   classification_outputs_scores = utils.build_tensor_info(values)
129 | 
130 |    
131 |   classification_signature = signature_def_utils.build_signature_def(
132 |       inputs={signature_constants.CLASSIFY_INPUTS: classification_inputs},
133 |       outputs={
134 |           signature_constants.CLASSIFY_OUTPUT_CLASSES:
135 |               classification_outputs_classes,
136 |           signature_constants.CLASSIFY_OUTPUT_SCORES:
137 |               classification_outputs_scores
138 |       },
139 |       method_name=signature_constants.CLASSIFY_METHOD_NAME)
140 | 
141 |   tensor_info_x = utils.build_tensor_info(x)
142 |   tensor_info_y = utils.build_tensor_info(y)
143 | 
144 |   prediction_signature = signature_def_utils.build_signature_def(
145 |       inputs={'images': tensor_info_x},
146 |       outputs={'scores': tensor_info_y},
147 |       method_name=signature_constants.PREDICT_METHOD_NAME)
148 | 
149 |   legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
150 |   
151 |   #add the sigs to the servable
152 |   builder.add_meta_graph_and_variables(
153 |       sess, [tag_constants.SERVING],
154 |       signature_def_map={
155 |           'predict_images':
156 |               prediction_signature,
157 |           signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
158 |               classification_signature,
159 |       },
160 |       legacy_init_op=legacy_init_op)
161 | 
162 |   #save it!
163 |   builder.save()
164 | 
165 |   print 'Done exporting!'
166 | 
167 | 
168 | if __name__ == '__main__':
169 |   tf.app.run()
170 | 


--------------------------------------------------------------------------------
/demo.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# How to Deploy a Tensorflow Model in Production\n",
  8 |     "\n",
  9 |     "We know how to \n",
 10 |     "- write models\n",
 11 |     "- train models\n",
 12 |     "- test models\n",
 13 |     "\n",
 14 |     "but how do we deploy them for production use?\n",
 15 |     "\n",
 16 |     "Let's create a simple  webapp that will allow the user to upload an image and run the Inception model over it for classifying.\n",
 17 |     "\n",
 18 |     "##  Tensorflow Serving \n",
 19 |     "\n",
 20 |     "![Image of Yaktocat](https://cdn-images-1.medium.com/max/1800/0*O7yprjYDk2WTO3__.)\n",
 21 |     "\n",
 22 |     "![Image of Yaktocat](https://blogs.nvidia.com/wp-content/uploads/2016/08/ai_difference_between_deep_learning_training_inference.jpg)\n",
 23 |     "\n",
 24 |     "- Google's open source library that accompanies Tensorflow\n",
 25 |     "- Meant for Inference. (managing models, giving versioned access via reference-counted lookup table i.e HTTP interface via RPC) \n",
 26 |     "https://apihandyman.io/do-you-really-know-why-you-prefer-rest-over-rpc/\n",
 27 |     "- Can serve multiple models simultaneously (great for A/B testing)\n",
 28 |     "- Can serve multiple versions of the same model\n",
 29 |     "- written in C++ \n",
 30 |     "\n",
 31 |     "\n",
 32 |     "## Architecture Overview\n",
 33 |     "\n",
 34 |     "![Image of Yaktocat](https://tensorflow.github.io/serving/images/serving_architecture.svg)\n",
 35 |     "\n",
 36 |     "### 4 major components\n",
 37 |     "\n",
 38 |     "### Servables\n",
 39 |     "\n",
 40 |     "- The central abstraction in TensorFlow Serving. They are the objects that clients use to perform computation (for example, a lookup or inference).\n",
 41 |     "- Flexible size (single lookup table shard, model, multiple models)\n",
 42 |     "- Good for Concurrent operations, A/B testing\n",
 43 |     "- Multiple versions of a servable in one instance helps refresh configs\n",
 44 |     "- Streams are sorted in-order like Git\n",
 45 |     "\n",
 46 |     "### Loaders\n",
 47 |     "\n",
 48 |     "- manage a servable's life cycle. Enables common infrastructure, standardizes the APIs for loading and unloading a servable.\n",
 49 |     "\n",
 50 |     "### Sources\n",
 51 |     "\n",
 52 |     "- plugin modules that originates zero or more servable streams. For each stream, a Source supplies one Loader instance for each version it wants to have loaded. \n",
 53 |     "\n",
 54 |     "### Managers\n",
 55 |     "\n",
 56 |     "- handle the full lifecycle of Servables (loading, serving, unloading)\n",
 57 |     "- listen to Sources and track all versions. \n",
 58 |     "- tries to fulfill Sources' requests, but may refuse to load an aspired version if, say, required resources aren't available. \n",
 59 |     "- may wait to unload until a newer version finishes loading, based on a policy to guarantee that at least one version is loaded at all times.\n",
 60 |     "\n",
 61 |     "### 2 Step Process \n",
 62 |     "\n",
 63 |     "1. Sources create Loaders for Servable Versions.\n",
 64 |     "2. Loaders are sent as Aspired Versions to the Manager, which loads and serves them to client requests.\n",
 65 |     "\n"
 66 |    ]
 67 |   },
 68 |   {
 69 |    "cell_type": "markdown",
 70 |    "metadata": {},
 71 |    "source": [
 72 |     "# Step 1 - Setup Development Environment\n",
 73 |     "\n",
 74 |     "Manually install from source? No. Let's use Docker.\n",
 75 |     "\n",
 76 |     "\n",
 77 |     "Docker is like a lightweight version of a virtual machine image that runs without the overhead of running a full OS inside it. It's like an app-specific VM. No need to worry about conflicting versions and other entanglements with the rest of the OS. \n",
 78 |     "\n",
 79 |     "![Image of Yaktocat](http://zdnet2.cbsistatic.com/hub/i/r/2017/05/08/af178c5a-64dd-4900-8447-3abd739757e3/resize/770xauto/78abd09a8d41c182a28118ac0465c914/docker-vm-container.png)\n",
 80 |     "\n",
 81 |     "Install Instructions https://docs.docker.com/engine/installation/.\n",
 82 |     "\n",
 83 |     "Let's first clone the Tensorflow serving repo\n",
 84 |     "\n",
 85 |     "```\n",
 86 |     "git clone --recursive https://github.com/tensorflow/serving\n",
 87 |     "cd serving\n",
 88 |     "```\n",
 89 |     "\n",
 90 |     "Now we can create a docker image with all the required dependencies(pip dependencies, bazel, grpc)\n",
 91 |     "\n",
 92 |     "```\n",
 93 |     "docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .\n",
 94 |     "```\n",
 95 |     "\n",
 96 |     "\n",
 97 |     "Now let's run the container locally. Once it's running it will let us work in a terminal inside of it. \n",
 98 |     "\n",
 99 |     "```\n",
100 |     "docker run --name=tensorflow_container -it $USER/tensorflow-serving-devel\n",
101 |     "```\n",
102 |     "\n",
103 |     "Now we can clone Tensorflow serving into our dependency-ready container\n",
104 |     "\n",
105 |     "git clone --recursive https://github.com/tensorflow/serving\n",
106 |     "cd serving/tensorflow\n",
107 |     "./configure\n",
108 |     "\n",
109 |     "\n",
110 |     "Now we need to build it using Google's Bazel build tool from inside our container.  Bazel manages third party dependencies at code level, downloading and building them, as long as they are also built with Bazel. \n",
111 |     "\n",
112 |     "Dependencies needed are\n",
113 |     "\n",
114 |     "* tensorflow serving\n",
115 |     "* pre-trained inception model\n",
116 |     "\n",
117 |     "\n",
118 |     "First TF Serving (This will take like 20-50 minutes)\n",
119 |     "\n",
120 |     "```\n",
121 |     "cd ..\n",
122 |     "bazel build -c opt tensorflow_serving/...\n",
123 |     "```\n",
124 |     "\n",
125 |     "Once completed we can test it out by running the model server\n",
126 |     "\n",
127 |     "```\n",
128 |     "bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server\n",
129 |     "```\n",
130 |     "\n",
131 |     "Output should look like this if install was successful\n",
132 |     "\n",
133 |     "```\n",
134 |     "Usage: model_server [--port=8500] [--enable_batching] [--model_name=my_name] --model_base_path=/path/to/export\n",
135 |     "```\n",
136 |     "\n",
137 |     "Now for dependency 2 of 2, the Inception Model. It's a Deep convolutional neural network that achieved state of the art classification in the ImageNet competition in 2014. Trained on hundreds of thousands of images.\n",
138 |     "\n",
139 |     "```\n",
140 |     "curl -O http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz\n",
141 |     "tar xzf inception-v3-2016-03-01.tar.gz\n",
142 |     "bazel-bin/tensorflow_serving/example/inception_export --checkpoint_dir=inception-v3 --export_dir=inception-export\n",
143 |     "```\n",
144 |     "![Image of Yaktocat](https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s1600/image00.png)\n",
145 |     "\n",
146 |     "Let's run it and the gRPC server locally!\n",
147 |     "\n",
148 |     "\n",
149 |     "```\n",
150 |     "bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=inception --model_base_path=inception-export &> inception_log &\n",
151 |     "```\n",
152 |     "\n",
153 |     "\n",
154 |     "Now that it's running on our local server, let's test it out using our python client app. We'll query it using a panda picture and it'll return a classification output\n",
155 |     "\n",
156 |     "```\n",
157 |     "wget https://upload.wikimedia.org/wikipedia/en/a/ac/Xiang_Xiang_panda.jpg\n",
158 |     "bazel-bin/tensorflow_serving/example/inception_client --server=localhost:9000 --image=./Xiang_Xiang_panda.jpg\n",
159 |     "```\n",
160 |     "\n",
161 |     "If everything works, we'll see a panda classification output to terminal!\n",
162 |     "\n",
163 |     "Wanna push this to the cloud? Well using Google cloud and the automatic container management tool (https://kubernetes.io/) we can. See part 2 of this tutorial to do that \n",
164 |     "\n",
165 |     "https://tensorflow.github.io/serving/serving_inception\n",
166 |     "\n",
167 |     "and when we're ready to build our own model\n",
168 |     "\n",
169 |     "https://tensorflow.github.io/serving/serving_basic"
170 |    ]
171 |   },
172 |   {
173 |    "cell_type": "code",
174 |    "execution_count": null,
175 |    "metadata": {
176 |     "collapsed": true
177 |    },
178 |    "outputs": [],
179 |    "source": []
180 |   }
181 |  ],
182 |  "metadata": {
183 |   "kernelspec": {
184 |    "display_name": "Python 3",
185 |    "language": "python",
186 |    "name": "python3"
187 |   },
188 |   "language_info": {
189 |    "codemirror_mode": {
190 |     "name": "ipython",
191 |     "version": 3
192 |    },
193 |    "file_extension": ".py",
194 |    "mimetype": "text/x-python",
195 |    "name": "python",
196 |    "nbconvert_exporter": "python",
197 |    "pygments_lexer": "ipython3",
198 |    "version": "3.6.0"
199 |   }
200 |  },
201 |  "nbformat": 4,
202 |  "nbformat_minor": 2
203 | }
204 | 


--------------------------------------------------------------------------------