├── README.md ├── client.py ├── custom_model.py └── demo.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # How-to-Deploy-a-Tensorflow-Model-in-Production 2 | This is the code for the "How to Deploy a Tensorflow Model in Production" by Siraj Raval on YouTube 3 | 4 | ## Overview 5 | 6 | This is the code for [this](https://youtu.be/T_afaArR0E8) video on Youtube by Siraj Raval. We're going to use the [Tensorflow Serving](https://tensorflow.github.io/serving/) library to deploy an inception model in production. 7 | 8 | ## Dependencies 9 | 10 | All included in the iPython notebook. You just need [docker](https://www.docker.com/) 11 | 12 | ## Usage 13 | 14 | Run the notebook by running `jupyter notebook` in terminal in the main directory. All the instructions are in there. 15 | 16 | The 2 attached files -- one is the client and the other is an example of how we train and save an simple MNIST model for Tensorflow Serving. 17 | 18 | ## Credits 19 | 20 | Credits go to Google. I've merely created a wrapper to get people started. 21 | -------------------------------------------------------------------------------- /client.py: -------------------------------------------------------------------------------- 1 | # Copyright 2016 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================== 15 | 16 | #!/usr/bin/env python2.7 17 | 18 | """Send JPEG image to tensorflow_model_server loaded with inception model. 19 | """ 20 | 21 | from __future__ import print_function 22 | 23 | # This is a placeholder for a Google-internal import. 24 | 25 | from grpc.beta import implementations 26 | import tensorflow as tf 27 | 28 | from tensorflow_serving.apis import predict_pb2 29 | from tensorflow_serving.apis import prediction_service_pb2 30 | 31 | 32 | tf.app.flags.DEFINE_string('server', 'localhost:9000', 33 | 'PredictionService host:port') 34 | tf.app.flags.DEFINE_string('image', '', 'path to image in JPEG format') 35 | FLAGS = tf.app.flags.FLAGS 36 | 37 | 38 | def main(_): 39 | host, port = FLAGS.server.split(':') 40 | channel = implementations.insecure_channel(host, int(port)) 41 | stub = prediction_service_pb2.beta_create_PredictionService_stub(channel) 42 | # Send request 43 | with open(FLAGS.image, 'rb') as f: 44 | # See prediction_service.proto for gRPC request/response details. 45 | data = f.read() 46 | request = predict_pb2.PredictRequest() 47 | request.model_spec.name = 'inception' 48 | request.model_spec.signature_name = 'predict_images' 49 | request.inputs['images'].CopyFrom( 50 | tf.contrib.util.make_tensor_proto(data, shape=[1])) 51 | result = stub.Predict(request, 10.0) # 10 secs timeout 52 | print(result) 53 | 54 | 55 | if __name__ == '__main__': 56 | tf.app.run() 57 | -------------------------------------------------------------------------------- /custom_model.py: -------------------------------------------------------------------------------- 1 | # Copyright 2016 Google Inc. All Rights Reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # ============================================================================== 15 | 16 | #!/usr/bin/env python2.7 17 | """Train and export a simple Softmax Regression TensorFlow model. 18 | The model is from the TensorFlow "MNIST For ML Beginner" tutorial. This program 19 | simply follows all its training instructions, and uses TensorFlow SavedModel to 20 | export the trained model with proper signatures that can be loaded by standard 21 | tensorflow_model_server. 22 | Usage: mnist_export.py [--training_iteration=x] [--model_version=y] export_dir 23 | """ 24 | 25 | import os 26 | import sys 27 | import tensorflow as tf 28 | from tensorflow.python.saved_model import builder as saved_model_builder 29 | from tensorflow.python.saved_model import signature_constants 30 | from tensorflow.python.saved_model import signature_def_utils 31 | from tensorflow.python.saved_model import tag_constants 32 | from tensorflow.python.saved_model import utils 33 | from tensorflow.python.util import compat 34 | from tensorflow_serving.example import mnist_input_data 35 | 36 | 37 | #training flags 38 | tf.app.flags.DEFINE_integer('training_iteration', 1000, 39 | 'number of training iterations.') 40 | tf.app.flags.DEFINE_integer('model_version', 1, 'version number of the model.') 41 | tf.app.flags.DEFINE_string('work_dir', '/tmp', 'Working directory.') 42 | FLAGS = tf.app.flags.FLAGS 43 | 44 | 45 | def main(_): 46 | if len(sys.argv) < 2 or sys.argv[-1].startswith('-'): 47 | print('Usage: mnist_export.py [--training_iteration=x] ' 48 | '[--model_version=y] export_dir') 49 | sys.exit(-1) 50 | if FLAGS.training_iteration <= 0: 51 | print 'Please specify a positive value for training iteration.' 52 | sys.exit(-1) 53 | if FLAGS.model_version <= 0: 54 | print 'Please specify a positive value for version number.' 55 | sys.exit(-1) 56 | 57 | # Train model 58 | print 'Training model...' 59 | 60 | #Read the data and format it 61 | mnist = mnist_input_data.read_data_sets(FLAGS.work_dir, one_hot=True) 62 | sess = tf.InteractiveSession() 63 | serialized_tf_example = tf.placeholder(tf.string, name='tf_example') 64 | feature_configs = {'x': tf.FixedLenFeature(shape=[784], dtype=tf.float32),} 65 | tf_example = tf.parse_example(serialized_tf_example, feature_configs) 66 | 67 | 68 | #Build model 69 | x = tf.identity(tf_example['x'], name='x') # use tf.identity() to assign name 70 | y_ = tf.placeholder('float', shape=[None, 10]) 71 | w = tf.Variable(tf.zeros([784, 10])) 72 | b = tf.Variable(tf.zeros([10])) 73 | sess.run(tf.global_variables_initializer()) 74 | y = tf.nn.softmax(tf.matmul(x, w) + b, name='y') 75 | cross_entropy = -tf.reduce_sum(y_ * tf.log(y)) 76 | train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) 77 | values, indices = tf.nn.top_k(y, 10) 78 | table = tf.contrib.lookup.index_to_string_table_from_tensor( 79 | tf.constant([str(i) for i in xrange(10)])) 80 | 81 | 82 | #train the model 83 | prediction_classes = table.lookup(tf.to_int64(indices)) 84 | for _ in range(FLAGS.training_iteration): 85 | batch = mnist.train.next_batch(50) 86 | train_step.run(feed_dict={x: batch[0], y_: batch[1]}) 87 | correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 88 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float')) 89 | print 'training accuracy %g' % sess.run( 90 | accuracy, feed_dict={x: mnist.test.images, 91 | y_: mnist.test.labels}) 92 | print 'Done training!' 93 | 94 | # Save the model 95 | 96 | #where to save to? 97 | export_path_base = sys.argv[-1] 98 | export_path = os.path.join( 99 | compat.as_bytes(export_path_base), 100 | compat.as_bytes(str(FLAGS.model_version))) 101 | print 'Exporting trained model to', export_path 102 | 103 | #This creates a SERVABLE from our model 104 | #saves a "snapshot" of the trained model to reliable storage 105 | #so that it can be loaded later for inference. 106 | #can save as many version as necessary 107 | 108 | #the tensoroflow serving main file tensorflow_model_server 109 | #will create a SOURCE out of it, the source 110 | #can house state that is shared across multiple servables 111 | #or versions 112 | 113 | #we can later create a LOADER from it using tf.saved_model.loader.load 114 | 115 | #then the MANAGER decides how to handle its lifecycle 116 | 117 | builder = saved_model_builder.SavedModelBuilder(export_path) 118 | 119 | # Build the signature_def_map. 120 | #Signature specifies what type of model is being exported, 121 | #and the input/output tensors to bind to when running inference. 122 | #think of them as annotiations on the graph for serving 123 | #we can use them a number of ways 124 | #grabbing whatever inputs/outputs/models we want either on server 125 | #or via client 126 | classification_inputs = utils.build_tensor_info(serialized_tf_example) 127 | classification_outputs_classes = utils.build_tensor_info(prediction_classes) 128 | classification_outputs_scores = utils.build_tensor_info(values) 129 | 130 | 131 | classification_signature = signature_def_utils.build_signature_def( 132 | inputs={signature_constants.CLASSIFY_INPUTS: classification_inputs}, 133 | outputs={ 134 | signature_constants.CLASSIFY_OUTPUT_CLASSES: 135 | classification_outputs_classes, 136 | signature_constants.CLASSIFY_OUTPUT_SCORES: 137 | classification_outputs_scores 138 | }, 139 | method_name=signature_constants.CLASSIFY_METHOD_NAME) 140 | 141 | tensor_info_x = utils.build_tensor_info(x) 142 | tensor_info_y = utils.build_tensor_info(y) 143 | 144 | prediction_signature = signature_def_utils.build_signature_def( 145 | inputs={'images': tensor_info_x}, 146 | outputs={'scores': tensor_info_y}, 147 | method_name=signature_constants.PREDICT_METHOD_NAME) 148 | 149 | legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op') 150 | 151 | #add the sigs to the servable 152 | builder.add_meta_graph_and_variables( 153 | sess, [tag_constants.SERVING], 154 | signature_def_map={ 155 | 'predict_images': 156 | prediction_signature, 157 | signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: 158 | classification_signature, 159 | }, 160 | legacy_init_op=legacy_init_op) 161 | 162 | #save it! 163 | builder.save() 164 | 165 | print 'Done exporting!' 166 | 167 | 168 | if __name__ == '__main__': 169 | tf.app.run() 170 | -------------------------------------------------------------------------------- /demo.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# How to Deploy a Tensorflow Model in Production\n", 8 | "\n", 9 | "We know how to \n", 10 | "- write models\n", 11 | "- train models\n", 12 | "- test models\n", 13 | "\n", 14 | "but how do we deploy them for production use?\n", 15 | "\n", 16 | "Let's create a simple webapp that will allow the user to upload an image and run the Inception model over it for classifying.\n", 17 | "\n", 18 | "## Tensorflow Serving \n", 19 | "\n", 20 | "![Image of Yaktocat](https://cdn-images-1.medium.com/max/1800/0*O7yprjYDk2WTO3__.)\n", 21 | "\n", 22 | "![Image of Yaktocat](https://blogs.nvidia.com/wp-content/uploads/2016/08/ai_difference_between_deep_learning_training_inference.jpg)\n", 23 | "\n", 24 | "- Google's open source library that accompanies Tensorflow\n", 25 | "- Meant for Inference. (managing models, giving versioned access via reference-counted lookup table i.e HTTP interface via RPC) \n", 26 | "https://apihandyman.io/do-you-really-know-why-you-prefer-rest-over-rpc/\n", 27 | "- Can serve multiple models simultaneously (great for A/B testing)\n", 28 | "- Can serve multiple versions of the same model\n", 29 | "- written in C++ \n", 30 | "\n", 31 | "\n", 32 | "## Architecture Overview\n", 33 | "\n", 34 | "![Image of Yaktocat](https://tensorflow.github.io/serving/images/serving_architecture.svg)\n", 35 | "\n", 36 | "### 4 major components\n", 37 | "\n", 38 | "### Servables\n", 39 | "\n", 40 | "- The central abstraction in TensorFlow Serving. They are the objects that clients use to perform computation (for example, a lookup or inference).\n", 41 | "- Flexible size (single lookup table shard, model, multiple models)\n", 42 | "- Good for Concurrent operations, A/B testing\n", 43 | "- Multiple versions of a servable in one instance helps refresh configs\n", 44 | "- Streams are sorted in-order like Git\n", 45 | "\n", 46 | "### Loaders\n", 47 | "\n", 48 | "- manage a servable's life cycle. Enables common infrastructure, standardizes the APIs for loading and unloading a servable.\n", 49 | "\n", 50 | "### Sources\n", 51 | "\n", 52 | "- plugin modules that originates zero or more servable streams. For each stream, a Source supplies one Loader instance for each version it wants to have loaded. \n", 53 | "\n", 54 | "### Managers\n", 55 | "\n", 56 | "- handle the full lifecycle of Servables (loading, serving, unloading)\n", 57 | "- listen to Sources and track all versions. \n", 58 | "- tries to fulfill Sources' requests, but may refuse to load an aspired version if, say, required resources aren't available. \n", 59 | "- may wait to unload until a newer version finishes loading, based on a policy to guarantee that at least one version is loaded at all times.\n", 60 | "\n", 61 | "### 2 Step Process \n", 62 | "\n", 63 | "1. Sources create Loaders for Servable Versions.\n", 64 | "2. Loaders are sent as Aspired Versions to the Manager, which loads and serves them to client requests.\n", 65 | "\n" 66 | ] 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": [ 72 | "# Step 1 - Setup Development Environment\n", 73 | "\n", 74 | "Manually install from source? No. Let's use Docker.\n", 75 | "\n", 76 | "\n", 77 | "Docker is like a lightweight version of a virtual machine image that runs without the overhead of running a full OS inside it. It's like an app-specific VM. No need to worry about conflicting versions and other entanglements with the rest of the OS. \n", 78 | "\n", 79 | "![Image of Yaktocat](http://zdnet2.cbsistatic.com/hub/i/r/2017/05/08/af178c5a-64dd-4900-8447-3abd739757e3/resize/770xauto/78abd09a8d41c182a28118ac0465c914/docker-vm-container.png)\n", 80 | "\n", 81 | "Install Instructions https://docs.docker.com/engine/installation/.\n", 82 | "\n", 83 | "Let's first clone the Tensorflow serving repo\n", 84 | "\n", 85 | "```\n", 86 | "git clone --recursive https://github.com/tensorflow/serving\n", 87 | "cd serving\n", 88 | "```\n", 89 | "\n", 90 | "Now we can create a docker image with all the required dependencies(pip dependencies, bazel, grpc)\n", 91 | "\n", 92 | "```\n", 93 | "docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .\n", 94 | "```\n", 95 | "\n", 96 | "\n", 97 | "Now let's run the container locally. Once it's running it will let us work in a terminal inside of it. \n", 98 | "\n", 99 | "```\n", 100 | "docker run --name=tensorflow_container -it $USER/tensorflow-serving-devel\n", 101 | "```\n", 102 | "\n", 103 | "Now we can clone Tensorflow serving into our dependency-ready container\n", 104 | "\n", 105 | "git clone --recursive https://github.com/tensorflow/serving\n", 106 | "cd serving/tensorflow\n", 107 | "./configure\n", 108 | "\n", 109 | "\n", 110 | "Now we need to build it using Google's Bazel build tool from inside our container. Bazel manages third party dependencies at code level, downloading and building them, as long as they are also built with Bazel. \n", 111 | "\n", 112 | "Dependencies needed are\n", 113 | "\n", 114 | "* tensorflow serving\n", 115 | "* pre-trained inception model\n", 116 | "\n", 117 | "\n", 118 | "First TF Serving (This will take like 20-50 minutes)\n", 119 | "\n", 120 | "```\n", 121 | "cd ..\n", 122 | "bazel build -c opt tensorflow_serving/...\n", 123 | "```\n", 124 | "\n", 125 | "Once completed we can test it out by running the model server\n", 126 | "\n", 127 | "```\n", 128 | "bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server\n", 129 | "```\n", 130 | "\n", 131 | "Output should look like this if install was successful\n", 132 | "\n", 133 | "```\n", 134 | "Usage: model_server [--port=8500] [--enable_batching] [--model_name=my_name] --model_base_path=/path/to/export\n", 135 | "```\n", 136 | "\n", 137 | "Now for dependency 2 of 2, the Inception Model. It's a Deep convolutional neural network that achieved state of the art classification in the ImageNet competition in 2014. Trained on hundreds of thousands of images.\n", 138 | "\n", 139 | "```\n", 140 | "curl -O http://download.tensorflow.org/models/image/imagenet/inception-v3-2016-03-01.tar.gz\n", 141 | "tar xzf inception-v3-2016-03-01.tar.gz\n", 142 | "bazel-bin/tensorflow_serving/example/inception_export --checkpoint_dir=inception-v3 --export_dir=inception-export\n", 143 | "```\n", 144 | "![Image of Yaktocat](https://1.bp.blogspot.com/-O7AznVGY9js/V8cV_wKKsMI/AAAAAAAABKQ/maO7n2w3dT4Pkcmk7wgGqiSX5FUW2sfZgCLcB/s1600/image00.png)\n", 145 | "\n", 146 | "Let's run it and the gRPC server locally!\n", 147 | "\n", 148 | "\n", 149 | "```\n", 150 | "bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=inception --model_base_path=inception-export &> inception_log &\n", 151 | "```\n", 152 | "\n", 153 | "\n", 154 | "Now that it's running on our local server, let's test it out using our python client app. We'll query it using a panda picture and it'll return a classification output\n", 155 | "\n", 156 | "```\n", 157 | "wget https://upload.wikimedia.org/wikipedia/en/a/ac/Xiang_Xiang_panda.jpg\n", 158 | "bazel-bin/tensorflow_serving/example/inception_client --server=localhost:9000 --image=./Xiang_Xiang_panda.jpg\n", 159 | "```\n", 160 | "\n", 161 | "If everything works, we'll see a panda classification output to terminal!\n", 162 | "\n", 163 | "Wanna push this to the cloud? Well using Google cloud and the automatic container management tool (https://kubernetes.io/) we can. See part 2 of this tutorial to do that \n", 164 | "\n", 165 | "https://tensorflow.github.io/serving/serving_inception\n", 166 | "\n", 167 | "and when we're ready to build our own model\n", 168 | "\n", 169 | "https://tensorflow.github.io/serving/serving_basic" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": null, 175 | "metadata": { 176 | "collapsed": true 177 | }, 178 | "outputs": [], 179 | "source": [] 180 | } 181 | ], 182 | "metadata": { 183 | "kernelspec": { 184 | "display_name": "Python 3", 185 | "language": "python", 186 | "name": "python3" 187 | }, 188 | "language_info": { 189 | "codemirror_mode": { 190 | "name": "ipython", 191 | "version": 3 192 | }, 193 | "file_extension": ".py", 194 | "mimetype": "text/x-python", 195 | "name": "python", 196 | "nbconvert_exporter": "python", 197 | "pygments_lexer": "ipython3", 198 | "version": "3.6.0" 199 | } 200 | }, 201 | "nbformat": 4, 202 | "nbformat_minor": 2 203 | } 204 | --------------------------------------------------------------------------------