├── README.md
├── iris_estimator
    ├── data
    │   ├── iris_test.csv
    │   └── iris_training.csv
    ├── export_raw
    │   └── 1560841190
    │   │   ├── saved_model.pb
    │   │   └── variables
    │   │       ├── variables.data-00000-of-00002
    │   │       ├── variables.data-00001-of-00002
    │   │       └── variables.index
    ├── grpc_client_for_parsing_receiver.py
    ├── grpc_client_for_raw_receiver.py
    ├── iris_dnn.py
    ├── rest_client_for_raw_receiver.py
    └── shell.sh
├── mnist
    ├── make_request.py
    ├── readme.md
    ├── running_service.py
    └── train_saved_model.py
├── model_des.png
├── res.png
├── serving.png
├── serving2.png
└── serving_nginx
    ├── dockerfile
    ├── make_request.py
    ├── nginx.conf
    └── readme.md


/README.md:
--------------------------------------------------------------------------------
  1 | # TF Serving 介绍、部署和Demo
  2 | ---
  3 | 
  4 | tf serving:
  5 | 支持模型热更新
  6 | 支持版本管理
  7 | 扩展性较好
  8 | 稳定性，性能较好
  9 | 
 10 | ### 一般工作流：
 11 | 
 12 | 1、hdfs上的数据，使用spark/mapreduce/hive 进行数据分析和预处理
 13 | 
 14 | 2、sub sample一部分数据，选择一个模型，预训练初始参数，交叉验证
 15 | 
 16 | 3、使用全部数据集，spark to tfrecord 使用单机读取hdfs数据训练 or 多机多卡分布式训练
 17 | 
 18 | 4、serving the model
 19 | 
 20 | # 一些解决方案：
 21 | ---
 22 | 
 23 | ## 方案1： yarn 3.1+ ： 
 24 | 可以支持docker_image, [还不能提供稳定性保障](https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/DockerContainers.html)
 25 | 
 26 | ![image](serving.png)
 27 | 
 28 | [Docker+GPU support + tf serving + hadoop 3.1](https://community.hortonworks.com/articles/231660/tensorflow-serving-function-as-a-service-faas-with.html)
 29 | 
 30 | 
 31 | ## 方案2： 模型Serving & 同步 from 美团blog
 32 | [参考链接](https://gitbook.cn/books/5b3adc411166b9562e9af3f6/index.html)
 33 | 
 34 | ### 训练：tfrecord存放在hdfs上 （训练时拉取到本地）
 35 | ### 预测：线上预估方案
 36 | 
 37 | - 模型同步
 38 | 
 39 | 我们开发了一个高可用的同步组件：用户只需要提供线下训练好的模型的 HDFS 路径，该组件会自动同步到线上服务机器上。该组件基于 HTTPFS 实现，它是美团离线计算组提供的 HDFS 的 HTTP 方式访问接口。同步过程如下：
 40 | 
 41 |     同步前，检查模型 md5 文件，只有该文件更新了，才需要同步。
 42 |     同步时，随机链接 HTTPFS 机器并限制下载速度。
 43 |     同步后，校验模型文件 md5 值并备份旧模型。
 44 |     
 45 | 同步过程中，如果发生错误或者超时，都会触发报警并重试。依赖这一组件，我们实现了在 2min 内可靠的将模型文件同步到线上。
 46 | 
 47 | - 模型计算
 48 | 
 49 | 主要的问题在于解决网络IO和计算性能。
 50 | 
 51 |     并发请求。一个请求会召回很多符合条件的广告。在客户端多个广告并发请求 TF Serving，可以有效降低整体预估时延。
 52 |     特征 ID 化。通过将字符串类型的特征名哈希到 64 位整型空间，可以有效减少传输的数据量，降低使用的带宽。
 53 |     定制的模型计算，针对性优化
 54 | 
 55 | 
 56 | 
 57 | ## 方案3： Centos 7 + docker + tfserving (当前使用方案)
 58 | 
 59 | ### 训练： 实现细节在我的另一个repo [这里](https://github.com/wangruichens/distributed_training)
 60 | 
 61 | ### 预测：线上预估方案
 62 | 
 63 | #### 1、prerequisit： 安装docker
 64 | 
 65 | 使用tfserving的docker版本。不想去踩编译和GPU功能拓展的坑。 
 66 | ```
 67 | # 1: 安装相关软件
 68 | sudo yum install -y yum-utils device-mapper-persistent-data lvm2
 69 | # 2: 添加软件源信息 (阿里镜像)
 70 | sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
 71 | # 3: 更新并安装 Docker-CE
 72 | sudo yum makecache fast
 73 | sudo yum -y install docker-ce
 74 | # 4: 开启Docker服务
 75 | sudo service docker start
 76 | # 5: 关闭Docker服务
 77 | sudo service docker stop
 78 | ```
 79 | 
 80 | #### 2、使用训练好的model, 使用hdfs上tfrecord数据训练的手写数字识别model. 具体参见我之前的[配置](https://github.com/wangruichens/samples/tree/master/distribute/tf/spark_tfrecord)
 81 | 
 82 | 模型很简单，参数量大概138w. 通过hdfs上的tfrecord来训练，模型文件保存在hdfs上
 83 | 
 84 | ![image](model_des.png)
 85 | 
 86 | #### 3、docker启动tf serving, 拉取hdfs model 到本地并加载模型
 87 | 
 88 | [参考链接](https://www.tensorflow.org/tfx/serving/docker#serving_with_docker)
 89 | 
 90 | 模型保存后的文件路径：
 91 | 
 92 | ```
 93 | test_serving 
 94 | └───mnist_model_for_serving
 95 | │   └───1
 96 | │       │   variables
 97 | │       │   saved_model.pb
 98 | ```
 99 | 可以到模型目录/1/下查看模型输入输出：
100 | ```
101 | saved_model_cli show --dir . --all
102 | 
103 | MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
104 | 
105 | signature_def['serving_default']:
106 |   The given SavedModel SignatureDef contains the following input(s):
107 |     inputs['input_image'] tensor_info:
108 |         dtype: DT_FLOAT
109 |         shape: (-1, 28, 28, 1)
110 |         name: Conv1_input:0
111 |   The given SavedModel SignatureDef contains the following output(s):
112 |     outputs['Softmax/Softmax:0'] tensor_info:
113 |         dtype: DT_FLOAT
114 |         shape: (-1, 10)
115 |         name: Softmax/Softmax:0
116 |   Method name is: tensorflow/serving/predict
117 | 
118 | ```
119 | docker 启动服务：
120 | 
121 | ```
122 | # For gRpc，默认端口8500
123 | docker run -p 8500:8500 --mount type=bind,source=/home/wangrc/test_serving/mnist_model_for_serving,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving
124 | 
125 | # For REST, 默认端口8501
126 | docker run -p 8501:8501 --mount type=bind,source=/home/wangrc/test_serving/mnist_model_for_serving,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving
127 | ```
128 | 当然也可以都启用这两个端口，-p 8500:8500 -p 8501:8501，也可以添加一些自己的config, 细节参考官方文档。
129 | 
130 | 实际在模型中执行的命令是：
131 | ```
132 | tensorflow_model_server --port=8500 --rest_api_port=8501 \
133 |   --model_name=my_model --model_base_path=/models/my_model
134 | ```
135 | 
136 | 服务启动，模型成功加载：
137 | 
138 | ![image](serving2.png)
139 | 
140 | 查看端口占用和服务状态：
141 | ```
142 | sudo netstat -nap | grep 8501
143 | curl http://localhost:8501/v1/models/mnist
144 | ```
145 | 
146 | #### 4、使用REST测试模型结果（线上采用gRpc）：
147 | ```
148 | python ./demo2/make_request.py
149 | ```
150 | 
151 | 
152 | ![image](res.png)
153 | 
154 | 
155 | ## 方案4： Centos 7 + tf serving + GPU without Docker
156 | 
157 | 愿意踩坑的可以自己使用bazel编译：[参考链接](https://www.dearcodes.com/index.php/archives/25/)
158 | 
159 | # tf serving 使用nginx部署负载均衡 
160 | [这里](https://github.com/wangruichens/tfserving/blob/master/serving_nginx)
161 | 
162 | # 主要问题：
163 | 
164 | tensorflow serving nginx 单个服务还能管理。当服务多起来，或者有服务间交互调用等问题时，再使用nginx config来管理就不现实了。需要一套类似spring cloud 的微服务框架。 而tf serving 只有python api， 并没有办法注册到spring cloud 来管理。
165 | 
166 | 虽然tensorflow serving用于生产环境部署训练好的模型，但需要自己实现集群功能和健康检查，同时和java应用中间还隔着一个网络通讯的开销。所以最好还是java应用内部直接调用模型。python构建并训练模型+java在线预测还是比较合理的方案。
167 | 
168 | 对于tensorflow来说，模型上线一般选择tensorflow serving或者client API库来上线，前者适合于较大的模型和应用场景，后者则适合中小型的模型和应用场景。因此算法工程师使用在产品之前需要做好选择和评估。
169 | 
170 | 最合适的方案还是:k8s+docker。容器即服务的概念


--------------------------------------------------------------------------------
/iris_estimator/data/iris_test.csv:
--------------------------------------------------------------------------------
 1 | 30,4,setosa,versicolor,virginica
 2 | 5.9,3.0,4.2,1.5,1
 3 | 6.9,3.1,5.4,2.1,2
 4 | 5.1,3.3,1.7,0.5,0
 5 | 6.0,3.4,4.5,1.6,1
 6 | 5.5,2.5,4.0,1.3,1
 7 | 6.2,2.9,4.3,1.3,1
 8 | 5.5,4.2,1.4,0.2,0
 9 | 6.3,2.8,5.1,1.5,2
10 | 5.6,3.0,4.1,1.3,1
11 | 6.7,2.5,5.8,1.8,2
12 | 7.1,3.0,5.9,2.1,2
13 | 4.3,3.0,1.1,0.1,0
14 | 5.6,2.8,4.9,2.0,2
15 | 5.5,2.3,4.0,1.3,1
16 | 6.0,2.2,4.0,1.0,1
17 | 5.1,3.5,1.4,0.2,0
18 | 5.7,2.6,3.5,1.0,1
19 | 4.8,3.4,1.9,0.2,0
20 | 5.1,3.4,1.5,0.2,0
21 | 5.7,2.5,5.0,2.0,2
22 | 5.4,3.4,1.7,0.2,0
23 | 5.6,3.0,4.5,1.5,1
24 | 6.3,2.9,5.6,1.8,2
25 | 6.3,2.5,4.9,1.5,1
26 | 5.8,2.7,3.9,1.2,1
27 | 6.1,3.0,4.6,1.4,1
28 | 5.2,4.1,1.5,0.1,0
29 | 6.7,3.1,4.7,1.5,1
30 | 6.7,3.3,5.7,2.5,2
31 | 6.4,2.9,4.3,1.3,1
32 | 


--------------------------------------------------------------------------------
/iris_estimator/data/iris_training.csv:
--------------------------------------------------------------------------------
  1 | 120,4,setosa,versicolor,virginica
  2 | 6.4,2.8,5.6,2.2,2
  3 | 5.0,2.3,3.3,1.0,1
  4 | 4.9,2.5,4.5,1.7,2
  5 | 4.9,3.1,1.5,0.1,0
  6 | 5.7,3.8,1.7,0.3,0
  7 | 4.4,3.2,1.3,0.2,0
  8 | 5.4,3.4,1.5,0.4,0
  9 | 6.9,3.1,5.1,2.3,2
 10 | 6.7,3.1,4.4,1.4,1
 11 | 5.1,3.7,1.5,0.4,0
 12 | 5.2,2.7,3.9,1.4,1
 13 | 6.9,3.1,4.9,1.5,1
 14 | 5.8,4.0,1.2,0.2,0
 15 | 5.4,3.9,1.7,0.4,0
 16 | 7.7,3.8,6.7,2.2,2
 17 | 6.3,3.3,4.7,1.6,1
 18 | 6.8,3.2,5.9,2.3,2
 19 | 7.6,3.0,6.6,2.1,2
 20 | 6.4,3.2,5.3,2.3,2
 21 | 5.7,4.4,1.5,0.4,0
 22 | 6.7,3.3,5.7,2.1,2
 23 | 6.4,2.8,5.6,2.1,2
 24 | 5.4,3.9,1.3,0.4,0
 25 | 6.1,2.6,5.6,1.4,2
 26 | 7.2,3.0,5.8,1.6,2
 27 | 5.2,3.5,1.5,0.2,0
 28 | 5.8,2.6,4.0,1.2,1
 29 | 5.9,3.0,5.1,1.8,2
 30 | 5.4,3.0,4.5,1.5,1
 31 | 6.7,3.0,5.0,1.7,1
 32 | 6.3,2.3,4.4,1.3,1
 33 | 5.1,2.5,3.0,1.1,1
 34 | 6.4,3.2,4.5,1.5,1
 35 | 6.8,3.0,5.5,2.1,2
 36 | 6.2,2.8,4.8,1.8,2
 37 | 6.9,3.2,5.7,2.3,2
 38 | 6.5,3.2,5.1,2.0,2
 39 | 5.8,2.8,5.1,2.4,2
 40 | 5.1,3.8,1.5,0.3,0
 41 | 4.8,3.0,1.4,0.3,0
 42 | 7.9,3.8,6.4,2.0,2
 43 | 5.8,2.7,5.1,1.9,2
 44 | 6.7,3.0,5.2,2.3,2
 45 | 5.1,3.8,1.9,0.4,0
 46 | 4.7,3.2,1.6,0.2,0
 47 | 6.0,2.2,5.0,1.5,2
 48 | 4.8,3.4,1.6,0.2,0
 49 | 7.7,2.6,6.9,2.3,2
 50 | 4.6,3.6,1.0,0.2,0
 51 | 7.2,3.2,6.0,1.8,2
 52 | 5.0,3.3,1.4,0.2,0
 53 | 6.6,3.0,4.4,1.4,1
 54 | 6.1,2.8,4.0,1.3,1
 55 | 5.0,3.2,1.2,0.2,0
 56 | 7.0,3.2,4.7,1.4,1
 57 | 6.0,3.0,4.8,1.8,2
 58 | 7.4,2.8,6.1,1.9,2
 59 | 5.8,2.7,5.1,1.9,2
 60 | 6.2,3.4,5.4,2.3,2
 61 | 5.0,2.0,3.5,1.0,1
 62 | 5.6,2.5,3.9,1.1,1
 63 | 6.7,3.1,5.6,2.4,2
 64 | 6.3,2.5,5.0,1.9,2
 65 | 6.4,3.1,5.5,1.8,2
 66 | 6.2,2.2,4.5,1.5,1
 67 | 7.3,2.9,6.3,1.8,2
 68 | 4.4,3.0,1.3,0.2,0
 69 | 7.2,3.6,6.1,2.5,2
 70 | 6.5,3.0,5.5,1.8,2
 71 | 5.0,3.4,1.5,0.2,0
 72 | 4.7,3.2,1.3,0.2,0
 73 | 6.6,2.9,4.6,1.3,1
 74 | 5.5,3.5,1.3,0.2,0
 75 | 7.7,3.0,6.1,2.3,2
 76 | 6.1,3.0,4.9,1.8,2
 77 | 4.9,3.1,1.5,0.1,0
 78 | 5.5,2.4,3.8,1.1,1
 79 | 5.7,2.9,4.2,1.3,1
 80 | 6.0,2.9,4.5,1.5,1
 81 | 6.4,2.7,5.3,1.9,2
 82 | 5.4,3.7,1.5,0.2,0
 83 | 6.1,2.9,4.7,1.4,1
 84 | 6.5,2.8,4.6,1.5,1
 85 | 5.6,2.7,4.2,1.3,1
 86 | 6.3,3.4,5.6,2.4,2
 87 | 4.9,3.1,1.5,0.1,0
 88 | 6.8,2.8,4.8,1.4,1
 89 | 5.7,2.8,4.5,1.3,1
 90 | 6.0,2.7,5.1,1.6,1
 91 | 5.0,3.5,1.3,0.3,0
 92 | 6.5,3.0,5.2,2.0,2
 93 | 6.1,2.8,4.7,1.2,1
 94 | 5.1,3.5,1.4,0.3,0
 95 | 4.6,3.1,1.5,0.2,0
 96 | 6.5,3.0,5.8,2.2,2
 97 | 4.6,3.4,1.4,0.3,0
 98 | 4.6,3.2,1.4,0.2,0
 99 | 7.7,2.8,6.7,2.0,2
100 | 5.9,3.2,4.8,1.8,1
101 | 5.1,3.8,1.6,0.2,0
102 | 4.9,3.0,1.4,0.2,0
103 | 4.9,2.4,3.3,1.0,1
104 | 4.5,2.3,1.3,0.3,0
105 | 5.8,2.7,4.1,1.0,1
106 | 5.0,3.4,1.6,0.4,0
107 | 5.2,3.4,1.4,0.2,0
108 | 5.3,3.7,1.5,0.2,0
109 | 5.0,3.6,1.4,0.2,0
110 | 5.6,2.9,3.6,1.3,1
111 | 4.8,3.1,1.6,0.2,0
112 | 6.3,2.7,4.9,1.8,2
113 | 5.7,2.8,4.1,1.3,1
114 | 5.0,3.0,1.6,0.2,0
115 | 6.3,3.3,6.0,2.5,2
116 | 5.0,3.5,1.6,0.6,0
117 | 5.5,2.6,4.4,1.2,1
118 | 5.7,3.0,4.2,1.2,1
119 | 4.4,2.9,1.4,0.2,0
120 | 4.8,3.0,1.4,0.1,0
121 | 5.5,2.4,3.7,1.0,1
122 | 


--------------------------------------------------------------------------------
/iris_estimator/export_raw/1560841190/saved_model.pb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/iris_estimator/export_raw/1560841190/saved_model.pb


--------------------------------------------------------------------------------
/iris_estimator/export_raw/1560841190/variables/variables.data-00000-of-00002:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/iris_estimator/export_raw/1560841190/variables/variables.data-00000-of-00002


--------------------------------------------------------------------------------
/iris_estimator/export_raw/1560841190/variables/variables.data-00001-of-00002:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/iris_estimator/export_raw/1560841190/variables/variables.data-00001-of-00002


--------------------------------------------------------------------------------
/iris_estimator/export_raw/1560841190/variables/variables.index:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/iris_estimator/export_raw/1560841190/variables/variables.index


--------------------------------------------------------------------------------
/iris_estimator/grpc_client_for_parsing_receiver.py:
--------------------------------------------------------------------------------
 1 | # GRPC remote call using estimator model with build_parsing_serving_input_receiver_fn
 2 | 
 3 | 
 4 | import tensorflow as tf
 5 | from tensorflow_serving.apis import predict_pb2
 6 | from tensorflow_serving.apis import prediction_service_pb2_grpc
 7 | import grpc
 8 | from time import time
 9 | import numpy as np
10 | 
11 | tf.app.flags.DEFINE_string('server', 'ha05:8556',
12 |                            'Server host:port.')
13 | tf.app.flags.DEFINE_string('model', 'iris',
14 |                            'Model name.')
15 | FLAGS = tf.app.flags.FLAGS
16 | 
17 | 
18 | def _float_feature(value):
19 |     return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
20 | 
21 | 
22 | def _bytes_feature(value):
23 |     return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
24 | 
25 | 
26 | def main(_):
27 |     channel = grpc.insecure_channel(FLAGS.server)
28 |     stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
29 | 
30 |     request = predict_pb2.PredictRequest()
31 |     request.model_spec.name = FLAGS.model
32 |     request.model_spec.signature_name = 'predict'
33 |     batching = []
34 | 
35 |     for i in range(1000):
36 |         feature_dict = {'SepalLength': _float_feature(value=np.random.random()),
37 |                         'SepalWidth': _float_feature(value=np.random.random()),
38 |                         'PetalLength': _float_feature(value=np.random.random()),
39 |                         'PetalWidth': _float_feature(value=np.random.random())}
40 | 
41 |         example = tf.train.Example(features=tf.train.Features(feature=feature_dict))
42 |         serialized = example.SerializeToString()
43 |         batching.append(serialized)
44 | 
45 |     request.inputs['examples'].CopyFrom(
46 |         tf.make_tensor_proto(batching, shape=[len(batching)]))
47 | 
48 |     start = time()
49 |     result_future = stub.Predict.future(request, 5.0)
50 |     elapsed = (time() - start)
51 |     prediction = result_future.result().outputs['probabilities']
52 |     print(prediction)
53 |     print("Time used:{0}ms".format(round(elapsed * 1000, 2)))
54 | 
55 | 
56 | if __name__ == '__main__':
57 |     tf.app.run()
58 | 


--------------------------------------------------------------------------------
/iris_estimator/grpc_client_for_raw_receiver.py:
--------------------------------------------------------------------------------
 1 | # GRPC remote call using estimator model with build_raw_serving_input_receiver_fn
 2 | 
 3 | 
 4 | import tensorflow as tf
 5 | from tensorflow_serving.apis import predict_pb2
 6 | from tensorflow_serving.apis import prediction_service_pb2_grpc
 7 | import grpc
 8 | from time import time
 9 | import numpy as np
10 | 
11 | tf.app.flags.DEFINE_string('server', 'ha05:8555',
12 |                            'Server host:port.')
13 | tf.app.flags.DEFINE_string('model', 'iris',
14 |                            'Model name.')
15 | FLAGS = tf.app.flags.FLAGS
16 | 
17 | 
18 | def _float_feature(value):
19 |     return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))
20 | 
21 | 
22 | def _bytes_feature(value):
23 |     return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
24 | 
25 | 
26 | def main(_):
27 |     channel = grpc.insecure_channel(FLAGS.server)
28 |     stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
29 | 
30 |     request = predict_pb2.PredictRequest()
31 |     request.model_spec.name = FLAGS.model
32 |     request.model_spec.signature_name = 'predict'
33 | 
34 |     p = []
35 |     for i in range(1000):
36 |         p.append(np.random.random())
37 | 
38 |     request.inputs['SepalLength'].CopyFrom(
39 |         tf.make_tensor_proto(p, shape=[len(p)]))
40 |     request.inputs['SepalWidth'].CopyFrom(
41 |         tf.make_tensor_proto(p, shape=[len(p)]))
42 |     request.inputs['PetalLength'].CopyFrom(
43 |         tf.make_tensor_proto(p, shape=[len(p)]))
44 |     request.inputs['PetalWidth'].CopyFrom(
45 |         tf.make_tensor_proto(p, shape=[len(p)]))
46 |     start = time()
47 |     result_future = stub.Predict.future(request, 5.0)
48 |     elapsed = (time() - start)
49 |     prediction = result_future.result().outputs['probabilities']
50 |     print(prediction)
51 |     print("Time used:{0}ms".format(round(elapsed * 1000, 2)))
52 | 
53 | 
54 | if __name__ == '__main__':
55 |     tf.app.run()
56 | 


--------------------------------------------------------------------------------
/iris_estimator/iris_dnn.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | 
  3 | import pandas as pd
  4 | import tensorflow as tf
  5 | from tensorflow_estimator import estimator
  6 | 
  7 | COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
  8 | SPECIES = ['Setosa', 'Versicolor', 'Virginica']
  9 | BATCH_SIZE = 100
 10 | STEPS = 10000
 11 | 
 12 | # load data
 13 | y_name = 'Species'
 14 | 
 15 | train = pd.read_csv('data/iris_training.csv', names=COLUMN_NAMES, header=0)
 16 | train_x, train_y = train, train.pop(y_name)
 17 | 
 18 | test = pd.read_csv('data/iris_test.csv', names=COLUMN_NAMES, header=0)
 19 | test_x, test_y = test, test.pop(y_name)
 20 | 
 21 | 
 22 | # prepare input / eval fn
 23 | def train_input_fn(features, labels, batch_size):
 24 |     dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
 25 |     dataset = dataset.shuffle(1000).repeat().batch(batch_size)
 26 |     return dataset
 27 | 
 28 | 
 29 | def eval_input_fn(features, labels, batch_size):
 30 |     features = dict(features)
 31 |     inputs = (features, labels) if labels is not None else features
 32 |     dataset = tf.data.Dataset.from_tensor_slices(inputs)
 33 |     dataset = dataset.batch(batch_size)
 34 |     return dataset
 35 | 
 36 | hook = estimator.ProfilerHook(save_steps=300, output_dir='./time/', show_memory=True, show_dataflow=True)
 37 | feature_columns = [tf.feature_column.numeric_column(key=key)
 38 |                    for key in train_x.keys()]
 39 | 
 40 | test = tf.feature_column.numeric_column(train_x.keys()[0], default_value=0.0)
 41 | test = tf.feature_column.bucketized_column(test,[0.1,1,100])
 42 | test_emb = tf.feature_column.embedding_column(test, 10)
 43 | feature_columns.append(test_emb)
 44 | 
 45 | session_config = tf.ConfigProto()
 46 | 
 47 | mirrored_strategy = tf.distribute.MirroredStrategy()
 48 | 
 49 | 
 50 | config = estimator.RunConfig(
 51 |     train_distribute=mirrored_strategy,
 52 |     eval_distribute=mirrored_strategy,
 53 | )
 54 | 
 55 | classifier = estimator.DNNClassifier(
 56 |     feature_columns=feature_columns,
 57 |     hidden_units=[10, 10],
 58 |     n_classes=3,
 59 |     config=config)
 60 | 
 61 | classifier.train(
 62 |     input_fn=lambda: train_input_fn(train_x, train_y, batch_size=BATCH_SIZE),hooks=[hook],
 63 |     steps=STEPS)
 64 | 
 65 | # evaluate
 66 | eval_result = classifier.evaluate(
 67 |     input_fn=lambda: eval_input_fn(test_x, test_y, batch_size=BATCH_SIZE))
 68 | 
 69 | print('Test set accuracy: {accuracy:0.3f}'.format(**eval_result))
 70 | 
 71 | # predict
 72 | expected = ['Setosa', 'Versicolor', 'Virginica']
 73 | predict_x = {
 74 |     'SepalLength': [5.1, 5.9, 6.9],
 75 |     'SepalWidth': [3.3, 3.0, 3.1],
 76 |     'PetalLength': [1.7, 4.2, 5.4],
 77 |     'PetalWidth': [0.5, 1.5, 2.1],
 78 | }
 79 | 
 80 | predictions = classifier.predict(
 81 |     input_fn=lambda: eval_input_fn(predict_x, labels=None, batch_size=BATCH_SIZE))
 82 | 
 83 | for prediction, expect in zip(predictions, expected):
 84 |     class_id = prediction['class_ids'][0]
 85 |     probability = prediction['probabilities'][class_id]
 86 |     print('Prediction is "{}" ({:.1f}%), expected "{}"'.format(
 87 |         SPECIES[class_id], 100 * probability, expect))
 88 | 
 89 | # export model
 90 | 
 91 | from tensorflow import FixedLenFeature
 92 | 
 93 | feature_specification = {
 94 |     'SepalLength': FixedLenFeature(shape=(1,), dtype=tf.float32, default_value=None),
 95 |     'SepalWidth': FixedLenFeature(shape=(1,), dtype=tf.float32, default_value=None),
 96 |     'PetalLength': FixedLenFeature(shape=(1,), dtype=tf.float32, default_value=None),
 97 |     'PetalWidth': FixedLenFeature(shape=(1,), dtype=tf.float32, default_value=None)
 98 | }
 99 | # feature_spec = tf.feature_column.make_parse_example_spec(feature_columns)
100 | 
101 | features = {
102 |     'SepalLength': tf.placeholder(dtype=tf.float32, shape=(1), name='SepalLength'),
103 |     'SepalWidth': tf.placeholder(dtype=tf.float32, shape=(1), name='SepalWidth'),
104 |     'PetalLength': tf.placeholder(dtype=tf.float32, shape=(1), name='PetalLength'),
105 |     'PetalWidth': tf.placeholder(dtype=tf.float32, shape=(1), name='PetalWidth')
106 | }
107 | 
108 | # Can pass the key-value format to http REST api directly.
109 | # curl -d '{"signature_name": "predict","instances": [{"SepalLength":[5.1],"SepalWidth":[3.3],"PetalLength":[1.7],"PetalWidth":[0.5]}]}' -X POST http://localhost:8501/v1/models/iris:predict
110 | 
111 | # If using shape=(None,1), the input shape should be (?,1). Means two dimensions
112 | # saved_model_cli run --dir /home/wangrc/github/summaries/serving/estimator/export_raw/1560322385 \
113 | #   --tag_set serve --signature_def predict \
114 | #   --input_exprs 'SepalLength=[[5.1],[5.1]];SepalWidth=[[3.3],[3]];PetalLength=[[1.7],[3]];PetalWidth=[[0.5],[2]]'
115 | 
116 | serving_input_receiver_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(features)
117 | export_dir = classifier.export_savedmodel('export_raw', serving_input_receiver_fn)
118 | 
119 | # Only works in this way
120 | # If using http REST api, the json can not parse the tf.Example string.
121 | # saved_model_cli run --dir /home/wangrc/Downloads/tf-serve-master/export_parsing/1560302102 \
122 | #   --tag_set serve --signature_def predict \
123 | #   --input_examples 'examples=[{"SepalLength":[5.1],"SepalWidth":[3.3],"PetalLength":[1.7],"PetalWidth":[0.5]}]'
124 | 
125 | # serving_input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_specification)
126 | # export_dir = classifier.export_savedmodel('export_parsing', serving_input_receiver_fn)
127 | 
128 | 
129 | print('Exported to {}'.format(export_dir))
130 | 


--------------------------------------------------------------------------------
/iris_estimator/rest_client_for_raw_receiver.py:
--------------------------------------------------------------------------------
 1 | # REST remote call using estimator model with build_raw_serving_input_receiver_fn
 2 | 
 3 | import json
 4 | import requests
 5 | import numpy as np
 6 | from time import time
 7 | 
 8 | # tensorflow_model_server \
 9 | #   --rest_api_port=8501 \
10 | #   --model_name=deepfm \
11 | #   --model_base_path="/home/wangrc/Desktop/"
12 | 
13 | batching=[]
14 | for i in range(1000):
15 |     p = {
16 |         "SepalLength": np.random.random(),
17 |         "SepalWidth": np.random.random(),
18 |         "PetalLength": np.random.random(),
19 |         "PetalWidth": np.random.random()
20 |     }
21 |     batching.append(p)
22 | 
23 | data = json.dumps({"signature_name": "predict", "instances": batching})
24 | # print(data)
25 | 
26 | headers = {"content-type": "application/json"}
27 | 
28 | start = time()
29 | json_response = requests.post('http://ha05:8555/v1/models/iris:predict', data=data, headers=headers)
30 | elapsed = (time() - start)
31 | # print(json_response.text)
32 | predictions = json.loads(json_response.text)['predictions']
33 | print(predictions)
34 | print("Time used:{0}ms".format(round(elapsed * 1000, 2)))
35 | 


--------------------------------------------------------------------------------
/iris_estimator/shell.sh:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env bash
 2 | tensorflow_model_server \
 3 | --port=8500 \
 4 | --rest_api_port=8501 \
 5 | --model_name=iris \
 6 | --model_base_path="/home/wangrc/github/summaries/serving/estimator/export_raw"
 7 | 
 8 | 
 9 | 
10 | docker run -p 8555:8555 --mount type=bind,source=/home/wangrc/export_parsing,target=/models/iris -e MODEL_NAME=iris -t tensorflow/serving
11 | 
12 | docker run -p 8500:8500 --mount type=bind,source=/home/wangrc/export/,target=/models/deepfm -e MODEL_NAME=deepfm -t tensorflow/serving
13 | 
14 | curl -o - http://ha05:8500/v1/models/deepfm
15 | 
16 | 
17 | curl -o - http://algorithmsdeepfm.2345.cn:38609/v1/models/deepfm


--------------------------------------------------------------------------------
/mnist/make_request.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import random
 3 | from time import time
 4 | 
 5 | import numpy as np
 6 | import requests
 7 | from matplotlib import pyplot as plt
 8 | from tensorflow import keras
 9 | 
10 | fashion_mnist = keras.datasets.fashion_mnist
11 | (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
12 | 
13 | # scale the values to 0.0 to 1.0
14 | train_images = train_images / 255.0
15 | test_images = test_images / 255.0
16 | 
17 | # reshape for feeding into the model
18 | train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
19 | test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
20 | 
21 | class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
22 |                'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
23 | 
24 | 
25 | def show(idx, title):
26 |     plt.figure()
27 |     plt.imshow(test_images[idx].reshape(28, 28))
28 |     plt.axis('off')
29 |     plt.title('\n\n{}'.format(title), fontdict={'size': 16})
30 |     plt.show()
31 | 
32 | 
33 | rando = random.randint(0, len(test_images) - 1)
34 | # show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))
35 | 
36 | 
37 | data = json.dumps({"signature_name": "serving_default", "instances": [test_images[rando].tolist()]})
38 | # print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
39 | 
40 | 
41 | headers = {"content-type": "application/json"}
42 | start = time()
43 | json_response = requests.post('http://localhost:8502/v1/models/fashion_model/versions/1:predict', data=data,
44 |                               headers=headers)
45 | elapsed = (time() - start)
46 | print(json_response.text)
47 | predictions = json.loads(json_response.text)['predictions']
48 | 
49 | show(rando, 'predict: {} , actually: {} '.format(
50 |     np.argmax(predictions[0]), test_labels[rando]))
51 | 
52 | print('predict: {} , actually: {}'.format(
53 |     np.argmax(predictions[0]), test_labels[rando]),
54 |     ", Time used:{0}ms".format(round(elapsed * 1000, 2)))
55 | 


--------------------------------------------------------------------------------
/mnist/readme.md:
--------------------------------------------------------------------------------
 1 | train model and save ->running service -> make request
 2 | 
 3 | docker中运行：
 4 | 参考 ： https://www.tensorflow.org/tfx/serving/docker#serving_with_docker
 5 | 
 6 | 启动cpu版本service
 7 | docker run -p 8500:8500 --mount type=bind,source=/home/wangrc/test_serving/mnist_model_for_serving,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving
 8 | 
 9 | 启动gpu版本service
10 | docker run --runtime=nvidia -p 8501:8501 --mount type=bind,source=/home/wangrc/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_gpu,target=/models/half_plus_two -e MODEL_NAME=half_plus_two -t tensorflow/serving:latest-gpu
11 | 
12 | 
13 | 文件夹目录：
14 | mnist_model_for_serving/1/ saved_model.pb  variables 
15 | 
16 | saved_model_cli show --dir . --all
17 | 
18 | curl http://localhost:8501/v1/models/mnist
19 | 


--------------------------------------------------------------------------------
/mnist/running_service.py:
--------------------------------------------------------------------------------
 1 | import tempfile
 2 | import os
 3 | import subprocess
 4 | 
 5 | MODEL_DIR = tempfile.gettempdir()
 6 | version = 1
 7 | export_path = os.path.join(MODEL_DIR, str(version))
 8 | 
 9 | print(MODEL_DIR)
10 | 
11 | os.environ["MODEL_DIR"] = MODEL_DIR
12 | cmd='nohup tensorflow_model_server \
13 |   --rest_api_port=8502 \
14 |   --model_name=fashion_model \
15 |   --model_base_path="${MODEL_DIR}" >server.log 2>&1'
16 | subprocess.call(cmd,shell=True)


--------------------------------------------------------------------------------
/mnist/train_saved_model.py:
--------------------------------------------------------------------------------
  1 | # TensorFlow and tf.keras
  2 | import tensorflow as tf
  3 | from tensorflow import keras
  4 | 
  5 | # Helper libraries
  6 | import numpy as np
  7 | import matplotlib.pyplot as plt
  8 | import os
  9 | import subprocess
 10 | 
 11 | tf.logging.set_verbosity(tf.logging.ERROR)
 12 | print(tf.__version__)
 13 | 
 14 | 
 15 | os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
 16 | 
 17 | config = tf.ConfigProto()
 18 | config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
 19 | # config.log_device_placement = True
 20 | 
 21 | sess = tf.Session(config=config)
 22 | keras.backend.set_session(sess)  # set this TensorFlow session as the default session for Keras
 23 | 
 24 | 
 25 | fashion_mnist = keras.datasets.fashion_mnist
 26 | (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
 27 | 
 28 | # scale the values to 0.0 to 1.0
 29 | train_images = train_images / 255.0
 30 | test_images = test_images / 255.0
 31 | 
 32 | # reshape for feeding into the model
 33 | train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
 34 | test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
 35 | 
 36 | class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
 37 |                'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
 38 | 
 39 | print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
 40 | print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))
 41 | 
 42 | 
 43 | model = keras.Sequential([
 44 |   keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3,
 45 |                       strides=2, activation='relu', name='Conv1'),
 46 |   keras.layers.Flatten(),
 47 |   keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
 48 | ])
 49 | 
 50 | model.summary()
 51 | 
 52 | testing = False
 53 | epochs = 5
 54 | 
 55 | # model = tf.keras.utils.multi_gpu_model(model, gpus=2)
 56 | 
 57 | model.compile(optimizer=tf.train.AdamOptimizer(),
 58 |               loss='sparse_categorical_crossentropy',
 59 |               metrics=['accuracy'])
 60 | model.fit(train_images, train_labels, epochs=epochs)
 61 | 
 62 | test_loss, test_acc = model.evaluate(test_images, test_labels)
 63 | print('\nTest accuracy: {}'.format(test_acc))
 64 | 
 65 | # To load our trained model into TensorFlow Serving we first need to save it in SavedModel format.
 66 | import tempfile
 67 | 
 68 | MODEL_DIR = tempfile.gettempdir()
 69 | version = 1
 70 | export_path = os.path.join(MODEL_DIR, str(version))
 71 | print('export_path = {}\n'.format(export_path))
 72 | if os.path.isdir(export_path):
 73 |   print('\nAlready saved a model, cleaning up\n')
 74 | 
 75 | tf.saved_model.simple_save(
 76 |     keras.backend.get_session(),
 77 |     export_path,
 78 |     inputs={'input_image': model.input},
 79 |     outputs={t.name:t for t in model.outputs})
 80 | 
 81 | print('\nSaved model:')
 82 | 
 83 | #saved_model_cli show --dir . --all
 84 | 
 85 | 
 86 | 
 87 | 
 88 | 
 89 | 
 90 | 
 91 | 
 92 | 
 93 | 
 94 | 
 95 | 
 96 | 
 97 | 
 98 | 
 99 | 
100 | 
101 | 
102 | 
103 | 
104 | 
105 | 
106 | 
107 | 
108 | 
109 | 
110 | 
111 | 
112 | 
113 | 
114 | 


--------------------------------------------------------------------------------
/model_des.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/model_des.png


--------------------------------------------------------------------------------
/res.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/res.png


--------------------------------------------------------------------------------
/serving.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/serving.png


--------------------------------------------------------------------------------
/serving2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wangruichens/tfserving/5107b07c0849ca16226f94943e029f3f53b3a19f/serving2.png


--------------------------------------------------------------------------------
/serving_nginx/dockerfile:
--------------------------------------------------------------------------------
1 | FROM nginx
2 | COPY nginx.conf /etc/nginx
3 | 


--------------------------------------------------------------------------------
/serving_nginx/make_request.py:
--------------------------------------------------------------------------------
 1 | ##############################
 2 | #
 3 | # call tfserving using REST API demo
 4 | #
 5 | ##############################
 6 | 
 7 | from matplotlib import pyplot as plt
 8 | import tensorflow as tf
 9 | from tensorflow import keras
10 | import numpy as np
11 | from time import time
12 | 
13 | mnist = keras.datasets.mnist
14 | (train_images, train_labels), (test_images, test_labels) = mnist.load_data()
15 | 
16 | # reshape for feeding into the model
17 | train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
18 | test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
19 | 
20 | 
21 | def show(idx, title):
22 |     plt.figure()
23 |     plt.imshow(test_images[idx].reshape(28, 28))
24 |     plt.axis('off')
25 |     plt.title('\n\n{}'.format(title), fontdict={'size': 16})
26 |     plt.show()
27 | 
28 | 
29 | import random
30 | import requests
31 | import json
32 | 
33 | correct=0
34 | error=0
35 | def test_post():
36 |     acc=0
37 |     rando = random.randint(0, len(test_images) - 1)
38 |     data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:400].tolist()})
39 |     # print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
40 |     headers = {"content-type": "application/json"}
41 |     start = time()
42 |     json_response = requests.post('http://algorithmsdemo.2345.cn/v1/models/mnist/versions/1:predict', data=data, headers=headers)
43 |     elapsed = (time() - start)
44 |     # print(json_response.text)
45 |     # predictions = json.loads(json_response.text)['predictions']
46 | 
47 |     # print("Time used:{0}ms".format(round(elapsed * 1000, 2)))
48 |     # print('predict: {} , actually: {} '.format(
49 |     #     np.argmax(predictions[0]), test_labels[rando]))
50 |     # show(rando, 'predict: {} , actually: {} '.format(
51 |     #   np.argmax(predictions[0]), test_labels[rando]))
52 |     return elapsed
53 | 
54 | sec=[]
55 | req_count=100
56 | for i in range(req_count):
57 |     sec.append(test_post())
58 | print('average cost time {0} ms'.format(np.sum(sec)/req_count*1000))


--------------------------------------------------------------------------------
/serving_nginx/nginx.conf:
--------------------------------------------------------------------------------
 1 | 
 2 | user  nginx;
 3 | worker_processes  1;
 4 | 
 5 | error_log  /var/log/nginx/error.log warn;
 6 | pid        /var/run/nginx.pid;
 7 | 
 8 | 
 9 | events {
10 |     worker_connections  1024;
11 | }
12 | 
13 | 
14 | http {
15 |     include       /etc/nginx/mime.types;
16 |     default_type  application/octet-stream;
17 | 
18 |     log_format  main  '$upstream_addr $remote_addr - $remote_user [$time_local] "$request" '
19 |                       '$status $body_bytes_sent "$http_referer" '
20 |                       '"$http_user_agent" "$http_x_forwarded_for"';
21 | 
22 |     access_log  /var/log/nginx/access.log  main;
23 | 
24 |     sendfile        on;
25 |     #tcp_nopush     on;
26 | 
27 |     keepalive_timeout  65;
28 | 	
29 | 	upstream 172.0.0.1
30 | 	{
31 | 		server 172.0.0.1:8501;
32 | 		server 172.0.0.2:8501;
33 | 		server 172.0.0.3:8501;
34 | 		server 172.0.0.4:8501;
35 | 		server 172.0.0.5:8501;
36 | 	}
37 | 	
38 | 	server
39 | 	{
40 | 		listen 8255;
41 | 		server_name 172.0.0.1;
42 | 		
43 | 		location /{
44 |         	proxy_pass http://172.0.0.1;
45 |         	proxy_set_header Host       $host;
46 |         	proxy_set_header X-Real-IP      $remote_addr;
47 |         	proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
48 |     	}		
49 | 	}
50 | 
51 |     #gzip  on;
52 | 
53 |     include /etc/nginx/conf.d/*.conf;
54 | 	}
55 | 


--------------------------------------------------------------------------------
/serving_nginx/readme.md:
--------------------------------------------------------------------------------
 1 | Nginx + TF Serving using Docker
 2 | 
 3 | 为tf serving 配置负载均衡。
 4 | 
 5 | 1、
 6 | 首先使用docker pull最新的nginx 镜像, 将其中的nginx.conf拷贝出来，修改成 ./nginx.conf的格式，配置主从节点ip和端口。
 7 | * $upstream_addr ： log_format配置，当ngnix做负载均衡时，可以查看后台提供真实服务的设备
 8 | 
 9 | 2、
10 | 在每台机器上启动tfserving服务：
11 | ```angular2
12 | docker run -d -p 8501:8501 --mount type=bind,source=/home/wangrc/mnist_model_for_serving,target=/models/mnist -e MODEL_NAME=mnist -t tensorflow/serving
13 | ```
14 | 
15 | 3、
16 | 在主节点 ha05 上执行：
17 | ```angular2
18 | docker run -d -p 8256:80 -p 8255:8255 -v '/home/wangrc/mnist_log:/var/log/nginx' --name nginx_server mynginx
19 | ```
20 | 映射相关log到本地
21 | 
22 | 4、 测试端口是否正常工作。 可以通过make_request.py提交post请求，查看log确认负载均衡搭建成功。
23 | ```angular2
24 | curl http://ha05:8255/v1/models/mnist
25 | ```
26 | 


--------------------------------------------------------------------------------