├── Dockerfile
├── README.md
├── docker_to_ecr.sh
├── imgs
├── docker.png
├── server-app.png
├── stack.png
├── structure.png
└── wsgi.png
├── lambda
└── sagemaker-invoke.py
├── sagemaker-estimator.ipynb
└── sagemaker-estimator
├── nginx.conf
├── predictor.py
├── serve
├── train
└── wsgi.py
/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM ubuntu:20.04
2 |
3 | RUN apt-get -y update && apt-get install -y --no-install-recommends \
4 | wget \
5 | python3.8 \
6 | ca-certificates \
7 | python3-pip \
8 | python3-setuptools \
9 | python3-numpy \
10 | python3-scipy \
11 | python3-pandas \
12 | python3-sklearn \
13 | nginx \
14 | python3-flask \
15 | python3-gevent \
16 | gunicorn \
17 | python-is-python3
18 |
19 |
20 | ENV PYTHONUNBUFFERED=TRUE
21 | ENV PYTHONDONTWRITEBYTECODE=TRUE
22 | ENV PATH="/opt/program:${PATH}"
23 |
24 | COPY sagemaker-estimator /opt/program
25 | WORKDIR /opt/program
26 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Deployment Dockerized ML Models in AWS Sagemaker
2 |
3 |
4 | ## Table of contents
5 |
6 | * [Docker](#Docker)
7 | * [ML Docker Structure for Sagemaker](#ML-Docker-Structure-for-Sagemaker)
8 | * [Execution Stack for Container](#Execution-Stack-for-Container)
9 | * [WSGI (Web Server Gateway Interface)](#WSGI (Web-Server-Gateway-Interface))
10 | * [Main Components](#Main-Components)
11 | * [Container Application](#Container-Application)
12 |
13 | Dataset can be found here: https://www.kaggle.com/andrewmvd/heart-failure-clinical-data
14 |
15 | It's very important to be sure that the target column stays in the dataset on the first position (first column). The main reason for that is that Sagemaker Estimator model accepts dataset in the following format: {Y, X1, X2, ..., Xn}, where Y-target variable and X1,...Xn-features. This can be done differently, in the current example .pop() approach is being used(can be found in train file):
16 |
17 | ```
18 | first_column = train_data.pop(target_variable)
19 | train_data.insert(0, target_variable, first_column)
20 | ```
21 | To use it for your own dataset, you can just change **target_variable**
22 |
23 | Also notice that you may need to change the separator in pd.read_csv() based on your exact dataset.
24 |
25 | ## Docker
26 |
27 | Functionality of Docker provides a simple way to package your code into an image that is totally self-contained. After the image has been established, Docker can run a container that this image is based on. The way you set up your program is the way it runs because the containers are separated from each other and the host.
28 |
29 | Comparing to envs like virtualenv (or conda), Docker is completely language independent and it can create the whole operating environment, including startup commands, environment variable, etc. In some ways, a Docker container is like a virtual machine, but it is much lighter weight.
30 |
31 |
32 |
33 |
34 |
35 |
36 |
37 | ## ML Docker Structure for Sagemaker
38 |
39 |
40 |
41 |
42 |
43 |
44 | ### input
45 | * /opt/ml/input/config contains information to control how your program runs. hyperparameters.json is a JSON-formatted dictionary of hyperparameter names to values. These values will always be strings, so you may need to convert them. resourceConfig.json is a JSON-formatted file that describes the network layout used for distributed training. Since scikit-learn doesn't support distributed training, we'll ignore it here.
46 | * /opt/ml/input/data// (for File mode) contains the input data for that channel. The channels are created based on the call to CreateTrainingJob but it's generally important that channels match what the algorithm expects. The files for each channel will be copied from S3 to this directory, preserving the tree structure indicated by the S3 key structure.
47 |
48 | ### output
49 | * /opt/ml/model/ is the directory where you write the model that your algorithm generates. Your model can be in any format that you want. It can be a single file or a whole directory tree. SageMaker will package any files in this directory into a compressed tar archive file. This file will be available at the S3 location returned in the DescribeTrainingJob result.
50 | * /opt/ml/output is a directory where the algorithm can write a file failure that describes why the job failed. The contents of this file will be returned in the FailureReason field of the DescribeTrainingJob result. For jobs that succeed, there is no reason to write this file as it will be ignored.
51 |
52 | ## Execution Stack for Container
53 |
54 |
55 |
56 |
57 |
58 | * /ping is simple health сheck endpoint that receives GET requests. If the model returns 200 (Success), then the container is up and running and ready to receive requests.
59 | * /invocations is the endpoint that receives client inference POST requests. The format of the request and the response is up to the algorithm.
60 |
61 |
62 | ## WSGI (Web Server Gateway Interface)
63 |
64 | WSGI consists of two parts:
65 |
66 | * Server part – usually web servers such as Nginx or Apache are being used
67 | * App part – web application model created from python scripts. In case of ML models, usually there are REST-API services wrapped in a lightweight web modules such as Flask or Tornado.
68 |
69 | The server executes the web app and sends information and a callback function to the app. The request is processed on the app side, and a response is sent back to the server utilizing the callback function.
70 |
71 | Examples of Python frameworks that support WSGI include Django, CherryPy, Flask, TurboGears, and web2py.
72 |
73 |
74 |
75 |
76 |
77 | ## Main Components
78 |
79 | * Dockerfile: Document file that contains all the commands that are used when you produce an image using 'docker build'
80 |
81 | * docker_to_ecr.sh: Shell script that builds Docker Image using Dockerfile and push that image directly to AWS ECR (Elastic Container Registry). After this procedure, this image can be used in Sagemaker for fitting the Estimator and deploying the model. Need to have preinstalled AWS CLI (Command Line Interface) and configured information using 'aws configure' command.
82 |
83 | * sagemaker-estimator: The main working directory for ML model that you're building
84 |
85 | ## Container Application
86 |
87 | * train: The main script that is using for training your ML models. Can also be combined with additional scripts for preprocessing, feature selection, etc.
88 | * serve: The wrapper that is working with inference server and starts it. Usually this file stays as it is and can be used in different ml models.
89 | * wsgi.py: Creating the start of individual workers.
90 | * predictor.py: Model prediction script combined with flask wrapper
91 | * nginx.conf: Conf settings for nginx master (enabling working with multiple workers)
92 |
93 |
--------------------------------------------------------------------------------
/docker_to_ecr.sh:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env bash
2 |
3 | image=$1
4 |
5 | if [ "$image" == "" ]
6 | then
7 | echo "Usage: $0 "
8 | exit 1
9 | fi
10 |
11 | chmod +x sagemaker-estimator/train
12 | chmod +x sagemaker-estimator/serve
13 |
14 | account=$(aws sts get-caller-identity --query Account --output text)
15 |
16 | if [ $? -ne 0 ]
17 | then
18 | exit 255
19 | fi
20 |
21 |
22 | region=$(aws configure get region)
23 | region=${region:-eu-west-1}
24 |
25 |
26 | fullname="${account}.dkr.ecr.${region}.amazonaws.com/${image}:latest"
27 |
28 | aws ecr describe-repositories --repository-names "${image}" > /dev/null 2>&1
29 |
30 | if [ $? -ne 0 ]
31 | then
32 | aws ecr create-repository --repository-name "${image}" > /dev/null
33 | fi
34 |
35 | $(aws ecr get-login --region ${region} --no-include-email)
36 |
37 |
38 | docker build -t ${image} .
39 | docker tag ${image} ${fullname}
40 |
41 | docker push ${fullname}
42 |
--------------------------------------------------------------------------------
/imgs/docker.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ds-muzalevskiy/sagemaker-docker-deploy/5e2b4710672867539b5f59985132307169435564/imgs/docker.png
--------------------------------------------------------------------------------
/imgs/server-app.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ds-muzalevskiy/sagemaker-docker-deploy/5e2b4710672867539b5f59985132307169435564/imgs/server-app.png
--------------------------------------------------------------------------------
/imgs/stack.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ds-muzalevskiy/sagemaker-docker-deploy/5e2b4710672867539b5f59985132307169435564/imgs/stack.png
--------------------------------------------------------------------------------
/imgs/structure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ds-muzalevskiy/sagemaker-docker-deploy/5e2b4710672867539b5f59985132307169435564/imgs/structure.png
--------------------------------------------------------------------------------
/imgs/wsgi.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ds-muzalevskiy/sagemaker-docker-deploy/5e2b4710672867539b5f59985132307169435564/imgs/wsgi.png
--------------------------------------------------------------------------------
/lambda/sagemaker-invoke.py:
--------------------------------------------------------------------------------
1 | import os
2 | import io
3 | import boto3
4 | import json
5 | import csv
6 |
7 | ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
8 | runtime= boto3.client('runtime.sagemaker')
9 |
10 | def lambda_handler(event, context):
11 | print("Received event: " + json.dumps(event, indent=2))
12 |
13 | data = json.loads(json.dumps(event))
14 | payload = data['data']
15 | print(payload)
16 |
17 | response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
18 | ContentType='text/csv',Body=payload)
19 | result = json.loads(response['Body'].read().decode())
20 |
21 | if(result=="0"):
22 | result="Not Failure"
23 | else:
24 | result="Failure"
25 | return result
--------------------------------------------------------------------------------
/sagemaker-estimator.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 39,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "import boto3\n",
10 | "import s3fs\n",
11 | "import re\n",
12 | "\n",
13 | "import os\n",
14 | "import numpy as np\n",
15 | "import pandas as pd\n",
16 | "from sagemaker import get_execution_role\n",
17 | "\n",
18 | "from io import StringIO\n",
19 | "\n",
20 | "role = get_execution_role()"
21 | ]
22 | },
23 | {
24 | "cell_type": "code",
25 | "execution_count": 40,
26 | "metadata": {},
27 | "outputs": [],
28 | "source": [
29 | "s3 = s3fs.S3FileSystem(anon=False)\n",
30 | "\n",
31 | "bucket = 'ml-presentation/'\n",
32 | "key = 'heart_failure_clinical_records_dataset.csv'\n",
33 | "\n",
34 | "df_key = pd.read_csv(s3.open('{}{}'.format(bucket, key)), sep=';')"
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": 41,
40 | "metadata": {},
41 | "outputs": [],
42 | "source": [
43 | "import sagemaker as sage\n",
44 | "from time import gmtime, strftime\n",
45 | "\n",
46 | "from sagemaker.predictor import CSVSerializer\n",
47 | "\n",
48 | "sess = sage.Session()"
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": 42,
54 | "metadata": {},
55 | "outputs": [
56 | {
57 | "data": {
58 | "text/plain": [
59 | "'s3://ml-presentation/heart_failure_clinical_records_dataset.csv'"
60 | ]
61 | },
62 | "execution_count": 42,
63 | "metadata": {},
64 | "output_type": "execute_result"
65 | }
66 | ],
67 | "source": [
68 | "data_location = 's3://' + bucket + key\n",
69 | "data_location"
70 | ]
71 | },
72 | {
73 | "cell_type": "code",
74 | "execution_count": 43,
75 | "metadata": {},
76 | "outputs": [
77 | {
78 | "data": {
79 | "text/html": [
80 | "\n",
81 | "\n",
94 | "
\n",
95 | " \n",
96 | " \n",
97 | " | \n",
98 | " heart_failure | \n",
99 | " anaemia | \n",
100 | " creatinine_phosphokinase | \n",
101 | " diabetes | \n",
102 | " ejection_fraction | \n",
103 | " high_blood_pressure | \n",
104 | " platelets | \n",
105 | " serum_creatinine | \n",
106 | " serum_sodium | \n",
107 | " sex | \n",
108 | " smoking | \n",
109 | " time | \n",
110 | " age | \n",
111 | "
\n",
112 | " \n",
113 | " \n",
114 | " \n",
115 | " 0 | \n",
116 | " 1 | \n",
117 | " 0 | \n",
118 | " 582 | \n",
119 | " 0 | \n",
120 | " 20 | \n",
121 | " 1 | \n",
122 | " 265000.00 | \n",
123 | " 1.9 | \n",
124 | " 130 | \n",
125 | " 1 | \n",
126 | " 0 | \n",
127 | " 4 | \n",
128 | " 75.0 | \n",
129 | "
\n",
130 | " \n",
131 | " 1 | \n",
132 | " 1 | \n",
133 | " 0 | \n",
134 | " 7861 | \n",
135 | " 0 | \n",
136 | " 38 | \n",
137 | " 0 | \n",
138 | " 263358.03 | \n",
139 | " 1.1 | \n",
140 | " 136 | \n",
141 | " 1 | \n",
142 | " 0 | \n",
143 | " 6 | \n",
144 | " 55.0 | \n",
145 | "
\n",
146 | " \n",
147 | " 2 | \n",
148 | " 1 | \n",
149 | " 0 | \n",
150 | " 146 | \n",
151 | " 0 | \n",
152 | " 20 | \n",
153 | " 0 | \n",
154 | " 162000.00 | \n",
155 | " 1.3 | \n",
156 | " 129 | \n",
157 | " 1 | \n",
158 | " 1 | \n",
159 | " 7 | \n",
160 | " 65.0 | \n",
161 | "
\n",
162 | " \n",
163 | " 3 | \n",
164 | " 1 | \n",
165 | " 1 | \n",
166 | " 111 | \n",
167 | " 0 | \n",
168 | " 20 | \n",
169 | " 0 | \n",
170 | " 210000.00 | \n",
171 | " 1.9 | \n",
172 | " 137 | \n",
173 | " 1 | \n",
174 | " 0 | \n",
175 | " 7 | \n",
176 | " 50.0 | \n",
177 | "
\n",
178 | " \n",
179 | " 4 | \n",
180 | " 1 | \n",
181 | " 1 | \n",
182 | " 160 | \n",
183 | " 1 | \n",
184 | " 20 | \n",
185 | " 0 | \n",
186 | " 327000.00 | \n",
187 | " 2.7 | \n",
188 | " 116 | \n",
189 | " 0 | \n",
190 | " 0 | \n",
191 | " 8 | \n",
192 | " 65.0 | \n",
193 | "
\n",
194 | " \n",
195 | "
\n",
196 | "
"
197 | ],
198 | "text/plain": [
199 | " heart_failure anaemia creatinine_phosphokinase diabetes \\\n",
200 | "0 1 0 582 0 \n",
201 | "1 1 0 7861 0 \n",
202 | "2 1 0 146 0 \n",
203 | "3 1 1 111 0 \n",
204 | "4 1 1 160 1 \n",
205 | "\n",
206 | " ejection_fraction high_blood_pressure platelets serum_creatinine \\\n",
207 | "0 20 1 265000.00 1.9 \n",
208 | "1 38 0 263358.03 1.1 \n",
209 | "2 20 0 162000.00 1.3 \n",
210 | "3 20 0 210000.00 1.9 \n",
211 | "4 20 0 327000.00 2.7 \n",
212 | "\n",
213 | " serum_sodium sex smoking time age \n",
214 | "0 130 1 0 4 75.0 \n",
215 | "1 136 1 0 6 55.0 \n",
216 | "2 129 1 1 7 65.0 \n",
217 | "3 137 1 0 7 50.0 \n",
218 | "4 116 0 0 8 65.0 "
219 | ]
220 | },
221 | "execution_count": 43,
222 | "metadata": {},
223 | "output_type": "execute_result"
224 | }
225 | ],
226 | "source": [
227 | "df_key.head()"
228 | ]
229 | },
230 | {
231 | "cell_type": "code",
232 | "execution_count": 44,
233 | "metadata": {},
234 | "outputs": [
235 | {
236 | "name": "stdout",
237 | "output_type": "stream",
238 | "text": [
239 | "\n",
240 | "RangeIndex: 299 entries, 0 to 298\n",
241 | "Data columns (total 13 columns):\n",
242 | " # Column Non-Null Count Dtype \n",
243 | "--- ------ -------------- ----- \n",
244 | " 0 heart_failure 299 non-null int64 \n",
245 | " 1 anaemia 299 non-null int64 \n",
246 | " 2 creatinine_phosphokinase 299 non-null int64 \n",
247 | " 3 diabetes 299 non-null int64 \n",
248 | " 4 ejection_fraction 299 non-null int64 \n",
249 | " 5 high_blood_pressure 299 non-null int64 \n",
250 | " 6 platelets 299 non-null float64\n",
251 | " 7 serum_creatinine 299 non-null float64\n",
252 | " 8 serum_sodium 299 non-null int64 \n",
253 | " 9 sex 299 non-null int64 \n",
254 | " 10 smoking 299 non-null int64 \n",
255 | " 11 time 299 non-null int64 \n",
256 | " 12 age 299 non-null float64\n",
257 | "dtypes: float64(3), int64(10)\n",
258 | "memory usage: 30.5 KB\n"
259 | ]
260 | }
261 | ],
262 | "source": [
263 | "df_key.info()"
264 | ]
265 | },
266 | {
267 | "cell_type": "code",
268 | "execution_count": 45,
269 | "metadata": {
270 | "scrolled": true
271 | },
272 | "outputs": [],
273 | "source": [
274 | "account = sess.boto_session.client('sts').get_caller_identity()['Account']\n",
275 | "\n",
276 | "region = sess.boto_session.region_name\n",
277 | "\n",
278 | "image_uri = '{}.dkr.ecr.{}.amazonaws.com/modelling:latest'.format(account, region)"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": 46,
284 | "metadata": {},
285 | "outputs": [
286 | {
287 | "name": "stdout",
288 | "output_type": "stream",
289 | "text": [
290 | "2020-11-09 23:23:27 Starting - Starting the training job...\n",
291 | "2020-11-09 23:23:28 Starting - Launching requested ML instances......\n",
292 | "2020-11-09 23:24:55 Starting - Preparing the instances for training......\n",
293 | "2020-11-09 23:25:40 Downloading - Downloading input data\n",
294 | "2020-11-09 23:25:40 Training - Downloading the training image..\u001b[34mTraining complete.\u001b[0m\n",
295 | "\n",
296 | "2020-11-09 23:26:13 Uploading - Uploading generated training model\n",
297 | "2020-11-09 23:26:13 Completed - Training job completed\n",
298 | "Training seconds: 39\n",
299 | "Billable seconds: 39\n"
300 | ]
301 | }
302 | ],
303 | "source": [
304 | "voting_clf = sage.estimator.Estimator(image_uri,\n",
305 | " role, 1, 'ml.c4.2xlarge',\n",
306 | " output_path=\"s3://{}/output\".format(sess.default_bucket()),\n",
307 | " sagemaker_session=sess)\n",
308 | "\n",
309 | "voting_clf.fit(data_location)"
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": 47,
315 | "metadata": {},
316 | "outputs": [
317 | {
318 | "name": "stdout",
319 | "output_type": "stream",
320 | "text": [
321 | "-----------!"
322 | ]
323 | }
324 | ],
325 | "source": [
326 | "predictor = voting_clf.deploy(1, 'ml.c4.2xlarge', serializer=CSVSerializer())"
327 | ]
328 | },
329 | {
330 | "cell_type": "code",
331 | "execution_count": 49,
332 | "metadata": {},
333 | "outputs": [],
334 | "source": [
335 | "xx=predictor.predict(df_key.iloc[:,1:].values)"
336 | ]
337 | },
338 | {
339 | "cell_type": "code",
340 | "execution_count": 50,
341 | "metadata": {},
342 | "outputs": [],
343 | "source": [
344 | "xx_str=str(xx,'utf-8')\n",
345 | "res_pred = StringIO(xx_str) "
346 | ]
347 | },
348 | {
349 | "cell_type": "code",
350 | "execution_count": 51,
351 | "metadata": {},
352 | "outputs": [
353 | {
354 | "data": {
355 | "text/html": [
356 | "\n",
357 | "\n",
370 | "
\n",
371 | " \n",
372 | " \n",
373 | " | \n",
374 | " 0 | \n",
375 | "
\n",
376 | " \n",
377 | " \n",
378 | " \n",
379 | " 0 | \n",
380 | " 1 | \n",
381 | "
\n",
382 | " \n",
383 | " 1 | \n",
384 | " 1 | \n",
385 | "
\n",
386 | " \n",
387 | " 2 | \n",
388 | " 1 | \n",
389 | "
\n",
390 | " \n",
391 | " 3 | \n",
392 | " 1 | \n",
393 | "
\n",
394 | " \n",
395 | " 4 | \n",
396 | " 1 | \n",
397 | "
\n",
398 | " \n",
399 | " ... | \n",
400 | " ... | \n",
401 | "
\n",
402 | " \n",
403 | " 294 | \n",
404 | " 0 | \n",
405 | "
\n",
406 | " \n",
407 | " 295 | \n",
408 | " 0 | \n",
409 | "
\n",
410 | " \n",
411 | " 296 | \n",
412 | " 0 | \n",
413 | "
\n",
414 | " \n",
415 | " 297 | \n",
416 | " 0 | \n",
417 | "
\n",
418 | " \n",
419 | " 298 | \n",
420 | " 0 | \n",
421 | "
\n",
422 | " \n",
423 | "
\n",
424 | "
299 rows × 1 columns
\n",
425 | "
"
426 | ],
427 | "text/plain": [
428 | " 0\n",
429 | "0 1\n",
430 | "1 1\n",
431 | "2 1\n",
432 | "3 1\n",
433 | "4 1\n",
434 | ".. ..\n",
435 | "294 0\n",
436 | "295 0\n",
437 | "296 0\n",
438 | "297 0\n",
439 | "298 0\n",
440 | "\n",
441 | "[299 rows x 1 columns]"
442 | ]
443 | },
444 | "execution_count": 51,
445 | "metadata": {},
446 | "output_type": "execute_result"
447 | }
448 | ],
449 | "source": [
450 | "res_pred = pd.read_csv(res_pred, header=None)\n",
451 | "res_pred"
452 | ]
453 | },
454 | {
455 | "cell_type": "code",
456 | "execution_count": null,
457 | "metadata": {},
458 | "outputs": [],
459 | "source": []
460 | }
461 | ],
462 | "metadata": {
463 | "kernelspec": {
464 | "display_name": "conda_python3",
465 | "language": "python",
466 | "name": "conda_python3"
467 | },
468 | "language_info": {
469 | "codemirror_mode": {
470 | "name": "ipython",
471 | "version": 3
472 | },
473 | "file_extension": ".py",
474 | "mimetype": "text/x-python",
475 | "name": "python",
476 | "nbconvert_exporter": "python",
477 | "pygments_lexer": "ipython3",
478 | "version": "3.6.10"
479 | }
480 | },
481 | "nbformat": 4,
482 | "nbformat_minor": 4
483 | }
484 |
--------------------------------------------------------------------------------
/sagemaker-estimator/nginx.conf:
--------------------------------------------------------------------------------
1 | worker_processes 1;
2 | daemon off;
3 |
4 |
5 | pid /tmp/nginx.pid;
6 | error_log /var/log/nginx/error.log;
7 |
8 | events {
9 | }
10 |
11 | http {
12 | include /etc/nginx/mime.types;
13 | default_type application/octet-stream;
14 | access_log /var/log/nginx/access.log combined;
15 |
16 | upstream gunicorn {
17 | server unix:/tmp/gunicorn.sock;
18 | }
19 |
20 | server {
21 | listen 8080 deferred;
22 | client_max_body_size 5m;
23 |
24 | keepalive_timeout 5;
25 | proxy_read_timeout 1200s;
26 |
27 | location ~ ^/(ping|invocations) {
28 | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
29 | proxy_set_header Host $http_host;
30 | proxy_redirect off;
31 | proxy_pass http://gunicorn;
32 | }
33 |
34 | location / {
35 | return 404 "{}";
36 | }
37 | }
38 | }
39 |
--------------------------------------------------------------------------------
/sagemaker-estimator/predictor.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 |
3 | import os
4 | import json
5 | import pickle
6 | import io
7 | import sys
8 | import signal
9 | import traceback
10 |
11 | import flask
12 |
13 | import pandas as pd
14 |
15 | prefix = '/opt/ml/'
16 | model_path = os.path.join(prefix, 'model')
17 |
18 |
19 | class ScoringService(object):
20 | model = None
21 |
22 | @classmethod
23 | def get_model(cls):
24 | if cls.model == None:
25 | with open(os.path.join(model_path, 'ml-model.pkl'), 'rb') as inp:
26 | cls.model = pickle.load(inp)
27 | return cls.model
28 |
29 | @classmethod
30 | def predict(cls, input):
31 |
32 | clf = cls.get_model()
33 | return clf.predict(input)
34 |
35 | app = flask.Flask(__name__)
36 |
37 | @app.route('/ping', methods=['GET'])
38 | def ping():
39 | health = ScoringService.get_model() is not None
40 |
41 | status = 200 if health else 404
42 | return flask.Response(response='\n', status=status, mimetype='application/json')
43 |
44 | @app.route('/invocations', methods=['POST'])
45 | def transformation():
46 | data = None
47 |
48 | if flask.request.content_type == 'text/csv':
49 | data = flask.request.data
50 | s = io.BytesIO(data)
51 | data = pd.read_csv(s, header=None)
52 | else:
53 | return flask.Response(response='This predictor only supports CSV data', status=415, mimetype='text/plain')
54 |
55 | print('Invoked with {} records'.format(data.shape[0]))
56 |
57 | predictions = ScoringService.predict(data)
58 |
59 | out = io.StringIO()
60 | pd.DataFrame({'results':predictions}).to_csv(out, header=False, index=False)
61 | result = out.getvalue()
62 |
63 | return flask.Response(response=result, status=200, mimetype='text/csv')
64 |
--------------------------------------------------------------------------------
/sagemaker-estimator/serve:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | from __future__ import print_function
4 | import multiprocessing
5 | import os
6 | import signal
7 | import subprocess
8 | import sys
9 |
10 | cpu_count = multiprocessing.cpu_count()
11 |
12 | model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)
13 | model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS', cpu_count))
14 |
15 | def sigterm_handler(nginx_pid, gunicorn_pid):
16 | try:
17 | os.kill(nginx_pid, signal.SIGQUIT)
18 | except OSError:
19 | pass
20 | try:
21 | os.kill(gunicorn_pid, signal.SIGTERM)
22 | except OSError:
23 | pass
24 |
25 | sys.exit(0)
26 |
27 | def start_server():
28 | print('Starting the inference server with {} workers.'.format(model_server_workers))
29 |
30 |
31 | subprocess.check_call(['ln', '-sf', '/dev/stdout', '/var/log/nginx/access.log'])
32 | subprocess.check_call(['ln', '-sf', '/dev/stderr', '/var/log/nginx/error.log'])
33 |
34 | nginx = subprocess.Popen(['nginx', '-c', '/opt/program/nginx.conf'])
35 | gunicorn = subprocess.Popen(['gunicorn',
36 | '--timeout', str(model_server_timeout),
37 | '-k', 'gevent',
38 | '-b', 'unix:/tmp/gunicorn.sock',
39 | '-w', str(model_server_workers),
40 | 'wsgi:app'])
41 |
42 | signal.signal(signal.SIGTERM, lambda a, b: sigterm_handler(nginx.pid, gunicorn.pid))
43 |
44 | pids = set([nginx.pid, gunicorn.pid])
45 | while True:
46 | pid, _ = os.wait()
47 | if pid in pids:
48 | break
49 |
50 | sigterm_handler(nginx.pid, gunicorn.pid)
51 | print('Inference server exiting')
52 |
53 |
54 | if __name__ == '__main__':
55 | start_server()
56 |
--------------------------------------------------------------------------------
/sagemaker-estimator/train:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | from __future__ import print_function
4 |
5 | import os
6 | import json
7 | import pickle
8 | import sys
9 | import traceback
10 |
11 | import pandas as pd
12 |
13 | from sklearn.linear_model import LogisticRegression
14 | from sklearn.svm import SVC
15 | from sklearn.ensemble import VotingClassifier
16 | from sklearn.ensemble import RandomForestClassifier
17 |
18 | prefix = '/opt/ml/'
19 |
20 | input_path = prefix + 'input/data'
21 | output_path = os.path.join(prefix, 'output')
22 | model_path = os.path.join(prefix, 'model')
23 |
24 | channel_name='training'
25 | training_path = os.path.join(input_path, channel_name)
26 |
27 |
28 | def train():
29 |
30 | input_files = [ os.path.join(training_path, file) for file in os.listdir(training_path) ]
31 | if len(input_files) == 0:
32 | raise ValueError(('error').format(training_path, channel_name))
33 | raw_data = [ pd.read_csv(file, sep=',') for file in input_files ]
34 | train_data = pd.concat(raw_data)
35 |
36 | first_column = train_data.pop(target_variable)
37 | train_data.insert(0, target_variable, first_column)
38 |
39 | train_y = train_data.iloc[:,0]
40 | train_X = train_data.iloc[:,1:]
41 |
42 | clf = VotingClassifier(estimators=[
43 | ('svm', SVC(probability=True)),
44 | ('lr', LogisticRegression()),
45 | ('rf', RandomForestClassifier())], voting='soft')
46 |
47 | grid = clf.fit(train_X, train_y)
48 |
49 | with open(os.path.join(model_path, 'ml-model.pkl'), 'wb') as out:
50 | pickle.dump(grid, out)
51 | print('Training complete.')
52 |
53 |
54 | if __name__ == '__main__':
55 | train()
56 |
57 | sys.exit(0)
58 |
--------------------------------------------------------------------------------
/sagemaker-estimator/wsgi.py:
--------------------------------------------------------------------------------
1 | import predictor as myapp
2 |
3 | app = myapp.app
4 |
--------------------------------------------------------------------------------