├── img ├── EKSCTL.png └── TorchServeOnAWS.png ├── CODE_OF_CONDUCT.md ├── template ├── cluster.yaml ├── eks_ami_policy.json ├── pt_inference.yaml └── cloud_watch_policy.json ├── delete_cluster.sh ├── LICENSE ├── installation.md ├── cloud_watch_util.sh ├── README.md ├── CONTRIBUTING.md ├── pt_serve_util.sh └── instructions.md /img/EKSCTL.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/torchserve-eks/HEAD/img/EKSCTL.png -------------------------------------------------------------------------------- /img/TorchServeOnAWS.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aws-samples/torchserve-eks/HEAD/img/TorchServeOnAWS.png -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | ## Code of Conduct 2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 4 | opensource-codeofconduct@amazon.com with any additional questions or comments. 5 | -------------------------------------------------------------------------------- /template/cluster.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: eksctl.io/v1alpha5 2 | kind: ClusterConfig 3 | 4 | metadata: 5 | name: "your_cluster_name" 6 | region: "your_cluster_region" 7 | 8 | nodeGroups: 9 | - name: ng-1 10 | instanceType: g4dn.xlarge 11 | desiredCapacity: 1 12 | 13 | cloudWatch: 14 | clusterLogging: 15 | enableTypes: ["audit", "authenticator", "api", "controllerManager", "scheduler"] 16 | -------------------------------------------------------------------------------- /delete_cluster.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # Util script for tearing down the cluster 3 | NODE_INSTANCE_ROLE_NAME=$(eksctl get iamidentitymapping --cluster $AWS_CLUSTER_NAME | tail -1 | cut -f1 | cut -f2 -d/) 4 | aws iam delete-role-policy --role-name ${NODE_INSTANCE_ROLE_NAME} --policy-name cw-log-policy 5 | eksctl delete cluster --name ${AWS_CLUSTER_NAME} 6 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of 4 | this software and associated documentation files (the "Software"), to deal in 5 | the Software without restriction, including without limitation the rights to 6 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 7 | the Software, and to permit persons to whom the Software is furnished to do so. 8 | 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 10 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 11 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 12 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 13 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 14 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 15 | 16 | -------------------------------------------------------------------------------- /template/eks_ami_policy.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Effect": "Allow", 6 | "Action": [ 7 | "iam:*" 8 | ], 9 | "Resource": [ 10 | "arn:aws:iam::{$ACCOUNT}:instance-profile/eksctl-*", 11 | "arn:aws:iam::{$ACCOUNT}:role/eksctl-*", 12 | "arn:aws:iam::{$ACCOUNT}:oidc-provider/oidc*" 13 | ] 14 | }, 15 | { 16 | "Effect": "Allow", 17 | "Action": [ 18 | "aws-marketplace:*", 19 | "ecs:*", 20 | "ec2:*", 21 | "cloudformation:*", 22 | "eks:*", 23 | "ecr:*", 24 | "ssm:GetParameter", 25 | "autoscaling:*", 26 | "logs:*", 27 | "elasticloadbalancing:*" 28 | ], 29 | "Resource": "*" 30 | } 31 | ] 32 | } 33 | -------------------------------------------------------------------------------- /template/pt_inference.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | kind: Service 3 | apiVersion: v1 4 | metadata: 5 | name: your_service_name 6 | labels: 7 | app: your_service_name 8 | spec: 9 | ports: 10 | - name: preds 11 | port: 8080 12 | targetPort: ts 13 | - name: mdl 14 | port: 8081 15 | targetPort: ts-management 16 | type: LoadBalancer 17 | selector: 18 | app: your_service_name 19 | --- 20 | kind: Deployment 21 | apiVersion: apps/v1 22 | metadata: 23 | name: your_service_name 24 | labels: 25 | app: your_service_name 26 | spec: 27 | replicas: 1 28 | selector: 29 | matchLabels: 30 | app: your_service_name 31 | template: 32 | metadata: 33 | labels: 34 | app: your_service_name 35 | spec: 36 | containers: 37 | - name: your_service_name 38 | image: "pytorch/torchserve:latest-gpu" 39 | ports: 40 | - name: ts 41 | containerPort: 8080 42 | - name: ts-management 43 | containerPort: 8081 44 | imagePullPolicy: IfNotPresent 45 | resources: 46 | limits: 47 | cpu: 4 48 | memory: 4Gi 49 | nvidia.com/gpu: 1 50 | requests: 51 | cpu: "1" 52 | memory: 1Gi 53 | -------------------------------------------------------------------------------- /template/cloud_watch_policy.json: -------------------------------------------------------------------------------- 1 | { 2 | "Version": "2012-10-17", 3 | "Statement": [ 4 | { 5 | "Sid": "VisualEditor0", 6 | "Effect": "Allow", 7 | "Action": [ 8 | "logs:GetLogEvents", 9 | "logs:PutLogEvents" 10 | ], 11 | "Resource": "arn:aws:logs:*:*:log-group:/aws/containerinsights/{$AWS_CLUSTER_NAME}/*:log-stream:*" 12 | }, 13 | { 14 | "Sid": "VisualEditor1", 15 | "Effect": "Allow", 16 | "Action": [ 17 | "logs:CreateLogStream", 18 | "logs:DescribeLogStreams", 19 | "logs:PutRetentionPolicy" 20 | ], 21 | "Resource": "arn:aws:logs:*:*:log-group:/aws/containerinsights/{$AWS_CLUSTER_NAME}/*:*" 22 | }, 23 | { 24 | "Sid": "VisualEditor2", 25 | "Effect": "Allow", 26 | "Action": "logs:CreateLogGroup", 27 | "Resource": "*" 28 | }, 29 | { 30 | "Sid": "VisualEditor3", 31 | "Effect": "Allow", 32 | "Action": [ 33 | "logs:DescribeLogGroups" 34 | ], 35 | "Resource": "arn:aws:logs:*:*:log-group::log-stream:*" 36 | } 37 | ] 38 | } 39 | -------------------------------------------------------------------------------- /installation.md: -------------------------------------------------------------------------------- 1 | ## Installation Instructions for Linux 2 | 3 | Before beginning to setup the EKS cluster you must first install the required command line tools. You will need to have [Docker](https://www.docker.com/), [AWS CLI](https://aws.amazon.com/cli/), [KubeCTL](https://kubernetes.io/docs/tasks/tools/install-kubectl/), [EKSCTL](https://eksctl.io/) and [AWS IAM Authenticator](https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html) installed to deploy [Torch Serve](https://github.com/pytorch/serve) to EKS. On an Deep Learning AMI Ubuntu 18.04 instance, this can be accomplished by the following: 4 | 5 | ``` 6 | sudo apt-get -y update 7 | 8 | # AWS CLI installation 9 | curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" 10 | unzip awscliv2.zip 11 | sudo ./aws/install 12 | 13 | # KubeCTL 14 | curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl 15 | chmod +x ./kubectl 16 | sudo mv ./kubectl /usr/local/bin/kubectl 17 | kubectl version --client 18 | 19 | # EKS CTL 20 | curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp 21 | sudo mv /tmp/eksctl /usr/local/bin 22 | 23 | # AWS IAM Authenticator 24 | curl -o aws-iam-authenticator https://amazon-eks.s3.us-west-2.amazonaws.com/1.16.8/2020-04-16/bin/linux/amd64/aws-iam-authenticator 25 | chmod +x ./aws-iam-authenticator 26 | mkdir -p $HOME/bin && cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$PATH:$HOME/bin 27 | echo 'export PATH=$PATH:$HOME/bin' >> ~/.bashrc 28 | ``` 29 | -------------------------------------------------------------------------------- /cloud_watch_util.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # Util functions to be used by scripts in this directory 3 | 4 | replace_text_in_file() { 5 | local FIND_TEXT=$1 6 | local REPLACE_TEXT=$2 7 | local SRC_FILE=$3 8 | 9 | sed -i.bak "s ${FIND_TEXT} ${REPLACE_TEXT} g" ${SRC_FILE} 10 | rm $SRC_FILE.bak 11 | } 12 | 13 | attach_inline_policy() { 14 | declare -r POLICY_NAME="$1" POLICY_DOCUMENT="$2" IAM_ROLE="$3" 15 | echo "Attach inline policy $POLICY_NAME for iam role $IAM_ROLE" 16 | if ! aws iam put-role-policy --role-name $IAM_ROLE --policy-name $POLICY_NAME --policy-document file://${POLICY_DOCUMENT}; then 17 | echo "Unable to attach iam inline policy $POLICY_NAME to role $IAM_ROLE" >&2 18 | exit 1 19 | fi 20 | } 21 | 22 | get_node_instance_role() { 23 | # i.e. convert from 24 | # ARN USERNAME GROUPS 25 | # arn:aws:iam::12345:role/eksctl-my-cluster-nodegroup-Gpu-NodeInstanceRole-ABCDEFG system:node:{{EC2PrivateDNSName}} system:bootstrappers,system:nodes 26 | # 27 | # to 28 | # eksctl-my-cluster-nodegroup-Gpu-NodeInstanceRole-ABCDEFG 29 | echo "Getting NodeInstanceRoleName" 30 | export NODE_INSTANCE_ROLE_NAME=$(eksctl get iamidentitymapping --cluster $AWS_CLUSTER_NAME | tail -1 | cut -f1 | cut -f2 -d/) 31 | echo "NodeInstanceRoleName: $NODE_INSTANCE_ROLE_NAME" 32 | 33 | } 34 | 35 | if [ -z ${AWS_CLUSTER_NAME+x} ]; then 36 | echo "AWS_CLUSTER_NAME is unset" 37 | exit 1 38 | else 39 | echo "AWS_CLUSTER_NAME is set to '$AWS_CLUSTER_NAME'"; 40 | fi 41 | 42 | if [ -z ${AWS_REGION+x} ]; then 43 | echo "AWS_REGION is unset" 44 | exit 1 45 | else 46 | echo "AWS_REGION is set to '$AWS_REGION'"; 47 | fi 48 | 49 | get_node_instance_role 50 | 51 | if [[ -z "$NODE_INSTANCE_ROLE_NAME" ]]; then 52 | echo "NODE_INSTANCE_ROLE_NAME cannot be empty." 53 | exit 1 54 | fi 55 | 56 | cp template/cloud_watch_policy.json cloud_watch_policy.json 57 | replace_text_in_file '{$AWS_CLUSTER_NAME}' ${AWS_CLUSTER_NAME} cloud_watch_policy.json 58 | attach_inline_policy cw-log-policy cloud_watch_policy.json $NODE_INSTANCE_ROLE_NAME 59 | 60 | # check https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-EKS-quickstart.html 61 | curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml \ 62 | | sed "s/{{cluster_name}}/${AWS_CLUSTER_NAME}/;s/{{region_name}}/${AWS_REGION}/" | kubectl apply -f - -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Running TorchServe on Amazon Elastic Kubernetes Service 2 | 3 | This repo contains the code for the blog post: [Running TorchServe on Amazon Elastic Kubernetes Service](https://aws.amazon.com/blogs/opensource/running-torchserve-on-amazon-elastic-kubernetes-service/) published to the AWS Open Source blog. 4 | 5 | ![alt text](/img/TorchServeOnAWS.png) 6 | 7 | [TorchServe](https://github.com/pytorch/serve) makes it easy to deploy and manage PyTorch models at scale in production environments. TorchServe is built and maintained by AWS in collaboration with Facebook and is available as part of the PyTorch open-source project. 8 | 9 | TorchServe supports any machine learning environment including, [Amazon Elastic Kubernets Service (EKS)](https://aws.amazon.com/eks/). 10 | 11 | ## The benefits of TorchServe 12 | 13 | TorchServe makes it easy to deploy PyTorch models at scale in production environments. It delivers lightweight serving with low latency, so you can deploy your models for high performance inference. It provides default handlers for the most common applications such as object detection and text classification, so you don’t have to write custom code to deploy your models. With powerful TorchServe features including multi-model serving, model versioning for A/B testing, metrics for monitoring, and RESTful endpoints for application integration, you can take your models from research to production quickly. TorchServe supports any machine learning environment, including Amazon SageMaker, Kubernetes, Amazon EKS, and Amazon EC2. 14 | 15 | ## The benefits of Amazon EKS 16 | 17 | Amazon EKS takes advantage of the fact that it is running in the AWS cloud making great use of many AWS services and features, while ensuring that everything you already know about Kubernetes remains applicable and helpful. EKS is deeply integrated with services such as Amazon CloudWatch, Auto Scaling Groups, AWS Identity and Access Management (IAM), and Amazon Virtual Private Cloud (VPC), providing you a seamless experience to monitor, scale, and load-balance your applications. 18 | 19 | ## The directory structure of this repository 20 | ``` bash 21 | ├── LICENSE 22 | ├── README.md 23 | ├── cloud_watch_util.sh # Script to set up CloudWatch logs 24 | ├── delete_cluster.sh # Script to tear down the EKS cluster 25 | ├── img 26 | │   ├── EKSCTL.png 27 | │   └── TorchServeOnAWS.png 28 | ├── installation.md # How to install command line tools 29 | ├── instructions.md # Step-by-step setup instructions 30 | ├── pt_serve_util.sh # Script to auto-gen manifest files 31 | └── template # A directory with all template files 32 | ├── cloud_watch_policy.json # IAM CloudWatch policy template 33 | ├── cluster.yaml # EKS cluster manifest template 34 | ├── eks_ami_policy.json # IAM user policy template 35 | └── pt_inference.yaml # TorchServe manifest template 36 | ``` 37 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing Guidelines 2 | 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 4 | documentation, we greatly value feedback and contributions from our community. 5 | 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 7 | information to effectively respond to your bug report or contribution. 8 | 9 | 10 | ## Reporting Bugs/Feature Requests 11 | 12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features. 13 | 14 | When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already 15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: 16 | 17 | * A reproducible test case or series of steps 18 | * The version of our code being used 19 | * Any modifications you've made relevant to the bug 20 | * Anything unusual about your environment or deployment 21 | 22 | 23 | ## Contributing via Pull Requests 24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 25 | 26 | 1. You are working against the latest source on the *master* branch. 27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. 28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted. 29 | 30 | To send us a pull request, please: 31 | 32 | 1. Fork the repository. 33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. 34 | 3. Ensure local tests pass. 35 | 4. Commit to your fork using clear commit messages. 36 | 5. Send us a pull request, answering any default questions in the pull request interface. 37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. 38 | 39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). 41 | 42 | 43 | ## Finding contributions to work on 44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. 45 | 46 | 47 | ## Code of Conduct 48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 50 | opensource-codeofconduct@amazon.com with any additional questions or comments. 51 | 52 | 53 | ## Security issue notifications 54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. 55 | 56 | 57 | ## Licensing 58 | 59 | See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. 60 | 61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. 62 | -------------------------------------------------------------------------------- /pt_serve_util.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # Util functions to be used by scripts in this directory 3 | 4 | replace_text_in_file() { 5 | local FIND_TEXT=$1 6 | local REPLACE_TEXT=$2 7 | local SRC_FILE=$3 8 | 9 | sed -i.bak "s ${FIND_TEXT} ${REPLACE_TEXT} g" ${SRC_FILE} 10 | rm $SRC_FILE.bak 11 | } 12 | 13 | check_installed_deps() { 14 | declare -a pt_deps=("aws" "eksctl" "kubectl") 15 | 16 | for pt_dep in "${pt_deps[@]}"; do 17 | if ! which "${pt_dep}" &>/dev/null && ! type -a "${pt_dep}" &>/dev/null ; then 18 | echo "You don't have ${pt_dep} installed. Please install ${pt_dep}." 19 | exit 1 20 | fi 21 | done 22 | } 23 | 24 | check_aws_iam_authenticator() { 25 | if ! which "aws-iam-authenticator" &>/dev/null && ! type -a "aws-iam-authenticator" &>/dev/null ; then 26 | echo "You don't have aws-iam-authenticator installed. Please install aws-iam-authenticator. https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html" 27 | exit 1 28 | fi 29 | } 30 | 31 | check_env_variables() { 32 | if [ -z ${K8S_MANIFESTS_DIR+x} ]; then 33 | echo "K8S_MANIFESTS_DIR is unset" 34 | exit 1 35 | else 36 | echo "K8S_MANIFESTS_DIR is set to '$K8S_MANIFESTS_DIR'"; 37 | fi 38 | 39 | if [ -z ${AWS_CLUSTER_NAME+x} ]; then 40 | echo "AWS_CLUSTER_NAME is unset" 41 | exit 1 42 | else 43 | echo "AWS_CLUSTER_NAME is set to '$AWS_CLUSTER_NAME'"; 44 | fi 45 | 46 | if [ -z ${AWS_REGION+x} ]; then 47 | echo "AWS_REGION is unset" 48 | exit 1 49 | else 50 | echo "AWS_REGION is set to '$AWS_REGION'"; 51 | fi 52 | 53 | if [ -z ${AWS_ACCOUNT+x} ]; then 54 | echo "AWS_ACCOUNT is unset" 55 | exit 1 56 | else 57 | echo "AWS_ACCOUNT is set to '$AWS_ACCOUNT'"; 58 | fi 59 | 60 | if [ -z ${PT_SERVE_NAME+x} ]; then 61 | echo "PT_SERVE_NAME is unset" 62 | exit 1 63 | else 64 | echo "PT_SERVE_NAME is set to '$PT_SERVE_NAME'"; 65 | fi 66 | } 67 | 68 | ## Prepare Infrastrcture Configurations 69 | generate_aws_infra_configs() { 70 | # Create the infrastructure configs if they don't exist. 71 | if [ ! -d "${K8S_MANIFESTS_DIR}" ]; then 72 | echo "Creating AWS infrastructure configs in directory ${K8S_MANIFESTS_DIR}" 73 | mkdir -p "${K8S_MANIFESTS_DIR}" 74 | else 75 | echo AWS infrastructure configs already exist in directory "${K8S_MANIFESTS_DIR}" 76 | fi 77 | 78 | # copy template yaml files into the manifest dir 79 | cp template/pt_inference.yaml ${K8S_MANIFESTS_DIR}/pt_inference.yaml 80 | cp template/cluster.yaml ${K8S_MANIFESTS_DIR}/cluster.yaml 81 | cp template/eks_ami_policy.json eks_ami_policy.json 82 | 83 | # Replace placehold with user configurations 84 | replace_text_in_file "your_cluster_name" ${AWS_CLUSTER_NAME} ${K8S_MANIFESTS_DIR}/cluster.yaml 85 | replace_text_in_file "your_cluster_region" ${AWS_REGION} ${K8S_MANIFESTS_DIR}/cluster.yaml 86 | 87 | image=torchserve 88 | # Get account and region (must have previously configured your AWS cli) 89 | # account=$(aws sts get-caller-identity --query Account --output text) 90 | #region=$(aws configure get region) 91 | IMAGE_URI="${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com/${image}" 92 | replace_text_in_file "your_image_ecr_uri" ${IMAGE_URI} ${K8S_MANIFESTS_DIR}/pt_inference.yaml 93 | replace_text_in_file "your_service_name" ${PT_SERVE_NAME} ${K8S_MANIFESTS_DIR}/pt_inference.yaml 94 | replace_text_in_file '{$REGION}' ${AWS_REGION} eks_ami_policy.json 95 | replace_text_in_file '{$ACCOUNT}' ${AWS_ACCOUNT} eks_ami_policy.json 96 | } 97 | 98 | check_installed_deps 99 | check_aws_iam_authenticator 100 | check_env_variables 101 | generate_aws_infra_configs 102 | -------------------------------------------------------------------------------- /instructions.md: -------------------------------------------------------------------------------- 1 | # Instructions 2 | 3 | ## Getting Started 4 | 5 | Please [install](https://github.com/smart-patrol/pytorch-serve-eks/blob/master/installation.md) required packages to complete this walkthrough. 6 | 7 | ## Setup Environment Variables 8 | 9 | ``` 10 | export AWS_ACCOUNT= 11 | export AWS_REGION= 12 | export K8S_MANIFESTS_DIR= 13 | export AWS_CLUSTER_NAME= 14 | export PT_SERVE_NAME= 15 | ``` 16 | 17 | ## Create EKS manifest files 18 | 19 | ``` 20 | git clone https://github.com/smart-patrol/pytorch-serve-eks 21 | 22 | cd pytorch-serve-eks 23 | 24 | ./pt_serve_util.sh 25 | ``` 26 | 27 | ## Setup IAM Roles and Policies 28 | 29 | An IAM user needs certain AWS resource permissions to set up the EKS cluster for TorchServe. However, if you set up the TorchServe EKS cluster using an AWS Admin account, this step on IAM policies should be skipped and jump directly to Step **Subscribe to EKS-optimized AMI with GPU Support in the AWS Marketplace** below. 30 | 31 | (A pre-requisite to this step is having an IAM User named "*EKSUser*". To see how to create an IAM User see [Creating an IAM User](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html)) 32 | 33 | The following two steps require admin privilege 34 | 35 | ### Create IAM Policy 36 | 37 | ``` 38 | aws iam create-policy --policy-name eks_ami_policy \ 39 | --policy-document file://eks_ami_policy.json 40 | ``` 41 | 42 | ### Attach policy to user 43 | ``` 44 | aws iam attach-user-policy \ 45 | --policy-arn arn:aws:iam::${AWS_ACCOUNT}:policy/eks_ami_policy \ 46 | --user-name EKSUser 47 | ``` 48 | 49 | ## Switch User 50 | If the user designated to set up the TorchServe EKS cluster is *EKSUser*, switch to *EKSUser* for both AWS API and AWS Console interactions in order to execute all of the following steps. 51 | 52 | ## Subscribe to EKS-optimized AMI with GPU Support in the AWS Marketplace 53 | 54 | Subscribe [here](https://aws.amazon.com/marketplace/pp/B07GRHFXGM) 55 | 56 | 63 | 64 | ## Creating an EKS Cluster 65 | 66 | ``` 67 | eksctl create cluster -f ${K8S_MANIFESTS_DIR}/cluster.yaml 68 | ``` 69 | 70 | ## Install NVIDIA device plugin for Kubernetes 71 | 72 | ``` 73 | kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/master/deployments/static/nvidia-device-plugin.yml 74 | 75 | kubectl get daemonset -n kube-system 76 | ``` 77 | 78 | ## Deploy Pods to EKS cluster 79 | 80 | ``` 81 | NAMESPACE=pt-inference; kubectl create namespace ${NAMESPACE} 82 | 83 | kubectl -n ${NAMESPACE} apply -f ${K8S_MANIFESTS_DIR}/pt_inference.yaml 84 | 85 | kubectl get pods -n ${NAMESPACE} 86 | ``` 87 | Wait to proceed until you see `STATUS=Running` 88 | 89 | ## Setup Logging on CloudWatch 90 | 91 | ``` 92 | ./cloud_watch_util.sh 93 | ``` 94 | 95 | Check out logs at: `/aws/containerinsights/${AWS_CLUSTER_NAME}/application/${PT_SERVE_NAME}*` 96 | 97 | ## Register Models with TorchServe 98 | 99 | ``` 100 | EXTERNAL_IP=`kubectl get svc -n ${NAMESPACE} -o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname}'` 101 | 102 | response=$(curl --write-out %{http_code} --silent --output /dev/null --retry 5 -X POST "http://${EXTERNAL_IP}:8081/models?url=https://torchserve.s3.amazonaws.com/mar_files/resnet-18.mar&initial_workers=1&synchronous=true") 103 | 104 | if [ ! "$response" == 200 ] 105 | then 106 | echo "failed to register model with torchserve" 107 | else 108 | echo "successfully registered model with torchserve" 109 | fi 110 | ``` 111 | 112 | Optional, if you want to use port forwarding: 113 | 114 | ``` 115 | kubectl port-forward -n ${NAMESPACE} `kubectl get pods -n ${NAMESPACE} --selector=app=densenet-service -o jsonpath='{.items[0].metadata.name}'` 8080:8080 8081:8081 & 116 | ``` 117 | 118 | ## Inference on Endpoint 119 | 120 | ``` 121 | # Get a sample image 122 | wget https://raw.githubusercontent.com/pytorch/serve/master/docs/images/kitten_small.jpg 123 | 124 | curl -X POST http://${EXTERNAL_IP}:8080/predictions/resnet-18 -T kitten_small.jpg 125 | ``` 126 | 127 | List out models. 128 | 129 | ``` 130 | curl -X GET http://${EXTERNAL_IP}:8081/models/ 131 | ``` 132 | 133 | ## Cleaning up 134 | 135 | ``` 136 | ./delete_cluster.sh 137 | ``` 138 | --------------------------------------------------------------------------------