├── Dockerfile ├── LICENSE ├── README.md ├── codefresh.yml ├── example ├── .gitignore ├── Dockerfile ├── codefresh.yml ├── deployment.yml ├── hello-web.go └── service.yml └── k8s-canary-rollout.sh /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM codefresh/kube-helm:master 2 | 3 | RUN mkdir /app 4 | 5 | COPY k8s-canary-rollout.sh /app 6 | 7 | RUN chmod +x /app/k8s-canary-rollout.sh 8 | 9 | CMD /app/k8s-canary-rollout.sh $WORKING_VOLUME $SERVICE_NAME $DEPLOYMENT_NAME $TRAFFIC_INCREMENT $NAMESPACE $NEW_VERSION $SLEEP_SECONDS 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Codefresh 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # This Repository is not used anymore 2 | 3 | Please use the new repo for Pull requests and issues at https://github.com/codefresh-io/steps/tree/master/incubating/k8s-canary-deployment 4 | 5 | 6 | # Kubernetes deployment with canaries 7 | 8 | This repository holds a bash script that allows you to perform canary deployments on a Kubernetes cluster. 9 | It is part of the respective [Codefresh blog post](https://codefresh.io/kubernetes-tutorial/fully-automated-canary-deployments-kubernetes/) 10 | 11 | ## Description 12 | 13 | The script expects you to have an existing deployment and service on your K8s cluster. It does the following: 14 | 15 | 1. Reads the existing deployment from the cluster to a yml file 16 | 1. Changes the name of the deployment and the docker image to a new version 17 | 1. Deploys 1 replica for the new version (the canary) 18 | 1. Waits for some time (it is configurable) and checks the number of restarts 19 | 1. If everything is ok it adds more canaries and scales down the production instances 20 | 1. The cycle continues until all replicas used by the service are canaries (the production replicas are zero) 21 | 22 | If something goes wrong (the pods have restarts) the scripts deletes all canaries and scales 23 | back the production version to the original number of replicas 24 | 25 | 26 | Of course during the wait period when both deployments are active, you are free to run your own additional 27 | checks or integration tests to see if the new deployment is ok. 28 | 29 | The canary percentage is configurable. The script will automatically calculate the phase 30 | 31 | Example: 32 | 33 | * Production instance has 5 replicas 34 | * User enters canary waves to 35% 35 | * Script calculates 35% is about 2 pods 36 | 37 | | Phase | Production | Canary | 38 | | ------------- | ------------- |------| 39 | | Original | 5 | 0 | 40 | | A | 5 |1 | 41 | | B | 3 | 3 | 42 | | C | 1 | 5 | 43 | | Final | 0 | 5 | 44 | 45 | ## Prerequisites 46 | 47 | As a convention the script expects 48 | 49 | 1. The name of your deployment to be $APP_NAME-$VERSION 50 | 1. Your service has a metadata label that shows which deployment is currently "in production" 51 | 52 | Notice that the new color deployment created by the script will follow the same conventions. This 53 | way each subsequent pipeline you run will work in the same manner. 54 | 55 | You can see examples of the labels with the sample application: 56 | 57 | * [service](example/service.yml) 58 | * [deployment](example/deployment.yml) 59 | 60 | ## How to use the script on its own 61 | 62 | The script needs one environment variable called `KUBE_CONTEXT` that selects the K8s cluster that will be used (if you have more than one) 63 | 64 | The rest of the parameters are provided as command line arguments 65 | 66 | | Parameter | Argument Number | Description | 67 | | ----------| --------------- | --------------- | 68 | | Working directory | 1 | Folder used for temp/debug files | 69 | | Service | 2 | Name of the existing service | 70 | | Deployment | 3 | Name of the existing deployment | 71 | | Traffic increment | 4 | Percentage of pods to convert to canaries at each stage | 72 | | Namespace | 5 | Kubernetes namespace that will be used | 73 | | New version | 6 | Tag of the new docker image | 74 | | Health seconds | 7 | Time to wait before each canary stage | 75 | 76 | 77 | Here is an example: 78 | 79 | ``` 80 | ./k8s-canary-rollout.sh myService myApp 20 my-namespace 73df943 30 81 | ``` 82 | 83 | ## How to do Canary deployments in Codefresh 84 | 85 | The script also comes with a Dockerfile that allows you to use it as a Docker image in any Docker based workflow such as Codefresh. 86 | 87 | For the `KUBE_CONTEXT` environment variable just use the name of your cluster as found in the Codefresh Kubernetes dashboard. For the rest of the arguments you need to define them as parameters in your [codefresh.yml](example/codefresh.yml) file. 88 | 89 | ``` 90 | canaryDeploy: 91 | title: "Deploying new version ${{CF_SHORT_REVISION}}" 92 | image: codefresh/k8s-canary:master 93 | environment: 94 | - WORKING_VOLUME=. 95 | - SERVICE_NAME=my-demo-app 96 | - DEPLOYMENT_NAME=my-demo-app 97 | - TRAFFIC_INCREMENT=20 98 | - NEW_VERSION=${{CF_SHORT_REVISION}} 99 | - SLEEP_SECONDS=40 100 | - NAMESPACE=canary 101 | - KUBE_CONTEXT=myDemoAKSCluster 102 | ``` 103 | 104 | The `CF_SHORT_REVISION` variable is offered by Codefresh and contains the git hash of the version that was just pushed. See all variables in the [official documentation](https://codefresh.io/docs/docs/codefresh-yaml/variables/) 105 | 106 | ## Dockerhub 107 | 108 | The canary step is now deployed in dockerhub as well 109 | 110 | https://hub.docker.com/r/codefresh/k8s-canary/ 111 | 112 | 113 | ## Future work 114 | 115 | Further improvements 116 | 117 | * Make the script create an initial deployment/service if nothing is deployed in the kubernetes cluster 118 | * Add more complex and configurable healthchecks 119 | 120 | -------------------------------------------------------------------------------- /codefresh.yml: -------------------------------------------------------------------------------- 1 | version: '1.0' 2 | steps: 3 | BuildingDockerImage: 4 | title: Building Docker Image 5 | type: build 6 | image_name: k8s-canary 7 | tag: '${{CF_BRANCH}}' 8 | dockerfile: Dockerfile -------------------------------------------------------------------------------- /example/.gitignore: -------------------------------------------------------------------------------- 1 | # Compiled Object files, Static and Dynamic libs (Shared Objects) 2 | *.o 3 | *.a 4 | *.so 5 | 6 | # Folders 7 | _obj 8 | _test 9 | bin 10 | 11 | # Architecture specific extensions/prefixes 12 | *.[568vq] 13 | [568vq].out 14 | 15 | *.cgo1.go 16 | *.cgo2.c 17 | _cgo_defun.c 18 | _cgo_gotypes.go 19 | _cgo_export.* 20 | 21 | _testmain.go 22 | 23 | *.exe 24 | *.test 25 | *.prof 26 | -------------------------------------------------------------------------------- /example/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM golang:alpine AS build-env 2 | ADD . /src 3 | RUN cd /src && go build -o hello-web 4 | 5 | FROM alpine 6 | 7 | EXPOSE 8080 8 | 9 | WORKDIR /app 10 | COPY --from=build-env /src/hello-web /app/ 11 | ENTRYPOINT ./hello-web 12 | 13 | -------------------------------------------------------------------------------- /example/codefresh.yml: -------------------------------------------------------------------------------- 1 | version: '1.0' 2 | steps: 3 | BuildingDockerImage: 4 | title: Building Docker Image 5 | type: build 6 | image_name: trivial-web 7 | working_directory: ./example/ 8 | tag: '${{CF_SHORT_REVISION}}' 9 | dockerfile: Dockerfile 10 | canaryDeploy: 11 | title: "Deploying new version ${{CF_SHORT_REVISION}}" 12 | image: codefresh/k8s-canary:master 13 | environment: 14 | - WORKING_VOLUME=. 15 | - SERVICE_NAME=my-demo-app 16 | - DEPLOYMENT_NAME=my-demo-app 17 | - TRAFFIC_INCREMENT=20 18 | - NEW_VERSION=${{CF_SHORT_REVISION}} 19 | - SLEEP_SECONDS=40 20 | - NAMESPACE=canary 21 | - KUBE_CONTEXT=myDemoAKSCluster 22 | -------------------------------------------------------------------------------- /example/deployment.yml: -------------------------------------------------------------------------------- 1 | apiVersion: extensions/v1beta1 2 | kind: Deployment 3 | metadata: 4 | name: my-demo-app-c39f076 5 | namespace: canary 6 | spec: 7 | replicas: 5 8 | template: 9 | metadata: 10 | labels: 11 | name: my-demo-app 12 | app: my-demo-app 13 | spec: 14 | containers: 15 | - name: my-demo-app 16 | image: r.cfcr.io/kostis-codefresh/trivial-web:c39f076 17 | imagePullPolicy: Always 18 | ports: 19 | - name: http 20 | containerPort: 8080 21 | protocol: TCP 22 | imagePullSecrets: 23 | - name: codefresh-generated-r.cfcr.io-cfcr-canary -------------------------------------------------------------------------------- /example/hello-web.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "net/http" 6 | ) 7 | 8 | func indexHandler(w http.ResponseWriter, r *http.Request) { 9 | fmt.Fprintf(w, "I am a GO application version 2 running in a Docker image.") 10 | 11 | } 12 | 13 | func main() { 14 | fmt.Println("Trivial web server is starting on port 8080...") 15 | http.HandleFunc("/", indexHandler) 16 | http.ListenAndServe(":8080", nil) 17 | } 18 | -------------------------------------------------------------------------------- /example/service.yml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | creationTimestamp: 2018-08-06T11:25:47Z 5 | labels: 6 | version: "c39f076" 7 | name: my-demo-app 8 | namespace: canary 9 | spec: 10 | 11 | externalTrafficPolicy: Cluster 12 | ports: 13 | - name: http1 14 | port: 80 15 | protocol: TCP 16 | targetPort: 8080 17 | selector: 18 | app: my-demo-app 19 | sessionAffinity: None 20 | type: LoadBalancer 21 | 22 | -------------------------------------------------------------------------------- /k8s-canary-rollout.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | healthcheck(){ 4 | echo "[CANARY INFO] Starting healthcheck" 5 | h=true 6 | 7 | #Start custom healthcheck 8 | output=$(kubectl get pods -l app="$DEPLOYMENT_NAME" -n $NAMESPACE --no-headers) 9 | echo "[CANARY HEALTH] $output" 10 | s=($(echo "$output" | awk '{s+=$4}END{print s}')) 11 | c=($(echo "$output" | wc -l)) 12 | 13 | if [ "$s" -gt "2" ]; then 14 | h=false 15 | fi 16 | ##if [ "$c" -lt "1" ]; then 17 | ## h=false 18 | ##fi 19 | #End custom healthcheck 20 | 21 | if [ ! $h == true ]; then 22 | cancel 23 | echo "[CANARY HEALTH] Canary is unhealthy" 24 | else 25 | echo "[CANARY HEALTH] Service healthy." 26 | fi 27 | } 28 | 29 | cancel(){ 30 | echo "[CANARY] Cancelling rollout - healthcheck failed" 31 | 32 | echo "[CANARY SCALE] Restoring original deployment to $PROD_DEPLOYMENT" 33 | kubectl apply -f $WORKING_VOLUME/original_deployment.yaml -n $NAMESPACE 34 | kubectl rollout status deployment/$PROD_DEPLOYMENT 35 | 36 | #we could also just scale to 0. 37 | echo "[CANARY DELETE] Removing canary deployment completely" 38 | kubectl delete deployment $CANARY_DEPLOYMENT 39 | 40 | echo "[CANARY DELETE] Removing canary horizontal pod autoscaler completely" 41 | kubectl delete hpa $CANARY_DEPLOYMENT -n $NAMESPACE 42 | 43 | exit 1 44 | } 45 | 46 | cleanup(){ 47 | echo "[CANARY CLEANUP] removing previous deployment $PROD_DEPLOYMENT" 48 | kubectl delete deployment $PROD_DEPLOYMENT -n $NAMESPACE 49 | 50 | echo "[CANARY CLEANUP] removing previous horizontal pod autoscaler $PROD_DEPLOYMENT" 51 | kubectl delete hpa $PROD_DEPLOYMENT -n $NAMESPACE 52 | 53 | echo "[CANARY CLEANUP] marking canary as new production" 54 | kubectl get service $SERVICE_NAME -o=yaml --namespace=${NAMESPACE} | sed -e "s/$CURRENT_VERSION/$NEW_VERSION/g" | kubectl apply --namespace=${NAMESPACE} -f - 55 | } 56 | 57 | incrementservice(){ 58 | percent=$1 59 | starting_replicas=$2 60 | 61 | #debug 62 | #echo "Increasing canaries to $percent percent, max replicas is $starting_replicas" 63 | 64 | prod_replicas=$(kubectl get deployment $PROD_DEPLOYMENT -n $NAMESPACE -o=jsonpath='{.spec.replicas}') 65 | canary_replicas=$(kubectl get deployment $CANARY_DEPLOYMENT -n $NAMESPACE -o=jsonpath='{.spec.replicas}') 66 | echo "[CANARY INFO] Production has now $prod_replicas replicas, canary has $canary_replicas replicas" 67 | 68 | # This gets the floor for pods, 2.69 will equal 2 69 | let increment="($percent*$starting_replicas*100)/(100-$percent)/100" 70 | 71 | echo "[CANARY INFO] We will increment canary and decrease production for $increment replicas" 72 | 73 | let new_prod_replicas="$prod_replicas-$increment" 74 | #Sanity check 75 | if [ "$new_prod_replicas" -lt "0" ]; then 76 | new_prod_replicas=0 77 | fi 78 | 79 | let new_canary_replicas="$canary_replicas+$increment" 80 | #Sanity check 81 | if [ "$new_canary_replicas" -ge "$starting_replicas" ]; then 82 | new_canary_replicas=$starting_replicas 83 | new_prod_replicas=0 84 | fi 85 | 86 | echo "[CANARY SCALE] Setting canary replicas to $new_canary_replicas" 87 | kubectl -n $NAMESPACE scale --replicas=$new_canary_replicas deploy/$CANARY_DEPLOYMENT 88 | 89 | echo "[CANARY SCALE] Setting production replicas to $new_prod_replicas" 90 | kubectl -n $NAMESPACE scale --replicas=$new_prod_replicas deploy/$PROD_DEPLOYMENT 91 | 92 | #Wait a bit until production instances are down. This should always succeed 93 | kubectl -n $NAMESPACE rollout status deployment/$PROD_DEPLOYMENT 94 | 95 | #Calulate increments. N = the number of starting pods, I = Increment value, X = how many pods to add 96 | # x / (N + x) = I 97 | # Starting pods N = 5 98 | # Desired increment I = 0.35 99 | # Solve for X 100 | # X / (5+X)= 0.35 101 | # X = .35(5+x) 102 | # X = 1.75 + .35x 103 | # X-.35X=1.75 104 | # .65X = 1.75 105 | # X = 35/13 106 | # X = 2.69 107 | # X = 3 108 | # 5+3 = 8 #3/8 = 37.5% 109 | # Round A B 110 | # 1 5 3 111 | # 2 2 6 112 | # 3 0 5 113 | } 114 | 115 | copy_deployment(){ 116 | #Replace old deployment name with new 117 | sed -Ei -- "s/name\: $PROD_DEPLOYMENT/name: $CANARY_DEPLOYMENT/g" $WORKING_VOLUME/canary_deployment.yaml 118 | echo "[CANARY INFO] Replaced deployment name" 119 | 120 | #Replace docker image 121 | sed -Ei -- "s/$CURRENT_VERSION/$NEW_VERSION/g" $WORKING_VOLUME/canary_deployment.yaml 122 | echo "[CANARY INFO] Replaced image name" 123 | echo "[CANARY INFO] Production deployment is $PROD_DEPLOYMENT, canary is $CANARY_DEPLOYMENT" 124 | } 125 | 126 | input_deployment(){ 127 | #Ouput user provided yml file to use as deployment object 128 | echo "${INPUT_DEPLOYMENT}" > ${WORKING_VOLUME}/canary_deployment.yaml 129 | } 130 | 131 | mainloop(){ 132 | echo "[CANARY INFO] Selecting Kubernetes cluster" 133 | kubectl config use-context "${KUBE_CONTEXT}" 134 | 135 | echo "[CANARY INFO] Locating current version" 136 | CURRENT_VERSION=$(kubectl get service $SERVICE_NAME -o=jsonpath='{.metadata.labels.version}' --namespace=${NAMESPACE}) 137 | 138 | if [ "$CURRENT_VERSION" == "$NEW_VERSION" ]; then 139 | echo "[DEPLOY NOP] NEW_VERSION is same as CURRENT_VERSION. Both are at $CURRENT_VERSION" 140 | exit 0 141 | fi 142 | 143 | echo "[CANARY INFO] current version is $CURRENT_VERSION" 144 | PROD_DEPLOYMENT=$DEPLOYMENT_NAME-$CURRENT_VERSION 145 | 146 | echo "[CANARY INFO] Locating current deployment" 147 | kubectl get deployment $PROD_DEPLOYMENT -n $NAMESPACE -o=yaml > $WORKING_VOLUME/canary_deployment.yaml 148 | 149 | echo "[CANARY INFO] keeping a backup of original deployment" 150 | cp $WORKING_VOLUME/canary_deployment.yaml $WORKING_VOLUME/original_deployment.yaml 151 | 152 | echo "[CANARY INFO] Reading current docker image" 153 | IMAGE=$(kubectl get deployment $PROD_DEPLOYMENT -n $NAMESPACE -o=yaml | grep image: | sed -E 's/.*image: (.*)/\1/') 154 | echo "[CANARY INFO] found image $IMAGE" 155 | echo "[CANARY INFO] Finding current replicas" 156 | 157 | if [[ -n ${INPUT_DEPLOYMENT} ]]; then 158 | #Allow user to provide custom new yaml deployment object 159 | input_deployment 160 | 161 | if ! STARTING_REPLICAS=$(grep "replicas:" < ${WORKING_VOLUME}/canary_deployment.yaml | awk '{print $2}'); then 162 | echo "[CANARY INFO] Failed getting replicas from input file: ${WORKING_VOLUME}/canary_deployment.yaml" 163 | echo "[CANARY INFO] Using the same number of replicas from prod deployment" 164 | STARTING_REPLICAS=$(kubectl get deployment $PROD_DEPLOYMENT -n $NAMESPACE -o=jsonpath='{.spec.replicas}') 165 | fi 166 | else 167 | #Copy existing deployment and update image only 168 | copy_deployment 169 | 170 | STARTING_REPLICAS=$(kubectl get deployment $PROD_DEPLOYMENT -n $NAMESPACE -o=jsonpath='{.spec.replicas}') 171 | echo "[CANARY INFO] Found replicas $STARTING_REPLICAS" 172 | fi 173 | 174 | #Start with one replica 175 | sed -Ei -- "s#replicas: $STARTING_REPLICAS#replicas: 1#g" $WORKING_VOLUME/canary_deployment.yaml 176 | echo "[CANARY INIT] Launching 1 pod with canary" 177 | 178 | #Launch canary 179 | kubectl apply -f $WORKING_VOLUME/canary_deployment.yaml -n $NAMESPACE 180 | 181 | echo "[CANARY INFO] Awaiting canary pod..." 182 | while [ $(kubectl get pods -l app="$DEPLOYMENT_NAME" -n $NAMESPACE --no-headers | wc -l) -eq 0 ] 183 | do 184 | sleep 2 185 | done 186 | 187 | echo "[CANARY INFO] Canary target replicas: $STARTING_REPLICAS" 188 | 189 | healthcheck 190 | 191 | while [ $TRAFFIC_INCREMENT -lt 100 ] 192 | do 193 | p=$((p + $TRAFFIC_INCREMENT)) 194 | if [ "$p" -gt "100" ]; then 195 | p=100 196 | fi 197 | echo "[CANARY INFO] Rollout is at $p percent" 198 | 199 | incrementservice $TRAFFIC_INCREMENT $STARTING_REPLICAS 200 | 201 | if [ "$p" == "100" ]; then 202 | cleanup 203 | echo "[CANARY INFO] Done" 204 | exit 0 205 | fi 206 | echo "[CANARY INFO] Will now sleep for $SLEEP_SECONDS seconds" 207 | sleep $SLEEP_SECONDS 208 | healthcheck 209 | done 210 | } 211 | 212 | if [ "$1" != "" ] && [ "$2" != "" ] && [ "$3" != "" ] && [ "$4" != "" ] && [ "$5" != "" ] && [ "$6" != "" ] && [ "$7" != "" ]; then 213 | WORKING_VOLUME=${1%/} 214 | SERVICE_NAME=$2 215 | DEPLOYMENT_NAME=$3 216 | TRAFFIC_INCREMENT=$4 217 | NAMESPACE=$5 218 | NEW_VERSION=$6 219 | SLEEP_SECONDS=$7 220 | CANARY_DEPLOYMENT=$DEPLOYMENT_NAME-$NEW_VERSION 221 | else 222 | echo "USAGE\n k8s-canary-rollout.sh [WORKING_VOLUME] [SERVICE_NAME] [DEPLOYMENT_NAME] [TRAFFIC_INCREMENT] [NAMESPACE] [NEW_VERSION] [SLEEP_SECONDS]" 223 | echo "\t [WORKING_VOLUME] - This should be set with \${{CF_VOLUME_PATH}}" 224 | echo "\t [SERVICE_NAME] - Name of the current service" 225 | echo "\t [DEPLOYMENT_NAME] - The name of the current deployment" 226 | echo "\t [TRAFFIC_INCREMENT] - Integer between 1-100 that will step increase traffic" 227 | echo "\t [NAMESPACE] - Namespace of the application" 228 | echo "\t [NEW_VERSION] - The next version of the Docker image" 229 | echo "\t [SLEEP_SECONDS] - seconds to wait for each canary step" 230 | exit 1; 231 | fi 232 | 233 | echo $BASH_VERSION 234 | 235 | mainloop 236 | --------------------------------------------------------------------------------