├── LICENSE.md ├── Makefile ├── README.md ├── js-templates ├── rsXX-arb-template.js ├── rsXX-cfg-template.js ├── rsXX-pri-template.js ├── rsXX-sec-template.js └── shardkeys.js ├── src ├── automate.sh ├── clean.sh ├── configure.sh ├── create.sh ├── delete.sh ├── generate.sh └── remote.sh └── yaml-templates ├── cfgXX-template.yaml ├── mgsXX-template.yaml ├── nodeXX-template.yaml ├── rsXX-template.yaml ├── separator.yaml ├── svcXX-port-template.yaml ├── svcXX-template.yaml ├── volumes-head.yaml └── volumes-template.yaml /LICENSE.md: -------------------------------------------------------------------------------- 1 | Copyright (c) 2011-2017 GitHub Inc. 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining 4 | a copy of this software and associated documentation files (the 5 | "Software"), to deal in the Software without restriction, including 6 | without limitation the rights to use, copy, modify, merge, publish, 7 | distribute, sublicense, and/or sell copies of the Software, and to 8 | permit persons to whom the Software is furnished to do so, subject to 9 | the following conditions: 10 | 11 | The above copyright notice and this permission notice shall be 12 | included in all copies or substantial portions of the Software. 13 | 14 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 15 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 16 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 17 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 18 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 19 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 20 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 21 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # Copyright 2016 The StyxLab Authors All rights reserved. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | # 15 | # Build tools. 16 | # 17 | 18 | build: 19 | src/clean.sh 20 | src/generate.sh 21 | .PHONY: build 22 | 23 | run: 24 | src/create.sh 25 | sleep 60 26 | src/automate.sh 27 | .PHONY: run 28 | 29 | config: 30 | src/config.sh 31 | .PHONY: config 32 | 33 | clean: 34 | src/clean.sh 35 | .PHONY: clean 36 | 37 | create: 38 | src/create.sh 39 | .PHONY: create 40 | 41 | delete: 42 | src/delete.sh 43 | .PHONY: delete 44 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # kubernetes-mongodb-shard 2 | Deploy a mongodb sharded cluster on kubernetes. This works on both small clusters with a minimum of 3 nodes and large clusters with 100+ nodes. 3 | 4 | ## Prerequisites 5 | - A Kubernetes cluster with at least 3 scheduable nodes. 6 | - Kubernetes v1.2.3 or greater 7 | 8 | ## Features 9 | - Configurable number of shards, replicas, config servers and mongos 10 | - Shard members and data replicas are distributed evenly on available nodes 11 | - Storage is directly allocated on each node 12 | - All mongo servers are combined into one kubernetes pod per node 13 | - Services are setup which can be consumed upstream 14 | - Official mongodb docker image is used without modifications 15 | 16 | ## Description 17 | Setting up a mongodb shard on kubernetes is easy with this repo. `kubectl` 18 | is used to determine the number of nodes in your cluster 19 | and the provided shell script `src/generate.sh` creates one kubernetes `yaml` 20 | file per node as well as the neccessary `js` config scripts. Finally, the 21 | shard is automatically created by executing the `yaml` files and applying the 22 | config scripts. 23 | 24 | Great care has been taken to distribute data accross the cluster to 25 | maximize data redundancy and high availability. In addition we 26 | bind disk space with the kubernetes `hostPath` option in order 27 | to maximize I/O throughput. 28 | 29 | Replication is achived by the built in mongodb feature rather than kubernetes 30 | itself. However, as kubernetes knows about the desired state of your shard, it 31 | will try to restore all services automatically should one node go down. 32 | 33 | ## Usage 34 | ``` 35 | $ git clone https://github.com/styxlab/kubernetes-mongodb-shard.git 36 | $ cd kubernetes-mongodb-shard 37 | $ make build 38 | ``` 39 | All needed files can be found in the `build` folder. You should find one 40 | `yaml` file for each node of you cluster and a couple of `js` 41 | files that will configure the mongodb shard. Finally, you 42 | need to execute these files on your kubernetes cluster: 43 | ``` 44 | $ make run 45 | ``` 46 | ## Verify 47 | After a minute or two (depending on how fast the docker images are fetched over your network) you should see that all deployments are up and running. For a 3 node shard, a typical output is shown below. 48 | ``` 49 | $ kubectl get deployments -l role="mongoshard" 50 | NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE 51 | mongodb-shard-node01 1 1 1 1 1d 52 | mongodb-shard-node02 1 1 1 1 1d 53 | mongodb-shard-node03 1 1 1 1 1d 54 | 55 | $ kubectl get pods -l role="mongoshard" 56 | NAME READY STATUS RESTARTS AGE 57 | mongodb-shard-node01-1358154500-wyv5n 5/5 Running 0 1d 58 | mongodb-shard-node02-1578289992-i49fw 5/5 Running 0 1d 59 | mongodb-shard-node03-4184329044-vwref 5/5 Running 0 1d 60 | ``` 61 | You can now connect to one of the mongos and inspect the status of the shard: 62 | ``` 63 | $ kubectl exec -ti mongodb-shard-node01-1358154500-wyv5n -c mgs01-node01 mongo 64 | MongoDB shell version: 3.2.6 65 | connecting to: test 66 | mongos> 67 | ``` 68 | Type `sh.status()` at the mongos prompt: 69 | ``` 70 | mongos> sh.status() 71 | --- Sharding Status --- 72 | sharding version: { 73 | "_id" : 1, 74 | "minCompatibleVersion" : 5, 75 | "currentVersion" : 6, 76 | "clusterId" : ObjectId("575abbcb568388677e5336ef") 77 | } 78 | shards: 79 | { "_id" : "rs01", "host" : "rs01/mongodb-node01.default.svc.cluster.local:27020,mongodb-node02.default.svc.cluster.local:27021" } 80 | { "_id" : "rs02", "host" : "rs02/mongodb-node01.default.svc.cluster.local:27021,mongodb-node03.default.svc.cluster.local:27020" } 81 | { "_id" : "rs03", "host" : "rs03/mongodb-node02.default.svc.cluster.local:27020,mongodb-node03.default.svc.cluster.local:27021" } 82 | active mongoses: 83 | "3.2.6" : 3 84 | balancer: 85 | Currently enabled: yes 86 | Currently running: no 87 | Failed balancer rounds in last 5 attempts: 0 88 | Migration Results for the last 24 hours: 89 | No recent migrations 90 | databases: 91 | { "_id" : "styxmail", "primary" : "rs01", "partitioned" : true } 92 | ``` 93 | 94 | ## Consume 95 | The default configurations configures one mongos service per node. Use one of them to connect 96 | to your shard from any other application on your kubernetes cluster: 97 | ``` 98 | $ kubectl get svc -l role="mongoshard" 99 | NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE 100 | mongodb-node01 10.3.0.175 27019/TCP,27018/TCP,27017/TCP,27020/TCP,27021/TCP 1d 101 | mongodb-node02 10.3.0.13 27019/TCP,27018/TCP,27017/TCP,27020/TCP,27021/TCP 1d 102 | mongodb-node03 10.3.0.47 27019/TCP,27018/TCP,27017/TCP,27020/TCP,27021/TCP 1d 103 | ``` 104 | 105 | ## Configuration Options 106 | Configuration options are currently hard coded in `src/generate.sh`. This will be enhanced later. The following options are availabe: 107 | ``` 108 | NODES: number of cluster nodes (default: all nodes on your cluster as determined by kubectl) 109 | SHARDS: number of shards in your mongo database (default: number of cluster nodes) 110 | MONGOS_PER_CLUSTER: you connect to your shard through mongos (default: one per node, minimum: 1) 111 | CFG_PER_CLUSTER: config servers per cluster (default: 1 config server, configured as a replication set) 112 | CFG_REPLICA: number of replicas per configuration cluster (default: number of nodes) 113 | REPLICAS_PER_SHARD: each shard is configured as a replication set (default: 2) 114 | ``` 115 | 116 | ## Ports 117 | As each pod gets one IP address assigned, each service within a pod must have a distinct port. 118 | As the mongos are the services by which you access your shard from other applications, the standard 119 | mongodb port `27017` is given to them. Here is the list of port assignments: 120 | 121 | | Service | Port | 122 | | ------------------------------- | --------| 123 | | Mongos | 27017 | 124 | | Config Server | 27018 | 125 | | Arbiter (if present) | 27019 | 126 | | Replication Server (Primary) | 27020 | 127 | | Replication Server (Secondary) | 27021 | 128 | 129 | Usually you need not be concerned about the ports as you will only access the shard through the 130 | standard port `27017`. 131 | 132 | ## Examples 133 | A typical `yaml` file for one node is shown below: 134 | ``` 135 | apiVersion: v1 136 | kind: Service 137 | metadata: 138 | name: mongodb-node01 139 | labels: 140 | app: mongodb-node01 141 | role: mongoshard 142 | tier: backend 143 | spec: 144 | selector: 145 | app: mongodb-shard-node01 146 | role: mongoshard 147 | tier: backend 148 | ports: 149 | - name: arb03-node01 150 | port: 27019 151 | protocol: TCP 152 | - name: cfg01-node01 153 | port: 27018 154 | protocol: TCP 155 | - name: mgs01-node01 156 | port: 27017 157 | protocol: TCP 158 | - name: rsp01-node01 159 | port: 27020 160 | protocol: TCP 161 | - name: rss02-node01 162 | port: 27021 163 | protocol: TCP 164 | 165 | --- 166 | 167 | apiVersion: extensions/v1beta1 168 | kind: Deployment 169 | metadata: 170 | name: mongodb-shard-node01 171 | spec: 172 | replicas: 1 173 | template: 174 | metadata: 175 | labels: 176 | app: mongodb-shard-node01 177 | role: mongoshard 178 | tier: backend 179 | spec: 180 | nodeSelector: 181 | kubernetes.io/hostname: 80.40.200.130 182 | containers: 183 | - name: arb03-node01 184 | image: mongo:3.2 185 | args: 186 | - "--storageEngine" 187 | - wiredTiger 188 | - "--replSet" 189 | - rs03 190 | - "--port" 191 | - "27019" 192 | - "--noprealloc" 193 | - "--smallfiles" 194 | ports: 195 | - name: arb03-node01 196 | containerPort: 27019 197 | volumeMounts: 198 | - name: db-rs03 199 | mountPath: /data/db 200 | - name: rss02-node01 201 | image: mongo:3.2 202 | args: 203 | - "--storageEngine" 204 | - wiredTiger 205 | - "--replSet" 206 | - rs02 207 | - "--port" 208 | - "27021" 209 | - "--noprealloc" 210 | - "--smallfiles" 211 | ports: 212 | - name: rss02-node01 213 | containerPort: 27021 214 | volumeMounts: 215 | - name: db-rs02 216 | mountPath: /data/db 217 | - name: rsp01-node01 218 | image: mongo:3.2 219 | args: 220 | - "--storageEngine" 221 | - wiredTiger 222 | - "--replSet" 223 | - rs01 224 | - "--port" 225 | - "27020" 226 | - "--noprealloc" 227 | - "--smallfiles" 228 | ports: 229 | - name: rsp01-node01 230 | containerPort: 27020 231 | volumeMounts: 232 | - name: db-rs01 233 | mountPath: /data/db 234 | - name: cfg01-node01 235 | image: mongo:3.2 236 | args: 237 | - "--storageEngine" 238 | - wiredTiger 239 | - "--configsvr" 240 | - "--replSet" 241 | - configReplSet01 242 | - "--port" 243 | - "27018" 244 | - "--noprealloc" 245 | - "--smallfiles" 246 | ports: 247 | - name: cfg01-node01 248 | containerPort: 27018 249 | volumeMounts: 250 | - name: db-cfg 251 | mountPath: /data/db 252 | - name: mgs01-node01 253 | image: mongo:3.2 254 | command: 255 | - "mongos" 256 | args: 257 | - "--configdb" 258 | - "configReplSet01/mongodb-node01.default.svc.cluster.local:27018,mongodb-node02.default.svc.cluster.local:27018,mongodb-node03.default.svc.cluster.local:27018" 259 | - "--port" 260 | - "27017" 261 | ports: 262 | - name: mgs01-node01 263 | containerPort: 27017 264 | volumes: 265 | - name: db-cfg 266 | hostPath: 267 | path: /enc/mongodb/db-cfg 268 | - name: db-rs01 269 | hostPath: 270 | path: /enc/mongodb/db-rs01 271 | - name: db-rs02 272 | hostPath: 273 | path: /enc/mongodb/db-rs02 274 | - name: db-rs03 275 | hostPath: 276 | path: /enc/mongodb/db-rs03 277 | ``` 278 | 279 | ## Layouts 280 | In order to get an understanding on how this script distributes the different mongodb servers and replication 281 | set on your cluster, a couple examples are shown. First, take note of the notation: 282 | 283 | | Abbreviation | Meaning | 284 | | -------------| --------------------------| 285 | | columns | nodes | 286 | | rows | shards | 287 | | - | no assignment | 288 | | rsp | replica set (primary) | 289 | | rss | replica set (secondary) | 290 | | arb | arbiter | 291 | 292 | ### 3 nodes, 3 shards, 2 shards per node, 1 arbiter 293 | 294 | | | node 1 | node 2 | node 3 | 295 | | ------- | ------ | ------- | ------ | 296 | | shard 1 | rsp | rss | arb | 297 | | shard 2 | rss | arb | rsp | 298 | | shard 3 | arb | rsp | rss | 299 | 300 | As can be seen, the secondary of a particular shard is always on a different node than the primary. 301 | This ensures the replication feature. Also, each node does contain the same number of data stores, thus 302 | distributing disk usage evenly accross the cluster. 303 | 304 | ### 5 nodes, 5 shards, 2 shards per node, 1 arbiter 305 | 306 | | | node 1 | node 2 | node 3 | node 4 | node 5 | 307 | | ------- | ------ | ------- | ------ | ------ | ------ | 308 | | shard 1 | rsp | rss | arb | - | - | 309 | | shard 2 | rss | arb | - | - | rsp | 310 | | shard 3 | arb | - | - | rsp | rss | 311 | | shard 4 | - | - | rsp | rss | arb | 312 | | shard 5 | - | rsp | rss | arb | - | 313 | 314 | Note that the same properties are retained for a larger cluster with 5 nodes. So, you can achieve 315 | real horizontal scaling of your mongodb database with this technique. 316 | 317 | ## Todos 318 | - Gather config parameters 319 | - EmptyDir option 320 | -------------------------------------------------------------------------------- /js-templates/rsXX-arb-template.js: -------------------------------------------------------------------------------- 1 | rs.addArb("__ARBITER_SVC_ADDR__") 2 | -------------------------------------------------------------------------------- /js-templates/rsXX-cfg-template.js: -------------------------------------------------------------------------------- 1 | rs.initiate( { 2 | _id: "configReplSet__RSNUM__", 3 | configsvr: true, 4 | members: [ 5 | __CONFIG_SERVERS_SERVICES_JS__ 6 | ] 7 | } ) 8 | -------------------------------------------------------------------------------- /js-templates/rsXX-pri-template.js: -------------------------------------------------------------------------------- 1 | rs.initiate() 2 | cfg = rs.conf() 3 | cfg.members[0].host = "__PRIMARY_SVC_ADDR__" 4 | cfg.members[0].priority = 5 5 | rs.reconfig(cfg, {force: true}) 6 | -------------------------------------------------------------------------------- /js-templates/rsXX-sec-template.js: -------------------------------------------------------------------------------- 1 | rs.add("__SECONDARY_SVC_ADDR__") 2 | -------------------------------------------------------------------------------- /js-templates/shardkeys.js: -------------------------------------------------------------------------------- 1 | sh.enableSharding( "styxmail" ) 2 | sh.shardCollection( "styxmail.styxlog", { "area" : 1, "_id" : 1 } ) -------------------------------------------------------------------------------- /src/automate.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | MONGOSPORT="27017" 6 | CFGPORT="27018" 7 | PORT=$1 && [ -z "${1}" ] && PORT="27020" || true 8 | 9 | kubectl get pods -l role=mongoshard -o name > ./tmp/podfile 10 | 11 | i=1 12 | while read p; do 13 | ii=$(printf %02d "${i}") 14 | POD=$(echo $p | cut -d / -f2) 15 | NODE=$(echo $POD | cut -d'-' -f3 | cut -b 5-6) 16 | JSFILE=$(ls ./build/node${NODE}-rs*) 17 | RSNUM=$(echo $JSFILE | cut -d'-' -f2 | cut -b 3-4) 18 | CONTAINER="rsp${RSNUM}-node${NODE}" 19 | echo "${ii}: Initialize replication set rs${RSNUM} on node ${NODE}" 20 | echo "Execute command on pod ${POD} and container ${CONTAINER}" 21 | kubectl exec -ti ${POD} -c ${CONTAINER} mongo 127.0.0.1:${PORT} <${JSFILE} 22 | if [ -e "./build/cfg${ii}-init.js" ]; then 23 | echo "Initialize Config Server Replication Set" 24 | kubectl exec -ti ${POD} -c ${CONTAINER} mongo 127.0.0.1:${CFGPORT} <./build/cfg${ii}-init.js 25 | fi 26 | i=$((i+1)) 27 | done < ./tmp/podfile 28 | 29 | sleep 15 30 | echo "Initialize Shard..." 31 | kubectl exec -ti ${POD} -c ${CONTAINER} mongo 127.0.0.1:${MONGOSPORT} <./build/shard-init.js 32 | 33 | sleep 15 34 | echo "Initialize database collections for sharding ..." 35 | kubectl exec -ti ${POD} -c ${CONTAINER} mongo 127.0.0.1:${MONGOSPORT} <./js-templates/shardkeys.js 36 | -------------------------------------------------------------------------------- /src/clean.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | function cleanUp(){ 6 | rm -rf *.yaml 7 | rm -rf *.js 8 | rm -rf ./tmp 9 | rm -rf ./build 10 | mkdir -p ./tmp/yaml 11 | mkdir -p ./tmp/js 12 | mkdir -p ./build 13 | } 14 | 15 | cleanUp -------------------------------------------------------------------------------- /src/configure.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | NODES=$1 && [ -z "${1}" ] && NODES="3" || true 6 | VERSION="3.2" 7 | 8 | echo 9 | echo "Please supply some important configuration parameters below:" 10 | echo "============================================================" 11 | read -p "SSH user for cluster access [root]: " SSHUSER 12 | [ -z "$SSHUSER" ] && SSHUSER="root" || true 13 | read -p "SSH port for cluster access [22]: " SSHPORT 14 | [ -z "$SSHPORT" ] && SSHPORT="22" || true 15 | read -p "Root directory for your mongodb data [/data]: " BASEDIR 16 | [ -z "$BASEDIR" ] && BASEDIR="/data" || true 17 | read -p "Number of nodes for your shard [${NODES}]: " CFGNODES 18 | [ -z "$CFGNODES" ] && CFGNODES=$NODES || true 19 | read -p "MongoDB version [${VERSION}]: " VERSION 20 | [ -z "$VERSION" ] && VERSION="3.2" || true 21 | echo 22 | -------------------------------------------------------------------------------- /src/create.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | i=1 6 | while read p; do 7 | NODE=$(printf %02d "${i}") 8 | NAME=$(echo $p | cut -d / -f2) 9 | echo "Create Deployment on machine ${NAME} (node${NODE})..." 10 | kubectl create -f ./build/node${NODE}-deployment.yaml 11 | i=$((i+1)) 12 | done < ./tmp/nodefile 13 | -------------------------------------------------------------------------------- /src/delete.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | i=1 6 | while read p; do 7 | NODE=$(printf %02d "${i}") 8 | NAME=$(echo $p | cut -d / -f2) 9 | echo "Create Deployment on machine ${NAME} (node${NODE})..." 10 | kubectl delete -f ./build/node${NODE}-deployment.yaml 11 | i=$((i+1)) 12 | done < ./tmp/nodefile -------------------------------------------------------------------------------- /src/generate.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | # Layout Description 6 | # columns - nodes 7 | # rows - shards 8 | # 0 - empty 9 | # 1 - replica set - primary (rsp) 10 | # 2 - replica set - secondary (rss) 11 | # 3 - replica set - arbiter (arb) 12 | # 13 | # Examples 14 | # 15 | # 3nodes, 3 shards, 2 shards per node, 1 arbiter 16 | # 123 17 | # 231 18 | # 312 19 | # 20 | # 4nodes, 4 shards, 2 shards per node, 1 arbiter 21 | # 1230 22 | # 2301 23 | # 3012 24 | # 0123 25 | # 26 | # 5nodes, 5 shards, 2 shards per node, 1 arbiter 27 | # 12300 28 | # 23001 29 | # 30012 30 | # 00123 31 | # 01230 32 | 33 | #global vars 34 | CONFIG_SERVERS_SERVICES="" 35 | CONFIG_SERVERS_SERVICES_JS="" 36 | 37 | # Helper Functions 38 | function cleanUp(){ 39 | rm -rf *.yaml 40 | rm -rf *.js 41 | rm -rf ./tmp 42 | rm -rf ./build 43 | mkdir -p ./tmp/yaml 44 | mkdir -p ./tmp/js 45 | mkdir -p ./build 46 | } 47 | function getNodeNames(){ 48 | i=1 49 | while read p; do 50 | ii=$(printf %02d "${i}") 51 | echo $p | cut -d / -f2 > "${2}/node${ii}" 52 | i=$((i+1)) 53 | done <$1 54 | } 55 | function rotateAxis(){ 56 | axis=$2 && [ -z "$2" ] && axis=1 57 | left=$(echo $1 | cut -b $((axis+1))-) 58 | right="" && [ ! "$axis" = "0" ] && right=$(echo $1 | cut -b 1-${axis}) 59 | echo "${left}${right}" 60 | } 61 | function validateConstraints(){ 62 | [ $1 -lt $2 ] && (echo "Value $1 must greater or equal $2"; exit 1) || true 63 | [ $1 -gt $3 ] && (echo "Value $1 must less or equal $3"; exit 1) || true 64 | } 65 | function getPattern(){ 66 | NODEGAPS=$(($1 - $2)) 67 | GAPPATTERN="" 68 | if [ "${NODEGAPS}" -gt "0" ]; then 69 | GAPPATTERN=$(printf %0${NODEGAPS}d "0") 70 | fi 71 | ARBITER="" 72 | if [ "${3}" -gt "0" ]; then 73 | ARBITER="3" 74 | fi 75 | SECONDARIES="" 76 | SECNUMS=$(($2-$3-1)) 77 | if [ "${SECNUMS}" -gt "0" ]; then 78 | SECONDARIES=$(printf %0${SECNUMS}d "2") 79 | fi 80 | echo "1${SECONDARIES}${ARBITER}${GAPPATTERN}" 81 | } 82 | function getRole(){ 83 | echo $1 | cut -b 1 84 | } 85 | function getPortShift(){ 86 | IDX1=$((2 * $1 - 2)) 87 | echo "${RSSADD:${IDX1}:2}" 88 | } 89 | function addPortShift(){ 90 | VAL=$(getPortShift $1) 91 | VAL=$(printf %02d $(($VAL + 1))) 92 | AXIS1=$(( 2 * $1 - 2)) 93 | AXIS2=$((${#RSSADD} - $AXIS1 )) 94 | AFTER=$(rotateAxis $RSSADD $AXIS1 | cut -b 3-) 95 | RSSADD=$(rotateAxis "${VAL}${AFTER}" ${AXIS2}) 96 | } 97 | function getSpec(){ 98 | NODENUM=$(printf %02d ${1}) 99 | RSNUM=$(printf %02d ${2}) 100 | ROLE=$(getRole ${3}) 101 | if [ "${ROLE}" = "1" ]; then 102 | PORT=$RSPPORT 103 | echo "rsp${RSNUM}-node${NODENUM}-port${PORT}" 104 | fi 105 | if [ "${ROLE}" = "2" ]; then 106 | addPortShift $NODENUM 107 | SHIFT=$(getPortShift $NODENUM) 108 | PORT=$(($RSPPORT+$SHIFT)) 109 | echo "rss${RSNUM}-node${NODENUM}-port${PORT}" 110 | fi 111 | if [ "${ROLE}" = "3" ]; then 112 | PORT=$ARBPORT 113 | echo "arb${RSNUM}-node${NODENUM}-port${PORT}" 114 | fi 115 | } 116 | function getCfg(){ 117 | [ $1 -gt 1 ] && (echo "Not yet implemented. Change to $CFG_PER_CLUSTER to 1."; exit 1) || true 118 | echo "cfg${1}-node${2}-port${CFGPORT}" 119 | } 120 | function addShard(){ 121 | PREFIX=$(echo $1 | cut -b 1-3) 122 | if [ "${PREFIX}" = "rsp" ]; then 123 | RSNUM=$(echo $1 | cut -b 4-5) 124 | NODENUM=$(echo $1 | cut -b 11-12) 125 | echo "sh.addShard(\"rs${RSNUM}/mongodb-node${NODENUM}.default.svc.cluster.local:${RSPPORT}\")" \ 126 | >> ./build/shard-init.js 127 | fi 128 | } 129 | function genYamlFromTemplates(){ 130 | TEMPLATE_PATH="./yaml-templates" 131 | JS_PATH="./js-templates" 132 | NODENUM=$(echo $1 | cut -b 11-12) 133 | NODESELECTOR=$(cat "./tmp/node${NODENUM}") 134 | OUTFILE="./tmp/yaml/node${NODENUM}-partial.yaml" 135 | SVC_OUTFILE="./tmp/yaml/svc${NODENUM}-partial.yaml" 136 | if [ ! -e "$OUTFILE" ]; then 137 | cat "${TEMPLATE_PATH}/nodeXX-template.yaml" \ 138 | | sed "s|__NODENUM__|${NODENUM}|g" \ 139 | | sed "s|__NODESELECTOR__|${NODESELECTOR}|g" \ 140 | | sed "/##/d" \ 141 | > $OUTFILE 142 | cat "${TEMPLATE_PATH}/svcXX-template.yaml" \ 143 | | sed "s|__NODENUM__|${NODENUM}|g" \ 144 | | sed "/##/d" \ 145 | > $SVC_OUTFILE 146 | fi 147 | RSID=$(echo $1 | cut -b 1-2) 148 | if [ "$RSID" = "rs" ] || [ "$RSID" = "ar" ]; then 149 | RSID=$(echo $1 | cut -b 1-3) 150 | RSNUM=$(echo $1 | cut -b 4-5) 151 | PORT=$(echo $1 | cut -b 18-22) 152 | OUTFILE="./tmp/yaml/node${NODENUM}-${RSID}${RSNUM}-partial.yaml" 153 | cat "${TEMPLATE_PATH}/rsXX-template.yaml" \ 154 | | sed "s|__NODENUM__|${NODENUM}|g" \ 155 | | sed "s|__RSNUM__|${RSNUM}|g" \ 156 | | sed "s|__PORT__|${PORT}|g" \ 157 | | sed "s|__RSID__|${RSID}|g" \ 158 | | sed "s|__VERSION__|${VERSION}|g" \ 159 | | sed "/##/d" \ 160 | > $OUTFILE 161 | OUTFILE="./tmp/yaml/node${NODENUM}-db${RSNUM}-volumes.yaml" 162 | cat "${TEMPLATE_PATH}/volumes-template.yaml" \ 163 | | sed "s|__BASEDIR__|${BASEDIR}|g" \ 164 | | sed "s|__RSNUM__|${RSNUM}|g" \ 165 | | sed "/##/d" \ 166 | > $OUTFILE 167 | if [ "$RSID" = "rsp" ]; then 168 | OUTFILE="./tmp/js/node${NODENUM}-rs${RSNUM}-pri.js" 169 | PRIMARY_SVC_ADDR="mongodb-node${NODENUM}.default.svc.cluster.local:${PORT}" 170 | cat "${JS_PATH}/rsXX-pri-template.js" \ 171 | | sed "s|__PRIMARY_SVC_ADDR__|${PRIMARY_SVC_ADDR}|g" \ 172 | | sed "/##/d" \ 173 | > $OUTFILE 174 | fi 175 | if [ "$RSID" = "rss" ]; then 176 | OUTFILE="./tmp/js/node${NODENUM}-rs${RSNUM}-sec.js" 177 | SECONDARY_SVC_ADDR="mongodb-node${NODENUM}.default.svc.cluster.local:${PORT}" 178 | cat "${JS_PATH}/rsXX-sec-template.js" \ 179 | | sed "s|__SECONDARY_SVC_ADDR__|${SECONDARY_SVC_ADDR}|g" \ 180 | | sed "/##/d" \ 181 | > $OUTFILE 182 | fi 183 | if [ "$RSID" = "arb" ]; then 184 | OUTFILE="./tmp/js/node${NODENUM}-rs${RSNUM}-arb.js" 185 | ARBITER_SVC_ADDR="mongodb-node${NODENUM}.default.svc.cluster.local:${PORT}" 186 | cat "${JS_PATH}/rsXX-arb-template.js" \ 187 | | sed "s|__ARBITER_SVC_ADDR__|${ARBITER_SVC_ADDR}|g" \ 188 | | sed "/##/d" \ 189 | > $OUTFILE 190 | fi 191 | fi 192 | if [ "$RSID" = "cf" ]; then 193 | RSID=$(echo $1 | cut -b 1-3) 194 | RSNUM=$(echo $1 | cut -b 4-5) 195 | PORT=$(echo $1 | cut -b 18-22) 196 | OUTFILE="./tmp/yaml/node${NODENUM}-${RSID}${RSNUM}-partial.yaml" 197 | cat "${TEMPLATE_PATH}/cfgXX-template.yaml" \ 198 | | sed "s|__NODENUM__|${NODENUM}|g" \ 199 | | sed "s|__RSNUM__|${RSNUM}|g" \ 200 | | sed "s|__PORT__|${PORT}|g" \ 201 | | sed "s|__RSID__|${RSID}|g" \ 202 | | sed "s|__VERSION__|${VERSION}|g" \ 203 | | sed "/##/d" \ 204 | > $OUTFILE 205 | ID=$((${NODENUM}-1)) 206 | CONFIG_SERVERS_SERVICES="${CONFIG_SERVERS_SERVICES},mongodb-node${NODENUM}.default.svc.cluster.local:${PORT}" 207 | CONFIG_SERVERS_SERVICES_JS="${CONFIG_SERVERS_SERVICES_JS},\n\t\t{ _id: ${ID}, host: \"mongodb-node${NODENUM}.default.svc.cluster.local:${PORT}\" }" 208 | fi 209 | if [ "$RSID" = "mg" ]; then 210 | RSID=$(echo $1 | cut -b 1-3) 211 | MSGNUM=$(echo $1 | cut -b 4-5) 212 | PORT=$(echo $1 | cut -b 18-22) 213 | OUTFILE="./tmp/yaml/node${NODENUM}-${RSID}${MSGNUM}-partial.yaml" 214 | cat "${TEMPLATE_PATH}/mgsXX-template.yaml" \ 215 | | sed "s|__NODENUM__|${NODENUM}|g" \ 216 | | sed "s|__MSGNUM__|${MSGNUM}|g" \ 217 | | sed "s|__PORT__|${PORT}|g" \ 218 | | sed "s|__RSID__|${RSID}|g" \ 219 | | sed "s|__VERSION__|${VERSION}|g" \ 220 | | sed "/##/d" \ 221 | > $OUTFILE 222 | fi 223 | RSID=$(echo $1 | cut -b 1-3) 224 | RSNUM=$(echo $1 | cut -b 4-5) 225 | PORT=$(echo $1 | cut -b 18-22) 226 | OUTFILE="./tmp/yaml/svc${NODENUM}-${RSID}${RSNUM}-port-partial.yaml" 227 | cat "${TEMPLATE_PATH}/svcXX-port-template.yaml" \ 228 | | sed "s|__NODENUM__|${NODENUM}|g" \ 229 | | sed "s|__RSNUM__|${RSNUM}|g" \ 230 | | sed "s|__PORT__|${PORT}|g" \ 231 | | sed "s|__RSID__|${RSID}|g" \ 232 | | sed "/##/d" \ 233 | > $OUTFILE 234 | } 235 | genFinalFromPartials(){ 236 | YAML_PATH="./tmp/yaml" 237 | JS_PATH="./tmp/js" 238 | TEMPLATE_PATH="./yaml-templates" 239 | JS_TEMPLATE_PATH="./js-templates" 240 | #cfg01.default.svc.cluster.local:27017,cfg02.default.svc.cluster.local:27017,cfg03.default.svc.cluster.local:27017" 241 | CONFIG_SERVERS_SERVICES=$(echo $CONFIG_SERVERS_SERVICES | cut -b 2-) 242 | CONFIG_SERVERS_SERVICES_JS=$(echo $CONFIG_SERVERS_SERVICES_JS | cut -b 4-) 243 | for i in $(seq ${CFG_PER_CLUSTER}); do 244 | RSNUM=$(printf %02d ${i}) 245 | cat ${JS_PATH}/node*-rs${RSNUM}-cfg.js \ 246 | | sed "s|__CONFIG_SERVERS_SERVICES_JS__|${CONFIG_SERVERS_SERVICES_JS}|g" \ 247 | > "./build/cfg${RSNUM}-init.js" 248 | done 249 | for j in $(seq ${NODES}); do 250 | NODENUM=$(printf %02d ${j}) 251 | cat ${YAML_PATH}/svc${NODENUM}-partial.yaml \ 252 | ${YAML_PATH}/svc${NODENUM}-*-port-partial.yaml \ 253 | ${TEMPLATE_PATH}/separator.yaml \ 254 | ${YAML_PATH}/node${NODENUM}-partial.yaml \ 255 | ${YAML_PATH}/node${NODENUM}-arb*.yaml \ 256 | ${YAML_PATH}/node${NODENUM}-rss*.yaml \ 257 | ${YAML_PATH}/node${NODENUM}-rsp*.yaml \ 258 | ${YAML_PATH}/node${NODENUM}-cfg*.yaml \ 259 | ${YAML_PATH}/node${NODENUM}-mgs*.yaml \ 260 | ${TEMPLATE_PATH}/volumes-head.yaml \ 261 | ${YAML_PATH}/node${NODENUM}-db*-volumes.yaml \ 262 | | sed "s|__BASEDIR__|${BASEDIR}|g" \ 263 | | sed "s|__CONFIG_SERVERS_SERVICES__|${CONFIG_SERVERS_SERVICES}|g" \ 264 | > "./build/node${NODENUM}-deployment.yaml" 265 | done 266 | for i in $(seq ${SHARDS}); do 267 | RSNUM=$(printf %02d ${i}) 268 | NODENUM=$(ls ${JS_PATH}/node*-rs${RSNUM}-pri.js | cut -d'/' -f 4 | cut -b 5-6) 269 | cat ${JS_PATH}/node*-rs${RSNUM}-pri.js \ 270 | ${JS_PATH}/node*-rs${RSNUM}-sec.js \ 271 | ${JS_PATH}/node*-rs${RSNUM}-arb.js \ 272 | > "./build/node${NODENUM}-rs${RSNUM}-init.js" 273 | done 274 | } 275 | 276 | # Ensure clean startup 277 | cleanUp 278 | 279 | # Gather basic config parameters 280 | kubectl get nodes| grep -v "SchedulingDisabled" | awk '{print $1}' | tail -n +2 > ./tmp/nodefile 281 | getNodeNames "./tmp/nodefile" "./tmp" 282 | NODES=$(cat ./tmp/nodefile |wc -l) 283 | 284 | # Ask for some config parameters 285 | source src/configure.sh ${NODES} 286 | 287 | echo "Please ensure that pods can be scheduled on all these nodes." 288 | echo "------------------------------------------------------------" 289 | NODES=${CFGNODES} 290 | echo "CLUSTER NODES.....................: ${NODES}" 291 | SHARDS=${NODES} 292 | validateConstraints $SHARDS 1 $NODES 293 | echo "SHARD MEMBERS.....................: ${SHARDS}" 294 | 295 | MONGOS_PER_CLUSTER=${NODES} 296 | echo "MONGOS PER CLUSTER................: ${MONGOS_PER_CLUSTER}" 297 | validateConstraints $MONGOS_PER_CLUSTER 1 $NODES 298 | 299 | CFG_PER_CLUSTER=1 300 | echo "CONFIG SERVERS PER CLUSTER........: ${CFG_PER_CLUSTER}" 301 | 302 | CFG_REPLICAS=${NODES} 303 | echo "CONFIG REPLICAS PER CLUSTER.......: ${CFG_REPLICAS}" 304 | validateConstraints $CFG_REPLICAS 1 $NODES 305 | 306 | REPLICAS_PER_SHARD=2 307 | echo "DATA REPLICAS PER SHARD...........: ${REPLICAS_PER_SHARD}" 308 | 309 | ARBITER=$(((${REPLICAS_PER_SHARD} + 1) % 2 )) 310 | echo "ARBITER PER REPLICA SET...........: ${ARBITER}" 311 | 312 | REPLICAS=$((${REPLICAS_PER_SHARD} + ${ARBITER})) 313 | validateConstraints $REPLICAS 1 $NODES 314 | echo "TOTAL REPLICAS PER SHARD..........: ${REPLICAS}" 315 | 316 | FIRSTROW=$(getPattern $NODES $REPLICAS $ARBITER) 317 | #echo "FIRST ROW PATTERN ...........: ${FIRSTROW}" 318 | echo "------------------------------------------------------------" 319 | echo "SHARDED CLUSTER DATA REDUNDANCY...: ${REPLICAS_PER_SHARD}" 320 | DISKSPACEFAC=$(echo "${SHARDS} / ${REPLICAS_PER_SHARD}" | bc -l) 321 | DISKSPACEFAC=$(printf %.1f ${DISKSPACEFAC}) 322 | echo "DISK SPACE FACTOR ................: ${DISKSPACEFAC} (times GB per node)" 323 | echo "------------------------------------------------------------" 324 | 325 | MONGOSPORT="27017" 326 | CFGPORT=$(($MONGOSPORT+1)) 327 | ARBPORT=$(($MONGOSPORT+2)) 328 | RSPPORT=$(($MONGOSPORT+3)) 329 | DOUBLENODES=$((${NODES} * 2)) 330 | RSSADD=$(printf %0${DOUBLENODES}d "0") 331 | 332 | echo "Generate Kubernetes YAML files according to spec..." 333 | for j in $(seq ${NODES}); do 334 | ROW=${FIRSTROW} 335 | for i in $(seq ${SHARDS}); do 336 | SPEC=$(getSpec $j $i $ROW $RSSADD) 337 | if [ ! -z "${SPEC}" ]; then 338 | echo "${SPEC}" 339 | genYamlFromTemplates $SPEC 340 | addShard $SPEC 341 | fi 342 | ROW=$(rotateAxis $ROW) 343 | done 344 | echo '---' 345 | FIRSTROW=$(rotateAxis $FIRSTROW) 346 | done 347 | 348 | echo "Config Server replicas loop..." 349 | for j in $(seq ${CFG_PER_CLUSTER}); do 350 | RSNUM=$(printf %02d ${j}) 351 | for i in $(seq ${CFG_REPLICAS}); do 352 | NODE=$(printf %02d ${i}) 353 | CFG=$(getCfg ${RSNUM} ${NODE}) 354 | echo "${CFG}" 355 | genYamlFromTemplates $CFG 356 | done 357 | OUTFILE="./tmp/js/node${RSNUM}-rs${RSNUM}-cfg.js" 358 | cat "${JS_PATH}/rsXX-cfg-template.js" \ 359 | | sed "s|__RSNUM__|${RSNUM}|g" \ 360 | | sed "/##/d" \ 361 | > $OUTFILE 362 | echo '---' 363 | done 364 | 365 | echo "Mongos server loop..." 366 | for j in $(seq ${CFG_PER_CLUSTER}); do 367 | RSNUM=$(printf %02d ${j}) 368 | for i in $(seq ${MONGOS_PER_CLUSTER}); do 369 | NODENUM=$(printf %02d ${i}) 370 | MGS="mgs${RSNUM}-node${NODENUM}-port${MONGOSPORT}" 371 | echo "${MGS}" 372 | genYamlFromTemplates $MGS 373 | done 374 | done 375 | 376 | genFinalFromPartials 377 | 378 | echo 'Generate needed directories on remote server ...' 379 | ./src/remote.sh ${SHARDS} ${SSHUSER} ${SSHPORT} ${BASEDIR} 380 | echo 381 | echo "Successfully executed." 382 | echo "Execute 'make run' to fire up the mongodb shard on your cluster." 383 | -------------------------------------------------------------------------------- /src/remote.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -eo pipefail 4 | 5 | SSHUSER=$2 && [ -z "${2}" ] && SSHUSER="core" || true 6 | SSHPORT=$3 && [ -z "${3}" ] && SSHPORT="22" || true 7 | BASEDIR=$4 && [ -z "${4}" ] && BASEDIR="/data" || true 8 | 9 | function sshCall(){ 10 | ssh -p $SSHPORT $SSHUSER@$1 $2 /sys/kernel/mm/transparent_hugepage/enabled"' 23 | sshCall $HOSTNAME 'sudo /bin/bash -c "echo never > /sys/kernel/mm/transparent_hugepage/defrag"' 24 | for j in $(seq ${SHARDS}); do 25 | RSNUM=$(printf %02d ${j}) 26 | echo "Create directories for replica set ${RSNUM}." 27 | sshCall $HOSTNAME "sudo mkdir -p ${BASEDIR}/mongodb/db-rs${RSNUM}" 28 | done 29 | i=$((i+1)) 30 | done < "$FILE" 31 | } 32 | 33 | if [ -z "$1" ]; then 34 | echo "Please provide the number of shards: sh remote.sh " 35 | exit 1 36 | fi 37 | echo "Number of Shards / Replication Sets: ${1}" 38 | if [ -e ./tmp/nodefile ]; then 39 | execRemote "./tmp/nodefile" $1 40 | else 41 | echo "You need to run generate.sh first." 42 | exit 1 43 | fi 44 | -------------------------------------------------------------------------------- /yaml-templates/cfgXX-template.yaml: -------------------------------------------------------------------------------- 1 | - name: __RSID____RSNUM__-node__NODENUM__ 2 | image: mongo:__VERSION__ 3 | args: 4 | - "--storageEngine" 5 | - wiredTiger 6 | - "--configsvr" 7 | - "--replSet" 8 | - configReplSet__RSNUM__ 9 | - "--port" 10 | - "__PORT__" 11 | - "--noprealloc" 12 | - "--smallfiles" 13 | ports: 14 | - name: __RSID____RSNUM__-node__NODENUM__ 15 | containerPort: __PORT__ 16 | volumeMounts: 17 | - name: db-cfg 18 | mountPath: /data/db 19 | -------------------------------------------------------------------------------- /yaml-templates/mgsXX-template.yaml: -------------------------------------------------------------------------------- 1 | - name: __RSID____MSGNUM__-node__NODENUM__ 2 | image: mongo:__VERSION__ 3 | command: 4 | - "mongos" 5 | args: 6 | - "--configdb" 7 | - "configReplSet__MSGNUM__/__CONFIG_SERVERS_SERVICES__" 8 | - "--port" 9 | - "__PORT__" 10 | ports: 11 | - name: __RSID____MSGNUM__-node__NODENUM__ 12 | containerPort: __PORT__ 13 | -------------------------------------------------------------------------------- /yaml-templates/nodeXX-template.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: extensions/v1beta1 2 | kind: Deployment 3 | metadata: 4 | name: mongodb-shard-node__NODENUM__ 5 | spec: 6 | replicas: 1 7 | template: 8 | metadata: 9 | labels: 10 | app: mongodb-shard-node__NODENUM__ 11 | role: mongoshard 12 | tier: backend 13 | spec: 14 | nodeSelector: 15 | kubernetes.io/hostname: __NODESELECTOR__ 16 | containers: 17 | -------------------------------------------------------------------------------- /yaml-templates/rsXX-template.yaml: -------------------------------------------------------------------------------- 1 | - name: __RSID____RSNUM__-node__NODENUM__ 2 | image: mongo:__VERSION__ 3 | args: 4 | - "--storageEngine" 5 | - wiredTiger 6 | - "--replSet" 7 | - rs__RSNUM__ 8 | - "--port" 9 | - "__PORT__" 10 | - "--noprealloc" 11 | - "--smallfiles" 12 | ports: 13 | - name: __RSID____RSNUM__-node__NODENUM__ 14 | containerPort: __PORT__ 15 | volumeMounts: 16 | - name: db-rs__RSNUM__ 17 | mountPath: /data/db 18 | -------------------------------------------------------------------------------- /yaml-templates/separator.yaml: -------------------------------------------------------------------------------- 1 | 2 | --- 3 | 4 | -------------------------------------------------------------------------------- /yaml-templates/svcXX-port-template.yaml: -------------------------------------------------------------------------------- 1 | - name: __RSID____RSNUM__-node__NODENUM__ 2 | port: __PORT__ 3 | protocol: TCP 4 | -------------------------------------------------------------------------------- /yaml-templates/svcXX-template.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | name: mongodb-node__NODENUM__ 5 | labels: 6 | app: mongodb-node__NODENUM__ 7 | role: mongoshard 8 | tier: backend 9 | spec: 10 | selector: 11 | app: mongodb-shard-node__NODENUM__ 12 | role: mongoshard 13 | tier: backend 14 | ports: 15 | -------------------------------------------------------------------------------- /yaml-templates/volumes-head.yaml: -------------------------------------------------------------------------------- 1 | volumes: 2 | - name: db-cfg 3 | hostPath: 4 | path: __BASEDIR__/mongodb/db-cfg 5 | -------------------------------------------------------------------------------- /yaml-templates/volumes-template.yaml: -------------------------------------------------------------------------------- 1 | - name: db-rs__RSNUM__ 2 | hostPath: 3 | path: __BASEDIR__/mongodb/db-rs__RSNUM__ 4 | --------------------------------------------------------------------------------