├── README.md ├── compose ├── consul │ ├── config │ │ └── consul.json │ └── consul.yml └── mariadb │ ├── common.env │ └── mariadb.yml ├── config ├── .gitignore └── swarm.conf ├── machine └── .gitignore ├── scripts ├── build_swarm.sh ├── cancel_softlayer.sh ├── deploy_mariadb.sh ├── destroy_swarm.sh ├── instance │ ├── cleanup_generic.sh │ └── sl_post_provision.sh ├── provision_softlayer.sh └── rebuild_swarm.sh ├── source.me └── ssh └── .gitignore /README.md: -------------------------------------------------------------------------------- 1 | Zero to HA MariaDB and Docker Swarm in under 15 minutes on IBM Softlayer (or anywhere, really) 2 | ================== 3 | 4 | Provisioning helper scripts for my post [Zero to HA MariaDB and Docker Swarm in under 15 minutes on IBM Softlayer (or anywhere, really)](http://18pct.com/zero-to-mariadb-cluster-in-docker-swarm-in-15-minutes-part-1/) over at my [blog](http://18pct.com/blog/). 5 | 6 | ## Build Goals 7 | 8 | - Multi-master, highly-available docker swarm cluster on CentOS 7. 9 | - HA Consul key-value store running on the swarm itself. 10 | - Use the btrfs storage driver (or alternately, device-mapper with LVM). 11 | - Containerized MariaDB Galera cluster running on the swarm, natch. 12 | - Overlay network between MariaDB nodes for cluster communication etc. 13 | - Percona Xtrabackup instead of rsync to reduce locking during state transfers. 14 | 15 | # Putting it all Together 16 | 17 | In addition to the deployment helper scripts in the `scripts` directory, you'll find compose files for consul and MariaDB using my [CentOS7 MariaDB Galera](https://hub.docker.com/r/dayreiner/centos7-mariadb-10.1-galera/) docker image off of [docker hub](hub.docker.com) in the `compose` dir. 18 | 19 | Running through the scripts in order, you'll end up with an n-node MariaDB Galera cluster running on top of a multi-master Docker Swarm, using an HA Consul cluster for swarm discovery and btrfs container storage -- all self-contained within the *n* swarm masters. In a real production environment, you would want to consider moving the database and any services on to Swarm agent hosts, and leave the three swarm masters to the task of managing the swarm itself. 20 | 21 | To quickly deploy your own MariaDB cluster, follow the steps outlined below. If you prefer to just read through scripts and see for yourself how things are done (or on a different platform), you can browse through the scripts directly [here](https://github.com/dayreiner/docker-swarm-mariadb/tree/master/scripts). The provisioning process was tested from a CentOS 7 host, so YMMV with other OSes -- if you run in to any issues, feel free to [report them here](https://github.com/dayreiner/docker-swarm-mariadb/issues). 22 | 23 | ### Prerequsites 24 | In order to run the scripts, you'll need to have the the Softlayer "[slcli](https://github.com/softlayer/softlayer-python)" command-line api client tool (`pip install softlayer`) installed and configured on the system you'll be running docker-machine from (or an alternate way to get the IP addresses of your instances if provisioned elsewhere). The expect command (`yum -y install expect`) is also required to run the softlayer order script -- otherwise you can provision instances yourself manually. 25 | 26 | ## Running the scripts 27 | To get started, first clone this repository: 28 | 29 | ```bash 30 | git clone git@github.com:dayreiner/docker-swarm-mariadb.git 31 | ``` 32 | 33 | Change to the `docker-swarm-mariadb` directory and run `source source.me` to set your docker-machine storage path to the `machine` subdirectory of the repository. This will help keep everything self-contained. Next go in the `config` directory and copy the premade example `swarm.conf` file to a new file called `swarm.local`. Make any changes you need to `swarm.local`; this will override the values in the example config. The config file is used to define some variables your environment, the number of swarm nodes etc. Once that's done, you can either provision the swarm instances automatically via the softlayer provisioning script or just skip ahead to building the swarm or the MariaDB cluster and overlay network: 34 | 35 | ### Steps 36 | 1. `cd config ; cp swarm.conf swarm.local` 37 | 2. `vi swarm.local` -- and change values for your nodes and environment 38 | 3. `cd ../scripts` 39 | 4. `./provision_softlayer.sh` -- generate ssh keys, orders nodes, runs post-provisioning scripts 40 | 5. `./build_swarm.sh` -- deploys the swarm and the consul cluster 41 | 6. Wait for the swarm nodes to find the consul cluster and finish bootstrapping the swarm. Check with: 42 | - `eval $(docker-machine env --swarm sw1)` 43 | - `docker info` -- and wait for all three nodes to be listed 44 | 7. `./deploy_mariadb.sh` Bootstrap the MariaDB cluster on the swarm nodes. 45 | - Check container logs to confirm all nodes have started 46 | - Run `docker exec -ti sw1-db1 mysql -psecret "show status like 'wsrep%';"` to confirm the cluster is happy. 47 | 9. *Optionally* run `./deploy_mariadb.sh` a second time to redeploy db1 as a standard galera cluster member. 48 | 49 | The repo also includes scripts for tearing down the swarm, rebuilding it and cancelling the swarm instances in Softlayer when you're done. 50 | 51 | #### Zero to a functional Galera Cluster across three Softlayer instances: 52 | 53 | ```bash 54 | real 13m17.885s 55 | user 0m15.442s 56 | sys 0m3.577s 57 | ``` 58 | 59 | Nice! 60 | -------------------------------------------------------------------------------- /compose/consul/config/consul.json: -------------------------------------------------------------------------------- 1 | { 2 | "ca_file": "/certs/ca.pem", 3 | "cert_file": "/certs/server.pem", 4 | "key_file": "/certs/server-key.pem", 5 | 6 | "ports": { 7 | "dns": 53 8 | }, 9 | 10 | "verify_incoming": true, 11 | "verify_outgoing": true 12 | } 13 | -------------------------------------------------------------------------------- /compose/consul/consul.yml: -------------------------------------------------------------------------------- 1 | consul: 2 | command: -dc ${datacenter} -server -node ${node} -client 0.0.0.0 -bootstrap-expect ${swarm_total_nodes} -advertise ${node_cluster_ip} -retry-interval 10s -recursor ${dns_primary} -recursor ${dns_secondary} -retry-join ${othernode0_cluster_ip} -retry-join ${othernode1_cluster_ip} 3 | container_name: consul 4 | net: host 5 | environment: 6 | - "GOMAXPROCS=2" 7 | image: gliderlabs/consul-server:latest 8 | ports: 9 | - 172.17.0.1:53:53 10 | - 172.17.0.1:53:53/udp 11 | - ${node_cluster_ip}:53:53 12 | - ${node_cluster_ip}:53:53/udp 13 | - ${node_cluster_ip}:8300-8302:8300-8302 14 | - ${node_cluster_ip}:8300-8302:8300-8302/udp 15 | - ${node_cluster_ip}:8400:8400 16 | - ${node_cluster_ip}:8500:8500 17 | restart: always 18 | volumes: 19 | - "consul-data:/data" 20 | - "/etc/consul/consul.json:/config/consul.json:ro" 21 | - "/etc/docker/ca.pem:/certs/ca.pem:ro" 22 | - "/etc/docker/server.pem:/certs/server.pem:ro" 23 | - "/etc/docker/server-key.pem:/certs/server-key.pem:ro" 24 | - "/var/run/docker.sock:/var/run/docker.sock" 25 | -------------------------------------------------------------------------------- /compose/mariadb/common.env: -------------------------------------------------------------------------------- 1 | # General 2 | HOME=/root 3 | TERM=xterm-256color 4 | # Service names for registrator 5 | SERVICE_3306_NAME=mariadb 6 | SERVICE_4567_NAME=galera 7 | SERVICE_4444_NAME=xtrabackup 8 | -------------------------------------------------------------------------------- /compose/mariadb/mariadb.yml: -------------------------------------------------------------------------------- 1 | version: '2' 2 | services: 3 | %%DBNODE%%: 4 | image: dayreiner/centos7-mariadb-10.1-galera 5 | container_name: %%DBNODE%% 6 | hostname: %%DBNODE%% 7 | restart: always 8 | networks: 9 | - mariadb 10 | ports: 11 | - 172.17.0.1:3306:3306 12 | expose: 13 | - "3306" 14 | - "4567" 15 | - "4444" 16 | volumes: 17 | - ${mariadb_data_path}:/var/lib/mysql 18 | env_file: 19 | - common.env 20 | environment: 21 | # This is set by the build script 22 | - CLUSTER=${cluster_members} 23 | # These are configured in swarm.conf 24 | - CLUSTER_NAME=${mariadb_cluster_name} 25 | - MYSQL_ROOT_PASSWORD=${mysql_root_password} 26 | - SST_USER=sst 27 | - SST_PASS=${sst_password} 28 | networks: 29 | mariadb: 30 | external: 31 | name: mariadb 32 | -------------------------------------------------------------------------------- /config/.gitignore: -------------------------------------------------------------------------------- 1 | *.local 2 | -------------------------------------------------------------------------------- /config/swarm.conf: -------------------------------------------------------------------------------- 1 | # 2 | # Copy this file to swarm.local to override default values 3 | # 4 | 5 | ############################################################################# 6 | # Basic swarm and node configuration 7 | ############################################################################# 8 | 9 | # Space separated array of node names for existing systems in slcli 10 | # Alternately these can be provisioned via the provision_softlayer.sh script 11 | # Three nodes recommended. Should work with more but untested. 12 | nodelist=( sw1 sw2 sw3 ) 13 | 14 | # Pass debug or native ssh opts to docker-machine if needed 15 | machine_opts="" 16 | 17 | # Name of datacenter, used for both consul and softlayer provisioning 18 | datacenter=tor01 19 | 20 | # DNS Settings for upstream resolvers and search domain 21 | dns_search_domain=example.com 22 | dns_primary=8.8.8.8 23 | dns_secondary=8.8.4.4 24 | 25 | ############################################################################# 26 | # Softlayer Configuration 27 | ############################################################################# 28 | 29 | # Billing, set to hourly or monthly. Hourly is the default. 30 | sl_billing="hourly" 31 | 32 | # System Sizing 33 | sl_cpu="1" 34 | sl_memory="1024" 35 | sl_os_disk_size="25" 36 | sl_docker_disk_size="25" 37 | sl_disk_type="" # Set this to "--san" to use SAN instead of local disk. 38 | 39 | # Provide the VLAN IDs for the instances 40 | # Use "slcli vlan list" to get the VLAN IDs. 41 | sl_public_vlan_id="" 42 | sl_private_vlan_id="" 43 | 44 | # Domain for instances (i.e. example.com) as used in your softlayer account. 45 | # Defaults to the DNS Search Domain 46 | sl_domain="${dns_search_domain}" 47 | 48 | # Softlayer datacenter. Defaults to the consul datacenter setting 49 | # Override this here if consul datacenter is set to an arbitrary value 50 | sl_region="${datacenter}" 51 | 52 | # Name of ssh key we will generate and use for provisioning operations 53 | sl_sshkey_name="jdr-swarm" 54 | 55 | # When provisioning with slci, ssh to the public or private ip of 56 | # the provisioned node to run post-provisioning scripts 57 | sl_ssh_interface="private" # Expects either "public" or "private" 58 | 59 | # Any optional extra Args to pass to slcli 60 | sl_extra_args="" 61 | 62 | ############################################################################# 63 | # MariaDB Configuration 64 | ############################################################################# 65 | 66 | # Name of the cluster. Should be unique to the network the nodes are on. 67 | mariadb_cluster_name="swarmdb" 68 | 69 | # Path on host to store mariadb container database directory (/var/lib/mysql) 70 | mariadb_data_path=/opt/mysql 71 | 72 | # Passwords for mysql root and xtrabackup state transfer 73 | mysql_root_password="secret" 74 | sst_password="sst" 75 | 76 | 77 | ############################################################################# 78 | # Nothing to set past here 79 | ############################################################################# 80 | 81 | export nodelist datacenter machine_opts \ 82 | dns_search_domain dns_primary dns_secondary 83 | 84 | export sl_cpu sl_memory sl_disk_size sl_public_vlan_id \ 85 | sl_private_vlan_id sl_domain sl_region sl_billing \ 86 | sl_sshkey_name sl_disk_type sl_ssh_interface sl_extra_args 87 | 88 | export mariadb_data_path mysql_root_password sst_password \ 89 | mariadb_cluster_name 90 | -------------------------------------------------------------------------------- /machine/.gitignore: -------------------------------------------------------------------------------- 1 | # Ignore everything in this directory 2 | * 3 | # Except this file 4 | !.gitignore 5 | -------------------------------------------------------------------------------- /scripts/build_swarm.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ############################################################################# 4 | # Build an n-node docker swarm HA cluster with HA consul for discovery 5 | # running on the swarm itself. Uses docker machine generic driver 6 | # and pre-existing CentOS 7 nodes in Softlayer. 7 | ############################################################################# 8 | 9 | #set -o xtrace # uncomment for debug 10 | 11 | ############################################################################# 12 | # Nothing to set past here 13 | ############################################################################# 14 | 15 | # Strict modes. Same as -euo pipefail 16 | set -o errexit 17 | set -o pipefail 18 | set -o nounset 19 | 20 | # Set some useful vars 21 | IFS=$'\n\t' # Set field separator to return+tab 22 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dir this script runs out of 23 | __root="$(cd "$(dirname "${__dir}")" && pwd)" # Parent dir 24 | __file="${__dir}/$(basename "${BASH_SOURCE[0]}")" # This script's filename 25 | 26 | # Trap errors 27 | trap $(printf "\n\tERROR: ${LINENO}\n\n" && exit 1) ERR 28 | 29 | cd ${__dir} 30 | 31 | # Get config. Load local override config if present. 32 | if [[ -e ${__root}/config/swarm.local ]] ; then 33 | source ${__root}/config/swarm.local 34 | else 35 | source ${__root}/config/swarm.conf 36 | fi 37 | 38 | # Set machine storage to machine subdir of repo 39 | export MACHINE_STORAGE_PATH="${__root}/machine" 40 | 41 | # Build cluster members with docker machine 42 | export node_consul=consul.service.consul 43 | for node in ${nodelist[@]} ; do 44 | export node_private_ip=$(slcli vs detail ${node} | grep private_ip | awk '{print $2}') 45 | export node_public_ip=$(slcli vs detail ${node} | grep public_ip | awk '{print $2}') 46 | export node_ssh_ip=$(slcli vs detail ${node} | grep ${sl_ssh_interface}_ip | awk '{print $2}') 47 | 48 | echo 49 | echo "-------------------------------------------------" 50 | echo "Current Run Details..." 51 | echo "Provisioning node ${node}" 52 | echo "Public IP = ${node_public_ip}" 53 | echo "Private IP = ${node_private_ip}" 54 | echo "Consul URL = consul://${node_consul}:8500" 55 | echo "DNS Search Doamin = ${dns_search_domain}" 56 | echo "-------------------------------------------------" 57 | echo 58 | 59 | docker-machine ${machine_opts} create \ 60 | --driver generic \ 61 | --generic-ip-address ${node_ssh_ip} \ 62 | --generic-ssh-key ${__root}/ssh/swarm.rsa \ 63 | --generic-ssh-user root \ 64 | --engine-storage-driver btrfs \ 65 | --swarm --swarm-master \ 66 | --swarm-opt="replication=true" \ 67 | --swarm-opt="advertise=${node_private_ip}:3376" \ 68 | --swarm-discovery="consul://${node_consul}:8500" \ 69 | --engine-opt="cluster-store consul://${node_consul}:8500" \ 70 | --engine-opt="cluster-advertise=eth0:2376" \ 71 | --engine-opt="dns ${node_private_ip}" \ 72 | --engine-opt="dns ${dns_primary}" \ 73 | --engine-opt="dns ${dns_secondary}" \ 74 | --engine-opt="log-driver json-file" \ 75 | --engine-opt="log-opt max-file=10" \ 76 | --engine-opt="log-opt max-size=10m" \ 77 | --engine-opt="dns-search=${dns_search_domain}" \ 78 | --engine-label="dc=${datacenter}" \ 79 | --engine-label="instance_type=public_cloud" \ 80 | --tls-san ${node} \ 81 | --tls-san ${node_private_ip} \ 82 | --tls-san ${node_public_ip} \ 83 | ${node} 84 | done 85 | 86 | # Install consul across cluster members for swarm discovery 87 | export nodelist=( $(docker-machine ls -q) ) 88 | for node in ${nodelist[@]} ; do 89 | othernodes=( $(echo ${nodelist[@]} | tr ' ' '\n' | grep -v ${node}) ) 90 | nodecount=${#othernodes[@]} 91 | for (( nodenum=0; nodenum<${nodecount}; nodenum++ )) ; do 92 | export othernode${nodenum}_cluster_ip=$(docker-machine ip ${othernodes[${nodenum}]}) 93 | done 94 | export node_cluster_ip=$(docker-machine ip ${node}) 95 | export swarm_total_nodes=${#nodelist[@]} 96 | export node 97 | echo 98 | echo 99 | echo "-------------------------------------------------" 100 | echo "Current Run Details..." 101 | echo "Node = ${node}" 102 | echo "Node Consul Advertise IP = ${node_cluster_ip}" 103 | echo "Joining Consul IP #1 = ${othernode0_cluster_ip}" 104 | echo "Joining Consul IP #2 = ${othernode1_cluster_ip}" 105 | echo "DNS Search Doamin = ${dns_search_domain}" 106 | echo "-------------------------------------------------" 107 | echo 108 | echo 109 | docker-machine ssh ${node} "printf 'nameserver ${node_cluster_ip}\nnameserver ${dns_primary}\nnameserver ${dns_secondary}\ndomain ${dns_search_domain}\n' > /etc/resolv.conf" 110 | docker-machine scp -r ${__root}/compose/consul/config ${node}:/tmp/consul 111 | docker-machine ssh ${node} "mv /tmp/consul /etc" 112 | eval $(docker-machine env ${node}) 113 | docker-compose -f ${__root}/compose/consul/consul.yml up -d consul 114 | docker-machine ssh ${node} "systemctl restart docker" 115 | done 116 | 117 | echo "Done" 118 | -------------------------------------------------------------------------------- /scripts/cancel_softlayer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ############################################################################# 4 | # Cancel configured instances in the public cloud in IBM Softlayer 5 | ############################################################################# 6 | 7 | #set -o xtrace # uncomment for debug 8 | 9 | ############################################################################# 10 | # Nothing to set past here 11 | ############################################################################# 12 | 13 | # Strict modes. Same as -euo pipefail 14 | set -o errexit 15 | set -o pipefail 16 | set -o nounset 17 | 18 | # Set some useful vars 19 | IFS=$'\n\t' # Set field separator to return+tab 20 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dir this script runs out of 21 | __root="$(cd "$(dirname "${__dir}")" && pwd)" # Parent dir 22 | __file="${__dir}/$(basename "${BASH_SOURCE[0]}")" # This script's filename 23 | 24 | # Trap errors 25 | trap $(printf "\n\tERROR: ${LINENO}\n\n" && exit 1) ERR 26 | 27 | cd ${__dir} 28 | 29 | # Get config. Load local override config if present. 30 | if [[ -e ${__root}/config/swarm.local ]] ; then 31 | source ${__root}/config/swarm.local 32 | else 33 | source ${__root}/config/swarm.conf 34 | fi 35 | 36 | # Set machine storage to machine subdir of repo 37 | export MACHINE_STORAGE_PATH="${__root}/machine" 38 | 39 | # Check for slcli 40 | [[ ! $(which slcli 2>/dev/null) ]] && echo "slcli tool required. Make sure slcli is configured and in your path" && exit 1 41 | 42 | # Cancel swarm instances using slcli 43 | 44 | re=$(tput setaf 1) 45 | wh=$(tput setaf 7) 46 | nor=$(tput sgr0) 47 | echo 48 | echo "########################################################" 49 | echo "# ${wh}ATTENTION - THIS WILL CANCEL ALL SWARM VM'S${nor} #" 50 | echo "########################################################" 51 | echo 52 | echo "All data on the cancelled systems will be lost." 53 | echo 54 | echo "${wh}The following instances will be cancelled:" 55 | echo 56 | echo "${re}${nodelist[@]}${nor}" 57 | echo 58 | 59 | read -p "Are you sure you wish to continue? Type ${re}CANCEL${nor} to initiate cancellations: " answer 60 | case ${answer} in 61 | CANCEL) 62 | echo "OK, going to cancel nodes: ${re}${nodelist[@]}${nor}" 63 | ;; 64 | *) 65 | echo "Aborting cancellation. No instances will be destroyed..." 66 | exit 67 | ;; 68 | esac 69 | 70 | [[ ! -f /usr/bin/expect ]] && echo "Please install expect to continue..." && exit 1 71 | for node in ${nodelist[@]} ; do 72 | # Check if node is still in docker-machine and remove it before we cancel 73 | echo "Checking for ${node} in docker-machine..." 74 | echo 75 | if [[ $(docker-machine ip ${node}) ]] ; then 76 | echo "########## Found ${node} in machine list, removing it." 77 | docker-machine rm -f -y ${node} 78 | echo "########## ${node} removed from docker machine list." 79 | else 80 | echo "########## Node ${node} not found in machine list, skipping straight to softlayer cancellation..." 81 | echo 82 | fi 83 | 84 | # Try to avoid collateral damage. Abort if we get anything other than a single hit. 85 | export check_instance="$(slcli vs list | grep ${node} || true)" # Strictmode workaround 86 | export uniq_check=$(echo ${check_instance} | wc -l) 87 | if [[ ${uniq_check} = 0 ]] ; then 88 | echo "Match for ${node} not found. Exiting" 89 | exit 1 90 | elif [[ ${uniq_check} > 1 ]] ; then 91 | echo "More than one match for ${node} found. Exiting." 92 | exit 1 93 | else 94 | echo "Found exactly one match, continuing" 95 | fi 96 | # OK to go... 97 | node_id=$(slcli vs list | grep ${node} | awk '{print $1}') 98 | echo "########## Cancelling ${node}..." 99 | /usr/bin/expect <<- EOF 100 | set force_conservative 0 101 | set timeout 10 102 | spawn slcli vs cancel ${node} 103 | expect "Enter to abort: " 104 | send -- "${node_id}\r"; 105 | expect eof 106 | EOF 107 | echo "########## Cancellation of node ${node} complete!" 108 | echo 109 | done 110 | 111 | echo "All nodes complete!" 112 | echo "All nodes cancelled. Check ${wh}slcli vs list${nor} for cancellation status before reprovisioning" 113 | -------------------------------------------------------------------------------- /scripts/deploy_mariadb.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ############################################################################# 4 | # Build an n-node docker swarm HA cluster with HA consul for discovery 5 | # running on the swarm itself. Uses docker machine generic driver 6 | # and pre-existing CentOS 7 nodes in Softlayer. 7 | ############################################################################# 8 | 9 | #set -o xtrace # uncomment for debug 10 | 11 | ############################################################################# 12 | # Nothing to set past here 13 | ############################################################################# 14 | 15 | # Strict modes. Same as -euo pipefail 16 | set -o errexit 17 | set -o pipefail 18 | set -o nounset 19 | 20 | # Set some useful vars 21 | #IFS=$'\n\t' # Set field separator to return+tab 22 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dir this script runs out of 23 | __root="$(cd "$(dirname "${__dir}")" && pwd)" # Parent dir 24 | __file="${__dir}/$(basename "${BASH_SOURCE[0]}")" # This script's filename 25 | 26 | # Trap errors 27 | trap $(printf "\n\tERROR: ${LINENO}\n\n" && exit 1) ERR 28 | 29 | cd ${__dir} 30 | 31 | # Get config. Load local override config if present. 32 | if [[ -e ${__root}/config/swarm.local ]] ; then 33 | source ${__root}/config/swarm.local 34 | else 35 | source ${__root}/config/swarm.conf 36 | fi 37 | 38 | # Set machine storage to machine subdir of repo 39 | export MACHINE_STORAGE_PATH="${__root}/machine" 40 | 41 | # Get our list of nodes from docker machine 42 | export nodelist=( $(docker-machine ls -q) ) 43 | export node=${nodelist[0]} 44 | eval $(docker-machine env ${node}) 45 | 46 | # Create our overlay network if it doesn't already exist 47 | [[ ! $(docker network ls | grep mariadb) ]] && docker network create -d overlay --subnet=172.100.100.0/24 mariadb 48 | 49 | # Bootstrap mariadb cluster on the first node if not already here 50 | # Otherwise, try to join as a regular node 51 | export node=${nodelist[0]} 52 | cd ${__root}/compose/mariadb 53 | if [[ ! $(docker ps | grep db) ]] ; then 54 | echo "${nodelist[0]}/db not running. Attempting to bootstrap cluster..." 55 | for node in ${nodelist[0]} ; do 56 | eval $(docker-machine env ${node}) 57 | # Set the cluster mode to bootstrap, as we're the first one 58 | export cluster_members=BOOTSTRAP 59 | export node 60 | sed "s/%%DBNODE%%/db-${node}/g" mariadb.yml > ${node}.yml 61 | docker-compose -f ${__root}/compose/mariadb/${node}.yml up -d --no-recreate 62 | done 63 | else 64 | echo "${nodelist[0]}/db already running. Attempting to rejoin cluster as regular node..." 65 | for node in ${nodelist[0]} ; do 66 | eval $(docker-machine env ${node}) 67 | # Set the cluster mode to to the current node list as we're joining as a regular node 68 | cluster_members=$(printf ",db-%s" "${nodelist[@]}") 69 | export cluster_members=${cluster_members:1} 70 | export node 71 | sed "s/%%DBNODE%%/db-${node}/g" mariadb.yml > ${node}.yml 72 | docker-compose -f ${__root}/compose/mariadb/${node}.yml up -d --force-recreate 73 | done 74 | fi 75 | 76 | while [[ ! $(docker logs db-${node} 2>&1 |grep "Synchronized with group, ready for connections") ]] ; do 77 | echo "Waiting for ${nodelist[0]}/db to become available" 78 | sleep 10 79 | done 80 | 81 | sec_nodelist="${nodelist[@]:1}" 82 | for node in ${sec_nodelist} ; do 83 | cluster_members=$(printf ",db-%s" "${nodelist[@]}") 84 | export cluster_members=${cluster_members:1} 85 | export node 86 | echo "Building ${node} mariadb instance..." 87 | eval $(docker-machine env ${node}) 88 | sed "s/%%DBNODE%%/db-${node}/g" mariadb.yml > ${node}.yml 89 | docker-compose -f ${__root}/compose/mariadb/${node}.yml up -d --no-recreate 90 | done 91 | 92 | echo "Done" 93 | -------------------------------------------------------------------------------- /scripts/destroy_swarm.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | #set -o xtrace # uncomment for debug 4 | 5 | # Strict modes. Same as -euo pipefail 6 | set -o errexit 7 | set -o pipefail 8 | set -o nounset 9 | 10 | # Set some magic vars 11 | IFS=$'\n\t' # Set field separator to return+tab 12 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dir this script runs out of 13 | __root="$(cd "$(dirname "${__dir}")" && pwd)" # Parent dir 14 | __file="${__dir}/$(basename "${BASH_SOURCE[0]}")" # This script's filename 15 | 16 | # Trap errors 17 | trap $(printf "\n\tERROR: ${LINENO}\n\n" && exit 1) ERR 18 | 19 | cd ${__dir} 20 | 21 | # Get config. Load local override config if present. 22 | if [[ -e ${__root}/config/swarm.local ]] ; then 23 | source ${__root}/config/swarm.local 24 | else 25 | source ${__root}/config/swarm.conf 26 | fi 27 | 28 | # Set machine storage to machine subdir of repo 29 | export MACHINE_STORAGE_PATH="${__root}/machine" 30 | 31 | export nodelist=( $(docker-machine ls -q) ) 32 | 33 | for node in ${nodelist[@]} ; do 34 | docker-machine scp ${__root}/scripts/instance/cleanup_generic.sh ${node}:/tmp 35 | docker-machine ssh ${node} "chmod +x /tmp/cleanup_generic.sh" 36 | docker-machine ssh ${node} "/tmp/cleanup_generic.sh" 37 | docker-machine rm -f -y ${node} 38 | done 39 | -------------------------------------------------------------------------------- /scripts/instance/cleanup_generic.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Cleanup after removing a generic centos system from docker machine. 4 | # 5 | # Without cleaning up first, attempting to reprovision with the 6 | # generic driver in machine will fail. 7 | # 8 | # Run on docker-machine $node before performing docker-machine rm $node 9 | # Node can then be reprovisioned using machine without crapping out. 10 | 11 | exec 1> >(logger -s -t $(basename $0)) 2>&1 12 | 13 | # Stop Docker 14 | systemctl stop docker 15 | 16 | # If using btrfs for /var/lib/docker, cleanup btrfs snapshots 17 | if [[ $(btrfs filesystem show /var/lib/docker|grep devid|wc -l) != 0 ]] ; then 18 | cd /var/lib/docker 19 | btrfs subvol del $(btrfs subvol list -t .|awk '{print $4}' |tail -n+3) 20 | cd / 21 | fi 22 | 23 | # Clear docker dirs and systemd config placed by machine 24 | rm -rf /etc/docker 25 | rm -rf /var/lib/docker 26 | rm /etc/systemd/system/docker.service 27 | 28 | # Remove Docker packages and yum repo 29 | yum -y remove docker-engine docker-engine-selinux 30 | rm /etc/yum.repos.d/docker* 31 | 32 | # Cleanup consul dirs if present 33 | rm -rf /etc/consul 34 | rm -rf /tmp/consul 35 | 36 | # Reset resolv.conf to use google 37 | echo "nameserver 8.8.8.8" > /etc/resolv.conf 38 | -------------------------------------------------------------------------------- /scripts/instance/sl_post_provision.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Post-Provisioning script for a generic CentOS 7 dockerhost on Softlayer 4 | # Second disk is added to LVM and formatted as btrfs to take advantage 5 | # of the btrfs driver in docker over the crappy default loopback devices. 6 | # 7 | # Assumes second provisioned volume is at /dev/xvdc, which should be 8 | # the default on a two-disk CentOS minimal hourly instance. 9 | 10 | #set -x 11 | set -euo pipefail 12 | 13 | exec 1> >(logger -s -t $(basename $0)) 2>&1 14 | 15 | # Install btrfs/lvm requirements plus git and 16 | # net-tools to provide netstat command 17 | # per https://github.com/docker/machine/issues/2480 18 | # Otherwise generic provisioning fails with SSH exit 127 error 19 | yum -y update && yum clean all && yum makecache fast 20 | yum -y install lvm2 lvm2-libs btrfs-progs git net-tools 21 | 22 | echo "Setup xvdc partition layout for docker" 23 | cat << EOF > /tmp/xvdc.layout 24 | # partition table of /dev/xvde 25 | unit: sectors 26 | 27 | /dev/xvdc1 : start= 2048, size= 26213376, Id=8e 28 | /dev/xvdc2 : start= 0, size= 0, Id= 0 29 | /dev/xvdc3 : start= 0, size= 0, Id= 0 30 | /dev/xvdc4 : start= 0, size= 0, Id= 0 31 | EOF 32 | 33 | echo "Apply partition layouts..." 34 | sfdisk --force /dev/xvdc < /tmp/xvdc.layout 35 | 36 | echo "Create physical volumes in lvm..." 37 | pvcreate /dev/xvdc1 38 | 39 | echo "Create lvm volume groups..." 40 | vgcreate docker_vg /dev/xvdc1 41 | 42 | echo "Create lvm logical volumes..." 43 | lvcreate -l 100%FREE -n docker_lv1 docker_vg 44 | 45 | echo "Setup btrfs filesystem of lvm volumes..." 46 | mkfs.btrfs /dev/docker_vg/docker_lv1 47 | rm /tmp/xvdc.layout 48 | 49 | echo "Mounting btrfs volume to /var/lib/docker" 50 | [[ ! -d /var/lib/docker ]] && mkdir /var/lib/docker 51 | mount /dev/docker_vg/docker_lv1 /var/lib/docker 52 | 53 | echo "Adding lvm volumes to fstab.." 54 | echo "/dev/docker_vg/docker_lv1 /var/lib/docker btrfs defaults,noatime,autodefrag 0 0" >> /etc/fstab 55 | 56 | export TERM=xterm-256color 57 | 58 | motd="/etc/motd" 59 | W="\033[01;37m" 60 | B="\033[00;34m" 61 | R="\033[01;31m" 62 | RST="\033[0m" 63 | clear > $motd 64 | printf "${W}======================================================\n" >> $motd 65 | printf "\n" >> $motd 66 | printf " ${B}███████╗██╗ ██╗ █████╗ ██████╗ ███╗ ███╗\n" >> $motd 67 | printf " ██╔════╝██║ ██║██╔══██╗██╔══██╗████╗ ████║\n" >> $motd 68 | printf " ███████╗██║ █╗ ██║███████║██████╔╝██╔████╔██║\n" >> $motd 69 | printf " ╚════██║██║███╗██║██╔══██║██╔══██╗██║╚██╔╝██║\n" >> $motd 70 | printf " ███████║╚███╔███╔╝██║ ██║██║ ██║██║ ╚═╝ ██║\n" >> $motd 71 | printf " ╚══════╝ ╚══╝╚══╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝\n" >> $motd 72 | printf "\n" >> $motd 73 | printf " ${R}This service is restricted to authorized users only.\n" >> $motd 74 | printf " All activities on this system are logged.\n" >> $motd 75 | printf "$RST" >> $motd 76 | printf "${W}======================================================\n" >> $motd 77 | printf "$RST" >> $motd 78 | 79 | echo "Post-provisioning for host $(hostname) complete!" 80 | -------------------------------------------------------------------------------- /scripts/provision_softlayer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | ############################################################################# 4 | # Provision configured instances in the public cloud in IBM Softlayer 5 | ############################################################################# 6 | 7 | #set -o xtrace # uncomment for debug 8 | 9 | ############################################################################# 10 | # Nothing to set past here 11 | ############################################################################# 12 | 13 | # Strict modes. Same as -euo pipefail 14 | set -o errexit 15 | set -o pipefail 16 | set -o nounset 17 | 18 | # Set some useful vars 19 | IFS=$'\n\t' # Set field separator to return+tab 20 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dir this script runs out of 21 | __root="$(cd "$(dirname "${__dir}")" && pwd)" # Parent dir 22 | __file="${__dir}/$(basename "${BASH_SOURCE[0]}")" # This script's filename 23 | 24 | # Trap errors 25 | trap $(printf "\n\tERROR: ${LINENO}\n\n" && exit 1) ERR 26 | 27 | cd ${__dir} 28 | 29 | # Get config. Load local override config if present. 30 | if [[ -e ${__root}/config/swarm.local ]] ; then 31 | source ${__root}/config/swarm.local 32 | else 33 | source ${__root}/config/swarm.conf 34 | fi 35 | 36 | # Set machine storage to machine subdir of repo 37 | export MACHINE_STORAGE_PATH="${__root}/machine" 38 | 39 | # Check for slcli 40 | [[ ! $(which slcli 2>/dev/null) ]] && echo "slcli tool required. Make sure slcli is configured and in your path" && exit 1 41 | 42 | # Order swarm instances using slcli 43 | 44 | # Get per-node cost 45 | export node=${nodelist[0]} 46 | node_hourly_cost="$(slcli vs create --test --public ${sl_disk_type} \ 47 | -H ${node} -D ${sl_domain} -c ${sl_cpu} -m ${sl_memory} \ 48 | -d ${sl_region} -o CENTOS_LATEST --billing ${sl_billing} \ 49 | --disk ${sl_os_disk_size} --disk ${sl_docker_disk_size} -n 100 \ 50 | | grep "Total hourly" | awk '{print $4}')" 51 | 52 | re=$(tput setaf 1) 53 | wh=$(tput setaf 7) 54 | nor=$(tput sgr0) 55 | echo 56 | echo "########################################################" 57 | echo "# ${wh}ATTENTION - THIS WILL PROVISION LIVE VM'S${nor} #" 58 | echo "# ${wh}YOU ${re}*WILL*${wh} ACCRUE COSTS IF YOU CONTINUE${nor} #" 59 | echo "########################################################" 60 | echo 61 | echo "- ${wh}Total # of Nodes:${nor} ${#nodelist[@]}" 62 | echo "- ${wh}Nodes:${nor} ${nodelist[@]}" 63 | echo "- ${wh}Cores:${nor} ${sl_cpu} ${wh}RAM:${nor} ${sl_memory}GB" 64 | echo "- ${wh}Cost Per Node:${nor} \$${node_hourly_cost} / ${sl_billing}" 65 | swarm_hourly_cost=$(echo "${node_hourly_cost} ${#nodelist[@]}" | awk '{printf "%f", $1 * $2}' | cut -b1-4) 66 | echo "- ${wh}TOTAL cost to run swarm:${nor} \$${swarm_hourly_cost} / ${sl_billing}" 67 | echo 68 | echo "########################################################" 69 | echo 70 | 71 | read -p "Are you sure you wish to continue? Type ${re}ORDER${nor} to continue: " answer 72 | case ${answer} in 73 | ORDER) 74 | echo "Proceeding with order of ${re}${nodelist[@]}${nor}..." 75 | ;; 76 | *) 77 | echo "Aborting, no order will be placed..." 78 | exit 79 | ;; 80 | esac 81 | 82 | echo "Provisioning ssh key for swarm provisioning..." 83 | if [[ ! $(slcli sshkey list | grep ${sl_sshkey_name}) ]] ; then 84 | if [[ ! -e ${__root}/ssh/swarm.rsa && ! -e ${__root}/ssh/swarm.rsa.pub ]] ; then 85 | echo "Existing swarm ssh cert and key not found, generating a new set..." 86 | echo -e 'y\n' | ssh-keygen -f ${__root}/ssh/swarm.rsa -t rsa -N '' 87 | fi 88 | echo "Adding swarm.rsa.pub to Softlayer account as ${sl_sshkey_name}" 89 | slcli sshkey add -f ${__root}/ssh/swarm.rsa.pub \ 90 | --note "Test Key for https://github.com/dayreiner/docker-swarm-mariadb" ${sl_sshkey_name} 91 | else 92 | if [[ ! -e ${__root}/ssh/swarm.rsa && ! -e ${__root}/ssh/swarm.rsa.pub ]] ; then 93 | echo 94 | echo "SSH Key present in softlayer but not found locally. Cannot continue." 95 | echo "Please run 'slcli sshkey remove ${sl_sshkey_name}' and re-run this script. Exiting." 96 | echo 97 | exit 1 98 | fi 99 | echo "SSH Key already present locally added to Softlayer account. Skipping..." 100 | fi 101 | 102 | echo 103 | 104 | # Actually place the orders 105 | 106 | echo 107 | echo "Provisioning ${#nodelist[@]} nodes: ${nodelist[@]}" 108 | echo 109 | [[ ! -f /usr/bin/expect ]] && echo "Please install expect to continue..." && exit 1 110 | for node in ${nodelist[@]} ; do 111 | echo 112 | echo "Ordering ${node}..." 113 | /usr/bin/expect <<- EOF 114 | set force_conservative 0 115 | set timeout 10 116 | spawn slcli vs create ${sl_extra_args} \ 117 | -H ${node} -D ${sl_domain} -c ${sl_cpu} -m ${sl_memory} \ 118 | -d ${sl_region} -o CENTOS_LATEST --billing ${sl_billing} \ 119 | --public ${sl_disk_type} -k $(slcli sshkey list | grep ${sl_sshkey_name} | awk '{print $1}') \ 120 | --disk ${sl_os_disk_size} --disk ${sl_docker_disk_size} -n 100 \ 121 | --tag dockerhost --vlan-public ${sl_public_vlan_id} --vlan-private ${sl_private_vlan_id} 122 | expect "*?N]: " 123 | send "Y\r" 124 | expect eof 125 | EOF 126 | echo "########## Ordering for ${node} complete!" 127 | echo 128 | done 129 | 130 | echo 131 | echo "Swarm nodes ordered in softlayer. Waiting for provisioning to complete..." 132 | echo 133 | 134 | for node in ${nodelist[@]} ; do 135 | provision_state=$(slcli vs detail ${node} | grep state | awk '{print $2}') 136 | while [[ "${provision_state}" != "RUNNING" ]]; do 137 | provision_state=$(slcli vs detail ${node} | grep state | awk '{print $2}') 138 | echo "Provisioning in progress. State is ${provision_state}. Will check again in 1 minute..." 139 | sleep 60 140 | done 141 | echo 142 | echo "########## Provisioning of ${node} completed. Running post-provision script." 143 | echo 144 | node_ssh_ip=$(slcli vs detail ${node} | grep ${sl_ssh_interface}_ip | awk '{print $2}') 145 | known_hosts_file="${__root}/ssh/known_hosts" 146 | ssh-keyscan -H ${node_ssh_ip} >> ${known_hosts_file} 147 | ssh_opts=( -o StrictHostKeyChecking=no -o UserKnownHostsFile=${known_hosts_file} ) 148 | export TERM=xterm-256color 149 | scp ${ssh_opts[@]} ${__root}/scripts/instance/sl_post_provision.sh root@${node_ssh_ip}:/tmp/${node}-docker-prep.sh 150 | ssh ${ssh_opts[@]} root@${node_ssh_ip} "chmod +x /tmp/${node}-docker-prep.sh" 151 | ssh ${ssh_opts[@]} root@${node_ssh_ip} "/tmp/${node}-docker-prep.sh" 152 | echo 153 | echo "########## Post-provision script on ${node} completed!" 154 | echo 155 | done 156 | 157 | echo "########################################################" 158 | echo "# SWARM INSTANCE PROVISIONING COMPLETE #" 159 | echo "########################################################" 160 | echo 161 | echo "You can now run build_swarm.sh to build the docker swarm" 162 | echo "cluster on the provisioned instances" 163 | echo 164 | -------------------------------------------------------------------------------- /scripts/rebuild_swarm.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | #set -o xtrace 3 | 4 | set -euo pipefail 5 | 6 | __dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" 7 | 8 | cd ${__dir} 9 | ./destroy_swarm.sh 10 | ./build_swarm.sh 11 | -------------------------------------------------------------------------------- /source.me: -------------------------------------------------------------------------------- 1 | # Set the machine storage path to the machine subdir of this repo 2 | export MACHINE_STORAGE_PATH="$( cd "$(dirname "${BASH_SOURCE:-$0}")/machine" ; pwd -P )" 3 | -------------------------------------------------------------------------------- /ssh/.gitignore: -------------------------------------------------------------------------------- 1 | # Ignore everything in this directory 2 | * 3 | # Except this file 4 | !.gitignore 5 | --------------------------------------------------------------------------------