├── docker-mtu.md ├── jvm-resources-recommendations.md └── README.md /docker-mtu.md: -------------------------------------------------------------------------------- 1 | # Docker Swarm Network MTU 2 | 3 |

4 | 5 |

6 | 7 | If your Docker instance is communicating to other Docker instances via VXLAN or any other network that has a different MTU than the default 1500, you need to delete the default ingress network and create a new one! This is needed because Docker doesn't inherit the MTU of your networking interface, there will be intermittent packet losses when communicating between nodes! 8 | 9 | Docker by default doesn't detect the network MTU because [reasons](https://github.com/moby/moby/pull/18108), so by default it uses 1500 MTU for connections, which will cause issues if your network has a different MTU! 10 | 11 |

12 | 13 |

14 | 15 | To fix this, check what is the MTU of the network that is going to be used for node communication by using `ip a`. 16 | 17 | ``` 18 | 3: ens19: mtu 1450 qdisc fq_codel state UP group default qlen 1000 19 | link/ether ... 20 | ``` 21 | 22 | As we can see, our MTU is 1450, so let's setup Docker to use 1450 for network! 23 | 24 | Create the file `/etc/docker/daemon.json` and insert the following contents, if the file already exists, add the `mtu` field to the already existing JSON. 25 | 26 | ```json 27 | { 28 | "mtu": 1450 29 | } 30 | ``` 31 | 32 | And then restart Docker with `systemctl restart docker`, this will setup Docker default networks to use 1450 MTU. 33 | 34 | Now, we need to change the MTU of the ingress network used by Docker Swarm for inter node communication, to do this, you need to recreate the ingress network by doing this: 35 | * `docker network rm ingress` 36 | * `docker network create --driver overlay --ingress --opt com.docker.network.driver.mtu=MTUOfTheNetworkHere --subnet 192.168.128.0/24 --gateway 192.168.128.1 ingress` 37 | * To check what `subnet` and `gateway` you should use, use `docker inspect overlay` and check the IPAM section! 38 | * Let's suppose that we are using a VXLAN network for communication, VXLAN networks use a MTU size of 1450, so we need to use `--opt com.docker.network.driver.mtu=1450` 39 | * Restart Docker with `systemctl restart docker` 40 | 41 | This fixes Outside World -> Swarm Cluster communication, now we need to fix communication between your containers! 42 | 43 | To do this, append this to the end of every `docker-compose.yml`. 44 | 45 | ```yml 46 | networks: 47 | default: 48 | driver: overlay 49 | driver_opts: 50 | com.docker.network.driver.mtu: 1450 51 | ``` 52 | 53 | This will setup the overlay networks used by the stacks to also use your MTU. 54 | 55 | And that's it! Now you shouldn't have any connectivity issues in your Swarm. 56 | 57 | [[If you want to learn more, this issue talks a lot about connectivity issues!]](https://github.com/moby/moby/issues/36689#issuecomment-987706496) 58 | 59 | There is a [Pull Request](https://github.com/moby/moby/pull/43197) that automatically sets the MTU for all networks without the need of changing every network manually, let's hope it is merged to Docker Desktop in the future. :3 60 | -------------------------------------------------------------------------------- /jvm-resources-recommendations.md: -------------------------------------------------------------------------------- 1 | # JVM Resources Recommendations 2 | 3 | Messing with `limits` and `reservations` may impact your Java application in ways that you weren't expecting, so here's some tips to not fail. 4 | 5 | Don't worry, *it is also painfully hard on Kubernetes too*. 😭 6 | 7 | My Swarm node Virtual Machine has 4GBs of RAM, 4 cores. So let's do some tests on it! The application will be using Java 17 (you MUST use Java 8.0_131 or above because Java didn't respect cgroups before that. We are in `${InsertYearHere}` already, move on to Java 17!!) and will print memory and CPU stats to the console before quitting. If you want to play around with it on your computer, [the container is public](https://github.com/MrPowerGamerBR/DebugAllocationContainers/pkgs/container/debugallocationcontainers)! 8 | ```kotlin 9 | fun main() { 10 | val mb = 1024 * 1024 11 | val runtime = Runtime.getRuntime() 12 | 13 | println("Used Memory: ${(runtime.totalMemory() - runtime.freeMemory()) / mb}MiB") 14 | println("Free Memory: ${runtime.freeMemory() / mb}MiB") 15 | println("Total Memory: ${runtime.totalMemory() / mb}MiB") 16 | println("Max Memory: ${runtime.maxMemory() / mb}MiB") 17 | println("Available Processors: ${Runtime.getRuntime().availableProcessors()}") 18 | } 19 | ``` 20 | 21 | ## No resources set 22 | 23 | ```yml 24 | version: "3.9" 25 | services: 26 | temurin: 27 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 28 | environment: 29 | JAVA_TOOL_OPTIONS: "-verbose:gc" 30 | ``` 31 | **Output:** 32 | ``` 33 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | [0.005s][info][gc] Using G1 34 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | Used Memory: 1MiB 35 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | Free Memory: 62MiB 36 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | Total Memory: 64MiB 37 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | Max Memory: 982MiB 38 | temurin-test_temurin.1.xnwoo9oyyhfx@docker-swarm-worker-1 | Available Processors: 4 39 | ``` 40 | 41 | The JVM memory is set to 1/4 of the entire VM. This makes sense, because the default value of `-XX:MaxRAMPercentage` is 25, and 25% of 4GBs is 1GB. 42 | 43 | ## With Xmx/Xms set 44 | Most Java developers use `-Xmx` and `-Xms` to set the heap size, so let's use it. 45 | 46 | ```yml 47 | version: "3.9" 48 | services: 49 | temurin: 50 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 51 | environment: 52 | JAVA_TOOL_OPTIONS: "-verbose:gc -Xmx512M -Xms512M" 53 | ``` 54 | **Output:** 55 | ``` 56 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | [0.005s][info][gc] Using G1 57 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | Used Memory: 1MiB 58 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | Free Memory: 510MiB 59 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | Total Memory: 512MiB 60 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | Max Memory: 512MiB 61 | temurin-test_temurin.1.ulbt33xyo0d0@docker-swarm-worker-1 | Available Processors: 4 62 | ``` 63 | 64 | As we can see, the JVM is using our configured allocated memory! However, Swarm doesn't know about this, so *it will try to allocate our container on any node, even if they don't have 512MB available!* 65 | 66 | ## Xmx/Xms + Resource Reservations (The Best And Simplest Way Of Doing This™) 67 | ```yml 68 | version: "3.9" 69 | services: 70 | temurin: 71 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 72 | deploy: 73 | resources: 74 | reservations: 75 | memory: 768M # We reserve more memory than we set the heap, due to off heap allocations and other JVM shenanigans. 76 | environment: 77 | JAVA_TOOL_OPTIONS: "-verbose:gc -Xmx512M -Xms512M" 78 | ``` 79 | **Output:** 80 | ``` 81 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | [0.005s][info][gc] Using G1 82 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | Used Memory: 1MiB 83 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | Free Memory: 510MiB 84 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | Total Memory: 512MiB 85 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | Max Memory: 512MiB 86 | temurin-test_temurin.1.vrp8yjosnc5d@docker-swarm-manager-1 | Available Processors: 4 87 | ``` 88 | 89 | Once again, it works fine! In my opinion, this is the best way AND easiest way to do this. 90 | 91 | ## But what if we don't set `Xmx/Xms` WHILE we have a `reservations` set? 92 | ```yml 93 | version: "3.9" 94 | services: 95 | temurin: 96 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 97 | deploy: 98 | resources: 99 | reservations: 100 | memory: 768M # We reserve more memory than we set the heap, due to off heap allocations and other JVM shenanigans. 101 | environment: 102 | JAVA_TOOL_OPTIONS: "-verbose:gc" 103 | ``` 104 | **Output:** 105 | ``` 106 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | [0.005s][info][gc] Using G1 107 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | Used Memory: 1MiB 108 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | Free Memory: 62MiB 109 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | Total Memory: 64MiB 110 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | Max Memory: 984MiB 111 | temurin-test_temurin.1.wost4c51s6ma@docker-swarm-manager-1 | Available Processors: 4 112 | ``` 113 | 114 | It still allocates ~1GBs, so as you can see the container just ignores our `reservations` when figuring out how much memory it can allocate. 115 | 116 | ## But what if we don't set `Xmx/Xms` WHILE we have a `limits` set? 117 | ```yml 118 | version: "3.9" 119 | services: 120 | temurin: 121 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 122 | deploy: 123 | resources: 124 | limits: 125 | memory: 512M 126 | environment: 127 | JAVA_TOOL_OPTIONS: "-verbose:gc" 128 | ``` 129 | **Output:** 130 | ``` 131 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | [0.002s][info][gc] Using Serial 132 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | Used Memory: 0MiB 133 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | Free Memory: 7MiB 134 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | Total Memory: 7MiB 135 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | Max Memory: 123MiB 136 | temurin-test_temurin.1.vq0a6qz6kjkr@docker-swarm-manager-1 | Available Processors: 4 137 | ``` 138 | 139 | Ah ha! Now the JVM is using 25% of the memory that we set on the `limits` section! And look, because our heap is smol, Java decides that using Serial instead of G1GC is a good idea (Spoiler: While Serial is good for desktop applications that don't have a lot of threads, it isn't good for anything else). 140 | 141 | Now, you can use `-XX:MaxRAMPercentage` to set how much % your JVM should allocate of the `limits.memory` that you have set. While this works, I do think that this is a bit confusing and non-intuitive, and besides, 99% of the times you are deploying containers that have the same max memory set, so this is not that useful. 142 | 143 | ## Should I limit CPU? 144 | 145 | In my opinion? Nah, let your services use all of your CPUs. Less headaches. 146 | 147 | ## But should I create CPU reservations? 148 | 149 | Yes! Mostly to avoid scheduling your containers in oversatured nodes. 150 | 151 | ## JVM Resources tl;dr: 152 | 153 | * Create memory reservations to avoid allocating your container to a node that doesn't have enough memory. Always reserve a bit more memory than what you are allocating to avoid the system killing your JVM app due to low memory. 154 | * Create CPU reservations to avoid allocating your container to a node that is oversaturated. 155 | * Set your memory with `-Xmx` and `-Xms` because it is easier than fiddling with `MaxRAMPercentage`. 156 | 157 | ```yml 158 | version: "3.9" 159 | services: 160 | temurin: 161 | image: ghcr.io/mrpowergamerbr/debugallocationcontainers@sha256:d98ad5df3b5829fc7595eb48f6e49c9856cd9ad8ebefe75068ecd5063f0fb789 162 | deploy: 163 | resources: 164 | reservations: 165 | memory: 768M 166 | cpus: "0.5" 167 | environment: 168 | JAVA_TOOL_OPTIONS: "-verbose:gc -Xmx512M -Xms512M" 169 | ``` 170 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 |

4 | 5 | # Docker Swarm Mode Tutorial 6 | 7 | I know, Kubernetes is what all the cool kidz are using nowadays! And Kubernetes is cool too... until it explodes and you spend hours trying to debug the issue. In fact, I decided to learn about Docker Swarm after getting burned by Kubernetes for the nth time... 8 | 9 | Kubernetes may be painless if you are using a managed Kubernetes instance, but if you are selfhosting it... it is hard! 10 | 11 | **Let's face it:** You ain't *insert huge company name here*, you just care about hosting some containers' replicas, you don't care about all the fancy big corp stuff that Kubernetes tries to provide to you... because you ain't a big corp! 12 | 13 | So why not use Docker Swarm? 14 | 15 | ## But is Docker Swarm... good? 16 | 17 | If you are just like me that likes to scour the internet to see if users think that *insert tech here* is good, here are some posts on the internet about people talking about their experience with Docker Swarm! 18 | 19 | * https://blog.kronis.dev/articles/docker-swarm-over-kubernetes 20 | * https://news.ycombinator.com/item?id=29448182 21 | * https://news.ycombinator.com/item?id=32306857 22 | * https://news.ycombinator.com/item?id=32305800 23 | * https://news.ycombinator.com/item?id=32305952 24 | * https://news.ycombinator.com/item?id=32307229 25 | * https://user-images.githubusercontent.com/9496359/183654510-568ddef6-68e8-4555-8380-429ad827d270.png 26 | * https://www.reddit.com/r/devops/comments/qxq1q9/is_docker_swarm_good_enough_for_production/hlg4xeu/ 27 | * https://www.reddit.com/r/devops/comments/wwpo91/k8s_encourages_people_to_deploy_really_complex/ilmjhl3/ (well that's my own comment but shhh just look at the replies) 28 | * https://www.reddit.com/r/devops/comments/wwpo91/k8s_encourages_people_to_deploy_really_complex/ilmw6uu/ 29 | 30 | ## When SHOULD I use Docker Swarm? 31 | 32 | Docker Swarm is a orchestrator, the "best little multi node orchestrator that could". It is very useful if you are already using Docker and you need to replicate your containers in multiple nodes/VMs. 33 | 34 | * If you are self hosting your services in a dedicated server or in a VPS. (Example: dedicated servers @ OVH, Hetzner, etc) 35 | * If you are already using Docker Compose. 36 | * Docker Swarm uses Docker Compose files for deployment too, so it is a natural step from Docker Compose -> Swarm! And I mean it! You just need to add a new `deploy` key to your already existing compose files to setup how do you want your container to be replicated and deployed... and that's it! 37 | * If your services don't require persistent storage. 38 | 39 | ## When SHOULDN'T I use Docker Swarm? 40 | 41 | As with any tool, there are some disadvantages to Docker Swarm, here are some reasons that you may want to avoid using Docker Swarm: 42 | 43 | * You are using AWS/GCP/Azure/etc. 44 | * Just use their managed Kubernetes service. As I said before: Kubernetes is cool, except when it breaks. However using a Kubernetes instance that is managed by someone else lifts all the heavyweight from you, in exchange of a pretty hefty price increase. 45 | * You need to auto scale according to demand. 46 | * However if you are using a rented dedicated server, why would tou want to auto scale? And if you are using a host that charges per usage (like AWS, GCP, Azure, etc), then they probably has a managed Kubernetes instance. 47 | * You need to have distributed storage because you want to have a distributed database. 48 | * Docker Swarm does support distributed storage, however Kubernetes' solutions is more rock solid so just go with Kubernetes. But then again, do you *really* need a distributed storage for your smol service? You can go pretty far with a single instance database! 49 | * You need a feature that isn't supported by Docker Swarm. 50 | * While Docker Swarm isn't deprecated, it is also not actively developed, so if you require a feature that isn't supported by it, it may take a loooong time until it is implemented. 51 | * You like to have shiny buzzwords in your CV. 52 | 53 | ## But isn't Docker Swarm... *dead*? 54 | 55 | People talk about Docker Swarm being dead because, back in the day, Docker Inc. had a orchestrator called "Docker Swarm", and *that* got deprecated. Docker Inc., in their infinite marketing wisdom, later created a orchestrator called "Swarm Mode" for Docker, and that's why a lot of people get confused about if Swarm is dead or not. 56 | 57 | Contrary to the popular belief, Swarm Mode is not deprecated! Yeah, sure, sadly there isn't too much dev work on Swarm nowadays, but it isn't *dead* or *deprecated*, and after all, if it fits your needs, does it really need to be actively worked on? As long as it works, and if you get to a point where Swarm is not fitting your needs, *then* move to Kubernetes! Don't overcomplicate your life right now just because some day, *maybe*, you would need Kubernetes. 58 | 59 | If it does get deprecated, you can migrate off of it by converting your Docker Compose deployment files with [Kompose](https://kompose.io/), so it is not like you are going to end up being stuck on a dead platform forever. 60 | 61 | ## Official Tutorial 62 | 63 | Docker has a Swarm Mode tutorial on their website, [check it out](https://docs.docker.com/engine/swarm/swarm-tutorial/)! 64 | 65 | Docker also has a tutorial on how to deploy Docker Compose files (also known as "Stack") to Docker Swarm, [check it out](https://docs.docker.com/engine/swarm/stack-deploy/)! 66 | 67 | This tutorial is mostly a "I'm writing things as I'm learning", focused more on the Docker Swarm + Docker Compose combo than the official tutorial's Docker Swarm + Docker CLI combo, so there may be things that are incorrect or misleading, however I think that other people may think that this tutorial is also useful! 68 | 69 | ## Install Ubuntu 70 | 71 | I'm using a Ubuntu 22.04 VM for this tutorial. 72 | 73 | ## Install Docker Engine 74 | 75 | https://docs.docker.com/engine/install/ 76 | 77 | ## Start Docker Swarm Mode 78 | 79 | Start your Docker Swarm with `docker swarm init`! 80 | 81 | If you have multiple network interfaces, Docker will ask you to choose what IP to use. You need to choose the IP that can access your other Docker instances! 82 | 83 | By default, Docker will use IPs on the 10.0.0.0 range, if your machine's interface is behind NAT and it also uses the 10.0.0.0 range, it will cause issues when you try to access containers hosted in your Docker Swarm! 84 | 85 | To fix this, initialize your Docker Swarm with the `--default-addr-pool` parameter! Let's suppose we want to use `192.168.128.0/18` for our containers. [[🐳 Learn more]](https://docs.docker.com/engine/swarm/swarm-mode/#configuring-default-address-pools) 86 | 87 | ```bash 88 | sudo docker swarm init --default-addr-pool 192.168.128.0/18 89 | ``` 90 | 91 | > **Warning** 92 | > 93 | > If you change your `default-addr-pool`, check if the IPs aren't being used by another network! `ip a` shows what IP ranges are being used. If you use an IP range that is already being used by something else, your containers won't start due to `Pool overlaps with other one on this address space`! 94 | 95 | 96 | > **Warning** 97 | > 98 | > If you have multiple network interfaces, Docker will say `Error response from daemon: could not choose an IP address to advertise since this system has multiple addresses on different interfaces (10.29.10.1 on ens18 and 172.29.10.1 on ens19) - specify one with --advertise-addr`. In this case, the `--advertise-addr` parameter should be the IP that *can communicate with other nodes*! So, if `172.29.10.1` is the IP that can access other nodes, then that should be your `--advertise-addr`. 99 | 100 | > **Warning** 101 | > 102 | > If your Docker instance is communicating to other Docker instances via VXLAN or any other network that has a different MTU than the default 1500, you need to delete the default ingress network and create a new one! This is needed because Docker doesn't inherit the MTU of your networking interface, there will be intermittent packet losses when communicating between nodes! [Read more about how to fix it here](docker-mtu.md) 103 | 104 | If you get `Swarm initialized: current node (qgsfyhmhwtpkp7zpo7lts2vhp) is now a manager.`, then your Docker instance is in a swarm, and your node is a manager node, sweet! 105 | 106 | ## Accessing private images hosted on GitHub's `ghcr.io` 107 | 108 | While not related to Docker Swarm, I thought it was nice to talk about this. :3 109 | 110 | ```bash 111 | docker login ghcr.io 112 | ``` 113 | 114 | Use your GitHub username as your Username, and a [Personal Access Token](https://github.com/settings/tokens) as your Password. 115 | 116 | When deploying your stack, add the parameter `--with-registry-auth`! 117 | 118 | ## Logs and your disk space 119 | 120 | Once again something not related to Docker Swarm, but I thought it was nice to talk about this. :3 121 | 122 | By default, Docker uses `json-file` as its logging driver, however [Docker recomemnds changing the log driver to `local` because the `json-file` driver is only set by default for backwards compatibility](https://docs.docker.com/config/containers/logging/configure/). 123 | 124 | To do this, edit your `/etc/docker/daemon.json` and insert the following contents, if the file already exists, add the `log-driver` field to the already existing JSON. 125 | ```json 126 | { 127 | "log-driver": "local" 128 | } 129 | ``` 130 | 131 | ## Hosting your first service 132 | 133 | Did you know that Docker Swarm uses `docker-compose.yml` files??? Crazy huh? 134 | 135 | ```yaml 136 | version: '3.9' 137 | services: 138 | helloworld: 139 | # This will set the hostname to helloworld-ReplicaID 140 | hostname: "helloworld-{{.Task.Slot}}" 141 | # The image, we will use a helloworld http image 142 | image: strm/helloworld-http 143 | # We will expose the service at port 8080 on the host 144 | ports: 145 | - "8080:80" 146 | # Docker Swarm configuration deployment configurations! 147 | deploy: 148 | # We want to replicate our service... 149 | mode: replicated 150 | # And it will have two instances of the container! 151 | replicas: 2 152 | ``` 153 | 154 | Now let's deploy our service with `docker stack deploy --compose-file docker-compose.yml stackdemo`! 155 | 156 | If everything goes well, you will be able to see your stack on the `docker stack ls` command! 157 | ```bash 158 | mrpowergamerbr@docker-swarm-test:~$ sudo docker stack ls 159 | NAME SERVICES ORCHESTRATOR 160 | stackdemo 1 Swarm 161 | ``` 162 | 163 | And you can also see the service status with `docker stack services stackdemo` 164 | ```bash 165 | mrpowergamerbr@docker-swarm-test:~$ sudo docker stack services stackdemo 166 | ID NAME MODE REPLICAS IMAGE PORTS 167 | totx5zzra290 stackdemo_helloworld replicated 2/2 strm/helloworld-http:latest *:8080->80/tcp 168 | ``` 169 | 170 | You can also view events about the stack with `docker stack ps stackdemo` 171 | ```bash 172 | swarm@docker-swarm-manager-1:~$ sudo docker stack ps powercms 173 | ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 174 | ztf65gnt1fjc powercms_powercms.1 ghcr.io/mrpowergamerbr/powercms@sha256:41e5cf391194792b4ddff8d0a5561122213cb793ada75edc1e4e6a5ba9b90e16 docker-swarm-manager-1 Ready Rejected 4 seconds ago "Pool overlaps with other one …" 175 | 9oiwb29bc1eu \_ powercms_powercms.1 ghcr.io/mrpowergamerbr/powercms@sha256:41e5cf391194792b4ddff8d0a5561122213cb793ada75edc1e4e6a5ba9b90e16 docker-swarm-manager-1 Shutdown Rejected 9 seconds ago "Pool overlaps with other one …" 176 | ezif348rn177 \_ powercms_powercms.1 ghcr.io/mrpowergamerbr/powercms@sha256:41e5cf391194792b4ddff8d0a5561122213cb793ada75edc1e4e6a5ba9b90e16 docker-swarm-worker-1 Shutdown Rejected 14 seconds ago "Pool overlaps with other one …" 177 | xszs8mbxrzdh \_ powercms_powercms.1 ghcr.io/mrpowergamerbr/powercms@sha256:41e5cf391194792b4ddff8d0a5561122213cb793ada75edc1e4e6a5ba9b90e16 docker-swarm-worker-1 Shutdown Rejected 19 seconds ago "Pool overlaps with other one …" 178 | nnq6i1ybb4v4 \_ powercms_powercms.1 ghcr.io/mrpowergamerbr/powercms@sha256:41e5cf391194792b4ddff8d0a5561122213cb793ada75edc1e4e6a5ba9b90e16 docker-swarm-manager-1 Shutdown Rejected 24 seconds ago "Pool overlaps with other one …" 179 | ``` 180 | 181 | Let's also check the containers running on our Docker instance... 182 | 183 | ```bash 184 | mrpowergamerbr@docker-swarm-test:~$ sudo docker ps 185 | CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 186 | 25088b353de7 strm/helloworld-http:latest "/main.sh" 41 minutes ago Up 41 minutes 80/tcp stackdemo_front.2.wvzs76sk9iq05xozgz4x9cdqe 187 | 18588ab9c07c strm/helloworld-http:latest "/main.sh" 41 minutes ago Up 41 minutes 80/tcp stackdemo_front.1.nmreutrz6q4l57rolejuco6xa 188 | ``` 189 | 190 | > **Warning** 191 | > 192 | > `docker ps` only shows the containers running on the current Docker instance! Don't forget about this if you include multiple Docker instances on the swarm! 193 | 194 | ## Accessing services 195 | 196 | Now let's access the service with `curl`! 197 | ```bash 198 | mrpowergamerbr@docker-swarm-test:~$ curl 127.0.0.1:8080 199 | HTTP Hello World

Hello from helloworld-2

HTTP Hello World

Hello from helloworld-1

HTTP Hello World

Hello from helloworld-2

HTTP Hello World

Hello from helloworld-1

HTTP Hello World

Hello from helloworld-2

HTTP Hello World

Hello from helloworld-2

HTTP Hello World

Hello from helloworld-2

HTTP Hello World

Hello from helloworld-2

**Warning** 243 | > 244 | > If you had an service, removed it from the Compose file and used `stack deploy`, Docker will *not* remove the already running services! To remove an service, use `sudo docker service rm servicename`, you can see all of your stack's running services with `sudo docker stack services stackdemo`! 245 | 246 | ## Rolling Updates and Health Checks 247 | Here's our Web Server. It is very simple and it takes a bit of time to be ready, because it needs to setup database connections and other thingamajigs. 248 | 249 | ```kotlin 250 | fun main() { 251 | println("Setting up Web Server...") 252 | 253 | // Imagine that this is initializing db connections and stuff like that 254 | Thread.sleep(15_000) 255 | 256 | println("Finished setting up Web Server!") 257 | 258 | println("Starting Web Server...") 259 | 260 | val server = embeddedServer(Netty) { 261 | routing { 262 | get("/") { 263 | call.respondText("Hello World!") 264 | } 265 | } 266 | } 267 | 268 | server.start(true) 269 | 270 | println("Finished Starting Web Server!") 271 | } 272 | ``` 273 | 274 | And then have our Docker Compose file with this configuration. 275 | 276 | ```yml 277 | version: '3.9' 278 | services: 279 | slowstartwebserver: 280 | image: sha256:7ffe31bc8eb30ba35b98b25a13bd748a7b5cd284826100656c2c6ffcfe9630d1 281 | ports: 282 | - "33333:80" 283 | ``` 284 | 285 | Let's check that it works... 286 | 287 | ```bash 288 | mrpowergamerbr@PhoenixWhistler:/mnt/c/Windows/system32$ curl 127.0.0.1:33333 289 | Hello World! 290 | ``` 291 | 292 | Now, we have a new version of our web server. Because sending `Hello World!` is boring, we changed it to `Hello World! Loritta is so cute!! :3`! 293 | 294 | ```yml 295 | version: '3.9' 296 | services: 297 | slowstartwebserver: 298 | image: sha256:1c00b295e356ad17156d2c12e856132c4bbde008c862ab1e2d449afadbfda36b 299 | ports: 300 | - "33333:80" 301 | ``` 302 | 303 | However, after deploying the stack update, you will notice that the old version will be shut down first, then the new version will be deployed. 304 | 305 | This can be changed by changing `update_config`'s `order` option! 306 | * `stop-first`: old task is stopped before starting new one (default) 307 | * `start-first`: new task is started first, and the running tasks briefly overlap 308 | 309 | ```yml 310 | version: '3.9' 311 | services: 312 | slowstartwebserver: 313 | image: sha256:1c00b295e356ad17156d2c12e856132c4bbde008c862ab1e2d449afadbfda36b 314 | ports: 315 | - "33333:80" 316 | deploy: 317 | update_config: 318 | order: start-first 319 | ``` 320 | 321 | ```bash 322 | mrpowergamerbr@PhoenixWhistler:/mnt/c/Windows/system32$ docker ps 323 | CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 324 | c7acffd6aac7 7ffe31bc8eb3 "java -cp @/app/jib-…" 2 seconds ago Up Less than a second slow-start-web-server-stack_slowstartwebserver.1.imx9ubf9sa4nbl92cc657fq1x 325 | 1f94ead29bb0 1c00b295e356 "java -cp @/app/jib-…" 3 minutes ago Up 3 minutes slow-start-web-server-stack_slowstartwebserver.1.gewj7xl809tlzo0mvnkkuiy4s 326 | ``` 327 | 328 | Now, when applying updates, both instances will briefly overlap. If you have used Kubernetes before, you will notice that this is how Kubernetes also works. 329 | 330 | But wait, what about the web server's slow start? The previous instance will be shut down before our new instance is ready to serve new connections! 331 | 332 | That's where Docker's HEALTHCHECK comes in! 333 | 334 | ```yml 335 | version: '3.9' 336 | services: 337 | slowstartwebserver: 338 | image: sha256:7ffe31bc8eb30ba35b98b25a13bd748a7b5cd284826100656c2c6ffcfe9630d1 339 | ports: 340 | - "33333:80" 341 | deploy: 342 | update_config: 343 | order: start-first 344 | healthcheck: 345 | # What command will be used to check if the system is healthy or not 346 | # Keep in mind that the command will be executed within your container! So, if you are using curl to check if your service is alive, 347 | # you need to be sure that curl is also installed on your container! 348 | test: ["CMD", "curl", "-f", "http://localhost"] 349 | interval: 5s # The check interval 350 | timeout: 10s # How much time the healthcheck process will wait before timing out 351 | retries: 3 # How many times the healthcheck will retry before considering the service as unhealthy 352 | start_period: 15s # Healthcheck initial delay 353 | ``` 354 | 355 | Now, the previous instance won't be shut down until the new instance is healthy! 356 | 357 | ```bash 358 | PS C:\Users\Leonardo\Documents\Docker> docker ps 359 | CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 360 | 53f4b4ec536c 7ffe31bc8eb3 "java -cp @/app/jib-…" 11 seconds ago Up 9 seconds (health: starting) slow-start-web-server-stack_slowstartwebserver.1.om1wcqjibzylpsnhmnn46twf2 361 | c7acffd6aac7 7ffe31bc8eb3 "java -cp @/app/jib-…" 2 minutes ago Up 2 minutes slow-start-web-server-stack_slowstartwebserver.1.imx9ubf9sa4nbl92cc657fq1x 362 | ``` 363 | 364 | But will Docker Swarm try to forward requests to the non-healthy instance while it is booting up? 365 | 366 | ```bash 367 | mrpowergamerbr@PhoenixWhistler:/mnt/c/Windows/system32$ while sleep 1; do curl 127.0.0.1:33333; echo ''; done 368 | Hello World! 369 | Hello World! 370 | Hello World! 371 | Hello World! 372 | Hello World! 373 | Hello World! 374 | Hello World! 375 | Hello World! 376 | Hello World! 377 | Hello World! 378 | Hello World! 379 | Hello World! 380 | Hello World! 381 | Hello World! 382 | Hello World! 383 | Hello World! 384 | Hello World! 385 | Hello World! 386 | Hello World! 387 | Hello World! 388 | Hello World! 389 | Hello World! 390 | Hello World! 391 | Hello World! 392 | Hello World! Loritta is so cute!! :3 393 | Hello World! Loritta is so cute!! :3 394 | Hello World! Loritta is so cute!! :3 395 | Hello World! Loritta is so cute!! :3 396 | ``` 397 | 398 | Thankfully, Docker Swarm only forwards traffic after the instance is healthy and ready to rock! 399 | 400 | ## Application Configurations 401 | ```bash 402 | mrpowergamerbr@docker-swarm-test:~$ echo "Loritta is so cute! :3" > loritta.txt 403 | ``` 404 | 405 | ```yaml 406 | version: '3.9' 407 | services: 408 | helloworld: 409 | hostname: "helloworld-{{.Task.Slot}}" 410 | image: strm/helloworld-http 411 | ports: 412 | - "8080:80" 413 | deploy: 414 | mode: replicated 415 | replicas: 2 416 | # The configuration! 417 | configs: 418 | - source: my_first_config # The source should match the config name in the "configs" section 419 | target: /loritta_cute.txt # Where the config should be mounted 420 | 421 | configs: 422 | my_first_config: 423 | file: ./loritta.txt 424 | ``` 425 | 426 | `docker stack deploy --compose-file docker-compose.yml stackdemo` 427 | 428 | Don't worry if two applications have a config named `my_first_config`! Docker prefixes the service name before the config name. 429 | 430 | ```bash 431 | mrpowergamerbr@docker-swarm-test:~$ sudo docker exec -it 66726bffdd6d /bin/bash 432 | root@helloworld-2:/www# cat /loritta_cute.txt 433 | Loritta is so cute! :3 434 | ``` 435 | 436 | > **Warning** 437 | > 438 | > If you update your configuration file, the `docker stack deploy --compose-file docker-compose.yml stackdemo` command will fail! This is because there is another container that is already using the configuration file. To workaround this issue, change the configuration name (`my_first_config`) after changing anything in the configuration file! 439 | > 440 | > **Idea:** Suffix the file's hash at the end of the configuration file, that's what Kubernetes' Kustomize does, and then create a script that periodically tries to delete all configs from your Docker Swarm cluster, configs that are in use won't be deleted by Docker. :P 441 | 442 | ## Creating variations of your Docker Compose file (Kustomize-like patches) 443 | 444 | While you can merge compose files with `docker compose -f docker-compose.stats-collector.yml -f patch.yml config`, it doesn't work *that* great, because Docker ends up mangling the file too much to the point that `docker stack deploy` rejects the file with "nuh huh, that ain't a Docker Compose file my dawg!" (example: It removes the `version` from the Compose file, so Docker Swarm thinks that's not a Docker Compose v3 file) 445 | 446 | So you can use a tool like [yq](https://github.com/mikefarah/yq) to merge multiple yaml files, and then deploy that file. 447 | 448 | ## Limiting Resources for the Scheduler 449 | 450 | Just like Kubernetes, you can also set resources requests and limits to your application! 451 | 452 | If you already used Kubernetes before... 453 | * k8s' `limits` = Swarm's `limits` 454 | * k8s' `requests` = Swarm's `reservations` 455 | 456 | * `reservations`: When scheduling a container, Swarm MUST guarantee container can allocate at least the configured amount 457 | * If your container is configured to `reservations.memory: 4G`, and none of your Swarm nodes have 4GB+ of RAM available, the node won't be scheduled due to `insufficient resources on X nodes`! 458 | * `limits`: When scheduling a container, Swarm MUST prevent container to allocate more 459 | * If your container is configured to `limits.memory: 4G`, and your container is using more than 4GBs of RAM, Swarm will terminate your container automatically! 460 | 461 | You can set both, only one of them, or none! Same goes for the resources specified within the section. 462 | 463 | [[🐳 Learn more]](https://docs.docker.com/compose/compose-file/deploy/#resources) 464 | 465 | ```yaml 466 | version: '3.9' 467 | services: 468 | helloworld: 469 | hostname: "helloworld-{{.Task.Slot}}" 470 | image: strm/helloworld-http 471 | ports: 472 | - "8080:80" 473 | deploy: 474 | mode: replicated 475 | replicas: 2 476 | resources: 477 | limits: 478 | cpus: '0.50' 479 | memory: 50M 480 | reservations: 481 | cpus: '0.25' 482 | memory: 4G 483 | ``` 484 | 485 | If you are running apps on the JVM (Java Virtual Machine), check out our [Java Resources Recommendations guide](jvm-resources-recommendations.md) to help you understand how setting `reservations` and `limits` affect how the JVM allocates and behaves! 486 | 487 | ## Deleting your stack 488 | You can delete the stack with `docker stack rm stackdemo`, this will delete the stack + service + all containers associated with it! 489 | 490 | ```bash 491 | mrpowergamerbr@docker-swarm-test:~$ sudo docker stack rm stackdemo 492 | Removing service stackdemo_helloworld 493 | Removing network stackdemo_default 494 | ``` 495 | 496 | ## Adding more nodes to your Swarm 497 | 498 | Having only one Docker instance in your swarm is kinda pointless, the fun really begins after we add other Docker instances to our swarm! This way, replicas can be hosted in different nodes, and this is all handled by Docker Swarm. 499 | 500 | `sudo docker swarm join-token worker` 501 | 502 | Copy the command and execute it on your VM! 503 | 504 | `docker swarm join --token SWMTKN-1-2s4fg0ctnjcgy2hei784kz1c3lmc15axeozlfgs08obm6cajzj-7ycl00ssan3glz3309ac79kl4 10.0.12.10:2377` 505 | 506 | If you get `This node joined a swarm as a worker.`, then it means that now your Docker instance is now in the Swarm! 507 | 508 | On your manager node, use `docker node ls` to view all nodes on the swarm. 509 | 510 | ``` 511 | swarm@docker-swarm-manager-1:/etc/netplan$ sudo docker node ls 512 | ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 513 | qgsfyhmhwtpkp7zpo7lts2vhp * docker-swarm-manager-1 Ready Active Leader 20.10.17 514 | d0o8mcxcjb4fwk8jxqxym0kcc docker-swarm-worker-1 Ready Active 20.10.17 515 | ``` 516 | 517 | ## Replicating in Multiple Nodes 518 | 519 | TODO 520 | 521 | ## Load Balancing between your Docker Swarm nodes with nginx 522 | 523 | You can access your service at `YourMachineIP:8080`, but what if I told you that there is *another way* to access your service? 524 | 525 | You can access the service at *any* node IP! Not just your machine IP, and it can even be in a machine that *isn't* hosting your service at the moment! As long as it is connected to your swarm, you are able to access it. 526 | 527 | If you ever used k3s' Load Balancer "Klipper", it works in the exact same way! 528 | 529 | With this, we can load balance our service with nginx! 530 | 531 | ```conf 532 | upstream powercms_backend { 533 | server 172.29.10.1:8080; 534 | server 172.29.11.1:8080; 535 | } 536 | 537 | server { 538 | listen 443 ssl; 539 | server_name mrpowergamerbr.com; 540 | 541 | include mrpowergamerbr_ssl.conf; 542 | 543 | location / { 544 | proxy_pass http://powercms_backend; 545 | proxy_set_header Host $host; 546 | proxy_set_header X-Real-IP $remote_addr; 547 | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 548 | proxy_set_header X-Forwarded-Proto $scheme; 549 | } 550 | } 551 | ``` 552 | 553 | ## GitOps fun 554 | 555 | https://betterprogramming.pub/docker-tips-access-the-docker-daemon-via-ssh-97cd6b44a53 556 | 557 | * Generate a public/private key with `ssh-keygen` 558 | * Login to your Docker instance 559 | * Add the public key (`cat ~/.ssh/id_rsa.pub`) to your Docker instance AS ROOT `sudo nano /root/.ssh/authorized_keys` (just append it) 560 | * `docker -H ssh://root@127.0.0.1:2222 ps` or `export DOCKER_HOST=ssh://root@127.0.0.1:2222` 561 | * yay! 562 | 563 | bye bye~ 564 | 565 | If you are deploying a service that uses an image that is hosted in a private repository, you need to `docker login` in the machine that you are triggering the deploy! 566 | 567 | TODO 568 | 569 | ## Cleaning up unused files 570 | 571 | Docker by default doesn't clean up unused files, so maybe it could be a good idea to setup a cron job to automatically run `docker system prune` on your nodes to avoid running out of disk space. 572 | 573 | [[🐳 Learn more]](https://docs.docker.com/config/pruning/) 574 | 575 | ## If your node is not queueing new containers, or if you have containers that show up in `docker ps` but when trying to view their log, Docker says that the container doesn't exist 576 | 577 | If this is happening, probably your node's disk is full! Clean it up and it will magically fix itself :3 578 | 579 | ## `exec` bug: OCI runtime exec failed: exec failed: unable to start container process: open /dev/pts/0: operation not permitted: unknown 580 | 581 | If you are getting `OCI runtime exec failed: exec failed: unable to start container process: open /dev/pts/0: operation not permitted: unknown` when trying to `exec` into a container, downgrade your `containerd.io` to version 1.6.6 until version 1.6.8 is released. https://askubuntu.com/questions/1424317/docker-20-10-ubuntu-22-04-oci-runtime-exec-failed 582 | 583 | ## Conclusion 584 | 585 | TODO 586 | --------------------------------------------------------------------------------