├── .gitmodules ├── Docker-migration.md ├── Dockerfile └── README.md /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "konk"] 2 | path = konk 3 | url = https://github.com/planetA/konk.git 4 | branch = migros-atc-2021 5 | [submodule "criu"] 6 | path = criu 7 | url = https://github.com/planetA/criu.git 8 | branch = migros-atc-2021 9 | [submodule "rdma-core"] 10 | path = rdma-core 11 | url = https://github.com/planetA/rdma-core.git 12 | branch = migros-atc-2021 13 | [submodule "perftest"] 14 | path = perftest 15 | url = https://github.com/planetA/perftest.git 16 | branch = migros-atc-2021 17 | [submodule "rxe-fix-races"] 18 | path = rxe-fix-races 19 | url = https://github.com/planetA/linux.git 20 | branch = mplaneta/rxe-fix-races-rebase 21 | [submodule "linux-dump"] 22 | path = linux-dump 23 | url = https://github.com/planetA/linux.git 24 | branch = migros-atc-2021/dump 25 | [submodule "rdma-rxe-workaround"] 26 | path = rdma-core-workaround 27 | url = https://github.com/planetA/rdma-core.git 28 | branch = migros-atc-2021-workaround 29 | -------------------------------------------------------------------------------- /Docker-migration.md: -------------------------------------------------------------------------------- 1 | # Container migration with Docker 2 | 3 | To test out container migration, instead of using CR-X, we provide instructions 4 | for using Docker. You would need two nodes connected in a network. Both the 5 | nodes need to configure SoftRoCE. We assume that the SoftRoCE device is `rxe0` 6 | with a minor number `192` (limitation of the PoC, not a fundamental limitation). 7 | 8 | 1. Create the image 9 | 10 | Build the image from the proposed Dockerfile 11 | 12 | ``` 13 | docker build -t docker-repo/perftest . 14 | ``` 15 | 16 | To distribute the image among multiple nodes, one of the ways is to push it to 17 | docker hub: 18 | 19 | ``` 20 | docker push docker-repo/perftest:latest 21 | ``` 22 | 23 | We do not describe how to register on docker hub. 24 | 25 | 2. Configure docker 26 | 27 | This step applies to both nodes. We do not describe docker installation 28 | procedure. 29 | 30 | To enable checkpoint/restart in Docker create file `/etc/docker/daemon.json` 31 | with following contents: 32 | 33 | ``` 34 | { 35 | "experimental": true 36 | } 37 | ``` 38 | 39 | Restart the daemons 40 | 41 | 3. Configure swarm 42 | 43 | Create an overlay network. 44 | 45 | ``` 46 | docker network create -d overlay --attachable rdma-net --subnet 10.0.1.0/24 47 | ``` 48 | 49 | One of the ways is to use "docker swarm". Run the following command on one of 50 | the nodes designated as manager. 51 | 52 | ``` 53 | docker swarm init 54 | ``` 55 | 56 | This command will print a command to run on worker nodes: 57 | 58 | ``` 59 | docker swarm join --token SWMTKN-<...> : 60 | ``` 61 | 62 | Run the command that was printed on another node 63 | 64 | 3. Start the server side 65 | 66 | The application runs a bidirectional bandwidth benchmark that consists of 67 | server and client applications. The server must be started before the client. 68 | 69 | For our setup, we create two containers: `cont-static` and `cont-moving`. We 70 | run the server in `cont-moving`, and the client in the `cont-static`. 71 | 72 | Following is the script for running the server. 73 | 74 | ``` 75 | #!/bin/bash 76 | 77 | docker rm --force cont-moving 78 | 79 | docker create --name cont-moving --network rdma-net --ip 10.0.1.7 \ 80 | --security-opt seccomp:unconfined --ulimit memlock=1073741824 \ 81 | --cap-add=ALL --memory=1g --kernel-memory=1G --device /dev/infiniband/ \ 82 | docker-repo/perftest:latest ib_send_bw -d rxe0 -n 100000 -b 83 | 84 | echo 64 > /proc/sys/net/rdma_rxe/last_qpn 85 | echo 64 > /proc/sys/net/rdma_rxe/last_mrn 86 | docker start cont-moving 87 | ``` 88 | 89 | This script first removes the existing instance of cont-moving. 90 | 91 | The second line creates the container for image "docker-repo/perftest". The 92 | name should be the same as in the first step of the instruction. The command 93 | should provide the overlay network name, resource limitations, access to the 94 | InfiniBand devices, and security privileges. 95 | 96 | We should specify the IP address explicitly for the migration experiment. 97 | 98 | Next two command set the initial number of the QP and MR ids. These commands 99 | are also important for the migration experiment. We need to make sure that 100 | client and server use different initial numbers to avoid conflicts for the ids 101 | (see paper for the details). 102 | 103 | Finally, we start the container. 104 | 105 | We use bidirectional test, because we need to make sure that QPs on both ends 106 | are in the RTS state. There is a limitation of our PoC, that we do not send 107 | resume message from RTR state. This is not a fundamental limitation. 108 | 109 | 4. Start the client side 110 | 111 | The client side uses very similar parameters as the server side. 112 | 113 | ``` 114 | #!/bin/bash 115 | 116 | docker rm --force cont-moving 117 | docker rm --force cont-static 118 | 119 | echo 16 > /proc/sys/net/rdma_rxe/last_qpn 120 | echo 16 > /proc/sys/net/rdma_rxe/last_mrn 121 | docker run --rm --name cont-static --network rdma-net --ip 10.0.1.3 \ 122 | --security-opt seccomp:unconfined --ulimit memlock=1073741824 \ 123 | --cap-add=ALL --memory=1g --kernel-memory=1G --device /dev/infiniband/ \ 124 | docker-repo/perftest:latest ib_send_bw -d rxe0 -n 100000 10.0.1.7 -b 125 | ``` 126 | 127 | The main difference to the server side command is a different IP address of 128 | the container and the specification of the IP address of the server. 129 | 130 | Client and server run on different nodes. 131 | 132 | If the client and server are left to run, the benchmark must finish normally. 133 | 134 | 4. Migrate the server 135 | 136 | For this experiment we show how to migrate the server. 137 | 138 | Following script must run on the client side. 139 | 140 | ``` 141 | #!/bin/bash 142 | 143 | docker rm cont-moving 144 | 145 | ssh server-node docker checkpoint create cont-moving ckpt1 146 | 147 | docker create --name cont-moving --network rdma-net --ip 10.0.1.7 \ 148 | --security-opt seccomp:unconfined --ulimit memlock=1073741824 \ 149 | --cap-add=ALL --memory=1g --kernel-memory=1G --device /dev/infiniband/ \ 150 | docker-repo/perftest:latest ib_send_bw -d rxe0 -n 100000 151 | 152 | SRC_ID=$(ssh server-node docker inspect --format="{{.Id}}" cont-moving) 153 | DEST_ID=$(docker inspect --format="{{.Id}}" cont-moving) 154 | CONTAINERS=/var/lib/docker/containers/ 155 | 156 | echo 64 > /proc/sys/net/rdma_rxe/last_qpn 157 | echo 64 > /proc/sys/net/rdma_rxe/last_mrn 158 | scp -r server-node:$CONTAINERS/$SRC_ID/checkpoints/ckpt1 $CONTAINERS/$DEST_ID/checkpoints/ 159 | 160 | ssh server-node docker rm cont-moving 161 | 162 | docker start cont-moving --checkpoint ckpt1 163 | ``` 164 | 165 | First, we remove an old instance of the server container. 166 | 167 | Next, we request the docker daemon on the client side to create a checkpoint. 168 | 169 | Then, we create the container with the server on the local node. 170 | 171 | Now, we need to copy the checkpoint from the remote node to the local node. 172 | 173 | Before restarting the container on the local node, we remove the container on 174 | the remote node to avoid IP address collision. 175 | 176 | Finally, we restart the container with the server on the local node. 177 | 178 | If everything goes well, the benchmark will finish after several seconds. 179 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM debian:testing 2 | 3 | ADD rdma-core.tar.gz / 4 | ADD perftest.tar.gz / 5 | 6 | RUN apt-get update && \ 7 | apt-get install -f -y build-essential cmake gcc libudev-dev libnl-3-dev \ 8 | libnl-route-3-dev pkg-config cython3 \ 9 | autoconf libtool-bin git pandoc python-docutils gfortran libgfortran5 && \ 10 | useradd -m user && \ 11 | cd /rdma-core && mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr .. && make -j $(nproc) && make install && \ 12 | make -j $(nproc) && make install && \ 13 | cd /perftest && ./autogen.sh && ./configure && make && make install && \ 14 | apt purge -y autoconf libtool-bin git pandoc python-docutils \ 15 | cmake gcc pkg-config && \ 16 | apt autoremove -y && apt-get clean -y && apt-get autoclean -y && \ 17 | rm -rf /perftest /rdma-core && \ 18 | echo 'user:user' | chpasswd 19 | 20 | USER user 21 | 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # migros-atc-2021 2 | 3 | Repository linking to the software artifacts used for the MigrOS ATC 2021 paper 4 | 5 | # Contents 6 | 7 | The description of the code base: 8 | 9 | 1. rxe-fix-races 10 | 11 | Git module rxe-fix-races contains the sources of the Linux kernel with a fixed SoftRoCE driver. 12 | 13 | 2. linux-dump 14 | 15 | Git module linux-dump contains the sources of the Linux kernel with a migratable SoftRoCE driver. 16 | 17 | 3. CRIU source 18 | 19 | Git module criu contains the sources of ibverbs-enabled CRIU 20 | 21 | 3. RDMA-core (workaround) 22 | 23 | Git module rdma-core-workaround contains the RDMA-core repository with a small 24 | workaround to use rxe user-device driver first. We used this version inside the 25 | container, because in certain situations, the libibverbs was not able to use 26 | SoftRoCE device driver for SoftRoCE device. Instead, it was picking mlx4 27 | user-level device driver. 28 | 29 | 4. RDMA-core (host) 30 | 31 | Git module rdma-core contains the RDMA-core repository enabled for 32 | checkpointing/restarting of libibverbs objects. The repo is to be used by CRIU. 33 | 34 | 5. Perftest tools 35 | 36 | Git module perftest contains the modified version of the perftest benchmark 37 | 38 | 6. konk 39 | 40 | Our container runtime. 41 | 42 | 7. Docker 43 | 44 | Dockerfile used to build the docker image for testing out container migration. 45 | We reuse the same docker image for our container runtime, after converting it 46 | into OCI-compatible archive. 47 | 48 | Docker-migration.md is a description of a workflow to run live migration with docker. 49 | 50 | --------------------------------------------------------------------------------