├── LICENSE ├── README.md └── docker-sync /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 David Darias 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Docker Sync 2 | 3 | A script to synchronize docker images between hosts with emphasis on reducing the amount of data transfered. It is a pure python3 script, does **not** depend on the docker registry and has **no dependencies** on the remote server you are synchronizing with. 4 | 5 | ## Typical use case 6 | 7 | You are part of a team of 4 programmers working on a code base that runs on a docker image. Lets assume it is a `Django` image that has the following Dockerfile: 8 | 9 | FROM python:3.4-slim 10 | 11 | RUN apt-get update && apt-get install -y \ 12 | gcc \ 13 | mysql-client libmysqlclient-dev \ 14 | postgresql-client libpq-dev \ 15 | sqlite3 \ 16 | --no-install-recommends && rm -rf /var/lib/apt/lists/* 17 | 18 | ENV DJANGO_VERSION 1.8.6 19 | 20 | RUN pip install mysqlclient psycopg2 django=="$DJANGO_VERSION" 21 | 22 | Everything is working ok until you need a new package installed on the image. Now you have to add the package to the first `RUN` command, rebuild and push it to the registry for everyone to download. 23 | 24 | These two `RUN` commands account for 240mb of the image and both layers are rebuilt, effectively making your whole team download these **240mb** again from the registry every time a new dependency is added to the image. 25 | 26 | With the command `docker-sync user@somewebserver.com django:latest` you would only transfer (with rsync) the compressed difference between the two images. In this case, if the package were for example `gettext` the whole update would only be around **~1mb** and no need to push or pull from any docker registry. 27 | 28 | ## Installation 29 | 30 | curl -L https://github.com/dvddarias/docker-sync/raw/master/docker-sync > /usr/local/bin/docker-sync 31 | chmod +x /usr/local/bin/docker-sync 32 | 33 | ## Usage 34 | 35 | Lets assume you want to synchronize your local machine docker images with the ones on your `myamazingweb.com` server. You would run: 36 | 37 | >> python3 docker-sync user@myamazingweb.com 38 | 39 | The output is something like: 40 | 41 | Connecting to user@myamazingweb.com. 42 | Listing user@myamazingweb.com images...............................DONE 43 | Listing local images........................................DONE 44 | --> Report: 45 | current: ['squid:latest', 'ubuntu:latest', 'debian:8.0'] 46 | need update: ['redis:latest'] 47 | new: ['mongodb:latest', 'gogs:latest'] 48 | 49 | Nothing was synchronized yet, this is just to see the status of all the images: 50 | 51 | * **Current** means the images have the exact same ID. 52 | * **Need Update** means that the two images have the same name:tag but different ID. 53 | * **New** means you dont have a local image with the listed tag. 54 | 55 | To sync the redis image run 56 | 57 | >> python3 docker-sync user@myamazingweb.com redis:latest 58 | 59 | Please run the script with `--help` for further details. 60 | 61 | ## How does it work 62 | 63 | When synchronizing two images it connects via ssh to the remote server and: 64 | 1. Dumps both images (local & remote) to the hard drive (`docker save`). 65 | 2. Runs `rsync` over the ssh connection to synchronize the content of both files 66 | 3. Loads the new image (`docker load`) 67 | 4. Removes the image dumps on both computers. 68 | 69 | Note: When synchronizing an image on the **New** list the script will try to look for the highest common parent on your local machine to run the `rsync` command, so it pays off **a lot** to use images with the same base image. If no common parent is found the whole image is compressed and transfered. 70 | 71 | ## Dependencies & Configuration 72 | 73 | You **need** python > 3.4 on your local machine and both the user running the script and the user on the remote host most have permissions to run `docker` commands. It uses `ssh` to connect to the host so you should also have the the appropriate permissions. There are **no dependencies on the remote host** other than: bash, rsync & ssh that are already installed on most linux distributions. 74 | 75 | 76 | 77 | 78 | -------------------------------------------------------------------------------- /docker-sync: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import subprocess 3 | import argparse 4 | import time 5 | import sys 6 | import os 7 | 8 | def main(): 9 | args = parse_args() 10 | 11 | just_size = 60 12 | current_time = str(time.time()).replace(".","") 13 | socket_path = os.path.join(args.folder,"docker_sync_ssh_socket_%s"%current_time) 14 | ssh = ["ssh", args.remote, "-o", "ControlPath=%s"%socket_path] 15 | 16 | print("Connecting to %s."%args.remote) 17 | master, output = create_master_connection(ssh) 18 | if output is not None: 19 | print("Ssh connection failed.") 20 | sys.exit(1) 21 | 22 | if args.push: 23 | source=None 24 | dest=ssh 25 | else: 26 | source=ssh 27 | dest=None 28 | 29 | print(("Listing source images").ljust(just_size,"."), end="", flush=True) 30 | source_images = get_source_images(source); 31 | check(source_images) 32 | 33 | print("Listing dest images".ljust(just_size,"."), end="", flush=True) 34 | dest_images = get_dest_images(dest); 35 | check(dest_images) 36 | 37 | new, need_update, current = partition_images(source=source_images, dest=dest_images) 38 | 39 | print("\n--> Report:\n") 40 | print("current:") 41 | print_list(current) 42 | print("need update:") 43 | print_list(need_update) 44 | print("new:") 45 | print_list(new) 46 | 47 | #end the program if only asked for the report 48 | if args.report or (not args.all and not args.update and not args.new and len(args.images)==0) : return 0 49 | 50 | if len(new)+len(need_update)==0: 51 | print("\nEverything is up to date.") 52 | sys.exit(0) 53 | 54 | update_set = set(args.images) 55 | if args.all: 56 | update_set = set(need_update).union(new) 57 | else: 58 | if args.update: 59 | update_set = update_set.union(set(need_update)) 60 | if args.new: 61 | update_set = update_set.union(set(new)) 62 | 63 | for name_tag in update_set: 64 | if len(args.images)==0 or name_tag in args.images: 65 | 66 | is_new = name_tag in new 67 | file_name = args.folder + "/%s_%s_sync_temp.tar.gz"%(name_tag.replace(":","_").replace("/","_").replace("\\","_"), current_time ) 68 | 69 | if is_new: 70 | print("\n--> Adding '%s':\n"%name_tag) 71 | 72 | print("Getting source image ancestors".ljust(just_size,"."), end="", flush=True) 73 | source_family = get_image_parents(name_tag, source) 74 | check(source_family) 75 | 76 | print("Searching all dest images".ljust(just_size,"."), end="", flush=True) 77 | all_dest = get_all_images(dest) 78 | check(all_dest) 79 | 80 | common_parent = get_common_parent(source_family,all_dest) 81 | if common_parent: 82 | print("Found common parent: %s"%common_parent[:20]) 83 | print("Dumping local parent".ljust(just_size,"."), end="", flush=True) 84 | _, output = execute("docker save -o %s %s"%(file_name, common_parent)) 85 | check(output) 86 | else: 87 | print("Unable to find a common parent. Importing full image.") 88 | else: 89 | print("\n--> Updating '%s':\n"%name_tag) 90 | print("Dumping dest image".ljust(just_size,"."), end="", flush=True) 91 | _, output = execute("docker save -o %s %s"%(file_name, name_tag), ssh=source) 92 | check(output) 93 | 94 | print("Dumping source image".ljust(just_size,"."), end="", flush=True) 95 | _, output = execute(["docker", "save", "-o", file_name, name_tag], ssh=source) 96 | check(output) 97 | 98 | print("Starting rsync connection.",end='\n\n') 99 | if args.push: 100 | files = [file_name, "%s:%s"%(args.remote,file_name)] 101 | else: 102 | files = ["%s:%s"%(args.remote,file_name), file_name] 103 | _, output = execute(["rsync", "-e", "ssh -o \"ControlPath=%s\""%socket_path, "-vhz", "--progress"] + files, ssh=None, print_output=True) 104 | check(output) 105 | 106 | print("Removing source temporary file".ljust(just_size,"."), end="", flush=True) 107 | _, output = execute(["rm", file_name], ssh=source) 108 | check(output) 109 | 110 | print("Loading new image".ljust(just_size,"."), end="", flush=True) 111 | _, output = execute("docker load -i %s"%(file_name), ssh=dest) 112 | check(output) 113 | 114 | print("Removing dest temporary file".ljust(just_size,"."), end="", flush=True) 115 | _, output = execute("rm %s"%file_name, ssh=dest) 116 | check(output) 117 | 118 | master.stdin.close() 119 | master.terminate() 120 | 121 | return 122 | 123 | def print_list(list, indent=2): 124 | print(' '*indent + '[') 125 | for l in list: 126 | print(' '*indent*2 + str(l)) 127 | print(' '*indent + ']') 128 | 129 | def create_master_connection(ssh): 130 | message ="__ds-connected__" 131 | p = subprocess.Popen(ssh + ["-o", "ControlMaster=yes", "echo %s; bash"%message ], stdin=subprocess.PIPE, stdout=subprocess.PIPE) 132 | line = p.stdout.readline().decode("UTF-8").strip() 133 | if line!=message: return p, 1 134 | else: return p, p.poll() 135 | 136 | def check(output): 137 | if output==None or (type(output) is int and output!=0): 138 | print("ERROR") 139 | print("Please make sure the user you logged in with is able to run docker commands.") 140 | sys.exit(1) 141 | else: 142 | print("DONE") 143 | 144 | def partition_images(source, dest): 145 | new = [] 146 | need_update = [] 147 | current = [] 148 | for source_name_tag, source_ID in source: 149 | found = False 150 | for dest_name_tag, dest_ID in dest: 151 | if source_name_tag==dest_name_tag: 152 | found = True 153 | if source_ID!=dest_ID: need_update.append( dest_name_tag ) 154 | else: current.append( dest_name_tag ) 155 | if not found: 156 | new.append( source_name_tag ) 157 | 158 | return new, need_update, current 159 | 160 | def get_source_images(ssh): 161 | images = get_images(ssh) 162 | if images == None: return 163 | return [i for i in images if i[0].split(":")[0]!=""] 164 | 165 | def get_dest_images(ssh): 166 | return get_images(ssh) 167 | 168 | def get_images(ssh): 169 | output, result = execute(["docker", "images", "--no-trunc"], ssh=ssh) 170 | if result!=0: return 171 | output = [line.strip().split() for line in output if line.strip()!=""][1:] 172 | return [("%s:%s"%(image[0], image[1]), image[2]) for image in output] 173 | 174 | def parse_args(): 175 | parser = argparse.ArgumentParser(description='Synchronize docker images over the network.') 176 | parser.add_argument('remote', type=str, help='The remote host to sync the images with.') 177 | parser.add_argument('images', type=str, nargs='*', help='The images you want to sync. Use format image:tag.') 178 | parser.add_argument('-a', "--all", dest="all", action="store_true", default=False, help='Synchronize all the images.') 179 | parser.add_argument('-r', "--report", dest="report", action="store_true", default=False, help='Show the status of all the images.') 180 | parser.add_argument('-u', "--update", dest="update", action="store_true", default=False, help='Synchronize the images that need update.') 181 | parser.add_argument('-n', "--new", dest="new", action="store_true", default=False, help='Synchronize the images that are new.') 182 | parser.add_argument('-p', "--push", dest="push", action="store_true", default=False, help='Push images instead of pulling.') 183 | 184 | def folder(path): 185 | if os.path.isdir(path): return os.path.abspath(path) 186 | else: raise argparse.ArgumentTypeError("The specified path is not a directory") 187 | 188 | parser.add_argument('-f', "--tmpfolder", dest="folder", type=folder, default="/tmp", help='The path to temporarily put the image dumps. Defaults to /tmp.') 189 | 190 | return parser.parse_args() 191 | 192 | def execute(command, ssh=None, print_output = False): 193 | if type(command) is str: parts = [s.strip() for s in command.split(" ")] 194 | else: parts = command 195 | 196 | if ssh is not None: 197 | parts = ssh + parts 198 | 199 | p = subprocess.Popen(parts, stdout=subprocess.PIPE) 200 | output = [] 201 | 202 | while True: 203 | if print_output: 204 | line = p.stdout.read(1024).decode("UTF-8") 205 | sys.stdout.write(line) 206 | sys.stdout.flush() 207 | else: 208 | line = p.stdout.readline().decode("UTF-8") 209 | 210 | if len(line)==0: break 211 | output.append(line) 212 | 213 | return output, p.wait() 214 | 215 | def get_image_parents(image, ssh): 216 | command = """ 217 | parent=$(docker inspect %s | grep -Po \"(?<=\\\"Id\\\": \\\")[^\\\"]*\" ); 218 | while test $(echo $parent|wc -m) -eq 72; 219 | do 220 | echo $parent; 221 | parent=$(docker inspect ${parent} | grep -Po \"(?<=\\\"Parent\\\": \\\")[^\\\"]*\" ); 222 | done; 223 | """%image 224 | output, exit_val = execute(["bash", "-cl", command], ssh=ssh) 225 | return [ID.strip() for ID in output] if exit_val==0 else None 226 | 227 | def get_all_images(ssh): 228 | output, exit_val = execute("docker images -qa --no-trunc", ssh=ssh) 229 | return [ID.strip() for ID in output] if exit_val==0 else None 230 | 231 | def get_common_parent(source_parents, dest_images): 232 | for ID in source_parents: 233 | if ID in dest_images: return ID 234 | 235 | if __name__ == '__main__': 236 | main() 237 | --------------------------------------------------------------------------------