├── .gitignore ├── LICENSE ├── README.md ├── Vagrantfile ├── __init__.py ├── clean_old_versions.py ├── delete_docker_registry_image.py ├── install_docker.sh └── test ├── clean_and_run ├── docker-compose.yml ├── fixtures ├── a │ ├── Dockerfile │ └── image ├── b │ ├── Dockerfile │ └── image ├── c │ ├── Dockerfile │ └── image ├── d │ ├── Dockerfile │ └── image └── e │ ├── Dockerfile │ └── image ├── start_up_vagrant_box_for_running_tests └── test /.gitignore: -------------------------------------------------------------------------------- 1 | .vagrant 2 | .DS_Store 3 | Gemfile* 4 | Guardfile 5 | aliases.sh 6 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Kevin Burnett 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # delete-docker-registry-image 2 | 3 | ## Install 4 | 5 | curl https://raw.githubusercontent.com/burnettk/delete-docker-registry-image/master/delete_docker_registry_image.py | sudo tee /usr/local/bin/delete_docker_registry_image >/dev/null 6 | sudo chmod a+x /usr/local/bin/delete_docker_registry_image 7 | 8 | ## Run 9 | 10 | Set up your data directory via an environment variable: 11 | 12 | export REGISTRY_DATA_DIR=/opt/registry_data/docker/registry/v2 13 | 14 | You can also just edit the script where this variable is set to make it work 15 | for your setup. 16 | 17 | Almost delete a repo: 18 | 19 | delete_docker_registry_image --image testrepo/awesomeimage --dry-run 20 | 21 | Actually delete a repo (remember to shut down your registry first): 22 | 23 | delete_docker_registry_image --image testrepo/awesomeimage 24 | 25 | Delete one tag from a repo: 26 | 27 | delete_docker_registry_image --image testrepo/awesomeimage:supertag 28 | 29 | 30 | ## clean_old_versions.py 31 | 32 | This complimentary script is made to remove tags in repository based on 33 | regexp pattern. 34 | 35 | Usage: 36 | 37 | ./clean_old_versions.py --image reg_exp_of_repository_to_find --include reg_exp_of_tag_to_find -l history_to_maintain --registry-url location_of_docker_registry -o tag_ordering -b only_tags_before_date -a only_tags_after_date 38 | 39 | Example: 40 | Search for all images whose name start with 'repo/sitor' and delete all tags 41 | whose name start with '0.1.' keeping the last 2 tags and of the remaining tags 42 | deletes only those having an image creation time between January 1, 2016 12 a.m. 43 | and June 25, 2016 12 p.m. (both datetimes are exclusive). 44 | 45 | ./clean_old_versions.py --image '^repo/sitor*' --include '^0.1.*' -l 2 -b 2016-06-25T12:00:00 -a 2016-01-01T00:00:00 --registry-url http://localhost:5000 46 | 47 | Add `--dry-run` as argument for a test run without actual removal of tags. 48 | 49 | ## Run tests for this project 50 | 51 | ./test/start_up_vagrant_box_for_running_tests 52 | vagrant ssh 53 | cd /vagrant 54 | ./test/clean_and_run 55 | 56 | Known test-passing configurations: 57 | 1. docker: 1.9.1, registry:2.2.1 58 | 2. docker: 1.10.2, registry:2.3.0 59 | 1. docker: 1.11.2, registry:2.3.0 60 | 1. docker: 1.12.1, registry:2.5.0 61 | 62 | Known test-failing configurations: 63 | 1. docker: 1.10.2, registry:2.2.1 64 | 65 | When tests are run with a new docker daemon and an older registry, 66 | architecture-specific config files are created, but they are not referenced 67 | anywhere, so tests fail when we delete a tag or repo and expect all files to be 68 | deleted, but these architecture-specific config files are still hanging around. 69 | With the newer registry, these config files are referenced in the schema 70 | version 2 manifest, so we can easily delete them. It's probably best to avoid 71 | use of this script with the version combinations that fail tests. 72 | 73 | ## Alternatives 74 | 75 | Docker is building or has built much of this functionality in newer versions of 76 | docker and the registry. 77 | 78 | The ability to delete the metadata for a manifest was added in registry:2.2. Make 79 | sure you give the registry the environment variable 80 | REGISTRY_STORAGE_DELETE_ENABLED=true. Follow the instructions at 81 | https://github.com/docker/docker-registry/issues/988#issuecomment-224280919 to 82 | delete a tag by name. Once the metadata is deleted, follow the instructions at 83 | https://github.com/docker/distribution/blob/master/docs/configuration.md to run 84 | garbage collection, which will clean up the binary data (the big stuff). 85 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | Vagrant.configure(2) do |config| 2 | config.vm.box = "ubuntu/trusty64" 3 | 4 | name = 'test-delete-docker-registry-image-docker-latest' 5 | config.vm.define name 6 | config.vm.provider "virtualbox" do |v| 7 | v.name = name 8 | end 9 | config.vm.hostname = name 10 | 11 | config.vm.provision :shell, path: "install_docker.sh" 12 | 13 | # you will need to install a vagrant plugin first: vagrant plugin install vagrant-docker-compose 14 | config.vm.provision :docker_compose, project_name: 'registry', yml: "/vagrant/test/docker-compose.yml", run: "always" 15 | end 16 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/burnettk/delete-docker-registry-image/4214924799e4e03bbd6ba0a98402f0bde9cdd752/__init__.py -------------------------------------------------------------------------------- /clean_old_versions.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | from __future__ import print_function 3 | import re 4 | import subprocess 5 | import argparse 6 | from distutils.version import LooseVersion 7 | import requests 8 | from datetime import datetime 9 | import json 10 | import sys 11 | 12 | DATE_FORMAT = "%Y-%m-%dT%H:%M:%S" 13 | 14 | # taken from http://stackoverflow.com/questions/25470844/specify-format-for-input-arguments-argparse-python#answer-25470943 15 | def valid_date(date_str): 16 | try: 17 | return datetime.strptime(date_str, DATE_FORMAT) 18 | except ValueError: 19 | msg = "Not a valid date: '{0}'.".format(date_str) 20 | raise argparse.ArgumentTypeError(msg) 21 | 22 | def get_created_date_for_tag(tag, repository, auth, args): 23 | headers = {'Accept': 'application/vnd.docker.distribution.manifest.v2+json'} 24 | 25 | response = requests.get(args.registry_url + "/v2/" + repository + "/manifests/" + tag, 26 | auth=auth, verify=args.no_check_certificate, headers=headers) 27 | 28 | if response.json()['schemaVersion'] == 1: 29 | created_str = json.loads(response.json()['history'][0]['v1Compatibility'])['created'].split(".")[0] 30 | elif response.json()['schemaVersion'] == 2: 31 | digest = response.json()["config"]["digest"] 32 | response = requests.get(args.registry_url + "/v2/" + repository + "/blobs/" + digest, 33 | auth=auth, verify=args.no_check_certificate, headers=headers) 34 | created_str = response.json()['created'].split(".")[0] 35 | return(datetime.strptime(created_str,DATE_FORMAT)) 36 | 37 | def get_paginate_query(response): 38 | if 'Link' in response.headers: 39 | return response.headers['Link'].split('; ')[0][:-1][1:] 40 | else: 41 | return None 42 | 43 | def main(): 44 | """cli entrypoint""" 45 | parser = argparse.ArgumentParser(description="Cleanup docker registry") 46 | parser.add_argument("-e", "--exclude", 47 | dest="exclude", 48 | help="Regexp to exclude tags") 49 | parser.add_argument("-E", "--include", 50 | dest="include", 51 | help="Regexp to include tags") 52 | parser.add_argument("-i", "--image", 53 | dest="image", 54 | required=True, 55 | help="Docker image to cleanup") 56 | parser.add_argument("-v", "--verbose", 57 | dest="verbose", 58 | action="store_true", 59 | help="verbose") 60 | parser.add_argument("-u", "--registry-url", 61 | dest="registry_url", 62 | default="http://localhost", 63 | help="Registry URL") 64 | parser.add_argument("-s", "--script-path", 65 | dest="script_path", 66 | default="/usr/local/bin/delete_docker_registry_image", 67 | help="delete_docker_registry_image full script path") 68 | parser.add_argument("-l", "--last", 69 | dest="last", 70 | type=int, 71 | help="Keep last N tags") 72 | parser.add_argument("-b", "--before-date", 73 | dest="before", 74 | type=valid_date, 75 | help="Only delete tags created before given date. " + 76 | "The date must be given in the format " + 77 | "'YYYY-MM-DDTHH24:mm:ss' (e.q. '" + 78 | datetime.now().strftime(DATE_FORMAT) + "').") 79 | parser.add_argument("-a", "--after-date", 80 | dest="after", 81 | type=valid_date, 82 | help="Only delete tags created after given date. " + 83 | "The date must be given in the format " + 84 | "'YYYY-MM-DDTHH24:mm:ss' (e.q. '" + 85 | datetime.now().strftime(DATE_FORMAT) + "').") 86 | parser.add_argument("-o", "--order", 87 | dest="order", 88 | choices=['name', 'date'], 89 | default='name', 90 | help="Selects the order in which tags are sorted when the option '--last' is used") 91 | parser.add_argument("-U", "--user", 92 | dest="user", 93 | help="User for auth") 94 | parser.add_argument("-P", "--password", 95 | dest="password", 96 | help="Password for auth") 97 | parser.add_argument("--no_check_certificate", 98 | action='store_false') 99 | parser.add_argument("--dry-run", 100 | dest='dry_run', 101 | action='store_true', 102 | help="Dry run - show which tags would have been deleted but do not delete them") 103 | args = parser.parse_args() 104 | 105 | # Get catalog 106 | if args.user and args.password: 107 | auth = (args.user, args.password) 108 | else: 109 | auth = None 110 | response = requests.get(args.registry_url + "/v2/_catalog", 111 | auth=auth, verify=args.no_check_certificate) 112 | 113 | nextQuery = get_paginate_query(response) 114 | repositories = response.json()["repositories"] 115 | 116 | while nextQuery is not None: 117 | response = requests.get(args.registry_url + nextQuery, 118 | auth=auth, verify=args.no_check_certificate) 119 | repositories.extend(response.json()['repositories']) 120 | nextQuery = get_paginate_query(response) 121 | 122 | # For each repository check it matches with args.image 123 | for repository in repositories: 124 | if re.search(args.image, repository): 125 | # Get tags 126 | response = requests.get(args.registry_url + "/v2/" + repository + "/tags/list", 127 | auth=auth, verify=args.no_check_certificate) 128 | 129 | tags = None 130 | if "tags" in response.json().keys(): 131 | tags = response.json()["tags"] 132 | 133 | # For each tag, check it does not matches with args.exclude 134 | matching_tags = [] 135 | if tags is not None: 136 | for tag in tags: 137 | if not args.exclude or not re.search(args.exclude, tag): 138 | if not args.include or re.search(args.include, tag): 139 | matching_tags.append(tag) 140 | 141 | # Sort tags 142 | if args.order == 'name': 143 | order_fn = lambda s: LooseVersion(re.sub('[^0-9.]', '9', s)) 144 | else: 145 | order_fn = lambda s: get_created_date_for_tag(s, repository, auth, args) 146 | 147 | matching_tags.sort(key=order_fn) 148 | 149 | # Set number of last tags to keep to the default value of 5 150 | # if 'last' is not set 151 | if args.last is None: 152 | args.last = 5 153 | 154 | # Delete all except N last items 155 | if args.last is not None and args.last > 0: 156 | matching_tags = matching_tags[:-args.last] 157 | else: 158 | matching_tags = matching_tags 159 | 160 | tags_to_delete = [] 161 | if args.before or args.after: 162 | for tag in matching_tags: 163 | created = get_created_date_for_tag(tag, repository, auth, args) 164 | 165 | if (not args.before or created < args.before) and (not args.after or created > args.after) : 166 | tags_to_delete.append(tag) 167 | else: 168 | tags_to_delete = matching_tags 169 | 170 | for tag in tags_to_delete: 171 | command2run = "{0} --image {1}:{2}". \ 172 | format(args.script_path, repository, tag) 173 | if args.dry_run : 174 | print("Simulate deletion of {0}:{1}".format(repository, tag)) 175 | command2run += " --dry-run" 176 | print("Running: {0}".format(command2run)) 177 | out = subprocess.Popen(command2run, shell=True, stdout=subprocess.PIPE, 178 | stderr=subprocess.STDOUT).stdout.read() 179 | print(out) 180 | else: 181 | print("No tags availables for " + repository) 182 | 183 | 184 | if __name__ == '__main__': 185 | main() 186 | -------------------------------------------------------------------------------- /delete_docker_registry_image.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """ 3 | Usage: 4 | Shut down your registry service to avoid race conditions and possible data loss 5 | and then run the command with an image repo like this: 6 | delete_docker_registry_image.py --image awesomeimage --dry-run 7 | """ 8 | 9 | import argparse 10 | import json 11 | import logging 12 | import os 13 | import sys 14 | import shutil 15 | import glob 16 | 17 | logger = logging.getLogger(__name__) 18 | 19 | 20 | def del_empty_dirs(s_dir, top_level): 21 | """recursively delete empty directories""" 22 | b_empty = True 23 | 24 | for s_target in os.listdir(s_dir): 25 | s_path = os.path.join(s_dir, s_target) 26 | if os.path.isdir(s_path): 27 | if not del_empty_dirs(s_path, False): 28 | b_empty = False 29 | else: 30 | b_empty = False 31 | 32 | if b_empty: 33 | logger.debug("Deleting empty directory '%s'", s_dir) 34 | if not top_level: 35 | os.rmdir(s_dir) 36 | 37 | return b_empty 38 | 39 | 40 | def get_layers_from_blob(path): 41 | """parse json blob and get set of layer digests""" 42 | try: 43 | with open(path, "r") as blob: 44 | data_raw = blob.read() 45 | data = json.loads(data_raw) 46 | if data["schemaVersion"] == 1: 47 | result = set([entry["blobSum"].split(":")[1] for entry in data["fsLayers"]]) 48 | else: 49 | result = set([entry["digest"].split(":")[1] for entry in data["layers"]]) 50 | if "config" in data: 51 | result.add(data["config"]["digest"].split(":")[1]) 52 | return result 53 | except Exception as error: 54 | logger.critical("Failed to read layers from blob:%s", error) 55 | return set() 56 | 57 | 58 | def get_digest_from_blob(path): 59 | """parse file and get digest""" 60 | try: 61 | with open(path, "r") as blob: 62 | return blob.read().split(":")[1] 63 | except Exception as error: 64 | logger.critical("Failed to read digest from blob:%s", error) 65 | return "" 66 | 67 | 68 | def get_links(path, _filter=None): 69 | """recursively walk `path` and parse every link inside""" 70 | result = [] 71 | for root, _, files in os.walk(path): 72 | for each in files: 73 | if each == "link": 74 | filepath = os.path.join(root, each) 75 | if not _filter or _filter in filepath: 76 | result.append(get_digest_from_blob(filepath)) 77 | return result 78 | 79 | 80 | class RegistryCleanerError(Exception): 81 | pass 82 | 83 | 84 | class RegistryCleaner(object): 85 | """Clean registry""" 86 | 87 | def __init__(self, registry_data_dir, dry_run=False): 88 | self.registry_data_dir = registry_data_dir 89 | if not os.path.isdir(self.registry_data_dir): 90 | raise RegistryCleanerError("No repositories directory found inside " \ 91 | "REGISTRY_DATA_DIR '{0}'.". 92 | format(self.registry_data_dir)) 93 | self.dry_run = dry_run 94 | 95 | def _delete_layer(self, repo, digest): 96 | """remove blob directory from filesystem""" 97 | path = os.path.join(self.registry_data_dir, "repositories", repo, "_layers/sha256", digest) 98 | self._delete_dir(path) 99 | 100 | def _delete_blob(self, digest): 101 | """remove blob directory from filesystem""" 102 | path = os.path.join(self.registry_data_dir, "blobs/sha256", digest[0:2], digest) 103 | self._delete_dir(path) 104 | 105 | def _blob_path_for_revision(self, digest): 106 | """where we can find the blob that contains the json describing this digest""" 107 | return os.path.join(self.registry_data_dir, "blobs/sha256", 108 | digest[0:2], digest, "data") 109 | 110 | def _blob_path_for_revision_is_missing(self, digest): 111 | """for each revision, there should be a blob describing it""" 112 | return not os.path.isfile(self._blob_path_for_revision(digest)) 113 | 114 | def _get_layers_from_blob(self, digest): 115 | """get layers from blob by digest""" 116 | return get_layers_from_blob(self._blob_path_for_revision(digest)) 117 | 118 | def _delete_dir(self, path): 119 | """remove directory from filesystem""" 120 | if self.dry_run: 121 | logger.info("DRY_RUN: would have deleted %s", path) 122 | else: 123 | logger.info("Deleting %s", path) 124 | try: 125 | shutil.rmtree(path) 126 | except Exception as error: 127 | logger.critical("Failed to delete directory:%s", error) 128 | 129 | def _delete_from_tag_index_for_revision(self, repo, digest): 130 | """delete revision from tag indexes""" 131 | paths = glob.glob( 132 | os.path.join(self.registry_data_dir, "repositories", repo, 133 | "_manifests/tags/*/index/sha256", digest) 134 | ) 135 | for path in paths: 136 | self._delete_dir(path) 137 | 138 | def _delete_revisions(self, repo, revisions, blobs_to_keep=None): 139 | """delete revisions from list of directories""" 140 | if blobs_to_keep is None: 141 | blobs_to_keep = [] 142 | for revision_dir in revisions: 143 | digests = get_links(revision_dir) 144 | for digest in digests: 145 | self._delete_from_tag_index_for_revision(repo, digest) 146 | if digest not in blobs_to_keep: 147 | self._delete_blob(digest) 148 | 149 | self._delete_dir(revision_dir) 150 | 151 | def _get_tags(self, repo): 152 | """get all tags for given repository""" 153 | path = os.path.join(self.registry_data_dir, "repositories", repo, "_manifests/tags") 154 | if not os.path.isdir(path): 155 | logger.critical("No repository '%s' found in repositories directory %s", 156 | repo, self.registry_data_dir) 157 | return None 158 | result = [] 159 | for each in os.listdir(path): 160 | filepath = os.path.join(path, each) 161 | if os.path.isdir(filepath): 162 | result.append(each) 163 | return result 164 | 165 | def _get_repositories(self): 166 | """get all repository repos""" 167 | result = [] 168 | root = os.path.join(self.registry_data_dir, "repositories") 169 | for each in os.listdir(root): 170 | filepath = os.path.join(root, each) 171 | if os.path.isdir(filepath): 172 | inside = os.listdir(filepath) 173 | if "_layers" in inside: 174 | result.append(each) 175 | else: 176 | for inner in inside: 177 | result.append(os.path.join(each, inner)) 178 | return result 179 | 180 | def _get_all_links(self, except_repo=""): 181 | """get links for every repository""" 182 | result = [] 183 | repositories = self._get_repositories() 184 | for repo in [r for r in repositories if r != except_repo]: 185 | path = os.path.join(self.registry_data_dir, "repositories", repo) 186 | for link in get_links(path): 187 | result.append(link) 188 | return result 189 | 190 | def prune(self): 191 | """delete all empty directories in registry_data_dir""" 192 | del_empty_dirs(self.registry_data_dir, True) 193 | 194 | def _layer_in_same_repo(self, repo, tag, layer): 195 | """check if layer is found in other tags of same repository""" 196 | for other_tag in [t for t in self._get_tags(repo) if t != tag]: 197 | path = os.path.join(self.registry_data_dir, "repositories", repo, 198 | "_manifests/tags", other_tag, "current/link") 199 | manifest = get_digest_from_blob(path) 200 | try: 201 | layers = self._get_layers_from_blob(manifest) 202 | if layer in layers: 203 | return True 204 | except IOError: 205 | if self._blob_path_for_revision_is_missing(manifest): 206 | logger.warn("Blob for digest %s does not exist. Deleting tag manifest: %s", manifest, other_tag) 207 | tag_dir = os.path.join(self.registry_data_dir, "repositories", repo, 208 | "_manifests/tags", other_tag) 209 | self._delete_dir(tag_dir) 210 | else: 211 | raise 212 | return False 213 | 214 | def _manifest_in_same_repo(self, repo, tag, manifest): 215 | """check if manifest is found in other tags of same repository""" 216 | for other_tag in [t for t in self._get_tags(repo) if t != tag]: 217 | path = os.path.join(self.registry_data_dir, "repositories", repo, 218 | "_manifests/tags", other_tag, "current/link") 219 | other_manifest = get_digest_from_blob(path) 220 | if other_manifest == manifest: 221 | return True 222 | 223 | return False 224 | 225 | def delete_entire_repository(self, repo): 226 | """delete all blobs for given repository repo""" 227 | logger.debug("Deleting entire repository '%s'", repo) 228 | repo_dir = os.path.join(self.registry_data_dir, "repositories", repo) 229 | if not os.path.isdir(repo_dir): 230 | raise RegistryCleanerError("No repository '{0}' found in repositories " 231 | "directory {1}/repositories". 232 | format(repo, self.registry_data_dir)) 233 | links = set(get_links(repo_dir)) 234 | all_links_but_current = set(self._get_all_links(except_repo=repo)) 235 | for layer in links: 236 | if layer in all_links_but_current: 237 | logger.debug("Blob found in another repository. Not deleting: %s", layer) 238 | else: 239 | self._delete_blob(layer) 240 | self._delete_dir(repo_dir) 241 | 242 | def delete_repository_tag(self, repo, tag): 243 | """delete all blobs only for given tag of repository""" 244 | logger.debug("Deleting repository '%s' with tag '%s'", repo, tag) 245 | tag_dir = os.path.join(self.registry_data_dir, "repositories", repo, "_manifests/tags", tag) 246 | if not os.path.isdir(tag_dir): 247 | raise RegistryCleanerError("No repository '{0}' tag '{1}' found in repositories " 248 | "directory {2}/repositories". 249 | format(repo, tag, self.registry_data_dir)) 250 | manifests_for_tag = set(get_links(tag_dir)) 251 | revisions_to_delete = [] 252 | blobs_to_keep = [] 253 | layers = [] 254 | all_links_not_in_current_repo = set(self._get_all_links(except_repo=repo)) 255 | for manifest in manifests_for_tag: 256 | logger.debug("Looking up filesystem layers for manifest digest %s", manifest) 257 | 258 | if self._manifest_in_same_repo(repo, tag, manifest): 259 | logger.debug("Not deleting since we found another tag using manifest: %s", manifest) 260 | continue 261 | else: 262 | revisions_to_delete.append( 263 | os.path.join(self.registry_data_dir, "repositories", repo, 264 | "_manifests/revisions/sha256", manifest) 265 | ) 266 | if manifest in all_links_not_in_current_repo: 267 | logger.debug("Not deleting the blob data since we found another repo using manifest: %s", manifest) 268 | blobs_to_keep.append(manifest) 269 | 270 | layers.extend(self._get_layers_from_blob(manifest)) 271 | 272 | layers_uniq = set(layers) 273 | for layer in layers_uniq: 274 | if self._layer_in_same_repo(repo, tag, layer): 275 | logger.debug("Not deleting since we found another tag using digest: %s", layer) 276 | continue 277 | 278 | self._delete_layer(repo, layer) 279 | if layer in all_links_not_in_current_repo: 280 | logger.debug("Blob found in another repository. Not deleting: %s", layer) 281 | else: 282 | self._delete_blob(layer) 283 | 284 | self._delete_revisions(repo, revisions_to_delete, blobs_to_keep) 285 | self._delete_dir(tag_dir) 286 | 287 | def delete_untagged(self, repo): 288 | """delete all untagged data from repo""" 289 | logger.debug("Deleting utagged data from repository '%s'", repo) 290 | repositories_dir = os.path.join(self.registry_data_dir, "repositories") 291 | repo_dir = os.path.join(repositories_dir, repo) 292 | if not os.path.isdir(repo_dir): 293 | raise RegistryCleanerError("No repository '{0}' found in repositories " 294 | "directory {1}/repositories". 295 | format(repo, self.registry_data_dir)) 296 | tagged_links = set(get_links(repositories_dir, _filter="current")) 297 | layers_to_protect = [] 298 | for link in tagged_links: 299 | layers_to_protect.extend(self._get_layers_from_blob(link)) 300 | 301 | unique_layers_to_protect = set(layers_to_protect) 302 | for layer in unique_layers_to_protect: 303 | logger.debug("layer_to_protect: %s", layer) 304 | 305 | tagged_revisions = set(get_links(repo_dir, _filter="current")) 306 | 307 | revisions_to_delete = [] 308 | layers_to_delete = [] 309 | 310 | dir_for_revisions = os.path.join(repo_dir, "_manifests/revisions/sha256") 311 | for rev in os.listdir(dir_for_revisions): 312 | if rev not in tagged_revisions: 313 | revisions_to_delete.append(os.path.join(dir_for_revisions, rev)) 314 | for layer in self._get_layers_from_blob(rev): 315 | if layer not in unique_layers_to_protect: 316 | layers_to_delete.append(layer) 317 | 318 | unique_layers_to_delete = set(layers_to_delete) 319 | 320 | self._delete_revisions(repo, revisions_to_delete) 321 | for layer in unique_layers_to_delete: 322 | self._delete_blob(layer) 323 | self._delete_layer(repo, layer) 324 | 325 | 326 | def get_tag_count(self, repo): 327 | logger.debug("Get tag count of repository '%s'", repo) 328 | repo_dir = os.path.join(self.registry_data_dir, "repositories", repo) 329 | tags_dir = os.path.join(repo_dir, "_manifests/tags") 330 | 331 | if os.path.isdir(tags_dir): 332 | tags = os.listdir(tags_dir) 333 | return len(tags) 334 | else: 335 | logger.info("Tags directory does not exist: '%s'", tags_dir) 336 | return -1 337 | 338 | def main(): 339 | """cli entrypoint""" 340 | parser = argparse.ArgumentParser(description="Cleanup docker registry") 341 | parser.add_argument("-i", "--image", 342 | dest="image", 343 | required=True, 344 | help="Docker image to cleanup") 345 | parser.add_argument("-v", "--verbose", 346 | dest="verbose", 347 | action="store_true", 348 | help="verbose") 349 | parser.add_argument("-n", "--dry-run", 350 | dest="dry_run", 351 | action="store_true", 352 | help="Dry run") 353 | parser.add_argument("-f", "--force", 354 | dest="force", 355 | action="store_true", 356 | help="Force delete (deprecated)") 357 | parser.add_argument("-p", "--prune", 358 | dest="prune", 359 | action="store_true", 360 | help="Prune") 361 | parser.add_argument("-u", "--untagged", 362 | dest="untagged", 363 | action="store_true", 364 | help="Delete all untagged blobs for image") 365 | args = parser.parse_args() 366 | 367 | 368 | handler = logging.StreamHandler() 369 | handler.setFormatter(logging.Formatter(u'%(levelname)-8s [%(asctime)s] %(message)s')) 370 | logger.addHandler(handler) 371 | 372 | if args.verbose: 373 | logger.setLevel(logging.DEBUG) 374 | else: 375 | logger.setLevel(logging.INFO) 376 | 377 | 378 | # make sure not to log before logging is setup. that'll hose your logging config. 379 | if args.force: 380 | logger.info( 381 | "You supplied the force switch, which is deprecated. It has no effect now, and the script defaults to doing what used to be only happen when force was true") 382 | 383 | splitted = args.image.split(":") 384 | if len(splitted) == 2: 385 | image = splitted[0] 386 | tag = splitted[1] 387 | else: 388 | image = args.image 389 | tag = None 390 | 391 | if 'REGISTRY_DATA_DIR' in os.environ: 392 | registry_data_dir = os.environ['REGISTRY_DATA_DIR'] 393 | else: 394 | registry_data_dir = "/opt/registry_data/docker/registry/v2" 395 | 396 | try: 397 | cleaner = RegistryCleaner(registry_data_dir, dry_run=args.dry_run) 398 | if args.untagged: 399 | cleaner.delete_untagged(image) 400 | else: 401 | if tag: 402 | tag_count = cleaner.get_tag_count(image) 403 | if tag_count == 1: 404 | cleaner.delete_entire_repository(image) 405 | else: 406 | cleaner.delete_repository_tag(image, tag) 407 | else: 408 | cleaner.delete_entire_repository(image) 409 | 410 | if args.prune: 411 | cleaner.prune() 412 | except RegistryCleanerError as error: 413 | logger.fatal(error) 414 | sys.exit(1) 415 | 416 | 417 | if __name__ == "__main__": 418 | main() 419 | -------------------------------------------------------------------------------- /install_docker.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | set -e 4 | 5 | # get.docker.com unfortunately no longer accepts a version 6 | # curl -sSL https://get.docker.com | DOCKER_VERSION=1.11.2 sh 7 | 8 | # apt-key adv --keyserver hkp://pgp.mit.edu:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D 9 | apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D 10 | mkdir -p /etc/apt/sources.list.d 11 | echo deb https://apt.dockerproject.org/repo ubuntu-trusty main > /etc/apt/sources.list.d/docker.list 12 | apt-get update 13 | 14 | # pick your docker version: 15 | # apt-get install -y -q docker-engine=1.10.2-0~trusty 16 | apt-get install -y -q docker-engine=1.11.2-0~trusty 17 | # apt-get install -y -q docker-engine # latest 18 | 19 | usermod -aG docker vagrant 20 | -------------------------------------------------------------------------------- /test/clean_and_run: -------------------------------------------------------------------------------- 1 | echo && echo "********************************" && sudo rm -rf /opt/registry_data/docker/registry/v2/* && docker restart registry && sleep 5 && ./test/test 2 | -------------------------------------------------------------------------------- /test/docker-compose.yml: -------------------------------------------------------------------------------- 1 | registry: 2 | # image: registry:2.2.1 3 | image: registry:2.3.0 4 | # image: registry:2.5.0 5 | container_name: registry 6 | ports: 7 | - 5000:5000 8 | volumes: 9 | - /opt/registry_data:/var/lib/registry 10 | -------------------------------------------------------------------------------- /test/fixtures/a/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM busybox 2 | 3 | RUN mkdir -p test && touch test/a 4 | 5 | CMD ["sh"] 6 | -------------------------------------------------------------------------------- /test/fixtures/a/image: -------------------------------------------------------------------------------- 1 | localhost:5000/test/a 2 | -------------------------------------------------------------------------------- /test/fixtures/b/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM localhost:5000/test/a 2 | 3 | RUN mkdir -p test && touch test/b 4 | 5 | CMD ["sh"] 6 | -------------------------------------------------------------------------------- /test/fixtures/b/image: -------------------------------------------------------------------------------- 1 | localhost:5000/test/b 2 | -------------------------------------------------------------------------------- /test/fixtures/c/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM busybox 2 | 3 | RUN mkdir -p test && touch test/c 4 | 5 | CMD ["sh"] 6 | -------------------------------------------------------------------------------- /test/fixtures/c/image: -------------------------------------------------------------------------------- 1 | localhost:5000/test/c:1 2 | -------------------------------------------------------------------------------- /test/fixtures/d/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM localhost:5000/test/c:1 2 | 3 | RUN mkdir -p test && touch test/d 4 | 5 | CMD ["sh"] 6 | -------------------------------------------------------------------------------- /test/fixtures/d/image: -------------------------------------------------------------------------------- 1 | localhost:5000/test/c:2 2 | -------------------------------------------------------------------------------- /test/fixtures/e/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM localhost:5000/test/a 2 | 3 | RUN mkdir -p test && touch test/e 4 | 5 | CMD ["sh"] 6 | -------------------------------------------------------------------------------- /test/fixtures/e/image: -------------------------------------------------------------------------------- 1 | localhost:5000/test/e:1 2 | -------------------------------------------------------------------------------- /test/start_up_vagrant_box_for_running_tests: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | if vagrant plugin list | grep vagrant-docker-compose; then 4 | echo "vagrant-docker-compose is already installed" 5 | else 6 | echo "installing vagrant-docker-compose" 7 | vagrant plugin install vagrant-docker-compose 8 | fi 9 | 10 | vagrant up 11 | -------------------------------------------------------------------------------- /test/test: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | function error_handler() { 4 | echo "Error occurred in script at line: ${1}." 5 | echo "Line exited with status: ${2}" 6 | } 7 | 8 | trap 'error_handler ${LINENO} $?' ERR 9 | 10 | set -o errexit 11 | set -o errtrace 12 | set -o nounset 13 | 14 | # done setting up error handling. actual code follows. 15 | 16 | if [[ ! -d "test/fixtures" ]] ; then 17 | echo "please run tests from the root of the delete-docker-registry-image repo, so there is a test/fixtures directory available" 18 | exit 1 19 | fi 20 | 21 | function get_test_docker_container_ids() { 22 | docker ps -a | grep localhost:5000/test | awk '{print $1}' 23 | } 24 | 25 | function get_test_docker_image_ids() { 26 | docker images | grep localhost:5000/test | awk '{print $3}' 27 | } 28 | 29 | function delete_test_docker_images() { 30 | (get_test_docker_image_ids | xargs docker rmi -f) || true 31 | } 32 | 33 | # if it finds a directory, fail. If not, noop. 34 | function assert_that_registry_has_no_data() { 35 | 36 | # who cares about uploads. they are temporary and we can delete them anytime. 37 | find /opt/registry_data/docker/registry/v2 -name _uploads -type d | sudo xargs rm -rf 38 | 39 | # delete empty directories after killing _uploads dirs 40 | lines=$(find /opt/registry_data/docker/registry/v2/* -type d -empty 2>/dev/null | wc -l) 41 | if [ $lines -gt 0 ]; then 42 | sudo find /opt/registry_data/docker/registry/v2/* -type d -empty -delete 43 | fi 44 | 45 | if [ "$(ls -A /opt/registry_data/docker/registry/v2)" ]; then 46 | echo "ASSERTION FAILURE: registry has data when we expected it not to have data. leaf node dirs:" 47 | # finding -type d (directories) with 2 links gets just "leaf" node directories 48 | find /opt/registry_data/docker/registry/v2 -type d -links 2 49 | i=0; while caller $i ;do ((i++)) ;done 50 | false 51 | fi 52 | } 53 | 54 | function build_test_image() { 55 | previous_location=$PWD 56 | cd test/fixtures/$1 57 | image=$(cat image) 58 | docker build -t $image . 59 | cd $previous_location 60 | } 61 | 62 | function push_test_image() { 63 | previous_location=$PWD 64 | cd test/fixtures/$1 65 | image=$(cat image) 66 | docker push $image 67 | cd $previous_location 68 | } 69 | 70 | function push_all_test_images() { 71 | for test_image in `ls test/fixtures`; do 72 | build_and_push_test_image $test_image 73 | done 74 | } 75 | 76 | function build_test_images() { 77 | for test_image in "$@"; do 78 | build_test_image $test_image 79 | done 80 | } 81 | 82 | function build_and_push_test_images() { 83 | for test_image in "$@"; do 84 | build_test_image $test_image 85 | push_test_image $test_image 86 | done 87 | } 88 | 89 | function delete_all_registry_data() { 90 | sudo rm -rf /opt/registry_data/docker/registry/v2/repositories /opt/registry_data/docker/registry/v2/blobs 91 | } 92 | 93 | function back_up_registry_data_for_later_inspection_in_case_test_fails() { 94 | sudo rm -rf /opt/registry_data/docker/registry/v2_backup 95 | sudo cp -r /opt/registry_data/docker/registry/v2 /opt/registry_data/docker/registry/v2_backup 96 | } 97 | 98 | function delete_all_test_data() { 99 | for id in $(get_test_docker_container_ids); do docker stop $id; done 100 | for id in $(get_test_docker_container_ids); do docker rm $id; done 101 | delete_test_docker_images 102 | delete_all_registry_data 103 | } 104 | 105 | function run_delete() { 106 | sudo /vagrant/delete_docker_registry_image.py "$@" 107 | } 108 | 109 | function bounce_registry() { 110 | docker ps | grep registry | awk '{ print $1 }' | xargs -n 1 docker restart 111 | } 112 | 113 | function setup() { 114 | delete_all_test_data 115 | bounce_registry 116 | } 117 | 118 | function test_deleting_untagged() { 119 | setup 120 | build_test_images a b c 121 | 122 | docker tag localhost:5000/test/b localhost:5000/test/repousinglatest 123 | docker push localhost:5000/test/repousinglatest 124 | 125 | docker tag localhost:5000/test/a localhost:5000/test/repousinglatest 126 | docker push localhost:5000/test/repousinglatest 127 | 128 | docker tag localhost:5000/test/c:1 localhost:5000/test/repousinglatest 129 | docker push localhost:5000/test/repousinglatest 130 | 131 | before=$(find /opt/registry_data/docker/registry/v2/blobs -type f | wc -l) 132 | run_delete --image test/repousinglatest --force --prune --untagged 133 | after=$(find /opt/registry_data/docker/registry/v2/blobs -type f | wc -l) 134 | 135 | if [ ! "$after" -lt "$before" ]; then 136 | echo "After deleting --untagged, the number of blobs is unchanged: before: $before. after: $after" 137 | exit 1 138 | fi 139 | 140 | delete_test_docker_images 141 | 142 | docker pull localhost:5000/test/repousinglatest 143 | 144 | if ! docker run -it localhost:5000/test/repousinglatest ls test | grep 'c' > /dev/null; then 145 | echo "Did not see expected file (c) in container" 146 | exit 1 147 | fi 148 | 149 | run_delete --image test/repousinglatest:latest --force --prune 150 | delete_test_docker_images 151 | assert_that_registry_has_no_data 152 | } 153 | 154 | function test_deleting_a_tag_and_then_repushing_it_works() { 155 | setup 156 | build_test_images a 157 | docker tag localhost:5000/test/a localhost:5000/test/somethingwithtag:1 158 | docker push localhost:5000/test/somethingwithtag:1 159 | run_delete --image test/somethingwithtag:1 160 | bounce_registry 161 | docker push localhost:5000/test/somethingwithtag:1 162 | delete_test_docker_images 163 | docker pull localhost:5000/test/somethingwithtag:1 164 | } 165 | 166 | function test_deleting_all_images_deletes_all_data() { 167 | setup 168 | build_and_push_test_images a b 169 | 170 | docker tag localhost:5000/test/a localhost:5000/test/somethingwithtag:1 171 | docker push localhost:5000/test/somethingwithtag:1 172 | run_delete --image test/somethingwithtag:1 --force --prune 173 | delete_test_docker_images 174 | docker pull localhost:5000/test/a # we haven't deleted this image yet, even though it has the same data as the tag we deleted. confirm we can still pull it. 175 | 176 | run_delete --image test/a --force --prune 177 | delete_test_docker_images 178 | docker pull localhost:5000/test/b # we haven't deleted b from the registry yet. confirm that we can pull it with no issue. 179 | docker tag localhost:5000/test/b localhost:5000/test/somethingelsewithtag:1 180 | docker push localhost:5000/test/somethingelsewithtag:1 181 | docker pull localhost:5000/test/somethingelsewithtag:1 # of course this works, right 182 | run_delete --image test/b --force --prune 183 | delete_test_docker_images 184 | docker pull localhost:5000/test/somethingelsewithtag:1 # we haven't deleted this tag yet, even though it has the same data as b. confirm we can still pull it. 185 | run_delete --image test/somethingelsewithtag:1 --force --prune 186 | assert_that_registry_has_no_data 187 | } 188 | 189 | function test_deleting_tag_first_does_not_leave_stuff_lying_around() { 190 | setup 191 | build_and_push_test_images a 192 | docker tag localhost:5000/test/a localhost:5000/test/somethingwithtag:1 193 | docker push localhost:5000/test/somethingwithtag:1 194 | back_up_registry_data_for_later_inspection_in_case_test_fails 195 | run_delete --image test/somethingwithtag:1 --force --prune -v 196 | delete_test_docker_images 197 | docker pull localhost:5000/test/a # we haven't deleted this image yet, even though it has the same data as the tag we deleted. confirm we can still pull it. 198 | run_delete --image test/a --force --prune 199 | assert_that_registry_has_no_data 200 | } 201 | 202 | function test_deleting_an_image_does_not_harm_an_equivalent_tag_in_another_repo() { 203 | setup 204 | build_and_push_test_images a 205 | docker tag localhost:5000/test/a localhost:5000/test/stillanotherthingwithtag:1 206 | docker push localhost:5000/test/stillanotherthingwithtag:1 207 | delete_test_docker_images 208 | docker pull localhost:5000/test/stillanotherthingwithtag:1 # of course this works, right 209 | run_delete --image test/a --force --prune -v 210 | delete_test_docker_images 211 | docker pull localhost:5000/test/stillanotherthingwithtag:1 # we haven't deleted this tag yet, even though it has the same data as b. confirm we can still pull it. 212 | run_delete --image test/stillanotherthingwithtag:1 --force --prune 213 | assert_that_registry_has_no_data 214 | } 215 | 216 | function test_deleting_by_tag() { 217 | setup 218 | build_and_push_test_images c d 219 | run_delete --image test/c:1 --force --prune 220 | delete_test_docker_images 221 | docker pull localhost:5000/test/c:2 # we haven't deleted d (c:2) from the registry yet. confirm that we can pull it with no issue. 222 | run_delete --image test/c:2 --force --prune 223 | assert_that_registry_has_no_data 224 | } 225 | 226 | function test_deleting_actually_deletes() { 227 | setup 228 | build_and_push_test_images c 229 | run_delete --image test/c:1 --force --prune 230 | delete_test_docker_images 231 | if docker pull localhost:5000/test/c:1; then 232 | echo "we should not have been able to pull that deleted repo" 233 | exit 1 234 | fi 235 | assert_that_registry_has_no_data 236 | } 237 | 238 | function test_deleting_with_dry_run_does_not_delete() { 239 | setup 240 | build_and_push_test_images c 241 | run_delete --image test/c:1 --force --prune --dry-run 242 | delete_test_docker_images 243 | docker pull localhost:5000/test/c:1 244 | run_delete --image test/c:1 --force --prune 245 | assert_that_registry_has_no_data 246 | } 247 | 248 | function test_deleting_with_no_image_argument_tells_you_to_try_again() { 249 | setup 250 | if run_delete --image test 2>&1 | grep "No repository 'test' found in repositories directory"; then 251 | echo "Yep, we expected this error message before adding a repo" 252 | else 253 | echo "Did not receive expected error message when trying to delete before adding a repo" 254 | exit 1 255 | fi 256 | build_and_push_test_images a 257 | if run_delete 2>&1 | grep 'image is required'; then 258 | echo "Yep, we expected this error message" 259 | else 260 | echo "Did not receive expected error message when trying to delete with no image argument" 261 | exit 1 262 | fi 263 | run_delete --image test/a --force --prune -v 264 | assert_that_registry_has_no_data 265 | } 266 | 267 | function test_clean_old_versions() { 268 | docker pull busybox 269 | 270 | docker tag busybox localhost:5000/busybox/busy:1 271 | docker tag busybox localhost:5000/busybox/busy:2 272 | docker tag busybox localhost:5000/busybox/busy:3 273 | docker tag busybox localhost:5000/busybox/busy:4 274 | docker tag busybox localhost:5000/busybox/busy:latest 275 | 276 | docker push localhost:5000/busybox/busy:1 277 | docker push localhost:5000/busybox/busy:2 278 | docker push localhost:5000/busybox/busy:3 279 | docker push localhost:5000/busybox/busy:4 280 | docker push localhost:5000/busybox/busy:latest 281 | 282 | # clean any tags (.*) of busybox/busy. keep the four latest tags. so this should kill busybox/busy:1 283 | sudo ./clean_old_versions.py --registry-url http://localhost:5000 --image 'busybox/busy' --include '.*' -l 4 --script-path /vagrant/delete_docker_registry_image.py 284 | 285 | # remove image locally 286 | docker rmi localhost:5000/busybox/busy:1 287 | 288 | if docker pull localhost:5000/busybox/busy:1; then 289 | echo "we should not have been able to pull the tag. it should have been deleted." 290 | exit 1 291 | fi 292 | 293 | # confirm we can pull :2. it was one of the four that we were supposed to keep around. 294 | docker push localhost:5000/busybox/busy:2 295 | } 296 | 297 | test_deleting_untagged 298 | test_deleting_all_images_deletes_all_data # fail 299 | test_deleting_tag_first_does_not_leave_stuff_lying_around # fail, but not any more 300 | test_deleting_an_image_does_not_harm_an_equivalent_tag_in_another_repo # pass 301 | test_deleting_by_tag # next three pass 302 | test_deleting_actually_deletes 303 | test_deleting_with_dry_run_does_not_delete 304 | test_deleting_with_no_image_argument_tells_you_to_try_again 305 | test_clean_old_versions 306 | 307 | echo TESTS PASSED 308 | --------------------------------------------------------------------------------