├── .gitignore ├── LICENSE ├── README.md ├── datasets └── README.md ├── docs └── getting-started.md ├── notebooks ├── coco_image_viewer.ipynb └── train_mask_rcnn.ipynb ├── python ├── coco_json_utils.py └── image_composition.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | 106 | # Specific to CocoSynth 107 | datasets/* 108 | !datasets/README.md 109 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Adam Kelly 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # cocosynth 2 | COCO Synth provides tools for creating synthetic COCO datasets. 3 | 4 | # Complete Guide to Creating COCO Datasets 5 | ![Complete Guide to Creating COCO Datasets Course Image](https://images.squarespace-cdn.com/content/v1/55652c24e4b0edcadf841347/1586723596485-FI1W99L5F0X6X0VP36SW/ke17ZwdGBToddI8pDm48kJFjiAAEKQOxhtR6kyGixEZZw-zPPgdn4jUwVcJE1ZvWQUxwkmyExglNqGp0IvTJZamWLI2zvYWH8K3-s_4yszcp2ryTI0HqTOaaUohrI8PIZG0-o4ErKRJfhIwgspvy036Ezj4M485dTMevEG-VX_E/creating-coco-datasets.png) 6 | 7 | This code repo is a companion to a Udemy course for developers who'd like a step by step walk-through of how to create a synthetic COCO dataset from scratch. When you enroll, you'll get a full walkthrough of how all of the code in this repo works. When you finish, you'll have a COCO dataset with your own custom categories and a trained Mask R-CNN. 8 | 9 | Learn more at [https://www.immersivelimit.com/course/creating-coco-datasets](https://www.immersivelimit.com/course/creating-coco-datasets) 10 | 11 | Follow my various social media channels listed at [immersivelimit.com/connect](http://www.immersivelimit.com/connect) for updates! 12 | 13 | # Getting Started 14 | Check out the [Getting Started](./docs/getting-started.md) guide. It will walk you through the scripts with a sample dataset. 15 | -------------------------------------------------------------------------------- /datasets/README.md: -------------------------------------------------------------------------------- 1 | # TL;DR 2 | This directory is intentionally empty to save space. You can put datasets here. 3 | 4 | # More Detailed Explanation 5 | This directory is intended to contain datasets that the scripts will access, however they are intentionally not included in the git repo to save space. You may safely leave datasets in here and the .gitignore file will ensure that they are not included in git commits. 6 | 7 | # Datasets 8 | Here are some datasets you can download to populate this directory. 9 | If any are missing or have broken links, please create an issue on GitHub or contact me via one of the various channels listed at [immersivelimit.com/connect](http://www.immersivelimit.com/connect) 10 | 11 | ## box_dataset_synthetic 12 | A complete synthetic dataset of boxes. [Download](https://immersivelimit.page.link/gnbR) 13 | 14 | ## weeds 15 | Foreground and background images for the [Weed Detector](http://www.immersivelimit.com/blog/ai-weed-detector) case study I did. [Download](https://immersivelimit.page.link/euJu) 16 | 17 | Test Images and Videos for the Weed Detector. [Download](https://immersivelimit.page.link/Zk6V) 18 | 19 | Note: The full dataset I created was several gigabytes of 1024x1024 images so I'm not including it here. You can use this git repo to generate your own. For reference, it contained 10k training images and 1k validation images. 20 | -------------------------------------------------------------------------------- /docs/getting-started.md: -------------------------------------------------------------------------------- 1 | # Getting Started with COCO Synth 2 | 3 | ## Initial Setup 4 | I highly recommend using [Anaconda](https://docs.anaconda.com/anaconda/install/) for Python environment management. It will help you install Shapely, which I've had some problems installing with pip. 5 | 6 | Once you have Anaconda installed... 7 | 8 | On Windows: 9 | ``` 10 | conda create -n cocosynth python=3.6 11 | activate cocosynth 12 | conda install -c conda-forge shapely 13 | pip install -r requirements.txt 14 | ``` 15 | On Linux (and I assume Mac): 16 | ``` 17 | conda create -n cocosynth python=3.6 18 | source activate cocosynth 19 | conda install -c conda-forge shapely 20 | pip install -r requirements.txt 21 | ``` 22 | 23 | # View Segmentations with the COCO Image Viewer 24 | Fire up Jupyter Notebook. 25 | ``` 26 | jupyter notebook 27 | ``` 28 | Open up "../notebooks/coco_image_viewer.ipynb" and run through the cells in the notebook. Pay attention to the file paths. They are set up to work with this guide. If everything works correctly, you'll be able to view an image with image segmentation overlays. 29 | 30 | # Create Synthetic Images and Masks 31 | In this section, we will use "image_composition.py" to randomly pick foregrounds and automatically super-impose them on backgrounds. You will need a number of foreground cutouts with transparent backgrounds. For example, you might have a picture of an eagle with a completely transparent background. Due to the need for transparency, these images should be .png format (.jpg doesn't have transparency). I cut out my foregrounds with [GIMP](https://www.gimp.org/), which is free. 32 | 33 | ## Using the sample dataset 34 | For this guide, all examples assume you'll be using the box_dataset_synthetic sample dataset. Find it here [../datasets/README.md](../datasets/README.md). Download it and extract the contents to "../datasets/box_dataset_synthetic". 35 | 36 | ## Custom dataset directory setup: 37 | - Inside the "datasets" directory, create a new folder for your dataset (e.g. "wild_animal_dataset") 38 | - Inside that dataset directory, create a folder called "input" 39 | - Inside "input", create two folders called "foregrounds" and "backgrounds" 40 | - Inside "foregrounds", create a folder for each super category (e.g. "bird", "lizard") 41 | - Inside each foreground super category folder, create a folder for each category (e.g. "eagle", "owl") 42 | - Inside each category folder, add all foreground photos you intend to use for the respective category (e.g. all of you eagle foreground cutouts) 43 | - Inside "backgrounds", add all background photos you intend to use 44 | 45 | Run "image_composition.py" to create your images and masks 46 | ``` 47 | python ./python/image_composition.py --input_dir ./datasets/box_dataset_synthetic/input --output_dir ./datasets/box_dataset_synthetic/output --count 10 --width 512 --height 512 48 | ``` 49 | 50 | # Create COCO Instances JSON 51 | Now we're going to use the images, masks, and json to create COCO instances. 52 | 53 | Optional: Run "coco_json_utils.py" with --help to see the documentation. This will explain the next command. 54 | ``` 55 | python ./python/coco_json_utils.py --help 56 | ``` 57 | Run the command with the correct parameters 58 | ``` 59 | python ./python/coco_json_utils.py -md ./datasets/box_dataset_synthetic/output/mask_definitions.json -di ./datasets/box_dataset_synthetic/output/dataset_info.json 60 | ``` 61 | 62 | You will now have a new json file called "coco_instances.json". This is contains all of your COCO json! 63 | 64 | -------------------------------------------------------------------------------- /python/coco_json_utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import numpy as np 4 | import json 5 | from pathlib import Path 6 | from tqdm import tqdm 7 | from skimage import measure, io 8 | from shapely.geometry import Polygon, MultiPolygon 9 | from PIL import Image 10 | 11 | class InfoJsonUtils(): 12 | """ Creates an info object to describe a COCO dataset 13 | """ 14 | def create_coco_info(self, description, url, version, year, contributor, date_created): 15 | """ Creates the "info" portion of COCO json 16 | """ 17 | info = dict() 18 | info['description'] = description 19 | info['url'] = url 20 | info['version'] = version 21 | info['year'] = year 22 | info['contributor'] = contributor 23 | info['date_created'] = date_created 24 | 25 | return info 26 | 27 | class LicenseJsonUtils(): 28 | """ Creates a license object to describe a COCO dataset 29 | """ 30 | def create_coco_license(self, url, license_id, name): 31 | """ Creates the "licenses" portion of COCO json 32 | """ 33 | lic = dict() 34 | lic['url'] = url 35 | lic['id'] = license_id 36 | lic['name'] = name 37 | 38 | return lic 39 | 40 | class CategoryJsonUtils(): 41 | """ Creates a category object to describe a COCO dataset 42 | """ 43 | def create_coco_category(self, supercategory, category_id, name): 44 | category = dict() 45 | category['supercategory'] = supercategory 46 | category['id'] = category_id 47 | category['name'] = name 48 | 49 | return category 50 | 51 | class ImageJsonUtils(): 52 | """ Creates an image object to describe a COCO dataset 53 | """ 54 | def create_coco_image(self, image_path, image_id, image_license): 55 | """ Creates the "image" portion of COCO json 56 | """ 57 | # Open the image and get the size 58 | image_file = Image.open(image_path) 59 | width, height = image_file.size 60 | 61 | image = dict() 62 | image['license'] = image_license 63 | image['file_name'] = image_path.name 64 | image['width'] = width 65 | image['height'] = height 66 | image['id'] = image_id 67 | 68 | return image 69 | 70 | class AnnotationJsonUtils(): 71 | """ Creates an annotation object to describe a COCO dataset 72 | """ 73 | def __init__(self): 74 | self.annotation_id_index = 0 75 | 76 | def create_coco_annotations(self, image_mask_path, image_id, category_ids): 77 | """ Takes a pixel-based RGB image mask and creates COCO annotations. 78 | Args: 79 | image_mask_path: a pathlib.Path to the image mask 80 | image_id: the integer image id 81 | category_ids: a dictionary of integer category ids keyed by RGB color (a tuple converted to a string) 82 | e.g. {'(255, 0, 0)': {'category': 'owl', 'super_category': 'bird'} } 83 | Returns: 84 | annotations: a list of COCO annotation dictionaries that can 85 | be converted to json. e.g.: 86 | { 87 | "segmentation": [[101.79,307.32,69.75,281.11,...,100.05,309.66]], 88 | "area": 51241.3617, 89 | "iscrowd": 0, 90 | "image_id": 284725, 91 | "bbox": [68.01,134.89,433.41,174.77], 92 | "category_id": 6, 93 | "id": 165690 94 | } 95 | """ 96 | # Set class variables 97 | self.image_id = image_id 98 | self.category_ids = category_ids 99 | 100 | # Make sure keys in category_ids are strings 101 | for key in self.category_ids.keys(): 102 | if type(key) is not str: 103 | raise TypeError('category_ids keys must be strings (e.g. "(0, 0, 255)")') 104 | break 105 | 106 | # Open and process image 107 | self.mask_image = Image.open(image_mask_path) 108 | self.mask_image = self.mask_image.convert('RGB') 109 | self.width, self.height = self.mask_image.size 110 | 111 | # Split up the multi-colored masks into multiple 0/1 bit masks 112 | self._isolate_masks() 113 | 114 | # Create annotations from the masks 115 | self._create_annotations() 116 | 117 | return self.annotations 118 | 119 | def _isolate_masks(self): 120 | # Breaks mask up into isolated masks based on color 121 | 122 | self.isolated_masks = dict() 123 | # for x in range(self.width): 124 | # for y in range(self.height): 125 | # pixel_rgb = self.mask_image.getpixel((x,y)) 126 | # pixel_rgb_str = str(pixel_rgb) 127 | 128 | # # If the pixel is any color other than black, add it to a respective isolated image mask 129 | # if not pixel_rgb == (0, 0, 0): 130 | # if self.isolated_masks.get(pixel_rgb_str) is None: 131 | # # Isolated mask doesn't have its own image yet, create one 132 | # # with 1-bit pixels, default black. Make room for 1 pixel of 133 | # # padding on each edge to allow the contours algorithm to work 134 | # # when shapes bleed up to the edge 135 | # self.isolated_masks[pixel_rgb_str] = Image.new('1', (self.width + 2, self.height + 2)) 136 | 137 | # # Add the pixel to the mask image, shifting by 1 pixel to account for padding 138 | # self.isolated_masks[pixel_rgb_str].putpixel((x + 1, y + 1), 1) 139 | 140 | # This is a much faster way to split masks using Numpy 141 | arr = np.array(self.mask_image, dtype=np.uint32) 142 | rgb32 = (arr[:,:,0] << 16) + (arr[:,:,1] << 8) + arr[:,:,2] 143 | unique_values = np.unique(rgb32) 144 | for u in unique_values: 145 | if u != 0: 146 | r = int((u & (255 << 16)) >> 16) 147 | g = int((u & (255 << 8)) >> 8) 148 | b = int(u & 255) 149 | self.isolated_masks[str((r, g, b))] = np.equal(rgb32, u) 150 | 151 | def _create_annotations(self): 152 | # Creates annotations for each isolated mask 153 | 154 | # Each image may have multiple annotations, so create an array 155 | self.annotations = [] 156 | for key, mask in self.isolated_masks.items(): 157 | annotation = dict() 158 | annotation['segmentation'] = [] 159 | annotation['iscrowd'] = 0 160 | annotation['image_id'] = self.image_id 161 | if not self.category_ids.get(key): 162 | print(f'category color not found: {key}; check for missing category or antialiasing') 163 | continue 164 | annotation['category_id'] = self.category_ids[key] 165 | annotation['id'] = self._next_annotation_id() 166 | 167 | # Find contours in the isolated mask 168 | mask = np.asarray(mask, dtype=np.float32) 169 | contours = measure.find_contours(mask, 0.5, positive_orientation='low') 170 | 171 | polygons = [] 172 | for contour in contours: 173 | # Flip from (row, col) representation to (x, y) 174 | # and subtract the padding pixel 175 | for i in range(len(contour)): 176 | row, col = contour[i] 177 | contour[i] = (col - 1, row - 1) 178 | 179 | # Make a polygon and simplify it 180 | poly = Polygon(contour) 181 | poly = poly.simplify(1.0, preserve_topology=False) 182 | 183 | if (poly.area > 16): # Ignore tiny polygons 184 | if (poly.geom_type == 'MultiPolygon'): 185 | # if MultiPolygon, take the smallest convex Polygon containing all the points in the object 186 | poly = poly.convex_hull 187 | 188 | if (poly.geom_type == 'Polygon'): # Ignore if still not a Polygon (could be a line or point) 189 | polygons.append(poly) 190 | segmentation = np.array(poly.exterior.coords).ravel().tolist() 191 | annotation['segmentation'].append(segmentation) 192 | 193 | if len(polygons) == 0: 194 | # This item doesn't have any visible polygons, ignore it 195 | # (This can happen if a randomly placed foreground is covered up 196 | # by other foregrounds) 197 | continue 198 | 199 | # Combine the polygons to calculate the bounding box and area 200 | multi_poly = MultiPolygon(polygons) 201 | x, y, max_x, max_y = multi_poly.bounds 202 | self.width = max_x - x 203 | self.height = max_y - y 204 | annotation['bbox'] = (x, y, self.width, self.height) 205 | annotation['area'] = multi_poly.area 206 | 207 | # Finally, add this annotation to the list 208 | self.annotations.append(annotation) 209 | 210 | def _next_annotation_id(self): 211 | # Gets the next annotation id 212 | # Note: This is not a unique id. It simply starts at 0 and increments each time it is called 213 | 214 | a_id = self.annotation_id_index 215 | self.annotation_id_index += 1 216 | return a_id 217 | 218 | class CocoJsonCreator(): 219 | def validate_and_process_args(self, args): 220 | """ Validates the arguments coming in from the command line and performs 221 | initial processing 222 | Args: 223 | args: ArgumentParser arguments 224 | """ 225 | # Validate the mask definition file exists 226 | mask_definition_file = Path(args.mask_definition) 227 | if not (mask_definition_file.exists and mask_definition_file.is_file()): 228 | raise FileNotFoundError(f'mask definition file was not found: {mask_definition_file}') 229 | 230 | # Load the mask definition json 231 | with open(mask_definition_file) as json_file: 232 | self.mask_definitions = json.load(json_file) 233 | 234 | self.dataset_dir = mask_definition_file.parent 235 | 236 | # Validate the dataset info file exists 237 | dataset_info_file = Path(args.dataset_info) 238 | if not (dataset_info_file.exists() and dataset_info_file.is_file()): 239 | raise FileNotFoundError(f'dataset info file was not found: {dataset_info_file}') 240 | 241 | # Load the dataset info json 242 | with open(dataset_info_file) as json_file: 243 | self.dataset_info = json.load(json_file) 244 | 245 | assert 'info' in self.dataset_info, 'dataset_info JSON was missing "info"' 246 | assert 'license' in self.dataset_info, 'dataset_info JSON was missing "license"' 247 | 248 | def create_info(self): 249 | """ Creates the "info" piece of the COCO json 250 | """ 251 | info_json = self.dataset_info['info'] 252 | iju = InfoJsonUtils() 253 | return iju.create_coco_info( 254 | description = info_json['description'], 255 | version = info_json['version'], 256 | url = info_json['url'], 257 | year = info_json['year'], 258 | contributor = info_json['contributor'], 259 | date_created = info_json['date_created'] 260 | ) 261 | 262 | def create_licenses(self): 263 | """ Creates the "license" portion of the COCO json 264 | """ 265 | license_json = self.dataset_info['license'] 266 | lju = LicenseJsonUtils() 267 | lic = lju.create_coco_license( 268 | url = license_json['url'], 269 | license_id = license_json['id'], 270 | name = license_json['name'] 271 | ) 272 | return [lic] 273 | 274 | def create_categories(self): 275 | """ Creates the "categories" portion of the COCO json 276 | Returns: 277 | categories: category objects that become part of the final json 278 | category_ids_by_name: a lookup dictionary for category ids based 279 | on the name of the category 280 | """ 281 | cju = CategoryJsonUtils() 282 | categories = [] 283 | category_ids_by_name = dict() 284 | category_id = 1 # 0 is reserved for the background 285 | 286 | super_categories = self.mask_definitions['super_categories'] 287 | for super_category, _categories in super_categories.items(): 288 | for category_name in _categories: 289 | categories.append(cju.create_coco_category(super_category, category_id, category_name)) 290 | category_ids_by_name[category_name] = category_id 291 | category_id += 1 292 | 293 | return categories, category_ids_by_name 294 | 295 | def create_images_and_annotations(self, category_ids_by_name): 296 | """ Creates the list of images (in json) and the annotations for each 297 | image for the "image" and "annotations" portions of the COCO json 298 | """ 299 | iju = ImageJsonUtils() 300 | aju = AnnotationJsonUtils() 301 | 302 | image_objs = [] 303 | annotation_objs = [] 304 | image_license = self.dataset_info['license']['id'] 305 | image_id = 0 306 | 307 | mask_count = len(self.mask_definitions['masks']) 308 | print(f'Processing {mask_count} mask definitions...') 309 | 310 | # For each mask definition, create image and annotations 311 | for file_name, mask_def in tqdm(self.mask_definitions['masks'].items()): 312 | # Create a coco image json item 313 | image_path = Path(self.dataset_dir) / file_name 314 | image_obj = iju.create_coco_image( 315 | image_path, 316 | image_id, 317 | image_license) 318 | image_objs.append(image_obj) 319 | 320 | mask_path = Path(self.dataset_dir) / mask_def['mask'] 321 | 322 | # Create a dict of category ids keyed by rgb_color 323 | category_ids_by_rgb = dict() 324 | for rgb_color, category in mask_def['color_categories'].items(): 325 | category_ids_by_rgb[rgb_color] = category_ids_by_name[category['category']] 326 | annotation_obj = aju.create_coco_annotations(mask_path, image_id, category_ids_by_rgb) 327 | annotation_objs += annotation_obj # Add the new annotations to the existing list 328 | image_id += 1 329 | 330 | return image_objs, annotation_objs 331 | 332 | def main(self, args): 333 | self.validate_and_process_args(args) 334 | 335 | info = self.create_info() 336 | licenses = self.create_licenses() 337 | categories, category_ids_by_name = self.create_categories() 338 | images, annotations = self.create_images_and_annotations(category_ids_by_name) 339 | 340 | master_obj = { 341 | 'info': info, 342 | 'licenses': licenses, 343 | 'images': images, 344 | 'annotations': annotations, 345 | 'categories': categories 346 | } 347 | 348 | # Write the json to a file 349 | output_path = Path(self.dataset_dir) / 'coco_instances.json' 350 | with open(output_path, 'w+') as output_file: 351 | json.dump(master_obj, output_file) 352 | 353 | print(f'Annotations successfully written to file:\n{output_path}') 354 | 355 | if __name__ == "__main__": 356 | import argparse 357 | 358 | parser = argparse.ArgumentParser(description="Generate COCO JSON") 359 | 360 | parser.add_argument("-md", "--mask_definition", dest="mask_definition", 361 | help="path to a mask definition JSON file, generated by MaskJsonUtils module") 362 | parser.add_argument("-di", "--dataset_info", dest="dataset_info", 363 | help="path to a dataset info JSON file") 364 | 365 | args = parser.parse_args() 366 | 367 | cjc = CocoJsonCreator() 368 | cjc.main(args) 369 | -------------------------------------------------------------------------------- /python/image_composition.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import json 4 | import warnings 5 | import random 6 | import numpy as np 7 | from datetime import datetime 8 | from pathlib import Path 9 | from tqdm import tqdm 10 | from PIL import Image, ImageEnhance 11 | 12 | class MaskJsonUtils(): 13 | """ Creates a JSON definition file for image masks. 14 | """ 15 | 16 | def __init__(self, output_dir): 17 | """ Initializes the class. 18 | Args: 19 | output_dir: the directory where the definition file will be saved 20 | """ 21 | self.output_dir = output_dir 22 | self.masks = dict() 23 | self.super_categories = dict() 24 | 25 | def add_category(self, category, super_category): 26 | """ Adds a new category to the set of the corresponding super_category 27 | Args: 28 | category: e.g. 'eagle' 29 | super_category: e.g. 'bird' 30 | Returns: 31 | True if successful, False if the category was already in the dictionary 32 | """ 33 | if not self.super_categories.get(super_category): 34 | # Super category doesn't exist yet, create a new set 35 | self.super_categories[super_category] = {category} 36 | elif category in self.super_categories[super_category]: 37 | # Category is already accounted for 38 | return False 39 | else: 40 | # Add the category to the existing super category set 41 | self.super_categories[super_category].add(category) 42 | 43 | return True # Addition was successful 44 | 45 | def add_mask(self, image_path, mask_path, color_categories): 46 | """ Takes an image path, its corresponding mask path, and its color categories, 47 | and adds it to the appropriate dictionaries 48 | Args: 49 | image_path: the relative path to the image, e.g. './images/00000001.png' 50 | mask_path: the relative path to the mask image, e.g. './masks/00000001.png' 51 | color_categories: the legend of color categories, for this particular mask, 52 | represented as an rgb-color keyed dictionary of category names and their super categories. 53 | (the color category associations are not assumed to be consistent across images) 54 | Returns: 55 | True if successful, False if the image was already in the dictionary 56 | """ 57 | if self.masks.get(image_path): 58 | return False # image/mask is already in the dictionary 59 | 60 | # Create the mask definition 61 | mask = { 62 | 'mask': mask_path, 63 | 'color_categories': color_categories 64 | } 65 | 66 | # Add the mask definition to the dictionary of masks 67 | self.masks[image_path] = mask 68 | 69 | # Regardless of color, we need to store each new category under its supercategory 70 | for _, item in color_categories.items(): 71 | self.add_category(item['category'], item['super_category']) 72 | 73 | return True # Addition was successful 74 | 75 | def get_masks(self): 76 | """ Gets all masks that have been added 77 | """ 78 | return self.masks 79 | 80 | def get_super_categories(self): 81 | """ Gets the dictionary of super categories for each category in a JSON 82 | serializable format 83 | Returns: 84 | A dictionary of lists of categories keyed on super_category 85 | """ 86 | serializable_super_cats = dict() 87 | for super_cat, categories_set in self.super_categories.items(): 88 | # Sets are not json serializable, so convert to list 89 | serializable_super_cats[super_cat] = list(categories_set) 90 | return serializable_super_cats 91 | 92 | def write_masks_to_json(self): 93 | """ Writes all masks and color categories to the output file path as JSON 94 | """ 95 | # Serialize the masks and super categories dictionaries 96 | serializable_masks = self.get_masks() 97 | serializable_super_cats = self.get_super_categories() 98 | masks_obj = { 99 | 'masks': serializable_masks, 100 | 'super_categories': serializable_super_cats 101 | } 102 | 103 | # Write the JSON output file 104 | output_file_path = Path(self.output_dir) / 'mask_definitions.json' 105 | with open(output_file_path, 'w+') as json_file: 106 | json_file.write(json.dumps(masks_obj)) 107 | 108 | class ImageComposition(): 109 | """ Composes images together in random ways, applying transformations to the foreground to create a synthetic 110 | combined image. 111 | """ 112 | 113 | def __init__(self): 114 | self.allowed_output_types = ['.png', '.jpg', '.jpeg'] 115 | self.allowed_background_types = ['.png', '.jpg', '.jpeg'] 116 | self.zero_padding = 8 # 00000027.png, supports up to 100 million images 117 | self.max_foregrounds = 3 118 | self.mask_colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255)] 119 | assert len(self.mask_colors) >= self.max_foregrounds, 'length of mask_colors should be >= max_foregrounds' 120 | 121 | def _validate_and_process_args(self, args): 122 | # Validates input arguments and sets up class variables 123 | # Args: 124 | # args: the ArgumentParser command line arguments 125 | 126 | self.silent = args.silent 127 | 128 | # Validate the count 129 | assert args.count > 0, 'count must be greater than 0' 130 | self.count = args.count 131 | 132 | # Validate the width and height 133 | assert args.width >= 64, 'width must be greater than 64' 134 | self.width = args.width 135 | assert args.height >= 64, 'height must be greater than 64' 136 | self.height = args.height 137 | 138 | # Validate and process the output type 139 | if args.output_type is None: 140 | self.output_type = '.jpg' # default 141 | else: 142 | if args.output_type[0] != '.': 143 | self.output_type = f'.{args.output_type}' 144 | assert self.output_type in self.allowed_output_types, f'output_type is not supported: {self.output_type}' 145 | 146 | # Validate and process output and input directories 147 | self._validate_and_process_output_directory() 148 | self._validate_and_process_input_directory() 149 | 150 | def _validate_and_process_output_directory(self): 151 | self.output_dir = Path(args.output_dir) 152 | self.images_output_dir = self.output_dir / 'images' 153 | self.masks_output_dir = self.output_dir / 'masks' 154 | 155 | # Create directories 156 | self.output_dir.mkdir(exist_ok=True) 157 | self.images_output_dir.mkdir(exist_ok=True) 158 | self.masks_output_dir.mkdir(exist_ok=True) 159 | 160 | if not self.silent: 161 | # Check for existing contents in the images directory 162 | for _ in self.images_output_dir.iterdir(): 163 | # We found something, check if the user wants to overwrite files or quit 164 | should_continue = input('output_dir is not empty, files may be overwritten.\nContinue (y/n)? ').lower() 165 | if should_continue != 'y' and should_continue != 'yes': 166 | quit() 167 | break 168 | 169 | def _validate_and_process_input_directory(self): 170 | self.input_dir = Path(args.input_dir) 171 | assert self.input_dir.exists(), f'input_dir does not exist: {args.input_dir}' 172 | 173 | for x in self.input_dir.iterdir(): 174 | if x.name == 'foregrounds': 175 | self.foregrounds_dir = x 176 | elif x.name == 'backgrounds': 177 | self.backgrounds_dir = x 178 | 179 | assert self.foregrounds_dir is not None, 'foregrounds sub-directory was not found in the input_dir' 180 | assert self.backgrounds_dir is not None, 'backgrounds sub-directory was not found in the input_dir' 181 | 182 | self._validate_and_process_foregrounds() 183 | self._validate_and_process_backgrounds() 184 | 185 | def _validate_and_process_foregrounds(self): 186 | # Validates input foregrounds and processes them into a foregrounds dictionary. 187 | # Expected directory structure: 188 | # + foregrounds_dir 189 | # + super_category_dir 190 | # + category_dir 191 | # + foreground_image.png 192 | 193 | self.foregrounds_dict = dict() 194 | 195 | for super_category_dir in self.foregrounds_dir.iterdir(): 196 | if not super_category_dir.is_dir(): 197 | warnings.warn(f'file found in foregrounds directory (expected super-category directories), ignoring: {super_category_dir}') 198 | continue 199 | 200 | # This is a super category directory 201 | for category_dir in super_category_dir.iterdir(): 202 | if not category_dir.is_dir(): 203 | warnings.warn(f'file found in super category directory (expected category directories), ignoring: {category_dir}') 204 | continue 205 | 206 | # This is a category directory 207 | for image_file in category_dir.iterdir(): 208 | if not image_file.is_file(): 209 | warnings.warn(f'a directory was found inside a category directory, ignoring: {str(image_file)}') 210 | continue 211 | if image_file.suffix != '.png': 212 | warnings.warn(f'foreground must be a .png file, skipping: {str(image_file)}') 213 | continue 214 | 215 | # Valid foreground image, add to foregrounds_dict 216 | super_category = super_category_dir.name 217 | category = category_dir.name 218 | 219 | if super_category not in self.foregrounds_dict: 220 | self.foregrounds_dict[super_category] = dict() 221 | 222 | if category not in self.foregrounds_dict[super_category]: 223 | self.foregrounds_dict[super_category][category] = [] 224 | 225 | self.foregrounds_dict[super_category][category].append(image_file) 226 | 227 | assert len(self.foregrounds_dict) > 0, 'no valid foregrounds were found' 228 | 229 | def _validate_and_process_backgrounds(self): 230 | self.backgrounds = [] 231 | for image_file in self.backgrounds_dir.iterdir(): 232 | if not image_file.is_file(): 233 | warnings.warn(f'a directory was found inside the backgrounds directory, ignoring: {image_file}') 234 | continue 235 | 236 | if image_file.suffix not in self.allowed_background_types: 237 | warnings.warn(f'background must match an accepted type {str(self.allowed_background_types)}, ignoring: {image_file}') 238 | continue 239 | 240 | # Valid file, add to backgrounds list 241 | self.backgrounds.append(image_file) 242 | 243 | assert len(self.backgrounds) > 0, 'no valid backgrounds were found' 244 | 245 | def _generate_images(self): 246 | # Generates a number of images and creates segmentation masks, then 247 | # saves a mask_definitions.json file that describes the dataset. 248 | 249 | print(f'Generating {self.count} images with masks...') 250 | 251 | mju = MaskJsonUtils(self.output_dir) 252 | 253 | # Create all images/masks (with tqdm to have a progress bar) 254 | for i in tqdm(range(self.count)): 255 | # Randomly choose a background 256 | background_path = random.choice(self.backgrounds) 257 | 258 | num_foregrounds = random.randint(1, self.max_foregrounds) 259 | foregrounds = [] 260 | for fg_i in range(num_foregrounds): 261 | # Randomly choose a foreground 262 | super_category = random.choice(list(self.foregrounds_dict.keys())) 263 | category = random.choice(list(self.foregrounds_dict[super_category].keys())) 264 | foreground_path = random.choice(self.foregrounds_dict[super_category][category]) 265 | 266 | # Get the color 267 | mask_rgb_color = self.mask_colors[fg_i] 268 | 269 | foregrounds.append({ 270 | 'super_category':super_category, 271 | 'category':category, 272 | 'foreground_path':foreground_path, 273 | 'mask_rgb_color':mask_rgb_color 274 | }) 275 | 276 | # Compose foregrounds and background 277 | composite, mask = self._compose_images(foregrounds, background_path) 278 | 279 | # Create the file name (used for both composite and mask) 280 | save_filename = f'{i:0{self.zero_padding}}' # e.g. 00000023.jpg 281 | 282 | # Save composite image to the images sub-directory 283 | composite_filename = f'{save_filename}{self.output_type}' # e.g. 00000023.jpg 284 | composite_path = self.output_dir / 'images' / composite_filename # e.g. my_output_dir/images/00000023.jpg 285 | composite = composite.convert('RGB') # remove alpha 286 | composite.save(composite_path) 287 | 288 | # Save the mask image to the masks sub-directory 289 | mask_filename = f'{save_filename}.png' # masks are always png to avoid lossy compression 290 | mask_path = self.output_dir / 'masks' / mask_filename # e.g. my_output_dir/masks/00000023.png 291 | mask.save(mask_path) 292 | 293 | color_categories = dict() 294 | for fg in foregrounds: 295 | # Add category and color info 296 | mju.add_category(fg['category'], fg['super_category']) 297 | color_categories[str(fg['mask_rgb_color'])] = \ 298 | { 299 | 'category':fg['category'], 300 | 'super_category':fg['super_category'] 301 | } 302 | 303 | # Add the mask to MaskJsonUtils 304 | mju.add_mask( 305 | composite_path.relative_to(self.output_dir).as_posix(), 306 | mask_path.relative_to(self.output_dir).as_posix(), 307 | color_categories 308 | ) 309 | 310 | #Write masks to json 311 | mju.write_masks_to_json() 312 | 313 | def _compose_images(self, foregrounds, background_path): 314 | # Composes a foreground image and a background image and creates a segmentation mask 315 | # using the specified color. Validation should already be done by now. 316 | # Args: 317 | # foregrounds: a list of dicts with format: 318 | # [{ 319 | # 'super_category':super_category, 320 | # 'category':category, 321 | # 'foreground_path':foreground_path, 322 | # 'mask_rgb_color':mask_rgb_color 323 | # },...] 324 | # background_path: the path to a valid background image 325 | # Returns: 326 | # composite: the composed image 327 | # mask: the mask image 328 | 329 | # Open background and convert to RGBA 330 | background = Image.open(background_path) 331 | background = background.convert('RGBA') 332 | 333 | # Crop background to desired size (self.width x self.height), randomly positioned 334 | bg_width, bg_height = background.size 335 | max_crop_x_pos = bg_width - self.width 336 | max_crop_y_pos = bg_height - self.height 337 | assert max_crop_x_pos >= 0, f'desired width, {self.width}, is greater than background width, {bg_width}, for {str(background_path)}' 338 | assert max_crop_y_pos >= 0, f'desired height, {self.height}, is greater than background height, {bg_height}, for {str(background_path)}' 339 | crop_x_pos = random.randint(0, max_crop_x_pos) 340 | crop_y_pos = random.randint(0, max_crop_y_pos) 341 | composite = background.crop((crop_x_pos, crop_y_pos, crop_x_pos + self.width, crop_y_pos + self.height)) 342 | composite_mask = Image.new('RGB', composite.size, 0) 343 | 344 | for fg in foregrounds: 345 | fg_path = fg['foreground_path'] 346 | 347 | # Perform transformations 348 | fg_image = self._transform_foreground(fg, fg_path) 349 | 350 | # Choose a random x,y position for the foreground 351 | max_x_position = composite.size[0] - fg_image.size[0] 352 | max_y_position = composite.size[1] - fg_image.size[1] 353 | assert max_x_position >= 0 and max_y_position >= 0, \ 354 | f'foreground {fg_path} is too big ({fg_image.size[0]}x{fg_image.size[1]}) for the requested output size ({self.width}x{self.height}), check your input parameters' 355 | paste_position = (random.randint(0, max_x_position), random.randint(0, max_y_position)) 356 | 357 | # Create a new foreground image as large as the composite and paste it on top 358 | new_fg_image = Image.new('RGBA', composite.size, color = (0, 0, 0, 0)) 359 | new_fg_image.paste(fg_image, paste_position) 360 | 361 | # Extract the alpha channel from the foreground and paste it into a new image the size of the composite 362 | alpha_mask = fg_image.getchannel(3) 363 | new_alpha_mask = Image.new('L', composite.size, color = 0) 364 | new_alpha_mask.paste(alpha_mask, paste_position) 365 | composite = Image.composite(new_fg_image, composite, new_alpha_mask) 366 | 367 | # Grab the alpha pixels above a specified threshold 368 | alpha_threshold = 200 369 | mask_arr = np.array(np.greater(np.array(new_alpha_mask), alpha_threshold), dtype=np.uint8) 370 | uint8_mask = np.uint8(mask_arr) # This is composed of 1s and 0s 371 | 372 | # Multiply the mask value (1 or 0) by the color in each RGB channel and combine to get the mask 373 | mask_rgb_color = fg['mask_rgb_color'] 374 | red_channel = uint8_mask * mask_rgb_color[0] 375 | green_channel = uint8_mask * mask_rgb_color[1] 376 | blue_channel = uint8_mask * mask_rgb_color[2] 377 | rgb_mask_arr = np.dstack((red_channel, green_channel, blue_channel)) 378 | isolated_mask = Image.fromarray(rgb_mask_arr, 'RGB') 379 | isolated_alpha = Image.fromarray(uint8_mask * 255, 'L') 380 | 381 | composite_mask = Image.composite(isolated_mask, composite_mask, isolated_alpha) 382 | 383 | return composite, composite_mask 384 | 385 | def _transform_foreground(self, fg, fg_path): 386 | # Open foreground and get the alpha channel 387 | fg_image = Image.open(fg_path) 388 | fg_alpha = np.array(fg_image.getchannel(3)) 389 | assert np.any(fg_alpha == 0), f'foreground needs to have some transparency: {str(fg_path)}' 390 | 391 | # ** Apply Transformations ** 392 | # Rotate the foreground 393 | angle_degrees = random.randint(0, 359) 394 | fg_image = fg_image.rotate(angle_degrees, resample=Image.BICUBIC, expand=True) 395 | 396 | # Scale the foreground 397 | scale = random.random() * .5 + .5 # Pick something between .5 and 1 398 | new_size = (int(fg_image.size[0] * scale), int(fg_image.size[1] * scale)) 399 | fg_image = fg_image.resize(new_size, resample=Image.BICUBIC) 400 | 401 | # Adjust foreground brightness 402 | brightness_factor = random.random() * .4 + .7 # Pick something between .7 and 1.1 403 | enhancer = ImageEnhance.Brightness(fg_image) 404 | fg_image = enhancer.enhance(brightness_factor) 405 | 406 | # Add any other transformations here... 407 | 408 | return fg_image 409 | 410 | def _create_info(self): 411 | # A convenience wizard for automatically creating dataset info 412 | # The user can always modify the resulting .json manually if needed 413 | 414 | if self.silent: 415 | # No user wizard in silent mode 416 | return 417 | 418 | should_continue = input('Would you like to create dataset info json? (y/n) ').lower() 419 | if should_continue != 'y' and should_continue != 'yes': 420 | print('No problem. You can always create the json manually.') 421 | quit() 422 | 423 | print('Note: you can always modify the json manually if you need to update this.') 424 | info = dict() 425 | info['description'] = input('Description: ') 426 | info['url'] = input('URL: ') 427 | info['version'] = input('Version: ') 428 | info['contributor'] = input('Contributor: ') 429 | now = datetime.now() 430 | info['year'] = now.year 431 | info['date_created'] = f'{now.month:0{2}}/{now.day:0{2}}/{now.year}' 432 | 433 | image_license = dict() 434 | image_license['id'] = 0 435 | 436 | should_add_license = input('Add an image license? (y/n) ').lower() 437 | if should_add_license != 'y' and should_add_license != 'yes': 438 | image_license['url'] = '' 439 | image_license['name'] = 'None' 440 | else: 441 | image_license['name'] = input('License name: ') 442 | image_license['url'] = input('License URL: ') 443 | 444 | dataset_info = dict() 445 | dataset_info['info'] = info 446 | dataset_info['license'] = image_license 447 | 448 | # Write the JSON output file 449 | output_file_path = Path(self.output_dir) / 'dataset_info.json' 450 | with open(output_file_path, 'w+') as json_file: 451 | json_file.write(json.dumps(dataset_info)) 452 | 453 | print('Successfully created {output_file_path}') 454 | 455 | 456 | # Start here 457 | def main(self, args): 458 | self._validate_and_process_args(args) 459 | self._generate_images() 460 | self._create_info() 461 | print('Image composition completed.') 462 | 463 | if __name__ == "__main__": 464 | import argparse 465 | 466 | parser = argparse.ArgumentParser(description="Image Composition") 467 | parser.add_argument("--input_dir", type=str, dest="input_dir", required=True, help="The input directory. \ 468 | This contains a 'backgrounds' directory of pngs or jpgs, and a 'foregrounds' directory which \ 469 | contains supercategory directories (e.g. 'animal', 'vehicle'), each of which contain category \ 470 | directories (e.g. 'horse', 'bear'). Each category directory contains png images of that item on a \ 471 | transparent background (e.g. a grizzly bear on a transparent background).") 472 | parser.add_argument("--output_dir", type=str, dest="output_dir", required=True, help="The directory where images, masks, \ 473 | and json files will be placed") 474 | parser.add_argument("--count", type=int, dest="count", required=True, help="number of composed images to create") 475 | parser.add_argument("--width", type=int, dest="width", required=True, help="output image pixel width") 476 | parser.add_argument("--height", type=int, dest="height", required=True, help="output image pixel height") 477 | parser.add_argument("--output_type", type=str, dest="output_type", help="png or jpg (default)") 478 | parser.add_argument("--silent", action='store_true', help="silent mode; doesn't prompt the user for input, \ 479 | automatically overwrites files") 480 | 481 | args = parser.parse_args() 482 | 483 | image_comp = ImageComposition() 484 | image_comp.main(args) -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | jupyter 2 | notebook 3 | Pillow 4 | numpy 5 | scikit-image 6 | scipy 7 | tqdm 8 | Shapely --------------------------------------------------------------------------------