├── .gitignore
├── LICENSE
├── README.md
├── datasets
    └── README.md
├── docs
    └── getting-started.md
├── notebooks
    ├── coco_image_viewer.ipynb
    └── train_mask_rcnn.ipynb
├── python
    ├── coco_json_utils.py
    └── image_composition.py
└── requirements.txt


/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | *.egg-info/
 24 | .installed.cfg
 25 | *.egg
 26 | MANIFEST
 27 | 
 28 | # PyInstaller
 29 | #  Usually these files are written by a python script from a template
 30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 31 | *.manifest
 32 | *.spec
 33 | 
 34 | # Installer logs
 35 | pip-log.txt
 36 | pip-delete-this-directory.txt
 37 | 
 38 | # Unit test / coverage reports
 39 | htmlcov/
 40 | .tox/
 41 | .coverage
 42 | .coverage.*
 43 | .cache
 44 | nosetests.xml
 45 | coverage.xml
 46 | *.cover
 47 | .hypothesis/
 48 | .pytest_cache/
 49 | 
 50 | # Translations
 51 | *.mo
 52 | *.pot
 53 | 
 54 | # Django stuff:
 55 | *.log
 56 | local_settings.py
 57 | db.sqlite3
 58 | 
 59 | # Flask stuff:
 60 | instance/
 61 | .webassets-cache
 62 | 
 63 | # Scrapy stuff:
 64 | .scrapy
 65 | 
 66 | # Sphinx documentation
 67 | docs/_build/
 68 | 
 69 | # PyBuilder
 70 | target/
 71 | 
 72 | # Jupyter Notebook
 73 | .ipynb_checkpoints
 74 | 
 75 | # pyenv
 76 | .python-version
 77 | 
 78 | # celery beat schedule file
 79 | celerybeat-schedule
 80 | 
 81 | # SageMath parsed files
 82 | *.sage.py
 83 | 
 84 | # Environments
 85 | .env
 86 | .venv
 87 | env/
 88 | venv/
 89 | ENV/
 90 | env.bak/
 91 | venv.bak/
 92 | 
 93 | # Spyder project settings
 94 | .spyderproject
 95 | .spyproject
 96 | 
 97 | # Rope project settings
 98 | .ropeproject
 99 | 
100 | # mkdocs documentation
101 | /site
102 | 
103 | # mypy
104 | .mypy_cache/
105 | 
106 | # Specific to CocoSynth
107 | datasets/*
108 | !datasets/README.md
109 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 Adam Kelly
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # cocosynth
 2 | COCO Synth provides tools for creating synthetic COCO datasets.
 3 | 
 4 | # Complete Guide to Creating COCO Datasets
 5 | ![Complete Guide to Creating COCO Datasets Course Image](https://images.squarespace-cdn.com/content/v1/55652c24e4b0edcadf841347/1586723596485-FI1W99L5F0X6X0VP36SW/ke17ZwdGBToddI8pDm48kJFjiAAEKQOxhtR6kyGixEZZw-zPPgdn4jUwVcJE1ZvWQUxwkmyExglNqGp0IvTJZamWLI2zvYWH8K3-s_4yszcp2ryTI0HqTOaaUohrI8PIZG0-o4ErKRJfhIwgspvy036Ezj4M485dTMevEG-VX_E/creating-coco-datasets.png)
 6 | 
 7 | This code repo is a companion to a Udemy course for developers who'd like a step by step walk-through of how to create a synthetic COCO dataset from scratch. When you enroll, you'll get a full walkthrough of how all of the code in this repo works. When you finish, you'll have a COCO dataset with your own custom categories and a trained Mask R-CNN.
 8 | 
 9 | Learn more at [https://www.immersivelimit.com/course/creating-coco-datasets](https://www.immersivelimit.com/course/creating-coco-datasets)
10 | 
11 | Follow my various social media channels listed at [immersivelimit.com/connect](http://www.immersivelimit.com/connect) for updates!
12 | 
13 | # Getting Started
14 | Check out the [Getting Started](./docs/getting-started.md) guide. It will walk you through the scripts with a sample dataset.
15 | 


--------------------------------------------------------------------------------
/datasets/README.md:
--------------------------------------------------------------------------------
 1 | # TL;DR
 2 | This directory is intentionally empty to save space. You can put datasets here.
 3 | 
 4 | # More Detailed Explanation
 5 | This directory is intended to contain datasets that the scripts will access, however they are intentionally not included in the git repo to save space. You may safely leave datasets in here and the .gitignore file will ensure that they are not included in git commits.
 6 | 
 7 | # Datasets
 8 | Here are some datasets you can download to populate this directory.
 9 | If any are missing or have broken links, please create an issue on GitHub or contact me via one of the various channels listed at [immersivelimit.com/connect](http://www.immersivelimit.com/connect)
10 | 
11 | ## box_dataset_synthetic
12 | A complete synthetic dataset of boxes. [Download](https://immersivelimit.page.link/gnbR)
13 | 
14 | ## weeds
15 | Foreground and background images for the [Weed Detector](http://www.immersivelimit.com/blog/ai-weed-detector) case study I did. [Download](https://immersivelimit.page.link/euJu)
16 | 
17 | Test Images and Videos for the Weed Detector. [Download](https://immersivelimit.page.link/Zk6V)
18 | 
19 | Note: The full dataset I created was several gigabytes of 1024x1024 images so I'm not including it here. You can use this git repo to generate your own. For reference, it contained 10k training images and 1k validation images.
20 | 


--------------------------------------------------------------------------------
/docs/getting-started.md:
--------------------------------------------------------------------------------
 1 | # Getting Started with COCO Synth
 2 | 
 3 | ## Initial Setup
 4 | I highly recommend using [Anaconda](https://docs.anaconda.com/anaconda/install/) for Python environment management. It will help you install Shapely, which I've had some problems installing with pip.
 5 | 
 6 | Once you have Anaconda installed...
 7 | 
 8 | On Windows:
 9 | ```
10 | conda create -n cocosynth python=3.6
11 | activate cocosynth
12 | conda install -c conda-forge shapely
13 | pip install -r requirements.txt
14 | ```
15 | On Linux (and I assume Mac):
16 | ```
17 | conda create -n cocosynth python=3.6
18 | source activate cocosynth
19 | conda install -c conda-forge shapely
20 | pip install -r requirements.txt
21 | ```
22 | 
23 | # View Segmentations with the COCO Image Viewer
24 | Fire up Jupyter Notebook.
25 | ```
26 | jupyter notebook
27 | ```
28 | Open up "../notebooks/coco_image_viewer.ipynb" and run through the cells in the notebook. Pay attention to the file paths. They are set up to work with this guide. If everything works correctly, you'll be able to view an image with image segmentation overlays.
29 | 
30 | # Create Synthetic Images and Masks
31 | In this section, we will use "image_composition.py" to randomly pick foregrounds and automatically super-impose them on backgrounds. You will need a number of foreground cutouts with transparent backgrounds. For example, you might have a picture of an eagle with a completely transparent background. Due to the need for transparency, these images should be .png format (.jpg doesn't have transparency). I cut out my foregrounds with [GIMP](https://www.gimp.org/), which is free.
32 | 
33 | ## Using the sample dataset
34 | For this guide, all examples assume you'll be using the box_dataset_synthetic sample dataset. Find it here [../datasets/README.md](../datasets/README.md). Download it and extract the contents to "../datasets/box_dataset_synthetic".
35 | 
36 | ## Custom dataset directory setup:
37 | - Inside the "datasets" directory, create a new folder for your dataset (e.g. "wild_animal_dataset")
38 | - Inside that dataset directory, create a folder called "input"
39 | - Inside "input", create two folders called "foregrounds" and "backgrounds"
40 | - Inside "foregrounds", create a folder for each super category (e.g. "bird", "lizard")
41 | - Inside each foreground super category folder, create a folder for each category (e.g. "eagle", "owl")
42 | - Inside each category folder, add all foreground photos you intend to use for the respective category (e.g. all of you eagle foreground cutouts)
43 | - Inside "backgrounds", add all background photos you intend to use
44 | 
45 | Run "image_composition.py" to create your images and masks
46 | ```
47 | python ./python/image_composition.py --input_dir ./datasets/box_dataset_synthetic/input --output_dir ./datasets/box_dataset_synthetic/output --count 10 --width 512 --height 512
48 | ```
49 | 
50 | # Create COCO Instances JSON
51 | Now we're going to use the images, masks, and json to create COCO instances.
52 | 
53 | Optional: Run "coco_json_utils.py" with --help to see the documentation. This will explain the next command.
54 | ```
55 | python ./python/coco_json_utils.py --help
56 | ```
57 | Run the command with the correct parameters
58 | ```
59 | python ./python/coco_json_utils.py -md ./datasets/box_dataset_synthetic/output/mask_definitions.json -di ./datasets/box_dataset_synthetic/output/dataset_info.json
60 | ```
61 | 
62 | You will now have a new json file called "coco_instances.json". This is contains all of your COCO json!
63 | 
64 | 


--------------------------------------------------------------------------------
/python/coco_json_utils.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/python
  2 | 
  3 | import numpy as np
  4 | import json
  5 | from pathlib import Path
  6 | from tqdm import tqdm
  7 | from skimage import measure, io
  8 | from shapely.geometry import Polygon, MultiPolygon
  9 | from PIL import Image
 10 | 
 11 | class InfoJsonUtils():
 12 |     """ Creates an info object to describe a COCO dataset
 13 |     """
 14 |     def create_coco_info(self, description, url, version, year, contributor, date_created):
 15 |         """ Creates the "info" portion of COCO json
 16 |         """
 17 |         info = dict()
 18 |         info['description'] = description
 19 |         info['url'] = url
 20 |         info['version'] = version
 21 |         info['year'] = year
 22 |         info['contributor'] = contributor
 23 |         info['date_created'] = date_created
 24 | 
 25 |         return info
 26 | 
 27 | class LicenseJsonUtils():
 28 |     """ Creates a license object to describe a COCO dataset
 29 |     """
 30 |     def create_coco_license(self, url, license_id, name):
 31 |         """ Creates the "licenses" portion of COCO json
 32 |         """
 33 |         lic = dict()
 34 |         lic['url'] = url
 35 |         lic['id'] = license_id
 36 |         lic['name'] = name
 37 | 
 38 |         return lic
 39 | 
 40 | class CategoryJsonUtils():
 41 |     """ Creates a category object to describe a COCO dataset
 42 |     """
 43 |     def create_coco_category(self, supercategory, category_id, name):
 44 |         category = dict()
 45 |         category['supercategory'] = supercategory
 46 |         category['id'] = category_id
 47 |         category['name'] = name
 48 | 
 49 |         return category
 50 | 
 51 | class ImageJsonUtils():
 52 |     """ Creates an image object to describe a COCO dataset
 53 |     """
 54 |     def create_coco_image(self, image_path, image_id, image_license):
 55 |         """ Creates the "image" portion of COCO json
 56 |         """
 57 |         # Open the image and get the size
 58 |         image_file = Image.open(image_path)
 59 |         width, height = image_file.size
 60 | 
 61 |         image = dict()
 62 |         image['license'] = image_license
 63 |         image['file_name'] = image_path.name
 64 |         image['width'] = width
 65 |         image['height'] = height
 66 |         image['id'] = image_id
 67 | 
 68 |         return image
 69 | 
 70 | class AnnotationJsonUtils():
 71 |     """ Creates an annotation object to describe a COCO dataset
 72 |     """
 73 |     def __init__(self):
 74 |         self.annotation_id_index = 0
 75 | 
 76 |     def create_coco_annotations(self, image_mask_path, image_id, category_ids):
 77 |         """ Takes a pixel-based RGB image mask and creates COCO annotations.
 78 |         Args:
 79 |             image_mask_path: a pathlib.Path to the image mask
 80 |             image_id: the integer image id
 81 |             category_ids: a dictionary of integer category ids keyed by RGB color (a tuple converted to a string)
 82 |                 e.g. {'(255, 0, 0)': {'category': 'owl', 'super_category': 'bird'} }
 83 |         Returns:
 84 |             annotations: a list of COCO annotation dictionaries that can
 85 |             be converted to json. e.g.:
 86 |             {
 87 |                 "segmentation": [[101.79,307.32,69.75,281.11,...,100.05,309.66]],
 88 |                 "area": 51241.3617,
 89 |                 "iscrowd": 0,
 90 |                 "image_id": 284725,
 91 |                 "bbox": [68.01,134.89,433.41,174.77],
 92 |                 "category_id": 6,
 93 |                 "id": 165690
 94 |             }
 95 |         """
 96 |         # Set class variables
 97 |         self.image_id = image_id
 98 |         self.category_ids = category_ids
 99 | 
100 |         # Make sure keys in category_ids are strings
101 |         for key in self.category_ids.keys():
102 |             if type(key) is not str:
103 |                 raise TypeError('category_ids keys must be strings (e.g. "(0, 0, 255)")')
104 |             break
105 | 
106 |         # Open and process image
107 |         self.mask_image = Image.open(image_mask_path)
108 |         self.mask_image = self.mask_image.convert('RGB')
109 |         self.width, self.height = self.mask_image.size
110 | 
111 |         # Split up the multi-colored masks into multiple 0/1 bit masks
112 |         self._isolate_masks()
113 | 
114 |         # Create annotations from the masks
115 |         self._create_annotations()
116 | 
117 |         return self.annotations
118 | 
119 |     def _isolate_masks(self):
120 |         # Breaks mask up into isolated masks based on color
121 | 
122 |         self.isolated_masks = dict()
123 |         # for x in range(self.width):
124 |         #     for y in range(self.height):
125 |         #         pixel_rgb = self.mask_image.getpixel((x,y))
126 |         #         pixel_rgb_str = str(pixel_rgb)
127 | 
128 |         #         # If the pixel is any color other than black, add it to a respective isolated image mask
129 |         #         if not pixel_rgb == (0, 0, 0):
130 |         #             if self.isolated_masks.get(pixel_rgb_str) is None:
131 |         #                 # Isolated mask doesn't have its own image yet, create one
132 |         #                 # with 1-bit pixels, default black. Make room for 1 pixel of
133 |         #                 # padding on each edge to allow the contours algorithm to work
134 |         #                 # when shapes bleed up to the edge
135 |         #                 self.isolated_masks[pixel_rgb_str] = Image.new('1', (self.width + 2, self.height + 2))
136 | 
137 |         #             # Add the pixel to the mask image, shifting by 1 pixel to account for padding
138 |         #             self.isolated_masks[pixel_rgb_str].putpixel((x + 1, y + 1), 1)
139 | 
140 |         # This is a much faster way to split masks using Numpy
141 |         arr = np.array(self.mask_image, dtype=np.uint32)
142 |         rgb32 = (arr[:,:,0] << 16) + (arr[:,:,1] << 8) + arr[:,:,2]
143 |         unique_values = np.unique(rgb32)
144 |         for u in unique_values:
145 |             if u != 0:
146 |                 r = int((u & (255 << 16)) >> 16)
147 |                 g = int((u & (255 << 8)) >> 8)
148 |                 b = int(u & 255)
149 |                 self.isolated_masks[str((r, g, b))] = np.equal(rgb32, u)
150 | 
151 |     def _create_annotations(self):
152 |         # Creates annotations for each isolated mask
153 | 
154 |         # Each image may have multiple annotations, so create an array
155 |         self.annotations = []
156 |         for key, mask in self.isolated_masks.items():
157 |             annotation = dict()
158 |             annotation['segmentation'] = []
159 |             annotation['iscrowd'] = 0
160 |             annotation['image_id'] = self.image_id
161 |             if not self.category_ids.get(key):
162 |                 print(f'category color not found: {key}; check for missing category or antialiasing')
163 |                 continue
164 |             annotation['category_id'] = self.category_ids[key]
165 |             annotation['id'] = self._next_annotation_id()
166 | 
167 |             # Find contours in the isolated mask
168 |             mask = np.asarray(mask, dtype=np.float32)
169 |             contours = measure.find_contours(mask, 0.5, positive_orientation='low')
170 | 
171 |             polygons = []
172 |             for contour in contours:
173 |                 # Flip from (row, col) representation to (x, y)
174 |                 # and subtract the padding pixel
175 |                 for i in range(len(contour)):
176 |                     row, col = contour[i]
177 |                     contour[i] = (col - 1, row - 1)
178 | 
179 |                 # Make a polygon and simplify it
180 |                 poly = Polygon(contour)
181 |                 poly = poly.simplify(1.0, preserve_topology=False)
182 | 
183 |                 if (poly.area > 16): # Ignore tiny polygons
184 |                     if (poly.geom_type == 'MultiPolygon'):
185 |                         # if MultiPolygon, take the smallest convex Polygon containing all the points in the object
186 |                         poly = poly.convex_hull
187 | 
188 |                     if (poly.geom_type == 'Polygon'): # Ignore if still not a Polygon (could be a line or point)
189 |                         polygons.append(poly)
190 |                         segmentation = np.array(poly.exterior.coords).ravel().tolist()
191 |                         annotation['segmentation'].append(segmentation)
192 | 
193 |             if len(polygons) == 0:
194 |                 # This item doesn't have any visible polygons, ignore it
195 |                 # (This can happen if a randomly placed foreground is covered up
196 |                 #  by other foregrounds)
197 |                 continue
198 | 
199 |             # Combine the polygons to calculate the bounding box and area
200 |             multi_poly = MultiPolygon(polygons)
201 |             x, y, max_x, max_y = multi_poly.bounds
202 |             self.width = max_x - x
203 |             self.height = max_y - y
204 |             annotation['bbox'] = (x, y, self.width, self.height)
205 |             annotation['area'] = multi_poly.area
206 | 
207 |             # Finally, add this annotation to the list
208 |             self.annotations.append(annotation)
209 | 
210 |     def _next_annotation_id(self):
211 |         # Gets the next annotation id
212 |         # Note: This is not a unique id. It simply starts at 0 and increments each time it is called
213 | 
214 |         a_id = self.annotation_id_index
215 |         self.annotation_id_index += 1
216 |         return a_id
217 | 
218 | class CocoJsonCreator():
219 |     def validate_and_process_args(self, args):
220 |         """ Validates the arguments coming in from the command line and performs
221 |             initial processing
222 |         Args:
223 |             args: ArgumentParser arguments
224 |         """
225 |         # Validate the mask definition file exists
226 |         mask_definition_file = Path(args.mask_definition)
227 |         if not (mask_definition_file.exists and mask_definition_file.is_file()):
228 |             raise FileNotFoundError(f'mask definition file was not found: {mask_definition_file}')
229 | 
230 |         # Load the mask definition json
231 |         with open(mask_definition_file) as json_file:
232 |             self.mask_definitions = json.load(json_file)
233 | 
234 |         self.dataset_dir = mask_definition_file.parent
235 | 
236 |         # Validate the dataset info file exists
237 |         dataset_info_file = Path(args.dataset_info)
238 |         if not (dataset_info_file.exists() and dataset_info_file.is_file()):
239 |             raise FileNotFoundError(f'dataset info file was not found: {dataset_info_file}')
240 | 
241 |         # Load the dataset info json
242 |         with open(dataset_info_file) as json_file:
243 |             self.dataset_info = json.load(json_file)
244 | 
245 |         assert 'info' in self.dataset_info, 'dataset_info JSON was missing "info"'
246 |         assert 'license' in self.dataset_info, 'dataset_info JSON was missing "license"'
247 | 
248 |     def create_info(self):
249 |         """ Creates the "info" piece of the COCO json
250 |         """
251 |         info_json = self.dataset_info['info']
252 |         iju = InfoJsonUtils()
253 |         return iju.create_coco_info(
254 |             description = info_json['description'],
255 |             version = info_json['version'],
256 |             url = info_json['url'],
257 |             year = info_json['year'],
258 |             contributor = info_json['contributor'],
259 |             date_created = info_json['date_created']
260 |         )
261 | 
262 |     def create_licenses(self):
263 |         """ Creates the "license" portion of the COCO json
264 |         """
265 |         license_json = self.dataset_info['license']
266 |         lju = LicenseJsonUtils()
267 |         lic = lju.create_coco_license(
268 |             url = license_json['url'],
269 |             license_id = license_json['id'],
270 |             name = license_json['name']
271 |         )
272 |         return [lic]
273 | 
274 |     def create_categories(self):
275 |         """ Creates the "categories" portion of the COCO json
276 |         Returns:
277 |             categories: category objects that become part of the final json
278 |             category_ids_by_name: a lookup dictionary for category ids based
279 |                 on the name of the category
280 |         """
281 |         cju = CategoryJsonUtils()
282 |         categories = []
283 |         category_ids_by_name = dict()
284 |         category_id = 1 # 0 is reserved for the background
285 | 
286 |         super_categories = self.mask_definitions['super_categories']
287 |         for super_category, _categories in super_categories.items():
288 |             for category_name in _categories:
289 |                 categories.append(cju.create_coco_category(super_category, category_id, category_name))
290 |                 category_ids_by_name[category_name] = category_id
291 |                 category_id += 1
292 | 
293 |         return categories, category_ids_by_name
294 | 
295 |     def create_images_and_annotations(self, category_ids_by_name):
296 |         """ Creates the list of images (in json) and the annotations for each
297 |             image for the "image" and "annotations" portions of the COCO json
298 |         """
299 |         iju = ImageJsonUtils()
300 |         aju = AnnotationJsonUtils()
301 | 
302 |         image_objs = []
303 |         annotation_objs = []
304 |         image_license = self.dataset_info['license']['id']
305 |         image_id = 0
306 | 
307 |         mask_count = len(self.mask_definitions['masks'])
308 |         print(f'Processing {mask_count} mask definitions...')
309 | 
310 |         # For each mask definition, create image and annotations
311 |         for file_name, mask_def in tqdm(self.mask_definitions['masks'].items()):
312 |             # Create a coco image json item
313 |             image_path = Path(self.dataset_dir) / file_name
314 |             image_obj = iju.create_coco_image(
315 |                 image_path,
316 |                 image_id,
317 |                 image_license)
318 |             image_objs.append(image_obj)
319 | 
320 |             mask_path = Path(self.dataset_dir) / mask_def['mask']
321 | 
322 |             # Create a dict of category ids keyed by rgb_color
323 |             category_ids_by_rgb = dict()
324 |             for rgb_color, category in mask_def['color_categories'].items():
325 |                 category_ids_by_rgb[rgb_color] = category_ids_by_name[category['category']]
326 |             annotation_obj = aju.create_coco_annotations(mask_path, image_id, category_ids_by_rgb)
327 |             annotation_objs += annotation_obj # Add the new annotations to the existing list
328 |             image_id += 1
329 | 
330 |         return image_objs, annotation_objs
331 | 
332 |     def main(self, args):
333 |         self.validate_and_process_args(args)
334 | 
335 |         info = self.create_info()
336 |         licenses = self.create_licenses()
337 |         categories, category_ids_by_name = self.create_categories()
338 |         images, annotations = self.create_images_and_annotations(category_ids_by_name)
339 | 
340 |         master_obj = {
341 |             'info': info,
342 |             'licenses': licenses,
343 |             'images': images,
344 |             'annotations': annotations,
345 |             'categories': categories
346 |         }
347 | 
348 |         # Write the json to a file
349 |         output_path = Path(self.dataset_dir) / 'coco_instances.json'
350 |         with open(output_path, 'w+') as output_file:
351 |             json.dump(master_obj, output_file)
352 | 
353 |         print(f'Annotations successfully written to file:\n{output_path}')
354 | 
355 | if __name__ == "__main__":
356 |     import argparse
357 | 
358 |     parser = argparse.ArgumentParser(description="Generate COCO JSON")
359 | 
360 |     parser.add_argument("-md", "--mask_definition", dest="mask_definition",
361 |         help="path to a mask definition JSON file, generated by MaskJsonUtils module")
362 |     parser.add_argument("-di", "--dataset_info", dest="dataset_info",
363 |         help="path to a dataset info JSON file")
364 | 
365 |     args = parser.parse_args()
366 | 
367 |     cjc = CocoJsonCreator()
368 |     cjc.main(args)
369 | 


--------------------------------------------------------------------------------
/python/image_composition.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | 
  3 | import json
  4 | import warnings
  5 | import random
  6 | import numpy as np
  7 | from datetime import datetime
  8 | from pathlib import Path
  9 | from tqdm import tqdm
 10 | from PIL import Image, ImageEnhance
 11 | 
 12 | class MaskJsonUtils():
 13 |     """ Creates a JSON definition file for image masks.
 14 |     """
 15 | 
 16 |     def __init__(self, output_dir):
 17 |         """ Initializes the class.
 18 |         Args:
 19 |             output_dir: the directory where the definition file will be saved
 20 |         """
 21 |         self.output_dir = output_dir
 22 |         self.masks = dict()
 23 |         self.super_categories = dict()
 24 | 
 25 |     def add_category(self, category, super_category):
 26 |         """ Adds a new category to the set of the corresponding super_category
 27 |         Args:
 28 |             category: e.g. 'eagle'
 29 |             super_category: e.g. 'bird'
 30 |         Returns:
 31 |             True if successful, False if the category was already in the dictionary
 32 |         """
 33 |         if not self.super_categories.get(super_category):
 34 |             # Super category doesn't exist yet, create a new set
 35 |             self.super_categories[super_category] = {category}
 36 |         elif category in self.super_categories[super_category]:
 37 |             # Category is already accounted for
 38 |             return False
 39 |         else:
 40 |             # Add the category to the existing super category set
 41 |             self.super_categories[super_category].add(category)
 42 | 
 43 |         return True # Addition was successful
 44 | 
 45 |     def add_mask(self, image_path, mask_path, color_categories):
 46 |         """ Takes an image path, its corresponding mask path, and its color categories,
 47 |             and adds it to the appropriate dictionaries
 48 |         Args:
 49 |             image_path: the relative path to the image, e.g. './images/00000001.png'
 50 |             mask_path: the relative path to the mask image, e.g. './masks/00000001.png'
 51 |             color_categories: the legend of color categories, for this particular mask,
 52 |                 represented as an rgb-color keyed dictionary of category names and their super categories.
 53 |                 (the color category associations are not assumed to be consistent across images)
 54 |         Returns:
 55 |             True if successful, False if the image was already in the dictionary
 56 |         """
 57 |         if self.masks.get(image_path):
 58 |             return False # image/mask is already in the dictionary
 59 | 
 60 |         # Create the mask definition
 61 |         mask = {
 62 |             'mask': mask_path,
 63 |             'color_categories': color_categories
 64 |         }
 65 | 
 66 |         # Add the mask definition to the dictionary of masks
 67 |         self.masks[image_path] = mask
 68 | 
 69 |         # Regardless of color, we need to store each new category under its supercategory
 70 |         for _, item in color_categories.items():
 71 |             self.add_category(item['category'], item['super_category'])
 72 | 
 73 |         return True # Addition was successful
 74 | 
 75 |     def get_masks(self):
 76 |         """ Gets all masks that have been added
 77 |         """
 78 |         return self.masks
 79 | 
 80 |     def get_super_categories(self):
 81 |         """ Gets the dictionary of super categories for each category in a JSON
 82 |             serializable format
 83 |         Returns:
 84 |             A dictionary of lists of categories keyed on super_category
 85 |         """
 86 |         serializable_super_cats = dict()
 87 |         for super_cat, categories_set in self.super_categories.items():
 88 |             # Sets are not json serializable, so convert to list
 89 |             serializable_super_cats[super_cat] = list(categories_set)
 90 |         return serializable_super_cats
 91 | 
 92 |     def write_masks_to_json(self):
 93 |         """ Writes all masks and color categories to the output file path as JSON
 94 |         """
 95 |         # Serialize the masks and super categories dictionaries
 96 |         serializable_masks = self.get_masks()
 97 |         serializable_super_cats = self.get_super_categories()
 98 |         masks_obj = {
 99 |             'masks': serializable_masks,
100 |             'super_categories': serializable_super_cats
101 |         }
102 | 
103 |         # Write the JSON output file
104 |         output_file_path = Path(self.output_dir) / 'mask_definitions.json'
105 |         with open(output_file_path, 'w+') as json_file:
106 |             json_file.write(json.dumps(masks_obj))
107 | 
108 | class ImageComposition():
109 |     """ Composes images together in random ways, applying transformations to the foreground to create a synthetic
110 |         combined image.
111 |     """
112 | 
113 |     def __init__(self):
114 |         self.allowed_output_types = ['.png', '.jpg', '.jpeg']
115 |         self.allowed_background_types = ['.png', '.jpg', '.jpeg']
116 |         self.zero_padding = 8 # 00000027.png, supports up to 100 million images
117 |         self.max_foregrounds = 3
118 |         self.mask_colors = [(255, 0, 0), (0, 255, 0), (0, 0, 255)]
119 |         assert len(self.mask_colors) >= self.max_foregrounds, 'length of mask_colors should be >= max_foregrounds'
120 | 
121 |     def _validate_and_process_args(self, args):
122 |         # Validates input arguments and sets up class variables
123 |         # Args:
124 |         #     args: the ArgumentParser command line arguments
125 | 
126 |         self.silent = args.silent
127 | 
128 |         # Validate the count
129 |         assert args.count > 0, 'count must be greater than 0'
130 |         self.count = args.count
131 | 
132 |         # Validate the width and height
133 |         assert args.width >= 64, 'width must be greater than 64'
134 |         self.width = args.width
135 |         assert args.height >= 64, 'height must be greater than 64'
136 |         self.height = args.height
137 | 
138 |         # Validate and process the output type
139 |         if args.output_type is None:
140 |             self.output_type = '.jpg' # default
141 |         else:
142 |             if args.output_type[0] != '.':
143 |                 self.output_type = f'.{args.output_type}'
144 |             assert self.output_type in self.allowed_output_types, f'output_type is not supported: {self.output_type}'
145 | 
146 |         # Validate and process output and input directories
147 |         self._validate_and_process_output_directory()
148 |         self._validate_and_process_input_directory()
149 | 
150 |     def _validate_and_process_output_directory(self):
151 |         self.output_dir = Path(args.output_dir)
152 |         self.images_output_dir = self.output_dir / 'images'
153 |         self.masks_output_dir = self.output_dir / 'masks'
154 | 
155 |         # Create directories
156 |         self.output_dir.mkdir(exist_ok=True)
157 |         self.images_output_dir.mkdir(exist_ok=True)
158 |         self.masks_output_dir.mkdir(exist_ok=True)
159 | 
160 |         if not self.silent:
161 |             # Check for existing contents in the images directory
162 |             for _ in self.images_output_dir.iterdir():
163 |                 # We found something, check if the user wants to overwrite files or quit
164 |                 should_continue = input('output_dir is not empty, files may be overwritten.\nContinue (y/n)? ').lower()
165 |                 if should_continue != 'y' and should_continue != 'yes':
166 |                     quit()
167 |                 break
168 | 
169 |     def _validate_and_process_input_directory(self):
170 |         self.input_dir = Path(args.input_dir)
171 |         assert self.input_dir.exists(), f'input_dir does not exist: {args.input_dir}'
172 | 
173 |         for x in self.input_dir.iterdir():
174 |             if x.name == 'foregrounds':
175 |                 self.foregrounds_dir = x
176 |             elif x.name == 'backgrounds':
177 |                 self.backgrounds_dir = x
178 | 
179 |         assert self.foregrounds_dir is not None, 'foregrounds sub-directory was not found in the input_dir'
180 |         assert self.backgrounds_dir is not None, 'backgrounds sub-directory was not found in the input_dir'
181 | 
182 |         self._validate_and_process_foregrounds()
183 |         self._validate_and_process_backgrounds()
184 | 
185 |     def _validate_and_process_foregrounds(self):
186 |         # Validates input foregrounds and processes them into a foregrounds dictionary.
187 |         # Expected directory structure:
188 |         # + foregrounds_dir
189 |         #     + super_category_dir
190 |         #         + category_dir
191 |         #             + foreground_image.png
192 | 
193 |         self.foregrounds_dict = dict()
194 | 
195 |         for super_category_dir in self.foregrounds_dir.iterdir():
196 |             if not super_category_dir.is_dir():
197 |                 warnings.warn(f'file found in foregrounds directory (expected super-category directories), ignoring: {super_category_dir}')
198 |                 continue
199 | 
200 |             # This is a super category directory
201 |             for category_dir in super_category_dir.iterdir():
202 |                 if not category_dir.is_dir():
203 |                     warnings.warn(f'file found in super category directory (expected category directories), ignoring: {category_dir}')
204 |                     continue
205 | 
206 |                 # This is a category directory
207 |                 for image_file in category_dir.iterdir():
208 |                     if not image_file.is_file():
209 |                         warnings.warn(f'a directory was found inside a category directory, ignoring: {str(image_file)}')
210 |                         continue
211 |                     if image_file.suffix != '.png':
212 |                         warnings.warn(f'foreground must be a .png file, skipping: {str(image_file)}')
213 |                         continue
214 | 
215 |                     # Valid foreground image, add to foregrounds_dict
216 |                     super_category = super_category_dir.name
217 |                     category = category_dir.name
218 | 
219 |                     if super_category not in self.foregrounds_dict:
220 |                         self.foregrounds_dict[super_category] = dict()
221 | 
222 |                     if category not in self.foregrounds_dict[super_category]:
223 |                         self.foregrounds_dict[super_category][category] = []
224 | 
225 |                     self.foregrounds_dict[super_category][category].append(image_file)
226 | 
227 |         assert len(self.foregrounds_dict) > 0, 'no valid foregrounds were found'
228 | 
229 |     def _validate_and_process_backgrounds(self):
230 |         self.backgrounds = []
231 |         for image_file in self.backgrounds_dir.iterdir():
232 |             if not image_file.is_file():
233 |                 warnings.warn(f'a directory was found inside the backgrounds directory, ignoring: {image_file}')
234 |                 continue
235 | 
236 |             if image_file.suffix not in self.allowed_background_types:
237 |                 warnings.warn(f'background must match an accepted type {str(self.allowed_background_types)}, ignoring: {image_file}')
238 |                 continue
239 | 
240 |             # Valid file, add to backgrounds list
241 |             self.backgrounds.append(image_file)
242 | 
243 |         assert len(self.backgrounds) > 0, 'no valid backgrounds were found'
244 | 
245 |     def _generate_images(self):
246 |         # Generates a number of images and creates segmentation masks, then
247 |         # saves a mask_definitions.json file that describes the dataset.
248 | 
249 |         print(f'Generating {self.count} images with masks...')
250 | 
251 |         mju = MaskJsonUtils(self.output_dir)
252 | 
253 |         # Create all images/masks (with tqdm to have a progress bar)
254 |         for i in tqdm(range(self.count)):
255 |             # Randomly choose a background
256 |             background_path = random.choice(self.backgrounds)
257 | 
258 |             num_foregrounds = random.randint(1, self.max_foregrounds)
259 |             foregrounds = []
260 |             for fg_i in range(num_foregrounds):
261 |                 # Randomly choose a foreground
262 |                 super_category = random.choice(list(self.foregrounds_dict.keys()))
263 |                 category = random.choice(list(self.foregrounds_dict[super_category].keys()))
264 |                 foreground_path = random.choice(self.foregrounds_dict[super_category][category])
265 | 
266 |                 # Get the color
267 |                 mask_rgb_color = self.mask_colors[fg_i]
268 | 
269 |                 foregrounds.append({
270 |                     'super_category':super_category,
271 |                     'category':category,
272 |                     'foreground_path':foreground_path,
273 |                     'mask_rgb_color':mask_rgb_color
274 |                 })
275 | 
276 |             # Compose foregrounds and background
277 |             composite, mask = self._compose_images(foregrounds, background_path)
278 | 
279 |             # Create the file name (used for both composite and mask)
280 |             save_filename = f'{i:0{self.zero_padding}}' # e.g. 00000023.jpg
281 | 
282 |             # Save composite image to the images sub-directory
283 |             composite_filename = f'{save_filename}{self.output_type}' # e.g. 00000023.jpg
284 |             composite_path = self.output_dir / 'images' / composite_filename # e.g. my_output_dir/images/00000023.jpg
285 |             composite = composite.convert('RGB') # remove alpha
286 |             composite.save(composite_path)
287 | 
288 |             # Save the mask image to the masks sub-directory
289 |             mask_filename = f'{save_filename}.png' # masks are always png to avoid lossy compression
290 |             mask_path = self.output_dir / 'masks' / mask_filename # e.g. my_output_dir/masks/00000023.png
291 |             mask.save(mask_path)
292 | 
293 |             color_categories = dict()
294 |             for fg in foregrounds:
295 |                 # Add category and color info
296 |                 mju.add_category(fg['category'], fg['super_category'])
297 |                 color_categories[str(fg['mask_rgb_color'])] = \
298 |                     {
299 |                         'category':fg['category'],
300 |                         'super_category':fg['super_category']
301 |                     }
302 |             
303 |             # Add the mask to MaskJsonUtils
304 |             mju.add_mask(
305 |                 composite_path.relative_to(self.output_dir).as_posix(),
306 |                 mask_path.relative_to(self.output_dir).as_posix(),
307 |                 color_categories
308 |             )
309 | 
310 |         #Write masks to json
311 |         mju.write_masks_to_json()
312 | 
313 |     def _compose_images(self, foregrounds, background_path):
314 |         # Composes a foreground image and a background image and creates a segmentation mask
315 |         # using the specified color. Validation should already be done by now.
316 |         # Args:
317 |         #     foregrounds: a list of dicts with format:
318 |         #       [{
319 |         #           'super_category':super_category,
320 |         #           'category':category,
321 |         #           'foreground_path':foreground_path,
322 |         #           'mask_rgb_color':mask_rgb_color
323 |         #       },...]
324 |         #     background_path: the path to a valid background image
325 |         # Returns:
326 |         #     composite: the composed image
327 |         #     mask: the mask image
328 | 
329 |         # Open background and convert to RGBA
330 |         background = Image.open(background_path)
331 |         background = background.convert('RGBA')
332 | 
333 |         # Crop background to desired size (self.width x self.height), randomly positioned
334 |         bg_width, bg_height = background.size
335 |         max_crop_x_pos = bg_width - self.width
336 |         max_crop_y_pos = bg_height - self.height
337 |         assert max_crop_x_pos >= 0, f'desired width, {self.width}, is greater than background width, {bg_width}, for {str(background_path)}'
338 |         assert max_crop_y_pos >= 0, f'desired height, {self.height}, is greater than background height, {bg_height}, for {str(background_path)}'
339 |         crop_x_pos = random.randint(0, max_crop_x_pos)
340 |         crop_y_pos = random.randint(0, max_crop_y_pos)
341 |         composite = background.crop((crop_x_pos, crop_y_pos, crop_x_pos + self.width, crop_y_pos + self.height))
342 |         composite_mask = Image.new('RGB', composite.size, 0)
343 | 
344 |         for fg in foregrounds:
345 |             fg_path = fg['foreground_path']
346 | 
347 |             # Perform transformations
348 |             fg_image = self._transform_foreground(fg, fg_path)
349 | 
350 |             # Choose a random x,y position for the foreground
351 |             max_x_position = composite.size[0] - fg_image.size[0]
352 |             max_y_position = composite.size[1] - fg_image.size[1]
353 |             assert max_x_position >= 0 and max_y_position >= 0, \
354 |             f'foreground {fg_path} is too big ({fg_image.size[0]}x{fg_image.size[1]}) for the requested output size ({self.width}x{self.height}), check your input parameters'
355 |             paste_position = (random.randint(0, max_x_position), random.randint(0, max_y_position))
356 | 
357 |             # Create a new foreground image as large as the composite and paste it on top
358 |             new_fg_image = Image.new('RGBA', composite.size, color = (0, 0, 0, 0))
359 |             new_fg_image.paste(fg_image, paste_position)
360 | 
361 |             # Extract the alpha channel from the foreground and paste it into a new image the size of the composite
362 |             alpha_mask = fg_image.getchannel(3)
363 |             new_alpha_mask = Image.new('L', composite.size, color = 0)
364 |             new_alpha_mask.paste(alpha_mask, paste_position)
365 |             composite = Image.composite(new_fg_image, composite, new_alpha_mask)
366 | 
367 |             # Grab the alpha pixels above a specified threshold
368 |             alpha_threshold = 200
369 |             mask_arr = np.array(np.greater(np.array(new_alpha_mask), alpha_threshold), dtype=np.uint8)
370 |             uint8_mask = np.uint8(mask_arr) # This is composed of 1s and 0s
371 | 
372 |             # Multiply the mask value (1 or 0) by the color in each RGB channel and combine to get the mask
373 |             mask_rgb_color = fg['mask_rgb_color']
374 |             red_channel = uint8_mask * mask_rgb_color[0]
375 |             green_channel = uint8_mask * mask_rgb_color[1]
376 |             blue_channel = uint8_mask * mask_rgb_color[2]
377 |             rgb_mask_arr = np.dstack((red_channel, green_channel, blue_channel))
378 |             isolated_mask = Image.fromarray(rgb_mask_arr, 'RGB')
379 |             isolated_alpha = Image.fromarray(uint8_mask * 255, 'L')
380 | 
381 |             composite_mask = Image.composite(isolated_mask, composite_mask, isolated_alpha)
382 | 
383 |         return composite, composite_mask
384 | 
385 |     def _transform_foreground(self, fg, fg_path):
386 |         # Open foreground and get the alpha channel
387 |         fg_image = Image.open(fg_path)
388 |         fg_alpha = np.array(fg_image.getchannel(3))
389 |         assert np.any(fg_alpha == 0), f'foreground needs to have some transparency: {str(fg_path)}'
390 | 
391 |         # ** Apply Transformations **
392 |         # Rotate the foreground
393 |         angle_degrees = random.randint(0, 359)
394 |         fg_image = fg_image.rotate(angle_degrees, resample=Image.BICUBIC, expand=True)
395 | 
396 |         # Scale the foreground
397 |         scale = random.random() * .5 + .5 # Pick something between .5 and 1
398 |         new_size = (int(fg_image.size[0] * scale), int(fg_image.size[1] * scale))
399 |         fg_image = fg_image.resize(new_size, resample=Image.BICUBIC)
400 | 
401 |         # Adjust foreground brightness
402 |         brightness_factor = random.random() * .4 + .7 # Pick something between .7 and 1.1
403 |         enhancer = ImageEnhance.Brightness(fg_image)
404 |         fg_image = enhancer.enhance(brightness_factor)
405 | 
406 |         # Add any other transformations here...
407 | 
408 |         return fg_image
409 | 
410 |     def _create_info(self):
411 |         # A convenience wizard for automatically creating dataset info
412 |         # The user can always modify the resulting .json manually if needed
413 | 
414 |         if self.silent:
415 |             # No user wizard in silent mode
416 |             return
417 | 
418 |         should_continue = input('Would you like to create dataset info json? (y/n) ').lower()
419 |         if should_continue != 'y' and should_continue != 'yes':
420 |             print('No problem. You can always create the json manually.')
421 |             quit()
422 | 
423 |         print('Note: you can always modify the json manually if you need to update this.')
424 |         info = dict()
425 |         info['description'] = input('Description: ')
426 |         info['url'] = input('URL: ')
427 |         info['version'] = input('Version: ')
428 |         info['contributor'] = input('Contributor: ')
429 |         now = datetime.now()
430 |         info['year'] = now.year
431 |         info['date_created'] = f'{now.month:0{2}}/{now.day:0{2}}/{now.year}'
432 | 
433 |         image_license = dict()
434 |         image_license['id'] = 0
435 | 
436 |         should_add_license = input('Add an image license? (y/n) ').lower()
437 |         if should_add_license != 'y' and should_add_license != 'yes':
438 |             image_license['url'] = ''
439 |             image_license['name'] = 'None'
440 |         else:
441 |             image_license['name'] = input('License name: ')
442 |             image_license['url'] = input('License URL: ')
443 | 
444 |         dataset_info = dict()
445 |         dataset_info['info'] = info
446 |         dataset_info['license'] = image_license
447 | 
448 |         # Write the JSON output file
449 |         output_file_path = Path(self.output_dir) / 'dataset_info.json'
450 |         with open(output_file_path, 'w+') as json_file:
451 |             json_file.write(json.dumps(dataset_info))
452 | 
453 |         print('Successfully created {output_file_path}')
454 | 
455 | 
456 |     # Start here
457 |     def main(self, args):
458 |         self._validate_and_process_args(args)
459 |         self._generate_images()
460 |         self._create_info()
461 |         print('Image composition completed.')
462 | 
463 | if __name__ == "__main__":
464 |     import argparse
465 | 
466 |     parser = argparse.ArgumentParser(description="Image Composition")
467 |     parser.add_argument("--input_dir", type=str, dest="input_dir", required=True, help="The input directory. \
468 |                         This contains a 'backgrounds' directory of pngs or jpgs, and a 'foregrounds' directory which \
469 |                         contains supercategory directories (e.g. 'animal', 'vehicle'), each of which contain category \
470 |                         directories (e.g. 'horse', 'bear'). Each category directory contains png images of that item on a \
471 |                         transparent background (e.g. a grizzly bear on a transparent background).")
472 |     parser.add_argument("--output_dir", type=str, dest="output_dir", required=True, help="The directory where images, masks, \
473 |                         and json files will be placed")
474 |     parser.add_argument("--count", type=int, dest="count", required=True, help="number of composed images to create")
475 |     parser.add_argument("--width", type=int, dest="width", required=True, help="output image pixel width")
476 |     parser.add_argument("--height", type=int, dest="height", required=True, help="output image pixel height")
477 |     parser.add_argument("--output_type", type=str, dest="output_type", help="png or jpg (default)")
478 |     parser.add_argument("--silent", action='store_true', help="silent mode; doesn't prompt the user for input, \
479 |                         automatically overwrites files")
480 | 
481 |     args = parser.parse_args()
482 | 
483 |     image_comp = ImageComposition()
484 |     image_comp.main(args)


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | jupyter
2 | notebook
3 | Pillow
4 | numpy
5 | scikit-image
6 | scipy
7 | tqdm
8 | Shapely


--------------------------------------------------------------------------------