├── .gitignore ├── README.md ├── img ├── .DS_Store ├── 15efe45820_D95DF0B1F4INSPIRE-label.png ├── example.jpg ├── out.gif └── plas.png ├── libs ├── __init__.py ├── config.py ├── datasets.py ├── datasets_fastai.py ├── datasets_keras.py ├── images2chips.py ├── inference.py ├── inference_keras.py ├── models_keras.py ├── scoring.py ├── training.py ├── training_keras.py ├── util.py └── util_keras.py ├── main_fastai.py ├── main_keras.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | dataset-sample 2 | dataset-medium 3 | dataset-large 4 | dataset-whole 5 | __pycache__ 6 | wandb 7 | *.tar.gz 8 | models 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Segmentation Dataset 2 | === 3 | 4 | This repository contains a description of the DroneDeploy Segmentation Dataset and how to use it. It also contains example code to get a working segmentation model up and running quickly using a small sample dataset. See below for details of the full dataset and suggested improvement directions. 5 | 6 | ![Example](https://github.com/dronedeploy/dd-ml-segmentation-benchmark/raw/master/img/example.jpg) 7 | 8 | ### Quickstart 9 | 10 | Follow these steps to train a model and run inference end-to-end: 11 | 12 | ``` 13 | git clone https://github.com/dronedeploy/dd-ml-segmentation-benchmark.git 14 | cd dd-ml-segmentation-benchmark 15 | pip3 install -r requirements.txt 16 | 17 | # optional: log in to W&B to track your experiements 18 | wandb login 19 | 20 | # train a Keras model 21 | python3 main_keras.py 22 | 23 | # train a Fastai model 24 | python3 main_fastai.py 25 | ``` 26 | 27 | This will download the sample dataset and begin training a model. You can monitor training performance on [Weights & Biases](https://www.wandb.com/). Once training is complete, inference will be performed on all test scenes and a number of prediction images with names like `123123_ABCABC-prediction.png` will be created in the `wandb` directory. After the images are created they will be scored, and those scores stored in the `predictions` directory. Here's what a prediction looks like - not bad for 50 lines of code, but there is a lot of room for improvement: 28 | 29 | ![Example](https://github.com/dronedeploy/dd-ml-segmentation-benchmark/raw/master/img/out.gif) 30 | 31 | ### Dataset Details 32 | 33 | The dataset comprises a number of aerial scenes captured from drones. Each scene has a ground resolution of 10 cm per pixel. For each scene there is a corresponding "image", "elevation" and "label". These are located in the `images`, `elevation` and `labels` directories. 34 | 35 | The images are RGB TIFFs, the elevations are single channel floating point TIFFs (where each pixel value represents elevation in meters), and finally the labels are PNGs with 7 colors representing the 7 classes (documented below). 36 | 37 | In addition please see `index.csv` - inside the downloaded dataset folder - for a description of the quality of each labelled image and the distribution of the labels. 38 | 39 | To use a dataset for training, it must first be converted to chips (see `images2chips.py`). This will create two directories, `images-chips` and `label-chips`, which will contain a number of `300x300` (by default) RGB images. The `label-chips` are also RGB but will be very low pixel intensities `[0 .. 7]` so will appear black as first glance. You can use the `color2class` and `category2mask` function to switch between the two label representations. 40 | 41 | Here is an example of one of the labelled scenes: 42 | 43 | ![Example](https://github.com/dronedeploy/dd-ml-segmentation-benchmark/raw/master/img/15efe45820_D95DF0B1F4INSPIRE-label.png) 44 | 45 | Each color represents a different class. 46 | 47 | Color (Blue, Green, Red) to Class Name: 48 | --- 49 | ``` 50 | (075, 025, 230) : BUILDING 51 | (180, 030, 145) : CLUTTER 52 | (075, 180, 060) : VEGETATION 53 | (048, 130, 245) : WATER 54 | (255, 255, 255) : GROUND 55 | (200, 130, 000) : CAR 56 | (255, 000, 255) : IGNORE 57 | ``` 58 | 59 | - IGNORE - These magenta pixels mask areas of missing labels or image boundaries. They can be ignored. 60 | 61 | ### Possible Improvements 62 | ---- 63 | The sample implementation is very basic and there is immediate opportunity to experiment with: 64 | - Data augmentation (`datasets_keras.py`, `datasets_fastai.py`) 65 | - Hyperparameters (`train_keras.py`, `train_fastai.py`) 66 | - Post-processing (`inference_keras.py`, `inference_fastai.py`) 67 | - Chip size (`images2chips.py`) 68 | - Model architecture (`train_keras.py`, `train_fastai.py`) 69 | - Elevation tiles are not currently used at all (`images2chips.py`) 70 | -------------------------------------------------------------------------------- /img/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/img/.DS_Store -------------------------------------------------------------------------------- /img/15efe45820_D95DF0B1F4INSPIRE-label.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/img/15efe45820_D95DF0B1F4INSPIRE-label.png -------------------------------------------------------------------------------- /img/example.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/img/example.jpg -------------------------------------------------------------------------------- /img/out.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/img/out.gif -------------------------------------------------------------------------------- /img/plas.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/img/plas.png -------------------------------------------------------------------------------- /libs/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dronedeploy/dd-ml-segmentation-benchmark/8e1290df658562039a244791c41c388c8c8248da/libs/__init__.py -------------------------------------------------------------------------------- /libs/config.py: -------------------------------------------------------------------------------- 1 | LABELS = ['BUILDING', 'CLUTTER', 'VEGETATION', 'WATER', 'GROUND', 'CAR'] 2 | 3 | # Class to color (BGR) 4 | LABELMAP = { 5 | 0 : (255, 0, 255), 6 | 1 : (75, 25, 230), 7 | 2 : (180, 30, 145), 8 | 3 : (75, 180, 60), 9 | 4 : (48, 130, 245), 10 | 5 : (255, 255, 255), 11 | 6 : (200, 130, 0), 12 | } 13 | 14 | # Color (BGR) to class 15 | INV_LABELMAP = { 16 | (255, 0, 255) : 0, 17 | (75, 25, 230) : 1, 18 | (180, 30, 145) : 2, 19 | (75, 180, 60) : 3, 20 | (48, 130, 245) : 4, 21 | (255, 255, 255) : 5, 22 | (200, 130, 0) : 6, 23 | } 24 | 25 | LABELMAP_RGB = { k: (v[2], v[1], v[0]) for k, v in LABELMAP.items() } 26 | 27 | INV_LABELMAP_RGB = { v: k for k, v in LABELMAP_RGB.items() } 28 | 29 | train_ids = [ 30 | "1d4fbe33f3_F1BE1D4184INSPIRE", 31 | "1df70e7340_4413A67E91INSPIRE", 32 | "274518390f_AFAC6311B8OPENPIPELINE", 33 | "32760710b0_EF73EE9CCDOPENPIPELINE", 34 | "7008b80b00_FF24A4975DINSPIRE", 35 | "e2e401ba8b_CFF58D01D0OPENPIPELINE", 36 | "c644f91210_27E21B7F30OPENPIPELINE", 37 | "edc59d4824_FE5B96942BOPENPIPELINE", 38 | "b705d0cc9c_E5F5E0E316OPENPIPELINE", 39 | "ade6e4b261_147755FEAAOPENPIPELINE", 40 | "3bdbe137a1_E1B9B139DEOPENPIPELINE", 41 | "84830cff24_FE5B96942BOPENPIPELINE", 42 | "564d5fd4ea_F7D81C1243OPENPIPELINE", 43 | "a1af86939f_F1BE1D4184OPENPIPELINE", 44 | "571ed24019_7EF127EDCFOPENPIPELINE", 45 | "a0cee5daca_9ABAFDAA93OPENPIPELINE", 46 | "520947aa07_8FCB044F58OPENPIPELINE", 47 | "2ef883f08d_F317F9C1DFOPENPIPELINE", 48 | "f971256246_MIKEINSPIRE", 49 | "2ef3a4994a_0CCD105428INSPIRE", 50 | "56e9e81013_C988C95F03INSPIRE", 51 | "888432f840_80E7FD39EBINSPIRE", 52 | "63430fa268_B4DE0FB544INSPIRE", 53 | "130a76ebe1_68B40B480AOPENPIPELINE", 54 | "d02ce7cb10_6DC1FE1DDCOPENPIPELINE", 55 | "91ad290806_3CB2E8FC73INSPIRE", 56 | "11cdce7802_B6A62F8BE0INSPIRE", 57 | "6500c05298_B00063DE8EOPENPIPELINE", 58 | "803cd2c508_C988C95F03INSPIRE", 59 | "d5107a09cf_6ABE00F5A1INSPIRE", 60 | "3502e187b2_23071E4605OPENPIPELINE", 61 | "3452561694_E44D97430AOPENPIPELINE", 62 | "f9f43e5144_1DB9E6F68BINSPIRE", 63 | "2c36a93b10_793BC93268OPENPIPELINE", 64 | "53471726bc_B69D2F059FOPENPIPELINE", 65 | "afb793674b_4B44AF2928OPENPIPELINE", 66 | "807c0c243b_EA5BB57953OPENPIPELINE", 67 | "385393ca4b_E21EAB978AOPENPIPELINE", 68 | "6664b45691_D1F6B2028BOPENPIPELINE", 69 | "236da542ee_597D7FF2F9OPENPIPELINE", 70 | "7197260eb8_9549AC1A09INSPIRE", 71 | "7ed68b136e_C966B12B4EOPENPIPELINE", 72 | "1553627230_APIGENERATED", 73 | "ebffe540d0_7BA042D858OPENPIPELINE", 74 | "a4580732ce_2F98B8FC82INSPIRE", 75 | "c167ca6cb2_3CB2E8FC73INSPIRE", 76 | "e848b35eff_5EAE4DDF80INSPIRE", 77 | "d9161f7e18_C05BA1BC72OPENPIPELINE", 78 | "d45d74e584_2E8C142043OPENPIPELINE", 79 | "34fbf7c2bd_E8AD935CEDINSPIRE", 80 | "15efe45820_D95DF0B1F4INSPIRE", 81 | "2552eb56dd_2AABB46C86OPENPIPELINE", 82 | "628be3d244_A8CB55BF1FINSPIRE", 83 | "1553642501_APIGENERATED", 84 | "364d26fd40_9549AC1A09OPENPIPELINE", 85 | "1553541487_APIGENERATED", 86 | "ab4a9b813f_B75AF9044COPENPIPELINE", 87 | "748d0acb6d_18BE858545OPENPIPELINE", 88 | "686c48a300_9C340F2D92OPENPIPELINE", 89 | "5a5e4e491b_D7A795B2DEOPENPIPELINE", 90 | "a8789b3c97_1381767170OPENPIPELINE", 91 | "84410645db_8D20F02042OPENPIPELINE", 92 | "7c53bbf0da_EAFCA9B26AOPENPIPELINE", 93 | "9e7f0310a0_24E090DDB9INSPIRE", 94 | "b970fca868_883F63EBCCOPENPIPELINE", 95 | "d8786926c5_A6879692DAOPENPIPELINE", 96 | "d0dc53f9c7_9C194DD066INSPIRE", 97 | "2a8617d7d4_9464BAFE8AOPENPIPELINE", 98 | "f0747ed88d_E74C0DD8FDOPENPIPELINE", 99 | "c6d131e346_536DE05ED2OPENPIPELINE", 100 | "b61673f780_4413A67E91INSPIRE", 101 | "277f16713e_5E3246E306OPENPIPELINE", 102 | "7c59b1a217_B4DE0FB544INSPIRE", 103 | "74a2f5aaa9_B943F74EC9OPENPIPELINE", 104 | "1666d0369f_48FE7F729BOPENPIPELINE", 105 | "7c719dfcc0_310490364FINSPIRE", 106 | "f56b6b2232_2A62B67B52OPENPIPELINE", 107 | "c37dbfae2f_84B52814D2OPENPIPELINE", 108 | "c6890f580c_AFAC6311B8OPENPIPELINE", 109 | "399c1c010d_7A89E00BBDOPENPIPELINE", 110 | "87eeb1b9cc_B943F74EC9OPENPIPELINE", 111 | "7a14002b7b_B6E1859E4FINSPIRE", 112 | "5fa39d6378_DB9FF730D9OPENPIPELINE", 113 | "f4dd768188_NOLANOPENPIPELINE", 114 | "b771104de5_7E02A41EBEOPENPIPELINE", 115 | "981755057f_3BFBF39957OPENPIPELINE", 116 | "6958d7a8d5_9C194DD066OPENPIPELINE", 117 | "83d72f744d_48BF12F23COPENPIPELINE", 118 | "664e38b92b_6C7C9BE1D3INSPIRE", 119 | "1553541585_APIGENERATED", 120 | "c8eb574986_CC5FAE4CF9INSPIRE", 121 | "1d056881e8_29FEA32BC7INSPIRE", 122 | "fc5837dcf8_7CD52BE09EINSPIRE", 123 | "3a2200b6c0_2F98B8FC82OPENPIPELINE", 124 | ] 125 | 126 | val_ids = [ 127 | "ec09336a6f_06BA0AF311OPENPIPELINE", 128 | "679850f980_27920CBE78OPENPIPELINE", 129 | "c8a7031e5f_32156F5DC2INSPIRE", 130 | "6b82bcd67b_2EBB40A325OPENPIPELINE", 131 | "cc4b443c7d_A9CBEF2C97INSPIRE", 132 | "12c3372a95_7EF127EDCFINSPIRE", 133 | "941cb687d3_48FE7F729BINSPIRE", 134 | "42ab9f9e27_3CB2E8FC73INSPIRE", 135 | "264c36d368_C988C95F03INSPIRE", 136 | "954a8c814c_267994885AINSPIRE", 137 | "ea607f191d_582C2A2F47OPENPIPELINE", 138 | "600023a2df_F4A3C2E777INSPIRE", 139 | "57426ebe1e_84B52814D2OPENPIPELINE", 140 | "cd5a0d3ce4_2F98B8FC82INSPIRE", 141 | "3731e901b0_9464BAFE8AOPENPIPELINE", 142 | "f0c32df5a8_0406E6C238OPENPIPELINE", 143 | "1476907971_CHADGRISMOPENPIPELINE", 144 | "97c4dd388d_4C51642B86OPENPIPELINE", 145 | "f78c4e5748_3572E1D9BBOPENPIPELINE", 146 | "a11d963a7d_EF73EE9CCDOPENPIPELINE", 147 | "aef48b9aca_0226FDD487OPENPIPELINE", 148 | "9170479165_625EDFBAB6OPENPIPELINE", 149 | "3bb457cde8_D336A13367INSPIRE", 150 | "a1199a489f_6ABE00F5A1OPENPIPELINE", 151 | "137f4dfb89_C966B12B4EOPENPIPELINE", 152 | "551063e3c5_8FCB044F58INSPIRE", 153 | "37cf2e5706_74D898C7C3OPENPIPELINE", 154 | "74d7796531_EB81FE6E2BOPENPIPELINE", 155 | "46b27f92c2_06BA0AF311OPENPIPELINE", 156 | "32052d9b97_9ABAFDAA93OPENPIPELINE", 157 | ] 158 | 159 | test_ids = [ 160 | "12fa5e614f_53197F206FOPENPIPELINE", 161 | "feb7a50f10_JAREDINSPIRE", 162 | "c2e8370ca3_3340CAC7AEOPENPIPELINE", 163 | "55ca10d9f1_E8C8441957INSPIRE", 164 | "5ab849ec40_2F98B8FC82INSPIRE", 165 | "9254c82db0_9C194DD066OPENPIPELINE", 166 | "168ac179d9_31328BCCC4OPENPIPELINE", 167 | "6f93b9026b_F1BFB8B17DOPENPIPELINE", 168 | "8b0ac1fc28_6688905E16OPENPIPELINE", 169 | "1553539551_APIGENERATED", 170 | "7310356a1b_7EAE3AC26AOPENPIPELINE", 171 | "632de91030_9ABAFDAA93OPENPIPELINE", 172 | "2f7aabb6e5_0C2B5F6CABOPENPIPELINE", 173 | "18072ccb69_B2AE5C54EBOPENPIPELINE", 174 | "8710b98ea0_06E6522D6DINSPIRE", 175 | "fb74c54103_6ABE00F5A1INSPIRE", 176 | "25f1c24f30_EB81FE6E2BOPENPIPELINE", 177 | "39e77bedd0_729FB913CDOPENPIPELINE", 178 | "e87da4ebdb_29FEA32BC7INSPIRE", 179 | "546f85625a_39E021DC32INSPIRE", 180 | "e1d3e6f6ba_B4DE0FB544INSPIRE", 181 | "eee7d707d4_6DC1FE1DDCOPENPIPELINE", 182 | "3ff76e84d5_0DD77DFCD7OPENPIPELINE", 183 | "a0a6f46099_F93BAE5403OPENPIPELINE", 184 | "420d6b69b8_84B52814D2OPENPIPELINE", 185 | "d06b2c67d2_2A62B67B52OPENPIPELINE", 186 | "107f24d6e9_F1BE1D4184INSPIRE", 187 | "36d5956a21_8F4CE60B77OPENPIPELINE", 188 | "1726eb08ef_60693DB04DINSPIRE", 189 | "dabec5e872_E8AD935CEDINSPIRE", 190 | ] 191 | -------------------------------------------------------------------------------- /libs/datasets.py: -------------------------------------------------------------------------------- 1 | import libs.images2chips 2 | import sys 3 | import os 4 | 5 | URLS = { 6 | 'dataset-sample' : 'https://dl.dropboxusercontent.com/s/h8a8kev0rktf4kq/dataset-sample.tar.gz?dl=0', 7 | 'dataset-medium' : 'https://dl.dropboxusercontent.com/s/r0dj9mhyv4bgbme/dataset-medium.tar.gz?dl=0', 8 | } 9 | 10 | def download_dataset(dataset): 11 | """ Download a dataset, extract it and create the tiles """ 12 | 13 | if dataset not in URLS: 14 | print(f"unknown dataset {dataset}") 15 | sys.exit(0) 16 | 17 | filename = f'{dataset}.tar.gz' 18 | url = URLS[dataset] 19 | 20 | if not os.path.exists(filename): 21 | print(f'downloading dataset "{dataset}"') 22 | os.system(f'curl "{url}" -o {filename}') 23 | else: 24 | print(f'zipfile "{filename}" already exists, remove it if you want to re-download.') 25 | 26 | if not os.path.exists(dataset): 27 | print(f'extracting "{filename}"') 28 | os.system(f'tar -xvf {filename}') 29 | else: 30 | print(f'folder "{dataset}" already exists, remove it if you want to re-create.') 31 | 32 | image_chips = f'{dataset}/image-chips' 33 | label_chips = f'{dataset}/label-chips' 34 | if not os.path.exists(image_chips) and not os.path.exists(label_chips): 35 | print("creating chips") 36 | libs.images2chips.run(dataset) 37 | else: 38 | print(f'chip folders "{image_chips}" and "{label_chips}" already exist, remove them to recreate chips.') 39 | -------------------------------------------------------------------------------- /libs/datasets_fastai.py: -------------------------------------------------------------------------------- 1 | from fastai.vision import * 2 | from fastai.callbacks.hooks import * 3 | from pathlib import PosixPath 4 | 5 | import numpy as np 6 | from libs.config import LABELS 7 | 8 | def load_dataset(dataset, training_chip_size, bs): 9 | """ Load a dataset, create batches and augmentation """ 10 | 11 | path = PosixPath(dataset) 12 | label_path = path/'label-chips' 13 | image_path = path/'image-chips' 14 | image_files = get_image_files(image_path) 15 | label_files = get_image_files(label_path) 16 | get_y_fn = lambda x: label_path/f'{x.stem}{x.suffix}' 17 | codes = np.array(LABELS) 18 | src = SegmentationItemList.from_folder(image_path).split_by_fname_file('../valid.txt').label_from_func(get_y_fn, classes=codes) 19 | # some data augmentation here 20 | data = src.transform(get_transforms(flip_vert=True, max_warp=0., max_zoom=0., max_rotate=180.), size=training_chip_size, tfm_y=True).databunch(bs=bs) 21 | return data 22 | -------------------------------------------------------------------------------- /libs/datasets_keras.py: -------------------------------------------------------------------------------- 1 | from keras.preprocessing.image import ImageDataGenerator 2 | from keras.utils import Sequence, to_categorical 3 | from PIL import Image 4 | 5 | import numpy as np 6 | import random 7 | 8 | def load_dataset(dataset, bs, aug={'horizontal_flip': True, 'vertical_flip': True, 'rotation_range': 180}): 9 | train_files = [f'{dataset}/image-chips/{fname}' for fname in load_lines(f'{dataset}/train.txt')] 10 | valid_files = [f'{dataset}/image-chips/{fname}' for fname in load_lines(f'{dataset}/valid.txt')] 11 | 12 | train_seq = SegmentationSequence( 13 | dataset, 14 | train_files, 15 | ImageDataGenerator(**aug), 16 | bs 17 | ) 18 | 19 | valid_seq = SegmentationSequence( 20 | dataset, 21 | valid_files, 22 | ImageDataGenerator(), # don't augment validation set 23 | bs 24 | ) 25 | 26 | return train_seq, valid_seq 27 | 28 | def load_lines(fname): 29 | with open(fname, 'r') as f: 30 | return [l.strip() for l in f.readlines()] 31 | 32 | def load_img(fname): 33 | return np.array(Image.open(fname)) 34 | 35 | def mask_to_classes(mask): 36 | return to_categorical(mask[:,:,0], 6) 37 | 38 | class SegmentationSequence(Sequence): 39 | def __init__(self, dataset, image_files, datagen, bs): 40 | self.label_path = f'{dataset}/label-chips' 41 | self.image_path = f'{dataset}/image-chips' 42 | self.image_files = image_files 43 | random.shuffle(self.image_files) 44 | 45 | self.datagen = datagen 46 | self.bs = bs 47 | 48 | def __len__(self): 49 | return int(np.ceil(len(self.image_files) / float(self.bs))) 50 | 51 | def __getitem__(self, idx): 52 | image_files = self.image_files[idx*self.bs:(idx+1)*self.bs] 53 | label_files = [fname.replace(self.image_path, self.label_path) for fname in image_files] 54 | 55 | images = [load_img(fname) for fname in image_files] 56 | labels = [mask_to_classes(load_img(fname)) for fname in label_files] 57 | 58 | ts = [self.datagen.get_random_transform(im.shape) for im in images] 59 | images = [self.datagen.apply_transform(im, ts) for im, ts in zip(images, ts)] 60 | labels = [self.datagen.apply_transform(im, ts) for im, ts in zip(labels, ts)] 61 | 62 | return np.array(images), np.array(labels) 63 | 64 | def on_epoch_end(self): 65 | random.shuffle(self.image_files) 66 | -------------------------------------------------------------------------------- /libs/images2chips.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import os 3 | import numpy as np 4 | 5 | from libs.config import train_ids, val_ids, test_ids, LABELMAP, INV_LABELMAP 6 | 7 | size = 300 8 | stride = 300 9 | 10 | def color2class(orthochip, img): 11 | ret = np.zeros((img.shape[0], img.shape[1]), dtype='uint8') 12 | ret = np.dstack([ret, ret, ret]) 13 | colors = np.unique(img.reshape(-1, img.shape[2]), axis=0) 14 | 15 | # Skip any chips that would contain magenta (IGNORE) pixels 16 | seen_colors = set( [tuple(color) for color in colors] ) 17 | IGNORE_COLOR = LABELMAP[0] 18 | if IGNORE_COLOR in seen_colors: 19 | return None, None 20 | 21 | for color in colors: 22 | locs = np.where( (img[:, :, 0] == color[0]) & (img[:, :, 1] == color[1]) & (img[:, :, 2] == color[2]) ) 23 | ret[ locs[0], locs[1], : ] = INV_LABELMAP[ tuple(color) ] - 1 24 | 25 | return orthochip, ret 26 | 27 | def image2tile(prefix, scene, dataset, orthofile, elevafile, labelfile, windowx=size, windowy=size, stridex=stride, stridey=stride): 28 | 29 | ortho = cv2.imread(orthofile) 30 | label = cv2.imread(labelfile) 31 | 32 | # Not using elevation in the sample - but useful to incorporate it ;) 33 | eleva = cv2.imread(elevafile, -1) 34 | 35 | assert(ortho.shape[0] == label.shape[0]) 36 | assert(ortho.shape[1] == label.shape[1]) 37 | 38 | shape = ortho.shape 39 | 40 | xsize = shape[1] 41 | ysize = shape[0] 42 | print(f"converting {dataset} image {orthofile} {xsize}x{ysize} to chips ...") 43 | 44 | counter = 0 45 | 46 | for xi in range(0, shape[1] - windowx, stridex): 47 | for yi in range(0, shape[0] - windowy, stridey): 48 | 49 | orthochip = ortho[yi:yi+windowy, xi:xi+windowx, :] 50 | labelchip = label[yi:yi+windowy, xi:xi+windowx, :] 51 | 52 | orthochip, classchip = color2class(orthochip, labelchip) 53 | 54 | if classchip is None: 55 | continue 56 | 57 | orthochip_filename = os.path.join(prefix, 'image-chips', scene + '-' + str(counter).zfill(6) + '.png') 58 | labelchip_filename = os.path.join(prefix, 'label-chips', scene + '-' + str(counter).zfill(6) + '.png') 59 | 60 | with open(f"{prefix}/{dataset}", mode='a') as fd: 61 | fd.write(scene + '-' + str(counter).zfill(6) + '.png\n') 62 | 63 | cv2.imwrite(orthochip_filename, orthochip) 64 | cv2.imwrite(labelchip_filename, classchip) 65 | counter += 1 66 | 67 | 68 | def get_split(scene): 69 | if scene in train_ids: 70 | return "train.txt" 71 | if scene in val_ids: 72 | return 'valid.txt' 73 | if scene in test_ids: 74 | return 'test.txt' 75 | 76 | def run(prefix): 77 | 78 | open(prefix + '/train.txt', mode='w').close() 79 | open(prefix + '/valid.txt', mode='w').close() 80 | open(prefix + '/test.txt', mode='w').close() 81 | 82 | if not os.path.exists( os.path.join(prefix, 'image-chips') ): 83 | os.mkdir(os.path.join(prefix, 'image-chips')) 84 | 85 | if not os.path.exists( os.path.join(prefix, 'label-chips') ): 86 | os.mkdir(os.path.join(prefix, 'label-chips')) 87 | 88 | 89 | lines = [ line for line in open(f'{prefix}/index.csv') ] 90 | num_images = len(lines) - 1 91 | print(f"converting {num_images} images to chips - this may take a few minutes but only needs to be done once.") 92 | 93 | for lineno, line in enumerate(lines): 94 | 95 | line = line.strip().split(' ') 96 | scene = line[1] 97 | dataset = get_split(scene) 98 | 99 | if dataset == 'test.txt': 100 | print(f"not converting test image {scene} to chips, it will be used for inference.") 101 | continue 102 | 103 | orthofile = os.path.join(prefix, 'images', scene + '-ortho.tif') 104 | elevafile = os.path.join(prefix, 'elevations', scene + '-elev.tif') 105 | labelfile = os.path.join(prefix, 'labels', scene + '-label.png') 106 | 107 | if os.path.exists(orthofile) and os.path.exists(labelfile): 108 | image2tile(prefix, scene, dataset, orthofile, elevafile, labelfile) 109 | -------------------------------------------------------------------------------- /libs/inference.py: -------------------------------------------------------------------------------- 1 | """ 2 | inference.py - Sample implementation of inference with a Dynamic Unet using FastAI 3 | 2019 - Nicholas Pilkington, DroneDeploy 4 | """ 5 | 6 | import os 7 | import cv2 8 | import sys 9 | import torch 10 | from libs.scoring import score_masks 11 | from fastai.vision import * 12 | from fastai.callbacks.hooks import * 13 | from fastai.utils import * 14 | from libs.config import train_ids, test_ids, val_ids, LABELMAP 15 | 16 | def category2mask(img): 17 | """ Convert a category image to color mask """ 18 | if len(img) == 3: 19 | if img.shape[2] == 3: 20 | img = img[:, :, 0] 21 | 22 | mask = np.zeros(img.shape[:2] + (3, ), dtype='uint8') 23 | 24 | for category, mask_color in LABELMAP.items(): 25 | locs = np.where(img == category) 26 | mask[locs] = mask_color 27 | 28 | return mask 29 | 30 | 31 | def chip_iterator(image, size=256): 32 | """ Generator that yields chips of size `size`x`size from `image` """ 33 | 34 | img = cv2.imread(image) 35 | shape = img.shape 36 | 37 | chip_count = math.ceil(shape[1] / size) * math.ceil(shape[0] / size) 38 | 39 | for xi, x in enumerate(range(0, shape[1], size)): 40 | for yi, y in enumerate(range(0, shape[0], size)): 41 | chip = img[y:y+size, x:x+size, :] 42 | # Padding right and bottom out to `size` with black pixels 43 | chip = cv2.copyMakeBorder(chip, top=0, bottom=size - chip.shape[0], left=0, right=size - chip.shape[1], borderType= cv2.BORDER_CONSTANT, value=[0, 0, 0] ) 44 | yield (chip, xi, yi, chip_count) 45 | 46 | def image_size(filename): 47 | img = cv2.imread(filename) 48 | return img.shape 49 | 50 | def tensor2numpy(tensor): 51 | """ Convert a pytorch tensor image presentation to numpy OpenCV representation """ 52 | 53 | ret = tensor.px.numpy() 54 | ret = ret * 255. 55 | ret = ret.astype('uint8') 56 | ret = np.transpose(ret, (1, 2, 0)) 57 | return ret 58 | 59 | def numpy2tensor(chip): 60 | tensorchip = np.transpose(chip, (2, 0, 1)) 61 | tensorchip = tensorchip.astype('float32') 62 | tensorchip = tensorchip / 255. 63 | tensorchip = torch.from_numpy(tensorchip) 64 | tensorchip = Image(tensorchip) 65 | return tensorchip 66 | 67 | class Inference(object): 68 | 69 | def __init__(self, modelpath, modelfile, size=1200): 70 | print("loading model", modelfile, "...") 71 | self.learn = load_learner(modelpath, modelfile) 72 | self.learn.data.single_ds.tfmargs['size'] = size 73 | self.learn.data.single_ds.tfmargs_y['size'] = size 74 | 75 | def predict(self, imagefile, predsfile, size=1200): 76 | 77 | shape = image_size(imagefile) 78 | print('loading input image', shape) 79 | 80 | assert(shape[2] == 3) 81 | 82 | prediction = np.zeros(shape[:2], dtype='uint8') 83 | 84 | iter = chip_iterator(imagefile, size=size) 85 | 86 | for counter, (imagechip, x, y, total_chips) in enumerate(iter): 87 | 88 | print(f"running inference on chip {counter} of {total_chips}") 89 | 90 | if imagechip.sum() == 0: 91 | continue 92 | 93 | tensorchip = numpy2tensor(imagechip) 94 | preds = self.learn.predict(tensorchip)[2] 95 | # add one because we don't predict the ignore class 96 | category_chip = preds.data.argmax(0).numpy() + 1 97 | section = prediction[y*size:y*size+size, x*size:x*size+size].shape 98 | prediction[y*size:y*size+size, x*size:x*size+size] = category_chip[:section[0], :section[1]] 99 | 100 | mask = category2mask(prediction) 101 | cv2.imwrite(predsfile, mask) 102 | 103 | def run_inference(dataset, model_name='baseline_model', basedir="predictions"): 104 | if not os.path.isdir(basedir): 105 | os.mkdir(basedir) 106 | 107 | size = 1200 108 | modelpath = 'models' 109 | 110 | if not os.path.exists(os.path.join(modelpath, model_name)): 111 | print(f"model {model_name} not found in {modelpath}") 112 | sys.exit(0) 113 | 114 | inf = Inference(modelpath, model_name, size=size) 115 | 116 | for scene in train_ids + val_ids + test_ids: 117 | #for scene in test_ids: 118 | 119 | imagefile = f'{dataset}/images/{scene}-ortho.tif' 120 | labelfile = f'{dataset}/labels/{scene}-label.png' 121 | predsfile = f"{basedir}/{scene}-prediction.png" 122 | 123 | if not os.path.exists(imagefile): 124 | #print(f"image {imagefile} not found, skipping.") 125 | continue 126 | 127 | print(f"running inference on image {imagefile}.") 128 | inf.predict(imagefile, predsfile, size=size) 129 | 130 | 131 | if __name__ == '__main__': 132 | run_inference() 133 | -------------------------------------------------------------------------------- /libs/inference_keras.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | import numpy as np 3 | import math 4 | from keras import models 5 | import os 6 | 7 | from libs.config import train_ids, test_ids, val_ids, LABELMAP_RGB 8 | 9 | def category2mask(img): 10 | """ Convert a category image to color mask """ 11 | if len(img) == 3: 12 | if img.shape[2] == 3: 13 | img = img[:, :, 0] 14 | 15 | mask = np.zeros(img.shape[:2] + (3, ), dtype='uint8') 16 | 17 | for category, mask_color in LABELMAP_RGB.items(): 18 | locs = np.where(img == category) 19 | mask[locs] = mask_color 20 | 21 | return mask 22 | 23 | def chips_from_image(img, size=300): 24 | shape = img.shape 25 | 26 | chip_count = math.ceil(shape[1] / size) * math.ceil(shape[0] / size) 27 | 28 | chips = [] 29 | for x in range(0, shape[1], size): 30 | for y in range(0, shape[0], size): 31 | chip = img[y:y+size, x:x+size, :] 32 | y_pad = size - chip.shape[0] 33 | x_pad = size - chip.shape[1] 34 | chip = np.pad(chip, [(0, y_pad), (0, x_pad), (0, 0)], mode='constant') 35 | chips.append((chip, x, y)) 36 | return chips 37 | 38 | def run_inference_on_file(imagefile, predsfile, model, size=300): 39 | with Image.open(imagefile).convert('RGB') as img: 40 | nimg = np.array(Image.open(imagefile).convert('RGB')) 41 | shape = nimg.shape 42 | chips = chips_from_image(nimg) 43 | 44 | chips = [(chip, xi, yi) for chip, xi, yi in chips if chip.sum() > 0] 45 | prediction = np.zeros(shape[:2], dtype='uint8') 46 | chip_preds = model.predict(np.array([chip for chip, _, _ in chips]), verbose=True) 47 | 48 | for (chip, x, y), pred in zip(chips, chip_preds): 49 | category_chip = np.argmax(pred, axis=-1) + 1 50 | section = prediction[y:y+size, x:x+size].shape 51 | prediction[y:y+size, x:x+size] = category_chip[:section[0], :section[1]] 52 | 53 | mask = category2mask(prediction) 54 | Image.fromarray(mask).save(predsfile) 55 | 56 | def run_inference(dataset, model=None, model_path=None, basedir='predictions'): 57 | if not os.path.isdir(basedir): 58 | os.mkdir(basedir) 59 | if model is None and model_path is None: 60 | raise Exception("model or model_path required") 61 | 62 | if model is None: 63 | model = models.load_model(model_path) 64 | 65 | for scene in train_ids + val_ids + test_ids: 66 | imagefile = f'{dataset}/images/{scene}-ortho.tif' 67 | predsfile = os.path.join(basedir, f'{scene}-prediction.png') 68 | 69 | if not os.path.exists(imagefile): 70 | continue 71 | 72 | print(f'running inference on image {imagefile}.') 73 | run_inference_on_file(imagefile, predsfile, model) 74 | -------------------------------------------------------------------------------- /libs/models_keras.py: -------------------------------------------------------------------------------- 1 | from keras import layers, models 2 | import numpy as np 3 | import tensorflow as tf 4 | 5 | def build_unet(size=300, basef=64, maxf=512, encoder='resnet50', pretrained=True): 6 | input = layers.Input((size, size, 3)) 7 | 8 | encoder_model = make_encoder(input, name=encoder, pretrained=pretrained) 9 | 10 | crosses = [] 11 | 12 | for layer in encoder_model.layers: 13 | # don't end on padding layers 14 | if type(layer) == layers.ZeroPadding2D: 15 | continue 16 | idx = get_scale_index(size, layer.output_shape[1]) 17 | if idx is None: 18 | continue 19 | if idx >= len(crosses): 20 | crosses.append(layer) 21 | else: 22 | crosses[idx] = layer 23 | 24 | x = crosses[-1].output 25 | for scale in range(len(crosses)-2, -1, -1): 26 | nf = min(basef * 2**scale, maxf) 27 | x = upscale(x, nf) 28 | x = act(x) 29 | x = layers.Concatenate()([ 30 | pad_to_scale(x, scale, size=size), 31 | pad_to_scale(crosses[scale].output, scale, size=size) 32 | ]) 33 | x = conv(x, nf) 34 | x = act(x) 35 | 36 | x = conv(x, 6) 37 | x = layers.Activation('softmax')(x) 38 | 39 | return models.Model(input, x) 40 | 41 | def make_encoder(input, name='resnet50', pretrained=True): 42 | if name == 'resnet18': 43 | from classification_models.keras import Classifiers 44 | ResNet18, _ = Classifiers.get('resnet18') 45 | model = ResNet18( 46 | weights='imagenet' if pretrained else None, 47 | input_tensor=input, 48 | include_top=False 49 | ) 50 | elif name == 'resnet50': 51 | from keras.applications.resnet import ResNet50 52 | model = ResNet50( 53 | weights='imagenet' if pretrained else None, 54 | input_tensor=input, 55 | include_top=False 56 | ) 57 | elif name == 'resnet101': 58 | from keras.applications.resnet import ResNet101 59 | model = ResNet101( 60 | weights='imagenet' if pretrained else None, 61 | input_tensor=input, 62 | include_top=False 63 | ) 64 | elif name == 'resnet152': 65 | from keras.applications.resnet import ResNet152 66 | model = ResNet152( 67 | weights='imagenet' if pretrained else None, 68 | input_tensor=input, 69 | include_top=False 70 | ) 71 | elif name == 'vgg16': 72 | from keras.applications.vgg16 import VGG16 73 | model = VGG16( 74 | weights='imagenet' if pretrained else None, 75 | input_tensor=input, 76 | include_top=False 77 | ) 78 | elif name == 'vgg19': 79 | from keras.applications.vgg19 import VGG19 80 | model = VGG19( 81 | weights='imagenet' if pretrained else None, 82 | input_tensor=input, 83 | include_top=False 84 | ) 85 | else: 86 | raise Exception(f'unknown encoder {name}') 87 | 88 | return model 89 | 90 | def get_scale_index(in_size, l_size): 91 | for i in range(8): 92 | s_size = in_size // (2 ** i) 93 | if abs(l_size - s_size) <= 4: 94 | return i 95 | return None 96 | 97 | def pad_to_scale(x, scale, size=300): 98 | expected = int(np.ceil(size / (2. ** scale))) 99 | diff = expected - int(x.shape[1]) 100 | if diff > 0: 101 | left = diff // 2 102 | right = diff - left 103 | x = reflectpad(x, (left, right)) 104 | elif diff < 0: 105 | left = -diff // 2 106 | right = -diff - left 107 | x = layers.Cropping2D(((left, right), (left, right)))(x) 108 | return x 109 | 110 | def reflectpad(x, pad): 111 | return layers.Lambda(lambda x: tf.pad(x, [(0, 0), pad, pad, (0, 0)], 'REFLECT'))(x) 112 | 113 | def upscale(x, nf): 114 | x = layers.UpSampling2D((2, 2))(x) 115 | x = conv(x, nf, kernel_size=(1, 1)) 116 | return x 117 | 118 | def act(x): 119 | x = layers.BatchNormalization()(x) 120 | x = layers.LeakyReLU(0.2)(x) 121 | return x 122 | 123 | def conv(x, nf, kernel_size=(3, 3), **kwargs): 124 | padleft = (kernel_size[0] - 1) // 2 125 | padright = kernel_size[0] - 1 - padleft 126 | if padleft > 0 or padright > 0: 127 | x = reflectpad(x, (padleft, padright)) 128 | return layers.Conv2D(nf, kernel_size=kernel_size, padding='valid', **kwargs)(x) 129 | -------------------------------------------------------------------------------- /libs/scoring.py: -------------------------------------------------------------------------------- 1 | import os 2 | import cv2 3 | 4 | from libs.config import LABELS, INV_LABELMAP, test_ids 5 | 6 | from sklearn.metrics import confusion_matrix 7 | from sklearn.utils.multiclass import unique_labels 8 | from sklearn.metrics import f1_score, precision_score, recall_score 9 | 10 | import matplotlib.pyplot as plt 11 | import numpy as np 12 | 13 | 14 | def wherecolor(img, color, negate = False): 15 | 16 | k1 = (img[:, :, 0] == color[0]) 17 | k2 = (img[:, :, 1] == color[1]) 18 | k3 = (img[:, :, 2] == color[2]) 19 | 20 | if negate: 21 | return np.where( not (k1 & k2 & k3) ) 22 | else: 23 | return np.where( k1 & k2 & k3 ) 24 | 25 | def plot_confusion_matrix(y_true, y_pred, classes, 26 | normalize=True, 27 | title=None, 28 | cmap=plt.cm.Blues, 29 | savedir="predictions"): 30 | """ 31 | This function prints and plots the confusion matrix. 32 | Normalization can be applied by setting `normalize=True`. 33 | """ 34 | if not title: 35 | if normalize: 36 | title = 'Normalized confusion matrix' 37 | else: 38 | title = 'Confusion matrix, without normalization' 39 | 40 | # Compute confusion matrix 41 | cm = confusion_matrix(y_true, y_pred) 42 | 43 | # Only use the labels that appear in the data 44 | labels_used = unique_labels(y_true, y_pred) 45 | classes = classes[labels_used] 46 | 47 | # Normalization with generate NaN where there are no ground label labels but there are predictions x/0 48 | if normalize: 49 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 50 | 51 | fig, ax = plt.subplots() 52 | im = ax.imshow(cm, interpolation='nearest', cmap=cmap) 53 | ax.figure.colorbar(im, ax=ax) 54 | 55 | base, fname = os.path.split(title) 56 | ax.set(xticks=np.arange(cm.shape[1]), 57 | yticks=np.arange(cm.shape[0]), 58 | xticklabels=classes, yticklabels=classes, 59 | title=fname, 60 | ylabel='True label', 61 | xlabel='Predicted label') 62 | 63 | # Rotate the tick labels and set their alignment. 64 | plt.setp(ax.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor") 65 | 66 | # Loop over data dimensions and create text annotations. 67 | fmt = '.2f' if normalize else 'd' 68 | thresh = cm.max() / 2. 69 | for i in range(cm.shape[0]): 70 | for j in range(cm.shape[1]): 71 | ax.text(j, i, format(cm[i, j], fmt), 72 | ha="center", va="center", 73 | color="white" if cm[i, j] > thresh else "black") 74 | 75 | plt.xlim([-0.5, cm.shape[1] - 0.5]) 76 | plt.ylim([-0.5, cm.shape[0]- 0.5]) 77 | 78 | fig.tight_layout() 79 | # save to directory 80 | if not os.path.isdir(savedir): 81 | os.mkdir(savedir) 82 | savefile = title 83 | plt.savefig(savefile) 84 | return savefile, cm 85 | 86 | def score_masks(labelfile, predictionfile): 87 | 88 | label = cv2.imread(labelfile) 89 | prediction = cv2.imread(predictionfile) 90 | 91 | shape = label.shape[:2] 92 | 93 | label_class = np.zeros(shape, dtype='uint8') 94 | pred_class = np.zeros(shape, dtype='uint8') 95 | 96 | for color, category in INV_LABELMAP.items(): 97 | locs = wherecolor(label, color) 98 | label_class[locs] = category 99 | 100 | for color, category in INV_LABELMAP.items(): 101 | locs = wherecolor(prediction, color) 102 | pred_class[locs] = category 103 | 104 | label_class = label_class.reshape((label_class.shape[0] * label_class.shape[1])) 105 | pred_class = pred_class.reshape((pred_class.shape[0] * pred_class.shape[1])) 106 | 107 | # Remove all predictions where there is a IGNORE (magenta pixel) in the groud label and then shift labels down 1 index 108 | not_ignore_locs = np.where(label_class != 0) 109 | label_class = label_class[not_ignore_locs] - 1 110 | pred_class = pred_class[not_ignore_locs] - 1 111 | 112 | precision = precision_score(label_class, pred_class, average='weighted') 113 | recall = recall_score(label_class, pred_class, average='weighted') 114 | f1 = f1_score(label_class, pred_class, average='weighted') 115 | print(f'precision={precision} recall={recall} f1={f1}') 116 | 117 | savefile, cm = plot_confusion_matrix(label_class, pred_class, np.array(LABELS), title=predictionfile) 118 | 119 | return precision, recall, f1, savefile 120 | 121 | def score_predictions(dataset, basedir='predictions'): 122 | 123 | scores = [] 124 | 125 | precision = [] 126 | recall = [] 127 | f1 = [] 128 | 129 | predictions = [] 130 | confusions = [] 131 | 132 | #for scene in train_ids + val_ids + test_ids: 133 | for scene in test_ids: 134 | 135 | labelfile = f'{dataset}/labels/{scene}-label.png' 136 | predsfile = os.path.join(basedir, f"{scene}-prediction.png") 137 | 138 | if not os.path.exists(labelfile): 139 | continue 140 | 141 | if not os.path.exists(predsfile): 142 | continue 143 | 144 | a, b, c, savefile = score_masks(labelfile, predsfile) 145 | 146 | precision.append(a) 147 | recall.append(b) 148 | f1.append(c) 149 | 150 | predictions.append(predsfile) 151 | confusions.append(savefile) 152 | 153 | # Compute test set scores 154 | scores = { 155 | 'f1_mean' : np.mean(f1), 156 | 'f1_std' : np.std(f1), 157 | 'pr_mean' : np.mean(precision), 158 | 'pr_std' : np.std(precision), 159 | 're_mean' : np.mean(recall), 160 | 're_std' : np.std(recall), 161 | } 162 | 163 | return scores, zip(predictions, confusions) 164 | 165 | 166 | if __name__ == '__main__': 167 | score_predictions('dataset-sample') 168 | -------------------------------------------------------------------------------- /libs/training.py: -------------------------------------------------------------------------------- 1 | """ 2 | train.py - Sample implementation of a Dynamic Unet using FastAI 3 | 2019 - Nicholas Pilkington, DroneDeploy 4 | """ 5 | 6 | from fastai.vision import * 7 | from fastai.callbacks.hooks import * 8 | from libs import inference 9 | from libs import scoring 10 | from libs.util import MySaveModelCallback, ExportCallback, MyCSVLogger, Precision, Recall, FBeta 11 | from libs import datasets_fastai 12 | 13 | import wandb 14 | from wandb.fastai import WandbCallback 15 | 16 | 17 | def train_model(dataset): 18 | """ Trains a DynamicUnet on the dataset """ 19 | 20 | epochs = 15 21 | lr = 1e-4 22 | size = 300 23 | wd = 1e-2 24 | bs = 8 # reduce this if you are running out of GPU memory 25 | pretrained = True 26 | 27 | config = { 28 | 'epochs' : epochs, 29 | 'lr' : lr, 30 | 'size' : size, 31 | 'wd' : wd, 32 | 'bs' : bs, 33 | 'pretrained' : pretrained, 34 | } 35 | 36 | wandb.config.update(config) 37 | 38 | metrics = [ 39 | Precision(average='weighted', clas_idx=1), 40 | Recall(average='weighted', clas_idx=1), 41 | FBeta(average='weighted', beta=1, clas_idx=1), 42 | ] 43 | 44 | data = datasets_fastai.load_dataset(dataset, size, bs) 45 | encoder_model = models.resnet18 46 | learn = unet_learner(data, encoder_model, path='models', metrics=metrics, wd=wd, bottle=True, pretrained=pretrained) 47 | 48 | callbacks = [ 49 | WandbCallback(learn, log=None, input_type="images"), 50 | MyCSVLogger(learn, filename='baseline_model'), 51 | ExportCallback(learn, "baseline_model", monitor='f_beta'), 52 | MySaveModelCallback(learn, every='epoch', monitor='f_beta') 53 | ] 54 | 55 | learn.unfreeze() 56 | learn.fit_one_cycle(epochs, lr, callbacks=callbacks) 57 | -------------------------------------------------------------------------------- /libs/training_keras.py: -------------------------------------------------------------------------------- 1 | from keras import optimizers, metrics 2 | from libs import datasets_keras 3 | from libs.config import LABELMAP 4 | from libs.util_keras import FBeta 5 | import numpy as np 6 | 7 | import wandb 8 | from wandb.keras import WandbCallback 9 | 10 | def train_model(dataset, model): 11 | epochs = 15 12 | # epochs = 0 13 | lr = 1e-4 14 | size = 300 15 | wd = 1e-2 16 | bs = 8 # reduce this if you are running out of GPU memory 17 | pretrained = True 18 | 19 | config = { 20 | 'epochs' : epochs, 21 | 'lr' : lr, 22 | 'size' : size, 23 | 'wd' : wd, 24 | 'bs' : bs, 25 | 'pretrained' : pretrained, 26 | } 27 | 28 | wandb.config.update(config) 29 | 30 | model.compile( 31 | optimizer=optimizers.Adam(lr=lr), 32 | loss='categorical_crossentropy', 33 | metrics=[ 34 | metrics.Precision(top_k=1, name='precision'), 35 | metrics.Recall(top_k=1, name='recall'), 36 | FBeta(name='f_beta') 37 | ] 38 | ) 39 | 40 | train_data, valid_data = datasets_keras.load_dataset(dataset, bs) 41 | _, ex_data = datasets_keras.load_dataset(dataset, 10) 42 | model.fit_generator( 43 | train_data, 44 | validation_data=valid_data, 45 | epochs=epochs, 46 | callbacks=[ 47 | WandbCallback( 48 | input_type='image', 49 | output_type='segmentation_mask', 50 | validation_data=ex_data[0] 51 | ) 52 | ] 53 | ) 54 | -------------------------------------------------------------------------------- /libs/util.py: -------------------------------------------------------------------------------- 1 | from typing import Any 2 | from fastai.callbacks import CSVLogger, SaveModelCallback, TrackerCallback 3 | from fastai.callback import Callback 4 | from fastai.metrics import add_metrics 5 | from fastai.torch_core import dataclass, torch, Tensor, Optional, warn 6 | from fastai.basic_train import Learner 7 | 8 | 9 | class ExportCallback(TrackerCallback): 10 | """"Exports the model when monitored quantity is best. 11 | 12 | The exported model is the one used for inference. 13 | """ 14 | def __init__(self, learn:Learner, model_path:str, monitor:str='valid_loss', mode:str='auto'): 15 | self.model_path = model_path 16 | super().__init__(learn, monitor=monitor, mode=mode) 17 | 18 | def on_epoch_end(self, epoch:int, **kwargs:Any)->None: 19 | current = self.get_monitor_value() 20 | 21 | if (epoch == 0 or (current is not None and self.operator(current, self.best))): 22 | print(f'Better model found at epoch {epoch} with {self.monitor} value: {current} - exporting {self.model_path}') 23 | self.best = current 24 | self.learn.export(self.model_path) 25 | 26 | # TODO: does this delete some other path or just overwrite? 27 | class MySaveModelCallback(SaveModelCallback): 28 | """Saves the model after each epoch to potentially resume training. 29 | 30 | Modified from fastai version to delete the previous model that was saved 31 | to avoid wasting disk space. 32 | """ 33 | def on_epoch_end(self, epoch:int, **kwargs:Any)->None: 34 | "Compare the value monitored to its best score and maybe save the model." 35 | current = self.get_monitor_value() 36 | if current is not None and self.operator(current, self.best): 37 | self.best = current 38 | self.learn.save(f'{self.name}') 39 | 40 | 41 | class MyCSVLogger(CSVLogger): 42 | """Logs metrics to a CSV file after each epoch. 43 | 44 | Modified from fastai version to: 45 | - flush after each epoch 46 | - append to log if already exists 47 | """ 48 | def __init__(self, learn, filename='history'): 49 | super().__init__(learn, filename) 50 | 51 | def on_train_begin(self, **kwargs): 52 | if self.path.exists(): 53 | # TODO: does this open a file named "a"...? 54 | self.file = self.path.open('a') 55 | else: 56 | super().on_train_begin(**kwargs) 57 | 58 | def on_epoch_end(self, epoch, smooth_loss, last_metrics, **kwargs): 59 | out = super().on_epoch_end( 60 | epoch, smooth_loss, last_metrics, **kwargs) 61 | self.file.flush() 62 | return out 63 | 64 | # The following are a set of metric callbacks that have been modified from the 65 | # original version in fastai to support semantic segmentation, which doesn't 66 | # have the class dimension in position -1. It also adds an ignore_idx 67 | # which is used to ignore pixels with class equal to ignore_idx. These 68 | # would be good to contribute back upstream to fastai -- however we should 69 | # wait for their upcoming refactor of the callback architecture. 70 | 71 | @dataclass 72 | class ConfusionMatrix(Callback): 73 | "Computes the confusion matrix." 74 | # The index of the dimension in the output and target arrays which ranges 75 | # over the different classes. This is -1 (the last index) for 76 | # classification, but is 1 for semantic segmentation. 77 | clas_idx:int=-1 78 | 79 | def on_train_begin(self, **kwargs): 80 | self.n_classes = 0 81 | 82 | 83 | def on_epoch_begin(self, **kwargs): 84 | self.cm = None 85 | 86 | def on_batch_end(self, last_output:Tensor, last_target:Tensor, **kwargs): 87 | 88 | 89 | preds = last_output.argmax(self.clas_idx).view(-1).cpu() 90 | targs = last_target.view(-1).cpu() 91 | if self.n_classes == 0: 92 | self.n_classes = last_output.shape[self.clas_idx] 93 | self.x = torch.arange(0, self.n_classes) 94 | cm = ((preds==self.x[:, None]) & (targs==self.x[:, None, None])).sum(dim=2, dtype=torch.float32) 95 | if self.cm is None: self.cm = cm 96 | else: self.cm += cm 97 | 98 | def on_epoch_end(self, **kwargs): 99 | self.metric = self.cm 100 | 101 | @dataclass 102 | class CMScores(ConfusionMatrix): 103 | "Base class for metrics which rely on the calculation of the precision and/or recall score." 104 | average:Optional[str]="binary" # `binary`, `micro`, `macro`, `weighted` or None 105 | pos_label:int=1 # 0 or 1 106 | eps:float=1e-9 107 | # If ground truth label is equal to the ignore_idx, it should be ignored 108 | # for the sake of evaluation. 109 | ignore_idx:int=None 110 | 111 | def _recall(self): 112 | rec = torch.diag(self.cm) / self.cm.sum(dim=1) 113 | rec[rec != rec] = 0 # removing potential "nan"s 114 | if self.average is None: return rec 115 | else: 116 | if self.average == "micro": weights = self._weights(avg="weighted") 117 | else: weights = self._weights(avg=self.average) 118 | return (rec * weights).sum() 119 | 120 | def _precision(self): 121 | prec = torch.diag(self.cm) / self.cm.sum(dim=0) 122 | prec[prec != prec] = 0 # removing potential "nan"s 123 | if self.average is None: return prec 124 | else: 125 | weights = self._weights(avg=self.average) 126 | return (prec * weights).sum() 127 | 128 | def _weights(self, avg:str): 129 | if self.n_classes != 2 and avg == "binary": 130 | avg = self.average = "macro" 131 | warn("average=`binary` was selected for a non binary case. Value for average has now been set to `macro` instead.") 132 | if avg == "binary": 133 | if self.pos_label not in (0, 1): 134 | self.pos_label = 1 135 | warn("Invalid value for pos_label. It has now been set to 1.") 136 | if self.pos_label == 1: return Tensor([0,1]) 137 | else: return Tensor([1,0]) 138 | else: 139 | if avg == "micro": weights = self.cm.sum(dim=0) / self.cm.sum() 140 | if avg == "macro": weights = torch.ones((self.n_classes,)) / self.n_classes 141 | if avg == "weighted": weights = self.cm.sum(dim=1) / self.cm.sum() 142 | if self.ignore_idx is not None and avg in ["macro", "weighted"]: 143 | weights[self.ignore_idx] = 0 144 | weights /= weights.sum() 145 | return weights 146 | 147 | class Recall(CMScores): 148 | "Compute the Recall." 149 | def on_epoch_end(self, last_metrics, **kwargs): 150 | return add_metrics(last_metrics, self._recall()) 151 | 152 | class Precision(CMScores): 153 | "Compute the Precision." 154 | def on_epoch_end(self, last_metrics, **kwargs): 155 | return add_metrics(last_metrics, self._precision()) 156 | 157 | @dataclass 158 | class FBeta(CMScores): 159 | "Compute the F`beta` score." 160 | beta:float=2 161 | 162 | def on_train_begin(self, **kwargs): 163 | self.n_classes = 0 164 | self.beta2 = self.beta ** 2 165 | self.avg = self.average 166 | if self.average != "micro": self.average = None 167 | 168 | def on_epoch_end(self, last_metrics, **kwargs): 169 | prec = self._precision() 170 | rec = self._recall() 171 | metric = (1 + self.beta2) * prec * rec / (prec * self.beta2 + rec + self.eps) 172 | metric[metric != metric] = 0 # removing potential "nan"s 173 | if self.avg: metric = (self._weights(avg=self.avg) * metric).sum() 174 | return add_metrics(last_metrics, metric) 175 | 176 | def on_train_end(self, **kwargs): self.average = self.avg 177 | -------------------------------------------------------------------------------- /libs/util_keras.py: -------------------------------------------------------------------------------- 1 | from keras.metrics import Metric 2 | from keras import backend as K 3 | from keras.utils import metrics_utils 4 | import numpy as np 5 | 6 | # adapted from keras.metrics.Precision 7 | class FBeta(Metric): 8 | def __init__(self, 9 | beta=1, 10 | name=None, 11 | dtype=None): 12 | super(FBeta, self).__init__(name=name, dtype=dtype) 13 | self.beta2 = beta*beta 14 | self.true_positives = self.add_weight( 15 | 'true_positives', 16 | shape=(1,), 17 | initializer='zeros') 18 | self.false_positives = self.add_weight( 19 | 'false_positives', 20 | shape=(1,), 21 | initializer='zeros') 22 | self.false_negatives = self.add_weight( 23 | 'false_negatives', 24 | shape=(1,), 25 | initializer='zeros') 26 | 27 | def update_state(self, y_true, y_pred, sample_weight=None): 28 | return metrics_utils.update_confusion_matrix_variables( 29 | { 30 | metrics_utils.ConfusionMatrix.TRUE_POSITIVES: self.true_positives, 31 | metrics_utils.ConfusionMatrix.FALSE_POSITIVES: self.false_positives, 32 | metrics_utils.ConfusionMatrix.FALSE_NEGATIVES: self.false_negatives, 33 | }, 34 | y_true, 35 | y_pred, 36 | thresholds=[metrics_utils.NEG_INF], 37 | top_k=1, 38 | class_id=None, 39 | sample_weight=sample_weight) 40 | 41 | def _precision(self): 42 | denom = (self.true_positives + self.false_positives) 43 | result = K.switch( 44 | K.greater(denom, 0), 45 | self.true_positives / denom, 46 | K.zeros_like(self.true_positives)) 47 | return result[0] 48 | 49 | def _recall(self): 50 | denom = (self.true_positives + self.false_negatives) 51 | result = K.switch( 52 | K.greater(denom, 0), 53 | self.true_positives / denom, 54 | K.zeros_like(self.true_positives)) 55 | return result[0] 56 | 57 | def result(self): 58 | precision, recall = self._precision(), self._recall() 59 | denom = self.beta2 * precision + recall 60 | result = K.switch( 61 | K.greater(denom, 0), 62 | (1 + self.beta2) * precision * recall / denom, 63 | 0.) 64 | return result 65 | 66 | def reset_states(self): 67 | K.batch_set_value( 68 | [(v, np.zeros((1,))) for v in self.weights]) 69 | 70 | def get_config(self): 71 | config = { 72 | 'beta2': self.beta2 73 | } 74 | base_config = super(FBeta, self).get_config() 75 | return dict(list(base_config.items()) + list(config.items())) -------------------------------------------------------------------------------- /main_fastai.py: -------------------------------------------------------------------------------- 1 | from libs import inference 2 | from libs import scoring 3 | from libs import training 4 | from libs import datasets 5 | 6 | import wandb 7 | 8 | if __name__ == '__main__': 9 | 10 | dataset = 'dataset-sample' # 0.5 GB download 11 | # dataset = 'dataset-medium' # 9.0 GB download 12 | 13 | config = { 14 | 'name' : 'baseline-fastai', 15 | 'dataset' : dataset, 16 | } 17 | 18 | wandb.init(config=config) 19 | 20 | datasets.download_dataset(dataset) 21 | 22 | # train the baseline model and save it in models folder 23 | training.train_model(dataset) 24 | 25 | # use the train model to run inference on all test scenes 26 | inference.run_inference(dataset) 27 | 28 | # scores all the test images compared to the ground truth labels then 29 | # send the scores (f1, precision, recall) and prediction images to wandb 30 | score, predictions = scoring.score_predictions(dataset) 31 | print(score) 32 | wandb.log(score) 33 | 34 | for f1, f2 in predictions: 35 | wandb.save( f1 ) 36 | wandb.save( f2 ) 37 | -------------------------------------------------------------------------------- /main_keras.py: -------------------------------------------------------------------------------- 1 | from libs import training_keras 2 | from libs import datasets 3 | from libs import models_keras 4 | from libs import inference_keras 5 | from libs import scoring 6 | 7 | import wandb 8 | 9 | if __name__ == '__main__': 10 | dataset = 'dataset-sample' # 0.5 GB download 11 | #dataset = 'dataset-medium' # 9.0 GB download 12 | 13 | config = { 14 | 'name' : 'baseline-keras', 15 | 'dataset' : dataset, 16 | } 17 | 18 | wandb.init(config=config) 19 | 20 | datasets.download_dataset(dataset) 21 | 22 | # train the model 23 | model = models_keras.build_unet(encoder='resnet18') 24 | training_keras.train_model(dataset, model) 25 | 26 | # use the train model to run inference on all test scenes 27 | inference_keras.run_inference(dataset, model=model, basedir=wandb.run.dir) 28 | 29 | # scores all the test images compared to the ground truth labels then 30 | # send the scores (f1, precision, recall) and prediction images to wandb 31 | score, _ = scoring.score_predictions(dataset, basedir=wandb.run.dir) 32 | print(score) 33 | wandb.log(score) 34 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | fastai 2 | image-classifiers 3 | keras 4 | numpy==1.22.0 5 | opencv_python==3.4.3.18 6 | sklearn 7 | tensorflow-gpu 8 | torch 9 | typing==3.6.6 10 | wandb 11 | --------------------------------------------------------------------------------