├── .github ├── FUNDING.yml ├── ISSUE_TEMPLATE │ └── bug_report.md └── workflows │ └── stale.yml ├── .gitignore ├── MaskRCNN Microcontroller Detection.ipynb ├── MaskRCNN Microcontroller Segmentation.ipynb ├── MaskRCNN Using pretrained model.ipynb ├── README.md ├── doc ├── detection_example.png ├── microcontroller_detection.png ├── microcontroller_segmentation.png └── visualize_masks.PNG └── video_detection.py /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: TannerGilbert 4 | patreon: gilberttanner 5 | open_collective: # Replace with a single Open Collective username 6 | ko_fi: # Replace with a single Ko-fi username 7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel 8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry 9 | liberapay: # Replace with a single Liberapay username 10 | issuehunt: # Replace with a single IssueHunt username 11 | otechie: # Replace with a single Otechie username 12 | custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2'] 13 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior: 15 | 1. Go to '...' 16 | 2. Click on '....' 17 | 3. Scroll down to '....' 18 | 4. See error 19 | 20 | **Expected behavior** 21 | A clear and concise description of what you expected to happen. 22 | 23 | **Screenshots** 24 | If applicable, add screenshots to help explain your problem. 25 | 26 | **Desktop (please complete the following information):** 27 | - OS: [e.g. iOS] 28 | - Browser [e.g. chrome, safari] 29 | - Version [e.g. 22] 30 | 31 | **Smartphone (please complete the following information):** 32 | - Device: [e.g. iPhone6] 33 | - OS: [e.g. iOS8.1] 34 | - Browser [e.g. stock browser, safari] 35 | - Version [e.g. 22] 36 | 37 | **Additional context** 38 | Add any other context about the problem here. 39 | -------------------------------------------------------------------------------- /.github/workflows/stale.yml: -------------------------------------------------------------------------------- 1 | name: Mark stale issues and pull requests 2 | 3 | on: 4 | schedule: 5 | - cron: "0 0 * * *" 6 | 7 | jobs: 8 | stale: 9 | 10 | runs-on: ubuntu-latest 11 | 12 | steps: 13 | - uses: actions/stale@v1 14 | with: 15 | repo-token: ${{ secrets.GITHUB_TOKEN }} 16 | stale-pr-message: 'Stale pull request message' 17 | stale-issue-label: 'no-issue-activity' 18 | stale-pr-label: 'no-pr-activity' 19 | stale-issue-message: 'This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days' 20 | days-before-stale: 30 21 | days-before-close: 5 22 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TannerGilbert/MaskRCNN-Object-Detection-and-Segmentation/7e24fa66f5a7ab8861448d0c582e8a10645bdbc9/.gitignore -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MaskRCNN Object Detection and Segmentation 2 | ![MaskRCNN Inference example](doc/detection_example.png) 3 | 4 | This repository shows you how to do object detection and instance segmentation with [MaskRCNN in Keras](https://github.com/matterport/Mask_RCNN). 5 | 6 | ## Installation 7 | 8 | 1. Clone the repository 9 | ```git clone https://github.com/matterport/Mask_RCNN``` 10 | 11 | 2. Install dependencies 12 | ```bash 13 | cd Mask_RCNN 14 | pip3 install -r requirements.txt 15 | ``` 16 | 17 | 3. Run setup 18 | ```bash 19 | python3 setup.py install 20 | ``` 21 | 22 | 4. (Optional) To train or test on MS COCO install pycocotools from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore). 23 | * Linux: https://github.com/waleedka/coco 24 | * Windows: https://github.com/philferriere/cocoapi. You must have the Visual C++ 2015 build tools on your path (see the repo for additional details) 25 | 26 | ## Running a pre-trained model 27 | 28 | To use a pre-trained model for inference, we need to download the weights, create an inference config, and create a MaskRCNN object. 29 | 30 | For a complete example of how to run a pre-trained model on an image take a look at ["MaskRCNN Using pretrained model.ipynb"](). 31 | 32 | ![](doc/detection_example.png) 33 | 34 | I also created [a python script](video_detection.py) that allows you to run MaskRCNN models on videos or a webcam stream. 35 | 36 | [![pedestrian detection](https://img.youtube.com/vi/g1-TRixHhls/0.jpg)](https://youtu.be/g1-TRixHhls) 37 | 38 | ## Training a custom object detection model 39 | 40 | MaskRCNN also allows you to train your own custom object detection and instance segmentation models. To train a model you'll need to create a class that loads in your data as well as a training config that defines properties for training. You can find the complete code inside the [MaskRCNN Microcontroller Detection.ipynb]() file. 41 | 42 | ### Creating the dataloader class 43 | 44 | As an example I'll use my [Microcontroller Detection dataset](https://www.kaggle.com/tannergi/microcontroller-detection), which was labeled with [labelImg](https://github.com/tzutalin/labelImg). 45 | 46 | The annotation files are in PascalVOC format. So every annotations file looks as follows: 47 | ```xml 48 | 49 | object_detection 50 | IMG_20181228_101826.jpg 51 | object_detection/IMG_20181228_101826.jpg 52 | 53 | Unknown 54 | 55 | 56 | 800 57 | 600 58 | 3 59 | 60 | 0 61 | 62 | Arduino_Nano 63 | Unspecified 64 | 0 65 | 0 66 | 67 | 317 68 | 265 69 | 556 70 | 342 71 | 72 | 73 | 74 | 75 | ``` 76 | 77 | The dataloader class has three methods we need to implement: 78 | * load_dataset() 79 | * load_mask() 80 | * image_reference() 81 | 82 | ```python 83 | class MicrocontrollerDataset(utils.Dataset): 84 | def load_dataset(self, dataset_dir): 85 | pass 86 | 87 | def load_mask(self, image_id): 88 | pass 89 | 90 | def image_reference(self, image_id): 91 | pass 92 | ``` 93 | 94 | The ```load_dataset``` method will define all the classes and add all the images using the ```add_image``` method. The ```load_mask``` method will load in the masks for a given image and the ```image_reference``` method will return the path to an image given its id. 95 | 96 | For the Microcontroller dataset the dataloader class looks as follows: 97 | 98 | ```python 99 | class MicrocontrollerDataset(utils.Dataset): 100 | def load_dataset(self, dataset_dir): 101 | self.add_class('dataset', 1, 'Raspberry_Pi_3') 102 | self.add_class('dataset', 2, 'Arduino_Nano') 103 | self.add_class('dataset', 3, 'ESP8266') 104 | self.add_class('dataset', 4, 'Heltec_ESP32_Lora') 105 | 106 | # find all images 107 | for i, filename in enumerate(os.listdir(dataset_dir)): 108 | if '.jpg' in filename: 109 | self.add_image('dataset', 110 | image_id=i, 111 | path=os.path.join(dataset_dir, filename), 112 | annotation=os.path.join(dataset_dir, filename.replace('.jpg', '.xml'))) 113 | 114 | # extract bounding boxes from an annotation file 115 | def extract_boxes(self, filename): 116 | # load and parse the file 117 | tree = ET.parse(filename) 118 | # get the root of the document 119 | root = tree.getroot() 120 | # extract each bounding box 121 | boxes = [] 122 | classes = [] 123 | for member in root.findall('object'): 124 | xmin = int(member[4][0].text) 125 | ymin = int(member[4][1].text) 126 | xmax = int(member[4][2].text) 127 | ymax = int(member[4][3].text) 128 | boxes.append([xmin, ymin, xmax, ymax]) 129 | classes.append(self.class_names.index(member[0].text)) 130 | # extract image dimensions 131 | width = int(root.find('size')[0].text) 132 | height = int(root.find('size')[1].text) 133 | return boxes, classes, width, height 134 | 135 | # load the masks for an image 136 | def load_mask(self, image_id): 137 | # get details of image 138 | info = self.image_info[image_id] 139 | # define box file location 140 | path = info['annotation'] 141 | # load XML 142 | boxes, classes, w, h = self.extract_boxes(path) 143 | # create one array for all masks, each on a different channel 144 | masks = np.zeros([h, w, len(boxes)], dtype='uint8') 145 | # create masks 146 | for i in range(len(boxes)): 147 | box = boxes[i] 148 | row_s, row_e = box[1], box[3] 149 | col_s, col_e = box[0], box[2] 150 | masks[row_s:row_e, col_s:col_e, i] = 1 151 | return masks, np.asarray(classes, dtype='int32') 152 | 153 | def image_reference(self, image_id): 154 | info = self.image_info[image_id] 155 | return info['path'] 156 | ``` 157 | 158 | Now that we have the dataloader class we can load in both training and testing set and visualize a few random images and their masks. 159 | 160 | ```python 161 | # Create training and validation set 162 | # train set 163 | dataset_train = MicrocontrollerDataset() 164 | dataset_train.load_dataset('Microcontroller Detection/train') 165 | dataset_train.prepare() 166 | print('Train: %d' % len(dataset_train.image_ids)) 167 | 168 | # test/val set 169 | dataset_val = MicrocontrollerDataset() 170 | dataset_val.load_dataset('Microcontroller Detection/test') 171 | dataset_val.prepare() 172 | print('Test: %d' % len(dataset_val.image_ids)) 173 | 174 | # Load and display random samples 175 | image_ids = np.random.choice(dataset_train.image_ids, 4) 176 | for image_id in image_ids: 177 | image = dataset_train.load_image(image_id) 178 | mask, class_ids = dataset_train.load_mask(image_id) 179 | visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names) 180 | ``` 181 | 182 | ![](doc/visualize_masks.PNG) 183 | 184 | ### Creating config object 185 | 186 | MaskRCNN has a Config class. It defines properties for both training and prediction, including the number of classes, gpu count and the learning rate. 187 | 188 | You can take a look at the default config using the following code: 189 | ```python 190 | from mrcnn.config import Config 191 | config = Config() 192 | config.display() 193 | ``` 194 | 195 | ``` 196 | Configurations: 197 | BACKBONE resnet101 198 | BACKBONE_STRIDES [4, 8, 16, 32, 64] 199 | BATCH_SIZE 2 200 | BBOX_STD_DEV [0.1 0.1 0.2 0.2] 201 | COMPUTE_BACKBONE_SHAPE None 202 | DETECTION_MAX_INSTANCES 100 203 | DETECTION_MIN_CONFIDENCE 0.7 204 | DETECTION_NMS_THRESHOLD 0.3 205 | FPN_CLASSIF_FC_LAYERS_SIZE 1024 206 | GPU_COUNT 1 207 | GRADIENT_CLIP_NORM 5.0 208 | IMAGES_PER_GPU 2 209 | IMAGE_CHANNEL_COUNT 3 210 | IMAGE_MAX_DIM 1024 211 | IMAGE_META_SIZE 13 212 | IMAGE_MIN_DIM 800 213 | IMAGE_MIN_SCALE 0 214 | IMAGE_RESIZE_MODE square 215 | IMAGE_SHAPE [1024 1024 3] 216 | LEARNING_MOMENTUM 0.9 217 | LEARNING_RATE 0.001 218 | LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} 219 | MASK_POOL_SIZE 14 220 | MASK_SHAPE [28, 28] 221 | MAX_GT_INSTANCES 100 222 | MEAN_PIXEL [123.7 116.8 103.9] 223 | MINI_MASK_SHAPE (56, 56) 224 | NAME None 225 | NUM_CLASSES 1 226 | POOL_SIZE 7 227 | POST_NMS_ROIS_INFERENCE 1000 228 | POST_NMS_ROIS_TRAINING 2000 229 | PRE_NMS_LIMIT 6000 230 | ROI_POSITIVE_RATIO 0.33 231 | RPN_ANCHOR_RATIOS [0.5, 1, 2] 232 | RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) 233 | RPN_ANCHOR_STRIDE 1 234 | RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] 235 | RPN_NMS_THRESHOLD 0.7 236 | RPN_TRAIN_ANCHORS_PER_IMAGE 256 237 | STEPS_PER_EPOCH 1000 238 | TOP_DOWN_PYRAMID_SIZE 256 239 | TRAIN_BN False 240 | TRAIN_ROIS_PER_IMAGE 200 241 | USE_MINI_MASK True 242 | USE_RPN_ROIS True 243 | VALIDATION_STEPS 50 244 | WEIGHT_DECAY 0.0001 245 | ``` 246 | 247 | For training we need to change at least two properties. The NAME and the NUM_CLASSES. 248 | 249 | ```python 250 | class MicrocontrollerConfig(Config): 251 | # Give the configuration a recognizable name 252 | NAME = "microcontroller_detection" 253 | 254 | NUM_CLASSES = 1 + 4 255 | 256 | GPU_COUNT = 1 257 | IMAGES_PER_GPU = 1 258 | 259 | config = MicrocontrollerConfig() 260 | config.display() 261 | ``` 262 | 263 | ### Creating and training the model. 264 | 265 | Now that we have both the Config and Dataset class we can create and train a model using the following code. 266 | 267 | ```python 268 | # Create model in training mode 269 | model = modellib.MaskRCNN(mode="training", config=config, 270 | model_dir=MODEL_DIR) 271 | 272 | # Which weights to start with? 273 | init_with = "coco" # imagenet, coco, or last 274 | 275 | if init_with == "imagenet": 276 | model.load_weights(model.get_imagenet_weights(), by_name=True) 277 | elif init_with == "coco": 278 | # Load weights trained on MS COCO, but skip layers that 279 | # are different due to the different number of classes 280 | # See README for instructions to download the COCO weights 281 | model.load_weights(COCO_MODEL_PATH, by_name=True, 282 | exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", 283 | "mrcnn_bbox", "mrcnn_mask"]) 284 | elif init_with == "last": 285 | # Load the last model you trained and continue training 286 | model.load_weights(model.find_last(), by_name=True) 287 | 288 | # Train the head branches 289 | # Passing layers="heads" freezes all layers except the head 290 | # layers. You can also pass a regular expression to select 291 | # which layers to train by name pattern. 292 | model.train(dataset_train, dataset_val, 293 | learning_rate=config.LEARNING_RATE, 294 | epochs=5, 295 | layers='heads') 296 | 297 | # Fine tune all layers 298 | # Passing layers="all" trains all layers. You can also 299 | # pass a regular expression to select which layers to 300 | # train by name pattern. 301 | model.train(dataset_train, dataset_val, 302 | learning_rate=config.LEARNING_RATE / 10, 303 | epochs=10, 304 | layers="all") 305 | ``` 306 | 307 | ![](doc/microcontroller_detection.png) 308 | 309 | ## Training a custom instance segmentation model 310 | 311 | For instance segmentation I'll make use of [my Microcontroller instance segmentation data-set](https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model/blob/master/microcontroller_segmentation_data.zip), which I labeled with [labelme](https://github.com/wkentaro/labelme). 312 | 313 | Instead of xml files we now have json files with the following format: 314 | ```json 315 | { 316 | "version": "4.2.9", 317 | "flags": {}, 318 | "shapes": [ 319 | { 320 | "label": "Arduino_Nano", 321 | "points": [ 322 | [ 323 | 318.9368770764119, 324 | 307.30897009966776 325 | ], 326 | [ 327 | 328.4053156146179, 328 | 307.1428571428571 329 | ], 330 | [ 331 | 323.75415282392026, 332 | 293.0232558139535 333 | ], 334 | [ 335 | 530.2839116719243, 336 | 269.4006309148265 337 | ], 338 | [ 339 | 549.2146596858638, 340 | 315.9685863874345 341 | ], 342 | [ 343 | 339.79057591623035, 344 | 341.0994764397906 345 | ], 346 | [ 347 | 336.1256544502618, 348 | 327.7486910994764 349 | ], 350 | [ 351 | 326.1780104712042, 352 | 328.5340314136125 353 | ] 354 | ], 355 | "group_id": null, 356 | "shape_type": "polygon", 357 | "flags": {} 358 | } 359 | ], 360 | "imagePath": "IMG_20181228_101826.jpg", 361 | "imageData": "...", 362 | "imageHeight": 600, 363 | "imageWidth": 800 364 | } 365 | ``` 366 | 367 | To load in the polygon annotations we will have to make some changes to the ```load_mask``` methods. 368 | 369 | ```python 370 | class MicrocontrollerDataset(utils.Dataset): 371 | def load_dataset(self, dataset_dir): 372 | self.add_class('dataset', 1, 'Raspberry_Pi_3') 373 | self.add_class('dataset', 2, 'Arduino_Nano') 374 | self.add_class('dataset', 3, 'ESP8266') 375 | self.add_class('dataset', 4, 'Heltec_ESP32_Lora') 376 | 377 | # find all images 378 | for i, filename in enumerate(os.listdir(dataset_dir)): 379 | if '.jpg' in filename: 380 | self.add_image('dataset', 381 | image_id=i, 382 | path=os.path.join(dataset_dir, filename), 383 | annotation=os.path.join(dataset_dir, filename.replace('.jpg', '.json'))) 384 | 385 | def extract_masks(self, filename): 386 | json_file = os.path.join(filename) 387 | with open(json_file) as f: 388 | img_anns = json.load(f) 389 | 390 | masks = np.zeros([600, 800, len(img_anns['shapes'])], dtype='uint8') 391 | classes = [] 392 | for i, anno in enumerate(img_anns['shapes']): 393 | mask = np.zeros([600, 800], dtype=np.uint8) 394 | cv2.fillPoly(mask, np.array([anno['points']], dtype=np.int32), 1) 395 | masks[:, :, i] = mask 396 | classes.append(self.class_names.index(anno['label'])) 397 | return masks, classes 398 | 399 | # load the masks for an image 400 | def load_mask(self, image_id): 401 | # get details of image 402 | info = self.image_info[image_id] 403 | # define box file location 404 | path = info['annotation'] 405 | # load XML 406 | masks, classes = self.extract_masks(path) 407 | return masks, np.asarray(classes, dtype='int32') 408 | 409 | def image_reference(self, image_id): 410 | info = self.image_info[image_id] 411 | return info['path'] 412 | ``` 413 | 414 | Everything else stays the same as in the object detection example. You can find the full code in the [MaskRCNN Microcontroller Segmentation file](). 415 | 416 | ![](doc/microcontroller_segmentation.png) -------------------------------------------------------------------------------- /doc/detection_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TannerGilbert/MaskRCNN-Object-Detection-and-Segmentation/7e24fa66f5a7ab8861448d0c582e8a10645bdbc9/doc/detection_example.png -------------------------------------------------------------------------------- /doc/microcontroller_detection.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TannerGilbert/MaskRCNN-Object-Detection-and-Segmentation/7e24fa66f5a7ab8861448d0c582e8a10645bdbc9/doc/microcontroller_detection.png -------------------------------------------------------------------------------- /doc/microcontroller_segmentation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TannerGilbert/MaskRCNN-Object-Detection-and-Segmentation/7e24fa66f5a7ab8861448d0c582e8a10645bdbc9/doc/microcontroller_segmentation.png -------------------------------------------------------------------------------- /doc/visualize_masks.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TannerGilbert/MaskRCNN-Object-Detection-and-Segmentation/7e24fa66f5a7ab8861448d0c582e8a10645bdbc9/doc/visualize_masks.PNG -------------------------------------------------------------------------------- /video_detection.py: -------------------------------------------------------------------------------- 1 | import os 2 | import sys 3 | import numpy as np 4 | np.random.seed(0) 5 | import matplotlib.pyplot as plt 6 | import cv2 7 | import argparse 8 | 9 | # Root directory of the project 10 | ROOT_DIR = os.path.abspath("../") 11 | 12 | # Import Mask RCNN 13 | sys.path.append(ROOT_DIR) # To find local version of the library 14 | from mrcnn.config import Config 15 | from mrcnn import utils 16 | import mrcnn.model as modellib 17 | from mrcnn import visualize 18 | from mrcnn.model import log 19 | 20 | sys.path.append(os.path.join(ROOT_DIR, "samples/coco/")) 21 | import coco 22 | 23 | 24 | def apply_mask(image, mask, color, alpha=0.5): 25 | """apply mask to image""" 26 | for n, c in enumerate(color): 27 | image[:, :, n] = np.where( 28 | mask == 1, 29 | image[:, :, n] * (1 - alpha) + alpha * c, 30 | image[:, :, n] 31 | ) 32 | return image 33 | 34 | 35 | # based on https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/visualize.py 36 | # and https://github.com/markjay4k/Mask-RCNN-series/blob/887404d990695a7bf7f180e3ffaee939fbd9a1cf/visualize_cv.py 37 | def display_instances(image, boxes, masks, class_ids, class_names, scores=None): 38 | assert boxes.shape[0] == masks.shape[-1] == class_ids.shape[0] 39 | 40 | N = boxes.shape[0] 41 | 42 | colors = colors = [tuple(255 * np.random.rand(3)) for _ in range(N)] 43 | 44 | for i, c in enumerate(colors): 45 | if not np.any(boxes[i]): 46 | continue 47 | 48 | y1, x1, y2, x2 = boxes[i] 49 | label = class_names[class_ids[i]] 50 | score = scores[i] if scores is not None else None 51 | caption = "{} {:.3f}".format(label, score) if score else label 52 | 53 | # Mask 54 | mask = masks[:, :, i] 55 | image = apply_mask(image, mask, c) 56 | image = cv2.rectangle(image, (x1, y1), (x2, y2), c, 2) 57 | image = cv2.putText(image, caption, (x1, y1), cv2.FONT_HERSHEY_COMPLEX, 0.7, c, 2) 58 | return image 59 | 60 | 61 | class InferenceConfig(coco.CocoConfig): 62 | # Set batch size to 1 since we'll be running inference on 63 | # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU 64 | GPU_COUNT = 1 65 | IMAGES_PER_GPU = 1 66 | 67 | 68 | if __name__ == '__main__': 69 | parser = argparse.ArgumentParser(description='MaskRCNN Video Object Detection/Instance Segmentation') 70 | parser.add_argument('-v', '--video_path', type=str, default='', help='Path to video. If None camera will be used') 71 | parser.add_argument('-sp', '--save_path', type=str, default='', help= 'Path to save the output. If None output won\'t be saved') 72 | parser.add_argument('-s', '--show', default=True, action="store_false", help='Show output') 73 | args = parser.parse_args() 74 | 75 | # Directory to save logs and trained model 76 | MODEL_DIR = os.path.join(ROOT_DIR, "logs") 77 | 78 | # Local path to trained weights file 79 | COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5") 80 | # Download COCO trained weights from Releases if needed 81 | if not os.path.exists(COCO_MODEL_PATH): 82 | utils.download_trained_weights(COCO_MODEL_PATH) 83 | 84 | class_names = ['BG', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 85 | 'bus', 'train', 'truck', 'boat', 'traffic light', 86 | 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 87 | 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 88 | 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 89 | 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 90 | 'kite', 'baseball bat', 'baseball glove', 'skateboard', 91 | 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 92 | 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 93 | 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 94 | 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 95 | 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 96 | 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 97 | 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 98 | 'teddy bear', 'hair drier', 'toothbrush'] 99 | 100 | config = InferenceConfig() 101 | 102 | # Create model object in inference mode. 103 | model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config) 104 | 105 | # Load weights trained on MS-COCO 106 | model.load_weights(COCO_MODEL_PATH, by_name=True) 107 | 108 | if args.video_path != '': 109 | cap = cv2.VideoCapture(args.video_path) 110 | else: 111 | cap = cv2.VideoCapture(0) 112 | 113 | if args.save_path: 114 | width = int(cap.get(3)) 115 | height = int(cap.get(4)) 116 | fps = cap.get(cv2.CAP_PROP_FPS) 117 | out = cv2.VideoWriter(args.save_path, cv2.VideoWriter_fourcc('M','J','P','G'), fps, (width, height)) 118 | while cap.isOpened(): 119 | ret, image = cap.read() 120 | results = model.detect([image], verbose=1) 121 | r = results[0] 122 | image = display_instances(image, r['rois'], r['masks'], r['class_ids'], class_names, r['scores']) 123 | if args.show: 124 | cv2.imshow('MaskRCNN Object Detection/Instance Segmentation', image) 125 | if cv2.waitKey(1) & 0xFF == ord('q'): 126 | break 127 | if args.save_path: 128 | out.write(image) 129 | cap.release() 130 | if args.save_path: 131 | out.release() 132 | cv2.destroyAllWindows() 133 | --------------------------------------------------------------------------------