├── README.md ├── data_preprocessing ├── Preprocess_Data.ipynb ├── README.md ├── data │ ├── label_map.pbtxt │ ├── predefined_classes.txt │ ├── test.record │ ├── test_labels.csv │ ├── train.record │ └── train_labels.csv ├── generate_tfrecord.py ├── labelImg.exe ├── transform_resolution.py └── xml_to_csv.py ├── inference_webcam.py └── object_detection_training.ipynb /README.md: -------------------------------------------------------------------------------- 1 | # Tensorflow Custom Object Detection Template 2 | 3 | #### Easy to use notebooks for preprocessing data locally to training on the cloud with colab 4 | 5 | This repository is an easy to template for using Tensorflow Object Detection API on custom datasets. The preprocessing can be done with the provided notebook locally and then training can be done easily on colab. 6 | 7 | ## Required Libraries 8 | 9 | - Tensorflow 1.15 10 | - object_detection 11 | - opencv 12 | 13 | It is best to create a separate python/conda environment 14 | 15 | ## Usage 16 | 17 | 1. `git clone https://github.com/theneuralbeing/object_detection_template.git` 18 | 2. [Read this](data_preprocessing/README.md) for gathering and annotating data 19 | 3. Run the [Preprocess_Data.ipynb](data_preprocessing/Preprocess_Data.ipynb) notebook on your computer 20 | 4. After your data is ready, you can directly start [training on this colab notebook](https://colab.research.google.com/github/theneuralbeing/object_detection_template/blob/master/object_detection_training.ipynb) The colab notebook contains all the further steps. 21 | 5. After training, the trained inference graph will be downloaded to your computer and you can use the [`inference_webcam.py`](inference_webcam.py). (the steps for inference are also mentioned in the colab notebook) 22 | 23 | ## References 24 | * [Tensorflow Object Detection API Documentation by Lyudmil Vladimirov](https://tensorflow-object-detection-api-tutorial.readthedocs.io/) 25 | * [This Medium Post by Alaa Sinjab 26 | ](https://towardsdatascience.com/detailed-tutorial-build-your-custom-real-time-object-detector-5ade1017fd2d) 27 | * [Racoon Detector by datitrain](https://github.com/datitran/raccoon_dataset) 28 | -------------------------------------------------------------------------------- /data_preprocessing/Preprocess_Data.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Custom Object Detection\n", 8 | "## Data Preprocessor\n", 9 | "\n", 10 | "This notebook is for preprocessing your data into tf record files and creating a label_map\n", 11 | "\n", 12 | "Project Structure of Collected\n", 13 | "```\n", 14 | "data_preprocessing/ \n", 15 | " |\n", 16 | " |---data/\n", 17 | " | |---predefined_classes.txt\n", 18 | " | |___train/\n", 19 | " | | |---all train images files\n", 20 | " | | |---...\n", 21 | " | | |---xml files of respective test images\n", 22 | " | | |---...\n", 23 | " | |\n", 24 | " | |___test/\n", 25 | " | |---all test images files\n", 26 | " | |---...\n", 27 | " | |---xml files of respective test images\n", 28 | " | |---....\n", 29 | " | \n", 30 | " |---labelImg.exe \n", 31 | " |---transform_resolution.py\n", 32 | " |---xml_to_csv.py\n", 33 | " |---generate_tfrecord.py\n", 34 | " |---Preprocess_Data.ipynb\n", 35 | "```" 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": [ 42 | "## Transform Resolution (optional)\n", 43 | "If you manually shot the images using a camera or phone, you may want to reduce the resolution" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": null, 49 | "metadata": {}, 50 | "outputs": [], 51 | "source": [ 52 | "# resize all images in train and test dir to 800x600\n", 53 | "!python transform_resolution.py -d data/train/ -s 800 600\n", 54 | "\n", 55 | "!python transform_resolution.py -d data/test/ -s 800 600" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "## Convert xml to csv files" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "metadata": {}, 69 | "outputs": [], 70 | "source": [ 71 | "!python xml_to_csv.py -i data/train -o data/train_labels.csv\n", 72 | "\n", 73 | "!python xml_to_csv.py -i data/test -o data/test_labels.csv" 74 | ] 75 | }, 76 | { 77 | "cell_type": "markdown", 78 | "metadata": {}, 79 | "source": [ 80 | "## Generate tf.record files from csv\n", 81 | "Edit `generate_tfrecord.py` and update your labels in `class_text_to_int()` function at line 31. Add if statements as per your labels.\n", 82 | "\n", 83 | "For example, if we have a dog and cat detector, we would modify the function as\n", 84 | "```python\n", 85 | "def class_text_to_int(row_label):\n", 86 | " if row_label == 'cat':\n", 87 | " return 1\n", 88 | " elif row_label == 'dog':\n", 89 | " return 2\n", 90 | " else:\n", 91 | " None\n", 92 | "```" 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [ 101 | "%%writefile generate_tfrecord.py\n", 102 | "\"\"\"\n", 103 | "Usage:\n", 104 | " # From tensorflow/models/\n", 105 | " # Create train data:\n", 106 | " python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=train.record\n", 107 | "\n", 108 | " # Create test data:\n", 109 | " python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record\n", 110 | "\"\"\"\n", 111 | "from __future__ import division\n", 112 | "from __future__ import print_function\n", 113 | "from __future__ import absolute_import\n", 114 | "\n", 115 | "import os\n", 116 | "import io\n", 117 | "import pandas as pd\n", 118 | "import tensorflow as tf\n", 119 | "\n", 120 | "from PIL import Image\n", 121 | "from object_detection.utils import dataset_util\n", 122 | "from collections import namedtuple, OrderedDict\n", 123 | "\n", 124 | "flags = tf.app.flags\n", 125 | "flags.DEFINE_string('csv_input', '', 'Path to the CSV input')\n", 126 | "flags.DEFINE_string('output_path', '', 'Path to output TFRecord')\n", 127 | "flags.DEFINE_string('image_dir', '', 'Path to images')\n", 128 | "FLAGS = flags.FLAGS\n", 129 | "\n", 130 | "\n", 131 | "# REPLACE/ADD IF STATEMENTS AS PER YOUR YOUR LABELS\n", 132 | "def class_text_to_int(row_label):\n", 133 | " if row_label == 'product':\n", 134 | " return 1\n", 135 | " else:\n", 136 | " None\n", 137 | "\n", 138 | "\n", 139 | "def split(df, group):\n", 140 | " data = namedtuple('data', ['filename', 'object'])\n", 141 | " gb = df.groupby(group)\n", 142 | " return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]\n", 143 | "\n", 144 | "\n", 145 | "def create_tf_example(group, path):\n", 146 | " with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:\n", 147 | " encoded_jpg = fid.read()\n", 148 | " encoded_jpg_io = io.BytesIO(encoded_jpg)\n", 149 | " image = Image.open(encoded_jpg_io)\n", 150 | " width, height = image.size\n", 151 | "\n", 152 | " filename = group.filename.encode('utf8')\n", 153 | " image_format = b'jpg'\n", 154 | " xmins = []\n", 155 | " xmaxs = []\n", 156 | " ymins = []\n", 157 | " ymaxs = []\n", 158 | " classes_text = []\n", 159 | " classes = []\n", 160 | "\n", 161 | " for index, row in group.object.iterrows():\n", 162 | " xmins.append(row['xmin'] / width)\n", 163 | " xmaxs.append(row['xmax'] / width)\n", 164 | " ymins.append(row['ymin'] / height)\n", 165 | " ymaxs.append(row['ymax'] / height)\n", 166 | " classes_text.append(row['class'].encode('utf8'))\n", 167 | " classes.append(class_text_to_int(row['class']))\n", 168 | "\n", 169 | " tf_example = tf.train.Example(features=tf.train.Features(feature={\n", 170 | " 'image/height': dataset_util.int64_feature(height),\n", 171 | " 'image/width': dataset_util.int64_feature(width),\n", 172 | " 'image/filename': dataset_util.bytes_feature(filename),\n", 173 | " 'image/source_id': dataset_util.bytes_feature(filename),\n", 174 | " 'image/encoded': dataset_util.bytes_feature(encoded_jpg),\n", 175 | " 'image/format': dataset_util.bytes_feature(image_format),\n", 176 | " 'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),\n", 177 | " 'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),\n", 178 | " 'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),\n", 179 | " 'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),\n", 180 | " 'image/object/class/text': dataset_util.bytes_list_feature(classes_text),\n", 181 | " 'image/object/class/label': dataset_util.int64_list_feature(classes),\n", 182 | " }))\n", 183 | " return tf_example\n", 184 | "\n", 185 | "\n", 186 | "def main(_):\n", 187 | " writer = tf.python_io.TFRecordWriter(FLAGS.output_path)\n", 188 | " path = os.path.join(FLAGS.image_dir)\n", 189 | " examples = pd.read_csv(FLAGS.csv_input)\n", 190 | " grouped = split(examples, 'filename')\n", 191 | " #print(grouped)\n", 192 | " for group in grouped:\n", 193 | " tf_example = create_tf_example(group, path)\n", 194 | " writer.write(tf_example.SerializeToString())\n", 195 | "\n", 196 | " writer.close()\n", 197 | " output_path = os.path.join(os.getcwd(), FLAGS.output_path)\n", 198 | " print('Successfully created the TFRecords: {}'.format(output_path))\n", 199 | "\n", 200 | "\n", 201 | "if __name__ == '__main__':\n", 202 | " tf.app.run()\n" 203 | ] 204 | }, 205 | { 206 | "cell_type": "markdown", 207 | "metadata": {}, 208 | "source": [ 209 | "Once you are done with updating the file with your labels, run the script with appropriate directories of images and csv files" 210 | ] 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": null, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "!python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record --image_dir=data/train\n", 219 | "\n", 220 | "!python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record --image_dir=data/test" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": [ 227 | "## Creating the Label Map\n", 228 | "In the below cell edit the label id and name and add item objects based on your labels\n", 229 | "\n", 230 | "For example,\n", 231 | "```\n", 232 | "item {\n", 233 | " id: 1\n", 234 | " name: 'cat'\n", 235 | "}\n", 236 | "item {\n", 237 | " id: 2\n", 238 | " name: 'dog'\n", 239 | "}\n", 240 | "```" 241 | ] 242 | }, 243 | { 244 | "cell_type": "code", 245 | "execution_count": null, 246 | "metadata": {}, 247 | "outputs": [], 248 | "source": [ 249 | "%%writefile data/label_map.pbtxt\n", 250 | "\n", 251 | "item {\n", 252 | " id: 1\n", 253 | " name: 'product'\n", 254 | "}" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": {}, 260 | "source": [ 261 | "## Archiving the processed data files\n", 262 | "Now that we have generated the required files for training like the `label_map.pbtxt`, `train.record` and `test.record`, we will archive it into a zip file so that we can easily upload it to our [training notebook on colab]()." 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "metadata": {}, 269 | "outputs": [], 270 | "source": [ 271 | "import os\n", 272 | "import shutil" 273 | ] 274 | }, 275 | { 276 | "cell_type": "code", 277 | "execution_count": null, 278 | "metadata": {}, 279 | "outputs": [], 280 | "source": [ 281 | "os.mkdir('data/tmp/')\n", 282 | "\n", 283 | "shutil.copy(\"data/label_map.pbtxt\", \"data/tmp/label_map.pbtxt\")\n", 284 | "shutil.copy(\"data/train.record\", \"data/tmp/train.record\")\n", 285 | "shutil.copy(\"data/test.record\", \"data/tmp/test.record\")\n", 286 | "\n", 287 | "shutil.make_archive('data.zip', 'zip', 'data/tmp')\n", 288 | "\n", 289 | "shutil.rmtree('data/tmp/')" 290 | ] 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": [ 296 | "#### Now you can move on to the [colab training notebook](https://colab.research.google.com/github/theneuralbeing/object_detection_template/blob/master/object_detection_training.ipynb) for further instructions" 297 | ] 298 | }, 299 | { 300 | "cell_type": "code", 301 | "execution_count": null, 302 | "metadata": {}, 303 | "outputs": [], 304 | "source": [] 305 | } 306 | ], 307 | "metadata": { 308 | "kernelspec": { 309 | "display_name": "Python 3", 310 | "language": "python", 311 | "name": "python3" 312 | }, 313 | "language_info": { 314 | "codemirror_mode": { 315 | "name": "ipython", 316 | "version": 3 317 | }, 318 | "file_extension": ".py", 319 | "mimetype": "text/x-python", 320 | "name": "python", 321 | "nbconvert_exporter": "python", 322 | "pygments_lexer": "ipython3", 323 | "version": "3.6.9" 324 | } 325 | }, 326 | "nbformat": 4, 327 | "nbformat_minor": 2 328 | } 329 | -------------------------------------------------------------------------------- /data_preprocessing/README.md: -------------------------------------------------------------------------------- 1 | # Data Gathering and Processing 2 | 3 | To train a custom object detection model from scratch on your own dataset, follow these steps on your local computer 4 | 5 | 1. Gather all the images and put a split the images into the `data/train` and `data/test` directories. Around 50 images per class is enough or more than 50 if you have only 1 class or so. 6 | 2. Run labelImg.exe application and start annotating the images. Then edit `data/predefined_classes.txt` file with your class names all separated by a new line. In labelImg, click on Open Dir and navigate to the train and test folders. Draw bounding boxes and label them and save them to the same directories as the images. Note: You will have to save for each image. To speed up workflow, use `w` key to draw a box, `d` key to move on to next image and `ctrl + s` to save. If you can't figure out how to use labelImg, you can just google it and you will find tons of resources. 7 | 3. Once you are done with annotating, open the [Preprocess_Data.ipynb](Preprocess_Data.ipynb) notebook for further instructions. 8 | -------------------------------------------------------------------------------- /data_preprocessing/data/label_map.pbtxt: -------------------------------------------------------------------------------- 1 | 2 | item { 3 | id: 1 4 | name: 'product' 5 | } 6 | -------------------------------------------------------------------------------- /data_preprocessing/data/predefined_classes.txt: -------------------------------------------------------------------------------- 1 | class 1 2 | class 2 3 | class n -------------------------------------------------------------------------------- /data_preprocessing/data/test.record: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nerdimite/object_detection_template/c0d8090fcbeb2c30be500992b55e62eee03d72c4/data_preprocessing/data/test.record -------------------------------------------------------------------------------- /data_preprocessing/data/test_labels.csv: -------------------------------------------------------------------------------- 1 | filename,width,height,class,xmin,ymin,xmax,ymax 2 | IMG20191227143316.jpg,800,600,product,366,122,552,322 3 | IMG20191227143323.jpg,800,600,product,103,206,492,405 4 | IMG20191227143356.jpg,800,600,product,208,24,592,520 5 | IMG20191227143404.jpg,800,600,product,249,121,584,416 6 | IMG20191227143411.jpg,800,600,product,60,40,445,521 7 | IMG20191227143421.jpg,800,600,product,4,86,466,492 8 | IMG20191227143429.jpg,800,600,product,42,270,496,396 9 | IMG20191227143438.jpg,800,600,product,43,86,505,600 10 | IMG20191227143448.jpg,800,600,product,156,4,543,462 11 | IMG20191227143509.jpg,800,600,product,97,189,774,478 12 | IMG20191227143511.jpg,800,600,product,117,217,745,406 13 | IMG20191227143532.jpg,800,600,product,464,126,611,218 14 | IMG20191227143549.jpg,800,600,product,60,5,586,153 15 | IMG20191227143553.jpg,800,600,product,91,295,543,564 16 | IMG20191227143708.jpg,800,600,product,141,95,685,560 17 | IMG20191227143709.jpg,800,600,product,138,205,760,430 18 | IMG20191227143746.jpg,800,600,product,193,108,648,424 19 | IMG20191227143752.jpg,800,600,product,221,233,720,589 20 | IMG20191227143849.jpg,800,600,product,180,30,743,582 21 | IMG20191227143855.jpg,800,600,product,12,1,664,596 22 | IMG20191227143901.jpg,800,600,product,81,142,685,551 23 | IMG20191227143913.jpg,800,600,product,201,39,798,462 24 | IMG20191227143924.jpg,800,600,product,595,25,797,183 25 | IMG20191227144022.jpg,800,600,product,406,168,760,480 26 | IMG20191227144022.jpg,800,600,product,126,164,334,417 27 | IMG20191227144022.jpg,800,600,product,215,510,387,600 28 | IMG20191227144028.jpg,800,600,product,315,394,795,519 29 | IMG20191227144028.jpg,800,600,product,560,91,800,273 30 | IMG20191227144028.jpg,800,600,product,1,184,411,348 31 | IMG20191227144028.jpg,800,600,product,223,1,731,280 32 | IMG20191227144047.jpg,800,600,product,178,95,447,311 33 | IMG20191227144047.jpg,800,600,product,151,350,384,502 34 | IMG20191227144047.jpg,800,600,product,341,294,740,487 35 | IMG20191227144047.jpg,800,600,product,450,124,750,321 36 | IMG20191227144054.jpg,800,600,product,345,63,696,332 37 | IMG20191227144131.jpg,800,600,product,119,235,442,516 38 | IMG20191227144131.jpg,800,600,product,431,281,727,474 39 | IMG20191227144134.jpg,800,600,product,206,327,554,555 40 | IMG20191227144147.jpg,800,600,product,148,242,800,507 41 | IMG20191227144147.jpg,800,600,product,261,70,560,311 42 | -------------------------------------------------------------------------------- /data_preprocessing/data/train.record: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nerdimite/object_detection_template/c0d8090fcbeb2c30be500992b55e62eee03d72c4/data_preprocessing/data/train.record -------------------------------------------------------------------------------- /data_preprocessing/data/train_labels.csv: -------------------------------------------------------------------------------- 1 | filename,width,height,class,xmin,ymin,xmax,ymax 2 | IMG20191227143251.jpg,800,600,product,219,29,532,494 3 | IMG20191227143253.jpg,800,600,product,214,68,508,487 4 | IMG20191227143311.jpg,800,600,product,522,83,658,254 5 | IMG20191227143318.jpg,800,600,product,278,282,544,473 6 | IMG20191227143321.jpg,800,600,product,408,124,800,488 7 | IMG20191227143325.jpg,800,600,product,194,230,521,409 8 | IMG20191227143327.jpg,800,600,product,137,206,797,600 9 | IMG20191227143329.jpg,800,600,product,323,194,584,571 10 | IMG20191227143331.jpg,800,600,product,307,133,592,570 11 | IMG20191227143401.jpg,800,600,product,194,84,537,430 12 | IMG20191227143407.jpg,800,600,product,194,209,584,475 13 | IMG20191227143413.jpg,800,600,product,183,1,577,224 14 | IMG20191227143416.jpg,800,600,product,101,461,400,600 15 | IMG20191227143425.jpg,800,600,product,231,240,485,520 16 | IMG20191227143432.jpg,800,600,product,154,288,566,478 17 | IMG20191227143434.jpg,800,600,product,331,296,596,549 18 | IMG20191227143441.jpg,800,600,product,173,247,439,542 19 | IMG20191227143450.jpg,800,600,product,163,4,551,376 20 | IMG20191227143453.jpg,800,600,product,242,19,592,518 21 | IMG20191227143455.jpg,800,600,product,295,251,516,543 22 | IMG20191227143458.jpg,800,600,product,144,313,393,485 23 | IMG20191227143507.jpg,800,600,product,169,161,649,499 24 | IMG20191227143512.jpg,800,600,product,262,127,628,518 25 | IMG20191227143514.jpg,800,600,product,16,252,571,417 26 | IMG20191227143517.jpg,800,600,product,299,47,609,282 27 | IMG20191227143520.jpg,800,600,product,36,185,667,510 28 | IMG20191227143523.jpg,800,600,product,66,116,605,441 29 | IMG20191227143526.jpg,800,600,product,61,31,371,300 30 | IMG20191227143526.jpg,800,600,product,145,325,423,486 31 | IMG20191227143542.jpg,800,600,product,303,116,533,265 32 | IMG20191227143552.jpg,800,600,product,56,32,523,383 33 | IMG20191227143555.jpg,800,600,product,170,413,485,577 34 | IMG20191227143600.jpg,800,600,product,191,230,456,419 35 | IMG20191227143604.jpg,800,600,product,308,274,528,411 36 | IMG20191227143606.jpg,800,600,product,249,366,614,529 37 | IMG20191227143608.jpg,800,600,product,56,130,578,473 38 | IMG20191227143611.jpg,800,600,product,40,63,739,452 39 | IMG20191227143618.jpg,800,600,product,132,266,377,429 40 | IMG20191227143618.jpg,800,600,product,396,271,642,459 41 | IMG20191227143618.jpg,800,600,product,236,106,619,295 42 | IMG20191227143621.jpg,800,600,product,23,216,292,362 43 | IMG20191227143621.jpg,800,600,product,160,28,606,265 44 | IMG20191227143621.jpg,800,600,product,198,262,681,516 45 | IMG20191227143625.jpg,800,600,product,17,131,665,481 46 | IMG20191227143625.jpg,800,600,product,285,446,627,600 47 | IMG20191227143630.jpg,800,600,product,118,182,532,502 48 | IMG20191227143632.jpg,800,600,product,171,41,628,389 49 | IMG20191227143634.jpg,800,600,product,167,63,626,378 50 | IMG20191227143637.jpg,800,600,product,17,40,356,430 51 | IMG20191227143646.jpg,800,600,product,248,216,497,421 52 | IMG20191227143648.jpg,800,600,product,321,268,625,427 53 | IMG20191227143656.jpg,800,600,product,273,236,491,357 54 | IMG20191227143656.jpg,800,600,product,200,354,684,493 55 | IMG20191227143703.jpg,800,600,product,5,89,691,600 56 | IMG20191227143706.jpg,800,600,product,94,106,626,497 57 | IMG20191227143712.jpg,800,600,product,154,154,508,519 58 | IMG20191227143716.jpg,800,600,product,211,105,575,365 59 | IMG20191227143721.jpg,800,600,product,128,48,644,503 60 | IMG20191227143725.jpg,800,600,product,394,341,649,542 61 | IMG20191227143731.jpg,800,600,product,154,143,365,307 62 | IMG20191227143735.jpg,800,600,product,40,70,359,197 63 | IMG20191227143739.jpg,800,600,product,209,302,655,600 64 | IMG20191227143742.jpg,800,600,product,158,92,556,432 65 | IMG20191227143750.jpg,800,600,product,119,31,800,528 66 | IMG20191227143755.jpg,800,600,product,192,335,686,600 67 | IMG20191227143759.jpg,800,600,product,316,214,558,352 68 | IMG20191227143759.jpg,800,600,product,421,343,675,484 69 | IMG20191227143759.jpg,800,600,product,136,309,662,565 70 | IMG20191227143803.jpg,800,600,product,1,129,640,595 71 | IMG20191227143806.jpg,800,600,product,317,380,547,508 72 | IMG20191227143806.jpg,800,600,product,258,496,645,600 73 | IMG20191227143806.jpg,800,600,product,5,7,800,361 74 | IMG20191227143809.jpg,800,600,product,28,315,604,600 75 | IMG20191227143814.jpg,800,600,product,232,200,573,454 76 | IMG20191227143814.jpg,800,600,product,119,394,639,592 77 | IMG20191227143819.jpg,800,600,product,19,9,781,589 78 | IMG20191227143823.jpg,800,600,product,23,8,800,585 79 | IMG20191227143827.jpg,800,600,product,7,108,543,593 80 | IMG20191227143830.jpg,800,600,product,140,22,645,516 81 | IMG20191227143833.jpg,800,600,product,165,62,664,503 82 | IMG20191227143835.jpg,800,600,product,158,44,753,583 83 | IMG20191227143837.jpg,800,600,product,31,25,752,504 84 | IMG20191227143839.jpg,800,600,product,110,21,800,435 85 | IMG20191227143846.jpg,800,600,product,44,109,771,519 86 | IMG20191227143848.jpg,800,600,product,34,3,775,600 87 | IMG20191227143851.jpg,800,600,product,12,5,676,600 88 | IMG20191227143853.jpg,800,600,product,156,174,744,505 89 | IMG20191227143856.jpg,800,600,product,133,99,800,599 90 | IMG20191227143859.jpg,800,600,product,15,88,704,480 91 | IMG20191227143900.jpg,800,600,product,12,68,634,368 92 | IMG20191227143904.jpg,800,600,product,15,236,723,566 93 | IMG20191227143907.jpg,800,600,product,5,209,504,554 94 | IMG20191227143911.jpg,800,600,product,92,50,777,491 95 | IMG20191227143917.jpg,800,600,product,134,143,632,435 96 | IMG20191227143920.jpg,800,600,product,238,198,741,477 97 | IMG20191227143920.jpg,800,600,product,321,24,663,146 98 | IMG20191227143923.jpg,800,600,product,567,14,669,158 99 | IMG20191227143926.jpg,800,600,product,555,5,753,125 100 | IMG20191227143931.jpg,800,600,product,309,171,800,582 101 | IMG20191227143933.jpg,800,600,product,127,232,800,598 102 | IMG20191227143938.jpg,800,600,product,76,65,656,433 103 | IMG20191227143940.jpg,800,600,product,160,157,726,582 104 | IMG20191227143944.jpg,800,600,product,8,5,794,421 105 | IMG20191227143947.jpg,800,600,product,10,308,620,600 106 | IMG20191227143947.jpg,800,600,product,437,265,684,410 107 | IMG20191227143947.jpg,800,600,product,597,483,670,584 108 | IMG20191227143949.jpg,800,600,product,101,226,577,531 109 | IMG20191227143952.jpg,800,600,product,149,205,596,495 110 | IMG20191227143952.jpg,800,600,product,160,400,570,584 111 | IMG20191227143955.jpg,800,600,product,328,265,665,494 112 | IMG20191227143955.jpg,800,600,product,301,184,570,268 113 | IMG20191227143955.jpg,800,600,product,336,415,800,539 114 | IMG20191227143959.jpg,800,600,product,294,305,648,555 115 | IMG20191227143959.jpg,800,600,product,523,476,705,572 116 | IMG20191227144002.jpg,800,600,product,53,21,661,452 117 | IMG20191227144002.jpg,800,600,product,369,358,588,519 118 | IMG20191227144002.jpg,800,600,product,345,514,703,600 119 | IMG20191227144016.jpg,800,600,product,147,129,289,317 120 | IMG20191227144016.jpg,800,600,product,241,342,364,468 121 | IMG20191227144016.jpg,800,600,product,329,117,506,318 122 | IMG20191227144016.jpg,800,600,product,422,342,625,509 123 | IMG20191227144020.jpg,800,600,product,309,285,649,600 124 | IMG20191227144020.jpg,800,600,product,40,97,282,321 125 | IMG20191227144020.jpg,800,600,product,345,2,747,131 126 | IMG20191227144025.jpg,800,600,product,76,24,506,254 127 | IMG20191227144025.jpg,800,600,product,185,347,569,475 128 | IMG20191227144025.jpg,800,600,product,447,189,725,375 129 | IMG20191227144025.jpg,800,600,product,696,231,800,377 130 | IMG20191227144031.jpg,800,600,product,128,370,674,532 131 | IMG20191227144031.jpg,800,600,product,480,19,800,255 132 | IMG20191227144031.jpg,800,600,product,1,85,311,278 133 | IMG20191227144036.jpg,800,600,product,53,277,451,432 134 | IMG20191227144036.jpg,800,600,product,74,43,518,257 135 | IMG20191227144036.jpg,800,600,product,566,248,800,407 136 | IMG20191227144036.jpg,800,600,product,303,484,729,600 137 | IMG20191227144040.jpg,800,600,product,189,385,563,514 138 | IMG20191227144040.jpg,800,600,product,123,180,423,385 139 | IMG20191227144040.jpg,800,600,product,561,178,800,337 140 | IMG20191227144040.jpg,800,600,product,123,5,481,159 141 | IMG20191227144042.jpg,800,600,product,274,348,544,574 142 | IMG20191227144042.jpg,800,600,product,108,219,397,420 143 | IMG20191227144042.jpg,800,600,product,129,7,493,210 144 | IMG20191227144042.jpg,800,600,product,559,233,800,389 145 | IMG20191227144045.jpg,800,600,product,355,137,658,313 146 | IMG20191227144045.jpg,800,600,product,62,97,356,314 147 | IMG20191227144045.jpg,800,600,product,100,347,343,495 148 | IMG20191227144045.jpg,800,600,product,530,327,800,492 149 | IMG20191227144048.jpg,800,600,product,161,351,418,528 150 | IMG20191227144048.jpg,800,600,product,361,271,773,536 151 | IMG20191227144048.jpg,800,600,product,164,87,477,317 152 | IMG20191227144051.jpg,800,600,product,195,140,465,346 153 | IMG20191227144051.jpg,800,600,product,362,305,782,568 154 | IMG20191227144056.jpg,800,600,product,249,291,507,472 155 | IMG20191227144056.jpg,800,600,product,438,18,609,180 156 | IMG20191227144056.jpg,800,600,product,52,1,170,73 157 | IMG20191227144058.jpg,800,600,product,28,125,205,278 158 | IMG20191227144058.jpg,800,600,product,447,244,587,392 159 | IMG20191227144058.jpg,800,600,product,4,281,377,504 160 | IMG20191227144058.jpg,800,600,product,265,494,477,600 161 | IMG20191227144105.jpg,800,600,product,442,383,582,519 162 | IMG20191227144105.jpg,800,600,product,4,251,367,418 163 | IMG20191227144105.jpg,800,600,product,272,41,495,169 164 | IMG20191227144109.jpg,800,600,product,142,221,443,495 165 | IMG20191227144109.jpg,800,600,product,477,497,650,600 166 | IMG20191227144109.jpg,800,600,product,396,1,603,115 167 | IMG20191227144128.jpg,800,600,product,110,4,587,368 168 | IMG20191227144128.jpg,800,600,product,528,106,759,470 169 | IMG20191227144136.jpg,800,600,product,94,86,782,425 170 | IMG20191227144136.jpg,800,600,product,233,354,586,568 171 | IMG20191227144139.jpg,800,600,product,175,215,472,430 172 | IMG20191227144139.jpg,800,600,product,338,340,641,540 173 | -------------------------------------------------------------------------------- /data_preprocessing/generate_tfrecord.py: -------------------------------------------------------------------------------- 1 | """ 2 | Usage: 3 | # From tensorflow/models/ 4 | # Create train data: 5 | python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=train.record 6 | 7 | # Create test data: 8 | python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record 9 | """ 10 | from __future__ import division 11 | from __future__ import print_function 12 | from __future__ import absolute_import 13 | 14 | import os 15 | import io 16 | import pandas as pd 17 | import tensorflow as tf 18 | 19 | from PIL import Image 20 | from object_detection.utils import dataset_util 21 | from collections import namedtuple, OrderedDict 22 | 23 | flags = tf.app.flags 24 | flags.DEFINE_string('csv_input', '', 'Path to the CSV input') 25 | flags.DEFINE_string('output_path', '', 'Path to output TFRecord') 26 | flags.DEFINE_string('image_dir', '', 'Path to images') 27 | FLAGS = flags.FLAGS 28 | 29 | 30 | # REPLACE/ADD IF STATEMENTS AS PER YOUR YOUR LABELS 31 | def class_text_to_int(row_label): 32 | if row_label == 'LABEL_1': 33 | return 1 34 | elif row_label == 'LABEL_2': 35 | return 1 36 | else: 37 | None 38 | 39 | 40 | def split(df, group): 41 | data = namedtuple('data', ['filename', 'object']) 42 | gb = df.groupby(group) 43 | return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)] 44 | 45 | 46 | def create_tf_example(group, path): 47 | with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid: 48 | encoded_jpg = fid.read() 49 | encoded_jpg_io = io.BytesIO(encoded_jpg) 50 | image = Image.open(encoded_jpg_io) 51 | width, height = image.size 52 | 53 | filename = group.filename.encode('utf8') 54 | image_format = b'jpg' 55 | xmins = [] 56 | xmaxs = [] 57 | ymins = [] 58 | ymaxs = [] 59 | classes_text = [] 60 | classes = [] 61 | 62 | for index, row in group.object.iterrows(): 63 | xmins.append(row['xmin'] / width) 64 | xmaxs.append(row['xmax'] / width) 65 | ymins.append(row['ymin'] / height) 66 | ymaxs.append(row['ymax'] / height) 67 | classes_text.append(row['class'].encode('utf8')) 68 | classes.append(class_text_to_int(row['class'])) 69 | 70 | tf_example = tf.train.Example(features=tf.train.Features(feature={ 71 | 'image/height': dataset_util.int64_feature(height), 72 | 'image/width': dataset_util.int64_feature(width), 73 | 'image/filename': dataset_util.bytes_feature(filename), 74 | 'image/source_id': dataset_util.bytes_feature(filename), 75 | 'image/encoded': dataset_util.bytes_feature(encoded_jpg), 76 | 'image/format': dataset_util.bytes_feature(image_format), 77 | 'image/object/bbox/xmin': dataset_util.float_list_feature(xmins), 78 | 'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs), 79 | 'image/object/bbox/ymin': dataset_util.float_list_feature(ymins), 80 | 'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs), 81 | 'image/object/class/text': dataset_util.bytes_list_feature(classes_text), 82 | 'image/object/class/label': dataset_util.int64_list_feature(classes), 83 | })) 84 | return tf_example 85 | 86 | 87 | def main(_): 88 | writer = tf.python_io.TFRecordWriter(FLAGS.output_path) 89 | path = os.path.join(FLAGS.image_dir) 90 | examples = pd.read_csv(FLAGS.csv_input) 91 | grouped = split(examples, 'filename') 92 | #print(grouped) 93 | for group in grouped: 94 | tf_example = create_tf_example(group, path) 95 | writer.write(tf_example.SerializeToString()) 96 | 97 | writer.close() 98 | output_path = os.path.join(os.getcwd(), FLAGS.output_path) 99 | print('Successfully created the TFRecords: {}'.format(output_path)) 100 | 101 | 102 | if __name__ == '__main__': 103 | tf.app.run() 104 | -------------------------------------------------------------------------------- /data_preprocessing/labelImg.exe: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nerdimite/object_detection_template/c0d8090fcbeb2c30be500992b55e62eee03d72c4/data_preprocessing/labelImg.exe -------------------------------------------------------------------------------- /data_preprocessing/transform_resolution.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | import os 3 | import argparse 4 | def rescale_images(directory, size): 5 | for img in os.listdir(directory): 6 | im = Image.open(directory+img) 7 | im_resized = im.resize(size, Image.ANTIALIAS) 8 | im_resized.save(directory+img) 9 | if __name__ == '__main__': 10 | parser = argparse.ArgumentParser(description="Rescale images") 11 | parser.add_argument('-d', '--directory', type=str, required=True, help='Directory containing the images') 12 | parser.add_argument('-s', '--size', type=int, nargs=2, required=True, metavar=('width', 'height'), help='Image size') 13 | args = parser.parse_args() 14 | rescale_images(args.directory, args.size) 15 | -------------------------------------------------------------------------------- /data_preprocessing/xml_to_csv.py: -------------------------------------------------------------------------------- 1 | """ 2 | Usage: 3 | # Create train data: 4 | python xml_to_csv.py -i [PATH_TO_IMAGES_FOLDER]/train -o [PATH_TO_ANNOTATIONS_FOLDER]/train_labels.csv 5 | 6 | # Create test data: 7 | python xml_to_csv.py -i [PATH_TO_IMAGES_FOLDER]/test -o [PATH_TO_ANNOTATIONS_FOLDER]/test_labels.csv 8 | """ 9 | 10 | import os 11 | import glob 12 | import pandas as pd 13 | import argparse 14 | import xml.etree.ElementTree as ET 15 | 16 | 17 | def xml_to_csv(path): 18 | """Iterates through all .xml files (generated by labelImg) in a given directory and combines them in a single Pandas datagrame. 19 | 20 | Parameters: 21 | ---------- 22 | path : {str} 23 | The path containing the .xml files 24 | Returns 25 | ------- 26 | Pandas DataFrame 27 | The produced dataframe 28 | """ 29 | 30 | xml_list = [] 31 | for xml_file in glob.glob(path + '/*.xml'): 32 | tree = ET.parse(xml_file) 33 | root = tree.getroot() 34 | for member in root.findall('object'): 35 | value = (root.find('filename').text, 36 | int(root.find('size')[0].text), 37 | int(root.find('size')[1].text), 38 | member[0].text, 39 | int(member[4][0].text), 40 | int(member[4][1].text), 41 | int(member[4][2].text), 42 | int(member[4][3].text) 43 | ) 44 | xml_list.append(value) 45 | column_name = ['filename', 'width', 'height', 46 | 'class', 'xmin', 'ymin', 'xmax', 'ymax'] 47 | xml_df = pd.DataFrame(xml_list, columns=column_name) 48 | return xml_df 49 | 50 | 51 | def main(): 52 | # Initiate argument parser 53 | parser = argparse.ArgumentParser( 54 | description="Sample TensorFlow XML-to-CSV converter") 55 | parser.add_argument("-i", 56 | "--inputDir", 57 | help="Path to the folder where the input .xml files are stored", 58 | type=str) 59 | parser.add_argument("-o", 60 | "--outputFile", 61 | help="Name of output .csv file (including path)", type=str) 62 | args = parser.parse_args() 63 | 64 | if(args.inputDir is None): 65 | args.inputDir = os.getcwd() 66 | if(args.outputFile is None): 67 | args.outputFile = args.inputDir + "/labels.csv" 68 | 69 | assert(os.path.isdir(args.inputDir)) 70 | 71 | xml_df = xml_to_csv(args.inputDir) 72 | xml_df.to_csv( 73 | args.outputFile, index=None) 74 | print('Successfully converted xml to csv.') 75 | 76 | 77 | if __name__ == '__main__': 78 | main() 79 | -------------------------------------------------------------------------------- /inference_webcam.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import tensorflow as tf 4 | import cv2 5 | from object_detection.utils import visualization_utils as vis_util 6 | import sys 7 | 8 | # ADD PATH TO FROZEN GRAPH 9 | PATH_TO_FROZEN_GRAPH = '[path to your inference graph]/frozen_inference_graph.pb' 10 | 11 | cap = cv2.VideoCapture(0) 12 | 13 | # reads the frozen graph 14 | detection_graph = tf.Graph() 15 | with detection_graph.as_default(): 16 | od_graph_def = tf.GraphDef() 17 | with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid: 18 | serialized_graph = fid.read() 19 | od_graph_def.ParseFromString(serialized_graph) 20 | tf.import_graph_def(od_graph_def, name='') 21 | 22 | # EDIT THE BELOW DICTIONARY ACCORDING TO YOUR LABEL MAP 23 | # For Example, 24 | # if your label_map.pbtxt was like: 25 | # item { 26 | # id: 1 27 | # name: 'cat' 28 | # } 29 | # item { 30 | # id: 2 31 | # name: 'dog' 32 | # } 33 | # then your dictionary would look like this: 34 | # category_index = {1: {'id': 1, 'name': 'cat'}, 2: {'id': 2, 'name': 'dog'}} 35 | category_index = {1: {'id': 1, 'name': 'LABEL_1'}, 2: {'id': 2, 'name': 'LABEL_2'}} 36 | 37 | 38 | # Detection 39 | with detection_graph.as_default(): 40 | with tf.Session(graph=detection_graph) as sess: 41 | while True: 42 | # Read frame from camera 43 | ret, image_np = cap.read() 44 | # Expand dimensions since the model expects images to have shape: [1, None, None, 3] 45 | image_np_expanded = np.expand_dims(image_np, axis=0) 46 | # Extract image tensor 47 | image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') 48 | # Extract detection boxes 49 | boxes = detection_graph.get_tensor_by_name('detection_boxes:0') 50 | # Extract detection scores 51 | scores = detection_graph.get_tensor_by_name('detection_scores:0') 52 | # Extract detection classes 53 | classes = detection_graph.get_tensor_by_name('detection_classes:0') 54 | # Extract number of detections 55 | num_detections = detection_graph.get_tensor_by_name( 56 | 'num_detections:0') 57 | # Actual detection. 58 | (boxes, scores, classes, num_detections) = sess.run( 59 | [boxes, scores, classes, num_detections], 60 | feed_dict={image_tensor: image_np_expanded}) 61 | 62 | # Visualization of the results of a detection. 63 | vis_util.visualize_boxes_and_labels_on_image_array( 64 | image_np, 65 | np.squeeze(boxes), 66 | np.squeeze(classes).astype(np.int32), 67 | np.squeeze(scores), 68 | category_index, 69 | use_normalized_coordinates=True, 70 | line_thickness=3, 71 | ) 72 | # Display output 73 | cv2.imshow('Detection', cv2.resize(image_np, (800, 600))) 74 | if cv2.waitKey(25) & 0xFF == ord('q'): 75 | break 76 | 77 | cv2.destroyAllWindows() 78 | cap.release() 79 | sys.exit() 80 | -------------------------------------------------------------------------------- /object_detection_training.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "colab_type": "text", 7 | "id": "TUZXYXeFDFOI" 8 | }, 9 | "source": [ 10 | "# Custom Object Detection\n", 11 | "## Colab Trainer\n", 12 | "\n", 13 | "This notebook is for training your custom model after running [this](https://github.com/theneuralbeing/object_detection_template/blob/master/data_preprocessing/Preprocess_Data.ipynb) notebook locally on your computer which generates the tf record files for your custom dataset. You can run all these cells in order and follow the steps written as comments and markdown cells along the way." 14 | ] 15 | }, 16 | { 17 | "cell_type": "markdown", 18 | "metadata": { 19 | "colab_type": "text", 20 | "id": "TJEMVvKgHYMo" 21 | }, 22 | "source": [ 23 | "Colab VM Project Structure\n", 24 | "\n", 25 | "```\n", 26 | "content/\n", 27 | " ├─ data/\n", 28 | " │ ├── label_map.pbtxt\n", 29 | " │ ├── test.record\n", 30 | " │ └── train.record\n", 31 | " └─ models/\n", 32 | " ├─ research/\n", 33 | " │ ├── fine_tuned_model/\n", 34 | " │ │ ├── frozen_inference_graph.pb\n", 35 | " │ │ └── ...\n", 36 | " │ │ \n", 37 | " │ ├── pretrained_model/\n", 38 | " │ │ ├── frozen_inference_graph.pb\n", 39 | " │ │ └── ...\n", 40 | " │ │ \n", 41 | " │ ├── object_detection/\n", 42 | " │ │ ├── utils/\n", 43 | " │ │ ├── samples/\n", 44 | " │ │ │ ├── samples/ \n", 45 | " │ │ │ │ ├── configs/ \n", 46 | " │ │ │ │ │ ├── ssd_mobilenet_v2_coco.config\n", 47 | " │ │ │ │ │ ├── rfcn_resnet101_pets.config\n", 48 | " │ │ │ │ │ └── ...\n", 49 | " │ │ │ │ └── ... \n", 50 | " │ │ │ └── ... \n", 51 | " │ │ ├── export_inference_graph.py\n", 52 | " │ │ ├── model_main.py\n", 53 | " │ │ └── ...\n", 54 | " │ │ \n", 55 | " │ ├── training/\n", 56 | " │ │ ├── events.out.tfevents.xxxxx\n", 57 | " │ │ └── ... \n", 58 | " │ └── ...\n", 59 | " └── ...\n", 60 | "\n", 61 | "\n" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "metadata": { 67 | "colab_type": "text", 68 | "id": "AhOmXekWIGON" 69 | }, 70 | "source": [ 71 | "## Installing Prerequisites and Importing the Libraries" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 0, 77 | "metadata": { 78 | "colab": {}, 79 | "colab_type": "code", 80 | "id": "h5uF6UG8IIo0" 81 | }, 82 | "outputs": [], 83 | "source": [ 84 | "!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk\n", 85 | "\n", 86 | "!pip install -qq Cython contextlib2 pillow lxml matplotlib\n", 87 | "\n", 88 | "!pip install -qq pycocotools" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 0, 94 | "metadata": { 95 | "colab": {}, 96 | "colab_type": "code", 97 | "id": "SUmCIgQuINJY" 98 | }, 99 | "outputs": [], 100 | "source": [ 101 | "from __future__ import division, print_function, absolute_import\n", 102 | "\n", 103 | "import pandas as pd\n", 104 | "import numpy as np\n", 105 | "import csv\n", 106 | "import re\n", 107 | "import cv2 \n", 108 | "import os\n", 109 | "import glob\n", 110 | "import xml.etree.ElementTree as ET\n", 111 | "\n", 112 | "import io\n", 113 | "import tensorflow.compat.v1 as tf\n", 114 | "\n", 115 | "from PIL import Image\n", 116 | "from collections import namedtuple, OrderedDict\n", 117 | "\n", 118 | "import shutil\n", 119 | "import urllib.request\n", 120 | "import tarfile\n", 121 | "\n", 122 | "from google.colab import files" 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": 0, 128 | "metadata": { 129 | "colab": {}, 130 | "colab_type": "code", 131 | "id": "NZMq86mFISey" 132 | }, 133 | "outputs": [], 134 | "source": [ 135 | "# Object Detection API works in Tensorflow v 1.15 (it is remove in v 2.0)\n", 136 | "print(tf.__version__)" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "metadata": { 142 | "colab_type": "text", 143 | "id": "4y_01gbpI2MV" 144 | }, 145 | "source": [ 146 | "## Getting preprocess data" 147 | ] 148 | }, 149 | { 150 | "cell_type": "code", 151 | "execution_count": 0, 152 | "metadata": { 153 | "colab": {}, 154 | "colab_type": "code", 155 | "id": "Lo8RedKFJGDI" 156 | }, 157 | "outputs": [], 158 | "source": [ 159 | "# Upload your data.zip file which contains the record files and label map\n", 160 | "files.upload()" 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": 0, 166 | "metadata": { 167 | "colab": {}, 168 | "colab_type": "code", 169 | "id": "NHKyMnWWIo3u" 170 | }, 171 | "outputs": [], 172 | "source": [ 173 | "!mkdir data\n", 174 | "!unzip data.zip -d data/" 175 | ] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "metadata": { 180 | "colab_type": "text", 181 | "id": "bF25eVB2NFJQ" 182 | }, 183 | "source": [ 184 | "## Downloading and Installing Tensorflow Object Detection API" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 0, 190 | "metadata": { 191 | "colab": {}, 192 | "colab_type": "code", 193 | "id": "zj2WAYyyNEU8" 194 | }, 195 | "outputs": [], 196 | "source": [ 197 | "# Downlaods Tensorflow Object Detection API\n", 198 | "%cd /content\n", 199 | "!git clone --q https://github.com/tensorflow/models.git" 200 | ] 201 | }, 202 | { 203 | "cell_type": "code", 204 | "execution_count": 0, 205 | "metadata": { 206 | "colab": {}, 207 | "colab_type": "code", 208 | "id": "XpzvxKQgNSsd" 209 | }, 210 | "outputs": [], 211 | "source": [ 212 | "%cd /content/models/research\n", 213 | "#compiling the proto buffers (not important to understand for this project but you can learn more about them here: https://developers.google.com/protocol-buffers/)\n", 214 | "!protoc object_detection/protos/*.proto --python_out=.\n", 215 | "\n", 216 | "# exports the PYTHONPATH environment variable with the reasearch and slim folders' paths\n", 217 | "os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'" 218 | ] 219 | }, 220 | { 221 | "cell_type": "code", 222 | "execution_count": 0, 223 | "metadata": { 224 | "colab": {}, 225 | "colab_type": "code", 226 | "id": "gg9wxBvjNVKH" 227 | }, 228 | "outputs": [], 229 | "source": [ 230 | "# testing the model builder\n", 231 | "!python3 object_detection/builders/model_builder_test.py" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": { 237 | "colab_type": "text", 238 | "id": "pR2350wlMo5E" 239 | }, 240 | "source": [ 241 | "## Getting the pretrained model\n", 242 | "\n", 243 | "Check other models from [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models)." 244 | ] 245 | }, 246 | { 247 | "cell_type": "code", 248 | "execution_count": 0, 249 | "metadata": { 250 | "colab": {}, 251 | "colab_type": "code", 252 | "id": "5k8_31DiMoo3" 253 | }, 254 | "outputs": [], 255 | "source": [ 256 | "# Some models to train on\n", 257 | "MODELS_CONFIG = {\n", 258 | " 'ssd_mobilenet_v2': {\n", 259 | " 'model_name': 'ssd_mobilenet_v2_coco_2018_03_29',\n", 260 | " 'pipeline_file': 'ssd_mobilenet_v2_coco.config',\n", 261 | " }\n", 262 | "}\n", 263 | "\n", 264 | "# Select a model in `MODELS_CONFIG`.\n", 265 | "selected_model = 'ssd_mobilenet_v2'\n" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 0, 271 | "metadata": { 272 | "colab": {}, 273 | "colab_type": "code", 274 | "id": "tJTbelkjLebq" 275 | }, 276 | "outputs": [], 277 | "source": [ 278 | "%cd /content/models/research\n", 279 | "\n", 280 | "# Name of the object detection model to use.\n", 281 | "MODEL = MODELS_CONFIG[selected_model]['model_name']\n", 282 | "\n", 283 | "# Name of the pipline file in tensorflow object detection API.\n", 284 | "pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']\n", 285 | "\n", 286 | "#selecting the model\n", 287 | "MODEL_FILE = MODEL + '.tar.gz'\n", 288 | "\n", 289 | "#creating the downlaod link for the model selected\n", 290 | "DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'\n", 291 | "\n", 292 | "#the distination folder where the model will be saved\n", 293 | "fine_tune_dir = '/content/models/research/pretrained_model'\n", 294 | "\n", 295 | "#checks if the model has already been downloaded\n", 296 | "if not (os.path.exists(MODEL_FILE)):\n", 297 | " urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)\n", 298 | "\n", 299 | "#unzipping the file and extracting its content\n", 300 | "tar = tarfile.open(MODEL_FILE)\n", 301 | "tar.extractall()\n", 302 | "tar.close()\n", 303 | "\n", 304 | "# creating an output file to save the model while training\n", 305 | "os.remove(MODEL_FILE)\n", 306 | "if (os.path.exists(fine_tune_dir)):\n", 307 | " shutil.rmtree(fine_tune_dir)\n", 308 | "os.rename(MODEL, fine_tune_dir)" 309 | ] 310 | }, 311 | { 312 | "cell_type": "code", 313 | "execution_count": 0, 314 | "metadata": { 315 | "colab": {}, 316 | "colab_type": "code", 317 | "id": "E7mB0qhVTD3C" 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "# checking the content of the pretrained model.\n", 322 | "# this is the directory of the \"fine_tune_checkpoint\" that is used in the config file.\n", 323 | "!echo {fine_tune_dir}\n", 324 | "!ls -alh {fine_tune_dir}" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": { 330 | "colab_type": "text", 331 | "id": "7WgOmv1bTJBB" 332 | }, 333 | "source": [ 334 | "## Configuring Training Pipeline\n", 335 | "\n", 336 | "Editing the configuration file to add the path for the TFRecords files, pbtxt, batch_size, num_steps, num_classes.\n", 337 | "\n", 338 | "Any image augmentation, hyperparemeter tunning (drop out, batch normalization... etc) can be editted here in the configuration file" 339 | ] 340 | }, 341 | { 342 | "cell_type": "code", 343 | "execution_count": 0, 344 | "metadata": { 345 | "colab": {}, 346 | "colab_type": "code", 347 | "id": "vIv1-_XWVaB1" 348 | }, 349 | "outputs": [], 350 | "source": [ 351 | "#the path to the folder containing all the sample config files\n", 352 | "CONFIG_BASE = \"/content/models/research/object_detection/samples/configs/\"\n", 353 | "\n", 354 | "#path to the specified model's config file\n", 355 | "model_pipline = os.path.join(CONFIG_BASE, pipeline_file)\n", 356 | "model_pipline" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 0, 362 | "metadata": { 363 | "colab": {}, 364 | "colab_type": "code", 365 | "id": "xW0g3H3pTY1z" 366 | }, 367 | "outputs": [], 368 | "source": [ 369 | "%%writefile {model_pipline}\n", 370 | "\n", 371 | "model {\n", 372 | " ssd {\n", 373 | " num_classes: 1 # number of classes to be detected\n", 374 | " box_coder {\n", 375 | " faster_rcnn_box_coder {\n", 376 | " y_scale: 10.0\n", 377 | " x_scale: 10.0\n", 378 | " height_scale: 5.0\n", 379 | " width_scale: 5.0\n", 380 | " }\n", 381 | " }\n", 382 | " matcher {\n", 383 | " argmax_matcher {\n", 384 | " matched_threshold: 0.5\n", 385 | " unmatched_threshold: 0.5\n", 386 | " ignore_thresholds: false\n", 387 | " negatives_lower_than_unmatched: true\n", 388 | " force_match_for_each_row: true\n", 389 | " }\n", 390 | " }\n", 391 | " similarity_calculator {\n", 392 | " iou_similarity {\n", 393 | " }\n", 394 | " }\n", 395 | " anchor_generator {\n", 396 | " ssd_anchor_generator {\n", 397 | " num_layers: 6\n", 398 | " min_scale: 0.2\n", 399 | " max_scale: 0.95\n", 400 | " aspect_ratios: 1.0\n", 401 | " aspect_ratios: 2.0\n", 402 | " aspect_ratios: 0.5\n", 403 | " aspect_ratios: 3.0\n", 404 | " aspect_ratios: 0.3333\n", 405 | " }\n", 406 | " }\n", 407 | " # all images will be resized to the below W x H.\n", 408 | " image_resizer { \n", 409 | " fixed_shape_resizer {\n", 410 | " height: 300\n", 411 | " width: 300\n", 412 | " }\n", 413 | " }\n", 414 | " box_predictor {\n", 415 | " convolutional_box_predictor {\n", 416 | " min_depth: 0\n", 417 | " max_depth: 0\n", 418 | " num_layers_before_predictor: 0\n", 419 | " use_dropout: true # to counter over fitting. you can also try tweaking its probability below\n", 420 | " dropout_keep_probability: 0.8\n", 421 | " kernel_size: 1\n", 422 | " box_code_size: 4\n", 423 | " apply_sigmoid_to_scores: false\n", 424 | " conv_hyperparams {\n", 425 | " activation: RELU_6,\n", 426 | " regularizer {\n", 427 | " l2_regularizer {\n", 428 | " # weight: 0.00004\n", 429 | " weight: 0.001 # higher regularizition to counter overfitting\n", 430 | " }\n", 431 | " }\n", 432 | " initializer {\n", 433 | " truncated_normal_initializer {\n", 434 | " stddev: 0.03\n", 435 | " mean: 0.0\n", 436 | " }\n", 437 | " }\n", 438 | " batch_norm {\n", 439 | " train: true,\n", 440 | " scale: true,\n", 441 | " center: true,\n", 442 | " decay: 0.9997,\n", 443 | " epsilon: 0.001,\n", 444 | " }\n", 445 | " }\n", 446 | " }\n", 447 | " }\n", 448 | " feature_extractor {\n", 449 | " type: 'ssd_mobilenet_v2'\n", 450 | " min_depth: 16\n", 451 | " depth_multiplier: 1.0\n", 452 | " conv_hyperparams {\n", 453 | " activation: RELU_6,\n", 454 | " regularizer {\n", 455 | " l2_regularizer {\n", 456 | " # weight: 0.00004\n", 457 | " weight: 0.001 # higher regularizition to counter overfitting\n", 458 | " }\n", 459 | " }\n", 460 | " initializer {\n", 461 | " truncated_normal_initializer {\n", 462 | " stddev: 0.03\n", 463 | " mean: 0.0\n", 464 | " }\n", 465 | " }\n", 466 | " batch_norm {\n", 467 | " train: true,\n", 468 | " scale: true,\n", 469 | " center: true,\n", 470 | " decay: 0.9997,\n", 471 | " epsilon: 0.001,\n", 472 | " }\n", 473 | " }\n", 474 | " }\n", 475 | " loss {\n", 476 | " classification_loss {\n", 477 | " weighted_sigmoid {\n", 478 | " }\n", 479 | " }\n", 480 | " localization_loss {\n", 481 | " weighted_smooth_l1 {\n", 482 | " }\n", 483 | " }\n", 484 | " hard_example_miner {\n", 485 | " num_hard_examples: 3000 \n", 486 | " iou_threshold: 0.95\n", 487 | " loss_type: CLASSIFICATION\n", 488 | " max_negatives_per_positive: 3\n", 489 | " min_negatives_per_image: 3\n", 490 | " }\n", 491 | " classification_weight: 1.0\n", 492 | " localization_weight: 1.0\n", 493 | " }\n", 494 | " normalize_loss_by_num_matches: true\n", 495 | " post_processing {\n", 496 | " batch_non_max_suppression {\n", 497 | " score_threshold: 1e-8\n", 498 | " iou_threshold: 0.6\n", 499 | " \n", 500 | " # adjust this to the max number of objects per class.\n", 501 | " max_detections_per_class: 4\n", 502 | " # max number of detections among all classes\n", 503 | " max_total_detections: 4\n", 504 | " }\n", 505 | " score_converter: SIGMOID\n", 506 | " }\n", 507 | " }\n", 508 | "}\n", 509 | "\n", 510 | "train_config: {\n", 511 | " batch_size: 16 # training batch size\n", 512 | " optimizer {\n", 513 | " rms_prop_optimizer: {\n", 514 | " learning_rate: {\n", 515 | " exponential_decay_learning_rate {\n", 516 | " initial_learning_rate: 0.003\n", 517 | " decay_steps: 800720\n", 518 | " decay_factor: 0.95\n", 519 | " }\n", 520 | " }\n", 521 | " momentum_optimizer_value: 0.9\n", 522 | " decay: 0.9\n", 523 | " epsilon: 1.0\n", 524 | " }\n", 525 | " }\n", 526 | "\n", 527 | " # the path to the pretrained model. \n", 528 | " fine_tune_checkpoint: \"/content/models/research/pretrained_model/model.ckpt\"\n", 529 | " fine_tune_checkpoint_type: \"detection\"\n", 530 | " # edit the num_steps to your required amount of steps. Predefining steps\n", 531 | " # will stop the learning rate from decaying. You can also remove the line to \n", 532 | " # train indefinitely\n", 533 | " # num_steps: 10000\n", 534 | " \n", 535 | "\n", 536 | " # data augmentaion is done here, you can remove or add more.\n", 537 | " # They will help the model generalize but the training time will increase greatly by using more data augmentation.\n", 538 | " # Check this link to add more image augmentation: https://github.com/tensorflow/models/blob/master/research/object_detection/protos/preprocessor.proto\n", 539 | " \n", 540 | " data_augmentation_options {\n", 541 | " random_horizontal_flip {\n", 542 | " }\n", 543 | " }\n", 544 | " data_augmentation_options {\n", 545 | " random_adjust_contrast {\n", 546 | " }\n", 547 | " }\n", 548 | " data_augmentation_options {\n", 549 | " ssd_random_crop {\n", 550 | " }\n", 551 | " }\n", 552 | "}\n", 553 | "\n", 554 | "train_input_reader: {\n", 555 | " tf_record_input_reader {\n", 556 | " #path to the training TFRecord\n", 557 | " input_path: \"/content/data/train.record\"\n", 558 | " }\n", 559 | " #path to the label map \n", 560 | " label_map_path: \"/content/data/label_map.pbtxt\"\n", 561 | "}\n", 562 | "\n", 563 | "eval_config: {\n", 564 | " # the number of images in your \"testing\" data (was 600 but we removed one above :) )\n", 565 | " num_examples: 30\n", 566 | " # the number of images to disply in Tensorboard while training\n", 567 | " num_visualizations: 5\n", 568 | "\n", 569 | " # Note: The below line limits the evaluation process to 10 evaluations.\n", 570 | " # Remove the below line to evaluate indefinitely.\n", 571 | " # max_evals: 10\n", 572 | "}\n", 573 | "\n", 574 | "eval_input_reader: {\n", 575 | " tf_record_input_reader {\n", 576 | " \n", 577 | " #path to the testing TFRecord\n", 578 | " input_path: \"/content/data/test.record\"\n", 579 | " }\n", 580 | " #path to the label map \n", 581 | " label_map_path: \"/content/data/label_map.pbtxt\"\n", 582 | " shuffle: false\n", 583 | " num_readers: 1\n", 584 | "}" 585 | ] 586 | }, 587 | { 588 | "cell_type": "code", 589 | "execution_count": 0, 590 | "metadata": { 591 | "colab": {}, 592 | "colab_type": "code", 593 | "id": "gqbb6FqLVLLq" 594 | }, 595 | "outputs": [], 596 | "source": [ 597 | "# where the model will be saved at each checkpoint while training \n", 598 | "model_dir = 'training/'\n", 599 | "\n", 600 | "# Optionally: remove content in output model directory to fresh start.\n", 601 | "!rm -rf {model_dir}\n", 602 | "os.makedirs(model_dir, exist_ok=True)" 603 | ] 604 | }, 605 | { 606 | "cell_type": "markdown", 607 | "metadata": { 608 | "colab_type": "text", 609 | "id": "IkcVnqEsVj0I" 610 | }, 611 | "source": [ 612 | "## Tensorboard" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": 0, 618 | "metadata": { 619 | "colab": {}, 620 | "colab_type": "code", 621 | "id": "bdPufKfQVPnr" 622 | }, 623 | "outputs": [], 624 | "source": [ 625 | "# downloading ngrok to be able to access tensorboard on google colab\n", 626 | "!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip\n", 627 | "!unzip -o ngrok-stable-linux-amd64.zip" 628 | ] 629 | }, 630 | { 631 | "cell_type": "code", 632 | "execution_count": 0, 633 | "metadata": { 634 | "colab": {}, 635 | "colab_type": "code", 636 | "id": "I2E--FZmVrYb" 637 | }, 638 | "outputs": [], 639 | "source": [ 640 | "# the logs that are created while training \n", 641 | "LOG_DIR = model_dir\n", 642 | "get_ipython().system_raw(\n", 643 | " 'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'\n", 644 | " .format(LOG_DIR)\n", 645 | ")\n", 646 | "get_ipython().system_raw('./ngrok http 6006 &')" 647 | ] 648 | }, 649 | { 650 | "cell_type": "code", 651 | "execution_count": 0, 652 | "metadata": { 653 | "colab": {}, 654 | "colab_type": "code", 655 | "id": "NincnSEEVt7-" 656 | }, 657 | "outputs": [], 658 | "source": [ 659 | "# The link to tensorboard.\n", 660 | "# works after the training starts.\n", 661 | "\n", 662 | "### note: if you didnt get a link as output, rerun this cell and the one above\n", 663 | "!curl -s http://localhost:4040/api/tunnels | python3 -c \\\n", 664 | " \"import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])\" " 665 | ] 666 | }, 667 | { 668 | "cell_type": "markdown", 669 | "metadata": { 670 | "colab_type": "text", 671 | "id": "P-vkHn73V1UD" 672 | }, 673 | "source": [ 674 | "## Training" 675 | ] 676 | }, 677 | { 678 | "cell_type": "code", 679 | "execution_count": 0, 680 | "metadata": { 681 | "colab": {}, 682 | "colab_type": "code", 683 | "id": "sbIypMf3VxUv" 684 | }, 685 | "outputs": [], 686 | "source": [ 687 | "!python3 /content/models/research/object_detection/model_main.py \\\n", 688 | " --pipeline_config_path={model_pipline}\\\n", 689 | " --model_dir={model_dir} \\\n", 690 | " --alsologtostderr \\" 691 | ] 692 | }, 693 | { 694 | "cell_type": "markdown", 695 | "metadata": { 696 | "colab_type": "text", 697 | "id": "Jj82A3xeV-PA" 698 | }, 699 | "source": [ 700 | "## Exporting the Trained Model as a Graph" 701 | ] 702 | }, 703 | { 704 | "cell_type": "code", 705 | "execution_count": 0, 706 | "metadata": { 707 | "colab": {}, 708 | "colab_type": "code", 709 | "id": "BrvjRg3WWAWk" 710 | }, 711 | "outputs": [], 712 | "source": [ 713 | "# the location where the exported model will be saved in.\n", 714 | "output_directory = '/content/models/research/fine_tuned_model'\n", 715 | "\n", 716 | "# goes through the model is the training/ dir and gets the last one.\n", 717 | "# you could choose a specfic one instead of the last\n", 718 | "lst = os.listdir(model_dir)\n", 719 | "lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]\n", 720 | "steps=np.array([int(re.findall('\\d+', l)[0]) for l in lst])\n", 721 | "last_model = lst[steps.argmax()].replace('.meta', '')\n", 722 | "last_model_path = os.path.join(model_dir, last_model)\n", 723 | "print(last_model_path)\n", 724 | "\n", 725 | "#exports the model specifed and inference graph\n", 726 | "!python /content/models/research/object_detection/export_inference_graph.py \\\n", 727 | " --input_type=image_tensor \\\n", 728 | " --pipeline_config_path={model_pipline} \\\n", 729 | " --output_directory={output_directory} \\\n", 730 | " --trained_checkpoint_prefix={last_model_path}" 731 | ] 732 | }, 733 | { 734 | "cell_type": "code", 735 | "execution_count": 0, 736 | "metadata": { 737 | "colab": {}, 738 | "colab_type": "code", 739 | "id": "ZStl1-iAWGJF" 740 | }, 741 | "outputs": [], 742 | "source": [ 743 | "# download the frozen model that is needed for inference\n", 744 | "files.download(output_directory + '/frozen_inference_graph.pb')" 745 | ] 746 | }, 747 | { 748 | "cell_type": "markdown", 749 | "metadata": { 750 | "colab_type": "text", 751 | "id": "rxqBLqckWNBu" 752 | }, 753 | "source": [ 754 | "## Performing Inference\n", 755 | "To perform inference locally on your computer, follow these steps\n", 756 | "\n", 757 | "1. Ideally create a separate environment with Tensorflow version 1.15 installed and then `pip install object_detection`\n", 758 | "2. Open the [`inference_webcam.py`]()\n", 759 | "3. Edit the path to the frozen graph in the file\n", 760 | "4. If you remember our label_map.pbtxt was of the format\n", 761 | "```\n", 762 | "item {\n", 763 | " id: 1\n", 764 | " name: 'LABEL_1'\n", 765 | "}\n", 766 | "item {\n", 767 | " id: 2\n", 768 | " name: 'LABEL_2'\n", 769 | "}\n", 770 | "```\n", 771 | "Now edit the `category_index` dictionary in the inference file in the following format\n", 772 | "```\n", 773 | "category_index = {1: {'id': 1, 'name': 'LABEL_1'}, 2: {'id': 2, 'name': 'LABEL_2'}}\n", 774 | "```\n", 775 | "5. Finally run `inference_webcam.py` in the terminal and your custom object detection model is ready." 776 | ] 777 | } 778 | ], 779 | "metadata": { 780 | "accelerator": "GPU", 781 | "colab": { 782 | "collapsed_sections": [], 783 | "name": "object_detection_training.ipynb", 784 | "private_outputs": true, 785 | "provenance": [] 786 | }, 787 | "kernelspec": { 788 | "display_name": "Python 3", 789 | "language": "python", 790 | "name": "python3" 791 | }, 792 | "language_info": { 793 | "codemirror_mode": { 794 | "name": "ipython", 795 | "version": 3 796 | }, 797 | "file_extension": ".py", 798 | "mimetype": "text/x-python", 799 | "name": "python", 800 | "nbconvert_exporter": "python", 801 | "pygments_lexer": "ipython3", 802 | "version": "3.7.4" 803 | } 804 | }, 805 | "nbformat": 4, 806 | "nbformat_minor": 1 807 | } 808 | --------------------------------------------------------------------------------