├── README.md
├── Train_custom_chess_piece_detector_with_TFLite_Model_Maker.ipynb
├── Train_custom_object_detector_with_TFLite_Model_Maker.ipynb
├── doc
├── android_application.PNG
├── label_image.png
└── prediction_example.png
└── model.tflite
/README.md:
--------------------------------------------------------------------------------
1 | # TFLite Object Detection with TFLite Model Maker
2 |
3 | 
4 |
5 | The TensorFlow Lite Model Maker library is a high-level library that simplifies the process of training a TensorFlow Lite model using a custom dataset. It uses transfer learning to reduce the amount of training data required and shorten the training time. This guide walks you through creating a custom object detector and deploying it on Android. The guide is heavily based on the [Object Detection with TensorFlow Lite Model Maker page](https://www.tensorflow.org/lite/tutorials/model_maker_object_detection) from the Tensorflow Lite documentation.
6 |
7 | ## Prerequisites
8 |
9 | ### Install required packages
10 |
11 | ```
12 | !sudo apt -y install libportaudio2
13 | !pip install -q --use-deprecated=legacy-resolver tflite-model-maker
14 | !pip install -q pycocotools
15 | !pip install -q opencv-python-headless==4.1.2.30
16 | !pip uninstall -y tensorflow && pip install -q tensorflow==2.8.0
17 | ```
18 |
19 | Import the required packages.
20 |
21 | ```python
22 | import numpy as np
23 | import os
24 |
25 | from tflite_model_maker.config import ExportFormat
26 | from tflite_model_maker import model_spec
27 | from tflite_model_maker import object_detector
28 |
29 | import tensorflow as tf
30 | assert tf.__version__.startswith('2')
31 |
32 | tf.get_logger().setLevel('ERROR')
33 | from absl import logging
34 | logging.set_verbosity(logging.ERROR)
35 | ```
36 |
37 | ### Gathering and labeling data
38 |
39 | Before you can start creating your own custom object detector, you'll have to prepare a dataset. The Tensorflow Lite Model Maker supports two data formats - [CSV](https://cloud.google.com/vision/automl/object-detection/docs/csv-format) and [PASCAL VOC](https://towardsdatascience.com/coco-data-format-for-object-detection-a4c5eaf518c5#:~:text=Pascal%20VOC%20is%20an%20XML,for%20training%2C%20testing%20and%20validation). Data in CSV format can be loaded with [`object_detector.DataLoader.from_csv`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader#from_csv) and data in PASCAL VOC format can be loaded using the [`object_detector.DataLoader.from_pascal_voc`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader#from_pascal_voc) method.
40 |
41 | To create a dataset in LabelImg format, I recommend using [labelImg](https://github.com/tzutalin/labelImg), an open-source graphical image annotation tool.
42 |
43 | 
44 |
45 | If you don't want to create your own dataset, you can find lots of datasets on places like [Kaggle](https://www.kaggle.com/datasets?tags=13207-Computer+Vision) or [Roboflow](https://public.roboflow.com/).
46 |
47 | ## Train custom object detection model
48 |
49 | ### Step 1. Choose an object detection model architecture.
50 |
51 | Tensorflow Lite Model Maker currently supports 5 different object detection models (EfficientDet-Lite[0-4]). All of them are derived from the [EfficientDet](https://arxiv.org/abs/1911.09070) architecture. The main differences between the models are their size and latency.
52 |
53 | | Model architecture | Size(MB)* | Latency(ms)** | Average Precision*** |
54 | |--------------------|-----------|---------------|----------------------|
55 | | EfficientDet-Lite0 | 4.4 | 37 | 25.69% |
56 | | EfficientDet-Lite1 | 5.8 | 49 | 30.55% |
57 | | EfficientDet-Lite2 | 7.2 | 69 | 33.97% |
58 | | EfficientDet-Lite3 | 11.4 | 116 | 37.70% |
59 | | EfficientDet-Lite4 | 19.9 | 260 | 41.96% |
60 |
61 | * Size of the integer quantized models.
62 | ** Latency measured on Pixel 4 using 4 threads on CPU.
63 | *** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.
64 |
65 |
66 | ```python
67 | spec = model_spec.get('efficientdet_lite0')
68 | ```
69 |
70 | ### Step 2. Load the dataset.
71 |
72 | If your dataset is in CSV format, use the [`object_detector.DataLoader.from_csv`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader#from_csv) method to load the data and to split it into training, validation, and test sets.
73 |
74 | ```python
75 | train_data, validation_data, test_data = object_detector.DataLoader.from_csv('')
76 | ```
77 |
78 | If you labeled your data in Pascal VOC format, use the [`object_detector.DataLoader.from_pascal_voc`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader#from_pascal_voc) method to load the data. You need to pass the method the `image_dir`, `annotations_dir`, and `label_map`. For more information, check out [the documentation](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader#from_pascal_voc).
79 |
80 | ```python
81 | dataloader = object_detector.DataLoader.from_pascal_voc(image_dir, annotations_dir, label_map={1: "person", 2: "notperson"})
82 | ```
83 |
84 | The `from_pascal_voc` method doesn't automatically split the data into a training, validation, and test set. For this the `tflite_model_maker.object_detector.DataLoader` provides the `split` method, allowing you to split a dataset into two sub-datasets with the given fraction. However, this method didn't work for me, and I already [reported the error](https://discuss.tensorflow.org/t/attributeerror-nonetype-object-has-no-attribute-take/2039). Therefore until the error is resolved, I recommend splitting the data by hand. An example of this can be seen in the [Chess Piece detection](Train_custom_chess_piece_detector_with_TFLite_Model_Maker.ipynb) example.
85 |
86 | ```python
87 | # split data into training and testing set
88 | import os, random, shutil
89 |
90 | os.mkdir('chess-detection/train')
91 | os.mkdir('chess-detection/test')
92 |
93 | image_paths = os.listdir('chess-detection/images')
94 | random.shuffle(image_paths)
95 |
96 | for i, image_path in enumerate(image_paths):
97 | if i < int(len(image_paths) * 0.8):
98 | shutil.copy(f'chess-detection/images/{image_path}', 'chess-detection/train')
99 | shutil.copy(f'chess-detection/annotations/{image_path.replace("JPG", "xml")}', 'chess-detection/train')
100 | else:
101 | shutil.copy(f'chess-detection/images/{image_path}', 'chess-detection/test')
102 | shutil.copy(f'chess-detection/annotations/{image_path.replace("JPG", "xml")}', 'chess-detection/test')
103 | ```
104 |
105 | ### Step 3. Train the TensorFlow model with the training data.
106 |
107 | After loading the data, the Tensorflow model can be trained using the `object_detector.create` method. The `create` method is the driver function that the Model Maker library uses to create models. The `create` method:
108 | 1. Creates the model for the object detection according to `model_spec`
109 | 2. Trains the model. By default, the hyperparameters inside the `model_spec` are used, but they can be overwritten by passing the hyperparameters as function arguments.
110 |
111 | ```python
112 | model = object_detector.create(train_data, model_spec=spec, epochs=50, batch_size=8, train_whole_model=True, validation_data=validation_data)
113 | ```
114 |
115 | Example output:
116 |
117 | ```
118 | Epoch 1/50
119 | 21/21 [==============================] - 110s 2s/step - det_loss: 1.7648 - cls_loss: 1.1449 - box_loss: 0.0124 - reg_l2_loss: 0.0764 - loss: 1.8412 - learning_rate: 0.0090 - gradient_norm: 0.7164 - val_det_loss: 1.6857 - val_cls_loss: 1.1173 - val_box_loss: 0.0114 - val_reg_l2_loss: 0.0764 - val_loss: 1.7621
120 | Epoch 2/50
121 | 21/21 [==============================] - 29s 1s/step - det_loss: 1.6056 - cls_loss: 1.0826 - box_loss: 0.0105 - reg_l2_loss: 0.0764 - loss: 1.6820 - learning_rate: 0.0100 - gradient_norm: 0.9471 - val_det_loss: 1.5332 - val_cls_loss: 1.0065 - val_box_loss: 0.0105 - val_reg_l2_loss: 0.0764 - val_loss: 1.6095
122 | Epoch 3/50
123 | 21/21 [==============================] - 33s 2s/step - det_loss: 1.3830 - cls_loss: 0.9211 - box_loss: 0.0092 - reg_l2_loss: 0.0764 - loss: 1.4594 - learning_rate: 0.0099 - gradient_norm: 1.9618 - val_det_loss: 1.3218 - val_cls_loss: 0.8212 - val_box_loss: 0.0100 - val_reg_l2_loss: 0.0764 - val_loss: 1.3982
124 | Epoch 4/50
125 | 21/21 [==============================] - 31s 2s/step - det_loss: 1.1782 - cls_loss: 0.7901 - box_loss: 0.0078 - reg_l2_loss: 0.0764 - loss: 1.2546 - learning_rate: 0.0099 - gradient_norm: 2.1614 - val_det_loss: 1.1834 - val_cls_loss: 0.7156 - val_box_loss: 0.0094 - val_reg_l2_loss: 0.0764 - val_loss: 1.2598
126 | Epoch 5/50
127 | 21/21 [==============================] - 33s 2s/step - det_loss: 1.0756 - cls_loss: 0.7167 - box_loss: 0.0072 - reg_l2_loss: 0.0764 - loss: 1.1520 - learning_rate: 0.0098 - gradient_norm: 2.1485 - val_det_loss: 1.1105 - val_cls_loss: 0.6764 - val_box_loss: 0.0087 - val_reg_l2_loss: 0.0764 - val_loss: 1.1869
128 | Epoch 6/50
129 | 21/21 [==============================] - 30s 1s/step - det_loss: 1.0091 - cls_loss: 0.6841 - box_loss: 0.0065 - reg_l2_loss: 0.0764 - loss: 1.0856 - learning_rate: 0.0097 - gradient_norm: 2.1970 - val_det_loss: 1.0964 - val_cls_loss: 0.6617 - val_box_loss: 0.0087 - val_reg_l2_loss: 0.0764 - val_loss: 1.1729
130 | Epoch 7/50
131 | 21/21 [==============================] - 33s 2s/step - det_loss: 0.9230 - cls_loss: 0.6264 - box_loss: 0.0059 - reg_l2_loss: 0.0764 - loss: 0.9995 - learning_rate: 0.0096 - gradient_norm: 2.2962 - val_det_loss: 0.9999 - val_cls_loss: 0.6122 - val_box_loss: 0.0078 - val_reg_l2_loss: 0.0765 - val_loss: 1.0763
132 | Epoch 8/50
133 | 21/21 [==============================] - 31s 1s/step - det_loss: 0.9043 - cls_loss: 0.6087 - box_loss: 0.0059 - reg_l2_loss: 0.0765 - loss: 0.9807 - learning_rate: 0.0094 - gradient_norm: 2.2009 - val_det_loss: 0.9992 - val_cls_loss: 0.6201 - val_box_loss: 0.0076 - val_reg_l2_loss: 0.0765 - val_loss: 1.0756
134 | Epoch 9/50
135 | 21/21 [==============================] - 32s 2s/step - det_loss: 0.8622 - cls_loss: 0.5827 - box_loss: 0.0056 - reg_l2_loss: 0.0765 - loss: 0.9386 - learning_rate: 0.0093 - gradient_norm: 2.3275 - val_det_loss: 0.9385 - val_cls_loss: 0.5811 - val_box_loss: 0.0071 - val_reg_l2_loss: 0.0765 - val_loss: 1.0150
136 | Epoch 10/50
137 | 21/21 [==============================] - 31s 1s/step - det_loss: 0.8461 - cls_loss: 0.5696 - box_loss: 0.0055 - reg_l2_loss: 0.0765 - loss: 0.9226 - learning_rate: 0.0091 - gradient_norm: 2.3217 - val_det_loss: 0.9469 - val_cls_loss: 0.5861 - val_box_loss: 0.0072 - val_reg_l2_loss: 0.0765 - val_loss: 1.0234
138 | ...
139 | ```
140 |
141 | ### Step 4. Evaluate the model with the test data.
142 |
143 | After training the object detection model using the images in the training dataset, the model can be evaluated on the validation or test data.
144 |
145 | ```python
146 | model.evaluate(test_data)
147 | ```
148 |
149 | or
150 |
151 | ```python
152 | model.evaluate(validation_data)
153 | ```
154 |
155 | Output example:
156 |
157 | ```
158 | {'AP': 0.9129471,
159 | 'AP50': 1.0,
160 | 'AP75': 1.0,
161 | 'AP_/Arduino_Nano': 0.80178857,
162 | 'AP_/ESP8266': 1.0,
163 | 'AP_/Heltec_ESP32_Lora': 0.85,
164 | 'AP_/Raspberry_Pi_3': 1.0,
165 | 'APl': 0.9130256,
166 | 'APm': -1.0,
167 | 'APs': -1.0,
168 | 'ARl': 0.9375,
169 | 'ARm': -1.0,
170 | 'ARmax1': 0.9125,
171 | 'ARmax10': 0.925,
172 | 'ARmax100': 0.9375,
173 | 'ARs': -1.0}
174 | ```
175 |
176 | ### Step 5. Export as a TensorFlow Lite model.
177 |
178 | Currently, the Tensorflow Lite Model Maker allows you to export the object detection model in TFLITE and [SAVED_MODEL](https://www.tensorflow.org/guide/saved_model) format. By default, the `export` method exports the model to the Tensorflow Lite format and performs full integer quantization on it (`model.export(export_dir='.')`), but you can also choose to [export the model in another format](https://www.tensorflow.org/lite/tutorials/model_maker_object_detection#export_to_different_formats) or [change the quantization type](https://www.tensorflow.org/lite/tutorials/model_maker_object_detection#customize_post-training_quantization_on_the_tensorflow_lite_model).
179 |
180 | ### Step 6. Evaluate the TensorFlow Lite model.
181 |
182 | After converting the model to Tensorflow Lite, it's useful to reevaluate the model as there are several factors when exporting that could have affected the accuracy, including:
183 | * Quantization
184 | * TFLITE using global [non-max suppression (NMS)](https://www.coursera.org/lecture/convolutional-neural-networks/non-max-suppression-dvrjH) instead of per-class non-max suppression (NMS).
185 |
186 | ```python
187 | model.evaluate_tflite('model.tflite', test_data)
188 | ```
189 |
190 | Output example:
191 | ```
192 | {'AP': 0.8691832,
193 | 'AP50': 1.0,
194 | 'AP75': 1.0,
195 | 'AP_/Arduino_Nano': 0.8009901,
196 | 'AP_/ESP8266': 0.92524755,
197 | 'AP_/Heltec_ESP32_Lora': 0.85049504,
198 | 'AP_/Raspberry_Pi_3': 0.9,
199 | 'APl': 0.8691832,
200 | 'APm': -1.0,
201 | 'APs': -1.0,
202 | 'ARl': 0.8875,
203 | 'ARm': -1.0,
204 | 'ARmax1': 0.8875,
205 | 'ARmax10': 0.8875,
206 | 'ARmax100': 0.8875,
207 | 'ARs': -1.0}
208 | ```
209 |
210 | ## (Optional) Compile for the Edge TPU
211 |
212 | ### Step 1. Install the EdgeTPU Compiler
213 |
214 | ```
215 | curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
216 |
217 | echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
218 |
219 | sudo apt-get update
220 |
221 | sudo apt-get install edgetpu-compiler
222 | ```
223 |
224 | ### Step 2. Select number of Edge TPUs, Compile
225 |
226 | The EdgeTPU has 8MB of SRAM for caching model parameters ([more info](https://coral.ai/docs/edgetpu/compiler/#parameter-data-caching)). This means that for models larger than 8MB, inference time will be increased to transfer over model parameters. One way to avoid this is [Model Pipelining](https://coral.ai/docs/edgetpu/pipeline/) - splitting the model into segments that can have a dedicated EdgeTPU. This can significantly improve latency.
227 |
228 | The below table can be used as a reference for the number of Edge TPUs to use - the larger models will not compile for a single TPU as the intermediate tensors can't fit in on-chip memory.
229 |
230 | | Model architecture | Minimum TPUs | Recommended TPUs |
231 | | :---: | :---: | :---: |
232 | | EfficientDet-Lite0 | 1 | 1 |
233 | | EfficientDet-Lite1 | 1 | 1 |
234 | | EfficientDet-Lite2 | 1 | 2 |
235 | | EfficientDet-Lite3 | 2 | 2 |
236 | | EfficientDet-Lite4 | 2 | 3 |
237 |
238 | ```
239 | edgetpu_compiler model.tflite --num_segments=1
240 | ```
241 |
242 | ## (Optional) Test the TFLite model on your image
243 |
244 | ### Load the trained TFLite model and define some visualization functions:
245 |
246 |
247 | Toggle code
248 |
249 | ```python
250 | import cv2
251 |
252 | from PIL import Image
253 |
254 | model_path = 'model.tflite'
255 |
256 | # Load the labels into a list
257 | classes = ['???'] * model.model_spec.config.num_classes
258 | label_map = model.model_spec.config.label_map
259 | for label_id, label_name in label_map.as_dict().items():
260 | classes[label_id-1] = label_name
261 |
262 | # Define a list of colors for visualization
263 | COLORS = np.random.randint(0, 255, size=(len(classes), 3), dtype=np.uint8)
264 |
265 | def preprocess_image(image_path, input_size):
266 | """Preprocess the input image to feed to the TFLite model"""
267 | img = tf.io.read_file(image_path)
268 | img = tf.io.decode_image(img, channels=3)
269 | img = tf.image.convert_image_dtype(img, tf.uint8)
270 | original_image = img
271 | resized_img = tf.image.resize(img, input_size)
272 | resized_img = resized_img[tf.newaxis, :]
273 | return resized_img, original_image
274 |
275 |
276 | def set_input_tensor(interpreter, image):
277 | """Set the input tensor."""
278 | tensor_index = interpreter.get_input_details()[0]['index']
279 | input_tensor = interpreter.tensor(tensor_index)()[0]
280 | input_tensor[:, :] = image
281 |
282 |
283 | def get_output_tensor(interpreter, index):
284 | """Retur the output tensor at the given index."""
285 | output_details = interpreter.get_output_details()[index]
286 | tensor = np.squeeze(interpreter.get_tensor(output_details['index']))
287 | return tensor
288 |
289 |
290 | def detect_objects(interpreter, image, threshold):
291 | """Returns a list of detection results, each a dictionary of object info."""
292 | # Feed the input image to the model
293 | set_input_tensor(interpreter, image)
294 | interpreter.invoke()
295 |
296 | # Get all outputs from the model
297 | scores = get_output_tensor(interpreter, 0)
298 | boxes = get_output_tensor(interpreter, 1)
299 | count = int(get_output_tensor(interpreter, 2))
300 | classes = get_output_tensor(interpreter, 3)
301 |
302 | results = []
303 | for i in range(count):
304 | if scores[i] >= threshold:
305 | result = {
306 | 'bounding_box': boxes[i],
307 | 'class_id': classes[i],
308 | 'score': scores[i]
309 | }
310 | results.append(result)
311 | return results
312 |
313 |
314 | def run_odt_and_draw_results(image_path, interpreter, threshold=0.5):
315 | """Run object detection on the input image and draw the detection results"""
316 | # Load the input shape required by the model
317 | _, input_height, input_width, _ = interpreter.get_input_details()[0]['shape']
318 |
319 | # Load the input image and preprocess it
320 | preprocessed_image, original_image = preprocess_image(
321 | image_path,
322 | (input_height, input_width)
323 | )
324 |
325 | # Run object detection on the input image
326 | results = detect_objects(interpreter, preprocessed_image, threshold=threshold)
327 |
328 | # Plot the detection results on the input image
329 | original_image_np = original_image.numpy().astype(np.uint8)
330 | for obj in results:
331 | # Convert the object bounding box from relative coordinates to absolute
332 | # coordinates based on the original image resolution
333 | ymin, xmin, ymax, xmax = obj['bounding_box']
334 | xmin = int(xmin * original_image_np.shape[1])
335 | xmax = int(xmax * original_image_np.shape[1])
336 | ymin = int(ymin * original_image_np.shape[0])
337 | ymax = int(ymax * original_image_np.shape[0])
338 |
339 | # Find the class index of the current object
340 | class_id = int(obj['class_id'])
341 |
342 | # Draw the bounding box and label on the image
343 | color = [int(c) for c in COLORS[class_id]]
344 | cv2.rectangle(original_image_np, (xmin, ymin), (xmax, ymax), color, 2)
345 | # Make adjustments to make the label visible for all objects
346 | y = ymin - 15 if ymin - 15 > 15 else ymin + 15
347 | label = "{}: {:.0f}%".format(classes[class_id], obj['score'] * 100)
348 | cv2.putText(original_image_np, label, (xmin, y),
349 | cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
350 |
351 | # Return the final image
352 | original_uint8 = original_image_np.astype(np.uint8)
353 | return original_uint8
354 | ```
355 |
356 |
357 |
358 | ### Run object detection and show the detection results
359 |
360 | ```python
361 | INPUT_IMAGE_URL = "/content/microcontroller-detection/test/IMG_20181228_102641.jpg"
362 | DETECTION_THRESHOLD = 0.5
363 |
364 | # Load the TFLite model
365 | interpreter = tf.lite.Interpreter(model_path=model_path)
366 | interpreter.allocate_tensors()
367 |
368 | # Run inference and draw detection result on the local copy of the original file
369 | detection_result_image = run_odt_and_draw_results(
370 | INPUT_IMAGE_URL,
371 | interpreter,
372 | threshold=DETECTION_THRESHOLD
373 | )
374 |
375 | # Show the detection result
376 | Image.fromarray(detection_result_image)
377 | ```
378 |
379 | ## Deploy model on Android
380 |
381 | This model can be integrated into an Android or an iOS app using the [ObjectDetector API](https://www.tensorflow.org/lite/inference_with_metadata/task_library/object_detector) of the [TensorFlow Lite Task Library](https://www.tensorflow.org/lite/inference_with_metadata/task_library/overview). The ["Build and deploy a custom object detection model with TensorFlow Lite (Android)" Codelab](https://codelabs.developers.google.com/tflite-object-detection-android) provides an example Android application written in Kotlin. You can download it by cloning the [odml-pathways repository](https://github.com/googlecodelabs/odml-pathways).
382 |
383 | ```
384 | git clone https://github.com/googlecodelabs/odml-pathways
385 | ```
386 |
387 | After cloning the repository open the `odml-pathways/object-detection/codelab2/android/final/` folder inside Android studio. To integrate your own custom model, copy the .tflite file inside the `assets` folder, open the `MainActivity.kt`, and change the path to the model as described [here](https://codelabs.developers.google.com/tflite-object-detection-android#7).
388 |
389 | 
--------------------------------------------------------------------------------
/doc/android_application.PNG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TannerGilbert/TFLite-Object-Detection-with-TFLite-Model-Maker/df5f2672018237ca7ad73ca75b2a19e8f0c335a2/doc/android_application.PNG
--------------------------------------------------------------------------------
/doc/label_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TannerGilbert/TFLite-Object-Detection-with-TFLite-Model-Maker/df5f2672018237ca7ad73ca75b2a19e8f0c335a2/doc/label_image.png
--------------------------------------------------------------------------------
/doc/prediction_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TannerGilbert/TFLite-Object-Detection-with-TFLite-Model-Maker/df5f2672018237ca7ad73ca75b2a19e8f0c335a2/doc/prediction_example.png
--------------------------------------------------------------------------------
/model.tflite:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/TannerGilbert/TFLite-Object-Detection-with-TFLite-Model-Maker/df5f2672018237ca7ad73ca75b2a19e8f0c335a2/model.tflite
--------------------------------------------------------------------------------