├── README.md ├── font ├── FiraMono-Medium.otf └── SIL Open Font License.txt ├── model_data ├── convert_model.py └── yolo_anchors.txt ├── model_keras.py ├── post_process.py ├── tensorrt_util ├── __init__.py ├── __pycache__ │ ├── __init__.cpython-35.pyc │ ├── tensorrt_common.cpython-35.pyc │ └── yolo_calibrator.cpython-35.pyc ├── tensorrt_common.py └── yolo_calibrator.py ├── utils.py ├── yolo_keras.py ├── yolo_tensorrt.py └── yolo_test.py /README.md: -------------------------------------------------------------------------------- 1 | # Yolov3 on TensorRT7.0 and Tensorflow2.0 2 | 3 | This repository contains Yolov3 inference on tensorrt7.0 and tensorflow 2.0. 4 | Model is based on [darknet](https://pjreddie.com/darknet/yolo/), adapt it to 5 | keras and tensorrt platform. 6 | 7 | ### Test environments 8 | ubuntu 16.04 9 | Tensorflow 1.4,1.5 and 2.0 10 | TensorRT 7.0 11 | Nvidia-driver 410.78 12 | cuda 10.0 13 | cudnn 7.6.5 14 | python 3.5.2 15 | 16 | optional: 17 | 18 | keras2onnx 1.6 19 | onnx 1.6 20 | 21 | ### Models 22 | - Keras model 23 | Keras model is borrowed from [keras-yolo3](https://github.com/qqwweee/keras-yolo3), which 24 | contains a detailed description about how to generate .h5 model. 25 | - TensorRT engine 26 | TensorRT engine are generated based on Keras model. In this case, Keras model is 27 | converted to onnx format and then used to generate TensorRT engine. 28 | 29 | Keras model can be converted to onnx as: 30 | 31 | python3 model_data/convert_model.py --model_path=your_dir/model.h5 --output_path=output_dir --type=onnx 32 | 33 | then specify model path in yolo_tensorrt.py or yolo_test.py, TensorRT engine will be 34 | generated after running yolo_test.py. 35 | 36 | For int8 engine, calibrate images should be prepared and specify images path in yolo_tensorrt.py 37 | 38 | - Download engine 39 | Or Download engine directly(waiting for upload). 40 | 41 | ### Run test 42 | 43 | python3 yolo_test.py --model_path=model_data/yolo_int8.engine --live --platform=tensorrt 44 | 45 | more detail refer to : 46 | 47 | python3 yolo_test.py --help 48 | 49 | ### Evaluate result 50 | int8 calibration images are 1000 pics selected in val2014 51 | 52 | Model | mode | dataset | MAP | MAP (0.5) | MAP(0.75) 53 | ---- | ---- | --- | --- | --- | --- 54 | Yolov3-416 | raw | COCOval2014 | 0.315 | 0.561 | 0.319 55 | Yolov3-416 | fp32 | COCOval2014 | 0.315 | 0.561 | 0.319 56 | Yolov3-416 | int8 | COCOval2014 | 0.304 | 0.551 | 0.295 57 | 58 | As shown above, fp32 model has completely same result as raw model in twice tests. 59 | In my previous tensorrt 6.0.1 practice, the fp32 model has little less mAP than raw model, but 60 | the model is converted through uff. -------------------------------------------------------------------------------- /font/FiraMono-Medium.otf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/font/FiraMono-Medium.otf -------------------------------------------------------------------------------- /font/SIL Open Font License.txt: -------------------------------------------------------------------------------- 1 | Copyright (c) 2014, Mozilla Foundation https://mozilla.org/ with Reserved Font Name Fira Mono. 2 | 3 | Copyright (c) 2014, Telefonica S.A. 4 | 5 | This Font Software is licensed under the SIL Open Font License, Version 1.1. 6 | This license is copied below, and is also available with a FAQ at: http://scripts.sil.org/OFL 7 | 8 | ----------------------------------------------------------- 9 | SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007 10 | ----------------------------------------------------------- 11 | 12 | PREAMBLE 13 | The goals of the Open Font License (OFL) are to stimulate worldwide development of collaborative font projects, to support the font creation efforts of academic and linguistic communities, and to provide a free and open framework in which fonts may be shared and improved in partnership with others. 14 | 15 | The OFL allows the licensed fonts to be used, studied, modified and redistributed freely as long as they are not sold by themselves. The fonts, including any derivative works, can be bundled, embedded, redistributed and/or sold with any software provided that any reserved names are not used by derivative works. The fonts and derivatives, however, cannot be released under any other type of license. The requirement for fonts to remain under this license does not apply to any document created using the fonts or their derivatives. 16 | 17 | DEFINITIONS 18 | "Font Software" refers to the set of files released by the Copyright Holder(s) under this license and clearly marked as such. This may include source files, build scripts and documentation. 19 | 20 | "Reserved Font Name" refers to any names specified as such after the copyright statement(s). 21 | 22 | "Original Version" refers to the collection of Font Software components as distributed by the Copyright Holder(s). 23 | 24 | "Modified Version" refers to any derivative made by adding to, deleting, or substituting -- in part or in whole -- any of the components of the Original Version, by changing formats or by porting the Font Software to a new environment. 25 | 26 | "Author" refers to any designer, engineer, programmer, technical writer or other person who contributed to the Font Software. 27 | 28 | PERMISSION & CONDITIONS 29 | Permission is hereby granted, free of charge, to any person obtaining a copy of the Font Software, to use, study, copy, merge, embed, modify, redistribute, and sell modified and unmodified copies of the Font Software, subject to the following conditions: 30 | 31 | 1) Neither the Font Software nor any of its individual components, in Original or Modified Versions, may be sold by itself. 32 | 33 | 2) Original or Modified Versions of the Font Software may be bundled, redistributed and/or sold with any software, provided that each copy contains the above copyright notice and this license. These can be included either as stand-alone text files, human-readable headers or in the appropriate machine-readable metadata fields within text or binary files as long as those fields can be easily viewed by the user. 34 | 35 | 3) No Modified Version of the Font Software may use the Reserved Font Name(s) unless explicit written permission is granted by the corresponding Copyright Holder. This restriction only applies to the primary font name as presented to the users. 36 | 37 | 4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font Software shall not be used to promote, endorse or advertise any Modified Version, except to acknowledge the contribution(s) of the Copyright Holder(s) and the Author(s) or with their explicit written permission. 38 | 39 | 5) The Font Software, modified or unmodified, in part or in whole, must be distributed entirely under this license, and must not be distributed under any other license. The requirement for fonts to remain under this license does not apply to any document created using the Font Software. 40 | 41 | TERMINATION 42 | This license becomes null and void if any of the above conditions are not met. 43 | 44 | DISCLAIMER 45 | THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE FONT SOFTWARE. -------------------------------------------------------------------------------- /model_data/convert_model.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | import os 4 | import tensorflow as tf 5 | 6 | 7 | def optimize_h5_model(model_path, output_path): 8 | # tf.enable_eager_execution() 9 | from tensorflow.python.compiler.tensorrt import trt_convert as trt 10 | model_path = os.path.expanduser(model_path) 11 | output_path = os.path.expanduser(output_path) 12 | model = tf.keras.models.load_model(model_path) 13 | name = os.path.basename(model_path) 14 | name = os.path.splitext(name)[0] 15 | temp_path = os.path.join('/tmp', name) 16 | print(temp_path) 17 | model.save(temp_path, save_format='tf') 18 | # tf.compat.v1.saved_model.save(model, temp_path) 19 | 20 | converter = trt.TrtGraphConverterV2(input_saved_model_dir=temp_path) 21 | converter.convert() 22 | converter.save(output_path) 23 | 24 | 25 | def freeze_keras_model(model_path, output_path, keep_var_names=None): 26 | # First freeze the graph and remove training nodes. 27 | model_path = os.path.expanduser(model_path) 28 | output_path = os.path.expanduser(output_path) 29 | sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(device_count={'GPU': 0})) # CPU only 30 | tf.compat.v1.keras.backend.set_session(sess) 31 | tf.compat.v1.keras.backend.set_learning_phase(0) 32 | model = tf.compat.v1.keras.models.load_model(model_path) 33 | tf.compat.v1.keras.backend.set_learning_phase(0) 34 | if isinstance(model.input, list): 35 | input_names = [input.name for input in model.input] 36 | elif isinstance(model.input, tf.Tensor): 37 | input_names = [model.input.op.name] 38 | else: 39 | raise Exception('No input') 40 | output_names = [output.op.name for output in model.output] 41 | freeze_var_names = list(set(v.op.name for v in tf.compat.v1.global_variables()).difference(keep_var_names or [])) 42 | # print(freeze_var_names) 43 | print(input_names) 44 | print(output_names) 45 | 46 | tf.compat.v1.train.write_graph(sess.graph.as_graph_def(), '.', os.path.join(output_path, 'graph.pbtxt'), 47 | as_text=True) 48 | frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), output_names, 49 | freeze_var_names) 50 | frozen_graph = tf.compat.v1.graph_util.remove_training_nodes(frozen_graph) 51 | # Save the model 52 | output_path = os.path.join(output_path, 'frozen.pb') 53 | with open(output_path, "wb") as ofile: 54 | ofile.write(frozen_graph.SerializeToString()) 55 | 56 | 57 | def convert_keras2onnx(model_path, output_path): 58 | import keras2onnx 59 | import onnx 60 | # load keras model 61 | model = tf.compat.v1.keras.models.load_model(model_path) 62 | 63 | # convert to onnx model 64 | print(onnx.defs.onnx_opset_version()) 65 | onnx_model = keras2onnx.convert_keras(model, model.name, target_opset=10) 66 | onnx_model.graph.input[0].type.tensor_type.shape.dim[0].dim_value = 1 67 | # # runtime prediction 68 | output_path = os.path.join(output_path, 'converted.onnx') 69 | onnx.save_model(onnx_model, output_path) 70 | 71 | 72 | def main(args): 73 | if args.type == 'onnx': 74 | convert_keras2onnx(args.model_path, args.output_path) 75 | elif args.type == 'pb': 76 | freeze_keras_model(args.model_path, args.output_path) 77 | elif args.type == 'trt': 78 | optimize_h5_model(args.model_path, args.output_path) 79 | else: 80 | print("Type error") 81 | 82 | 83 | def parse_arguments(): 84 | parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS) 85 | 86 | parser.add_argument('--model_path', type=str, 87 | help='Path of the model to be converted .', default='') 88 | parser.add_argument('--output_path', type=str, 89 | help='Path of the model to be stored .', default='./optimized_model') 90 | parser.add_argument('--type', type=str, 91 | help='Convert model to tyoe: onnx, pb or trt', default='pb') 92 | return parser.parse_args() 93 | 94 | 95 | if __name__ == '__main__': 96 | main(parse_arguments()) 97 | -------------------------------------------------------------------------------- /model_data/yolo_anchors.txt: -------------------------------------------------------------------------------- 1 | 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 2 | -------------------------------------------------------------------------------- /model_keras.py: -------------------------------------------------------------------------------- 1 | """YOLO_v3 Model Defined in Keras.""" 2 | 3 | from functools import wraps 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | from tensorflow.keras.layers import Conv2D, Add, ZeroPadding2D, UpSampling2D, Concatenate, MaxPooling2D 8 | from tensorflow.keras.layers import LeakyReLU 9 | from tensorflow.keras.layers import BatchNormalization 10 | from tensorflow.keras.models import Model 11 | from tensorflow.keras.regularizers import l2 12 | 13 | from yolo3.utils import compose 14 | 15 | 16 | @wraps(Conv2D) 17 | def DarknetConv2D(*args, **kwargs): 18 | """Wrapper to set Darknet parameters for Convolution2D.""" 19 | darknet_conv_kwargs = {'kernel_regularizer': l2(5e-4)} 20 | darknet_conv_kwargs['padding'] = 'valid' if kwargs.get('strides') == (2, 2) else 'same' 21 | darknet_conv_kwargs.update(kwargs) 22 | return Conv2D(*args, **darknet_conv_kwargs) 23 | 24 | 25 | def DarknetConv2D_BN_Leaky(*args, **kwargs): 26 | """Darknet Convolution2D followed by BatchNormalization and LeakyReLU.""" 27 | no_bias_kwargs = {'use_bias': False} 28 | no_bias_kwargs.update(kwargs) 29 | return compose( 30 | DarknetConv2D(*args, **no_bias_kwargs), 31 | BatchNormalization(), 32 | LeakyReLU(alpha=0.1)) 33 | 34 | 35 | def resblock_body(x, num_filters, num_blocks): 36 | '''A series of resblocks starting with a downsampling Convolution2D''' 37 | # Darknet uses left and top padding instead of 'same' mode 38 | x = ZeroPadding2D(((1, 0), (1, 0)))(x) 39 | x = DarknetConv2D_BN_Leaky(num_filters, (3, 3), strides=(2, 2))(x) 40 | for i in range(num_blocks): 41 | y = compose( 42 | DarknetConv2D_BN_Leaky(num_filters // 2, (1, 1)), 43 | DarknetConv2D_BN_Leaky(num_filters, (3, 3)))(x) 44 | x = Add()([x, y]) 45 | return x 46 | 47 | 48 | def darknet_body(x): 49 | '''Darknent body having 52 Convolution2D layers''' 50 | x = DarknetConv2D_BN_Leaky(32, (3, 3))(x) 51 | x = resblock_body(x, 64, 1) 52 | x = resblock_body(x, 128, 2) 53 | x = resblock_body(x, 256, 8) 54 | x = resblock_body(x, 512, 8) 55 | x = resblock_body(x, 1024, 4) 56 | return x 57 | 58 | 59 | def make_last_layers(x, num_filters, out_filters): 60 | '''6 Conv2D_BN_Leaky layers followed by a Conv2D_linear layer''' 61 | x = compose( 62 | DarknetConv2D_BN_Leaky(num_filters, (1, 1)), 63 | DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)), 64 | DarknetConv2D_BN_Leaky(num_filters, (1, 1)), 65 | DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)), 66 | DarknetConv2D_BN_Leaky(num_filters, (1, 1)))(x) 67 | y = compose( 68 | DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)), 69 | DarknetConv2D(out_filters, (1, 1)))(x) 70 | return x, y 71 | 72 | 73 | def yolo_body(inputs, num_anchors, num_classes): 74 | """Create YOLO_V3 model CNN body in Keras.""" 75 | darknet = Model(inputs, darknet_body(inputs)) 76 | x, y1 = make_last_layers(darknet.output, 512, num_anchors * (num_classes + 5)) 77 | 78 | x = compose( 79 | DarknetConv2D_BN_Leaky(256, (1, 1)), 80 | UpSampling2D(2))(x) 81 | x = Concatenate()([x, darknet.layers[152].output]) 82 | x, y2 = make_last_layers(x, 256, num_anchors * (num_classes + 5)) 83 | 84 | x = compose( 85 | DarknetConv2D_BN_Leaky(128, (1, 1)), 86 | UpSampling2D(2))(x) 87 | x = Concatenate()([x, darknet.layers[92].output]) 88 | x, y3 = make_last_layers(x, 128, num_anchors * (num_classes + 5)) 89 | 90 | return Model(inputs, [y1, y2, y3]) 91 | 92 | 93 | def tiny_yolo_body(inputs, num_anchors, num_classes): 94 | '''Create Tiny YOLO_v3 model CNN body in keras.''' 95 | x1 = compose( 96 | DarknetConv2D_BN_Leaky(16, (3, 3)), 97 | MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'), 98 | DarknetConv2D_BN_Leaky(32, (3, 3)), 99 | MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'), 100 | DarknetConv2D_BN_Leaky(64, (3, 3)), 101 | MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'), 102 | DarknetConv2D_BN_Leaky(128, (3, 3)), 103 | MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'), 104 | DarknetConv2D_BN_Leaky(256, (3, 3)))(inputs) 105 | x2 = compose( 106 | MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'), 107 | DarknetConv2D_BN_Leaky(512, (3, 3)), 108 | MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='same'), 109 | DarknetConv2D_BN_Leaky(1024, (3, 3)), 110 | DarknetConv2D_BN_Leaky(256, (1, 1)))(x1) 111 | y1 = compose( 112 | DarknetConv2D_BN_Leaky(512, (3, 3)), 113 | DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)))(x2) 114 | 115 | x2 = compose( 116 | DarknetConv2D_BN_Leaky(128, (1, 1)), 117 | UpSampling2D(2))(x2) 118 | y2 = compose( 119 | Concatenate(), 120 | DarknetConv2D_BN_Leaky(256, (3, 3)), 121 | DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)))([x2, x1]) 122 | 123 | return Model(inputs, [y1, y2]) 124 | 125 | 126 | def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False): 127 | """Convert final layer features to bounding box parameters.""" 128 | num_anchors = len(anchors) 129 | # Reshape to batch, height, width, num_anchors, box_params. 130 | anchors_tensor = tf.reshape(tf.constant(anchors), [1, 1, 1, num_anchors, 2]) 131 | 132 | grid_shape = tf.shape(feats)[1:3] # height, width 133 | grid_y = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[0]), [-1, 1, 1, 1]), 134 | [1, grid_shape[1], 1, 1]) 135 | grid_x = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[1]), [1, -1, 1, 1]), 136 | [grid_shape[0], 1, 1, 1]) 137 | grid = tf.concat([grid_x, grid_y], -1) 138 | grid = tf.cast(grid, feats.dtype) 139 | 140 | feats = tf.reshape( 141 | feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5]) 142 | 143 | # Adjust preditions to each spatial grid point and anchor size. 144 | box_xy = (tf.sigmoid(feats[..., :2]) + grid) / tf.cast(grid_shape[::-1], feats.dtype) 145 | box_wh = tf.exp(feats[..., 2:4]) * tf.cast(anchors_tensor, feats.dtype) / tf.cast(input_shape[::-1], feats.dtype) 146 | box_confidence = tf.sigmoid(feats[..., 4:5]) 147 | box_class_probs = tf.sigmoid(feats[..., 5:]) 148 | 149 | if calc_loss == True: 150 | return grid, feats, box_xy, box_wh 151 | return box_xy, box_wh, box_confidence, box_class_probs 152 | 153 | 154 | def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape): 155 | '''Get corrected boxes''' 156 | box_yx = box_xy[..., ::-1] 157 | box_hw = box_wh[..., ::-1] 158 | input_shape = tf.cast(input_shape, box_yx.dtype) 159 | image_shape = tf.cast(image_shape, box_yx.dtype) 160 | new_shape = tf.round(image_shape * tf.reduce_min(input_shape / image_shape)) 161 | offset = (input_shape - new_shape) / 2. / input_shape 162 | scale = input_shape / new_shape 163 | box_yx = (box_yx - offset) * scale 164 | box_hw *= scale 165 | 166 | box_mins = box_yx - (box_hw / 2.) 167 | box_maxes = box_yx + (box_hw / 2.) 168 | boxes = tf.concat([ 169 | box_mins[..., 0:1], # y_min 170 | box_mins[..., 1:2], # x_min 171 | box_maxes[..., 0:1], # y_max 172 | box_maxes[..., 1:2] # x_max 173 | ], -1) 174 | # Scale boxes back to original image shape. 175 | boxes *= tf.concat([image_shape, image_shape], -1) 176 | return boxes 177 | 178 | 179 | def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape): 180 | '''Process Conv layer output''' 181 | box_xy, box_wh, box_confidence, box_class_probs = yolo_head(feats, 182 | anchors, num_classes, input_shape) 183 | boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape) 184 | boxes = tf.reshape(boxes, [tf.shape(boxes)[0], -1, 4]) 185 | box_scores = box_confidence * box_class_probs 186 | box_scores = tf.reshape(box_scores, [tf.shape(box_scores)[0], -1, num_classes]) 187 | return boxes, box_scores 188 | 189 | 190 | def yolo_eval(yolo_outputs, 191 | anchors, 192 | num_classes, 193 | image_shape, 194 | max_boxes=20, 195 | score_threshold=.6, 196 | iou_threshold=.5): 197 | """Evaluate YOLO model on given input and return filtered boxes.""" 198 | num_layers = len(yolo_outputs) 199 | anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] # default setting 200 | input_shape = tf.shape(yolo_outputs[0])[1:3] * 32 201 | boxes = [] 202 | box_scores = [] 203 | for l in range(num_layers): 204 | _boxes, _box_scores = yolo_boxes_and_scores(yolo_outputs[l], 205 | anchors[anchor_mask[l]], num_classes, input_shape, image_shape) 206 | boxes.append(_boxes) 207 | box_scores.append(_box_scores) 208 | # boxes = tf.concat(boxes, axis=1) 209 | # box_scores = tf.concat(box_scores, axis=1) 210 | boxes = np.concatenate(boxes, axis=1) 211 | box_scores = np.concatenate(box_scores, axis=1) 212 | # print(box_scores.shape) 213 | # print(boxes.shape) 214 | 215 | boxes_ = [] 216 | scores_ = [] 217 | classes_ = [] 218 | for single_boxes, single_box_scores in zip(boxes, box_scores): 219 | mask = single_box_scores >= score_threshold 220 | max_boxes_tensor = tf.constant(max_boxes, dtype='int32') 221 | single_boxes_ = [] 222 | single_scores_ = [] 223 | single_classes_ = [] 224 | for c in range(num_classes): 225 | pass 226 | # TODO: use keras backend instead of tf. 227 | # class_boxes = tf.boolean_mask(single_boxes, mask[..., c]) 228 | # class_box_scores = tf.boolean_mask(single_box_scores[..., c], mask[..., c]) 229 | class_boxes = single_boxes[mask[..., c]] 230 | class_box_scores = single_box_scores[..., c][mask[..., c]] 231 | nms_index = tf.image.non_max_suppression( 232 | class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold) 233 | class_boxes = tf.gather(class_boxes, nms_index) 234 | class_box_scores = tf.gather(class_box_scores, nms_index) 235 | classes = tf.ones_like(class_box_scores, 'int32') * c 236 | single_boxes_.append(class_boxes) 237 | single_scores_.append(class_box_scores) 238 | single_classes_.append(classes) 239 | single_boxes_ = tf.concat(single_boxes_, axis=0) 240 | single_scores_ = tf.concat(single_scores_, axis=0) 241 | single_classes_ = tf.concat(single_classes_, axis=0) 242 | boxes_.append(single_boxes_) 243 | scores_.append(single_scores_) 244 | classes_.append(single_classes_) 245 | 246 | # boxes_ = tf.reshape(boxes_, [tf.shape(box_scores)[0], -1, 4]) 247 | # scores_ = tf.reshape(scores_, [tf.shape(box_scores)[0], -1]) 248 | # classes_ = tf.reshape(classes_, [tf.shape(box_scores)[0], -1]) 249 | 250 | return boxes_, scores_, classes_ 251 | 252 | 253 | def preprocess_true_boxes(true_boxes, input_shape, anchors, num_classes): 254 | '''Preprocess true boxes to training input format 255 | 256 | Parameters 257 | ---------- 258 | true_boxes: array, shape=(m, T, 5) 259 | Absolute x_min, y_min, x_max, y_max, class_id relative to input_shape. 260 | input_shape: array-like, hw, multiples of 32 261 | anchors: array, shape=(N, 2), wh 262 | num_classes: integer 263 | 264 | Returns 265 | ------- 266 | y_true: list of array, shape like yolo_outputs, xywh are reletive value 267 | 268 | ''' 269 | assert (true_boxes[..., 4] < num_classes).all(), 'class id must be less than num_classes' 270 | num_layers = len(anchors) // 3 # default setting 271 | anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] 272 | 273 | true_boxes = np.array(true_boxes, dtype='float32') 274 | input_shape = np.array(input_shape, dtype='int32') 275 | boxes_xy = (true_boxes[..., 0:2] + true_boxes[..., 2:4]) // 2 276 | boxes_wh = true_boxes[..., 2:4] - true_boxes[..., 0:2] 277 | true_boxes[..., 0:2] = boxes_xy / input_shape[::-1] 278 | true_boxes[..., 2:4] = boxes_wh / input_shape[::-1] 279 | 280 | m = true_boxes.shape[0] 281 | grid_shapes = [input_shape // {0: 32, 1: 16, 2: 8}[l] for l in range(num_layers)] 282 | y_true = [np.zeros((m, grid_shapes[l][0], grid_shapes[l][1], len(anchor_mask[l]), 5 + num_classes), 283 | dtype='float32') for l in range(num_layers)] 284 | 285 | # Expand dim to apply broadcasting. 286 | anchors = np.expand_dims(anchors, 0) 287 | anchor_maxes = anchors / 2. 288 | anchor_mins = -anchor_maxes 289 | valid_mask = boxes_wh[..., 0] > 0 290 | 291 | for b in range(m): 292 | # Discard zero rows. 293 | wh = boxes_wh[b, valid_mask[b]] 294 | if len(wh) == 0: continue 295 | # Expand dim to apply broadcasting. 296 | wh = np.expand_dims(wh, -2) 297 | box_maxes = wh / 2. 298 | box_mins = -box_maxes 299 | 300 | intersect_mins = np.maximum(box_mins, anchor_mins) 301 | intersect_maxes = np.minimum(box_maxes, anchor_maxes) 302 | intersect_wh = np.maximum(intersect_maxes - intersect_mins, 0.) 303 | intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1] 304 | box_area = wh[..., 0] * wh[..., 1] 305 | anchor_area = anchors[..., 0] * anchors[..., 1] 306 | iou = intersect_area / (box_area + anchor_area - intersect_area) 307 | 308 | # Find best anchor for each true box 309 | best_anchor = np.argmax(iou, axis=-1) 310 | 311 | for t, n in enumerate(best_anchor): 312 | for l in range(num_layers): 313 | if n in anchor_mask[l]: 314 | i = np.floor(true_boxes[b, t, 0] * grid_shapes[l][1]).astype('int32') 315 | j = np.floor(true_boxes[b, t, 1] * grid_shapes[l][0]).astype('int32') 316 | k = anchor_mask[l].index(n) 317 | c = true_boxes[b, t, 4].astype('int32') 318 | y_true[l][b, j, i, k, 0:4] = true_boxes[b, t, 0:4] 319 | y_true[l][b, j, i, k, 4] = 1 320 | y_true[l][b, j, i, k, 5 + c] = 1 321 | 322 | return y_true 323 | 324 | 325 | def box_iou(b1, b2): 326 | '''Return iou tensor 327 | 328 | Parameters 329 | ---------- 330 | b1: tensor, shape=(i1,...,iN, 4), xywh 331 | b2: tensor, shape=(j, 4), xywh 332 | 333 | Returns 334 | ------- 335 | iou: tensor, shape=(i1,...,iN, j) 336 | 337 | ''' 338 | 339 | # Expand dim to apply broadcasting. 340 | b1 = tf.expand_dims(b1, -2) 341 | b1_xy = b1[..., :2] 342 | b1_wh = b1[..., 2:4] 343 | b1_wh_half = b1_wh / 2. 344 | b1_mins = b1_xy - b1_wh_half 345 | b1_maxes = b1_xy + b1_wh_half 346 | 347 | # Expand dim to apply broadcasting. 348 | b2 = tf.expand_dims(b2, 0) 349 | b2_xy = b2[..., :2] 350 | b2_wh = b2[..., 2:4] 351 | b2_wh_half = b2_wh / 2. 352 | b2_mins = b2_xy - b2_wh_half 353 | b2_maxes = b2_xy + b2_wh_half 354 | 355 | intersect_mins = tf.maximum(b1_mins, b2_mins) 356 | intersect_maxes = tf.minimum(b1_maxes, b2_maxes) 357 | intersect_wh = tf.maximum(intersect_maxes - intersect_mins, 0.) 358 | intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1] 359 | b1_area = b1_wh[..., 0] * b1_wh[..., 1] 360 | b2_area = b2_wh[..., 0] * b2_wh[..., 1] 361 | iou = intersect_area / (b1_area + b2_area - intersect_area) 362 | 363 | return iou 364 | 365 | 366 | def yolo_loss(args, anchors, num_classes, ignore_thresh=.5, print_loss=False): 367 | '''Return yolo_loss tensor 368 | 369 | Parameters 370 | ---------- 371 | yolo_outputs: list of tensor, the output of yolo_body or tiny_yolo_body 372 | y_true: list of array, the output of preprocess_true_boxes 373 | anchors: array, shape=(N, 2), wh 374 | num_classes: integer 375 | ignore_thresh: float, the iou threshold whether to ignore object confidence loss 376 | 377 | Returns 378 | ------- 379 | loss: tensor, shape=(1,) 380 | 381 | ''' 382 | num_layers = len(anchors) // 3 # default setting 383 | yolo_outputs = args[:num_layers] 384 | y_true = args[num_layers:] 385 | anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] 386 | input_shape = tf.cast(tf.shape(yolo_outputs[0])[1:3] * 32, y_true[0].dtype) 387 | grid_shapes = [tf.cast(tf.shape(yolo_outputs[l])[1:3], y_true[0].dtype) for l in range(num_layers)] 388 | loss = 0 389 | m = tf.shape(yolo_outputs[0])[0] # batch size, tensor 390 | mf = tf.cast(m, yolo_outputs[0].dtype) 391 | 392 | for l in range(num_layers): 393 | object_mask = y_true[l][..., 4:5] 394 | true_class_probs = y_true[l][..., 5:] 395 | 396 | grid, raw_pred, pred_xy, pred_wh = yolo_head(yolo_outputs[l], 397 | anchors[anchor_mask[l]], num_classes, input_shape, calc_loss=True) 398 | pred_box = tf.concat([pred_xy, pred_wh], -1) 399 | 400 | # Darknet raw box to calculate loss. 401 | raw_true_xy = y_true[l][..., :2] * grid_shapes[l][::-1] - grid 402 | raw_true_wh = tf.log(y_true[l][..., 2:4] / anchors[anchor_mask[l]] * input_shape[::-1]) 403 | raw_true_wh = tf.switch(object_mask, raw_true_wh, tf.zeros_like(raw_true_wh)) # avoid log(0)=-inf 404 | box_loss_scale = 2 - y_true[l][..., 2:3] * y_true[l][..., 3:4] 405 | 406 | # Find ignore mask, iterate over each of batch. 407 | ignore_mask = tf.TensorArray(tf.dtype(y_true[0]), size=1, dynamic_size=True) 408 | object_mask_bool = tf.cast(object_mask, 'bool') 409 | 410 | def loop_body(b, ignore_mask): 411 | true_box = tf.boolean_mask(y_true[l][b, ..., 0:4], object_mask_bool[b, ..., 0]) 412 | iou = box_iou(pred_box[b], true_box) 413 | best_iou = tf.max(iou, axis=-1) 414 | ignore_mask = ignore_mask.write(b, tf.cast(best_iou < ignore_thresh, true_box.dtype)) 415 | return b + 1, ignore_mask 416 | 417 | _, ignore_mask = tf.control_flow_ops.while_loop(lambda b, *args: b < m, loop_body, [0, ignore_mask]) 418 | ignore_mask = ignore_mask.stack() 419 | ignore_mask = tf.expand_dims(ignore_mask, -1) 420 | 421 | # tf.binary_crossentropy is helpful to avoid exp overflow. 422 | xy_loss = object_mask * box_loss_scale * tf.binary_crossentropy(raw_true_xy, raw_pred[..., 0:2], 423 | from_logits=True) 424 | wh_loss = object_mask * box_loss_scale * 0.5 * tf.square(raw_true_wh - raw_pred[..., 2:4]) 425 | confidence_loss = object_mask * tf.binary_crossentropy(object_mask, raw_pred[..., 4:5], from_logits=True) + \ 426 | (1 - object_mask) * tf.binary_crossentropy(object_mask, raw_pred[..., 4:5], 427 | from_logits=True) * ignore_mask 428 | class_loss = object_mask * tf.binary_crossentropy(true_class_probs, raw_pred[..., 5:], from_logits=True) 429 | 430 | xy_loss = tf.sum(xy_loss) / mf 431 | wh_loss = tf.sum(wh_loss) / mf 432 | confidence_loss = tf.sum(confidence_loss) / mf 433 | class_loss = tf.sum(class_loss) / mf 434 | loss += xy_loss + wh_loss + confidence_loss + class_loss 435 | if print_loss: 436 | loss = tf.Print(loss, [loss, xy_loss, wh_loss, confidence_loss, class_loss, tf.sum(ignore_mask)], 437 | message='loss: ') 438 | return loss 439 | -------------------------------------------------------------------------------- /post_process.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | from PIL import Image 4 | 5 | 6 | def letterbox_image(image, size): 7 | '''resize image with unchanged aspect ratio using padding''' 8 | iw, ih = image.size 9 | w, h = size 10 | scale = min(w / iw, h / ih) 11 | nw = int(iw * scale) 12 | nh = int(ih * scale) 13 | 14 | image = image.resize((nw, nh), Image.BICUBIC) 15 | new_image = Image.new('RGB', size, (128, 128, 128)) 16 | new_image.paste(image, ((w - nw) // 2, (h - nh) // 2)) 17 | return new_image 18 | 19 | 20 | def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False): 21 | """Convert final layer features to bounding box parameters.""" 22 | num_anchors = len(anchors) 23 | # Reshape to batch, height, width, num_anchors, box_params. 24 | anchors_tensor = tf.reshape(tf.constant(anchors), [1, 1, 1, num_anchors, 2]) 25 | 26 | grid_shape = tf.shape(feats)[1:3] # height, width 27 | grid_y = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[0]), [-1, 1, 1, 1]), 28 | [1, grid_shape[1], 1, 1]) 29 | grid_x = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[1]), [1, -1, 1, 1]), 30 | [grid_shape[0], 1, 1, 1]) 31 | grid = tf.concat([grid_x, grid_y], -1) 32 | grid = tf.cast(grid, feats.dtype) 33 | 34 | feats = tf.reshape( 35 | feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5]) 36 | 37 | # Adjust preditions to each spatial grid point and anchor size. 38 | box_xy = (tf.sigmoid(feats[..., :2]) + grid) / tf.cast(grid_shape[::-1], feats.dtype) 39 | box_wh = tf.exp(feats[..., 2:4]) * tf.cast(anchors_tensor, feats.dtype) / tf.cast(input_shape[::-1], feats.dtype) 40 | box_confidence = tf.sigmoid(feats[..., 4:5]) 41 | box_class_probs = tf.sigmoid(feats[..., 5:]) 42 | 43 | if calc_loss == True: 44 | return grid, feats, box_xy, box_wh 45 | return box_xy, box_wh, box_confidence, box_class_probs 46 | 47 | 48 | def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape): 49 | '''Get corrected boxes''' 50 | box_yx = box_xy[..., ::-1] 51 | box_hw = box_wh[..., ::-1] 52 | input_shape = tf.cast(input_shape, box_yx.dtype) 53 | image_shape = tf.cast(image_shape, box_yx.dtype) 54 | new_shape = tf.round(image_shape * tf.reduce_min(input_shape / image_shape)) 55 | offset = (input_shape - new_shape) / 2. / input_shape 56 | scale = input_shape / new_shape 57 | box_yx = (box_yx - offset) * scale 58 | box_hw *= scale 59 | 60 | box_mins = box_yx - (box_hw / 2.) 61 | box_maxes = box_yx + (box_hw / 2.) 62 | boxes = tf.concat([ 63 | box_mins[..., 0:1], # y_min 64 | box_mins[..., 1:2], # x_min 65 | box_maxes[..., 0:1], # y_max 66 | box_maxes[..., 1:2] # x_max 67 | ], -1) 68 | # Scale boxes back to original image shape. 69 | boxes *= tf.concat([image_shape, image_shape], -1) 70 | return boxes 71 | 72 | 73 | def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape): 74 | '''Process Conv layer output''' 75 | box_xy, box_wh, box_confidence, box_class_probs = yolo_head(feats, 76 | anchors, num_classes, input_shape) 77 | boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape) 78 | boxes = tf.reshape(boxes, [tf.shape(boxes)[0], -1, 4]) 79 | box_scores = box_confidence * box_class_probs 80 | box_scores = tf.reshape(box_scores, [tf.shape(box_scores)[0], -1, num_classes]) 81 | return boxes, box_scores 82 | 83 | 84 | def yolo_post_process(yolo_outputs, 85 | anchors, 86 | num_classes, 87 | image_shape, 88 | max_boxes=20, 89 | score_threshold=.6, 90 | iou_threshold=.5): 91 | """Evaluate YOLO model on given input and return filtered boxes.""" 92 | num_layers = len(yolo_outputs) 93 | anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]] # default setting 94 | input_shape = tf.shape(yolo_outputs[0])[1:3] * 32 95 | boxes = [] 96 | box_scores = [] 97 | for l in range(num_layers): 98 | _boxes, _box_scores = yolo_boxes_and_scores(yolo_outputs[l], 99 | anchors[anchor_mask[l]], num_classes, input_shape, image_shape) 100 | boxes.append(_boxes) 101 | box_scores.append(_box_scores) 102 | # boxes = tf.concat(boxes, axis=1) 103 | # box_scores = tf.concat(box_scores, axis=1) 104 | boxes = np.concatenate(boxes, axis=1) 105 | box_scores = np.concatenate(box_scores, axis=1) 106 | # print(box_scores.shape) 107 | # print(boxes.shape) 108 | 109 | boxes_ = [] 110 | scores_ = [] 111 | classes_ = [] 112 | for single_boxes, single_box_scores in zip(boxes, box_scores): 113 | mask = single_box_scores >= score_threshold 114 | max_boxes_tensor = tf.constant(max_boxes, dtype='int32') 115 | single_boxes_ = [] 116 | single_scores_ = [] 117 | single_classes_ = [] 118 | for c in range(num_classes): 119 | pass 120 | # TODO: use keras backend instead of tf. 121 | # class_boxes = tf.boolean_mask(single_boxes, mask[..., c]) 122 | # class_box_scores = tf.boolean_mask(single_box_scores[..., c], mask[..., c]) 123 | class_boxes = single_boxes[mask[..., c]] 124 | class_box_scores = single_box_scores[..., c][mask[..., c]] 125 | nms_index = tf.image.non_max_suppression( 126 | class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold) 127 | class_boxes = tf.gather(class_boxes, nms_index) 128 | class_box_scores = tf.gather(class_box_scores, nms_index) 129 | classes = tf.ones_like(class_box_scores, 'int32') * c 130 | single_boxes_.append(class_boxes) 131 | single_scores_.append(class_box_scores) 132 | single_classes_.append(classes) 133 | single_boxes_ = tf.concat(single_boxes_, axis=0) 134 | single_scores_ = tf.concat(single_scores_, axis=0) 135 | single_classes_ = tf.concat(single_classes_, axis=0) 136 | boxes_.append(single_boxes_) 137 | scores_.append(single_scores_) 138 | classes_.append(single_classes_) 139 | 140 | # boxes_ = tf.reshape(boxes_, [tf.shape(box_scores)[0], -1, 4]) 141 | # scores_ = tf.reshape(scores_, [tf.shape(box_scores)[0], -1]) 142 | # classes_ = tf.reshape(classes_, [tf.shape(box_scores)[0], -1]) 143 | 144 | return boxes_, scores_, classes_ 145 | -------------------------------------------------------------------------------- /tensorrt_util/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__init__.py -------------------------------------------------------------------------------- /tensorrt_util/__pycache__/__init__.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/__init__.cpython-35.pyc -------------------------------------------------------------------------------- /tensorrt_util/__pycache__/tensorrt_common.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/tensorrt_common.cpython-35.pyc -------------------------------------------------------------------------------- /tensorrt_util/__pycache__/yolo_calibrator.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/yolo_calibrator.cpython-35.pyc -------------------------------------------------------------------------------- /tensorrt_util/tensorrt_common.py: -------------------------------------------------------------------------------- 1 | import pycuda.driver as cuda 2 | import pycuda.autoinit 3 | import tensorrt as trt 4 | import uff 5 | import os 6 | 7 | 8 | class HostDeviceMem(object): 9 | def __init__(self, host_mem, device_mem): 10 | self.host = host_mem 11 | self.device = device_mem 12 | 13 | def __str__(self): 14 | return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device) 15 | 16 | def __repr__(self): 17 | return self.__str__() 18 | 19 | 20 | def allocate_buffers(engine): 21 | inputs = [] 22 | outputs = [] 23 | bindings = [] 24 | stream = cuda.Stream() 25 | for binding in engine: 26 | size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size 27 | dtype = trt.nptype(engine.get_binding_dtype(binding)) 28 | # Allocate host and device buffers 29 | host_mem = cuda.pagelocked_empty(size, dtype) 30 | device_mem = cuda.mem_alloc(host_mem.nbytes) 31 | # Append the device buffer to device bindings. 32 | bindings.append(int(device_mem)) 33 | # Append to the appropriate list. 34 | if engine.binding_is_input(binding): 35 | inputs.append(HostDeviceMem(host_mem, device_mem)) 36 | else: 37 | outputs.append(HostDeviceMem(host_mem, device_mem)) 38 | return inputs, outputs, bindings, stream 39 | 40 | 41 | def do_inference(context, bindings, inputs, outputs, stream, batch_size=1): 42 | # Transfer input data to the GPU. 43 | [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs] 44 | # Run inference. 45 | context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle) 46 | # Transfer predictions back from the GPU. 47 | [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs] 48 | # Synchronize the stream 49 | stream.synchronize() 50 | # Return only the host outputs. 51 | return [out.host for out in outputs] 52 | 53 | 54 | # Transforms model path to uff path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff) 55 | def model_path_to_uff_path(model_path): 56 | uff_path = os.path.splitext(model_path)[0] + ".uff" 57 | return uff_path 58 | 59 | 60 | # Transforms model path to onnx path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff) 61 | def model_path_to_onnx_path(model_path): 62 | onnx_path = os.path.splitext(model_path)[0] + ".onnx" 63 | return onnx_path 64 | 65 | 66 | # Transforms model path to engine path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff) 67 | def model_path_to_engine_path(model_path, build_type='fp32'): 68 | if os.path.splitext(model_path)[1] == '.engine': 69 | return model_path 70 | engine_path = os.path.splitext(model_path)[0] + '_' + build_type + ".engine" 71 | return engine_path 72 | 73 | 74 | # Converts the TensorFlow frozen graphdef to UFF format using the UFF converter 75 | def model_to_uff(model_path, output_names, plugin_map={}): 76 | # Transform graph using graphsurgeon to map unsupported TensorFlow 77 | # operations to appropriate TensorRT custom layer plugins 78 | import graphsurgeon as gs 79 | dynamic_graph = gs.DynamicGraph(model_path) 80 | dynamic_graph.collapse_namespaces(plugin_map) 81 | # Save resulting graph to UFF file 82 | output_uff_path = model_path_to_uff_path(model_path) 83 | uff.from_tensorflow( 84 | dynamic_graph.as_graph_def(), 85 | output_names, 86 | output_filename=output_uff_path, 87 | text=True 88 | ) 89 | return output_uff_path 90 | -------------------------------------------------------------------------------- /tensorrt_util/yolo_calibrator.py: -------------------------------------------------------------------------------- 1 | import tensorrt as trt 2 | import os 3 | 4 | import pycuda.driver as cuda 5 | import pycuda.autoinit 6 | from PIL import Image 7 | from post_process import letterbox_image 8 | import numpy as np 9 | 10 | 11 | def data_generator(annotation_lines, batch_size, input_shape): 12 | '''data generator for fit_generator''' 13 | n = len(annotation_lines) 14 | count = 0 15 | i = 0 16 | while True: 17 | image_data = [] 18 | for b in range(batch_size): 19 | count += 1 20 | if count > 400: 21 | return None 22 | line = annotation_lines[i].split() 23 | image = Image.open(line[0]) 24 | boxed_image = letterbox_image(image, input_shape) 25 | image_np = np.array(boxed_image, dtype='float32', order='C')/255. 26 | # image_np = np.transpose(image_np, [2, 0, 1]) 27 | image_data.append(image_np) 28 | i = (i+1) % n 29 | print("Calib count: ", count) 30 | yield np.ascontiguousarray(image_data) 31 | 32 | 33 | class YOLOEntropyCalibrator(trt.IInt8EntropyCalibrator2): 34 | def __init__(self, data_path, cache_file, input_shape, batch_size=64): 35 | # Whenever you specify a custom constructor for a TensorRT class, 36 | # you MUST call the constructor of the parent explicitly. 37 | trt.IInt8EntropyCalibrator2.__init__(self) 38 | 39 | self.cache_file = cache_file 40 | 41 | # Every time get_batch is called, the next batch of size batch_size will be copied to the device and returned. 42 | 43 | self.batch_size = batch_size 44 | self.current_index = 0 45 | self.input_shape = input_shape 46 | 47 | # Allocate enough memory for a whole batch. 48 | self.device_input = cuda.mem_alloc(input_shape[0] * input_shape[1] * 3 * 4 * self.batch_size) 49 | with open(data_path) as f: 50 | self.lines = f.readlines() 51 | self.batches = data_generator(self.lines, self.batch_size, self.input_shape) 52 | 53 | def get_batch_size(self): 54 | return self.batch_size 55 | 56 | # TensorRT passes along the names of the engine bindings to the get_batch function. 57 | # You don't necessarily have to use them, but they can be useful to understand the order of 58 | # the inputs. The bindings list is expected to have the same ordering as 'names'. 59 | def get_batch(self, names): 60 | try: 61 | # Assume self.batches is a generator that provides batch data. 62 | data = next(self.batches) 63 | # Assume that self.device_input is a device buffer allocated by the constructor. 64 | cuda.memcpy_htod(self.device_input, data) 65 | return [int(self.device_input)] 66 | except StopIteration: 67 | # When we're out of batches, we return either [] or None. 68 | # This signals to TensorRT that there is no calibration data remaining. 69 | return None 70 | 71 | def read_calibration_cache(self): 72 | # If there is a cache, use it instead of calibrating again. Otherwise, implicitly return None. 73 | if os.path.exists(self.cache_file): 74 | with open(self.cache_file, "rb") as f: 75 | return f.read() 76 | 77 | def write_calibration_cache(self, cache): 78 | with open(self.cache_file, "wb") as f: 79 | f.write(cache) 80 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import colorsys 3 | import numpy as np 4 | from matplotlib.colors import rgb_to_hsv, hsv_to_rgb 5 | from PIL import Image, ImageFont, ImageDraw 6 | from timeit import default_timer as timer 7 | 8 | 9 | def rand(a=0, b=1): 10 | return np.random.rand() * (b - a) + a 11 | 12 | 13 | def get_random_data(annotation_line, input_shape, random=True, max_boxes=20, jitter=.3, hue=.1, sat=1.5, val=1.5, 14 | proc_img=True): 15 | '''random preprocessing for real-time data augmentation''' 16 | line = annotation_line.split() 17 | image = Image.open(line[0]) 18 | iw, ih = image.size 19 | h, w = input_shape 20 | box = np.array([np.array(list(map(int, box.split(',')))) for box in line[1:]]) 21 | 22 | if not random: 23 | # resize image 24 | scale = min(w / iw, h / ih) 25 | nw = int(iw * scale) 26 | nh = int(ih * scale) 27 | dx = (w - nw) // 2 28 | dy = (h - nh) // 2 29 | image_data = 0 30 | if proc_img: 31 | image = image.resize((nw, nh), Image.BICUBIC) 32 | new_image = Image.new('RGB', (w, h), (128, 128, 128)) 33 | new_image.paste(image, (dx, dy)) 34 | image_data = np.array(new_image) / 255. 35 | 36 | # correct boxes 37 | box_data = np.zeros((max_boxes, 5)) 38 | if len(box) > 0: 39 | np.random.shuffle(box) 40 | if len(box) > max_boxes: box = box[:max_boxes] 41 | box[:, [0, 2]] = box[:, [0, 2]] * scale + dx 42 | box[:, [1, 3]] = box[:, [1, 3]] * scale + dy 43 | box_data[:len(box)] = box 44 | 45 | return image_data, box_data 46 | 47 | # resize image 48 | new_ar = w / h * rand(1 - jitter, 1 + jitter) / rand(1 - jitter, 1 + jitter) 49 | scale = rand(.25, 2) 50 | if new_ar < 1: 51 | nh = int(scale * h) 52 | nw = int(nh * new_ar) 53 | else: 54 | nw = int(scale * w) 55 | nh = int(nw / new_ar) 56 | image = image.resize((nw, nh), Image.BICUBIC) 57 | 58 | # place image 59 | dx = int(rand(0, w - nw)) 60 | dy = int(rand(0, h - nh)) 61 | new_image = Image.new('RGB', (w, h), (128, 128, 128)) 62 | new_image.paste(image, (dx, dy)) 63 | image = new_image 64 | 65 | # flip image or not 66 | flip = rand() < .5 67 | if flip: image = image.transpose(Image.FLIP_LEFT_RIGHT) 68 | 69 | # distort image 70 | hue = rand(-hue, hue) 71 | sat = rand(1, sat) if rand() < .5 else 1 / rand(1, sat) 72 | val = rand(1, val) if rand() < .5 else 1 / rand(1, val) 73 | x = rgb_to_hsv(np.array(image) / 255.) 74 | x[..., 0] += hue 75 | x[..., 0][x[..., 0] > 1] -= 1 76 | x[..., 0][x[..., 0] < 0] += 1 77 | x[..., 1] *= sat 78 | x[..., 2] *= val 79 | x[x > 1] = 1 80 | x[x < 0] = 0 81 | image_data = hsv_to_rgb(x) # numpy array, 0 to 1 82 | 83 | # correct boxes 84 | box_data = np.zeros((max_boxes, 5)) 85 | if len(box) > 0: 86 | np.random.shuffle(box) 87 | box[:, [0, 2]] = box[:, [0, 2]] * nw / iw + dx 88 | box[:, [1, 3]] = box[:, [1, 3]] * nh / ih + dy 89 | if flip: box[:, [0, 2]] = w - box[:, [2, 0]] 90 | box[:, 0:2][box[:, 0:2] < 0] = 0 91 | box[:, 2][box[:, 2] > w] = w 92 | box[:, 3][box[:, 3] > h] = h 93 | box_w = box[:, 2] - box[:, 0] 94 | box_h = box[:, 3] - box[:, 1] 95 | box = box[np.logical_and(box_w > 1, box_h > 1)] # discard invalid box 96 | if len(box) > max_boxes: box = box[:max_boxes] 97 | box_data[:len(box)] = box 98 | 99 | return image_data, box_data 100 | 101 | 102 | class Drawer(object): 103 | def __init__(self): 104 | self.class_names = [ 105 | "person", "bicycle", "car", "motorbike", 106 | "aeroplane", "bus", "train", "truck", 107 | "boat", "traffic light", "fire hydrant", "stop sign", 108 | "parking meter", "bench", "bird", "cat", 109 | "dog", "horse", "sheep", "cow", 110 | "elephant", "bear", "zebra", "giraffe", 111 | "backpack", "umbrella", "handbag", "tie", 112 | "suitcase", "frisbee", "skis", "snowboard", 113 | "sports ball", "kite", "baseball bat", "baseball glove", 114 | "skateboard", "surfboard", "tennis racket", "bottle", 115 | "wine glass", "cup", "fork", "knife", 116 | "spoon", "bowl", "banana", "apple", 117 | "sandwich", "orange", "broccoli", "carrot", 118 | "hot dog", "pizza", "donut", "cake", 119 | "chair", "sofa", "pottedplant", "bed", 120 | "diningtable", "toilet", "tvmonitor", "laptop", 121 | "mouse", "remote", "keyboard", "cell phone", 122 | "microwave", "oven", "toaster", "sink", 123 | "refrigerator", "book", "clock", "vase", 124 | "scissors", "teddy bear", "hair drier", "toothbrush", 125 | ] 126 | 127 | # Generate colors for drawing bounding boxes. 128 | self.hsv_tuples = [(x / len(self.class_names), 1., 1.) 129 | for x in range(len(self.class_names))] 130 | self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), self.hsv_tuples)) 131 | self.colors = list( 132 | map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), 133 | self.colors)) 134 | np.random.seed(10101) # Fixed seed for consistent colors across runs. 135 | np.random.shuffle(self.colors) # Shuffle colors to decorrelate adjacent classes. 136 | np.random.seed(None) # Reset seed to default. 137 | 138 | def draw_boxes(self, image, out_boxes, out_scores, out_classes): 139 | font = ImageFont.truetype(font='font/FiraMono-Medium.otf', 140 | size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32')) 141 | thickness = (image.size[0] + image.size[1]) // 300 142 | 143 | for i, c in reversed(list(enumerate(out_classes))): 144 | predicted_class = self.class_names[c] 145 | box = out_boxes[i] 146 | score = out_scores[i] 147 | 148 | label = '{} {:.2f}'.format(predicted_class, score) 149 | draw = ImageDraw.Draw(image) 150 | label_size = draw.textsize(label, font) 151 | 152 | top, left, bottom, right = box 153 | top = max(0, np.floor(top + 0.5).astype('int32')) 154 | left = max(0, np.floor(left + 0.5).astype('int32')) 155 | bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32')) 156 | right = min(image.size[0], np.floor(right + 0.5).astype('int32')) 157 | print(label, (left, top), (right, bottom)) 158 | 159 | if top - label_size[1] >= 0: 160 | text_origin = np.array([left, top - label_size[1]]) 161 | else: 162 | text_origin = np.array([left, top + 1]) 163 | 164 | # My kingdom for a good redistributable image drawing library. 165 | for i in range(thickness): 166 | draw.rectangle( 167 | [left + i, top + i, right - i, bottom - i], 168 | outline=self.colors[c]) 169 | draw.rectangle( 170 | [tuple(text_origin), tuple(text_origin + label_size)], 171 | fill=self.colors[c]) 172 | draw.text(text_origin, label, fill=(0, 0, 0), font=font) 173 | del draw 174 | return image 175 | 176 | 177 | def detect_image(yolo, image_path, output_path=""): 178 | import cv2 179 | image_path = os.path.abspath(image_path) 180 | if os.path.isdir(image_path): 181 | files = os.listdir(image_path) 182 | images = [os.path.join(image_path, file) for file in files] 183 | elif os.path.isfile(image_path): 184 | images = image_path 185 | else: 186 | print('image_path error.') 187 | return 188 | f = open('./time.txt', 'w') 189 | for image_file in images: 190 | image = Image.open(image_file) 191 | start = timer() 192 | out_boxes, out_scores, out_classes = yolo.detect_image(image) 193 | end = timer() 194 | print('inference time: {:.3f}'.format(end - start)) 195 | f.write('{:.3f}\n'.format(end - start)) 196 | drawer = Drawer() 197 | image = drawer.draw_boxes(image, out_boxes, out_scores, out_classes) 198 | result = np.asarray(image) 199 | cv2.namedWindow("result", cv2.WINDOW_NORMAL) 200 | cv2.imshow("result", cv2.cvtColor(result, cv2.COLOR_RGB2BGR)) 201 | if output_path != "": 202 | image.save(output_path) 203 | if cv2.waitKey(1) & 0xFF == ord('q'): 204 | break 205 | f.close() 206 | yolo.close_session() 207 | 208 | 209 | def detect_video(yolo, video_path, output_path=""): 210 | import cv2 211 | vid = cv2.VideoCapture(video_path) 212 | if not vid.isOpened(): 213 | raise IOError("Couldn't open webcam or video") 214 | video_FourCC = int(vid.get(cv2.CAP_PROP_FOURCC)) 215 | video_fps = vid.get(cv2.CAP_PROP_FPS) 216 | video_size = (int(vid.get(cv2.CAP_PROP_FRAME_WIDTH)), 217 | int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT))) 218 | vid.set(cv2.CAP_PROP_BUFFERSIZE, 1) 219 | isOutput = True if output_path != "" else False 220 | if isOutput: 221 | print("!!! TYPE:", type(output_path), type(video_FourCC), type(video_fps), type(video_size)) 222 | out = cv2.VideoWriter(output_path, video_FourCC, video_fps, video_size) 223 | accum_time = 0 224 | curr_fps = 0 225 | fps = "FPS: ??" 226 | prev_time = timer() 227 | drawer = Drawer() 228 | while True: 229 | return_value, frame = vid.read() 230 | image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) 231 | # image = Image.fromarray(frame) 232 | start = timer() 233 | out_boxes, out_scores, out_classes = yolo.detect_image(image) 234 | end = timer() 235 | print('inference time: {}'.format(end - start)) 236 | 237 | image = drawer.draw_boxes(image, out_boxes, out_scores, out_classes) 238 | result = np.asarray(image) 239 | curr_time = timer() 240 | exec_time = curr_time - prev_time 241 | prev_time = curr_time 242 | accum_time = accum_time + exec_time 243 | curr_fps = curr_fps + 1 244 | if accum_time > 1: 245 | accum_time = accum_time - 1 246 | fps = "FPS: " + str(curr_fps) 247 | curr_fps = 0 248 | cv2.putText(result, text=fps, org=(3, 15), fontFace=cv2.FONT_HERSHEY_SIMPLEX, 249 | fontScale=0.50, color=(255, 0, 0), thickness=2) 250 | cv2.namedWindow("result", cv2.WINDOW_NORMAL) 251 | cv2.imshow("result", cv2.cvtColor(result, cv2.COLOR_RGB2BGR)) 252 | if isOutput: 253 | out.write(result) 254 | if cv2.waitKey(1) & 0xFF == ord('q'): 255 | break 256 | yolo.close_session() 257 | -------------------------------------------------------------------------------- /yolo_keras.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import numpy as np 3 | import tensorflow as tf 4 | from tensorflow import keras 5 | from tensorflow.keras.models import load_model 6 | 7 | import os 8 | from post_process import yolo_post_process, letterbox_image 9 | from tensorflow.keras.utils import multi_gpu_model 10 | 11 | # fix a memory allocation bug, refer to https://www.tensorflow.org/guide/gpu?hl=zh-CN 12 | tf.enable_eager_execution() 13 | gpus = tf.config.experimental.list_physical_devices('GPU') 14 | if gpus: 15 | try: 16 | # Currently, memory growth needs to be the same across GPUs 17 | for gpu in gpus: 18 | tf.config.experimental.set_memory_growth(gpu, True) 19 | # tf.config.experimental.set_virtual_device_configuration( 20 | # gpu, 21 | # [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]) 22 | 23 | logical_gpus = tf.config.experimental.list_logical_devices('GPU') 24 | print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs") 25 | except RuntimeError as e: 26 | # Memory growth must be set before GPUs have been initialized 27 | print(e) 28 | 29 | 30 | # class OutputLayer(keras.layers.Layer): 31 | # def __init__(self, anchors, class_names, score, iou, input_image_shape): 32 | # super(OutputLayer, self).__init__() 33 | # self.anchors = anchors 34 | # self.class_names = class_names 35 | # self.score = score 36 | # self.iou = iou 37 | # self.input_image_shape = input_image_shape 38 | # 39 | # def call(self, inputs): 40 | # return yolo_eval(inputs, self.anchors, 41 | # len(self.class_names), self.input_image_shape, 42 | # score_threshold=self.score, iou_threshold=self.iou) 43 | 44 | 45 | class YOLO(object): 46 | _defaults = { 47 | "model_path": 'model_data/yolo.h5', 48 | "anchors_path": 'model_data/yolo_anchors.txt', 49 | "classes_path": 'model_data/coco_classes.txt', 50 | "classes_num": 80, 51 | "score": 0.6, 52 | "iou": 0.45, 53 | "model_image_size": (416, 416), 54 | "gpu_num": 1, 55 | } 56 | 57 | @classmethod 58 | def get_defaults(cls, n): 59 | if n in cls._defaults: 60 | return cls._defaults[n] 61 | else: 62 | return "Unrecognized attribute name '" + n + "'" 63 | 64 | def __init__(self, **kwargs): 65 | self.__dict__.update(self._defaults) # set up default values 66 | self.__dict__.update(kwargs) # and update with user overrides 67 | self.anchors = self._get_anchors() 68 | self.generate() 69 | 70 | def _get_anchors(self): 71 | anchors_path = os.path.expanduser(self.anchors_path) 72 | with open(anchors_path) as f: 73 | anchors = f.readline() 74 | anchors = [float(x) for x in anchors.split(',')] 75 | return np.array(anchors).reshape(-1, 2) 76 | 77 | def generate(self): 78 | model_path = os.path.expanduser(self.model_path) 79 | assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.' 80 | # loaded = tf.saved_model.load(model_path) 81 | # self.inference_func = loaded.signatures["serving_default"] 82 | 83 | # Load model 84 | try: 85 | raw_model = load_model(model_path) 86 | except: 87 | print('load model failed.') 88 | 89 | print('{} model loaded.'.format(model_path)) 90 | 91 | if self.gpu_num >= 2: 92 | raw_model = multi_gpu_model(raw_model, gpus=self.gpu_num) 93 | # boxes, scores, classes = OutputLayer(self.anchors, self.class_names, self.score, self.iou, input_image_shape)( 94 | # raw_model.output) 95 | # self.yolo_model = keras.Model(inputs=[raw_model.input, input_image_shape], outputs=[boxes, scores, classes]) 96 | self.yolo_model = raw_model 97 | 98 | def detect_image(self, image): 99 | if self.model_image_size != (None, None): 100 | assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required' 101 | assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required' 102 | boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size))) 103 | else: 104 | new_image_size = (image.width - (image.width % 32), 105 | image.height - (image.height % 32)) 106 | boxed_image = letterbox_image(image, new_image_size) 107 | image_data = np.array(boxed_image, dtype='float32') 108 | 109 | image_data /= 255. 110 | image_data = np.expand_dims(image_data, 0) # Add batch dimension. 111 | input_image_size = np.array([image.size[1], image.size[0]]) 112 | input_image_size = np.expand_dims(input_image_size, 0) 113 | 114 | yolo_output = self.yolo_model.predict([image_data], verbose=1) 115 | # with tf.device("/gpu:0"): 116 | out_boxes, out_scores, out_classes = yolo_post_process(yolo_output, self.anchors, 117 | self.classes_num, input_image_size, 118 | score_threshold=self.score, iou_threshold=self.iou) 119 | 120 | out_boxes = out_boxes[0] 121 | out_scores = out_scores[0] 122 | out_classes = out_classes[0] 123 | print('Found {} boxes for {}'.format(len(out_boxes), 'img')) 124 | return out_boxes, out_scores, out_classes 125 | 126 | def close_session(self): 127 | pass 128 | -------------------------------------------------------------------------------- /yolo_tensorrt.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import numpy as np 3 | import tensorflow as tf 4 | import tensorrt as trt 5 | from tensorrt_util import tensorrt_common as common 6 | from tensorrt_util.yolo_calibrator import YOLOEntropyCalibrator 7 | 8 | from post_process import yolo_post_process, letterbox_image 9 | import os 10 | 11 | # fix a memory allocation bug, refer to https://www.tensorflow.org/guide/gpu?hl=zh-CN 12 | tf.enable_eager_execution() 13 | gpus = tf.config.experimental.list_physical_devices('GPU') 14 | if gpus: 15 | try: 16 | # Currently, memory growth needs to be the same across GPUs 17 | for gpu in gpus: 18 | # tf.config.experimental.set_virtual_device_configuration( 19 | # gpu, 20 | # [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=512)]) 21 | tf.config.experimental.set_memory_growth(gpu, True) 22 | logical_gpus = tf.config.experimental.list_logical_devices('GPU') 23 | print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs") 24 | except RuntimeError as e: 25 | # Memory growth must be set before GPUs have been initialized 26 | print(e) 27 | 28 | 29 | # tf.compat.v1.disable_v2_behavior() 30 | 31 | 32 | class YOLO(object): 33 | _defaults = { 34 | "model_path": 'model_data/tensorrt_model/yolo_int8.engine', 35 | "anchors_path": 'model_data/yolo_anchors.txt', 36 | "calib_img_path": "./dataset/2012_train.txt", 37 | "classes_num": 80, 38 | "score": 0.6, 39 | "iou": 0.45, 40 | "model_image_size": (416, 416), 41 | "gpu_num": 1, 42 | "infer_mode": "int8", 43 | } 44 | 45 | @classmethod 46 | def get_defaults(cls, n): 47 | if n in cls._defaults: 48 | return cls._defaults[n] 49 | else: 50 | return "Unrecognized attribute name '" + n + "'" 51 | 52 | def __init__(self, **kwargs): 53 | self.__dict__.update(self._defaults) # set up default values 54 | self.__dict__.update(kwargs) # and update with user overrides 55 | self.anchors = self._get_anchors() 56 | self.engine = self._build_engine() 57 | self.context = self.engine.create_execution_context() 58 | # self.context.active_optimization_profile = 0 59 | # self.context.set_binding_shape(0, (1, 416, 416, 3)) 60 | self.inputs, self.outputs, self.bindings, self.stream = common.allocate_buffers(self.engine) 61 | 62 | def _get_anchors(self): 63 | anchors_path = os.path.expanduser(self.anchors_path) 64 | with open(anchors_path) as f: 65 | anchors = f.readline() 66 | anchors = [float(x) for x in anchors.split(',')] 67 | return np.array(anchors).reshape(-1, 2) 68 | 69 | def _build_engine(self): 70 | model_path = os.path.expanduser(self.model_path) 71 | 72 | TRT_LOGGER = trt.Logger(trt.Logger.INFO) 73 | # trt.init_libnvinfer_plugins(TRT_LOGGER, '') 74 | # CLIP_PLUGIN_LIBRARY = '/usr/lib/x86_64-linux-gnu/libnvinfer_plugin.so.6.0.1' 75 | # if not os.path.isfile(self.plugin_path): 76 | # raise IOError("\n{}\n{}\n{}\n".format( 77 | # "Failed to load library ({}).".format(self.plugin_path), 78 | # "Please build the Clip sample plugin.", 79 | # "For more information, see the included README.md" 80 | # )) 81 | 82 | # dll = ctypes.CDLL(self.plugin_path) 83 | # init = dll.initLibNvInferPlugins 84 | # init = dll.initLibYoloInferPlugins 85 | # init(None, b'') 86 | 87 | # plugins = trt.get_plugin_registry().plugin_creator_list 88 | # for plugin_creator in plugins: 89 | # print(plugin_creator.name) 90 | 91 | engine_path = common.model_path_to_engine_path(model_path, self.infer_mode) 92 | if os.path.isfile(engine_path): 93 | with open(engine_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime: 94 | return runtime.deserialize_cuda_engine(f.read()) 95 | 96 | EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) 97 | with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser( 98 | network, TRT_LOGGER) as parser: 99 | builder.max_batch_size = 1 100 | builder.max_workspace_size = 1 << 28 101 | if self.infer_mode == 'int8': 102 | calibration_cache = "model_data/tensorrt_model/yolo_calibration.cache" 103 | calib = YOLOEntropyCalibrator(self.calib_img_path, calibration_cache, self.model_image_size, 104 | batch_size=8) 105 | builder.int8_mode = True 106 | builder.int8_calibrator = calib 107 | 108 | # profile = builder.create_optimization_profile() 109 | # profile.set_shape('input_1', (1, 416, 416, 3), (1, 416, 416, 3), (1, 416, 416, 3)) 110 | # config.add_optimization_profile(profile) 111 | 112 | onnx_path = common.model_path_to_onnx_path(model_path) 113 | with open(onnx_path, 'rb') as model: 114 | print(onnx_path) 115 | if not parser.parse(model.read()): 116 | raise TypeError("Parser parse failed.") 117 | # network.get_input(0).shape = (1, 416, 416, 3) 118 | # print(network.get_output(0).shape) 119 | # print(network.get_output(1).shape) 120 | # print(network.get_output(2).shape) 121 | 122 | engine = builder.build_cuda_engine(network) 123 | print(engine.get_binding_name(1)) 124 | 125 | if not engine: 126 | raise TypeError("Build engine failed.") 127 | with open(engine_path, 'wb') as f: 128 | f.write(engine.serialize()) 129 | return engine 130 | 131 | # with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.UffParser() as parser: 132 | # builder.max_batch_size = 1 133 | # builder.max_workspace_size = 1 << 29 134 | # calibration_cache = "model_data/tensorrt_model/yolo_calibration.cache" 135 | # if self.infer_mode == 'int8': 136 | # calib = YOLOEntropyCalibrator(self.calib_img_path, calibration_cache, self.model_image_size, 137 | # batch_size=8) 138 | # builder.int8_mode = True 139 | # builder.int8_calibrator = calib 140 | # output_names = ['conv2d_59/BiasAdd', 141 | # 'conv2d_67/BiasAdd', 142 | # 'conv2d_75/BiasAdd'] 143 | # import graphsurgeon as gs 144 | # plugin_map = { 145 | # # "input_1": gs.create_plugin_node(name="input_1", op="Placeholder", shape=(-1, 416, 416, 3), 146 | # # dtype=tf.float32), 147 | # # "up_sampling2d_2/ResizeNearestNeighbor": gs.create_plugin_node( 148 | # # name="trt_upsampled2d_2/ResizeNearest_TRT", 149 | # # op="ResizeNearest_TRT", 150 | # # scale=2.0), 151 | # # "up_sampling2d_1/ResizeNearestNeighbor": gs.create_plugin_node( 152 | # # name="trt_upsampled2d_1/ResizeNearest_TRT", 153 | # # op="ResizeNearest_TRT", 154 | # # scale=2.0) 155 | # } 156 | # uff_path = common.model_path_to_uff_path(model_path) 157 | # if not os.path.isfile(uff_path): 158 | # uff_path = common.model_to_uff(model_path, output_names, plugin_map=plugin_map) 159 | # parser.register_input("input_1", (3, 416, 416)) 160 | # parser.register_output("conv2d_59/BiasAdd") 161 | # parser.register_output("conv2d_67/BiasAdd") 162 | # parser.register_output("conv2d_75/BiasAdd") 163 | # parser.parse(uff_path, network) 164 | # engine = builder.build_cuda_engine(network) 165 | # if not engine: 166 | # raise TypeError("Build engine failed.") 167 | # with open(engine_path, 'wb') as f: 168 | # f.write(engine.serialize()) 169 | # return engine 170 | 171 | def detect_image(self, image): 172 | if self.model_image_size != (None, None): 173 | assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required' 174 | assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required' 175 | boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size))) 176 | else: 177 | new_image_size = (image.width - (image.width % 32), 178 | image.height - (image.height % 32)) 179 | boxed_image = letterbox_image(image, new_image_size) 180 | image_data = np.array(boxed_image, dtype='float32', order='C') 181 | 182 | image_data /= 255. 183 | # image_data = np.ascontiguousarray(np.transpose(image_data, [2, 0, 1])) 184 | image_data = np.expand_dims(image_data, 0) # Add batch dimension. 185 | input_image_size = np.array([image.size[1], image.size[0]]) 186 | input_image_size = np.expand_dims(input_image_size, 0) 187 | 188 | # np.copyto(inputs[0].host, image_data.transpose(0, 3, 1, 2).ravel()) 189 | self.inputs[0].host = image_data 190 | # For more information on performing inference, refer to the introductory samples. 191 | # The common.do_inference function will return a list of outputs - we only have one in this case. 192 | yolo_output = common.do_inference(self.context, bindings=self.bindings, inputs=self.inputs, 193 | outputs=self.outputs, 194 | stream=self.stream) 195 | yolo_output[0] = yolo_output[0].reshape(1, 13, 13, 255) 196 | yolo_output[1] = yolo_output[1].reshape(1, 26, 26, 255) 197 | yolo_output[2] = yolo_output[2].reshape(1, 52, 52, 255) 198 | 199 | out_boxes, out_scores, out_classes = yolo_post_process(yolo_output, self.anchors, 200 | self.classes_num, input_image_size, 201 | score_threshold=self.score, iou_threshold=self.iou) 202 | 203 | out_boxes = out_boxes[0] 204 | out_scores = out_scores[0] 205 | out_classes = out_classes[0] 206 | print('Found {} boxes for {}'.format(len(out_boxes), 'img')) 207 | return out_boxes, out_scores, out_classes 208 | 209 | def close_session(self): 210 | pass 211 | -------------------------------------------------------------------------------- /yolo_test.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | from utils import detect_video, detect_image 3 | from PIL import Image 4 | 5 | 6 | # test 7 | def detect_img(yolo): 8 | import cv2 9 | import numpy as np 10 | # while True: 11 | img = input('Input image filename:') 12 | if not img: 13 | return 14 | # dataset = open("dataset/2012_val.txt") 15 | # for line in dataset.readlines(): 16 | # img = line.split()[0] 17 | try: 18 | image = Image.open(img) 19 | img_origin = image.copy() 20 | except: 21 | print('Open Error! Try again!') 22 | exit() 23 | else: 24 | r_image = yolo.detect_image(image) 25 | img_output = np.asarray(r_image) 26 | img_output = np.concatenate([np.asarray(img_origin), np.asarray(r_image)], axis=1) 27 | cv2.imshow("test", cv2.cvtColor(img_output, cv2.COLOR_RGB2BGR)) 28 | if cv2.waitKey() & 0xFF == ord('q'): 29 | exit() 30 | 31 | 32 | # FLAGS = None 33 | 34 | if __name__ == '__main__': 35 | # class YOLO defines the default value, so suppress any default here 36 | parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS) 37 | ''' 38 | Command line options 39 | ''' 40 | parser.add_argument( 41 | '--model_path', type=str, 42 | help='path to model weight file' 43 | ) 44 | 45 | parser.add_argument( 46 | '--anchors_path', type=str, 47 | help='path to anchor definitions' 48 | ) 49 | 50 | parser.add_argument( 51 | '--classes_path', type=str, 52 | help='path to class definitions' 53 | ) 54 | 55 | parser.add_argument( 56 | '--gpu_num', type=int, 57 | help='Number of GPU to use' 58 | ) 59 | 60 | parser.add_argument( 61 | '--platform', type=str, required=False, default='tensorrt', 62 | help='Inference platform: tensorflow or tensorrt' 63 | ) 64 | ''' 65 | image detection mode 66 | ''' 67 | parser.add_argument( 68 | "--image_path", type=str, required=False, 69 | help="Image input path" 70 | ) 71 | 72 | parser.add_argument( 73 | "--image_output_path", default='', type=str, required=False, 74 | help="Image output path" 75 | ) 76 | ''' 77 | video detection mode 78 | ''' 79 | 80 | parser.add_argument( 81 | "--video_path", type=str, required=False, 82 | help="Video input path" 83 | ) 84 | 85 | parser.add_argument( 86 | "--video_output_path", default='', type=str, required=False, 87 | help="Video output path" 88 | ) 89 | 90 | parser.add_argument( 91 | "--live", nargs='?', required=False, 92 | help="Live mode" 93 | ) 94 | 95 | FLAGS = parser.parse_args() 96 | if FLAGS.platform == 'tensorrt': 97 | from yolo_tensorrt import YOLO 98 | else: 99 | from yolo_keras import YOLO 100 | 101 | if "image_path" in FLAGS: 102 | """ 103 | Image detection mode, disregard any remaining command line arguments 104 | """ 105 | print("Image detection mode") 106 | detect_image(YOLO(**vars(FLAGS)), FLAGS.image_path, FLAGS.image_output_path) 107 | elif "video_path" in FLAGS: 108 | print("local video mode") 109 | detect_video(YOLO(**vars(FLAGS)), FLAGS.video_path, FLAGS.video_output_path) 110 | elif "live" in FLAGS: 111 | print("live mode") 112 | detect_video(YOLO(**vars(FLAGS)), 0) 113 | else: 114 | print("Must specify at least video_input_path , image_input_path or live. See usage with --help.") 115 | --------------------------------------------------------------------------------