├── README.md
├── font
    ├── FiraMono-Medium.otf
    └── SIL Open Font License.txt
├── model_data
    ├── convert_model.py
    └── yolo_anchors.txt
├── model_keras.py
├── post_process.py
├── tensorrt_util
    ├── __init__.py
    ├── __pycache__
    │   ├── __init__.cpython-35.pyc
    │   ├── tensorrt_common.cpython-35.pyc
    │   └── yolo_calibrator.cpython-35.pyc
    ├── tensorrt_common.py
    └── yolo_calibrator.py
├── utils.py
├── yolo_keras.py
├── yolo_tensorrt.py
└── yolo_test.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Yolov3 on TensorRT7.0 and Tensorflow2.0
 2 | 
 3 | This repository contains Yolov3 inference on tensorrt7.0 and tensorflow 2.0.
 4 | Model is based on [darknet](https://pjreddie.com/darknet/yolo/), adapt it to 
 5 | keras and tensorrt platform.
 6 | 
 7 | ### Test environments
 8 |     ubuntu 16.04
 9 |     Tensorflow 1.4,1.5 and 2.0
10 |     TensorRT 7.0
11 |     Nvidia-driver 410.78
12 |     cuda 10.0
13 |     cudnn 7.6.5
14 |     python 3.5.2
15 |     
16 | optional:
17 | 
18 |     keras2onnx 1.6
19 |     onnx 1.6
20 |     
21 | ### Models
22 | - Keras model  
23 | Keras model is borrowed from [keras-yolo3](https://github.com/qqwweee/keras-yolo3), which 
24 | contains a detailed description about how to generate .h5 model. 
25 | - TensorRT engine  
26 | TensorRT engine are generated based on Keras model. In this case, Keras model is 
27 | converted to onnx format and then used to generate TensorRT engine.  
28 | 
29 | Keras model can be converted to onnx as:
30 | 
31 |     python3 model_data/convert_model.py --model_path=your_dir/model.h5 --output_path=output_dir --type=onnx
32 | 
33 | then specify model path in yolo_tensorrt.py or yolo_test.py, TensorRT engine will be
34 | generated after running yolo_test.py.
35 | 
36 | For int8 engine, calibrate images should be prepared and specify images path in yolo_tensorrt.py
37 | 
38 | - Download engine  
39 | Or Download engine directly(waiting for upload).
40 | 
41 | ### Run test
42 | 
43 |     python3 yolo_test.py --model_path=model_data/yolo_int8.engine --live --platform=tensorrt
44 | 
45 | more detail refer to :
46 | 
47 |     python3 yolo_test.py --help
48 |     
49 | ### Evaluate result
50 | int8 calibration images are 1000 pics selected in val2014
51 | 
52 | Model   | mode | dataset | MAP | MAP (0.5) | MAP(0.75)  
53 | ----  | ----  | --- | --- | --- | ---
54 | Yolov3-416  | raw  | COCOval2014 | 0.315 | 0.561 | 0.319
55 | Yolov3-416  | fp32 | COCOval2014 | 0.315 | 0.561 | 0.319
56 | Yolov3-416  | int8 | COCOval2014 | 0.304 | 0.551 | 0.295
57 |  
58 | As shown above, fp32 model has completely same result as raw model in twice tests.
59 | In my previous tensorrt 6.0.1 practice, the fp32 model has little less mAP than raw model, but
60 | the model is converted through uff.


--------------------------------------------------------------------------------
/font/FiraMono-Medium.otf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/font/FiraMono-Medium.otf


--------------------------------------------------------------------------------
/font/SIL Open Font License.txt:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2014, Mozilla Foundation https://mozilla.org/ with Reserved Font Name Fira Mono.
 2 | 
 3 | Copyright (c) 2014, Telefonica S.A.
 4 | 
 5 | This Font Software is licensed under the SIL Open Font License, Version 1.1.
 6 | This license is copied below, and is also available with a FAQ at: http://scripts.sil.org/OFL
 7 | 
 8 | -----------------------------------------------------------
 9 | SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
10 | -----------------------------------------------------------
11 | 
12 | PREAMBLE
13 | The goals of the Open Font License (OFL) are to stimulate worldwide development of collaborative font projects, to support the font creation efforts of academic and linguistic communities, and to provide a free and open framework in which fonts may be shared and improved in partnership with others.
14 | 
15 | The OFL allows the licensed fonts to be used, studied, modified and redistributed freely as long as they are not sold by themselves. The fonts, including any derivative works, can be bundled, embedded, redistributed and/or sold with any software provided that any reserved names are not used by derivative works. The fonts and derivatives, however, cannot be released under any other type of license. The requirement for fonts to remain under this license does not apply to any document created using the fonts or their derivatives.
16 | 
17 | DEFINITIONS
18 | "Font Software" refers to the set of files released by the Copyright Holder(s) under this license and clearly marked as such. This may include source files, build scripts and documentation.
19 | 
20 | "Reserved Font Name" refers to any names specified as such after the copyright statement(s).
21 | 
22 | "Original Version" refers to the collection of Font Software components as distributed by the Copyright Holder(s).
23 | 
24 | "Modified Version" refers to any derivative made by adding to, deleting, or substituting -- in part or in whole -- any of the components of the Original Version, by changing formats or by porting the Font Software to a new environment.
25 | 
26 | "Author" refers to any designer, engineer, programmer, technical writer or other person who contributed to the Font Software.
27 | 
28 | PERMISSION & CONDITIONS
29 | Permission is hereby granted, free of charge, to any person obtaining a copy of the Font Software, to use, study, copy, merge, embed, modify, redistribute, and sell modified and unmodified copies of the Font Software, subject to the following conditions:
30 | 
31 | 1) Neither the Font Software nor any of its individual components, in Original or Modified Versions, may be sold by itself.
32 | 
33 | 2) Original or Modified Versions of the Font Software may be bundled, redistributed and/or sold with any software, provided that each copy contains the above copyright notice and this license. These can be included either as stand-alone text files, human-readable headers or in the appropriate machine-readable metadata fields within text or binary files as long as those fields can be easily viewed by the user.
34 | 
35 | 3) No Modified Version of the Font Software may use the Reserved Font Name(s) unless explicit written permission is granted by the corresponding Copyright Holder. This restriction only applies to the primary font name as presented to the users.
36 | 
37 | 4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font Software shall not be used to promote, endorse or advertise any Modified Version, except to acknowledge the contribution(s) of the Copyright Holder(s) and the Author(s) or with their explicit written permission.
38 | 
39 | 5) The Font Software, modified or unmodified, in part or in whole, must be distributed entirely under this license, and must not be distributed under any other license. The requirement for fonts to remain under this license does not apply to any document created using the Font Software.
40 | 
41 | TERMINATION
42 | This license becomes null and void if any of the above conditions are not met.
43 | 
44 | DISCLAIMER
45 | THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE FONT SOFTWARE.


--------------------------------------------------------------------------------
/model_data/convert_model.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import sys
 3 | import os
 4 | import tensorflow as tf
 5 | 
 6 | 
 7 | def optimize_h5_model(model_path, output_path):
 8 |     # tf.enable_eager_execution()
 9 |     from tensorflow.python.compiler.tensorrt import trt_convert as trt
10 |     model_path = os.path.expanduser(model_path)
11 |     output_path = os.path.expanduser(output_path)
12 |     model = tf.keras.models.load_model(model_path)
13 |     name = os.path.basename(model_path)
14 |     name = os.path.splitext(name)[0]
15 |     temp_path = os.path.join('/tmp', name)
16 |     print(temp_path)
17 |     model.save(temp_path, save_format='tf')
18 |     # tf.compat.v1.saved_model.save(model, temp_path)
19 | 
20 |     converter = trt.TrtGraphConverterV2(input_saved_model_dir=temp_path)
21 |     converter.convert()
22 |     converter.save(output_path)
23 | 
24 | 
25 | def freeze_keras_model(model_path, output_path, keep_var_names=None):
26 |     # First freeze the graph and remove training nodes.
27 |     model_path = os.path.expanduser(model_path)
28 |     output_path = os.path.expanduser(output_path)
29 |     sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(device_count={'GPU': 0}))  # CPU only
30 |     tf.compat.v1.keras.backend.set_session(sess)
31 |     tf.compat.v1.keras.backend.set_learning_phase(0)
32 |     model = tf.compat.v1.keras.models.load_model(model_path)
33 |     tf.compat.v1.keras.backend.set_learning_phase(0)
34 |     if isinstance(model.input, list):
35 |         input_names = [input.name for input in model.input]
36 |     elif isinstance(model.input, tf.Tensor):
37 |         input_names = [model.input.op.name]
38 |     else:
39 |         raise Exception('No input')
40 |     output_names = [output.op.name for output in model.output]
41 |     freeze_var_names = list(set(v.op.name for v in tf.compat.v1.global_variables()).difference(keep_var_names or []))
42 |     # print(freeze_var_names)
43 |     print(input_names)
44 |     print(output_names)
45 | 
46 |     tf.compat.v1.train.write_graph(sess.graph.as_graph_def(), '.', os.path.join(output_path, 'graph.pbtxt'),
47 |                                    as_text=True)
48 |     frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), output_names,
49 |                                                                           freeze_var_names)
50 |     frozen_graph = tf.compat.v1.graph_util.remove_training_nodes(frozen_graph)
51 |     # Save the model
52 |     output_path = os.path.join(output_path, 'frozen.pb')
53 |     with open(output_path, "wb") as ofile:
54 |         ofile.write(frozen_graph.SerializeToString())
55 | 
56 | 
57 | def convert_keras2onnx(model_path, output_path):
58 |     import keras2onnx
59 |     import onnx
60 |     # load keras model
61 |     model = tf.compat.v1.keras.models.load_model(model_path)
62 | 
63 |     # convert to onnx model
64 |     print(onnx.defs.onnx_opset_version())
65 |     onnx_model = keras2onnx.convert_keras(model, model.name, target_opset=10)
66 |     onnx_model.graph.input[0].type.tensor_type.shape.dim[0].dim_value = 1
67 |     # # runtime prediction
68 |     output_path = os.path.join(output_path, 'converted.onnx')
69 |     onnx.save_model(onnx_model, output_path)
70 | 
71 | 
72 | def main(args):
73 |     if args.type == 'onnx':
74 |         convert_keras2onnx(args.model_path, args.output_path)
75 |     elif args.type == 'pb':
76 |         freeze_keras_model(args.model_path, args.output_path)
77 |     elif args.type == 'trt':
78 |         optimize_h5_model(args.model_path, args.output_path)
79 |     else:
80 |         print("Type error")
81 | 
82 | 
83 | def parse_arguments():
84 |     parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
85 | 
86 |     parser.add_argument('--model_path', type=str,
87 |                         help='Path of the model to be converted .', default='')
88 |     parser.add_argument('--output_path', type=str,
89 |                         help='Path of the model to be stored .', default='./optimized_model')
90 |     parser.add_argument('--type', type=str,
91 |                         help='Convert model to tyoe: onnx, pb or trt', default='pb')
92 |     return parser.parse_args()
93 | 
94 | 
95 | if __name__ == '__main__':
96 |     main(parse_arguments())
97 | 


--------------------------------------------------------------------------------
/model_data/yolo_anchors.txt:
--------------------------------------------------------------------------------
1 | 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
2 | 


--------------------------------------------------------------------------------
/model_keras.py:
--------------------------------------------------------------------------------
  1 | """YOLO_v3 Model Defined in Keras."""
  2 | 
  3 | from functools import wraps
  4 | 
  5 | import numpy as np
  6 | import tensorflow as tf
  7 | from tensorflow.keras.layers import Conv2D, Add, ZeroPadding2D, UpSampling2D, Concatenate, MaxPooling2D
  8 | from tensorflow.keras.layers import LeakyReLU
  9 | from tensorflow.keras.layers import BatchNormalization
 10 | from tensorflow.keras.models import Model
 11 | from tensorflow.keras.regularizers import l2
 12 | 
 13 | from yolo3.utils import compose
 14 | 
 15 | 
 16 | @wraps(Conv2D)
 17 | def DarknetConv2D(*args, **kwargs):
 18 |     """Wrapper to set Darknet parameters for Convolution2D."""
 19 |     darknet_conv_kwargs = {'kernel_regularizer': l2(5e-4)}
 20 |     darknet_conv_kwargs['padding'] = 'valid' if kwargs.get('strides') == (2, 2) else 'same'
 21 |     darknet_conv_kwargs.update(kwargs)
 22 |     return Conv2D(*args, **darknet_conv_kwargs)
 23 | 
 24 | 
 25 | def DarknetConv2D_BN_Leaky(*args, **kwargs):
 26 |     """Darknet Convolution2D followed by BatchNormalization and LeakyReLU."""
 27 |     no_bias_kwargs = {'use_bias': False}
 28 |     no_bias_kwargs.update(kwargs)
 29 |     return compose(
 30 |         DarknetConv2D(*args, **no_bias_kwargs),
 31 |         BatchNormalization(),
 32 |         LeakyReLU(alpha=0.1))
 33 | 
 34 | 
 35 | def resblock_body(x, num_filters, num_blocks):
 36 |     '''A series of resblocks starting with a downsampling Convolution2D'''
 37 |     # Darknet uses left and top padding instead of 'same' mode
 38 |     x = ZeroPadding2D(((1, 0), (1, 0)))(x)
 39 |     x = DarknetConv2D_BN_Leaky(num_filters, (3, 3), strides=(2, 2))(x)
 40 |     for i in range(num_blocks):
 41 |         y = compose(
 42 |             DarknetConv2D_BN_Leaky(num_filters // 2, (1, 1)),
 43 |             DarknetConv2D_BN_Leaky(num_filters, (3, 3)))(x)
 44 |         x = Add()([x, y])
 45 |     return x
 46 | 
 47 | 
 48 | def darknet_body(x):
 49 |     '''Darknent body having 52 Convolution2D layers'''
 50 |     x = DarknetConv2D_BN_Leaky(32, (3, 3))(x)
 51 |     x = resblock_body(x, 64, 1)
 52 |     x = resblock_body(x, 128, 2)
 53 |     x = resblock_body(x, 256, 8)
 54 |     x = resblock_body(x, 512, 8)
 55 |     x = resblock_body(x, 1024, 4)
 56 |     return x
 57 | 
 58 | 
 59 | def make_last_layers(x, num_filters, out_filters):
 60 |     '''6 Conv2D_BN_Leaky layers followed by a Conv2D_linear layer'''
 61 |     x = compose(
 62 |         DarknetConv2D_BN_Leaky(num_filters, (1, 1)),
 63 |         DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
 64 |         DarknetConv2D_BN_Leaky(num_filters, (1, 1)),
 65 |         DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
 66 |         DarknetConv2D_BN_Leaky(num_filters, (1, 1)))(x)
 67 |     y = compose(
 68 |         DarknetConv2D_BN_Leaky(num_filters * 2, (3, 3)),
 69 |         DarknetConv2D(out_filters, (1, 1)))(x)
 70 |     return x, y
 71 | 
 72 | 
 73 | def yolo_body(inputs, num_anchors, num_classes):
 74 |     """Create YOLO_V3 model CNN body in Keras."""
 75 |     darknet = Model(inputs, darknet_body(inputs))
 76 |     x, y1 = make_last_layers(darknet.output, 512, num_anchors * (num_classes + 5))
 77 | 
 78 |     x = compose(
 79 |         DarknetConv2D_BN_Leaky(256, (1, 1)),
 80 |         UpSampling2D(2))(x)
 81 |     x = Concatenate()([x, darknet.layers[152].output])
 82 |     x, y2 = make_last_layers(x, 256, num_anchors * (num_classes + 5))
 83 | 
 84 |     x = compose(
 85 |         DarknetConv2D_BN_Leaky(128, (1, 1)),
 86 |         UpSampling2D(2))(x)
 87 |     x = Concatenate()([x, darknet.layers[92].output])
 88 |     x, y3 = make_last_layers(x, 128, num_anchors * (num_classes + 5))
 89 | 
 90 |     return Model(inputs, [y1, y2, y3])
 91 | 
 92 | 
 93 | def tiny_yolo_body(inputs, num_anchors, num_classes):
 94 |     '''Create Tiny YOLO_v3 model CNN body in keras.'''
 95 |     x1 = compose(
 96 |         DarknetConv2D_BN_Leaky(16, (3, 3)),
 97 |         MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
 98 |         DarknetConv2D_BN_Leaky(32, (3, 3)),
 99 |         MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
100 |         DarknetConv2D_BN_Leaky(64, (3, 3)),
101 |         MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
102 |         DarknetConv2D_BN_Leaky(128, (3, 3)),
103 |         MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
104 |         DarknetConv2D_BN_Leaky(256, (3, 3)))(inputs)
105 |     x2 = compose(
106 |         MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'),
107 |         DarknetConv2D_BN_Leaky(512, (3, 3)),
108 |         MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='same'),
109 |         DarknetConv2D_BN_Leaky(1024, (3, 3)),
110 |         DarknetConv2D_BN_Leaky(256, (1, 1)))(x1)
111 |     y1 = compose(
112 |         DarknetConv2D_BN_Leaky(512, (3, 3)),
113 |         DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)))(x2)
114 | 
115 |     x2 = compose(
116 |         DarknetConv2D_BN_Leaky(128, (1, 1)),
117 |         UpSampling2D(2))(x2)
118 |     y2 = compose(
119 |         Concatenate(),
120 |         DarknetConv2D_BN_Leaky(256, (3, 3)),
121 |         DarknetConv2D(num_anchors * (num_classes + 5), (1, 1)))([x2, x1])
122 | 
123 |     return Model(inputs, [y1, y2])
124 | 
125 | 
126 | def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False):
127 |     """Convert final layer features to bounding box parameters."""
128 |     num_anchors = len(anchors)
129 |     # Reshape to batch, height, width, num_anchors, box_params.
130 |     anchors_tensor = tf.reshape(tf.constant(anchors), [1, 1, 1, num_anchors, 2])
131 | 
132 |     grid_shape = tf.shape(feats)[1:3]  # height, width
133 |     grid_y = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[0]), [-1, 1, 1, 1]),
134 |                      [1, grid_shape[1], 1, 1])
135 |     grid_x = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[1]), [1, -1, 1, 1]),
136 |                      [grid_shape[0], 1, 1, 1])
137 |     grid = tf.concat([grid_x, grid_y], -1)
138 |     grid = tf.cast(grid, feats.dtype)
139 | 
140 |     feats = tf.reshape(
141 |         feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5])
142 | 
143 |     # Adjust preditions to each spatial grid point and anchor size.
144 |     box_xy = (tf.sigmoid(feats[..., :2]) + grid) / tf.cast(grid_shape[::-1], feats.dtype)
145 |     box_wh = tf.exp(feats[..., 2:4]) * tf.cast(anchors_tensor, feats.dtype) / tf.cast(input_shape[::-1], feats.dtype)
146 |     box_confidence = tf.sigmoid(feats[..., 4:5])
147 |     box_class_probs = tf.sigmoid(feats[..., 5:])
148 | 
149 |     if calc_loss == True:
150 |         return grid, feats, box_xy, box_wh
151 |     return box_xy, box_wh, box_confidence, box_class_probs
152 | 
153 | 
154 | def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape):
155 |     '''Get corrected boxes'''
156 |     box_yx = box_xy[..., ::-1]
157 |     box_hw = box_wh[..., ::-1]
158 |     input_shape = tf.cast(input_shape, box_yx.dtype)
159 |     image_shape = tf.cast(image_shape, box_yx.dtype)
160 |     new_shape = tf.round(image_shape * tf.reduce_min(input_shape / image_shape))
161 |     offset = (input_shape - new_shape) / 2. / input_shape
162 |     scale = input_shape / new_shape
163 |     box_yx = (box_yx - offset) * scale
164 |     box_hw *= scale
165 | 
166 |     box_mins = box_yx - (box_hw / 2.)
167 |     box_maxes = box_yx + (box_hw / 2.)
168 |     boxes = tf.concat([
169 |         box_mins[..., 0:1],  # y_min
170 |         box_mins[..., 1:2],  # x_min
171 |         box_maxes[..., 0:1],  # y_max
172 |         box_maxes[..., 1:2]  # x_max
173 |     ], -1)
174 |     # Scale boxes back to original image shape.
175 |     boxes *= tf.concat([image_shape, image_shape], -1)
176 |     return boxes
177 | 
178 | 
179 | def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape):
180 |     '''Process Conv layer output'''
181 |     box_xy, box_wh, box_confidence, box_class_probs = yolo_head(feats,
182 |                                                                 anchors, num_classes, input_shape)
183 |     boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape)
184 |     boxes = tf.reshape(boxes, [tf.shape(boxes)[0], -1, 4])
185 |     box_scores = box_confidence * box_class_probs
186 |     box_scores = tf.reshape(box_scores, [tf.shape(box_scores)[0], -1, num_classes])
187 |     return boxes, box_scores
188 | 
189 | 
190 | def yolo_eval(yolo_outputs,
191 |               anchors,
192 |               num_classes,
193 |               image_shape,
194 |               max_boxes=20,
195 |               score_threshold=.6,
196 |               iou_threshold=.5):
197 |     """Evaluate YOLO model on given input and return filtered boxes."""
198 |     num_layers = len(yolo_outputs)
199 |     anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]  # default setting
200 |     input_shape = tf.shape(yolo_outputs[0])[1:3] * 32
201 |     boxes = []
202 |     box_scores = []
203 |     for l in range(num_layers):
204 |         _boxes, _box_scores = yolo_boxes_and_scores(yolo_outputs[l],
205 |                                                     anchors[anchor_mask[l]], num_classes, input_shape, image_shape)
206 |         boxes.append(_boxes)
207 |         box_scores.append(_box_scores)
208 |     # boxes = tf.concat(boxes, axis=1)
209 |     # box_scores = tf.concat(box_scores, axis=1)
210 |     boxes = np.concatenate(boxes, axis=1)
211 |     box_scores = np.concatenate(box_scores, axis=1)
212 |     # print(box_scores.shape)
213 |     # print(boxes.shape)
214 | 
215 |     boxes_ = []
216 |     scores_ = []
217 |     classes_ = []
218 |     for single_boxes, single_box_scores in zip(boxes, box_scores):
219 |         mask = single_box_scores >= score_threshold
220 |         max_boxes_tensor = tf.constant(max_boxes, dtype='int32')
221 |         single_boxes_ = []
222 |         single_scores_ = []
223 |         single_classes_ = []
224 |         for c in range(num_classes):
225 |             pass
226 |             # TODO: use keras backend instead of tf.
227 |             # class_boxes = tf.boolean_mask(single_boxes, mask[..., c])
228 |             # class_box_scores = tf.boolean_mask(single_box_scores[..., c], mask[..., c])
229 |             class_boxes = single_boxes[mask[..., c]]
230 |             class_box_scores = single_box_scores[..., c][mask[..., c]]
231 |             nms_index = tf.image.non_max_suppression(
232 |                 class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold)
233 |             class_boxes = tf.gather(class_boxes, nms_index)
234 |             class_box_scores = tf.gather(class_box_scores, nms_index)
235 |             classes = tf.ones_like(class_box_scores, 'int32') * c
236 |             single_boxes_.append(class_boxes)
237 |             single_scores_.append(class_box_scores)
238 |             single_classes_.append(classes)
239 |         single_boxes_ = tf.concat(single_boxes_, axis=0)
240 |         single_scores_ = tf.concat(single_scores_, axis=0)
241 |         single_classes_ = tf.concat(single_classes_, axis=0)
242 |         boxes_.append(single_boxes_)
243 |         scores_.append(single_scores_)
244 |         classes_.append(single_classes_)
245 | 
246 |     # boxes_ = tf.reshape(boxes_, [tf.shape(box_scores)[0], -1, 4])
247 |     # scores_ = tf.reshape(scores_, [tf.shape(box_scores)[0], -1])
248 |     # classes_ = tf.reshape(classes_, [tf.shape(box_scores)[0], -1])
249 | 
250 |     return boxes_, scores_, classes_
251 | 
252 | 
253 | def preprocess_true_boxes(true_boxes, input_shape, anchors, num_classes):
254 |     '''Preprocess true boxes to training input format
255 | 
256 |     Parameters
257 |     ----------
258 |     true_boxes: array, shape=(m, T, 5)
259 |         Absolute x_min, y_min, x_max, y_max, class_id relative to input_shape.
260 |     input_shape: array-like, hw, multiples of 32
261 |     anchors: array, shape=(N, 2), wh
262 |     num_classes: integer
263 | 
264 |     Returns
265 |     -------
266 |     y_true: list of array, shape like yolo_outputs, xywh are reletive value
267 | 
268 |     '''
269 |     assert (true_boxes[..., 4] < num_classes).all(), 'class id must be less than num_classes'
270 |     num_layers = len(anchors) // 3  # default setting
271 |     anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]
272 | 
273 |     true_boxes = np.array(true_boxes, dtype='float32')
274 |     input_shape = np.array(input_shape, dtype='int32')
275 |     boxes_xy = (true_boxes[..., 0:2] + true_boxes[..., 2:4]) // 2
276 |     boxes_wh = true_boxes[..., 2:4] - true_boxes[..., 0:2]
277 |     true_boxes[..., 0:2] = boxes_xy / input_shape[::-1]
278 |     true_boxes[..., 2:4] = boxes_wh / input_shape[::-1]
279 | 
280 |     m = true_boxes.shape[0]
281 |     grid_shapes = [input_shape // {0: 32, 1: 16, 2: 8}[l] for l in range(num_layers)]
282 |     y_true = [np.zeros((m, grid_shapes[l][0], grid_shapes[l][1], len(anchor_mask[l]), 5 + num_classes),
283 |                        dtype='float32') for l in range(num_layers)]
284 | 
285 |     # Expand dim to apply broadcasting.
286 |     anchors = np.expand_dims(anchors, 0)
287 |     anchor_maxes = anchors / 2.
288 |     anchor_mins = -anchor_maxes
289 |     valid_mask = boxes_wh[..., 0] > 0
290 | 
291 |     for b in range(m):
292 |         # Discard zero rows.
293 |         wh = boxes_wh[b, valid_mask[b]]
294 |         if len(wh) == 0: continue
295 |         # Expand dim to apply broadcasting.
296 |         wh = np.expand_dims(wh, -2)
297 |         box_maxes = wh / 2.
298 |         box_mins = -box_maxes
299 | 
300 |         intersect_mins = np.maximum(box_mins, anchor_mins)
301 |         intersect_maxes = np.minimum(box_maxes, anchor_maxes)
302 |         intersect_wh = np.maximum(intersect_maxes - intersect_mins, 0.)
303 |         intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
304 |         box_area = wh[..., 0] * wh[..., 1]
305 |         anchor_area = anchors[..., 0] * anchors[..., 1]
306 |         iou = intersect_area / (box_area + anchor_area - intersect_area)
307 | 
308 |         # Find best anchor for each true box
309 |         best_anchor = np.argmax(iou, axis=-1)
310 | 
311 |         for t, n in enumerate(best_anchor):
312 |             for l in range(num_layers):
313 |                 if n in anchor_mask[l]:
314 |                     i = np.floor(true_boxes[b, t, 0] * grid_shapes[l][1]).astype('int32')
315 |                     j = np.floor(true_boxes[b, t, 1] * grid_shapes[l][0]).astype('int32')
316 |                     k = anchor_mask[l].index(n)
317 |                     c = true_boxes[b, t, 4].astype('int32')
318 |                     y_true[l][b, j, i, k, 0:4] = true_boxes[b, t, 0:4]
319 |                     y_true[l][b, j, i, k, 4] = 1
320 |                     y_true[l][b, j, i, k, 5 + c] = 1
321 | 
322 |     return y_true
323 | 
324 | 
325 | def box_iou(b1, b2):
326 |     '''Return iou tensor
327 | 
328 |     Parameters
329 |     ----------
330 |     b1: tensor, shape=(i1,...,iN, 4), xywh
331 |     b2: tensor, shape=(j, 4), xywh
332 | 
333 |     Returns
334 |     -------
335 |     iou: tensor, shape=(i1,...,iN, j)
336 | 
337 |     '''
338 | 
339 |     # Expand dim to apply broadcasting.
340 |     b1 = tf.expand_dims(b1, -2)
341 |     b1_xy = b1[..., :2]
342 |     b1_wh = b1[..., 2:4]
343 |     b1_wh_half = b1_wh / 2.
344 |     b1_mins = b1_xy - b1_wh_half
345 |     b1_maxes = b1_xy + b1_wh_half
346 | 
347 |     # Expand dim to apply broadcasting.
348 |     b2 = tf.expand_dims(b2, 0)
349 |     b2_xy = b2[..., :2]
350 |     b2_wh = b2[..., 2:4]
351 |     b2_wh_half = b2_wh / 2.
352 |     b2_mins = b2_xy - b2_wh_half
353 |     b2_maxes = b2_xy + b2_wh_half
354 | 
355 |     intersect_mins = tf.maximum(b1_mins, b2_mins)
356 |     intersect_maxes = tf.minimum(b1_maxes, b2_maxes)
357 |     intersect_wh = tf.maximum(intersect_maxes - intersect_mins, 0.)
358 |     intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
359 |     b1_area = b1_wh[..., 0] * b1_wh[..., 1]
360 |     b2_area = b2_wh[..., 0] * b2_wh[..., 1]
361 |     iou = intersect_area / (b1_area + b2_area - intersect_area)
362 | 
363 |     return iou
364 | 
365 | 
366 | def yolo_loss(args, anchors, num_classes, ignore_thresh=.5, print_loss=False):
367 |     '''Return yolo_loss tensor
368 | 
369 |     Parameters
370 |     ----------
371 |     yolo_outputs: list of tensor, the output of yolo_body or tiny_yolo_body
372 |     y_true: list of array, the output of preprocess_true_boxes
373 |     anchors: array, shape=(N, 2), wh
374 |     num_classes: integer
375 |     ignore_thresh: float, the iou threshold whether to ignore object confidence loss
376 | 
377 |     Returns
378 |     -------
379 |     loss: tensor, shape=(1,)
380 | 
381 |     '''
382 |     num_layers = len(anchors) // 3  # default setting
383 |     yolo_outputs = args[:num_layers]
384 |     y_true = args[num_layers:]
385 |     anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]
386 |     input_shape = tf.cast(tf.shape(yolo_outputs[0])[1:3] * 32, y_true[0].dtype)
387 |     grid_shapes = [tf.cast(tf.shape(yolo_outputs[l])[1:3], y_true[0].dtype) for l in range(num_layers)]
388 |     loss = 0
389 |     m = tf.shape(yolo_outputs[0])[0]  # batch size, tensor
390 |     mf = tf.cast(m, yolo_outputs[0].dtype)
391 | 
392 |     for l in range(num_layers):
393 |         object_mask = y_true[l][..., 4:5]
394 |         true_class_probs = y_true[l][..., 5:]
395 | 
396 |         grid, raw_pred, pred_xy, pred_wh = yolo_head(yolo_outputs[l],
397 |                                                      anchors[anchor_mask[l]], num_classes, input_shape, calc_loss=True)
398 |         pred_box = tf.concat([pred_xy, pred_wh], -1)
399 | 
400 |         # Darknet raw box to calculate loss.
401 |         raw_true_xy = y_true[l][..., :2] * grid_shapes[l][::-1] - grid
402 |         raw_true_wh = tf.log(y_true[l][..., 2:4] / anchors[anchor_mask[l]] * input_shape[::-1])
403 |         raw_true_wh = tf.switch(object_mask, raw_true_wh, tf.zeros_like(raw_true_wh))  # avoid log(0)=-inf
404 |         box_loss_scale = 2 - y_true[l][..., 2:3] * y_true[l][..., 3:4]
405 | 
406 |         # Find ignore mask, iterate over each of batch.
407 |         ignore_mask = tf.TensorArray(tf.dtype(y_true[0]), size=1, dynamic_size=True)
408 |         object_mask_bool = tf.cast(object_mask, 'bool')
409 | 
410 |         def loop_body(b, ignore_mask):
411 |             true_box = tf.boolean_mask(y_true[l][b, ..., 0:4], object_mask_bool[b, ..., 0])
412 |             iou = box_iou(pred_box[b], true_box)
413 |             best_iou = tf.max(iou, axis=-1)
414 |             ignore_mask = ignore_mask.write(b, tf.cast(best_iou < ignore_thresh, true_box.dtype))
415 |             return b + 1, ignore_mask
416 | 
417 |         _, ignore_mask = tf.control_flow_ops.while_loop(lambda b, *args: b < m, loop_body, [0, ignore_mask])
418 |         ignore_mask = ignore_mask.stack()
419 |         ignore_mask = tf.expand_dims(ignore_mask, -1)
420 | 
421 |         # tf.binary_crossentropy is helpful to avoid exp overflow.
422 |         xy_loss = object_mask * box_loss_scale * tf.binary_crossentropy(raw_true_xy, raw_pred[..., 0:2],
423 |                                                                         from_logits=True)
424 |         wh_loss = object_mask * box_loss_scale * 0.5 * tf.square(raw_true_wh - raw_pred[..., 2:4])
425 |         confidence_loss = object_mask * tf.binary_crossentropy(object_mask, raw_pred[..., 4:5], from_logits=True) + \
426 |                           (1 - object_mask) * tf.binary_crossentropy(object_mask, raw_pred[..., 4:5],
427 |                                                                      from_logits=True) * ignore_mask
428 |         class_loss = object_mask * tf.binary_crossentropy(true_class_probs, raw_pred[..., 5:], from_logits=True)
429 | 
430 |         xy_loss = tf.sum(xy_loss) / mf
431 |         wh_loss = tf.sum(wh_loss) / mf
432 |         confidence_loss = tf.sum(confidence_loss) / mf
433 |         class_loss = tf.sum(class_loss) / mf
434 |         loss += xy_loss + wh_loss + confidence_loss + class_loss
435 |         if print_loss:
436 |             loss = tf.Print(loss, [loss, xy_loss, wh_loss, confidence_loss, class_loss, tf.sum(ignore_mask)],
437 |                             message='loss: ')
438 |     return loss
439 | 


--------------------------------------------------------------------------------
/post_process.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | from PIL import Image
  4 | 
  5 | 
  6 | def letterbox_image(image, size):
  7 |     '''resize image with unchanged aspect ratio using padding'''
  8 |     iw, ih = image.size
  9 |     w, h = size
 10 |     scale = min(w / iw, h / ih)
 11 |     nw = int(iw * scale)
 12 |     nh = int(ih * scale)
 13 | 
 14 |     image = image.resize((nw, nh), Image.BICUBIC)
 15 |     new_image = Image.new('RGB', size, (128, 128, 128))
 16 |     new_image.paste(image, ((w - nw) // 2, (h - nh) // 2))
 17 |     return new_image
 18 | 
 19 | 
 20 | def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False):
 21 |     """Convert final layer features to bounding box parameters."""
 22 |     num_anchors = len(anchors)
 23 |     # Reshape to batch, height, width, num_anchors, box_params.
 24 |     anchors_tensor = tf.reshape(tf.constant(anchors), [1, 1, 1, num_anchors, 2])
 25 | 
 26 |     grid_shape = tf.shape(feats)[1:3]  # height, width
 27 |     grid_y = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[0]), [-1, 1, 1, 1]),
 28 |                      [1, grid_shape[1], 1, 1])
 29 |     grid_x = tf.tile(tf.reshape(tf.range(0, limit=grid_shape[1]), [1, -1, 1, 1]),
 30 |                      [grid_shape[0], 1, 1, 1])
 31 |     grid = tf.concat([grid_x, grid_y], -1)
 32 |     grid = tf.cast(grid, feats.dtype)
 33 | 
 34 |     feats = tf.reshape(
 35 |         feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5])
 36 | 
 37 |     # Adjust preditions to each spatial grid point and anchor size.
 38 |     box_xy = (tf.sigmoid(feats[..., :2]) + grid) / tf.cast(grid_shape[::-1], feats.dtype)
 39 |     box_wh = tf.exp(feats[..., 2:4]) * tf.cast(anchors_tensor, feats.dtype) / tf.cast(input_shape[::-1], feats.dtype)
 40 |     box_confidence = tf.sigmoid(feats[..., 4:5])
 41 |     box_class_probs = tf.sigmoid(feats[..., 5:])
 42 | 
 43 |     if calc_loss == True:
 44 |         return grid, feats, box_xy, box_wh
 45 |     return box_xy, box_wh, box_confidence, box_class_probs
 46 | 
 47 | 
 48 | def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape):
 49 |     '''Get corrected boxes'''
 50 |     box_yx = box_xy[..., ::-1]
 51 |     box_hw = box_wh[..., ::-1]
 52 |     input_shape = tf.cast(input_shape, box_yx.dtype)
 53 |     image_shape = tf.cast(image_shape, box_yx.dtype)
 54 |     new_shape = tf.round(image_shape * tf.reduce_min(input_shape / image_shape))
 55 |     offset = (input_shape - new_shape) / 2. / input_shape
 56 |     scale = input_shape / new_shape
 57 |     box_yx = (box_yx - offset) * scale
 58 |     box_hw *= scale
 59 | 
 60 |     box_mins = box_yx - (box_hw / 2.)
 61 |     box_maxes = box_yx + (box_hw / 2.)
 62 |     boxes = tf.concat([
 63 |         box_mins[..., 0:1],  # y_min
 64 |         box_mins[..., 1:2],  # x_min
 65 |         box_maxes[..., 0:1],  # y_max
 66 |         box_maxes[..., 1:2]  # x_max
 67 |     ], -1)
 68 |     # Scale boxes back to original image shape.
 69 |     boxes *= tf.concat([image_shape, image_shape], -1)
 70 |     return boxes
 71 | 
 72 | 
 73 | def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape):
 74 |     '''Process Conv layer output'''
 75 |     box_xy, box_wh, box_confidence, box_class_probs = yolo_head(feats,
 76 |                                                                 anchors, num_classes, input_shape)
 77 |     boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape)
 78 |     boxes = tf.reshape(boxes, [tf.shape(boxes)[0], -1, 4])
 79 |     box_scores = box_confidence * box_class_probs
 80 |     box_scores = tf.reshape(box_scores, [tf.shape(box_scores)[0], -1, num_classes])
 81 |     return boxes, box_scores
 82 | 
 83 | 
 84 | def yolo_post_process(yolo_outputs,
 85 |                       anchors,
 86 |                       num_classes,
 87 |                       image_shape,
 88 |                       max_boxes=20,
 89 |                       score_threshold=.6,
 90 |                       iou_threshold=.5):
 91 |     """Evaluate YOLO model on given input and return filtered boxes."""
 92 |     num_layers = len(yolo_outputs)
 93 |     anchor_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] if num_layers == 3 else [[3, 4, 5], [1, 2, 3]]  # default setting
 94 |     input_shape = tf.shape(yolo_outputs[0])[1:3] * 32
 95 |     boxes = []
 96 |     box_scores = []
 97 |     for l in range(num_layers):
 98 |         _boxes, _box_scores = yolo_boxes_and_scores(yolo_outputs[l],
 99 |                                                     anchors[anchor_mask[l]], num_classes, input_shape, image_shape)
100 |         boxes.append(_boxes)
101 |         box_scores.append(_box_scores)
102 |     # boxes = tf.concat(boxes, axis=1)
103 |     # box_scores = tf.concat(box_scores, axis=1)
104 |     boxes = np.concatenate(boxes, axis=1)
105 |     box_scores = np.concatenate(box_scores, axis=1)
106 |     # print(box_scores.shape)
107 |     # print(boxes.shape)
108 | 
109 |     boxes_ = []
110 |     scores_ = []
111 |     classes_ = []
112 |     for single_boxes, single_box_scores in zip(boxes, box_scores):
113 |         mask = single_box_scores >= score_threshold
114 |         max_boxes_tensor = tf.constant(max_boxes, dtype='int32')
115 |         single_boxes_ = []
116 |         single_scores_ = []
117 |         single_classes_ = []
118 |         for c in range(num_classes):
119 |             pass
120 |             # TODO: use keras backend instead of tf.
121 |             # class_boxes = tf.boolean_mask(single_boxes, mask[..., c])
122 |             # class_box_scores = tf.boolean_mask(single_box_scores[..., c], mask[..., c])
123 |             class_boxes = single_boxes[mask[..., c]]
124 |             class_box_scores = single_box_scores[..., c][mask[..., c]]
125 |             nms_index = tf.image.non_max_suppression(
126 |                 class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold)
127 |             class_boxes = tf.gather(class_boxes, nms_index)
128 |             class_box_scores = tf.gather(class_box_scores, nms_index)
129 |             classes = tf.ones_like(class_box_scores, 'int32') * c
130 |             single_boxes_.append(class_boxes)
131 |             single_scores_.append(class_box_scores)
132 |             single_classes_.append(classes)
133 |         single_boxes_ = tf.concat(single_boxes_, axis=0)
134 |         single_scores_ = tf.concat(single_scores_, axis=0)
135 |         single_classes_ = tf.concat(single_classes_, axis=0)
136 |         boxes_.append(single_boxes_)
137 |         scores_.append(single_scores_)
138 |         classes_.append(single_classes_)
139 | 
140 |     # boxes_ = tf.reshape(boxes_, [tf.shape(box_scores)[0], -1, 4])
141 |     # scores_ = tf.reshape(scores_, [tf.shape(box_scores)[0], -1])
142 |     # classes_ = tf.reshape(classes_, [tf.shape(box_scores)[0], -1])
143 | 
144 |     return boxes_, scores_, classes_
145 | 


--------------------------------------------------------------------------------
/tensorrt_util/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__init__.py


--------------------------------------------------------------------------------
/tensorrt_util/__pycache__/__init__.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/__init__.cpython-35.pyc


--------------------------------------------------------------------------------
/tensorrt_util/__pycache__/tensorrt_common.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/tensorrt_common.cpython-35.pyc


--------------------------------------------------------------------------------
/tensorrt_util/__pycache__/yolo_calibrator.cpython-35.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mMikaa00/Yolov3-TensorRT-py/90c34d5eee4dc6ef5c853df64f3ef11cb32ac020/tensorrt_util/__pycache__/yolo_calibrator.cpython-35.pyc


--------------------------------------------------------------------------------
/tensorrt_util/tensorrt_common.py:
--------------------------------------------------------------------------------
 1 | import pycuda.driver as cuda
 2 | import pycuda.autoinit
 3 | import tensorrt as trt
 4 | import uff
 5 | import os
 6 | 
 7 | 
 8 | class HostDeviceMem(object):
 9 |     def __init__(self, host_mem, device_mem):
10 |         self.host = host_mem
11 |         self.device = device_mem
12 | 
13 |     def __str__(self):
14 |         return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)
15 | 
16 |     def __repr__(self):
17 |         return self.__str__()
18 | 
19 | 
20 | def allocate_buffers(engine):
21 |     inputs = []
22 |     outputs = []
23 |     bindings = []
24 |     stream = cuda.Stream()
25 |     for binding in engine:
26 |         size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
27 |         dtype = trt.nptype(engine.get_binding_dtype(binding))
28 |         # Allocate host and device buffers
29 |         host_mem = cuda.pagelocked_empty(size, dtype)
30 |         device_mem = cuda.mem_alloc(host_mem.nbytes)
31 |         # Append the device buffer to device bindings.
32 |         bindings.append(int(device_mem))
33 |         # Append to the appropriate list.
34 |         if engine.binding_is_input(binding):
35 |             inputs.append(HostDeviceMem(host_mem, device_mem))
36 |         else:
37 |             outputs.append(HostDeviceMem(host_mem, device_mem))
38 |     return inputs, outputs, bindings, stream
39 | 
40 | 
41 | def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
42 |     # Transfer input data to the GPU.
43 |     [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
44 |     # Run inference.
45 |     context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
46 |     # Transfer predictions back from the GPU.
47 |     [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
48 |     # Synchronize the stream
49 |     stream.synchronize()
50 |     # Return only the host outputs.
51 |     return [out.host for out in outputs]
52 | 
53 | 
54 | # Transforms model path to uff path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff)
55 | def model_path_to_uff_path(model_path):
56 |     uff_path = os.path.splitext(model_path)[0] + ".uff"
57 |     return uff_path
58 | 
59 | 
60 | # Transforms model path to onnx path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff)
61 | def model_path_to_onnx_path(model_path):
62 |     onnx_path = os.path.splitext(model_path)[0] + ".onnx"
63 |     return onnx_path
64 | 
65 | 
66 | # Transforms model path to engine path (e.g. /a/b/c/d.pb -> /a/b/c/d.uff)
67 | def model_path_to_engine_path(model_path, build_type='fp32'):
68 |     if os.path.splitext(model_path)[1] == '.engine':
69 |         return model_path
70 |     engine_path = os.path.splitext(model_path)[0] + '_' + build_type + ".engine"
71 |     return engine_path
72 | 
73 | 
74 | # Converts the TensorFlow frozen graphdef to UFF format using the UFF converter
75 | def model_to_uff(model_path, output_names, plugin_map={}):
76 |     # Transform graph using graphsurgeon to map unsupported TensorFlow
77 |     # operations to appropriate TensorRT custom layer plugins
78 |     import graphsurgeon as gs
79 |     dynamic_graph = gs.DynamicGraph(model_path)
80 |     dynamic_graph.collapse_namespaces(plugin_map)
81 |     # Save resulting graph to UFF file
82 |     output_uff_path = model_path_to_uff_path(model_path)
83 |     uff.from_tensorflow(
84 |         dynamic_graph.as_graph_def(),
85 |         output_names,
86 |         output_filename=output_uff_path,
87 |         text=True
88 |     )
89 |     return output_uff_path
90 | 


--------------------------------------------------------------------------------
/tensorrt_util/yolo_calibrator.py:
--------------------------------------------------------------------------------
 1 | import tensorrt as trt
 2 | import os
 3 | 
 4 | import pycuda.driver as cuda
 5 | import pycuda.autoinit
 6 | from PIL import Image
 7 | from post_process import letterbox_image
 8 | import numpy as np
 9 | 
10 | 
11 | def data_generator(annotation_lines, batch_size, input_shape):
12 |     '''data generator for fit_generator'''
13 |     n = len(annotation_lines)
14 |     count = 0
15 |     i = 0
16 |     while True:
17 |         image_data = []
18 |         for b in range(batch_size):
19 |             count += 1
20 |             if count > 400:
21 |                 return None
22 |             line = annotation_lines[i].split()
23 |             image = Image.open(line[0])
24 |             boxed_image = letterbox_image(image, input_shape)
25 |             image_np = np.array(boxed_image, dtype='float32', order='C')/255.
26 |             # image_np = np.transpose(image_np, [2, 0, 1])
27 |             image_data.append(image_np)
28 |             i = (i+1) % n
29 |         print("Calib count: ", count)
30 |         yield np.ascontiguousarray(image_data)
31 | 
32 | 
33 | class YOLOEntropyCalibrator(trt.IInt8EntropyCalibrator2):
34 |     def __init__(self, data_path, cache_file, input_shape, batch_size=64):
35 |         # Whenever you specify a custom constructor for a TensorRT class,
36 |         # you MUST call the constructor of the parent explicitly.
37 |         trt.IInt8EntropyCalibrator2.__init__(self)
38 | 
39 |         self.cache_file = cache_file
40 | 
41 |         # Every time get_batch is called, the next batch of size batch_size will be copied to the device and returned.
42 | 
43 |         self.batch_size = batch_size
44 |         self.current_index = 0
45 |         self.input_shape = input_shape
46 | 
47 |         # Allocate enough memory for a whole batch.
48 |         self.device_input = cuda.mem_alloc(input_shape[0] * input_shape[1] * 3 * 4 * self.batch_size)
49 |         with open(data_path) as f:
50 |             self.lines = f.readlines()
51 |         self.batches = data_generator(self.lines, self.batch_size, self.input_shape)
52 | 
53 |     def get_batch_size(self):
54 |         return self.batch_size
55 | 
56 |     # TensorRT passes along the names of the engine bindings to the get_batch function.
57 |     # You don't necessarily have to use them, but they can be useful to understand the order of
58 |     # the inputs. The bindings list is expected to have the same ordering as 'names'.
59 |     def get_batch(self, names):
60 |         try:
61 |             # Assume self.batches is a generator that provides batch data.
62 |             data = next(self.batches)
63 |             # Assume that self.device_input is a device buffer allocated by the constructor.
64 |             cuda.memcpy_htod(self.device_input, data)
65 |             return [int(self.device_input)]
66 |         except StopIteration:
67 |             # When we're out of batches, we return either [] or None.
68 |             # This signals to TensorRT that there is no calibration data remaining.
69 |             return None
70 | 
71 |     def read_calibration_cache(self):
72 |         # If there is a cache, use it instead of calibrating again. Otherwise, implicitly return None.
73 |         if os.path.exists(self.cache_file):
74 |             with open(self.cache_file, "rb") as f:
75 |                 return f.read()
76 | 
77 |     def write_calibration_cache(self, cache):
78 |         with open(self.cache_file, "wb") as f:
79 |             f.write(cache)
80 | 


--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import colorsys
  3 | import numpy as np
  4 | from matplotlib.colors import rgb_to_hsv, hsv_to_rgb
  5 | from PIL import Image, ImageFont, ImageDraw
  6 | from timeit import default_timer as timer
  7 | 
  8 | 
  9 | def rand(a=0, b=1):
 10 |     return np.random.rand() * (b - a) + a
 11 | 
 12 | 
 13 | def get_random_data(annotation_line, input_shape, random=True, max_boxes=20, jitter=.3, hue=.1, sat=1.5, val=1.5,
 14 |                     proc_img=True):
 15 |     '''random preprocessing for real-time data augmentation'''
 16 |     line = annotation_line.split()
 17 |     image = Image.open(line[0])
 18 |     iw, ih = image.size
 19 |     h, w = input_shape
 20 |     box = np.array([np.array(list(map(int, box.split(',')))) for box in line[1:]])
 21 | 
 22 |     if not random:
 23 |         # resize image
 24 |         scale = min(w / iw, h / ih)
 25 |         nw = int(iw * scale)
 26 |         nh = int(ih * scale)
 27 |         dx = (w - nw) // 2
 28 |         dy = (h - nh) // 2
 29 |         image_data = 0
 30 |         if proc_img:
 31 |             image = image.resize((nw, nh), Image.BICUBIC)
 32 |             new_image = Image.new('RGB', (w, h), (128, 128, 128))
 33 |             new_image.paste(image, (dx, dy))
 34 |             image_data = np.array(new_image) / 255.
 35 | 
 36 |         # correct boxes
 37 |         box_data = np.zeros((max_boxes, 5))
 38 |         if len(box) > 0:
 39 |             np.random.shuffle(box)
 40 |             if len(box) > max_boxes: box = box[:max_boxes]
 41 |             box[:, [0, 2]] = box[:, [0, 2]] * scale + dx
 42 |             box[:, [1, 3]] = box[:, [1, 3]] * scale + dy
 43 |             box_data[:len(box)] = box
 44 | 
 45 |         return image_data, box_data
 46 | 
 47 |     # resize image
 48 |     new_ar = w / h * rand(1 - jitter, 1 + jitter) / rand(1 - jitter, 1 + jitter)
 49 |     scale = rand(.25, 2)
 50 |     if new_ar < 1:
 51 |         nh = int(scale * h)
 52 |         nw = int(nh * new_ar)
 53 |     else:
 54 |         nw = int(scale * w)
 55 |         nh = int(nw / new_ar)
 56 |     image = image.resize((nw, nh), Image.BICUBIC)
 57 | 
 58 |     # place image
 59 |     dx = int(rand(0, w - nw))
 60 |     dy = int(rand(0, h - nh))
 61 |     new_image = Image.new('RGB', (w, h), (128, 128, 128))
 62 |     new_image.paste(image, (dx, dy))
 63 |     image = new_image
 64 | 
 65 |     # flip image or not
 66 |     flip = rand() < .5
 67 |     if flip: image = image.transpose(Image.FLIP_LEFT_RIGHT)
 68 | 
 69 |     # distort image
 70 |     hue = rand(-hue, hue)
 71 |     sat = rand(1, sat) if rand() < .5 else 1 / rand(1, sat)
 72 |     val = rand(1, val) if rand() < .5 else 1 / rand(1, val)
 73 |     x = rgb_to_hsv(np.array(image) / 255.)
 74 |     x[..., 0] += hue
 75 |     x[..., 0][x[..., 0] > 1] -= 1
 76 |     x[..., 0][x[..., 0] < 0] += 1
 77 |     x[..., 1] *= sat
 78 |     x[..., 2] *= val
 79 |     x[x > 1] = 1
 80 |     x[x < 0] = 0
 81 |     image_data = hsv_to_rgb(x)  # numpy array, 0 to 1
 82 | 
 83 |     # correct boxes
 84 |     box_data = np.zeros((max_boxes, 5))
 85 |     if len(box) > 0:
 86 |         np.random.shuffle(box)
 87 |         box[:, [0, 2]] = box[:, [0, 2]] * nw / iw + dx
 88 |         box[:, [1, 3]] = box[:, [1, 3]] * nh / ih + dy
 89 |         if flip: box[:, [0, 2]] = w - box[:, [2, 0]]
 90 |         box[:, 0:2][box[:, 0:2] < 0] = 0
 91 |         box[:, 2][box[:, 2] > w] = w
 92 |         box[:, 3][box[:, 3] > h] = h
 93 |         box_w = box[:, 2] - box[:, 0]
 94 |         box_h = box[:, 3] - box[:, 1]
 95 |         box = box[np.logical_and(box_w > 1, box_h > 1)]  # discard invalid box
 96 |         if len(box) > max_boxes: box = box[:max_boxes]
 97 |         box_data[:len(box)] = box
 98 | 
 99 |     return image_data, box_data
100 | 
101 | 
102 | class Drawer(object):
103 |     def __init__(self):
104 |         self.class_names = [
105 |             "person", "bicycle", "car", "motorbike",
106 |             "aeroplane", "bus", "train", "truck",
107 |             "boat", "traffic light", "fire hydrant", "stop sign",
108 |             "parking meter", "bench", "bird", "cat",
109 |             "dog", "horse", "sheep", "cow",
110 |             "elephant", "bear", "zebra", "giraffe",
111 |             "backpack", "umbrella", "handbag", "tie",
112 |             "suitcase", "frisbee", "skis", "snowboard",
113 |             "sports ball", "kite", "baseball bat", "baseball glove",
114 |             "skateboard", "surfboard", "tennis racket", "bottle",
115 |             "wine glass", "cup", "fork", "knife",
116 |             "spoon", "bowl", "banana", "apple",
117 |             "sandwich", "orange", "broccoli", "carrot",
118 |             "hot dog", "pizza", "donut", "cake",
119 |             "chair", "sofa", "pottedplant", "bed",
120 |             "diningtable", "toilet", "tvmonitor", "laptop",
121 |             "mouse", "remote", "keyboard", "cell phone",
122 |             "microwave", "oven", "toaster", "sink",
123 |             "refrigerator", "book", "clock", "vase",
124 |             "scissors", "teddy bear", "hair drier", "toothbrush",
125 |         ]
126 | 
127 |         # Generate colors for drawing bounding boxes.
128 |         self.hsv_tuples = [(x / len(self.class_names), 1., 1.)
129 |                            for x in range(len(self.class_names))]
130 |         self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), self.hsv_tuples))
131 |         self.colors = list(
132 |             map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
133 |                 self.colors))
134 |         np.random.seed(10101)  # Fixed seed for consistent colors across runs.
135 |         np.random.shuffle(self.colors)  # Shuffle colors to decorrelate adjacent classes.
136 |         np.random.seed(None)  # Reset seed to default.
137 | 
138 |     def draw_boxes(self, image, out_boxes, out_scores, out_classes):
139 |         font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
140 |                                   size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
141 |         thickness = (image.size[0] + image.size[1]) // 300
142 | 
143 |         for i, c in reversed(list(enumerate(out_classes))):
144 |             predicted_class = self.class_names[c]
145 |             box = out_boxes[i]
146 |             score = out_scores[i]
147 | 
148 |             label = '{} {:.2f}'.format(predicted_class, score)
149 |             draw = ImageDraw.Draw(image)
150 |             label_size = draw.textsize(label, font)
151 | 
152 |             top, left, bottom, right = box
153 |             top = max(0, np.floor(top + 0.5).astype('int32'))
154 |             left = max(0, np.floor(left + 0.5).astype('int32'))
155 |             bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
156 |             right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
157 |             print(label, (left, top), (right, bottom))
158 | 
159 |             if top - label_size[1] >= 0:
160 |                 text_origin = np.array([left, top - label_size[1]])
161 |             else:
162 |                 text_origin = np.array([left, top + 1])
163 | 
164 |             # My kingdom for a good redistributable image drawing library.
165 |             for i in range(thickness):
166 |                 draw.rectangle(
167 |                     [left + i, top + i, right - i, bottom - i],
168 |                     outline=self.colors[c])
169 |             draw.rectangle(
170 |                 [tuple(text_origin), tuple(text_origin + label_size)],
171 |                 fill=self.colors[c])
172 |             draw.text(text_origin, label, fill=(0, 0, 0), font=font)
173 |             del draw
174 |         return image
175 | 
176 | 
177 | def detect_image(yolo, image_path, output_path=""):
178 |     import cv2
179 |     image_path = os.path.abspath(image_path)
180 |     if os.path.isdir(image_path):
181 |         files = os.listdir(image_path)
182 |         images = [os.path.join(image_path, file) for file in files]
183 |     elif os.path.isfile(image_path):
184 |         images = image_path
185 |     else:
186 |         print('image_path error.')
187 |         return
188 |     f = open('./time.txt', 'w')
189 |     for image_file in images:
190 |         image = Image.open(image_file)
191 |         start = timer()
192 |         out_boxes, out_scores, out_classes = yolo.detect_image(image)
193 |         end = timer()
194 |         print('inference time: {:.3f}'.format(end - start))
195 |         f.write('{:.3f}\n'.format(end - start))
196 |         drawer = Drawer()
197 |         image = drawer.draw_boxes(image, out_boxes, out_scores, out_classes)
198 |         result = np.asarray(image)
199 |         cv2.namedWindow("result", cv2.WINDOW_NORMAL)
200 |         cv2.imshow("result", cv2.cvtColor(result, cv2.COLOR_RGB2BGR))
201 |         if output_path != "":
202 |             image.save(output_path)
203 |         if cv2.waitKey(1) & 0xFF == ord('q'):
204 |             break
205 |     f.close()
206 |     yolo.close_session()
207 | 
208 | 
209 | def detect_video(yolo, video_path, output_path=""):
210 |     import cv2
211 |     vid = cv2.VideoCapture(video_path)
212 |     if not vid.isOpened():
213 |         raise IOError("Couldn't open webcam or video")
214 |     video_FourCC = int(vid.get(cv2.CAP_PROP_FOURCC))
215 |     video_fps = vid.get(cv2.CAP_PROP_FPS)
216 |     video_size = (int(vid.get(cv2.CAP_PROP_FRAME_WIDTH)),
217 |                   int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT)))
218 |     vid.set(cv2.CAP_PROP_BUFFERSIZE, 1)
219 |     isOutput = True if output_path != "" else False
220 |     if isOutput:
221 |         print("!!! TYPE:", type(output_path), type(video_FourCC), type(video_fps), type(video_size))
222 |         out = cv2.VideoWriter(output_path, video_FourCC, video_fps, video_size)
223 |     accum_time = 0
224 |     curr_fps = 0
225 |     fps = "FPS: ??"
226 |     prev_time = timer()
227 |     drawer = Drawer()
228 |     while True:
229 |         return_value, frame = vid.read()
230 |         image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
231 |         # image = Image.fromarray(frame)
232 |         start = timer()
233 |         out_boxes, out_scores, out_classes = yolo.detect_image(image)
234 |         end = timer()
235 |         print('inference time: {}'.format(end - start))
236 | 
237 |         image = drawer.draw_boxes(image, out_boxes, out_scores, out_classes)
238 |         result = np.asarray(image)
239 |         curr_time = timer()
240 |         exec_time = curr_time - prev_time
241 |         prev_time = curr_time
242 |         accum_time = accum_time + exec_time
243 |         curr_fps = curr_fps + 1
244 |         if accum_time > 1:
245 |             accum_time = accum_time - 1
246 |             fps = "FPS: " + str(curr_fps)
247 |             curr_fps = 0
248 |         cv2.putText(result, text=fps, org=(3, 15), fontFace=cv2.FONT_HERSHEY_SIMPLEX,
249 |                     fontScale=0.50, color=(255, 0, 0), thickness=2)
250 |         cv2.namedWindow("result", cv2.WINDOW_NORMAL)
251 |         cv2.imshow("result", cv2.cvtColor(result, cv2.COLOR_RGB2BGR))
252 |         if isOutput:
253 |             out.write(result)
254 |         if cv2.waitKey(1) & 0xFF == ord('q'):
255 |             break
256 |     yolo.close_session()
257 | 


--------------------------------------------------------------------------------
/yolo_keras.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | from tensorflow import keras
  5 | from tensorflow.keras.models import load_model
  6 | 
  7 | import os
  8 | from post_process import yolo_post_process, letterbox_image
  9 | from tensorflow.keras.utils import multi_gpu_model
 10 | 
 11 | # fix a memory allocation bug, refer to https://www.tensorflow.org/guide/gpu?hl=zh-CN
 12 | tf.enable_eager_execution()
 13 | gpus = tf.config.experimental.list_physical_devices('GPU')
 14 | if gpus:
 15 |     try:
 16 |         # Currently, memory growth needs to be the same across GPUs
 17 |         for gpu in gpus:
 18 |             tf.config.experimental.set_memory_growth(gpu, True)
 19 |             # tf.config.experimental.set_virtual_device_configuration(
 20 |             #     gpu,
 21 |             #     [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
 22 | 
 23 |         logical_gpus = tf.config.experimental.list_logical_devices('GPU')
 24 |         print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
 25 |     except RuntimeError as e:
 26 |         # Memory growth must be set before GPUs have been initialized
 27 |         print(e)
 28 | 
 29 | 
 30 | # class OutputLayer(keras.layers.Layer):
 31 | #     def __init__(self, anchors, class_names, score, iou, input_image_shape):
 32 | #         super(OutputLayer, self).__init__()
 33 | #         self.anchors = anchors
 34 | #         self.class_names = class_names
 35 | #         self.score = score
 36 | #         self.iou = iou
 37 | #         self.input_image_shape = input_image_shape
 38 | #
 39 | #     def call(self, inputs):
 40 | #         return yolo_eval(inputs, self.anchors,
 41 | #                          len(self.class_names), self.input_image_shape,
 42 | #                          score_threshold=self.score, iou_threshold=self.iou)
 43 | 
 44 | 
 45 | class YOLO(object):
 46 |     _defaults = {
 47 |         "model_path": 'model_data/yolo.h5',
 48 |         "anchors_path": 'model_data/yolo_anchors.txt',
 49 |         "classes_path": 'model_data/coco_classes.txt',
 50 |         "classes_num": 80,
 51 |         "score": 0.6,
 52 |         "iou": 0.45,
 53 |         "model_image_size": (416, 416),
 54 |         "gpu_num": 1,
 55 |     }
 56 | 
 57 |     @classmethod
 58 |     def get_defaults(cls, n):
 59 |         if n in cls._defaults:
 60 |             return cls._defaults[n]
 61 |         else:
 62 |             return "Unrecognized attribute name '" + n + "'"
 63 | 
 64 |     def __init__(self, **kwargs):
 65 |         self.__dict__.update(self._defaults)  # set up default values
 66 |         self.__dict__.update(kwargs)  # and update with user overrides
 67 |         self.anchors = self._get_anchors()
 68 |         self.generate()
 69 | 
 70 |     def _get_anchors(self):
 71 |         anchors_path = os.path.expanduser(self.anchors_path)
 72 |         with open(anchors_path) as f:
 73 |             anchors = f.readline()
 74 |         anchors = [float(x) for x in anchors.split(',')]
 75 |         return np.array(anchors).reshape(-1, 2)
 76 | 
 77 |     def generate(self):
 78 |         model_path = os.path.expanduser(self.model_path)
 79 |         assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'
 80 |         # loaded = tf.saved_model.load(model_path)
 81 |         # self.inference_func = loaded.signatures["serving_default"]
 82 | 
 83 |         # Load model
 84 |         try:
 85 |             raw_model = load_model(model_path)
 86 |         except:
 87 |             print('load model failed.')
 88 | 
 89 |         print('{} model loaded.'.format(model_path))
 90 | 
 91 |         if self.gpu_num >= 2:
 92 |             raw_model = multi_gpu_model(raw_model, gpus=self.gpu_num)
 93 |         # boxes, scores, classes = OutputLayer(self.anchors, self.class_names, self.score, self.iou, input_image_shape)(
 94 |         #     raw_model.output)
 95 |         # self.yolo_model = keras.Model(inputs=[raw_model.input, input_image_shape], outputs=[boxes, scores, classes])
 96 |         self.yolo_model = raw_model
 97 | 
 98 |     def detect_image(self, image):
 99 |         if self.model_image_size != (None, None):
100 |             assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required'
101 |             assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required'
102 |             boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
103 |         else:
104 |             new_image_size = (image.width - (image.width % 32),
105 |                               image.height - (image.height % 32))
106 |             boxed_image = letterbox_image(image, new_image_size)
107 |         image_data = np.array(boxed_image, dtype='float32')
108 | 
109 |         image_data /= 255.
110 |         image_data = np.expand_dims(image_data, 0)  # Add batch dimension.
111 |         input_image_size = np.array([image.size[1], image.size[0]])
112 |         input_image_size = np.expand_dims(input_image_size, 0)
113 | 
114 |         yolo_output = self.yolo_model.predict([image_data], verbose=1)
115 |         # with tf.device("/gpu:0"):
116 |         out_boxes, out_scores, out_classes = yolo_post_process(yolo_output, self.anchors,
117 |                                                                self.classes_num, input_image_size,
118 |                                                                score_threshold=self.score, iou_threshold=self.iou)
119 | 
120 |         out_boxes = out_boxes[0]
121 |         out_scores = out_scores[0]
122 |         out_classes = out_classes[0]
123 |         print('Found {} boxes for {}'.format(len(out_boxes), 'img'))
124 |         return out_boxes, out_scores, out_classes
125 | 
126 |     def close_session(self):
127 |         pass
128 | 


--------------------------------------------------------------------------------
/yolo_tensorrt.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | import numpy as np
  3 | import tensorflow as tf
  4 | import tensorrt as trt
  5 | from tensorrt_util import tensorrt_common as common
  6 | from tensorrt_util.yolo_calibrator import YOLOEntropyCalibrator
  7 | 
  8 | from post_process import yolo_post_process, letterbox_image
  9 | import os
 10 | 
 11 | # fix a memory allocation bug, refer to https://www.tensorflow.org/guide/gpu?hl=zh-CN
 12 | tf.enable_eager_execution()
 13 | gpus = tf.config.experimental.list_physical_devices('GPU')
 14 | if gpus:
 15 |     try:
 16 |         # Currently, memory growth needs to be the same across GPUs
 17 |         for gpu in gpus:
 18 |             # tf.config.experimental.set_virtual_device_configuration(
 19 |             #     gpu,
 20 |             #     [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=512)])
 21 |             tf.config.experimental.set_memory_growth(gpu, True)
 22 |         logical_gpus = tf.config.experimental.list_logical_devices('GPU')
 23 |         print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
 24 |     except RuntimeError as e:
 25 |         # Memory growth must be set before GPUs have been initialized
 26 |         print(e)
 27 | 
 28 | 
 29 | # tf.compat.v1.disable_v2_behavior()
 30 | 
 31 | 
 32 | class YOLO(object):
 33 |     _defaults = {
 34 |         "model_path": 'model_data/tensorrt_model/yolo_int8.engine',
 35 |         "anchors_path": 'model_data/yolo_anchors.txt',
 36 |         "calib_img_path": "./dataset/2012_train.txt",
 37 |         "classes_num": 80,
 38 |         "score": 0.6,
 39 |         "iou": 0.45,
 40 |         "model_image_size": (416, 416),
 41 |         "gpu_num": 1,
 42 |         "infer_mode": "int8",
 43 |     }
 44 | 
 45 |     @classmethod
 46 |     def get_defaults(cls, n):
 47 |         if n in cls._defaults:
 48 |             return cls._defaults[n]
 49 |         else:
 50 |             return "Unrecognized attribute name '" + n + "'"
 51 | 
 52 |     def __init__(self, **kwargs):
 53 |         self.__dict__.update(self._defaults)  # set up default values
 54 |         self.__dict__.update(kwargs)  # and update with user overrides
 55 |         self.anchors = self._get_anchors()
 56 |         self.engine = self._build_engine()
 57 |         self.context = self.engine.create_execution_context()
 58 |         # self.context.active_optimization_profile = 0
 59 |         # self.context.set_binding_shape(0, (1, 416, 416, 3))
 60 |         self.inputs, self.outputs, self.bindings, self.stream = common.allocate_buffers(self.engine)
 61 | 
 62 |     def _get_anchors(self):
 63 |         anchors_path = os.path.expanduser(self.anchors_path)
 64 |         with open(anchors_path) as f:
 65 |             anchors = f.readline()
 66 |         anchors = [float(x) for x in anchors.split(',')]
 67 |         return np.array(anchors).reshape(-1, 2)
 68 | 
 69 |     def _build_engine(self):
 70 |         model_path = os.path.expanduser(self.model_path)
 71 | 
 72 |         TRT_LOGGER = trt.Logger(trt.Logger.INFO)
 73 |         # trt.init_libnvinfer_plugins(TRT_LOGGER, '')
 74 |         # CLIP_PLUGIN_LIBRARY = '/usr/lib/x86_64-linux-gnu/libnvinfer_plugin.so.6.0.1'
 75 |         # if not os.path.isfile(self.plugin_path):
 76 |         #     raise IOError("\n{}\n{}\n{}\n".format(
 77 |         #         "Failed to load library ({}).".format(self.plugin_path),
 78 |         #         "Please build the Clip sample plugin.",
 79 |         #         "For more information, see the included README.md"
 80 |         #     ))
 81 | 
 82 |         # dll = ctypes.CDLL(self.plugin_path)
 83 |         # init = dll.initLibNvInferPlugins
 84 |         # init = dll.initLibYoloInferPlugins
 85 |         # init(None, b'')
 86 | 
 87 |         # plugins = trt.get_plugin_registry().plugin_creator_list
 88 |         # for plugin_creator in plugins:
 89 |         #     print(plugin_creator.name)
 90 | 
 91 |         engine_path = common.model_path_to_engine_path(model_path, self.infer_mode)
 92 |         if os.path.isfile(engine_path):
 93 |             with open(engine_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
 94 |                 return runtime.deserialize_cuda_engine(f.read())
 95 | 
 96 |         EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
 97 |         with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(
 98 |                 network, TRT_LOGGER) as parser:
 99 |             builder.max_batch_size = 1
100 |             builder.max_workspace_size = 1 << 28
101 |             if self.infer_mode == 'int8':
102 |                 calibration_cache = "model_data/tensorrt_model/yolo_calibration.cache"
103 |                 calib = YOLOEntropyCalibrator(self.calib_img_path, calibration_cache, self.model_image_size,
104 |                                               batch_size=8)
105 |                 builder.int8_mode = True
106 |                 builder.int8_calibrator = calib
107 | 
108 |             # profile = builder.create_optimization_profile()
109 |             # profile.set_shape('input_1', (1, 416, 416, 3), (1, 416, 416, 3), (1, 416, 416, 3))
110 |             # config.add_optimization_profile(profile)
111 | 
112 |             onnx_path = common.model_path_to_onnx_path(model_path)
113 |             with open(onnx_path, 'rb') as model:
114 |                 print(onnx_path)
115 |                 if not parser.parse(model.read()):
116 |                     raise TypeError("Parser parse failed.")
117 |             # network.get_input(0).shape = (1, 416, 416, 3)
118 |             # print(network.get_output(0).shape)
119 |             # print(network.get_output(1).shape)
120 |             # print(network.get_output(2).shape)
121 | 
122 |             engine = builder.build_cuda_engine(network)
123 |             print(engine.get_binding_name(1))
124 | 
125 |             if not engine:
126 |                 raise TypeError("Build engine failed.")
127 |             with open(engine_path, 'wb') as f:
128 |                 f.write(engine.serialize())
129 |             return engine
130 | 
131 |         # with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.UffParser() as parser:
132 |         #     builder.max_batch_size = 1
133 |         #     builder.max_workspace_size = 1 << 29
134 |         #     calibration_cache = "model_data/tensorrt_model/yolo_calibration.cache"
135 |         #     if self.infer_mode == 'int8':
136 |         #         calib = YOLOEntropyCalibrator(self.calib_img_path, calibration_cache, self.model_image_size,
137 |         #                                       batch_size=8)
138 |         #         builder.int8_mode = True
139 |         #         builder.int8_calibrator = calib
140 |         #     output_names = ['conv2d_59/BiasAdd',
141 |         #                     'conv2d_67/BiasAdd',
142 |         #                     'conv2d_75/BiasAdd']
143 |         #     import graphsurgeon as gs
144 |         #     plugin_map = {
145 |         #         # "input_1": gs.create_plugin_node(name="input_1", op="Placeholder", shape=(-1, 416, 416, 3),
146 |         #         #                                  dtype=tf.float32),
147 |         #         # "up_sampling2d_2/ResizeNearestNeighbor": gs.create_plugin_node(
148 |         #         #     name="trt_upsampled2d_2/ResizeNearest_TRT",
149 |         #         #     op="ResizeNearest_TRT",
150 |         #         #     scale=2.0),
151 |         #         # "up_sampling2d_1/ResizeNearestNeighbor": gs.create_plugin_node(
152 |         #         #     name="trt_upsampled2d_1/ResizeNearest_TRT",
153 |         #         #     op="ResizeNearest_TRT",
154 |         #         #     scale=2.0)
155 |         #     }
156 |         #     uff_path = common.model_path_to_uff_path(model_path)
157 |         #     if not os.path.isfile(uff_path):
158 |         #         uff_path = common.model_to_uff(model_path, output_names, plugin_map=plugin_map)
159 |         #     parser.register_input("input_1", (3, 416, 416))
160 |         #     parser.register_output("conv2d_59/BiasAdd")
161 |         #     parser.register_output("conv2d_67/BiasAdd")
162 |         #     parser.register_output("conv2d_75/BiasAdd")
163 |         #     parser.parse(uff_path, network)
164 |         #     engine = builder.build_cuda_engine(network)
165 |         #     if not engine:
166 |         #         raise TypeError("Build engine failed.")
167 |         #     with open(engine_path, 'wb') as f:
168 |         #         f.write(engine.serialize())
169 |         #     return engine
170 | 
171 |     def detect_image(self, image):
172 |         if self.model_image_size != (None, None):
173 |             assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required'
174 |             assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required'
175 |             boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
176 |         else:
177 |             new_image_size = (image.width - (image.width % 32),
178 |                               image.height - (image.height % 32))
179 |             boxed_image = letterbox_image(image, new_image_size)
180 |         image_data = np.array(boxed_image, dtype='float32', order='C')
181 | 
182 |         image_data /= 255.
183 |         # image_data = np.ascontiguousarray(np.transpose(image_data, [2, 0, 1]))
184 |         image_data = np.expand_dims(image_data, 0)  # Add batch dimension.
185 |         input_image_size = np.array([image.size[1], image.size[0]])
186 |         input_image_size = np.expand_dims(input_image_size, 0)
187 | 
188 |         # np.copyto(inputs[0].host, image_data.transpose(0, 3, 1, 2).ravel())
189 |         self.inputs[0].host = image_data
190 |         # For more information on performing inference, refer to the introductory samples.
191 |         # The common.do_inference function will return a list of outputs - we only have one in this case.
192 |         yolo_output = common.do_inference(self.context, bindings=self.bindings, inputs=self.inputs,
193 |                                           outputs=self.outputs,
194 |                                           stream=self.stream)
195 |         yolo_output[0] = yolo_output[0].reshape(1, 13, 13, 255)
196 |         yolo_output[1] = yolo_output[1].reshape(1, 26, 26, 255)
197 |         yolo_output[2] = yolo_output[2].reshape(1, 52, 52, 255)
198 | 
199 |         out_boxes, out_scores, out_classes = yolo_post_process(yolo_output, self.anchors,
200 |                                                                self.classes_num, input_image_size,
201 |                                                                score_threshold=self.score, iou_threshold=self.iou)
202 | 
203 |         out_boxes = out_boxes[0]
204 |         out_scores = out_scores[0]
205 |         out_classes = out_classes[0]
206 |         print('Found {} boxes for {}'.format(len(out_boxes), 'img'))
207 |         return out_boxes, out_scores, out_classes
208 | 
209 |     def close_session(self):
210 |         pass
211 | 


--------------------------------------------------------------------------------
/yolo_test.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | from utils import detect_video, detect_image
  3 | from PIL import Image
  4 | 
  5 | 
  6 | # test
  7 | def detect_img(yolo):
  8 |     import cv2
  9 |     import numpy as np
 10 |     # while True:
 11 |     img = input('Input image filename:')
 12 |     if not img:
 13 |         return
 14 |     # dataset = open("dataset/2012_val.txt")
 15 |     # for line in dataset.readlines():
 16 |     #     img = line.split()[0]
 17 |     try:
 18 |         image = Image.open(img)
 19 |         img_origin = image.copy()
 20 |     except:
 21 |         print('Open Error! Try again!')
 22 |         exit()
 23 |     else:
 24 |         r_image = yolo.detect_image(image)
 25 |         img_output = np.asarray(r_image)
 26 |         img_output = np.concatenate([np.asarray(img_origin), np.asarray(r_image)], axis=1)
 27 |         cv2.imshow("test", cv2.cvtColor(img_output, cv2.COLOR_RGB2BGR))
 28 |         if cv2.waitKey() & 0xFF == ord('q'):
 29 |             exit()
 30 | 
 31 | 
 32 | # FLAGS = None
 33 | 
 34 | if __name__ == '__main__':
 35 |     # class YOLO defines the default value, so suppress any default here
 36 |     parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
 37 |     '''
 38 |     Command line options
 39 |     '''
 40 |     parser.add_argument(
 41 |         '--model_path', type=str,
 42 |         help='path to model weight file'
 43 |     )
 44 | 
 45 |     parser.add_argument(
 46 |         '--anchors_path', type=str,
 47 |         help='path to anchor definitions'
 48 |     )
 49 | 
 50 |     parser.add_argument(
 51 |         '--classes_path', type=str,
 52 |         help='path to class definitions'
 53 |     )
 54 | 
 55 |     parser.add_argument(
 56 |         '--gpu_num', type=int,
 57 |         help='Number of GPU to use'
 58 |     )
 59 | 
 60 |     parser.add_argument(
 61 |         '--platform', type=str, required=False, default='tensorrt',
 62 |         help='Inference platform: tensorflow or tensorrt'
 63 |     )
 64 |     '''
 65 |     image detection mode
 66 |     '''
 67 |     parser.add_argument(
 68 |         "--image_path", type=str, required=False,
 69 |         help="Image input path"
 70 |     )
 71 | 
 72 |     parser.add_argument(
 73 |         "--image_output_path", default='', type=str, required=False,
 74 |         help="Image output path"
 75 |     )
 76 |     '''
 77 |     video detection mode
 78 |     '''
 79 | 
 80 |     parser.add_argument(
 81 |         "--video_path", type=str, required=False,
 82 |         help="Video input path"
 83 |     )
 84 | 
 85 |     parser.add_argument(
 86 |         "--video_output_path", default='', type=str, required=False,
 87 |         help="Video output path"
 88 |     )
 89 | 
 90 |     parser.add_argument(
 91 |         "--live", nargs='?', required=False,
 92 |         help="Live mode"
 93 |     )
 94 | 
 95 |     FLAGS = parser.parse_args()
 96 |     if FLAGS.platform == 'tensorrt':
 97 |         from yolo_tensorrt import YOLO
 98 |     else:
 99 |         from yolo_keras import YOLO
100 | 
101 |     if "image_path" in FLAGS:
102 |         """
103 |         Image detection mode, disregard any remaining command line arguments
104 |         """
105 |         print("Image detection mode")
106 |         detect_image(YOLO(**vars(FLAGS)), FLAGS.image_path, FLAGS.image_output_path)
107 |     elif "video_path" in FLAGS:
108 |         print("local video mode")
109 |         detect_video(YOLO(**vars(FLAGS)), FLAGS.video_path, FLAGS.video_output_path)
110 |     elif "live" in FLAGS:
111 |         print("live mode")
112 |         detect_video(YOLO(**vars(FLAGS)), 0)
113 |     else:
114 |         print("Must specify at least video_input_path , image_input_path or live.  See usage with --help.")
115 | 


--------------------------------------------------------------------------------