├── nets
    ├── __init__.py
    ├── arcface.py
    ├── arcface_training.py
    ├── mobilefacenet.py
    ├── mobilenet.py
    ├── iresnet.py
    ├── mobilenetv3.py
    └── mobilenetv2.py
├── utils
    ├── __init__.py
    ├── callbacks.py
    ├── dataloader.py
    ├── utils_metrics.py
    └── utils.py
├── datasets
    └── README.md
├── lfw
    └── README.md
├── logs
    └── README.md
├── img
    ├── 1_001.jpg
    ├── 1_002.jpg
    └── 2_001.jpg
├── model_data
    ├── roc.png
    └── arcface_mobilefacenet.h5
├── txt_annotation.py
├── summary.py
├── LICENSE
├── eval_LFW.py
├── predict.py
├── .gitignore
├── README.md
├── arcface.py
├── train.py
└── 常见问题汇总.md


/nets/__init__.py:
--------------------------------------------------------------------------------
1 | #


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
1 | #


--------------------------------------------------------------------------------
/datasets/README.md:
--------------------------------------------------------------------------------
1 | 存放数据集


--------------------------------------------------------------------------------
/lfw/README.md:
--------------------------------------------------------------------------------
1 | 存放lfw数据集


--------------------------------------------------------------------------------
/logs/README.md:
--------------------------------------------------------------------------------
1 | 用于存放训练好的文件


--------------------------------------------------------------------------------
/img/1_001.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/1_001.jpg


--------------------------------------------------------------------------------
/img/1_002.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/1_002.jpg


--------------------------------------------------------------------------------
/img/2_001.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/2_001.jpg


--------------------------------------------------------------------------------
/model_data/roc.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/model_data/roc.png


--------------------------------------------------------------------------------
/model_data/arcface_mobilefacenet.h5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/model_data/arcface_mobilefacenet.h5


--------------------------------------------------------------------------------
/txt_annotation.py:
--------------------------------------------------------------------------------
 1 | #------------------------------------------------#
 2 | #   进行训练前需要利用这个文件生成cls_train.txt
 3 | #------------------------------------------------#
 4 | import os
 5 | 
 6 | if __name__ == "__main__":
 7 |     #---------------------#
 8 |     #   训练集所在的路径
 9 |     #---------------------#
10 |     datasets_path   = "datasets"
11 | 
12 |     types_name      = os.listdir(datasets_path)
13 |     types_name      = sorted(types_name)
14 | 
15 |     list_file = open('cls_train.txt', 'w')
16 |     for cls_id, type_name in enumerate(types_name):
17 |         photos_path = os.path.join(datasets_path, type_name)
18 |         if not os.path.isdir(photos_path):
19 |             continue
20 |         photos_name = os.listdir(photos_path)
21 | 
22 |         for photo_name in photos_name:
23 |             list_file.write(str(cls_id) + ";" + '%s'%(os.path.join(os.path.abspath(datasets_path), type_name, photo_name)))
24 |             list_file.write('\n')
25 |     list_file.close()
26 | 


--------------------------------------------------------------------------------
/summary.py:
--------------------------------------------------------------------------------
 1 | #--------------------------------------------#
 2 | #   该部分代码只用于看网络结构，并非测试代码
 3 | #--------------------------------------------#
 4 | from nets.arcface import arcface
 5 | from utils.utils import net_flops
 6 | 
 7 | if __name__ == "__main__":
 8 |     input_shape = [112, 112]
 9 |     backbone    = "mobilefacenet"
10 |     model       = arcface([input_shape[0], input_shape[1], 3], 10575, backbone=backbone, mode="predict")
11 |     #--------------------------------------------#
12 |     #   查看网络结构网络结构
13 |     #--------------------------------------------#
14 |     model.summary()
15 |     #------------------------------------------#
16 |     #   计算网络的FLOPS
17 |     #--------------------------------------------#
18 |     net_flops(model, table=False)
19 |     
20 |     #--------------------------------------------#
21 |     #   获得网络每个层的名称与序号
22 |     #--------------------------------------------#
23 |     # for i,layer in enumerate(model.layers):
24 |     #     print(i,layer.name)
25 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2022 Bubbliiiing
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/eval_LFW.py:
--------------------------------------------------------------------------------
 1 | from nets.arcface import arcface
 2 | from utils.dataloader import LFWDataset
 3 | from utils.utils_metrics import test
 4 | 
 5 | if __name__ == "__main__":
 6 |     #--------------------------------------#
 7 |     #   主干特征提取网络的选择
 8 |     #   mobilefacenet
 9 |     #   mobilenetv1
10 |     #   mobilenetv2
11 |     #   mobilenetv3
12 |     #   iresnet50
13 |     #--------------------------------------#
14 |     backbone        = "mobilefacenet"
15 |     #--------------------------------------#
16 |     #   输入图像大小
17 |     #--------------------------------------#
18 |     input_shape     = [112, 112, 3]
19 |     #--------------------------------------#
20 |     #   训练好的权值文件
21 |     #--------------------------------------#
22 |     model_path      = "model_data/arcface_mobilefacenet.h5"
23 |     #--------------------------------------#
24 |     #   LFW评估数据集的文件路径
25 |     #   以及对应的txt文件
26 |     #--------------------------------------#
27 |     lfw_dir_path    = "lfw"
28 |     lfw_pairs_path  = "model_data/lfw_pair.txt"
29 |     #--------------------------------------#
30 |     #   评估的批次大小和记录间隔
31 |     #--------------------------------------#
32 |     batch_size      = 256
33 |     log_interval    = 1
34 |     #--------------------------------------#
35 |     #   ROC图的保存路径
36 |     #--------------------------------------#
37 |     png_save_path   = "model_data/roc_test.png"
38 | 
39 |     test_loader     = LFWDataset(dir=lfw_dir_path,pairs_path=lfw_pairs_path, batch_size=batch_size, input_shape=input_shape)
40 | 
41 |     model           = arcface(input_shape, None, backbone=backbone, mode="predict")
42 |     model.load_weights(model_path, by_name=True)
43 | 
44 |     test(test_loader, model, png_save_path, log_interval, batch_size)
45 | 


--------------------------------------------------------------------------------
/predict.py:
--------------------------------------------------------------------------------
 1 | from PIL import Image
 2 | 
 3 | from arcface import Arcface
 4 | 
 5 | if __name__ == "__main__":
 6 |     model = Arcface()
 7 |         
 8 |     #----------------------------------------------------------------------------------------------------------#
 9 |     #   mode用于指定测试的模式：
10 |     #   'predict'   表示单张图片预测，如果想对预测过程进行修改，如保存图片，截取对象等，可以先看下方详细的注释
11 |     #   'fps'       表示测试fps，使用的图片是img里面的street.jpg，详情查看下方注释。
12 |     #----------------------------------------------------------------------------------------------------------#
13 |     mode            = "predict"
14 |     #-------------------------------------------------------------------------#
15 |     #   test_interval   用于指定测量fps的时候，图片检测的次数
16 |     #                   理论上test_interval越大，fps越准确。
17 |     #   fps_test_image  fps测试图片
18 |     #-------------------------------------------------------------------------#
19 |     test_interval   = 100
20 |     fps_test_image  = 'img/1_001.jpg'
21 |     
22 |     if mode == "predict":
23 |         while True:
24 |             image_1 = input('Input image_1 filename:')
25 |             try:
26 |                 image_1 = Image.open(image_1)
27 |             except:
28 |                 print('Image_1 Open Error! Try again!')
29 |                 continue
30 | 
31 |             image_2 = input('Input image_2 filename:')
32 |             try:
33 |                 image_2 = Image.open(image_2)
34 |             except:
35 |                 print('Image_2 Open Error! Try again!')
36 |                 continue
37 |             
38 |             probability = model.detect_image(image_1,image_2)
39 |             print(probability)
40 | 
41 |     elif mode == "fps":
42 |         img = Image.open(fps_test_image)
43 |         tact_time = model.get_FPS(img, test_interval)
44 |         print(str(tact_time) + ' seconds, ' + str(1/tact_time) + 'FPS, @batch_size 1')


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # ignore map, miou, datasets
  2 | map_out/
  3 | miou_out/
  4 | VOCdevkit/
  5 | datasets/
  6 | Medical_Datasets/
  7 | lfw/
  8 | logs/
  9 | model_data/
 10 | 
 11 | # Byte-compiled / optimized / DLL files
 12 | __pycache__/
 13 | *.py[cod]
 14 | *$py.class
 15 | 
 16 | # C extensions
 17 | *.so
 18 | 
 19 | # Distribution / packaging
 20 | .Python
 21 | build/
 22 | develop-eggs/
 23 | dist/
 24 | downloads/
 25 | eggs/
 26 | .eggs/
 27 | lib/
 28 | lib64/
 29 | parts/
 30 | sdist/
 31 | var/
 32 | wheels/
 33 | pip-wheel-metadata/
 34 | share/python-wheels/
 35 | *.egg-info/
 36 | .installed.cfg
 37 | *.egg
 38 | MANIFEST
 39 | 
 40 | # PyInstaller
 41 | #  Usually these files are written by a python script from a template
 42 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 43 | *.manifest
 44 | *.spec
 45 | 
 46 | # Installer logs
 47 | pip-log.txt
 48 | pip-delete-this-directory.txt
 49 | 
 50 | # Unit test / coverage reports
 51 | htmlcov/
 52 | .tox/
 53 | .nox/
 54 | .coverage
 55 | .coverage.*
 56 | .cache
 57 | nosetests.xml
 58 | coverage.xml
 59 | *.cover
 60 | *.py,cover
 61 | .hypothesis/
 62 | .pytest_cache/
 63 | 
 64 | # Translations
 65 | *.mo
 66 | *.pot
 67 | 
 68 | # Django stuff:
 69 | *.log
 70 | local_settings.py
 71 | db.sqlite3
 72 | db.sqlite3-journal
 73 | 
 74 | # Flask stuff:
 75 | instance/
 76 | .webassets-cache
 77 | 
 78 | # Scrapy stuff:
 79 | .scrapy
 80 | 
 81 | # Sphinx documentation
 82 | docs/_build/
 83 | 
 84 | # PyBuilder
 85 | target/
 86 | 
 87 | # Jupyter Notebook
 88 | .ipynb_checkpoints
 89 | 
 90 | # IPython
 91 | profile_default/
 92 | ipython_config.py
 93 | 
 94 | # pyenv
 95 | .python-version
 96 | 
 97 | # pipenv
 98 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
 99 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
100 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
101 | #   install all needed dependencies.
102 | #Pipfile.lock
103 | 
104 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
105 | __pypackages__/
106 | 
107 | # Celery stuff
108 | celerybeat-schedule
109 | celerybeat.pid
110 | 
111 | # SageMath parsed files
112 | *.sage.py
113 | 
114 | # Environments
115 | .env
116 | .venv
117 | env/
118 | venv/
119 | ENV/
120 | env.bak/
121 | venv.bak/
122 | 
123 | # Spyder project settings
124 | .spyderproject
125 | .spyproject
126 | 
127 | # Rope project settings
128 | .ropeproject
129 | 
130 | # mkdocs documentation
131 | /site
132 | 
133 | # mypy
134 | .mypy_cache/
135 | .dmypy.json
136 | dmypy.json
137 | 
138 | # Pyre type checker
139 | .pyre/
140 | 


--------------------------------------------------------------------------------
/nets/arcface.py:
--------------------------------------------------------------------------------
 1 | import keras.backend as K
 2 | import tensorflow as tf
 3 | from keras import initializers
 4 | from keras.layers import Input, Lambda, Layer
 5 | from keras.models import Model
 6 | from keras.regularizers import l2
 7 | 
 8 | from nets.iresnet import iResNet50
 9 | from nets.mobilefacenet import mobilefacenet
10 | from nets.mobilenet import MobilenetV1
11 | from nets.mobilenetv2 import MobilenetV2
12 | from nets.mobilenetv3 import MobileNetV3_Large, MobilenetV3_small
13 | 
14 | 
15 | class ArcMarginProduct(Layer) :
16 |     def __init__(self, n_classes=1000, **kwargs) :
17 |         self.n_classes = n_classes
18 |         super(ArcMarginProduct, self).__init__(**kwargs)
19 | 
20 |     def build(self, input_shape) :
21 |         self.W = self.add_weight(name='W',
22 |                                 shape=(input_shape[-1], self.n_classes),
23 |                                 initializer=initializers.glorot_uniform(),
24 |                                 trainable=True,
25 |                                 regularizer=l2(5e-4))
26 |         super(ArcMarginProduct, self).build(input_shape)
27 |         
28 |     def call(self, input) :
29 |         W       = tf.nn.l2_normalize(self.W, axis=0)
30 |         logits  = input @ W
31 |         return K.clip(logits, -1 + K.epsilon(), 1 - K.epsilon())
32 | 
33 |     def compute_output_shape(self, input_shape) :
34 |         return (None, self.n_classes)
35 | 
36 | def arcface(input_shape, num_classes=None, backbone="mobilefacenet", mode="train"):
37 |     inputs = Input(shape=input_shape)
38 | 
39 |     if backbone=="mobilefacenet":
40 |         embedding_size  = 128
41 |         x = mobilefacenet(inputs, embedding_size)
42 |     elif backbone=="mobilenetv1":
43 |         embedding_size  = 512
44 |         x = MobilenetV1(inputs, embedding_size, dropout_keep_prob=0.5)
45 |     elif backbone=="mobilenetv2":
46 |         embedding_size  = 512
47 |         x = MobilenetV2(inputs, embedding_size, dropout_keep_prob=0.5)
48 |     elif backbone=="mobilenetv3":
49 |         embedding_size  = 512
50 |         x = MobileNetV3_Large(inputs, embedding_size, dropout_keep_prob=0.5)
51 |     elif backbone=="iresnet50":
52 |         embedding_size  = 512
53 |         x = iResNet50(inputs, embedding_size, dropout_keep_prob=0.5)
54 |     else:
55 |         raise ValueError('Unsupported backbone - `{}`, Use mobilefacenet, mobilenetv1, mobilenetv2, mobilenetv3, iresnet50.'.format(mode))
56 | 
57 |     if mode == "train":
58 |         predict = Lambda(lambda  x: K.l2_normalize(x, axis=1), name="l2_normalize")(x)
59 |         x       = ArcMarginProduct(num_classes, name="ArcMargin")(predict)
60 |         model   = Model(inputs, [x, predict])
61 |         return model
62 |     elif mode == "predict":
63 |         x       = Lambda(lambda  x: K.l2_normalize(x, axis=1))(x)
64 |         model   = Model(inputs, x)
65 |         return model
66 |     else:
67 |         raise ValueError('Unsupported mode - `{}`, Use train, predict.'.format(mode))
68 | 


--------------------------------------------------------------------------------
/nets/arcface_training.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | from functools import partial
 3 | 
 4 | import keras.backend as K
 5 | import tensorflow as tf
 6 | 
 7 | 
 8 | class ArcFaceLoss() :
 9 |     def __init__(self, s=32.0, m=0.5) :
10 |         self.s = s
11 | 
12 |         self.cos_m      = math.cos(m)
13 |         self.sin_m      = math.sin(m)
14 |         
15 |         self.th         = math.cos(math.pi - m)
16 |         self.mm         = math.sin(math.pi - m) * m
17 | 
18 |     def __call__(self, y_true, y_pred, sample_weight=None):
19 |         labels = tf.cast(y_true, tf.float32)
20 |         cosine = tf.cast(y_pred, tf.float32)
21 |         #----------------------------------------------------#
22 |         #   batch_size, 10575 -> batch_size, 10575
23 |         #----------------------------------------------------#
24 |         sine    = tf.sqrt(1 - tf.square(cosine))
25 |         phi     = cosine * self.cos_m - sine * self.sin_m
26 |         phi     = tf.where(cosine > self.th, phi, cosine - self.mm)
27 | 
28 |         output = (labels * phi) + ((1.0 - labels) * cosine)
29 |         output *= self.s
30 |         
31 |         losses = K.categorical_crossentropy(y_true, output, from_logits=True)
32 |         # losses = tf.Print(losses,[tf.shape(losses),tf.shape(y_true),tf.shape(output)])
33 |         return losses
34 | 
35 | def get_lr_scheduler(lr_decay_type, lr, min_lr, total_iters, warmup_iters_ratio = 0.1, warmup_lr_ratio = 0.1, no_aug_iter_ratio = 0.3, step_num = 10):
36 |     def yolox_warm_cos_lr(lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter, iters):
37 |         if iters <= warmup_total_iters:
38 |             # lr = (lr - warmup_lr_start) * iters / float(warmup_total_iters) + warmup_lr_start
39 |             lr = (lr - warmup_lr_start) * pow(iters / float(warmup_total_iters), 2
40 |             ) + warmup_lr_start
41 |         elif iters >= total_iters - no_aug_iter:
42 |             lr = min_lr
43 |         else:
44 |             lr = min_lr + 0.5 * (lr - min_lr) * (
45 |                 1.0
46 |                 + math.cos(
47 |                     math.pi
48 |                     * (iters - warmup_total_iters)
49 |                     / (total_iters - warmup_total_iters - no_aug_iter)
50 |                 )
51 |             )
52 |         return lr
53 | 
54 |     def step_lr(lr, decay_rate, step_size, iters):
55 |         if step_size < 1:
56 |             raise ValueError("step_size must above 1.")
57 |         n       = iters // step_size
58 |         out_lr  = lr * decay_rate ** n
59 |         return out_lr
60 | 
61 |     if lr_decay_type == "cos":
62 |         warmup_total_iters  = min(max(warmup_iters_ratio * total_iters, 1), 3)
63 |         warmup_lr_start     = max(warmup_lr_ratio * lr, 1e-6)
64 |         no_aug_iter         = min(max(no_aug_iter_ratio * total_iters, 1), 15)
65 |         func = partial(yolox_warm_cos_lr ,lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter)
66 |     else:
67 |         decay_rate  = (min_lr / lr) ** (1 / (step_num - 1))
68 |         step_size   = total_iters / step_num
69 |         func = partial(step_lr, lr, decay_rate, step_size)
70 | 
71 |     return func
72 | 
73 | 


--------------------------------------------------------------------------------
/nets/mobilefacenet.py:
--------------------------------------------------------------------------------
 1 | import keras
 2 | from keras import backend as K
 3 | from keras import initializers
 4 | from keras.layers import (BatchNormalization, Conv2D, DepthwiseConv2D, PReLU, Flatten,
 5 |                           add)
 6 | 
 7 | 
 8 | def conv_block(inputs, filters, kernel_size, strides, padding):
 9 |     x = Conv2D(filters, kernel_size, strides=strides, padding=padding, use_bias=False, 
10 |                kernel_initializer=initializers.random_normal(stddev=0.1),
11 |                bias_initializer='zeros')(inputs)
12 |     x = BatchNormalization(axis=-1, epsilon=1e-5)(x)
13 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
14 |     return x
15 | 
16 | def depthwise_conv_block(inputs, filters, kernel_size, strides):
17 |     x = DepthwiseConv2D(kernel_size, strides=strides, padding="same", use_bias=False,
18 |                         depthwise_initializer=initializers.random_normal(stddev=0.1),
19 |                         bias_initializer='zeros')(inputs)
20 |     x = BatchNormalization(axis=-1, epsilon=1e-5)(x)
21 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
22 |     return x
23 | 
24 | def bottleneck(inputs, filters, kernel, t, strides, r=False):
25 |     tchannel = K.int_shape(inputs)[-1] * t
26 |     x = conv_block(inputs, tchannel, 1, 1, "same")
27 | 
28 |     x = DepthwiseConv2D(kernel, strides=strides, padding="same", depth_multiplier=1, use_bias=False,
29 |                         depthwise_initializer=initializers.random_normal(stddev=0.1),
30 |                         bias_initializer='zeros')(x)
31 |     x = BatchNormalization(axis=-1, epsilon=1e-5)(x)
32 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
33 |     
34 |     x = Conv2D(filters, 1, strides=1, padding="same", use_bias=False, 
35 |                kernel_initializer=initializers.random_normal(stddev=0.1),
36 |                bias_initializer='zeros')(x)
37 |     x = BatchNormalization(axis=-1, epsilon=1e-5)(x)
38 |     if r:
39 |         x = add([x, inputs])
40 |     return x
41 | 
42 | def inverted_residual_block(inputs, filters, kernel, t, n):
43 |     x = inputs
44 |     for _ in range(n):
45 |         x = bottleneck(x, filters, kernel, t, 1, True)
46 |     return x
47 | 
48 | def mobilefacenet(inputs, embedding_size):
49 |     x = conv_block(inputs, 64, 3, 2, "same")  # Output Shape: (56, 56, 64)
50 |     x = depthwise_conv_block(x, 64, 3, 1)  # (56, 56, 64)
51 | 
52 |     x = bottleneck(x, 64, 3, t=2, strides=2)
53 |     x = inverted_residual_block(x, 64, 3, t=2, n=4)  # (28, 28, 64)
54 | 
55 |     x = bottleneck(x, 128, 3, t=4, strides=2)  # (14, 14, 128)
56 |     x = inverted_residual_block(x, 128, 3, t=2, n=6)  # (14, 14, 128)
57 |     
58 |     x = bottleneck(x, 128, 3, t=4, strides=2)  # (14, 14, 128)
59 |     x = inverted_residual_block(x, 128, 3, t=2, n=2)  # (7, 7, 128)
60 |     
61 |     x = Conv2D(512, 1, use_bias=False, name="conv2d",
62 |                kernel_initializer=initializers.random_normal(stddev=0.1),
63 |                bias_initializer='zeros')(x)
64 |     x = BatchNormalization(epsilon=1e-5)(x)
65 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
66 |     
67 |     x = DepthwiseConv2D(int(x.shape[1]), depth_multiplier=1, use_bias=False,
68 |                         depthwise_initializer=initializers.random_normal(stddev=0.1),
69 |                         bias_initializer='zeros')(x)
70 |     x = BatchNormalization(epsilon=1e-5)(x)
71 | 
72 |     x = Conv2D(embedding_size, 1, use_bias=False,
73 |                kernel_initializer=initializers.random_normal(stddev=0.1),
74 |                bias_initializer='zeros')(x)
75 |     x = BatchNormalization(name="embedding", epsilon=1e-5)(x)
76 |     x = Flatten()(x)
77 | 
78 |     return x
79 | 


--------------------------------------------------------------------------------
/nets/mobilenet.py:
--------------------------------------------------------------------------------
 1 | 
 2 | from keras import backend as K
 3 | from keras import initializers
 4 | from keras.layers import (Activation, BatchNormalization, Conv2D, Dense,
 5 |                           DepthwiseConv2D, Dropout, Flatten, PReLU)
 6 | 
 7 | 
 8 | def _conv_block(inputs, filters, kernel=(3, 3), strides=(1, 1)):
 9 |     x = Conv2D(filters, kernel,
10 |                padding='same',
11 |                use_bias=False,
12 |                strides=strides,
13 |                name='conv1',
14 |                kernel_initializer=initializers.random_normal(stddev=0.1),
15 |                bias_initializer='zeros')(inputs)
16 |     x = BatchNormalization(name='conv1_bn', epsilon=1e-5)(x)
17 |     return Activation(relu6, name='conv1_relu')(x)
18 | 
19 | 
20 | def _depthwise_conv_block(inputs, pointwise_conv_filters,
21 |                           depth_multiplier=1, strides=(1, 1), block_id=1):
22 |     x = DepthwiseConv2D((3, 3),
23 |                         padding='same',
24 |                         depth_multiplier=depth_multiplier,
25 |                         strides=strides,
26 |                         use_bias=False,
27 |                         name='conv_dw_%d' % block_id,
28 |                         depthwise_initializer=initializers.random_normal(stddev=0.1),
29 |                         bias_initializer='zeros')(inputs)
30 | 
31 |     x = BatchNormalization(name='conv_dw_%d_bn' % block_id, epsilon=1e-5)(x)
32 |     x = Activation(relu6, name='conv_dw_%d_relu' % block_id)(x)
33 | 
34 |     x = Conv2D(pointwise_conv_filters, (1, 1),
35 |                padding='same',
36 |                use_bias=False,
37 |                strides=(1, 1),
38 |                name='conv_pw_%d' % block_id,
39 |                kernel_initializer=initializers.random_normal(stddev=0.1),
40 |                bias_initializer='zeros')(x)
41 |     x = BatchNormalization(name='conv_pw_%d_bn' % block_id, epsilon=1e-5)(x)
42 |     return Activation(relu6, name='conv_pw_%d_relu' % block_id)(x)
43 | 
44 | def relu6(x):
45 |     return K.relu(x, max_value=6)
46 | 
47 | def MobilenetV1(inputs, embedding_size, dropout_keep_prob=0.5, depth_multiplier=1):
48 |     x = _conv_block(inputs, 32, strides=(1, 1))
49 |     x = _depthwise_conv_block(x, 64, depth_multiplier, block_id=1)
50 | 
51 |     x = _depthwise_conv_block(x, 128, depth_multiplier, strides=(2, 2), block_id=2)
52 |     x = _depthwise_conv_block(x, 128, depth_multiplier, block_id=3)
53 | 
54 |     x = _depthwise_conv_block(x, 256, depth_multiplier, strides=(2, 2), block_id=4)
55 |     x = _depthwise_conv_block(x, 256, depth_multiplier, block_id=5)
56 | 
57 |     x = _depthwise_conv_block(x, 512, depth_multiplier, strides=(2, 2), block_id=6)
58 |     x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=7)
59 |     x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=8)
60 |     x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=9)
61 |     x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=10)
62 |     x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=11)
63 | 
64 |     x = _depthwise_conv_block(x, 1024, depth_multiplier, strides=(2, 2), block_id=12)
65 |     x = _depthwise_conv_block(x, 1024, depth_multiplier, block_id=13)
66 | 
67 |     x = Conv2D(512, kernel_size=1, use_bias=False, name='sep',
68 |                kernel_initializer=initializers.random_normal(stddev=0.1),
69 |                bias_initializer='zeros')(x)
70 |     x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x)
71 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
72 | 
73 |     x = BatchNormalization(name='bn2', epsilon=1e-5)(x)
74 |     x = Dropout(p=dropout_keep_prob)(x)
75 |     x = Flatten()(x)
76 |     x = Dense(embedding_size, name='linear',
77 |             kernel_initializer=initializers.random_normal(stddev=0.1),
78 |             bias_initializer='zeros')(x)
79 |     x = BatchNormalization(name='features', epsilon=1e-5)(x)
80 |     return x
81 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## Arcface：人脸识别模型在Keras当中的实现
  2 | ---
  3 | 
  4 | ## 目录
  5 | 1. [仓库更新 Top News](#仓库更新)
  6 | 2. [相关仓库 Related code](#相关仓库)
  7 | 3. [性能情况 Performance](#性能情况)
  8 | 4. [所需环境 Environment](#所需环境)
  9 | 5. [注意事项 Attention](#注意事项)
 10 | 6. [文件下载 Download](#文件下载)
 11 | 7. [预测步骤 How2predict](#预测步骤)
 12 | 8. [训练步骤 How2train](#训练步骤)
 13 | 9. [参考资料 Reference](#Reference)
 14 | 
 15 | ## Top News
 16 | **`2022-03`**:**创建仓库，支持不同模型训练，支持大量可调整参数，支持step、cos学习率下降法、支持adam、sgd优化器选择、支持学习率根据batch_size自适应调整、新增图片裁剪。**  
 17 | 
 18 | ## 相关仓库
 19 | | 模型 | 路径 |
 20 | | :----- | :----- |
 21 | facenet | https://github.com/bubbliiiing/facenet-keras
 22 | arcface | https://github.com/bubbliiiing/arcface-keras
 23 | retinaface | https://github.com/bubbliiiing/retinaface-keras
 24 | facenet + retinaface | https://github.com/bubbliiiing/facenet-retinaface-keras
 25 | 
 26 | ## 性能情况
 27 | | 训练数据集 | 权值文件名称 | 测试数据集 | 输入图片大小 | accuracy | Validation rate |
 28 | | :-----: | :-----: | :------: | :------: | :------: | :------: |
 29 | | CASIA-WebFace | [arcface_mobilenet.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_mobilenet.h5) | LFW | 112x112 | 99.00% |  0.95200+-0.02237 @ FAR=0.00100 |
 30 | | CASIA-WebFace | [arcface_mobilefacenet.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_mobilefacenet.h5) | LFW | 112x112 | 99.02% | 0.96500+-0.01344 @ FAR=0.00133 |
 31 | | CASIA-WebFace | [arcface_iresnet50.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_iresnet50.h5) | LFW | 112x112 | 98.98% | 0.92967+-0.01935 @ FAR=0.00133 |
 32 | 
 33 | ## 所需环境
 34 | tensorflow==1.13.2
 35 | keras=2.1.5
 36 | 
 37 | ## 文件下载
 38 | 已经训练好的权值可以在百度网盘下载。    
 39 | 链接：https://pan.baidu.com/s/1P3-T6_PoXGTMYa_VuiwXmw 提取码: 114e
 40 | 
 41 | 训练用的CASIA-WebFaces数据集以及评估用的LFW数据集可以在百度网盘下载。    
 42 | 链接: https://pan.baidu.com/s/1qMxFR8H_ih0xmY-rKgRejw 提取码: bcrq   
 43 | 
 44 | ## 预测步骤
 45 | ### a、使用预训练权重
 46 | 1. 下载完库后解压，可直接运行predict.py输入：
 47 | ```python
 48 | img\1_001.jpg
 49 | img\1_002.jpg
 50 | ```  
 51 | 2. 也可以在百度网盘下载权值，放入model_data，修改arcface.py文件的model_path后，输入：
 52 | ```python
 53 | img\1_001.jpg
 54 | img\1_002.jpg
 55 | ```  
 56 | ### b、使用自己训练的权重
 57 | 1. 按照训练步骤训练。  
 58 | 2. 在arcface.py文件里面，在如下部分修改model_path和backbone使其对应训练好的文件；**model_path对应logs文件夹下面的权值文件，backbone对应主干特征提取网络**。  
 59 | ```python
 60 | _defaults = {
 61 |     #--------------------------------------------------------------------------#
 62 |     #   使用自己训练好的模型进行预测要修改model_path，指向logs文件夹下的权值文件
 63 |     #   训练好后logs文件夹下存在多个权值文件，选择验证集损失较低的即可。
 64 |     #   验证集损失较低不代表准确度较高，仅代表该权值在验证集上泛化性能较好。
 65 |     #--------------------------------------------------------------------------#
 66 |     "model_path"        : "model_data/arcface_mobilefacenet.h5",
 67 |     #-------------------------------------------#
 68 |     #   输入图片的大小。
 69 |     #-------------------------------------------#
 70 |     "input_shape"       : [112, 112, 3],
 71 |     #-------------------------------------------#
 72 |     #   所使用到的主干特征提取网络，与训练的相同
 73 |     #   mobilefacenet
 74 |     #   mobilenetv1
 75 |     #   iresnet50
 76 |     #-------------------------------------------#
 77 |     "backbone"          : "mobilefacenet",
 78 |     #-------------------------------------------#
 79 |     #   是否进行不失真的resize
 80 |     #-------------------------------------------#
 81 |     "letterbox_image"   : True,
 82 | }
 83 | ```
 84 | 3. 运行predict.py，输入  
 85 | ```python
 86 | img\1_001.jpg
 87 | img\1_002.jpg
 88 | ```  
 89 | 
 90 | ## 训练步骤
 91 | 1. 本文使用如下格式进行训练。
 92 | ```
 93 | |-datasets
 94 |     |-people0
 95 |         |-123.jpg
 96 |         |-234.jpg
 97 |     |-people1
 98 |         |-345.jpg
 99 |         |-456.jpg
100 |     |-...
101 | ```  
102 | 2. 下载好数据集，将训练用的CASIA-WebFaces数据集以及评估用的LFW数据集，解压后放在根目录。
103 | 3. 在训练前利用txt_annotation.py文件生成对应的cls_train.txt。  
104 | 4. 利用train.py训练模型，训练前，根据自己的需要选择backbone，model_path和backbone一定要对应。
105 | 5. 运行train.py即可开始训练。
106 | 
107 | ## 评估步骤
108 | 1. 下载好评估数据集，将评估用的LFW数据集，解压后放在根目录
109 | 2. 在eval_LFW.py设置使用的主干特征提取网络和网络权值。
110 | 3. 运行eval_LFW.py来进行模型准确率评估。
111 | 
112 | ## Reference
113 | https://github.com/deepinsight/insightface  
114 | https://github.com/timesler/facenet-pytorch   
115 | 
116 | 


--------------------------------------------------------------------------------
/utils/callbacks.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | 
  3 | import keras
  4 | import matplotlib
  5 | import numpy as np
  6 | matplotlib.use('Agg')
  7 | from matplotlib import pyplot as plt
  8 | import scipy.signal
  9 | from keras import backend as K
 10 | 
 11 | from utils.utils_metrics import evaluate
 12 | 
 13 | 
 14 | class LossHistory(keras.callbacks.Callback):
 15 |     def __init__(self, log_dir):
 16 |         self.log_dir    = log_dir
 17 |         self.losses     = []
 18 |         self.val_loss   = []
 19 |         
 20 |         os.makedirs(self.log_dir)
 21 | 
 22 |     def on_epoch_end(self, epoch, logs={}):
 23 |         if not os.path.exists(self.log_dir):
 24 |             os.makedirs(self.save_path)
 25 | 
 26 |         self.losses.append(logs.get('loss'))
 27 |         self.val_loss.append(logs.get('val_loss'))
 28 |         
 29 |         with open(os.path.join(self.log_dir, "epoch_loss.txt"), 'a') as f:
 30 |             f.write(str(logs.get('loss')))
 31 |             f.write("\n")
 32 |         with open(os.path.join(self.log_dir, "epoch_val_loss.txt"), 'a') as f:
 33 |             f.write(str(logs.get('val_loss')))
 34 |             f.write("\n")
 35 |         self.loss_plot()
 36 | 
 37 |     def loss_plot(self):
 38 |         iters = range(len(self.losses))
 39 | 
 40 |         plt.figure()
 41 |         plt.plot(iters, self.losses, 'red', linewidth = 2, label='train loss')
 42 |         plt.plot(iters, self.val_loss, 'coral', linewidth = 2, label='val loss')
 43 |         try:
 44 |             if len(self.losses) < 25:
 45 |                 num = 5
 46 |             else:
 47 |                 num = 15
 48 |             
 49 |             plt.plot(iters, scipy.signal.savgol_filter(self.losses, num, 3), 'green', linestyle = '--', linewidth = 2, label='smooth train loss')
 50 |             plt.plot(iters, scipy.signal.savgol_filter(self.val_loss, num, 3), '#8B4513', linestyle = '--', linewidth = 2, label='smooth val loss')
 51 |         except:
 52 |             pass
 53 | 
 54 |         plt.grid(True)
 55 |         plt.xlabel('Epoch')
 56 |         plt.ylabel('Loss')
 57 |         plt.title('A Loss Curve')
 58 |         plt.legend(loc="upper right")
 59 | 
 60 |         plt.savefig(os.path.join(self.log_dir, "epoch_loss.png"))
 61 | 
 62 |         plt.cla()
 63 |         plt.close("all")
 64 | 
 65 | 
 66 | class ExponentDecayScheduler(keras.callbacks.Callback):
 67 |     def __init__(self,
 68 |                  decay_rate,
 69 |                  verbose=0):
 70 |         super(ExponentDecayScheduler, self).__init__()
 71 |         self.decay_rate         = decay_rate
 72 |         self.verbose            = verbose
 73 |         self.learning_rates     = []
 74 | 
 75 |     def on_epoch_end(self, batch, logs=None):
 76 |         learning_rate = K.get_value(self.model.optimizer.lr) * self.decay_rate
 77 |         K.set_value(self.model.optimizer.lr, learning_rate)
 78 |         if self.verbose > 0:
 79 |             print('Setting learning rate to %s.' % (learning_rate))
 80 | 
 81 | class LFW_callback(keras.callbacks.Callback):
 82 |     def __init__(self, test_loader):
 83 |         self.test_loader    = test_loader
 84 | 
 85 |     def on_train_begin(self, logs={}):
 86 |         return
 87 |  
 88 |     def on_train_end(self, logs={}):
 89 |         return
 90 |  
 91 |     def on_epoch_begin(self, epoch, logs={}):
 92 |         return
 93 |  
 94 |     def on_epoch_end(self, epoch, logs={}):        
 95 |         labels, distances = [], []
 96 |         print("正在进行LFW数据集测试")
 97 | 
 98 |         for _, (data_a, data_p, label) in enumerate(self.test_loader.generate()):
 99 |             out_a, out_p    = self.model.predict(data_a)[1], self.model.predict(data_p)[1]
100 |             dists           = np.linalg.norm(out_a - out_p, axis=1)
101 | 
102 |             distances.append(dists)
103 |             labels.append(label)
104 | 
105 |         labels      = np.array([sublabel for label in labels for sublabel in label])
106 |         distances   = np.array([subdist for dist in distances for subdist in dist])
107 |         _, _, accuracy, _, _, _, _ = evaluate(distances,labels)
108 |         print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy)))
109 |  
110 |     def on_batch_begin(self, batch, logs={}):
111 |         return
112 |  
113 |     def on_batch_end(self, batch, logs={}):
114 |         return   
115 |     
116 | class ParallelModelCheckpoint(keras.callbacks.ModelCheckpoint):
117 |     def __init__(self, model, filepath, monitor='val_loss', verbose=0,
118 |                  save_best_only=False, save_weights_only=False,
119 |                  mode='auto', period=1):
120 |         self.single_model = model
121 |         super(ParallelModelCheckpoint,self).__init__(filepath, monitor, verbose,save_best_only, save_weights_only,mode, period)
122 | 
123 |     def set_model(self, model):
124 |         super(ParallelModelCheckpoint,self).set_model(self.single_model)


--------------------------------------------------------------------------------
/utils/dataloader.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | import os
  3 | 
  4 | import keras
  5 | import numpy as np
  6 | from keras.utils import np_utils
  7 | from PIL import Image
  8 | 
  9 | from .utils import cvtColor, preprocess_input, resize_image
 10 | 
 11 | 
 12 | #------------------------------------#
 13 | #   数据加载器
 14 | #------------------------------------#
 15 | class FacenetDataset(keras.utils.Sequence):
 16 |     def __init__(self, input_shape, lines, batch_size, num_classes, random):
 17 |         self.input_shape    = input_shape
 18 |         self.lines          = lines
 19 |         self.length         = len(lines)
 20 |         self.batch_size     = batch_size
 21 |         self.num_classes    = num_classes
 22 |         self.random         = random
 23 |         
 24 |     def __len__(self):
 25 |         return math.ceil(self.length / float(self.batch_size))
 26 | 
 27 |     def __getitem__(self, index):
 28 |         images  = []
 29 |         labels  = []
 30 |         for i in range(index * self.batch_size, (index + 1) * self.batch_size):  
 31 |             i           = i % self.length
 32 | 
 33 |             annotation_path = self.lines[i].split(';')[1].split()[0]
 34 |             y               = int(self.lines[i].split(';')[0])
 35 | 
 36 |             image = cvtColor(Image.open(annotation_path))
 37 |             #------------------------------------------#
 38 |             #   翻转图像
 39 |             #------------------------------------------#
 40 |             if self.rand()<.5 and self.random: 
 41 |                 image = image.transpose(Image.FLIP_LEFT_RIGHT)
 42 |             image = resize_image(image, [self.input_shape[1], self.input_shape[0]], letterbox_image = True)
 43 |             image = preprocess_input(np.array(image, dtype='float32'))
 44 | 
 45 |             images.append(image)
 46 |             labels.append(y)
 47 | 
 48 |         labels = np_utils.to_categorical(np.array(labels), num_classes=self.num_classes)  
 49 |         return np.array(images, np.float32), np.array(labels, np.float32)
 50 | 
 51 |     def rand(self, a=0, b=1):
 52 |         return np.random.rand()*(b-a) + a
 53 | 
 54 | class LFWDataset():
 55 |     def __init__(self, dir, pairs_path, input_shape, batch_size):
 56 |         super(LFWDataset, self).__init__()
 57 |         self.input_shape        = input_shape
 58 |         self.pairs_path         = pairs_path
 59 |         self.batch_size         = batch_size
 60 |         self.validation_images  = self.get_lfw_paths(dir)
 61 | 
 62 |     def generate(self):
 63 |         images1 = []
 64 |         images2 = []
 65 |         issames = []
 66 |         for annotation_line in self.validation_images:  
 67 |             (path_1, path_2, issame)    = annotation_line
 68 |             image1, image2              = Image.open(path_1), Image.open(path_2)
 69 |             image1 = resize_image(image1, [self.input_shape[1], self.input_shape[0]], letterbox_image = True)
 70 |             image2 = resize_image(image2, [self.input_shape[1], self.input_shape[0]], letterbox_image = True)
 71 |                 
 72 |             image1, image2 = preprocess_input(np.array(image1, np.float32)), preprocess_input(np.array(image2, np.float32))
 73 | 
 74 |             images1.append(image1)
 75 |             images2.append(image2)
 76 |             issames.append(issame)
 77 |             if len(images1) == self.batch_size:
 78 |                 yield np.array(images1), np.array(images2), np.array(issames)
 79 |                 images1     = []
 80 |                 images2     = []
 81 |                 issames     = []
 82 |                 
 83 |         yield np.array(images1), np.array(images2), np.array(issames)
 84 | 
 85 |     def read_lfw_pairs(self,pairs_filename):
 86 |         pairs = []
 87 |         with open(pairs_filename, 'r') as f:
 88 |             for line in f.readlines()[1:]:
 89 |                 pair = line.strip().split()
 90 |                 pairs.append(pair)
 91 |         return np.array(pairs)
 92 | 
 93 |     def get_lfw_paths(self,lfw_dir,file_ext="jpg"):
 94 |         pairs = self.read_lfw_pairs(self.pairs_path)
 95 | 
 96 |         nrof_skipped_pairs = 0
 97 |         path_list = []
 98 |         issame_list = []
 99 | 
100 |         for i in range(len(pairs)):
101 |         #for pair in pairs:
102 |             pair = pairs[i]
103 |             if len(pair) == 3:
104 |                 path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'.'+file_ext)
105 |                 path1 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[2])+'.'+file_ext)
106 |                 issame = True
107 |             elif len(pair) == 4:
108 |                 path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'.'+file_ext)
109 |                 path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3])+'.'+file_ext)
110 |                 issame = False
111 |             if os.path.exists(path0) and os.path.exists(path1):    # Only add the pair if both paths exist
112 |                 path_list.append((path0,path1,issame))
113 |                 issame_list.append(issame)
114 |             else:
115 |                 nrof_skipped_pairs += 1
116 |         if nrof_skipped_pairs>0:
117 |             print('Skipped %d image pairs' % nrof_skipped_pairs)
118 | 
119 |         return path_list


--------------------------------------------------------------------------------
/arcface.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | 
  3 | import matplotlib.pyplot as plt
  4 | import numpy as np
  5 | 
  6 | from nets.arcface import arcface
  7 | from utils.utils import preprocess_input, resize_image, show_config
  8 | 
  9 | 
 10 | class Arcface(object):
 11 |     _defaults = {
 12 |         #--------------------------------------------------------------------------#
 13 |         #   使用自己训练好的模型进行预测要修改model_path，指向logs文件夹下的权值文件
 14 |         #   训练好后logs文件夹下存在多个权值文件，选择验证集损失较低的即可。
 15 |         #   验证集损失较低不代表准确度较高，仅代表该权值在验证集上泛化性能较好。
 16 |         #--------------------------------------------------------------------------#
 17 |         "model_path"        : "model_data/arcface_mobilefacenet.h5",
 18 |         #-------------------------------------------#
 19 |         #   输入图片的大小。
 20 |         #-------------------------------------------#
 21 |         "input_shape"       : [112, 112, 3],
 22 |         #-------------------------------------------#
 23 |         #   所使用到的主干特征提取网络，与训练的相同
 24 |         #   mobilefacenet
 25 |         #   mobilenetv1
 26 |         #   mobilenetv2
 27 |         #   mobilenetv3
 28 |         #   iresnet50
 29 |         #-------------------------------------------#
 30 |         "backbone"          : "mobilefacenet",
 31 |         #-------------------------------------------#
 32 |         #   是否进行不失真的resize
 33 |         #-------------------------------------------#
 34 |         "letterbox_image"   : True,
 35 |     }
 36 | 
 37 |     @classmethod
 38 |     def get_defaults(cls, n):
 39 |         if n in cls._defaults:
 40 |             return cls._defaults[n]
 41 |         else:
 42 |             return "Unrecognized attribute name '" + n + "'"
 43 | 
 44 |     #---------------------------------------------------#
 45 |     #   初始化Arcface
 46 |     #---------------------------------------------------#
 47 |     def __init__(self, **kwargs):
 48 |         self.__dict__.update(self._defaults)
 49 |         for name, value in kwargs.items():
 50 |             setattr(self, name, value)
 51 |         self.generate()
 52 |         
 53 |         show_config(**self._defaults)
 54 |         
 55 |     def generate(self):
 56 |         #---------------------------------------------------#
 57 |         #   载入模型与权值
 58 |         #---------------------------------------------------#
 59 |         model_path = os.path.expanduser(self.model_path)
 60 |         assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'
 61 |         self.model = arcface(self.input_shape, backbone=self.backbone, mode="predict")
 62 | 
 63 |         print('Loading weights into state dict...')
 64 |         self.model.load_weights(self.model_path, by_name=True)
 65 |         print('{} model loaded.'.format(self.model_path))
 66 |     
 67 |     #---------------------------------------------------#
 68 |     #   检测图片
 69 |     #---------------------------------------------------#
 70 |     def detect_image(self, image_1, image_2):
 71 |         image_1 = resize_image(image_1, [self.input_shape[1], self.input_shape[0]], self.letterbox_image)
 72 |         image_2 = resize_image(image_2, [self.input_shape[1], self.input_shape[0]], self.letterbox_image)
 73 |         
 74 |         photo_1 = np.expand_dims(preprocess_input(np.array(image_1, np.float32)), 0)
 75 |         photo_2 = np.expand_dims(preprocess_input(np.array(image_2, np.float32)), 0)
 76 | 
 77 |         #---------------------------------------------------#
 78 |         #   图片传入网络进行预测
 79 |         #---------------------------------------------------#
 80 |         output1 = self.model.predict(photo_1)
 81 |         output2 = self.model.predict(photo_2)
 82 |     
 83 |         #---------------------------------------------------#
 84 |         #   计算二者之间的距离
 85 |         #---------------------------------------------------#
 86 |         l1 = np.linalg.norm(output1-output2, axis=1)
 87 |         # l1 = np.sum(np.square(output1 - output2), axis=-1)
 88 | 
 89 |         plt.subplot(1, 2, 1)
 90 |         plt.imshow(np.array(image_1))
 91 | 
 92 |         plt.subplot(1, 2, 2)
 93 |         plt.imshow(np.array(image_2))
 94 |         plt.text(-12, -12, 'Distance:%.3f' % l1, ha='center', va= 'bottom',fontsize=11)
 95 |         plt.show()
 96 |         return l1
 97 | 
 98 |     def get_FPS(self, image, test_interval):
 99 |         #---------------------------------------------------#
100 |         #   对图片进行不失真的resize
101 |         #---------------------------------------------------#
102 |         image_data  = resize_image(image, [self.input_shape[1], self.input_shape[0]], self.letterbox_image)
103 |         #---------------------------------------------------------#
104 |         #   归一化+添加上batch_size维度
105 |         #---------------------------------------------------------#
106 |         image_data  = np.expand_dims(preprocess_input(np.array(image_data, np.float32)), 0)
107 |         
108 |         #---------------------------------------------------#
109 |         #   图片传入网络进行预测
110 |         #---------------------------------------------------#
111 |         preds       = self.model.predict(image_data)[0]
112 |         import time
113 |         t1 = time.time()
114 |         for _ in range(test_interval):
115 |             #---------------------------------------------------#
116 |             #   图片传入网络进行预测
117 |             #---------------------------------------------------#
118 |             preds       = self.model.predict(image_data)[0]
119 |         t2 = time.time()
120 |         tact_time = (t2 - t1) / test_interval
121 |         return tact_time


--------------------------------------------------------------------------------
/nets/iresnet.py:
--------------------------------------------------------------------------------
  1 | from keras import initializers, layers
  2 | from keras.layers import (BatchNormalization, Conv2D, Dense, Dropout, Flatten,
  3 |                           PReLU, ZeroPadding2D)
  4 | from keras.models import Model
  5 | 
  6 | 
  7 | def identity_block(input_tensor, kernel_size, filters, stage, block):
  8 |     filters1, filters2 = filters
  9 | 
 10 |     conv_name_base = 'res' + str(stage) + block + '_branch'
 11 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
 12 | 
 13 |     x = BatchNormalization(name=bn_name_base + '2', epsilon=1e-5)(input_tensor)
 14 | 
 15 |     #----------------------------#
 16 |     #   减少通道数
 17 |     #----------------------------#
 18 |     x = Conv2D(filters1, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2a',
 19 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 20 |                bias_initializer='zeros')(x)
 21 |     x = BatchNormalization(name=bn_name_base + '2a', epsilon=1e-5)(x)
 22 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
 23 | 
 24 |     #----------------------------#
 25 |     #   3x3卷积
 26 |     #----------------------------#
 27 |     x = Conv2D(filters2, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2b',
 28 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 29 |                bias_initializer='zeros')(x)
 30 |     x = BatchNormalization(name=bn_name_base + '2b', epsilon=1e-5)(x)
 31 |     
 32 |     x = layers.add([x, input_tensor])
 33 |     return x
 34 | 
 35 | 
 36 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
 37 |     filters1, filters2 = filters
 38 | 
 39 |     conv_name_base = 'res' + str(stage) + block + '_branch'
 40 |     bn_name_base = 'bn' + str(stage) + block + '_branch'
 41 | 
 42 |     x = BatchNormalization(name=bn_name_base + '2', epsilon=1e-5)(input_tensor)
 43 |     
 44 |     #----------------------------#
 45 |     #   减少通道数
 46 |     #----------------------------#
 47 |     x = Conv2D(filters1, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2a',
 48 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 49 |                bias_initializer='zeros')(x)
 50 |     x = BatchNormalization(name=bn_name_base + '2a', epsilon=1e-5)(x)
 51 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
 52 | 
 53 |     #----------------------------#
 54 |     #   3x3卷积
 55 |     #----------------------------#
 56 |     x = Conv2D(filters2, kernel_size, padding='same', use_bias=False, strides=strides, name=conv_name_base + '2b',
 57 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 58 |                bias_initializer='zeros')(x)
 59 |     x = BatchNormalization(name=bn_name_base + '2b', epsilon=1e-5)(x)
 60 |     
 61 |     #----------------------------#
 62 |     #   残差边
 63 |     #----------------------------#
 64 |     shortcut = Conv2D(filters2, (1, 1), strides=strides, use_bias=False, name=conv_name_base + '1',
 65 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 66 |                bias_initializer='zeros')(input_tensor)
 67 |     shortcut = BatchNormalization(name=bn_name_base + '1', epsilon=1e-5)(shortcut)
 68 | 
 69 |     x = layers.add([x, shortcut])
 70 |     return x
 71 | 
 72 | def iResNet50(inputs, embedding_size, dropout_keep_prob=0.5):
 73 |     x = ZeroPadding2D((1, 1))(inputs)
 74 |     x = Conv2D(64, (3, 3), strides=(1, 1), name='conv1', use_bias=False,
 75 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 76 |                bias_initializer='zeros')(x)
 77 |     x = BatchNormalization(name='bn_conv1', epsilon=1e-5)(x)
 78 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
 79 | 
 80 |     x = conv_block(x, 3, [64, 64], stage=2, block='a')
 81 |     x = identity_block(x, 3, [64, 64], stage=2, block='b')
 82 |     x = identity_block(x, 3, [64, 64], stage=2, block='c')
 83 | 
 84 |     x = conv_block(x, 3, [128, 128], stage=3, block='a')
 85 |     x = identity_block(x, 3, [128, 128], stage=3, block='b')
 86 |     x = identity_block(x, 3, [128, 128], stage=3, block='c')
 87 |     x = identity_block(x, 3, [128, 128], stage=3, block='d')
 88 | 
 89 |     x = conv_block(x, 3, [256, 256], stage=4, block='a')
 90 |     x = identity_block(x, 3, [256, 256], stage=4, block='b')
 91 |     x = identity_block(x, 3, [256, 256], stage=4, block='c')
 92 |     x = identity_block(x, 3, [256, 256], stage=4, block='d')
 93 |     x = identity_block(x, 3, [256, 256], stage=4, block='e')
 94 |     x = identity_block(x, 3, [256, 256], stage=4, block='f')
 95 | 
 96 |     x = identity_block(x, 3, [256, 256], stage=4, block='g')
 97 |     x = identity_block(x, 3, [256, 256], stage=4, block='h')
 98 |     x = identity_block(x, 3, [256, 256], stage=4, block='i')
 99 |     x = identity_block(x, 3, [256, 256], stage=4, block='j')
100 |     x = identity_block(x, 3, [256, 256], stage=4, block='k')
101 | 
102 |     x = identity_block(x, 3, [256, 256], stage=4, block='l')
103 |     x = identity_block(x, 3, [256, 256], stage=4, block='m')
104 |     x = identity_block(x, 3, [256, 256], stage=4, block='n')
105 | 
106 |     x = conv_block(x, 3, [512, 512], stage=5, block='a')
107 |     x = identity_block(x, 3, [512, 512], stage=5, block='b')
108 |     x = identity_block(x, 3, [512, 512], stage=5, block='c')
109 |     
110 |     x = BatchNormalization(name='bn2', epsilon=1e-5)(x)
111 |     x = Dropout(p=dropout_keep_prob)(x)
112 |     x = Flatten()(x)
113 |     x = Dense(embedding_size, name='linear',
114 |             kernel_initializer=initializers.random_normal(stddev=0.1),
115 |             bias_initializer='zeros')(x)
116 |     x = BatchNormalization(name='features', epsilon=1e-5,)(x)
117 | 
118 |     return x
119 | 


--------------------------------------------------------------------------------
/utils/utils_metrics.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | from scipy import interpolate
  3 | from sklearn.model_selection import KFold
  4 | from tqdm import tqdm
  5 | 
  6 | def evaluate(distances, labels, nrof_folds=10):
  7 |     # Calculate evaluation metrics
  8 |     thresholds = np.arange(0, 4, 0.01)
  9 |     tpr, fpr, accuracy, best_thresholds = calculate_roc(thresholds, distances,
 10 |         labels, nrof_folds=nrof_folds)
 11 |     thresholds = np.arange(0, 4, 0.001)
 12 |     val, val_std, far = calculate_val(thresholds, distances,
 13 |         labels, 1e-3, nrof_folds=nrof_folds)
 14 |     return tpr, fpr, accuracy, val, val_std, far, best_thresholds 
 15 | 
 16 | def calculate_roc(thresholds, distances, labels, nrof_folds=10):
 17 | 
 18 |     nrof_pairs = min(len(labels), len(distances))
 19 |     nrof_thresholds = len(thresholds)
 20 |     k_fold = KFold(n_splits=nrof_folds, shuffle=False)
 21 | 
 22 |     tprs = np.zeros((nrof_folds,nrof_thresholds))
 23 |     fprs = np.zeros((nrof_folds,nrof_thresholds))
 24 |     accuracy = np.zeros((nrof_folds))
 25 | 
 26 |     indices = np.arange(nrof_pairs)
 27 | 
 28 |     for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
 29 | 
 30 |         # Find the best threshold for the fold
 31 |         acc_train = np.zeros((nrof_thresholds))
 32 |         for threshold_idx, threshold in enumerate(thresholds):
 33 |             _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, distances[train_set], labels[train_set])
 34 | 
 35 |         best_threshold_index = np.argmax(acc_train)
 36 |         for threshold_idx, threshold in enumerate(thresholds):
 37 |             tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, distances[test_set], labels[test_set])
 38 |         _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], distances[test_set], labels[test_set])
 39 |         tpr = np.mean(tprs,0)
 40 |         fpr = np.mean(fprs,0)
 41 |     return tpr, fpr, accuracy, thresholds[best_threshold_index]
 42 | 
 43 | def calculate_accuracy(threshold, dist, actual_issame):
 44 |     predict_issame = np.less(dist, threshold)
 45 |     tp = np.sum(np.logical_and(predict_issame, actual_issame))
 46 |     fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
 47 |     tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame)))
 48 |     fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))
 49 | 
 50 |     tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn)
 51 |     fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn)
 52 |     acc = float(tp+tn)/dist.size
 53 |     return tpr, fpr, acc
 54 | 
 55 | def calculate_val(thresholds, distances, labels, far_target=1e-3, nrof_folds=10):
 56 |     nrof_pairs = min(len(labels), len(distances))
 57 |     nrof_thresholds = len(thresholds)
 58 |     k_fold = KFold(n_splits=nrof_folds, shuffle=False)
 59 | 
 60 |     val = np.zeros(nrof_folds)
 61 |     far = np.zeros(nrof_folds)
 62 | 
 63 |     indices = np.arange(nrof_pairs)
 64 | 
 65 |     for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
 66 |         # Find the threshold that gives FAR = far_target
 67 |         far_train = np.zeros(nrof_thresholds)
 68 |         for threshold_idx, threshold in enumerate(thresholds):
 69 |             _, far_train[threshold_idx] = calculate_val_far(threshold, distances[train_set], labels[train_set])
 70 |         if np.max(far_train)>=far_target:
 71 |             f = interpolate.interp1d(far_train, thresholds, kind='slinear')
 72 |             threshold = f(far_target)
 73 |         else:
 74 |             threshold = 0.0
 75 | 
 76 |         val[fold_idx], far[fold_idx] = calculate_val_far(threshold, distances[test_set], labels[test_set])
 77 | 
 78 |     val_mean = np.mean(val)
 79 |     far_mean = np.mean(far)
 80 |     val_std = np.std(val)
 81 |     return val_mean, val_std, far_mean
 82 | 
 83 | def calculate_val_far(threshold, dist, actual_issame):
 84 |     predict_issame = np.less(dist, threshold)
 85 |     true_accept = np.sum(np.logical_and(predict_issame, actual_issame))
 86 |     false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
 87 |     n_same = np.sum(actual_issame)
 88 |     n_diff = np.sum(np.logical_not(actual_issame))
 89 |     if n_diff == 0:
 90 |         n_diff = 1
 91 |     if n_same == 0:
 92 |         return 0,0
 93 |     val = float(true_accept) / float(n_same)
 94 |     far = float(false_accept) / float(n_diff)
 95 |     return val, far
 96 | 
 97 | def test(test_loader, model, png_save_path, log_interval, batch_size):
 98 |     labels, distances = [], []
 99 |     pbar = tqdm(enumerate(test_loader.generate()))
100 |     for batch_idx, (data_a, data_p, label) in pbar:
101 |         out_a, out_p = model.predict(data_a), model.predict(data_p)
102 |         dists = np.linalg.norm(out_a - out_p, axis=1)
103 | 
104 |         #--------------------------------------#
105 |         #   将结果添加进列表中
106 |         #--------------------------------------#
107 |         distances.append(dists)
108 |         labels.append(label)
109 |         
110 |         #--------------------------------------#
111 |         #   打印
112 |         #--------------------------------------#
113 |         if batch_idx % log_interval == 0:
114 |             pbar.set_description('Test Epoch: [{}/{} ({:.0f}%)]'.format(
115 |                 batch_idx * batch_size, len(test_loader.validation_images),
116 |                 100. * batch_idx / len(test_loader.validation_images)))
117 | 
118 |     #--------------------------------------#
119 |     #   转换成numpy
120 |     #--------------------------------------#
121 |     labels      = np.array([sublabel for label in labels for sublabel in label])
122 |     distances   = np.array([subdist for dist in distances for subdist in dist])
123 |     
124 |     tpr, fpr, accuracy, val, val_std, far, best_thresholds = evaluate(distances,labels)
125 |     print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy)))
126 |     print('Best_thresholds: %2.5f' % best_thresholds)
127 |     print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
128 |     plot_roc(fpr, tpr, figure_name = png_save_path)
129 | 
130 | def plot_roc(fpr, tpr, figure_name = "roc.png"):
131 |     import matplotlib.pyplot as plt
132 |     from sklearn.metrics import auc, roc_curve
133 |     roc_auc = auc(fpr, tpr)
134 |     fig = plt.figure()
135 |     lw = 2
136 |     plt.plot(fpr, tpr, color='darkorange',
137 |              lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)
138 |     plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
139 |     plt.xlim([0.0, 1.0])
140 |     plt.ylim([0.0, 1.05])
141 |     plt.xlabel('False Positive Rate')
142 |     plt.ylabel('True Positive Rate')
143 |     plt.title('Receiver operating characteristic')
144 |     plt.legend(loc="lower right")
145 |     fig.savefig(figure_name, dpi=fig.dpi)
146 | 


--------------------------------------------------------------------------------
/nets/mobilenetv3.py:
--------------------------------------------------------------------------------
  1 | from keras import backend, initializers
  2 | from keras.layers import (Activation, Add, BatchNormalization, Conv2D, Dense,
  3 |                           DepthwiseConv2D, Dropout, Flatten,
  4 |                           GlobalAveragePooling2D, Multiply, PReLU, Reshape)
  5 | 
  6 | 
  7 | def _make_divisible(v, divisor=8, min_value=None):
  8 |     if min_value is None:
  9 |         min_value = divisor
 10 |     new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
 11 |     if new_v < 0.9 * v:
 12 |         new_v += divisor
 13 |     return new_v
 14 | 
 15 | def _activation(x, name='relu'):
 16 |     if name == 'relu':
 17 |         return Activation('relu')(x)
 18 |     elif name == 'hardswish':
 19 |         return hard_swish(x)
 20 | 
 21 | def hard_sigmoid(x):
 22 |     return backend.relu(x + 3.0, max_value=6.0) / 6.0
 23 | 
 24 | def hard_swish(x):
 25 |     return Multiply()([Activation(hard_sigmoid)(x), x])
 26 | 
 27 | def _bneck(inputs, expansion, out_ch, alpha, kernel_size, stride, se_ratio, activation, block_id, rate=1):
 28 |     in_channels     = backend.int_shape(inputs)[-1]
 29 |     exp_size        = _make_divisible(in_channels * expansion, 8)
 30 |     out_channels    = _make_divisible(out_ch * alpha, 8)
 31 |     
 32 |     x           = inputs
 33 |     prefix      = 'expanded_conv/'
 34 |     if block_id:
 35 |         # Expand
 36 |         prefix = 'expanded_conv_{}/'.format(block_id)
 37 |         x = Conv2D(exp_size, 1, padding='same', use_bias=False, name=prefix + 'expand',
 38 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(x)
 39 |         x = BatchNormalization(axis=-1, name=prefix + 'expand/BatchNorm')(x)
 40 |         x = _activation(x, activation)
 41 | 
 42 |     x = DepthwiseConv2D(kernel_size, strides=stride, padding='same', dilation_rate=(rate, rate), use_bias=False, depthwise_initializer=initializers.random_normal(stddev=0.1), name=prefix + 'depthwise')(x)
 43 |     x = BatchNormalization(axis=-1, name=prefix + 'depthwise/BatchNorm')(x)
 44 |     x = _activation(x, activation)
 45 | 
 46 |     if se_ratio:
 47 |         reduced_ch = _make_divisible(exp_size * se_ratio, 8)
 48 |         y = GlobalAveragePooling2D(name=prefix + 'squeeze_excite/AvgPool')(x)
 49 |         y = Reshape([1, 1, exp_size], name=prefix + 'reshape')(y)
 50 |         
 51 |         y = Conv2D(reduced_ch, 1, padding='same', use_bias=True, name=prefix + 'squeeze_excite/Conv',
 52 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(y)
 53 |         y = Activation("relu", name=prefix + 'squeeze_excite/Relu')(y)
 54 |         
 55 |         y = Conv2D(exp_size, 1, padding='same', use_bias=True, name=prefix + 'squeeze_excite/Conv_1',
 56 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(y)
 57 |         x = Multiply(name=prefix + 'squeeze_excite/Mul')([Activation(hard_sigmoid)(y), x])
 58 | 
 59 |     x = Conv2D(out_channels, 1, padding='same', use_bias=False, name=prefix + 'project', 
 60 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(x)
 61 |     x = BatchNormalization(axis=-1, name=prefix + 'project/BatchNorm')(x)
 62 | 
 63 |     if in_channels == out_channels and stride == 1:
 64 |         x = Add(name=prefix + 'Add')([inputs, x])
 65 |     return x
 66 | 
 67 | def MobilenetV3_small(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0, kernel=5, se_ratio=0.25):
 68 |     x = Conv2D(16, 3, strides=(1, 1), padding='same', use_bias=False, name='Conv', 
 69 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(inputs)
 70 |     x = BatchNormalization(axis=-1, name='Conv/BatchNorm')(x)
 71 |     x = Activation(hard_swish)(x)
 72 | 
 73 |     x = _bneck(x, 1, 16, alpha, 3, 2, se_ratio, 'relu', 0)
 74 |     
 75 |     x = _bneck(x, 4.5, 24, alpha, 3, 2, None, 'relu', 1)
 76 |     x = _bneck(x, 3.66, 24, alpha, 3, 1, None, 'relu', 2)
 77 |     
 78 |     x = _bneck(x, 4, 40, alpha, kernel, 2, se_ratio, 'hardswish', 3)
 79 |     x = _bneck(x, 6, 40, alpha, kernel, 1, se_ratio, 'hardswish', 4)
 80 |     x = _bneck(x, 6, 40, alpha, kernel, 1, se_ratio, 'hardswish', 5)
 81 |     x = _bneck(x, 3, 48, alpha, kernel, 1, se_ratio, 'hardswish', 6)
 82 |     x = _bneck(x, 3, 48, alpha, kernel, 1, se_ratio, 'hardswish', 7)
 83 |     
 84 |     x = _bneck(x, 6, 96, alpha, kernel, 2, se_ratio, 'hardswish', 8)
 85 |     x = _bneck(x, 6, 96, alpha, kernel, 1, se_ratio, 'hardswish', 9)
 86 |     x = _bneck(x, 6, 96, alpha, kernel, 1, se_ratio, 'hardswish', 10)
 87 | 
 88 |     x = Conv2D(512, kernel_size=1, use_bias=False, name='sep',
 89 |                kernel_initializer=initializers.random_normal(stddev=0.1),
 90 |                bias_initializer='zeros')(x)
 91 |     x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x)
 92 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
 93 | 
 94 |     x = BatchNormalization(name='bn2', epsilon=1e-5)(x)
 95 |     x = Dropout(p=dropout_keep_prob)(x)
 96 |     x = Flatten()(x)
 97 |     x = Dense(embedding_size, name='linear',
 98 |             kernel_initializer=initializers.random_normal(stddev=0.1),
 99 |             bias_initializer='zeros')(x)
100 |     x = BatchNormalization(name='features', epsilon=1e-5)(x)
101 |     return x
102 | 
103 | def MobileNetV3_Large(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0, kernel=5, se_ratio=0.25):
104 |     x = Conv2D(16, 3, strides=(1, 1), padding='same', use_bias=False, name='Conv', 
105 |                       kernel_initializer=initializers.random_normal(stddev=0.1))(inputs)
106 |     x = BatchNormalization(axis=-1, name='Conv/BatchNorm')(x)
107 |     x = Activation(hard_swish)(x)
108 | 
109 |     x = _bneck(x, 1, 16, alpha, 3, 1, None, 'relu', 0)
110 | 
111 |     x = _bneck(x, 4, 24, alpha, 3, 2, None, 'relu', 1)
112 |     x = _bneck(x, 3, 24, alpha, 3, 1, None, 'relu', 2)
113 |     
114 |     x = _bneck(x, 3, 40, alpha, kernel, 2, se_ratio, 'relu', 3)
115 |     x = _bneck(x, 3, 40, alpha, kernel, 1, se_ratio, 'relu', 4)
116 |     x = _bneck(x, 3, 40, alpha, kernel, 1, se_ratio, 'relu', 5)
117 |     
118 |     x = _bneck(x, 6, 80, alpha, 3, 2, None, 'hardswish', 6)
119 |     x = _bneck(x, 2.5, 80, alpha, 3, 1, None, 'hardswish', 7)
120 |     x = _bneck(x, 2.3, 80, alpha, 3, 1, None, 'hardswish', 8)
121 |     x = _bneck(x, 2.3, 80, alpha, 3, 1, None, 'hardswish', 9)
122 |     x = _bneck(x, 6, 112, alpha, 3, 1, se_ratio, 'hardswish', 10)
123 |     x = _bneck(x, 6, 112, alpha, 3, 1, se_ratio, 'hardswish', 11)
124 |     
125 |     x = _bneck(x, 6, 160, alpha, kernel, 2, se_ratio, 'hardswish', 12)
126 |     x = _bneck(x, 6, 160, alpha, kernel, 1, se_ratio, 'hardswish', 13)
127 |     x = _bneck(x, 6, 160, alpha, kernel, 1, se_ratio, 'hardswish', 14)
128 |     
129 |     x = Conv2D(512, kernel_size=1, use_bias=False, name='sep',
130 |                kernel_initializer=initializers.random_normal(stddev=0.1),
131 |                bias_initializer='zeros')(x)
132 |     x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x)
133 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
134 | 
135 |     x = BatchNormalization(name='bn2', epsilon=1e-5)(x)
136 |     x = Dropout(p=dropout_keep_prob)(x)
137 |     x = Flatten()(x)
138 |     x = Dense(embedding_size, name='linear',
139 |             kernel_initializer=initializers.random_normal(stddev=0.1),
140 |             bias_initializer='zeros')(x)
141 |     x = BatchNormalization(name='features', epsilon=1e-5)(x)
142 |     return x
143 | 


--------------------------------------------------------------------------------
/nets/mobilenetv2.py:
--------------------------------------------------------------------------------
  1 | #-------------------------------------------------------------#
  2 | #   MobileNetV2的网络部分
  3 | #-------------------------------------------------------------#
  4 | 
  5 | from keras import backend
  6 | from keras import initializers
  7 | from keras.layers import (Activation, Add, BatchNormalization, Conv2D, Dense,
  8 |                           DepthwiseConv2D, Dropout, Flatten,
  9 |                           GlobalAveragePooling2D, Input, PReLU, Reshape,
 10 |                           ZeroPadding2D)
 11 | from keras.layers.normalization import BatchNormalization
 12 | 
 13 | 
 14 | def relu6(x):
 15 |     return backend.relu(x, max_value=6)
 16 |     
 17 | def correct_pad(inputs, kernel_size):
 18 |     img_dim = 1
 19 |     input_size = backend.int_shape(inputs)[img_dim:(img_dim + 2)]
 20 | 
 21 |     if isinstance(kernel_size, int):
 22 |         kernel_size = (kernel_size, kernel_size)
 23 | 
 24 |     if input_size[0] is None:
 25 |         adjust = (1, 1)
 26 |     else:
 27 |         adjust = (1 - input_size[0] % 2, 1 - input_size[1] % 2)
 28 | 
 29 |     correct = (kernel_size[0] // 2, kernel_size[1] // 2)
 30 | 
 31 |     return ((correct[0] - adjust[0], correct[0]),
 32 |             (correct[1] - adjust[1], correct[1]))
 33 | 
 34 | def _make_divisible(v, divisor, min_value=None):
 35 |     if min_value is None:
 36 |         min_value = divisor
 37 |     new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
 38 |     if new_v < 0.9 * v:
 39 |         new_v += divisor
 40 |     return new_v
 41 | 
 42 | def _inverted_res_block(inputs, expansion, stride, alpha, filters, block_id):
 43 |     in_channels = backend.int_shape(inputs)[-1]
 44 |     pointwise_conv_filters = int(filters * alpha)
 45 |     pointwise_filters = _make_divisible(pointwise_conv_filters, 8)
 46 | 
 47 |     x = inputs
 48 |     prefix = 'block_{}_'.format(block_id)
 49 |     #---------------------------------------------#
 50 |     #   part1 数据扩张
 51 |     #---------------------------------------------#
 52 |     if block_id:
 53 |         # Expand
 54 |         x = Conv2D(expansion * in_channels,
 55 |                           kernel_size=1,
 56 |                           padding='same',
 57 |                           use_bias=False,
 58 |                           kernel_initializer=initializers.random_normal(stddev=0.1),
 59 |                           activation=None,
 60 |                           name=prefix + 'expand')(x)
 61 |         x = BatchNormalization(epsilon=1e-3,
 62 |                                       momentum=0.999,
 63 |                                       name=prefix + 'expand_BN')(x)
 64 |         x = Activation(relu6, name=prefix + 'expand_relu')(x)
 65 |     else:
 66 |         prefix = 'expanded_conv_'
 67 | 
 68 |     if stride == 2:
 69 |         x = ZeroPadding2D(padding=correct_pad(x, 3),
 70 |                                  name=prefix + 'pad')(x)
 71 |     
 72 |     #---------------------------------------------#
 73 |     #   part2 可分离卷积
 74 |     #---------------------------------------------#
 75 |     x = DepthwiseConv2D(kernel_size=3,
 76 |                                strides=stride,
 77 |                                activation=None,
 78 |                                use_bias=False,
 79 |                                depthwise_initializer=initializers.random_normal(stddev=0.1),
 80 |                                padding='same' if stride == 1 else 'valid',
 81 |                                name=prefix + 'depthwise')(x)
 82 |     x = BatchNormalization(epsilon=1e-3,
 83 |                                   momentum=0.999,
 84 |                                   name=prefix + 'depthwise_BN')(x)
 85 | 
 86 |     x = Activation(relu6, name=prefix + 'depthwise_relu')(x)
 87 | 
 88 |     #---------------------------------------------#
 89 |     #   part3压缩特征，而且不使用relu函数，保证特征不被破坏
 90 |     #---------------------------------------------#
 91 |     x = Conv2D(pointwise_filters,
 92 |                       kernel_size=1,
 93 |                       padding='same',
 94 |                       use_bias=False,
 95 |                       kernel_initializer=initializers.random_normal(stddev=0.1),
 96 |                       activation=None,
 97 |                       name=prefix + 'project')(x)
 98 | 
 99 |     x = BatchNormalization(epsilon=1e-3, momentum=0.999, name=prefix + 'project_BN')(x)
100 | 
101 |     if in_channels == pointwise_filters and stride == 1:
102 |         return Add(name=prefix + 'add')([inputs, x])
103 |     return x
104 | 
105 | def MobilenetV2(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0):
106 |     #---------------------------------------------#
107 |     #   stem部分
108 |     #---------------------------------------------#
109 |     first_block_filters = _make_divisible(32 * alpha, 8)
110 |     x = ZeroPadding2D(padding=correct_pad(inputs, 3),
111 |                              name='Conv1_pad')(inputs)
112 |     
113 |     x = Conv2D(first_block_filters,
114 |                       kernel_size=3,
115 |                       strides=(1, 1),
116 |                       padding='valid',
117 |                       use_bias=False,
118 |                       kernel_initializer=initializers.random_normal(stddev=0.1),
119 |                       name='Conv1')(x)
120 |     x = BatchNormalization(epsilon=1e-3,
121 |                                   momentum=0.999,
122 |                                   name='bn_Conv1')(x)
123 |     x = Activation(relu6, name='Conv1_relu')(x)
124 | 
125 |     x = _inverted_res_block(x, filters=16, alpha=alpha, stride=1,
126 |                             expansion=1, block_id=0)
127 |     x = _inverted_res_block(x, filters=24, alpha=alpha, stride=2,
128 |                             expansion=6, block_id=1)
129 |     x = _inverted_res_block(x, filters=24, alpha=alpha, stride=1,
130 |                             expansion=6, block_id=2)
131 | 
132 | 
133 |     x = _inverted_res_block(x, filters=32, alpha=alpha, stride=2,
134 |                             expansion=6, block_id=3)
135 |     x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1,
136 |                             expansion=6, block_id=4)
137 |     x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1,
138 |                             expansion=6, block_id=5)
139 | 
140 | 
141 |     x = _inverted_res_block(x, filters=64, alpha=alpha, stride=2,
142 |                             expansion=6, block_id=6)
143 |     x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
144 |                             expansion=6, block_id=7)
145 |     x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
146 |                             expansion=6, block_id=8)
147 |     x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1,
148 |                             expansion=6, block_id=9)
149 |     x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
150 |                             expansion=6, block_id=10)
151 |     x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
152 |                             expansion=6, block_id=11)
153 |     x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1,
154 |                             expansion=6, block_id=12)
155 | 
156 | 
157 |     x = _inverted_res_block(x, filters=160, alpha=alpha, stride=2,
158 |                             expansion=6, block_id=13)
159 |     x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1,
160 |                             expansion=6, block_id=14)
161 |     x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1,
162 |                             expansion=6, block_id=15)
163 |     x = _inverted_res_block(x, filters=320, alpha=alpha, stride=1,
164 |                             expansion=6, block_id=16)
165 |     
166 |     x = Conv2D(512, kernel_size=1, use_bias=False, name='sep',
167 |                kernel_initializer=initializers.random_normal(stddev=0.1),
168 |                bias_initializer='zeros')(x)
169 |     x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x)
170 |     x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x)
171 | 
172 |     x = BatchNormalization(name='bn2', epsilon=1e-5)(x)
173 |     x = Dropout(p=dropout_keep_prob)(x)
174 |     x = Flatten()(x)
175 |     x = Dense(embedding_size, name='linear',
176 |             kernel_initializer=initializers.random_normal(stddev=0.1),
177 |             bias_initializer='zeros')(x)
178 |     x = BatchNormalization(name='features', epsilon=1e-5)(x)
179 |     return x
180 | 


--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | from PIL import Image
  6 | 
  7 | 
  8 | #---------------------------------------------------------#
  9 | #   将图像转换成RGB图像，防止灰度图在预测时报错。
 10 | #   代码仅仅支持RGB图像的预测，所有其它类型的图像都会转化成RGB
 11 | #---------------------------------------------------------#
 12 | def cvtColor(image):
 13 |     if len(np.shape(image)) == 3 and np.shape(image)[2] == 3:
 14 |         return image 
 15 |     else:
 16 |         image = image.convert('RGB')
 17 |         return image 
 18 | 
 19 | #---------------------------------------------------#
 20 | #   对输入图像进行resize
 21 | #---------------------------------------------------#
 22 | def resize_image(image, size, letterbox_image):
 23 |     iw, ih  = image.size
 24 |     w, h    = size
 25 |     if letterbox_image:
 26 |         scale   = min(w/iw, h/ih)
 27 |         nw      = int(iw*scale)
 28 |         nh      = int(ih*scale)
 29 | 
 30 |         image   = image.resize((nw,nh), Image.BICUBIC)
 31 |         new_image = Image.new('RGB', size, (128,128,128))
 32 |         new_image.paste(image, ((w-nw)//2, (h-nh)//2))
 33 |     else:
 34 |         new_image = image.resize((w, h), Image.BICUBIC)
 35 |     return new_image
 36 | 
 37 | def get_num_classes(annotation_path):
 38 |     with open(annotation_path) as f:
 39 |         dataset_path = f.readlines()
 40 | 
 41 |     labels = []
 42 |     for path in dataset_path:
 43 |         path_split = path.split(";")
 44 |         labels.append(int(path_split[0]))
 45 |     num_classes = np.max(labels) + 1
 46 |     return num_classes
 47 | 
 48 | def preprocess_input(image):
 49 |     image /= 255.0 
 50 |     image -= 0.5
 51 |     image /= 0.5
 52 |     return image
 53 | 
 54 | def get_acc(s=32.0, m=0.5):
 55 |     cos_m      = math.cos(m)
 56 |     sin_m      = math.sin(m)
 57 |     th         = math.cos(math.pi - m)
 58 |     mm         = math.sin(math.pi - m) * m
 59 |     def acc(y_true, y_pred):
 60 |         cosine = tf.cast(y_pred, tf.float32)
 61 |         labels = tf.cast(y_true, tf.float32)
 62 | 
 63 |         sine = tf.sqrt(1 - tf.square(cosine))
 64 |         phi = cosine * cos_m - sine * sin_m
 65 |         phi = tf.where(cosine > th, phi, cosine - mm)
 66 | 
 67 |         output = (labels * phi) + ((1.0 - labels) * cosine)
 68 |         output *= s
 69 | 
 70 |         accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(tf.math.softmax(output, -1), -1), tf.argmax(y_true, -1)), tf.float32))
 71 |         return accuracy
 72 |     return acc
 73 | 
 74 | def show_config(**kwargs):
 75 |     print('Configurations:')
 76 |     print('-' * 70)
 77 |     print('|%25s | %40s|' % ('keys', 'values'))
 78 |     print('-' * 70)
 79 |     for key, value in kwargs.items():
 80 |         print('|%25s | %40s|' % (str(key), str(value)))
 81 |     print('-' * 70)
 82 | 
 83 | #-------------------------------------------------------------------------------------------------------------------------------#
 84 | #   From https://github.com/ckyrkou/Keras_FLOP_Estimator 
 85 | #   Fix lots of bugs
 86 | #-------------------------------------------------------------------------------------------------------------------------------#
 87 | def net_flops(model, table=False, print_result=True):
 88 |     if (table == True):
 89 |         print("\n")
 90 |         print('%25s | %16s | %16s | %16s | %16s | %6s | %6s' % (
 91 |             'Layer Name', 'Input Shape', 'Output Shape', 'Kernel Size', 'Filters', 'Strides', 'FLOPS'))
 92 |         print('=' * 120)
 93 |         
 94 |     #---------------------------------------------------#
 95 |     #   总的FLOPs
 96 |     #---------------------------------------------------#
 97 |     t_flops = 0
 98 |     factor  = 1e9
 99 | 
100 |     for l in model.layers:
101 |         try:
102 |             #--------------------------------------#
103 |             #   所需参数的初始化定义
104 |             #--------------------------------------#
105 |             o_shape, i_shape, strides, ks, filters = ('', '', ''), ('', '', ''), (1, 1), (0, 0), 0
106 |             flops   = 0
107 |             #--------------------------------------#
108 |             #   获得层的名字
109 |             #--------------------------------------#
110 |             name    = l.name
111 |             
112 |             if ('InputLayer' in str(l)):
113 |                 i_shape = l.get_input_shape_at(0)[1:4]
114 |                 o_shape = l.get_output_shape_at(0)[1:4]
115 |                 
116 |             #--------------------------------------#
117 |             #   Reshape层
118 |             #--------------------------------------#
119 |             elif ('Reshape' in str(l)):
120 |                 i_shape = l.get_input_shape_at(0)[1:4]
121 |                 o_shape = l.get_output_shape_at(0)[1:4]
122 | 
123 |             #--------------------------------------#
124 |             #   填充层
125 |             #--------------------------------------#
126 |             elif ('Padding' in str(l)):
127 |                 i_shape = l.get_input_shape_at(0)[1:4]
128 |                 o_shape = l.get_output_shape_at(0)[1:4]
129 | 
130 |             #--------------------------------------#
131 |             #   平铺层
132 |             #--------------------------------------#
133 |             elif ('Flatten' in str(l)):
134 |                 i_shape = l.get_input_shape_at(0)[1:4]
135 |                 o_shape = l.get_output_shape_at(0)[1:4]
136 |                 
137 |             #--------------------------------------#
138 |             #   激活函数层
139 |             #--------------------------------------#
140 |             elif 'Activation' in str(l):
141 |                 i_shape = l.get_input_shape_at(0)[1:4]
142 |                 o_shape = l.get_output_shape_at(0)[1:4]
143 |                 
144 |             #--------------------------------------#
145 |             #   LeakyReLU
146 |             #--------------------------------------#
147 |             elif 'LeakyReLU' in str(l):
148 |                 for i in range(len(l._inbound_nodes)):
149 |                     i_shape = l.get_input_shape_at(i)[1:4]
150 |                     o_shape = l.get_output_shape_at(i)[1:4]
151 |                     
152 |                     flops   += i_shape[0] * i_shape[1] * i_shape[2]
153 |                     
154 |             #--------------------------------------#
155 |             #   池化层
156 |             #--------------------------------------#
157 |             elif 'MaxPooling' in str(l):
158 |                 i_shape = l.get_input_shape_at(0)[1:4]
159 |                 o_shape = l.get_output_shape_at(0)[1:4]
160 |                     
161 |             #--------------------------------------#
162 |             #   池化层
163 |             #--------------------------------------#
164 |             elif ('AveragePooling' in str(l) and 'Global' not in str(l)):
165 |                 strides = l.strides
166 |                 ks      = l.pool_size
167 |                 
168 |                 for i in range(len(l._inbound_nodes)):
169 |                     i_shape = l.get_input_shape_at(i)[1:4]
170 |                     o_shape = l.get_output_shape_at(i)[1:4]
171 |                     
172 |                     flops   += o_shape[0] * o_shape[1] * o_shape[2]
173 | 
174 |             #--------------------------------------#
175 |             #   全局池化层
176 |             #--------------------------------------#
177 |             elif ('AveragePooling' in str(l) and 'Global' in str(l)):
178 |                 for i in range(len(l._inbound_nodes)):
179 |                     i_shape = l.get_input_shape_at(i)[1:4]
180 |                     o_shape = l.get_output_shape_at(i)[1:4]
181 |                     
182 |                     flops += (i_shape[0] * i_shape[1] + 1) * i_shape[2]
183 |                 
184 |             #--------------------------------------#
185 |             #   标准化层
186 |             #--------------------------------------#
187 |             elif ('BatchNormalization' in str(l)):
188 |                 for i in range(len(l._inbound_nodes)):
189 |                     i_shape = l.get_input_shape_at(i)[1:4]
190 |                     o_shape = l.get_output_shape_at(i)[1:4]
191 | 
192 |                     temp_flops = 1
193 |                     for i in range(len(i_shape)):
194 |                         temp_flops *= i_shape[i]
195 |                     temp_flops *= 2
196 |                     
197 |                     flops += temp_flops
198 |                 
199 |             #--------------------------------------#
200 |             #   全连接层
201 |             #--------------------------------------#
202 |             elif ('Dense' in str(l)):
203 |                 for i in range(len(l._inbound_nodes)):
204 |                     i_shape = l.get_input_shape_at(i)[1:4]
205 |                     o_shape = l.get_output_shape_at(i)[1:4]
206 |                 
207 |                     temp_flops = 1
208 |                     for i in range(len(o_shape)):
209 |                         temp_flops *= o_shape[i]
210 |                         
211 |                     if (i_shape[-1] == None):
212 |                         temp_flops = temp_flops * o_shape[-1]
213 |                     else:
214 |                         temp_flops = temp_flops * i_shape[-1]
215 |                     flops += temp_flops
216 | 
217 |             #--------------------------------------#
218 |             #   普通卷积层
219 |             #--------------------------------------#
220 |             elif ('Conv2D' in str(l) and 'DepthwiseConv2D' not in str(l) and 'SeparableConv2D' not in str(l)):
221 |                 strides = l.strides
222 |                 ks      = l.kernel_size
223 |                 filters = l.filters
224 |                 bias    = 1 if l.use_bias else 0
225 |                 
226 |                 for i in range(len(l._inbound_nodes)):
227 |                     i_shape = l.get_input_shape_at(i)[1:4]
228 |                     o_shape = l.get_output_shape_at(i)[1:4]
229 |                     
230 |                     if (filters == None):
231 |                         filters = i_shape[2]
232 |                     flops += filters * o_shape[0] * o_shape[1] * (ks[0] * ks[1] * i_shape[2] + bias)
233 | 
234 |             #--------------------------------------#
235 |             #   逐层卷积层
236 |             #--------------------------------------#
237 |             elif ('Conv2D' in str(l) and 'DepthwiseConv2D' in str(l) and 'SeparableConv2D' not in str(l)):
238 |                 strides = l.strides
239 |                 ks      = l.kernel_size
240 |                 filters = l.filters
241 |                 bias    = 1 if l.use_bias else 0
242 |             
243 |                 for i in range(len(l._inbound_nodes)):
244 |                     i_shape = l.get_input_shape_at(i)[1:4]
245 |                     o_shape = l.get_output_shape_at(i)[1:4]
246 |                     
247 |                     if (filters == None):
248 |                         filters = i_shape[2]
249 |                     flops += filters * o_shape[0] * o_shape[1] * (ks[0] * ks[1] + bias)
250 |                 
251 |             #--------------------------------------#
252 |             #   深度可分离卷积层
253 |             #--------------------------------------#
254 |             elif ('Conv2D' in str(l) and 'DepthwiseConv2D' not in str(l) and 'SeparableConv2D' in str(l)):
255 |                 strides = l.strides
256 |                 ks      = l.kernel_size
257 |                 filters = l.filters
258 |                 
259 |                 for i in range(len(l._inbound_nodes)):
260 |                     i_shape = l.get_input_shape_at(i)[1:4]
261 |                     o_shape = l.get_output_shape_at(i)[1:4]
262 |                     
263 |                     if (filters == None):
264 |                         filters = i_shape[2]
265 |                     flops += i_shape[2] * o_shape[0] * o_shape[1] * (ks[0] * ks[1] + bias) + \
266 |                              filters * o_shape[0] * o_shape[1] * (1 * 1 * i_shape[2] + bias)
267 |             #--------------------------------------#
268 |             #   模型中有模型时
269 |             #--------------------------------------#
270 |             elif 'Model' in str(l):
271 |                 flops = net_flops(l, print_result=False)
272 |                 
273 |             t_flops += flops
274 | 
275 |             if (table == True):
276 |                 print('%25s | %16s | %16s | %16s | %16s | %6s | %5.4f' % (
277 |                     name[:25], str(i_shape), str(o_shape), str(ks), str(filters), str(strides), flops))
278 |                 
279 |         except:
280 |             pass
281 |     
282 |     t_flops = t_flops * 2
283 |     if print_result:
284 |         show_flops = t_flops / factor
285 |         print('Total GFLOPs: %.3fG' % (show_flops))
286 |     return t_flops


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
  1 | import datetime
  2 | import os
  3 | 
  4 | import numpy as np
  5 | import tensorflow as tf
  6 | from keras.callbacks import LearningRateScheduler, ModelCheckpoint, TensorBoard
  7 | from keras.layers import Conv2D, Dense, DepthwiseConv2D, PReLU
  8 | from keras.optimizers import SGD, Adam
  9 | from keras.regularizers import l2
 10 | from keras.utils.multi_gpu_utils import multi_gpu_model
 11 | 
 12 | from nets.arcface import arcface
 13 | from nets.arcface_training import ArcFaceLoss, get_lr_scheduler
 14 | from utils.callbacks import (ExponentDecayScheduler, LFW_callback, LossHistory,
 15 |                              ParallelModelCheckpoint)
 16 | from utils.dataloader import FacenetDataset, LFWDataset
 17 | from utils.utils import get_acc, get_num_classes, show_config
 18 | 
 19 | tf.logging.set_verbosity(tf.logging.ERROR)
 20 | 
 21 | if __name__ == "__main__":
 22 |     #---------------------------------------------------------------------#
 23 |     #   train_gpu   训练用到的GPU
 24 |     #               默认为第一张卡、双卡为[0, 1]、三卡为[0, 1, 2]
 25 |     #               在使用多GPU时，每个卡上的batch为总batch除以卡的数量。
 26 |     #---------------------------------------------------------------------#
 27 |     train_gpu       = [0,]
 28 |     #--------------------------------------------------------#
 29 |     #   指向根目录下的cls_train.txt，读取人脸路径与标签
 30 |     #--------------------------------------------------------#
 31 |     annotation_path = "cls_train.txt"
 32 |     #--------------------------------------------------------#
 33 |     #   输入图像大小
 34 |     #--------------------------------------------------------#
 35 |     input_shape     = [112, 112, 3]
 36 |     #--------------------------------------------------------#
 37 |     #   主干特征提取网络的选择
 38 |     #   mobilefacenet
 39 |     #   mobilenetv1
 40 |     #   mobilenetv2
 41 |     #   mobilenetv3
 42 |     #   iresnet50
 43 |     #
 44 |     #   除了mobilenetv1外，其它的backbone均可从0开始训练。
 45 |     #--------------------------------------------------------#
 46 |     backbone        = "mobilefacenet"
 47 |     #----------------------------------------------------------------------------------------------------------------------------#
 48 |     #   如果训练过程中存在中断训练的操作，可以将model_path设置成logs文件夹下的权值文件，将已经训练了一部分的权值再次载入。
 49 |     #   同时修改下方的训练的参数，来保证模型epoch的连续性。
 50 |     #   
 51 |     #   当model_path = ''的时候不加载整个模型的权值。
 52 |     #
 53 |     #   此处使用的是整个模型的权重，因此是在train.py进行加载的，pretrain不影响此处的权值加载。
 54 |     #   如果想要让模型从主干的预训练权值开始训练，则设置model_path = 主干的权值。
 55 |     #   如果想要让模型从0开始训练，则设置model_path = ''，此时从0开始训练。
 56 |     #----------------------------------------------------------------------------------------------------------------------------#  
 57 |     model_path      = ""
 58 | 
 59 |     #----------------------------------------------------------------------------------------------------------------------------#
 60 |     #   显存不足与数据集大小无关，提示显存不足请调小batch_size。
 61 |     #   受到BatchNorm层影响，不能为1。
 62 |     #
 63 |     #   在此提供若干参数设置建议，各位训练者根据自己的需求进行灵活调整：
 64 |     #   （一）从预训练权重开始训练：
 65 |     #       Adam：
 66 |     #           Init_Epoch = 0，Epoch = 100，optimizer_type = 'adam'，Init_lr = 1e-3，weight_decay = 0。
 67 |     #       SGD：
 68 |     #           Init_Epoch = 0，Epoch = 100，optimizer_type = 'sgd'，Init_lr = 1e-2，weight_decay = 5e-4。
 69 |     #       其中：UnFreeze_Epoch可以在100-300之间调整。
 70 |     #   （二）batch_size的设置：
 71 |     #       在显卡能够接受的范围内，以大为好。显存不足与数据集大小无关，提示显存不足（OOM或者CUDA out of memory）请调小batch_size。
 72 |     #       受到BatchNorm层影响，batch_size最小为2，不能为1。
 73 |     #----------------------------------------------------------------------------------------------------------------------------#
 74 |     #------------------------------------------------------#
 75 |     #   训练参数
 76 |     #   Init_Epoch      模型当前开始的训练世代
 77 |     #   Epoch           模型总共训练的epoch
 78 |     #   batch_size      每次输入的图片数量
 79 |     #------------------------------------------------------#
 80 |     Init_Epoch      = 0
 81 |     Epoch           = 100
 82 |     batch_size      = 64
 83 | 
 84 |     #------------------------------------------------------------------#
 85 |     #   其它训练参数：学习率、优化器、学习率下降有关
 86 |     #------------------------------------------------------------------#
 87 |     #------------------------------------------------------------------#
 88 |     #   Init_lr         模型的最大学习率
 89 |     #   Min_lr          模型的最小学习率，默认为最大学习率的0.01
 90 |     #------------------------------------------------------------------#
 91 |     Init_lr             = 1e-2
 92 |     Min_lr              = Init_lr * 0.01
 93 |     #------------------------------------------------------------------#
 94 |     #   optimizer_type  使用到的优化器种类，可选的有adam、sgd
 95 |     #                   当使用Adam优化器时建议设置  Init_lr=1e-3
 96 |     #                   当使用SGD优化器时建议设置   Init_lr=1e-2
 97 |     #   momentum        优化器内部使用到的momentum参数
 98 |     #   weight_decay    权值衰减，可防止过拟合
 99 |     #                   adam会导致weight_decay错误，使用adam时建议设置为0。
100 |     #------------------------------------------------------------------#
101 |     optimizer_type      = "sgd"
102 |     momentum            = 0.9
103 |     weight_decay        = 5e-4
104 |     #------------------------------------------------------------------#
105 |     #   lr_decay_type   使用到的学习率下降方式，可选的有step、cos
106 |     #------------------------------------------------------------------#
107 |     lr_decay_type       = "cos"
108 |     #------------------------------------------------------------------#
109 |     #   save_period     多少个epoch保存一次权值，默认每个世代都保存
110 |     #------------------------------------------------------------------#
111 |     save_period         = 1
112 |     #------------------------------------------------------------------#
113 |     #   save_dir        权值与日志文件保存的文件夹
114 |     #------------------------------------------------------------------#
115 |     save_dir            = 'logs'
116 |     #------------------------------------------------------------------#
117 |     #   用于设置是否使用多线程读取数据
118 |     #   开启后会加快数据读取速度，但是会占用更多内存
119 |     #   内存较小的电脑可以设置为2或者1 
120 |     #------------------------------------------------------------------#
121 |     num_workers     = 1
122 |     #------------------------------------------------------------------#
123 |     #   是否开启LFW评估
124 |     #------------------------------------------------------------------#
125 |     lfw_eval_flag   = True
126 |     #------------------------------------------------------------------#
127 |     #   LFW评估数据集的文件路径和对应的txt文件
128 |     #------------------------------------------------------------------#
129 |     lfw_dir_path    = "lfw"
130 |     lfw_pairs_path  = "model_data/lfw_pair.txt"
131 | 
132 |     #------------------------------------------------------#
133 |     #   设置用到的显卡
134 |     #------------------------------------------------------#
135 |     os.environ["CUDA_VISIBLE_DEVICES"]  = ','.join(str(x) for x in train_gpu)
136 |     ngpus_per_node                      = len(train_gpu)
137 |     print('Number of devices: {}'.format(ngpus_per_node))
138 | 
139 |     num_classes = get_num_classes(annotation_path)
140 |     #-------------------------------------------#
141 |     #   建立模型
142 |     #-------------------------------------------#
143 |     model_body = arcface(input_shape, num_classes, backbone=backbone, mode="train")
144 |     if model_path != '':
145 |         #------------------------------------------------------#
146 |         #   载入预训练权重
147 |         #------------------------------------------------------#
148 |         print('Load weights {}.'.format(model_path))
149 |         model_body.load_weights(model_path, by_name=True, skip_mismatch=True)
150 |         
151 |     if ngpus_per_node > 1:
152 |         model   = multi_gpu_model(model_body, gpus=ngpus_per_node)
153 |     else:
154 |         model   = model_body
155 |     #-------------------------------------------------------#
156 |     #   0.01用于验证，0.99用于训练
157 |     #-------------------------------------------------------#
158 |     val_split = 0.01
159 |     with open(annotation_path,"r") as f:
160 |         lines = f.readlines()
161 |     np.random.seed(10101)
162 |     np.random.shuffle(lines)
163 |     np.random.seed(None)
164 |     num_val = int(len(lines)*val_split)
165 |     num_train = len(lines) - num_val
166 |     
167 |     show_config(
168 |         num_classes = num_classes, backbone = backbone, model_path = model_path, input_shape = input_shape, \
169 |         Init_Epoch = Init_Epoch, Epoch = Epoch, batch_size = batch_size, \
170 |         Init_lr = Init_lr, Min_lr = Min_lr, optimizer_type = optimizer_type, momentum = momentum, lr_decay_type = lr_decay_type, \
171 |         save_period = save_period, save_dir = save_dir, num_workers = num_workers, num_train = num_train, num_val = num_val
172 |     )
173 | 
174 |     for layer in model.layers:
175 |         if isinstance(layer, DepthwiseConv2D):
176 |             layer.add_loss(l2(weight_decay)(layer.depthwise_kernel))
177 |         elif isinstance(layer, Conv2D) or isinstance(layer, Dense):
178 |             layer.add_loss(l2(weight_decay)(layer.kernel))
179 |         elif isinstance(layer, PReLU):
180 |             layer.add_loss(l2(weight_decay)(layer.alpha))
181 | 
182 |     if True:
183 |         #-------------------------------------------------------------------#
184 |         #   判断当前batch_size，自适应调整学习率
185 |         #-------------------------------------------------------------------#
186 |         nbs             = 64
187 |         lr_limit_max    = 1e-3 if optimizer_type == 'adam' else 1e-1
188 |         lr_limit_min    = 3e-4 if optimizer_type == 'adam' else 5e-4
189 |         Init_lr_fit     = min(max(batch_size / nbs * Init_lr, lr_limit_min), lr_limit_max)
190 |         Min_lr_fit      = min(max(batch_size / nbs * Min_lr, lr_limit_min * 1e-2), lr_limit_max * 1e-2)
191 | 
192 |         optimizer = {
193 |             'adam'  : Adam(lr = Init_lr_fit, beta_1 = momentum),
194 |             'sgd'   : SGD(lr = Init_lr_fit, momentum = momentum, nesterov=True)
195 |         }[optimizer_type]
196 |         m = 0.5
197 |         s = 32 if backbone == "mobilefacenet" else 64
198 |         model.compile(optimizer = optimizer, loss={'ArcMargin': ArcFaceLoss(s = s, m = m)}, metrics={'ArcMargin': get_acc()})
199 |     
200 |         #---------------------------------------#
201 |         #   获得学习率下降的公式
202 |         #---------------------------------------#
203 |         lr_scheduler_func = get_lr_scheduler(lr_decay_type, Init_lr_fit, Min_lr_fit, Epoch)
204 | 
205 |         epoch_step          = num_train // batch_size
206 |         epoch_step_val      = num_val // batch_size
207 | 
208 |         if epoch_step == 0 or epoch_step_val == 0:
209 |             raise ValueError('数据集过小，无法进行训练，请扩充数据集。')
210 | 
211 |         train_dataset   = FacenetDataset(input_shape, lines[:num_train], batch_size, num_classes, random = True)
212 |         val_dataset     = FacenetDataset(input_shape, lines[num_train:], batch_size, num_classes, random = False)
213 | 
214 |         #-------------------------------------------------------------------------------#
215 |         #   训练参数的设置
216 |         #   logging         用于设置tensorboard的保存地址
217 |         #   checkpoint      用于设置权值保存的细节，period用于修改多少epoch保存一次
218 |         #   lr_scheduler       用于设置学习率下降的方式
219 |         #   early_stopping  用于设定早停，val_loss多次不下降自动结束训练，表示模型基本收敛
220 |         #-------------------------------------------------------------------------------#
221 |         time_str        = datetime.datetime.strftime(datetime.datetime.now(),'%Y_%m_%d_%H_%M_%S')
222 |         log_dir         = os.path.join(save_dir, "loss_" + str(time_str))
223 |         logging         = TensorBoard(log_dir)
224 |         loss_history    = LossHistory(log_dir)
225 |         if ngpus_per_node > 1:
226 |             checkpoint      = ParallelModelCheckpoint(model_body, os.path.join(save_dir, "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5"), 
227 |                                     monitor = 'val_loss', save_weights_only = True, save_best_only = False, period = save_period)
228 |         else:
229 |             checkpoint      = ModelCheckpoint(os.path.join(save_dir, "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5"), 
230 |                                     monitor = 'val_loss', save_weights_only = True, save_best_only = False, period = save_period)
231 |         lr_scheduler    = LearningRateScheduler(lr_scheduler_func, verbose = 1)
232 |         #---------------------------------#
233 |         #   LFW估计
234 |         #---------------------------------#
235 |         if lfw_eval_flag:
236 |             lfw_callback    = LFW_callback(LFWDataset(dir=lfw_dir_path, pairs_path=lfw_pairs_path, batch_size=32, input_shape=input_shape))
237 |             callbacks       = [logging, loss_history, checkpoint, lr_scheduler, lfw_callback]
238 |         else:
239 |             callbacks       = [logging, loss_history, checkpoint, lr_scheduler]
240 | 
241 |         print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
242 |         model.fit_generator(
243 |             generator           = train_dataset,
244 |             steps_per_epoch     = epoch_step,
245 |             validation_data     = val_dataset,
246 |             validation_steps    = epoch_step_val,
247 |             epochs              = Epoch,
248 |             initial_epoch       = Init_Epoch,
249 |             use_multiprocessing = True if num_workers > 1 else False,
250 |             workers             = num_workers,
251 |             callbacks           = callbacks
252 |         )
253 | 


--------------------------------------------------------------------------------
/常见问题汇总.md:
--------------------------------------------------------------------------------
  1 | 问题汇总的博客地址为[https://blog.csdn.net/weixin_44791964/article/details/107517428](https://blog.csdn.net/weixin_44791964/article/details/107517428)。
  2 | 
  3 | # 问题汇总
  4 | ## 1、下载问题
  5 | ### a、代码下载
  6 | **问：up主，可以给我发一份代码吗，代码在哪里下载啊？ 
  7 | 答：Github上的地址就在视频简介里。复制一下就能进去下载了。**
  8 | 
  9 | **问：up主，为什么我下载的代码提示压缩包损坏？
 10 | 答：重新去Github下载。**
 11 | 
 12 | **问：up主，为什么我下载的代码和你在视频以及博客上的代码不一样？
 13 | 答：我常常会对代码进行更新，最终以实际的代码为准。**
 14 | 
 15 | ### b、 权值下载
 16 | **问：up主，为什么我下载的代码里面，model_data下面没有.pth或者.h5文件？ 
 17 | 答：我一般会把权值上传到Github和百度网盘，在GITHUB的README里面就能找到。**
 18 | 
 19 | ### c、 数据集下载
 20 | **问：up主，XXXX数据集在哪里下载啊？
 21 | 答：一般数据集的下载地址我会放在README里面，基本上都有，没有的话请及时联系我添加，直接发github的issue即可**。
 22 | 
 23 | ## 2、环境配置问题
 24 | ### a、现在库中所用的环境
 25 | **pytorch代码对应的pytorch版本为1.2，博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/106037141](https://blog.csdn.net/weixin_44791964/article/details/106037141)。
 26 | 
 27 | **keras代码对应的tensorflow版本为1.13.2，keras版本是2.1.5，博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/104702142](https://blog.csdn.net/weixin_44791964/article/details/104702142)。
 28 | 
 29 | **tf2代码对应的tensorflow版本为2.2.0，无需安装keras，博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/109161493](https://blog.csdn.net/weixin_44791964/article/details/109161493)。
 30 | 
 31 | **问：你的代码某某某版本的tensorflow和pytorch能用嘛？
 32 | 答：最好按照我推荐的配置，配置教程也有！其它版本的我没有试过！可能出现问题但是一般问题不大。仅需要改少量代码即可。**
 33 | 
 34 | ### b、30系列显卡环境配置
 35 | 30系显卡由于框架更新不可使用上述环境配置教程。
 36 | 当前我已经测试的可以用的30显卡配置如下：
 37 | **pytorch代码对应的pytorch版本为1.7.0，cuda为11.0，cudnn为8.0.5**。
 38 | 
 39 | **keras代码无法在win10下配置cuda11，在ubuntu下可以百度查询一下，配置tensorflow版本为1.15.4，keras版本是2.1.5或者2.3.1（少量函数接口不同，代码可能还需要少量调整。）**
 40 | 
 41 | **tf2代码对应的tensorflow版本为2.4.0，cuda为11.0，cudnn为8.0.5**。
 42 | 
 43 | ### c、GPU利用问题与环境使用问题
 44 | **问：为什么我安装了tensorflow-gpu但是却没用利用GPU进行训练呢？
 45 | 答：确认tensorflow-gpu已经装好，利用pip list查看tensorflow版本，然后查看任务管理器或者利用nvidia命令看看是否使用了gpu进行训练，任务管理器的话要看显存使用情况。**
 46 | 
 47 | **问：up主，我好像没有在用gpu进行训练啊，怎么看是不是用了GPU进行训练？
 48 | 答：查看是否使用GPU进行训练一般使用NVIDIA在命令行的查看命令，如果要看任务管理器的话，请看性能部分GPU的显存是否利用，或者查看任务管理器的Cuda，而非Copy。**
 49 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20201013234241524.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70#pic_center)
 50 | 
 51 | **问：up主，为什么我按照你的环境配置后还是不能使用？
 52 | 答：请把你的GPU、CUDA、CUDNN、TF版本以及PYTORCH版本B站私聊告诉我。**
 53 | 
 54 | **问：出现如下错误**
 55 | ```python
 56 | Traceback (most recent call last):
 57 |   File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
 58 |  from tensorflow.python.pywrap_tensorflow_internal import *
 59 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
 60 | pywrap_tensorflow_internal = swig_import_helper()
 61 |   File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
 62 |     _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
 63 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\imp.py", line 243, in load_modulereturn load_dynamic(name, filename, file)
 64 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\imp.py", line 343, in load_dynamic
 65 |     return _load(spec)
 66 | ImportError: DLL load failed: 找不到指定的模块。
 67 | ```
 68 | **答：如果没重启过就重启一下，否则重新按照步骤安装，还无法解决则把你的GPU、CUDA、CUDNN、TF版本以及PYTORCH版本私聊告诉我。**
 69 | 
 70 | ### d、no module问题
 71 | **问：为什么提示说no module name utils.utils（no module name nets.yolo、no module name nets.ssd等一系列问题）啊？
 72 | 答：utils并不需要用pip装，它就在我上传的仓库的根目录，出现这个问题的原因是根目录不对，查查相对目录和根目录的概念。查了基本上就明白了。**
 73 | 
 74 | **问：为什么提示说no module name matplotlib（no module name PIL，no module name cv2等等）？
 75 | 答：这个库没安装打开命令行安装就好。pip install matplotlib**
 76 | 
 77 | **问：为什么我已经用pip装了opencv（pillow、matplotlib等），还是提示no module name cv2？
 78 | 答：没有激活环境装，要激活对应的conda环境进行安装才可以正常使用**
 79 | 
 80 | **问：为什么提示说No module named 'torch' ？
 81 | 答：其实我也真的很想知道为什么会有这个问题……这个pytorch没装是什么情况？一般就俩情况，一个是真的没装，还有一个是装到其它环境了，当前激活的环境不是自己装的环境。**
 82 | 
 83 | **问：为什么提示说No module named 'tensorflow' ？
 84 | 答：同上。**
 85 | 
 86 | ### e、cuda安装失败问题
 87 | 一般cuda安装前需要安装Visual Studio，装个2017版本即可。
 88 | 
 89 | ### f、Ubuntu系统问题
 90 | **所有代码在Ubuntu下可以使用，我两个系统都试过。**
 91 | 
 92 | ### g、VSCODE提示错误的问题
 93 | **问：为什么在VSCODE里面提示一大堆的错误啊？
 94 | 答：我也提示一大堆的错误，但是不影响，是VSCODE的问题，如果不想看错误的话就装Pycharm。**
 95 | 
 96 | ### h、使用cpu进行训练与预测的问题
 97 | **对于keras和tf2的代码而言，如果想用cpu进行训练和预测，直接装cpu版本的tensorflow就可以了。**
 98 | 
 99 | **对于pytorch的代码而言，如果想用cpu进行训练和预测，需要将cuda=True修改成cuda=False。**
100 | 
101 | ### i、tqdm没有pos参数问题
102 | **问：运行代码提示'tqdm' object has no attribute 'pos'。
103 | 答：重装tqdm，换个版本就可以了。**
104 | 
105 | ### j、提示decode(“utf-8”)的问题
106 | **由于h5py库的更新，安装过程中会自动安装h5py=3.0.0以上的版本，会导致decode("utf-8")的错误！
107 | 各位一定要在安装完tensorflow后利用命令装h5py=2.10.0！**
108 | ```
109 | pip install h5py==2.10.0
110 | ```
111 | 
112 | ### k、提示TypeError: __array__() takes 1 positional argument but 2 were given错误
113 | 可以修改pillow版本解决。
114 | ```
115 | pip install pillow==8.2.0
116 | ```
117 | 
118 | ### l、其它问题
119 | **问：为什么提示TypeError: cat() got an unexpected keyword argument 'axis'，Traceback (most recent call last)，AttributeError: 'Tensor' object has no attribute 'bool'？
120 | 答：这是版本问题，建议使用torch1.2以上版本**
121 | **其它有很多稀奇古怪的问题，很多是版本问题，建议按照我的视频教程安装Keras和tensorflow。比如装的是tensorflow2，就不用问我说为什么我没法运行Keras-yolo啥的。那是必然不行的。**
122 | 
123 | ## 3、目标检测库问题汇总（人脸检测和分类库也可参考）
124 | ### a、shape不匹配问题
125 | #### 1）、训练时shape不匹配问题
126 | **问：up主，为什么运行train.py会提示shape不匹配啊？
127 | 答：在keras环境中，因为你训练的种类和原始的种类不同，网络结构会变化，所以最尾部的shape会有少量不匹配。**
128 | 
129 | #### 2）、预测时shape不匹配问题
130 | **问：为什么我运行predict.py会提示我说shape不匹配呀。
131 | 在Pytorch里面是这样的：**
132 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171631901.png)
133 | 在Keras里面是这样的：
134 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171523380.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70)
135 | **答：原因主要有仨：
136 | 1、在ssd、FasterRCNN里面，可能是train.py里面的num_classes没改。
137 | 2、model_path没改。
138 | 3、classes_path没改。
139 | 请检查清楚了！确定自己所用的model_path和classes_path是对应的！训练的时候用到的num_classes或者classes_path也需要检查！**
140 | 
141 | ### b、显存不足问题
142 | **问：为什么我运行train.py下面的命令行闪的贼快，还提示OOM啥的？ 
143 | 答：这是在keras中出现的，爆显存了，可以改小batch_size，SSD的显存占用率是最小的，建议用SSD；
144 | 2G显存：SSD、YOLOV4-TINY
145 | 4G显存：YOLOV3
146 | 6G显存：YOLOV4、Retinanet、M2det、Efficientdet、Faster RCNN等
147 | 8G+显存：随便选吧。**
148 | **需要注意的是，受到BatchNorm2d影响，batch_size不可为1，至少为2。**
149 | 
150 | **问：为什么提示 RuntimeError: CUDA out of memory. Tried to allocate 52.00 MiB (GPU 0; 15.90 GiB total capacity; 14.85 GiB already allocated; 51.88 MiB free; 15.07 GiB reserved in total by PyTorch)？ 
151 | 答：这是pytorch中出现的，爆显存了，同上。**
152 | 
153 | **问：为什么我显存都没利用，就直接爆显存了？ 
154 | 答：都爆显存了，自然就不利用了，模型没有开始训练。**
155 | ### c、训练问题（冻结训练，LOSS问题、训练效果问题等）
156 | **问：为什么要冻结训练和解冻训练呀？
157 | 答：这是迁移学习的思想，因为神经网络主干特征提取部分所提取到的特征是通用的，我们冻结起来训练可以加快训练效率，也可以防止权值被破坏。**
158 | 在冻结阶段，模型的主干被冻结了，特征提取网络不发生改变。占用的显存较小，仅对网络进行微调。
159 | 在解冻阶段，模型的主干不被冻结了，特征提取网络会发生改变。占用的显存较大，网络所有的参数都会发生改变。
160 | 
161 | **问：为什么我的网络不收敛啊，LOSS是XXXX。
162 | 答：不同网络的LOSS不同，LOSS只是一个参考指标，用于查看网络是否收敛，而非评价网络好坏，我的yolo代码都没有归一化，所以LOSS值看起来比较高，LOSS的值不重要，重要的是是否在变小，预测是否有效果。**
163 | 
164 | **问：为什么我的训练效果不好？预测了没有框（框不准）。
165 | 答：**
166 | 
167 | 考虑几个问题：
168 | 1、目标信息问题，查看2007_train.txt文件是否有目标信息，没有的话请修改voc_annotation.py。
169 | 2、数据集问题，小于500的自行考虑增加数据集，同时测试不同的模型，确认数据集是好的。
170 | 3、是否解冻训练，如果数据集分布与常规画面差距过大需要进一步解冻训练，调整主干，加强特征提取能力。
171 | 4、网络问题，比如SSD不适合小目标，因为先验框固定了。
172 | 5、训练时长问题，有些同学只训练了几代表示没有效果，按默认参数训练完。
173 | 6、确认自己是否按照步骤去做了，如果比如voc_annotation.py里面的classes是否修改了等。
174 | 7、不同网络的LOSS不同，LOSS只是一个参考指标，用于查看网络是否收敛，而非评价网络好坏，LOSS的值不重要，重要的是是否收敛。
175 | 
176 | **问：我怎么出现了gbk什么的编码错误啊：**
177 | ```python
178 | UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 446: illegal multibyte sequence
179 | ```
180 | **答：标签和路径不要使用中文，如果一定要使用中文，请注意处理的时候编码的问题，改成打开文件的encoding方式改为utf-8。**
181 | 
182 | **问：我的图片是xxx*xxx的分辨率的，可以用吗！**
183 | **答：可以用，代码里面会自动进行resize或者数据增强。**
184 | 
185 | **问：怎么进行多GPU训练？
186 | 答：pytorch的大多数代码可以直接使用gpu训练，keras的话直接百度就好了，实现并不复杂，我没有多卡没法详细测试，还需要各位同学自己努力了。**
187 | ### d、灰度图问题
188 | **问：能不能训练灰度图（预测灰度图）啊？
189 | 答：我的大多数库会将灰度图转化成RGB进行训练和预测，如果遇到代码不能训练或者预测灰度图的情况，可以尝试一下在get_random_data里面将Image.open后的结果转换成RGB，预测的时候也这样试试。（仅供参考）**
190 | 
191 | ### e、断点续练问题
192 | **问：我已经训练过几个世代了，能不能从这个基础上继续开始训练
193 | 答：可以，你在训练前，和载入预训练权重一样载入训练过的权重就行了。一般训练好的权重会保存在logs文件夹里面，将model_path修改成你要开始的权值的路径即可。**
194 | 
195 | ### f、预训练权重的问题
196 | **问：如果我要训练其它的数据集，预训练权重要怎么办啊？**
197 | **答：数据的预训练权重对不同数据集是通用的，因为特征是通用的，预训练权重对于99%的情况都必须要用，不用的话权值太过随机，特征提取效果不明显，网络训练的结果也不会好。**
198 | 
199 | **问：up，我修改了网络，预训练权重还能用吗？
200 | 答：修改了主干的话，如果不是用的现有的网络，基本上预训练权重是不能用的，要么就自己判断权值里卷积核的shape然后自己匹配，要么只能自己预训练去了；修改了后半部分的话，前半部分的主干部分的预训练权重还是可以用的，如果是pytorch代码的话，需要自己修改一下载入权值的方式，判断shape后载入，如果是keras代码，直接by_name=True,skip_mismatch=True即可。**
201 | 权值匹配的方式可以参考如下：
202 | ```python
203 | # 加快模型训练的效率
204 | print('Loading weights into state dict...')
205 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
206 | model_dict = model.state_dict()
207 | pretrained_dict = torch.load(model_path, map_location=device)
208 | a = {}
209 | for k, v in pretrained_dict.items():
210 |     try:    
211 |         if np.shape(model_dict[k]) ==  np.shape(v):
212 |             a[k]=v
213 |     except:
214 |         pass
215 | model_dict.update(a)
216 | model.load_state_dict(model_dict)
217 | print('Finished!')
218 | ```
219 | 
220 | **问：我要怎么不使用预训练权重啊？
221 | 答：把载入预训练权重的代码注释了就行。**
222 | 
223 | **问：为什么我不使用预训练权重效果这么差啊？
224 | 答：因为随机初始化的权值不好，提取的特征不好，也就导致了模型训练的效果不好，voc07+12、coco+voc07+12效果都不一样，预训练权重还是非常重要的。**
225 | 
226 | ### g、视频检测问题与摄像头检测问题
227 | **问：怎么用摄像头检测呀？
228 | 答：predict.py修改参数可以进行摄像头检测，也有视频详细解释了摄像头检测的思路。**
229 | 
230 | **问：怎么用视频检测呀？
231 | 答：同上**
232 | ### h、从0开始训练问题
233 | **问：怎么在模型上从0开始训练？
234 | 答：在算力不足与调参能力不足的情况下从0开始训练毫无意义。模型特征提取能力在随机初始化参数的情况下非常差。没有好的参数调节能力和算力，无法使得网络正常收敛。**
235 | 如果一定要从0开始，那么训练的时候请注意几点：
236 |  - 不载入预训练权重。 
237 |  - 不要进行冻结训练，注释冻结模型的代码。
238 | 
239 | **问：为什么我不使用预训练权重效果这么差啊？
240 | 答：因为随机初始化的权值不好，提取的特征不好，也就导致了模型训练的效果不好，voc07+12、coco+voc07+12效果都不一样，预训练权重还是非常重要的。**
241 | 
242 | ### i、保存问题
243 | **问：检测完的图片怎么保存？
244 | 答：一般目标检测用的是Image，所以查询一下PIL库的Image如何进行保存。详细看看predict.py文件的注释。**
245 | 
246 | **问：怎么用视频保存呀？
247 | 答：详细看看predict.py文件的注释。**
248 | 
249 | ### j、遍历问题
250 | **问：如何对一个文件夹的图片进行遍历？
251 | 答：一般使用os.listdir先找出文件夹里面的所有图片，然后根据predict.py文件里面的执行思路检测图片就行了，详细看看predict.py文件的注释。**
252 | 
253 | **问：如何对一个文件夹的图片进行遍历？并且保存。
254 | 答：遍历的话一般使用os.listdir先找出文件夹里面的所有图片，然后根据predict.py文件里面的执行思路检测图片就行了。保存的话一般目标检测用的是Image，所以查询一下PIL库的Image如何进行保存。如果有些库用的是cv2，那就是查一下cv2怎么保存图片。详细看看predict.py文件的注释。**
255 | 
256 | ### k、路径问题（No such file or directory）
257 | **问：我怎么出现了这样的错误呀：**
258 | ```python
259 | FileNotFoundError: 【Errno 2】 No such file or directory
260 | ……………………………………
261 | ……………………………………
262 | ```
263 | **答：去检查一下文件夹路径，查看是否有对应文件；并且检查一下2007_train.txt，其中文件路径是否有错。**
264 | 关于路径有几个重要的点：
265 | **文件夹名称中一定不要有空格。
266 | 注意相对路径和绝对路径。
267 | 多百度路径相关的知识。**
268 | 
269 | **所有的路径问题基本上都是根目录问题，好好查一下相对目录的概念！**
270 | ### l、和原版比较问题
271 | **问：你这个代码和原版比怎么样，可以达到原版的效果么？
272 | 答：基本上可以达到，我都用voc数据测过，我没有好显卡，没有能力在coco上测试与训练。**
273 | 
274 | **问：你有没有实现yolov4所有的tricks，和原版差距多少？
275 | 答：并没有实现全部的改进部分，由于YOLOV4使用的改进实在太多了，很难完全实现与列出来，这里只列出来了一些我比较感兴趣，而且非常有效的改进。论文中提到的SAM（注意力机制模块），作者自己的源码也没有使用。还有其它很多的tricks，不是所有的tricks都有提升，我也没法实现全部的tricks。至于和原版的比较，我没有能力训练coco数据集，根据使用过的同学反应差距不大。**
276 | 
277 | ### m、FPS问题（检测速度问题）
278 | **问：你这个FPS可以到达多少，可以到 XX FPS么？
279 | 答：FPS和机子的配置有关，配置高就快，配置低就慢。**
280 | 
281 | **问：为什么我用服务器去测试yolov4（or others）的FPS只有十几？
282 | 答：检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本，如果已经正确安装，可以去利用time.time()的方法查看detect_image里面，哪一段代码耗时更长（不仅只有网络耗时长，其它处理部分也会耗时，如绘图等）。**
283 | 
284 | **问：为什么论文中说速度可以达到XX，但是这里却没有？
285 | 答：检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本，如果已经正确安装，可以去利用time.time()的方法查看detect_image里面，哪一段代码耗时更长（不仅只有网络耗时长，其它处理部分也会耗时，如绘图等）。有些论文还会使用多batch进行预测，我并没有去实现这个部分。**
286 | 
287 | ### n、预测图片不显示问题
288 | **问：为什么你的代码在预测完成后不显示图片？只是在命令行告诉我有什么目标。
289 | 答：给系统安装一个图片查看器就行了。**
290 | 
291 | ### o、算法评价问题（目标检测的map、PR曲线、Recall、Precision等）
292 | **问：怎么计算map？
293 | 答：看map视频，都一个流程。**
294 | 
295 | **问：计算map的时候，get_map.py里面有一个MINOVERLAP是什么用的，是iou吗？
296 | 答：是iou，它的作用是判断预测框和真实框的重合成度，如果重合程度大于MINOVERLAP，则预测正确。**
297 | 
298 | **问：为什么get_map.py里面的self.confidence（self.score）要设置的那么小？
299 | 答：看一下map的视频的原理部分，要知道所有的结果然后再进行pr曲线的绘制。**
300 | 
301 | **问：能不能说说怎么绘制PR曲线啥的呀。
302 | 答：可以看mAP视频，结果里面有PR曲线。**
303 | 
304 | **问：怎么计算Recall、Precision指标。
305 | 答：这俩指标应该是相对于特定的置信度的，计算map的时候也会获得。**
306 | 
307 | ### p、coco数据集训练问题
308 | **问：目标检测怎么训练COCO数据集啊？。
309 | 答：coco数据训练所需要的txt文件可以参考qqwweee的yolo3的库，格式都是一样的。**
310 | 
311 | ### q、模型优化（模型修改）问题
312 | **问：up，YOLO系列使用Focal LOSS的代码你有吗，有提升吗？
313 | 答：很多人试过，提升效果也不大（甚至变的更Low），它自己有自己的正负样本的平衡方式。**
314 | 
315 | **问：up，我修改了网络，预训练权重还能用吗？
316 | 答：修改了主干的话，如果不是用的现有的网络，基本上预训练权重是不能用的，要么就自己判断权值里卷积核的shape然后自己匹配，要么只能自己预训练去了；修改了后半部分的话，前半部分的主干部分的预训练权重还是可以用的，如果是pytorch代码的话，需要自己修改一下载入权值的方式，判断shape后载入，如果是keras代码，直接by_name=True,skip_mismatch=True即可。**
317 | 权值匹配的方式可以参考如下：
318 | ```python
319 | # 加快模型训练的效率
320 | print('Loading weights into state dict...')
321 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
322 | model_dict = model.state_dict()
323 | pretrained_dict = torch.load(model_path, map_location=device)
324 | a = {}
325 | for k, v in pretrained_dict.items():
326 |     try:    
327 |         if np.shape(model_dict[k]) ==  np.shape(v):
328 |             a[k]=v
329 |     except:
330 |         pass
331 | model_dict.update(a)
332 | model.load_state_dict(model_dict)
333 | print('Finished!')
334 | ```
335 | 
336 | **问：up，怎么修改模型啊，我想发个小论文！
337 | 答：建议看看yolov3和yolov4的区别，然后看看yolov4的论文，作为一个大型调参现场非常有参考意义，使用了很多tricks。我能给的建议就是多看一些经典模型，然后拆解里面的亮点结构并使用。**
338 | 
339 | ### r、部署问题
340 | 我没有具体部署到手机等设备上过，所以很多部署问题我并不了解……
341 | 
342 | ## 4、语义分割库问题汇总
343 | ### a、shape不匹配问题
344 | #### 1）、训练时shape不匹配问题
345 | **问：up主，为什么运行train.py会提示shape不匹配啊？
346 | 答：在keras环境中，因为你训练的种类和原始的种类不同，网络结构会变化，所以最尾部的shape会有少量不匹配。**
347 | 
348 | #### 2）、预测时shape不匹配问题
349 | **问：为什么我运行predict.py会提示我说shape不匹配呀。
350 | 在Pytorch里面是这样的：**
351 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171631901.png)
352 | 在Keras里面是这样的：
353 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171523380.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70)
354 | **答：原因主要有二：
355 | 1、train.py里面的num_classes没改。
356 | 2、预测时num_classes没改。
357 | 请检查清楚！训练和预测的时候用到的num_classes都需要检查！**
358 | 
359 | ### b、显存不足问题
360 | **问：为什么我运行train.py下面的命令行闪的贼快，还提示OOM啥的？ 
361 | 答：这是在keras中出现的，爆显存了，可以改小batch_size。**
362 | 
363 | **需要注意的是，受到BatchNorm2d影响，batch_size不可为1，至少为2。**
364 | 
365 | **问：为什么提示 RuntimeError: CUDA out of memory. Tried to allocate 52.00 MiB (GPU 0; 15.90 GiB total capacity; 14.85 GiB already allocated; 51.88 MiB free; 15.07 GiB reserved in total by PyTorch)？ 
366 | 答：这是pytorch中出现的，爆显存了，同上。**
367 | 
368 | **问：为什么我显存都没利用，就直接爆显存了？ 
369 | 答：都爆显存了，自然就不利用了，模型没有开始训练。**
370 | 
371 | ### c、训练问题（冻结训练，LOSS问题、训练效果问题等）
372 | **问：为什么要冻结训练和解冻训练呀？
373 | 答：这是迁移学习的思想，因为神经网络主干特征提取部分所提取到的特征是通用的，我们冻结起来训练可以加快训练效率，也可以防止权值被破坏。**
374 | **在冻结阶段，模型的主干被冻结了，特征提取网络不发生改变。占用的显存较小，仅对网络进行微调。**
375 | **在解冻阶段，模型的主干不被冻结了，特征提取网络会发生改变。占用的显存较大，网络所有的参数都会发生改变。**
376 | 
377 | **问：为什么我的网络不收敛啊，LOSS是XXXX。
378 | 答：不同网络的LOSS不同，LOSS只是一个参考指标，用于查看网络是否收敛，而非评价网络好坏，我的yolo代码都没有归一化，所以LOSS值看起来比较高，LOSS的值不重要，重要的是是否在变小，预测是否有效果。**
379 | 
380 | **问：为什么我的训练效果不好？预测了没有目标，结果是一片黑。
381 | 答：**
382 | **考虑几个问题：
383 | 1、数据集问题，这是最重要的问题。小于500的自行考虑增加数据集；一定要检查数据集的标签，视频中详细解析了VOC数据集的格式，但并不是有输入图片有输出标签即可，还需要确认标签的每一个像素值是否为它对应的种类。很多同学的标签格式不对，最常见的错误格式就是标签的背景为黑，目标为白，此时目标的像素点值为255，无法正常训练，目标需要为1才行。
384 | 2、是否解冻训练，如果数据集分布与常规画面差距过大需要进一步解冻训练，调整主干，加强特征提取能力。
385 | 3、网络问题，可以尝试不同的网络。
386 | 4、训练时长问题，有些同学只训练了几代表示没有效果，按默认参数训练完。
387 | 5、确认自己是否按照步骤去做了。
388 | 6、不同网络的LOSS不同，LOSS只是一个参考指标，用于查看网络是否收敛，而非评价网络好坏，LOSS的值不重要，重要的是是否收敛。**
389 | 
390 | 
391 | 
392 | **问：为什么我的训练效果不好？对小目标预测不准确。
393 | 答：对于deeplab和pspnet而言，可以修改一下downsample_factor，当downsample_factor为16的时候下采样倍数过多，效果不太好，可以修改为8。**
394 | 
395 | **问：我怎么出现了gbk什么的编码错误啊：**
396 | ```python
397 | UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 446: illegal multibyte sequence
398 | ```
399 | **答：标签和路径不要使用中文，如果一定要使用中文，请注意处理的时候编码的问题，改成打开文件的encoding方式改为utf-8。**
400 | 
401 | **问：我的图片是xxx*xxx的分辨率的，可以用吗！**
402 | **答：可以用，代码里面会自动进行resize或者数据增强。**
403 | 
404 | **问：怎么进行多GPU训练？
405 | 答：pytorch的大多数代码可以直接使用gpu训练，keras的话直接百度就好了，实现并不复杂，我没有多卡没法详细测试，还需要各位同学自己努力了。**
406 | 
407 | ### d、灰度图问题
408 | **问：能不能训练灰度图（预测灰度图）啊？
409 | 答：我的大多数库会将灰度图转化成RGB进行训练和预测，如果遇到代码不能训练或者预测灰度图的情况，可以尝试一下在get_random_data里面将Image.open后的结果转换成RGB，预测的时候也这样试试。（仅供参考）**
410 | 
411 | ### e、断点续练问题
412 | **问：我已经训练过几个世代了，能不能从这个基础上继续开始训练
413 | 答：可以，你在训练前，和载入预训练权重一样载入训练过的权重就行了。一般训练好的权重会保存在logs文件夹里面，将model_path修改成你要开始的权值的路径即可。**
414 | 
415 | ### f、预训练权重的问题
416 | 
417 | **问：如果我要训练其它的数据集，预训练权重要怎么办啊？**
418 | **答：数据的预训练权重对不同数据集是通用的，因为特征是通用的，预训练权重对于99%的情况都必须要用，不用的话权值太过随机，特征提取效果不明显，网络训练的结果也不会好。**
419 | 
420 | **问：up，我修改了网络，预训练权重还能用吗？
421 | 答：修改了主干的话，如果不是用的现有的网络，基本上预训练权重是不能用的，要么就自己判断权值里卷积核的shape然后自己匹配，要么只能自己预训练去了；修改了后半部分的话，前半部分的主干部分的预训练权重还是可以用的，如果是pytorch代码的话，需要自己修改一下载入权值的方式，判断shape后载入，如果是keras代码，直接by_name=True,skip_mismatch=True即可。**
422 | 权值匹配的方式可以参考如下：
423 | 
424 | ```python
425 | # 加快模型训练的效率
426 | print('Loading weights into state dict...')
427 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
428 | model_dict = model.state_dict()
429 | pretrained_dict = torch.load(model_path, map_location=device)
430 | a = {}
431 | for k, v in pretrained_dict.items():
432 |     try:    
433 |         if np.shape(model_dict[k]) ==  np.shape(v):
434 |             a[k]=v
435 |     except:
436 |         pass
437 | model_dict.update(a)
438 | model.load_state_dict(model_dict)
439 | print('Finished!')
440 | ```
441 | 
442 | **问：我要怎么不使用预训练权重啊？
443 | 答：把载入预训练权重的代码注释了就行。**
444 | 
445 | **问：为什么我不使用预训练权重效果这么差啊？
446 | 答：因为随机初始化的权值不好，提取的特征不好，也就导致了模型训练的效果不好，预训练权重还是非常重要的。**
447 | 
448 | ### g、视频检测问题与摄像头检测问题
449 | **问：怎么用摄像头检测呀？
450 | 答：predict.py修改参数可以进行摄像头检测，也有视频详细解释了摄像头检测的思路。**
451 | 
452 | **问：怎么用视频检测呀？
453 | 答：同上**
454 | 
455 | ### h、从0开始训练问题
456 | **问：怎么在模型上从0开始训练？
457 | 答：在算力不足与调参能力不足的情况下从0开始训练毫无意义。模型特征提取能力在随机初始化参数的情况下非常差。没有好的参数调节能力和算力，无法使得网络正常收敛。**
458 | 如果一定要从0开始，那么训练的时候请注意几点：
459 |  - 不载入预训练权重。
460 |  - 不要进行冻结训练，注释冻结模型的代码。
461 | 
462 | **问：为什么我不使用预训练权重效果这么差啊？
463 | 答：因为随机初始化的权值不好，提取的特征不好，也就导致了模型训练的效果不好，预训练权重还是非常重要的。**
464 | 
465 | ### i、保存问题
466 | **问：检测完的图片怎么保存？
467 | 答：一般目标检测用的是Image，所以查询一下PIL库的Image如何进行保存。详细看看predict.py文件的注释。**
468 | 
469 | **问：怎么用视频保存呀？
470 | 答：详细看看predict.py文件的注释。**
471 | 
472 | ### j、遍历问题
473 | **问：如何对一个文件夹的图片进行遍历？
474 | 答：一般使用os.listdir先找出文件夹里面的所有图片，然后根据predict.py文件里面的执行思路检测图片就行了，详细看看predict.py文件的注释。**
475 | 
476 | **问：如何对一个文件夹的图片进行遍历？并且保存。
477 | 答：遍历的话一般使用os.listdir先找出文件夹里面的所有图片，然后根据predict.py文件里面的执行思路检测图片就行了。保存的话一般目标检测用的是Image，所以查询一下PIL库的Image如何进行保存。如果有些库用的是cv2，那就是查一下cv2怎么保存图片。详细看看predict.py文件的注释。**
478 | 
479 | ### k、路径问题（No such file or directory）
480 | **问：我怎么出现了这样的错误呀：**
481 | ```python
482 | FileNotFoundError: 【Errno 2】 No such file or directory
483 | ……………………………………
484 | ……………………………………
485 | ```
486 | 
487 | **答：去检查一下文件夹路径，查看是否有对应文件；并且检查一下2007_train.txt，其中文件路径是否有错。**
488 | 关于路径有几个重要的点：
489 | **文件夹名称中一定不要有空格。
490 | 注意相对路径和绝对路径。
491 | 多百度路径相关的知识。**
492 | 
493 | **所有的路径问题基本上都是根目录问题，好好查一下相对目录的概念！**
494 | 
495 | ### l、FPS问题（检测速度问题）
496 | **问：你这个FPS可以到达多少，可以到 XX FPS么？
497 | 答：FPS和机子的配置有关，配置高就快，配置低就慢。**
498 | 
499 | **问：为什么论文中说速度可以达到XX，但是这里却没有？
500 | 答：检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本，如果已经正确安装，可以去利用time.time()的方法查看detect_image里面，哪一段代码耗时更长（不仅只有网络耗时长，其它处理部分也会耗时，如绘图等）。有些论文还会使用多batch进行预测，我并没有去实现这个部分。**
501 | 
502 | ### m、预测图片不显示问题
503 | **问：为什么你的代码在预测完成后不显示图片？只是在命令行告诉我有什么目标。
504 | 答：给系统安装一个图片查看器就行了。**
505 | 
506 | ### n、算法评价问题（miou）
507 | **问：怎么计算miou？
508 | 答：参考视频里的miou测量部分。**
509 | 
510 | **问：怎么计算Recall、Precision指标。
511 | 答：现有的代码还无法获得，需要各位同学理解一下混淆矩阵的概念，然后自行计算一下。**
512 | 
513 | ### o、模型优化（模型修改）问题
514 | **问：up，我修改了网络，预训练权重还能用吗？
515 | 答：修改了主干的话，如果不是用的现有的网络，基本上预训练权重是不能用的，要么就自己判断权值里卷积核的shape然后自己匹配，要么只能自己预训练去了；修改了后半部分的话，前半部分的主干部分的预训练权重还是可以用的，如果是pytorch代码的话，需要自己修改一下载入权值的方式，判断shape后载入，如果是keras代码，直接by_name=True,skip_mismatch=True即可。**
516 | 权值匹配的方式可以参考如下：
517 | 
518 | ```python
519 | # 加快模型训练的效率
520 | print('Loading weights into state dict...')
521 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
522 | model_dict = model.state_dict()
523 | pretrained_dict = torch.load(model_path, map_location=device)
524 | a = {}
525 | for k, v in pretrained_dict.items():
526 |     try:    
527 |         if np.shape(model_dict[k]) ==  np.shape(v):
528 |             a[k]=v
529 |     except:
530 |         pass
531 | model_dict.update(a)
532 | model.load_state_dict(model_dict)
533 | print('Finished!')
534 | ```
535 | 
536 | 
537 | 
538 | **问：up，怎么修改模型啊，我想发个小论文！
539 | 答：建议看看目标检测中yolov4的论文，作为一个大型调参现场非常有参考意义，使用了很多tricks。我能给的建议就是多看一些经典模型，然后拆解里面的亮点结构并使用。常用的tricks如注意力机制什么的，可以试试。**
540 | 
541 | ### p、部署问题
542 | 我没有具体部署到手机等设备上过，所以很多部署问题我并不了解……
543 | 
544 | ## 5、交流群问题
545 | **问：up，有没有QQ群啥的呢？
546 | 答：没有没有，我没有时间管理QQ群……**
547 | 
548 | ## 6、怎么学习的问题
549 | **问：up，你的学习路线怎么样的？我是个小白我要怎么学？
550 | 答：这里有几点需要注意哈
551 | 1、我不是高手，很多东西我也不会，我的学习路线也不一定适用所有人。
552 | 2、我实验室不做深度学习，所以我很多东西都是自学，自己摸索，正确与否我也不知道。
553 | 3、我个人觉得学习更靠自学**
554 | 学习路线的话，我是先学习了莫烦的python教程，从tensorflow、keras、pytorch入门，入门完之后学的SSD，YOLO，然后了解了很多经典的卷积网，后面就开始学很多不同的代码了，我的学习方法就是一行一行的看，了解整个代码的执行流程，特征层的shape变化等，花了很多时间也没有什么捷径，就是要花时间吧。


--------------------------------------------------------------------------------