├── nets ├── __init__.py ├── arcface.py ├── arcface_training.py ├── mobilefacenet.py ├── mobilenet.py ├── iresnet.py ├── mobilenetv3.py └── mobilenetv2.py ├── utils ├── __init__.py ├── callbacks.py ├── dataloader.py ├── utils_metrics.py └── utils.py ├── datasets └── README.md ├── lfw └── README.md ├── logs └── README.md ├── img ├── 1_001.jpg ├── 1_002.jpg └── 2_001.jpg ├── model_data ├── roc.png └── arcface_mobilefacenet.h5 ├── txt_annotation.py ├── summary.py ├── LICENSE ├── eval_LFW.py ├── predict.py ├── .gitignore ├── README.md ├── arcface.py ├── train.py └── 常见问题汇总.md /nets/__init__.py: -------------------------------------------------------------------------------- 1 | # -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | # -------------------------------------------------------------------------------- /datasets/README.md: -------------------------------------------------------------------------------- 1 | 存放数据集 -------------------------------------------------------------------------------- /lfw/README.md: -------------------------------------------------------------------------------- 1 | 存放lfw数据集 -------------------------------------------------------------------------------- /logs/README.md: -------------------------------------------------------------------------------- 1 | 用于存放训练好的文件 -------------------------------------------------------------------------------- /img/1_001.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/1_001.jpg -------------------------------------------------------------------------------- /img/1_002.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/1_002.jpg -------------------------------------------------------------------------------- /img/2_001.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/img/2_001.jpg -------------------------------------------------------------------------------- /model_data/roc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/model_data/roc.png -------------------------------------------------------------------------------- /model_data/arcface_mobilefacenet.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bubbliiiing/arcface-keras/HEAD/model_data/arcface_mobilefacenet.h5 -------------------------------------------------------------------------------- /txt_annotation.py: -------------------------------------------------------------------------------- 1 | #------------------------------------------------# 2 | # 进行训练前需要利用这个文件生成cls_train.txt 3 | #------------------------------------------------# 4 | import os 5 | 6 | if __name__ == "__main__": 7 | #---------------------# 8 | # 训练集所在的路径 9 | #---------------------# 10 | datasets_path = "datasets" 11 | 12 | types_name = os.listdir(datasets_path) 13 | types_name = sorted(types_name) 14 | 15 | list_file = open('cls_train.txt', 'w') 16 | for cls_id, type_name in enumerate(types_name): 17 | photos_path = os.path.join(datasets_path, type_name) 18 | if not os.path.isdir(photos_path): 19 | continue 20 | photos_name = os.listdir(photos_path) 21 | 22 | for photo_name in photos_name: 23 | list_file.write(str(cls_id) + ";" + '%s'%(os.path.join(os.path.abspath(datasets_path), type_name, photo_name))) 24 | list_file.write('\n') 25 | list_file.close() 26 | -------------------------------------------------------------------------------- /summary.py: -------------------------------------------------------------------------------- 1 | #--------------------------------------------# 2 | # 该部分代码只用于看网络结构,并非测试代码 3 | #--------------------------------------------# 4 | from nets.arcface import arcface 5 | from utils.utils import net_flops 6 | 7 | if __name__ == "__main__": 8 | input_shape = [112, 112] 9 | backbone = "mobilefacenet" 10 | model = arcface([input_shape[0], input_shape[1], 3], 10575, backbone=backbone, mode="predict") 11 | #--------------------------------------------# 12 | # 查看网络结构网络结构 13 | #--------------------------------------------# 14 | model.summary() 15 | #------------------------------------------# 16 | # 计算网络的FLOPS 17 | #--------------------------------------------# 18 | net_flops(model, table=False) 19 | 20 | #--------------------------------------------# 21 | # 获得网络每个层的名称与序号 22 | #--------------------------------------------# 23 | # for i,layer in enumerate(model.layers): 24 | # print(i,layer.name) 25 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Bubbliiiing 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /eval_LFW.py: -------------------------------------------------------------------------------- 1 | from nets.arcface import arcface 2 | from utils.dataloader import LFWDataset 3 | from utils.utils_metrics import test 4 | 5 | if __name__ == "__main__": 6 | #--------------------------------------# 7 | # 主干特征提取网络的选择 8 | # mobilefacenet 9 | # mobilenetv1 10 | # mobilenetv2 11 | # mobilenetv3 12 | # iresnet50 13 | #--------------------------------------# 14 | backbone = "mobilefacenet" 15 | #--------------------------------------# 16 | # 输入图像大小 17 | #--------------------------------------# 18 | input_shape = [112, 112, 3] 19 | #--------------------------------------# 20 | # 训练好的权值文件 21 | #--------------------------------------# 22 | model_path = "model_data/arcface_mobilefacenet.h5" 23 | #--------------------------------------# 24 | # LFW评估数据集的文件路径 25 | # 以及对应的txt文件 26 | #--------------------------------------# 27 | lfw_dir_path = "lfw" 28 | lfw_pairs_path = "model_data/lfw_pair.txt" 29 | #--------------------------------------# 30 | # 评估的批次大小和记录间隔 31 | #--------------------------------------# 32 | batch_size = 256 33 | log_interval = 1 34 | #--------------------------------------# 35 | # ROC图的保存路径 36 | #--------------------------------------# 37 | png_save_path = "model_data/roc_test.png" 38 | 39 | test_loader = LFWDataset(dir=lfw_dir_path,pairs_path=lfw_pairs_path, batch_size=batch_size, input_shape=input_shape) 40 | 41 | model = arcface(input_shape, None, backbone=backbone, mode="predict") 42 | model.load_weights(model_path, by_name=True) 43 | 44 | test(test_loader, model, png_save_path, log_interval, batch_size) 45 | -------------------------------------------------------------------------------- /predict.py: -------------------------------------------------------------------------------- 1 | from PIL import Image 2 | 3 | from arcface import Arcface 4 | 5 | if __name__ == "__main__": 6 | model = Arcface() 7 | 8 | #----------------------------------------------------------------------------------------------------------# 9 | # mode用于指定测试的模式: 10 | # 'predict' 表示单张图片预测,如果想对预测过程进行修改,如保存图片,截取对象等,可以先看下方详细的注释 11 | # 'fps' 表示测试fps,使用的图片是img里面的street.jpg,详情查看下方注释。 12 | #----------------------------------------------------------------------------------------------------------# 13 | mode = "predict" 14 | #-------------------------------------------------------------------------# 15 | # test_interval 用于指定测量fps的时候,图片检测的次数 16 | # 理论上test_interval越大,fps越准确。 17 | # fps_test_image fps测试图片 18 | #-------------------------------------------------------------------------# 19 | test_interval = 100 20 | fps_test_image = 'img/1_001.jpg' 21 | 22 | if mode == "predict": 23 | while True: 24 | image_1 = input('Input image_1 filename:') 25 | try: 26 | image_1 = Image.open(image_1) 27 | except: 28 | print('Image_1 Open Error! Try again!') 29 | continue 30 | 31 | image_2 = input('Input image_2 filename:') 32 | try: 33 | image_2 = Image.open(image_2) 34 | except: 35 | print('Image_2 Open Error! Try again!') 36 | continue 37 | 38 | probability = model.detect_image(image_1,image_2) 39 | print(probability) 40 | 41 | elif mode == "fps": 42 | img = Image.open(fps_test_image) 43 | tact_time = model.get_FPS(img, test_interval) 44 | print(str(tact_time) + ' seconds, ' + str(1/tact_time) + 'FPS, @batch_size 1') -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # ignore map, miou, datasets 2 | map_out/ 3 | miou_out/ 4 | VOCdevkit/ 5 | datasets/ 6 | Medical_Datasets/ 7 | lfw/ 8 | logs/ 9 | model_data/ 10 | 11 | # Byte-compiled / optimized / DLL files 12 | __pycache__/ 13 | *.py[cod] 14 | *$py.class 15 | 16 | # C extensions 17 | *.so 18 | 19 | # Distribution / packaging 20 | .Python 21 | build/ 22 | develop-eggs/ 23 | dist/ 24 | downloads/ 25 | eggs/ 26 | .eggs/ 27 | lib/ 28 | lib64/ 29 | parts/ 30 | sdist/ 31 | var/ 32 | wheels/ 33 | pip-wheel-metadata/ 34 | share/python-wheels/ 35 | *.egg-info/ 36 | .installed.cfg 37 | *.egg 38 | MANIFEST 39 | 40 | # PyInstaller 41 | # Usually these files are written by a python script from a template 42 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 43 | *.manifest 44 | *.spec 45 | 46 | # Installer logs 47 | pip-log.txt 48 | pip-delete-this-directory.txt 49 | 50 | # Unit test / coverage reports 51 | htmlcov/ 52 | .tox/ 53 | .nox/ 54 | .coverage 55 | .coverage.* 56 | .cache 57 | nosetests.xml 58 | coverage.xml 59 | *.cover 60 | *.py,cover 61 | .hypothesis/ 62 | .pytest_cache/ 63 | 64 | # Translations 65 | *.mo 66 | *.pot 67 | 68 | # Django stuff: 69 | *.log 70 | local_settings.py 71 | db.sqlite3 72 | db.sqlite3-journal 73 | 74 | # Flask stuff: 75 | instance/ 76 | .webassets-cache 77 | 78 | # Scrapy stuff: 79 | .scrapy 80 | 81 | # Sphinx documentation 82 | docs/_build/ 83 | 84 | # PyBuilder 85 | target/ 86 | 87 | # Jupyter Notebook 88 | .ipynb_checkpoints 89 | 90 | # IPython 91 | profile_default/ 92 | ipython_config.py 93 | 94 | # pyenv 95 | .python-version 96 | 97 | # pipenv 98 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 99 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 100 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 101 | # install all needed dependencies. 102 | #Pipfile.lock 103 | 104 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 105 | __pypackages__/ 106 | 107 | # Celery stuff 108 | celerybeat-schedule 109 | celerybeat.pid 110 | 111 | # SageMath parsed files 112 | *.sage.py 113 | 114 | # Environments 115 | .env 116 | .venv 117 | env/ 118 | venv/ 119 | ENV/ 120 | env.bak/ 121 | venv.bak/ 122 | 123 | # Spyder project settings 124 | .spyderproject 125 | .spyproject 126 | 127 | # Rope project settings 128 | .ropeproject 129 | 130 | # mkdocs documentation 131 | /site 132 | 133 | # mypy 134 | .mypy_cache/ 135 | .dmypy.json 136 | dmypy.json 137 | 138 | # Pyre type checker 139 | .pyre/ 140 | -------------------------------------------------------------------------------- /nets/arcface.py: -------------------------------------------------------------------------------- 1 | import keras.backend as K 2 | import tensorflow as tf 3 | from keras import initializers 4 | from keras.layers import Input, Lambda, Layer 5 | from keras.models import Model 6 | from keras.regularizers import l2 7 | 8 | from nets.iresnet import iResNet50 9 | from nets.mobilefacenet import mobilefacenet 10 | from nets.mobilenet import MobilenetV1 11 | from nets.mobilenetv2 import MobilenetV2 12 | from nets.mobilenetv3 import MobileNetV3_Large, MobilenetV3_small 13 | 14 | 15 | class ArcMarginProduct(Layer) : 16 | def __init__(self, n_classes=1000, **kwargs) : 17 | self.n_classes = n_classes 18 | super(ArcMarginProduct, self).__init__(**kwargs) 19 | 20 | def build(self, input_shape) : 21 | self.W = self.add_weight(name='W', 22 | shape=(input_shape[-1], self.n_classes), 23 | initializer=initializers.glorot_uniform(), 24 | trainable=True, 25 | regularizer=l2(5e-4)) 26 | super(ArcMarginProduct, self).build(input_shape) 27 | 28 | def call(self, input) : 29 | W = tf.nn.l2_normalize(self.W, axis=0) 30 | logits = input @ W 31 | return K.clip(logits, -1 + K.epsilon(), 1 - K.epsilon()) 32 | 33 | def compute_output_shape(self, input_shape) : 34 | return (None, self.n_classes) 35 | 36 | def arcface(input_shape, num_classes=None, backbone="mobilefacenet", mode="train"): 37 | inputs = Input(shape=input_shape) 38 | 39 | if backbone=="mobilefacenet": 40 | embedding_size = 128 41 | x = mobilefacenet(inputs, embedding_size) 42 | elif backbone=="mobilenetv1": 43 | embedding_size = 512 44 | x = MobilenetV1(inputs, embedding_size, dropout_keep_prob=0.5) 45 | elif backbone=="mobilenetv2": 46 | embedding_size = 512 47 | x = MobilenetV2(inputs, embedding_size, dropout_keep_prob=0.5) 48 | elif backbone=="mobilenetv3": 49 | embedding_size = 512 50 | x = MobileNetV3_Large(inputs, embedding_size, dropout_keep_prob=0.5) 51 | elif backbone=="iresnet50": 52 | embedding_size = 512 53 | x = iResNet50(inputs, embedding_size, dropout_keep_prob=0.5) 54 | else: 55 | raise ValueError('Unsupported backbone - `{}`, Use mobilefacenet, mobilenetv1, mobilenetv2, mobilenetv3, iresnet50.'.format(mode)) 56 | 57 | if mode == "train": 58 | predict = Lambda(lambda x: K.l2_normalize(x, axis=1), name="l2_normalize")(x) 59 | x = ArcMarginProduct(num_classes, name="ArcMargin")(predict) 60 | model = Model(inputs, [x, predict]) 61 | return model 62 | elif mode == "predict": 63 | x = Lambda(lambda x: K.l2_normalize(x, axis=1))(x) 64 | model = Model(inputs, x) 65 | return model 66 | else: 67 | raise ValueError('Unsupported mode - `{}`, Use train, predict.'.format(mode)) 68 | -------------------------------------------------------------------------------- /nets/arcface_training.py: -------------------------------------------------------------------------------- 1 | import math 2 | from functools import partial 3 | 4 | import keras.backend as K 5 | import tensorflow as tf 6 | 7 | 8 | class ArcFaceLoss() : 9 | def __init__(self, s=32.0, m=0.5) : 10 | self.s = s 11 | 12 | self.cos_m = math.cos(m) 13 | self.sin_m = math.sin(m) 14 | 15 | self.th = math.cos(math.pi - m) 16 | self.mm = math.sin(math.pi - m) * m 17 | 18 | def __call__(self, y_true, y_pred, sample_weight=None): 19 | labels = tf.cast(y_true, tf.float32) 20 | cosine = tf.cast(y_pred, tf.float32) 21 | #----------------------------------------------------# 22 | # batch_size, 10575 -> batch_size, 10575 23 | #----------------------------------------------------# 24 | sine = tf.sqrt(1 - tf.square(cosine)) 25 | phi = cosine * self.cos_m - sine * self.sin_m 26 | phi = tf.where(cosine > self.th, phi, cosine - self.mm) 27 | 28 | output = (labels * phi) + ((1.0 - labels) * cosine) 29 | output *= self.s 30 | 31 | losses = K.categorical_crossentropy(y_true, output, from_logits=True) 32 | # losses = tf.Print(losses,[tf.shape(losses),tf.shape(y_true),tf.shape(output)]) 33 | return losses 34 | 35 | def get_lr_scheduler(lr_decay_type, lr, min_lr, total_iters, warmup_iters_ratio = 0.1, warmup_lr_ratio = 0.1, no_aug_iter_ratio = 0.3, step_num = 10): 36 | def yolox_warm_cos_lr(lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter, iters): 37 | if iters <= warmup_total_iters: 38 | # lr = (lr - warmup_lr_start) * iters / float(warmup_total_iters) + warmup_lr_start 39 | lr = (lr - warmup_lr_start) * pow(iters / float(warmup_total_iters), 2 40 | ) + warmup_lr_start 41 | elif iters >= total_iters - no_aug_iter: 42 | lr = min_lr 43 | else: 44 | lr = min_lr + 0.5 * (lr - min_lr) * ( 45 | 1.0 46 | + math.cos( 47 | math.pi 48 | * (iters - warmup_total_iters) 49 | / (total_iters - warmup_total_iters - no_aug_iter) 50 | ) 51 | ) 52 | return lr 53 | 54 | def step_lr(lr, decay_rate, step_size, iters): 55 | if step_size < 1: 56 | raise ValueError("step_size must above 1.") 57 | n = iters // step_size 58 | out_lr = lr * decay_rate ** n 59 | return out_lr 60 | 61 | if lr_decay_type == "cos": 62 | warmup_total_iters = min(max(warmup_iters_ratio * total_iters, 1), 3) 63 | warmup_lr_start = max(warmup_lr_ratio * lr, 1e-6) 64 | no_aug_iter = min(max(no_aug_iter_ratio * total_iters, 1), 15) 65 | func = partial(yolox_warm_cos_lr ,lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter) 66 | else: 67 | decay_rate = (min_lr / lr) ** (1 / (step_num - 1)) 68 | step_size = total_iters / step_num 69 | func = partial(step_lr, lr, decay_rate, step_size) 70 | 71 | return func 72 | 73 | -------------------------------------------------------------------------------- /nets/mobilefacenet.py: -------------------------------------------------------------------------------- 1 | import keras 2 | from keras import backend as K 3 | from keras import initializers 4 | from keras.layers import (BatchNormalization, Conv2D, DepthwiseConv2D, PReLU, Flatten, 5 | add) 6 | 7 | 8 | def conv_block(inputs, filters, kernel_size, strides, padding): 9 | x = Conv2D(filters, kernel_size, strides=strides, padding=padding, use_bias=False, 10 | kernel_initializer=initializers.random_normal(stddev=0.1), 11 | bias_initializer='zeros')(inputs) 12 | x = BatchNormalization(axis=-1, epsilon=1e-5)(x) 13 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 14 | return x 15 | 16 | def depthwise_conv_block(inputs, filters, kernel_size, strides): 17 | x = DepthwiseConv2D(kernel_size, strides=strides, padding="same", use_bias=False, 18 | depthwise_initializer=initializers.random_normal(stddev=0.1), 19 | bias_initializer='zeros')(inputs) 20 | x = BatchNormalization(axis=-1, epsilon=1e-5)(x) 21 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 22 | return x 23 | 24 | def bottleneck(inputs, filters, kernel, t, strides, r=False): 25 | tchannel = K.int_shape(inputs)[-1] * t 26 | x = conv_block(inputs, tchannel, 1, 1, "same") 27 | 28 | x = DepthwiseConv2D(kernel, strides=strides, padding="same", depth_multiplier=1, use_bias=False, 29 | depthwise_initializer=initializers.random_normal(stddev=0.1), 30 | bias_initializer='zeros')(x) 31 | x = BatchNormalization(axis=-1, epsilon=1e-5)(x) 32 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 33 | 34 | x = Conv2D(filters, 1, strides=1, padding="same", use_bias=False, 35 | kernel_initializer=initializers.random_normal(stddev=0.1), 36 | bias_initializer='zeros')(x) 37 | x = BatchNormalization(axis=-1, epsilon=1e-5)(x) 38 | if r: 39 | x = add([x, inputs]) 40 | return x 41 | 42 | def inverted_residual_block(inputs, filters, kernel, t, n): 43 | x = inputs 44 | for _ in range(n): 45 | x = bottleneck(x, filters, kernel, t, 1, True) 46 | return x 47 | 48 | def mobilefacenet(inputs, embedding_size): 49 | x = conv_block(inputs, 64, 3, 2, "same") # Output Shape: (56, 56, 64) 50 | x = depthwise_conv_block(x, 64, 3, 1) # (56, 56, 64) 51 | 52 | x = bottleneck(x, 64, 3, t=2, strides=2) 53 | x = inverted_residual_block(x, 64, 3, t=2, n=4) # (28, 28, 64) 54 | 55 | x = bottleneck(x, 128, 3, t=4, strides=2) # (14, 14, 128) 56 | x = inverted_residual_block(x, 128, 3, t=2, n=6) # (14, 14, 128) 57 | 58 | x = bottleneck(x, 128, 3, t=4, strides=2) # (14, 14, 128) 59 | x = inverted_residual_block(x, 128, 3, t=2, n=2) # (7, 7, 128) 60 | 61 | x = Conv2D(512, 1, use_bias=False, name="conv2d", 62 | kernel_initializer=initializers.random_normal(stddev=0.1), 63 | bias_initializer='zeros')(x) 64 | x = BatchNormalization(epsilon=1e-5)(x) 65 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 66 | 67 | x = DepthwiseConv2D(int(x.shape[1]), depth_multiplier=1, use_bias=False, 68 | depthwise_initializer=initializers.random_normal(stddev=0.1), 69 | bias_initializer='zeros')(x) 70 | x = BatchNormalization(epsilon=1e-5)(x) 71 | 72 | x = Conv2D(embedding_size, 1, use_bias=False, 73 | kernel_initializer=initializers.random_normal(stddev=0.1), 74 | bias_initializer='zeros')(x) 75 | x = BatchNormalization(name="embedding", epsilon=1e-5)(x) 76 | x = Flatten()(x) 77 | 78 | return x 79 | -------------------------------------------------------------------------------- /nets/mobilenet.py: -------------------------------------------------------------------------------- 1 | 2 | from keras import backend as K 3 | from keras import initializers 4 | from keras.layers import (Activation, BatchNormalization, Conv2D, Dense, 5 | DepthwiseConv2D, Dropout, Flatten, PReLU) 6 | 7 | 8 | def _conv_block(inputs, filters, kernel=(3, 3), strides=(1, 1)): 9 | x = Conv2D(filters, kernel, 10 | padding='same', 11 | use_bias=False, 12 | strides=strides, 13 | name='conv1', 14 | kernel_initializer=initializers.random_normal(stddev=0.1), 15 | bias_initializer='zeros')(inputs) 16 | x = BatchNormalization(name='conv1_bn', epsilon=1e-5)(x) 17 | return Activation(relu6, name='conv1_relu')(x) 18 | 19 | 20 | def _depthwise_conv_block(inputs, pointwise_conv_filters, 21 | depth_multiplier=1, strides=(1, 1), block_id=1): 22 | x = DepthwiseConv2D((3, 3), 23 | padding='same', 24 | depth_multiplier=depth_multiplier, 25 | strides=strides, 26 | use_bias=False, 27 | name='conv_dw_%d' % block_id, 28 | depthwise_initializer=initializers.random_normal(stddev=0.1), 29 | bias_initializer='zeros')(inputs) 30 | 31 | x = BatchNormalization(name='conv_dw_%d_bn' % block_id, epsilon=1e-5)(x) 32 | x = Activation(relu6, name='conv_dw_%d_relu' % block_id)(x) 33 | 34 | x = Conv2D(pointwise_conv_filters, (1, 1), 35 | padding='same', 36 | use_bias=False, 37 | strides=(1, 1), 38 | name='conv_pw_%d' % block_id, 39 | kernel_initializer=initializers.random_normal(stddev=0.1), 40 | bias_initializer='zeros')(x) 41 | x = BatchNormalization(name='conv_pw_%d_bn' % block_id, epsilon=1e-5)(x) 42 | return Activation(relu6, name='conv_pw_%d_relu' % block_id)(x) 43 | 44 | def relu6(x): 45 | return K.relu(x, max_value=6) 46 | 47 | def MobilenetV1(inputs, embedding_size, dropout_keep_prob=0.5, depth_multiplier=1): 48 | x = _conv_block(inputs, 32, strides=(1, 1)) 49 | x = _depthwise_conv_block(x, 64, depth_multiplier, block_id=1) 50 | 51 | x = _depthwise_conv_block(x, 128, depth_multiplier, strides=(2, 2), block_id=2) 52 | x = _depthwise_conv_block(x, 128, depth_multiplier, block_id=3) 53 | 54 | x = _depthwise_conv_block(x, 256, depth_multiplier, strides=(2, 2), block_id=4) 55 | x = _depthwise_conv_block(x, 256, depth_multiplier, block_id=5) 56 | 57 | x = _depthwise_conv_block(x, 512, depth_multiplier, strides=(2, 2), block_id=6) 58 | x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=7) 59 | x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=8) 60 | x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=9) 61 | x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=10) 62 | x = _depthwise_conv_block(x, 512, depth_multiplier, block_id=11) 63 | 64 | x = _depthwise_conv_block(x, 1024, depth_multiplier, strides=(2, 2), block_id=12) 65 | x = _depthwise_conv_block(x, 1024, depth_multiplier, block_id=13) 66 | 67 | x = Conv2D(512, kernel_size=1, use_bias=False, name='sep', 68 | kernel_initializer=initializers.random_normal(stddev=0.1), 69 | bias_initializer='zeros')(x) 70 | x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x) 71 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 72 | 73 | x = BatchNormalization(name='bn2', epsilon=1e-5)(x) 74 | x = Dropout(p=dropout_keep_prob)(x) 75 | x = Flatten()(x) 76 | x = Dense(embedding_size, name='linear', 77 | kernel_initializer=initializers.random_normal(stddev=0.1), 78 | bias_initializer='zeros')(x) 79 | x = BatchNormalization(name='features', epsilon=1e-5)(x) 80 | return x 81 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Arcface:人脸识别模型在Keras当中的实现 2 | --- 3 | 4 | ## 目录 5 | 1. [仓库更新 Top News](#仓库更新) 6 | 2. [相关仓库 Related code](#相关仓库) 7 | 3. [性能情况 Performance](#性能情况) 8 | 4. [所需环境 Environment](#所需环境) 9 | 5. [注意事项 Attention](#注意事项) 10 | 6. [文件下载 Download](#文件下载) 11 | 7. [预测步骤 How2predict](#预测步骤) 12 | 8. [训练步骤 How2train](#训练步骤) 13 | 9. [参考资料 Reference](#Reference) 14 | 15 | ## Top News 16 | **`2022-03`**:**创建仓库,支持不同模型训练,支持大量可调整参数,支持step、cos学习率下降法、支持adam、sgd优化器选择、支持学习率根据batch_size自适应调整、新增图片裁剪。** 17 | 18 | ## 相关仓库 19 | | 模型 | 路径 | 20 | | :----- | :----- | 21 | facenet | https://github.com/bubbliiiing/facenet-keras 22 | arcface | https://github.com/bubbliiiing/arcface-keras 23 | retinaface | https://github.com/bubbliiiing/retinaface-keras 24 | facenet + retinaface | https://github.com/bubbliiiing/facenet-retinaface-keras 25 | 26 | ## 性能情况 27 | | 训练数据集 | 权值文件名称 | 测试数据集 | 输入图片大小 | accuracy | Validation rate | 28 | | :-----: | :-----: | :------: | :------: | :------: | :------: | 29 | | CASIA-WebFace | [arcface_mobilenet.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_mobilenet.h5) | LFW | 112x112 | 99.00% | 0.95200+-0.02237 @ FAR=0.00100 | 30 | | CASIA-WebFace | [arcface_mobilefacenet.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_mobilefacenet.h5) | LFW | 112x112 | 99.02% | 0.96500+-0.01344 @ FAR=0.00133 | 31 | | CASIA-WebFace | [arcface_iresnet50.h5](https://github.com/bubbliiiing/arcface-keras/releases/download/v1.0/arcface_iresnet50.h5) | LFW | 112x112 | 98.98% | 0.92967+-0.01935 @ FAR=0.00133 | 32 | 33 | ## 所需环境 34 | tensorflow==1.13.2 35 | keras=2.1.5 36 | 37 | ## 文件下载 38 | 已经训练好的权值可以在百度网盘下载。 39 | 链接:https://pan.baidu.com/s/1P3-T6_PoXGTMYa_VuiwXmw 提取码: 114e 40 | 41 | 训练用的CASIA-WebFaces数据集以及评估用的LFW数据集可以在百度网盘下载。 42 | 链接: https://pan.baidu.com/s/1qMxFR8H_ih0xmY-rKgRejw 提取码: bcrq 43 | 44 | ## 预测步骤 45 | ### a、使用预训练权重 46 | 1. 下载完库后解压,可直接运行predict.py输入: 47 | ```python 48 | img\1_001.jpg 49 | img\1_002.jpg 50 | ``` 51 | 2. 也可以在百度网盘下载权值,放入model_data,修改arcface.py文件的model_path后,输入: 52 | ```python 53 | img\1_001.jpg 54 | img\1_002.jpg 55 | ``` 56 | ### b、使用自己训练的权重 57 | 1. 按照训练步骤训练。 58 | 2. 在arcface.py文件里面,在如下部分修改model_path和backbone使其对应训练好的文件;**model_path对应logs文件夹下面的权值文件,backbone对应主干特征提取网络**。 59 | ```python 60 | _defaults = { 61 | #--------------------------------------------------------------------------# 62 | # 使用自己训练好的模型进行预测要修改model_path,指向logs文件夹下的权值文件 63 | # 训练好后logs文件夹下存在多个权值文件,选择验证集损失较低的即可。 64 | # 验证集损失较低不代表准确度较高,仅代表该权值在验证集上泛化性能较好。 65 | #--------------------------------------------------------------------------# 66 | "model_path" : "model_data/arcface_mobilefacenet.h5", 67 | #-------------------------------------------# 68 | # 输入图片的大小。 69 | #-------------------------------------------# 70 | "input_shape" : [112, 112, 3], 71 | #-------------------------------------------# 72 | # 所使用到的主干特征提取网络,与训练的相同 73 | # mobilefacenet 74 | # mobilenetv1 75 | # iresnet50 76 | #-------------------------------------------# 77 | "backbone" : "mobilefacenet", 78 | #-------------------------------------------# 79 | # 是否进行不失真的resize 80 | #-------------------------------------------# 81 | "letterbox_image" : True, 82 | } 83 | ``` 84 | 3. 运行predict.py,输入 85 | ```python 86 | img\1_001.jpg 87 | img\1_002.jpg 88 | ``` 89 | 90 | ## 训练步骤 91 | 1. 本文使用如下格式进行训练。 92 | ``` 93 | |-datasets 94 | |-people0 95 | |-123.jpg 96 | |-234.jpg 97 | |-people1 98 | |-345.jpg 99 | |-456.jpg 100 | |-... 101 | ``` 102 | 2. 下载好数据集,将训练用的CASIA-WebFaces数据集以及评估用的LFW数据集,解压后放在根目录。 103 | 3. 在训练前利用txt_annotation.py文件生成对应的cls_train.txt。 104 | 4. 利用train.py训练模型,训练前,根据自己的需要选择backbone,model_path和backbone一定要对应。 105 | 5. 运行train.py即可开始训练。 106 | 107 | ## 评估步骤 108 | 1. 下载好评估数据集,将评估用的LFW数据集,解压后放在根目录 109 | 2. 在eval_LFW.py设置使用的主干特征提取网络和网络权值。 110 | 3. 运行eval_LFW.py来进行模型准确率评估。 111 | 112 | ## Reference 113 | https://github.com/deepinsight/insightface 114 | https://github.com/timesler/facenet-pytorch 115 | 116 | -------------------------------------------------------------------------------- /utils/callbacks.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | import keras 4 | import matplotlib 5 | import numpy as np 6 | matplotlib.use('Agg') 7 | from matplotlib import pyplot as plt 8 | import scipy.signal 9 | from keras import backend as K 10 | 11 | from utils.utils_metrics import evaluate 12 | 13 | 14 | class LossHistory(keras.callbacks.Callback): 15 | def __init__(self, log_dir): 16 | self.log_dir = log_dir 17 | self.losses = [] 18 | self.val_loss = [] 19 | 20 | os.makedirs(self.log_dir) 21 | 22 | def on_epoch_end(self, epoch, logs={}): 23 | if not os.path.exists(self.log_dir): 24 | os.makedirs(self.save_path) 25 | 26 | self.losses.append(logs.get('loss')) 27 | self.val_loss.append(logs.get('val_loss')) 28 | 29 | with open(os.path.join(self.log_dir, "epoch_loss.txt"), 'a') as f: 30 | f.write(str(logs.get('loss'))) 31 | f.write("\n") 32 | with open(os.path.join(self.log_dir, "epoch_val_loss.txt"), 'a') as f: 33 | f.write(str(logs.get('val_loss'))) 34 | f.write("\n") 35 | self.loss_plot() 36 | 37 | def loss_plot(self): 38 | iters = range(len(self.losses)) 39 | 40 | plt.figure() 41 | plt.plot(iters, self.losses, 'red', linewidth = 2, label='train loss') 42 | plt.plot(iters, self.val_loss, 'coral', linewidth = 2, label='val loss') 43 | try: 44 | if len(self.losses) < 25: 45 | num = 5 46 | else: 47 | num = 15 48 | 49 | plt.plot(iters, scipy.signal.savgol_filter(self.losses, num, 3), 'green', linestyle = '--', linewidth = 2, label='smooth train loss') 50 | plt.plot(iters, scipy.signal.savgol_filter(self.val_loss, num, 3), '#8B4513', linestyle = '--', linewidth = 2, label='smooth val loss') 51 | except: 52 | pass 53 | 54 | plt.grid(True) 55 | plt.xlabel('Epoch') 56 | plt.ylabel('Loss') 57 | plt.title('A Loss Curve') 58 | plt.legend(loc="upper right") 59 | 60 | plt.savefig(os.path.join(self.log_dir, "epoch_loss.png")) 61 | 62 | plt.cla() 63 | plt.close("all") 64 | 65 | 66 | class ExponentDecayScheduler(keras.callbacks.Callback): 67 | def __init__(self, 68 | decay_rate, 69 | verbose=0): 70 | super(ExponentDecayScheduler, self).__init__() 71 | self.decay_rate = decay_rate 72 | self.verbose = verbose 73 | self.learning_rates = [] 74 | 75 | def on_epoch_end(self, batch, logs=None): 76 | learning_rate = K.get_value(self.model.optimizer.lr) * self.decay_rate 77 | K.set_value(self.model.optimizer.lr, learning_rate) 78 | if self.verbose > 0: 79 | print('Setting learning rate to %s.' % (learning_rate)) 80 | 81 | class LFW_callback(keras.callbacks.Callback): 82 | def __init__(self, test_loader): 83 | self.test_loader = test_loader 84 | 85 | def on_train_begin(self, logs={}): 86 | return 87 | 88 | def on_train_end(self, logs={}): 89 | return 90 | 91 | def on_epoch_begin(self, epoch, logs={}): 92 | return 93 | 94 | def on_epoch_end(self, epoch, logs={}): 95 | labels, distances = [], [] 96 | print("正在进行LFW数据集测试") 97 | 98 | for _, (data_a, data_p, label) in enumerate(self.test_loader.generate()): 99 | out_a, out_p = self.model.predict(data_a)[1], self.model.predict(data_p)[1] 100 | dists = np.linalg.norm(out_a - out_p, axis=1) 101 | 102 | distances.append(dists) 103 | labels.append(label) 104 | 105 | labels = np.array([sublabel for label in labels for sublabel in label]) 106 | distances = np.array([subdist for dist in distances for subdist in dist]) 107 | _, _, accuracy, _, _, _, _ = evaluate(distances,labels) 108 | print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy))) 109 | 110 | def on_batch_begin(self, batch, logs={}): 111 | return 112 | 113 | def on_batch_end(self, batch, logs={}): 114 | return 115 | 116 | class ParallelModelCheckpoint(keras.callbacks.ModelCheckpoint): 117 | def __init__(self, model, filepath, monitor='val_loss', verbose=0, 118 | save_best_only=False, save_weights_only=False, 119 | mode='auto', period=1): 120 | self.single_model = model 121 | super(ParallelModelCheckpoint,self).__init__(filepath, monitor, verbose,save_best_only, save_weights_only,mode, period) 122 | 123 | def set_model(self, model): 124 | super(ParallelModelCheckpoint,self).set_model(self.single_model) -------------------------------------------------------------------------------- /utils/dataloader.py: -------------------------------------------------------------------------------- 1 | import math 2 | import os 3 | 4 | import keras 5 | import numpy as np 6 | from keras.utils import np_utils 7 | from PIL import Image 8 | 9 | from .utils import cvtColor, preprocess_input, resize_image 10 | 11 | 12 | #------------------------------------# 13 | # 数据加载器 14 | #------------------------------------# 15 | class FacenetDataset(keras.utils.Sequence): 16 | def __init__(self, input_shape, lines, batch_size, num_classes, random): 17 | self.input_shape = input_shape 18 | self.lines = lines 19 | self.length = len(lines) 20 | self.batch_size = batch_size 21 | self.num_classes = num_classes 22 | self.random = random 23 | 24 | def __len__(self): 25 | return math.ceil(self.length / float(self.batch_size)) 26 | 27 | def __getitem__(self, index): 28 | images = [] 29 | labels = [] 30 | for i in range(index * self.batch_size, (index + 1) * self.batch_size): 31 | i = i % self.length 32 | 33 | annotation_path = self.lines[i].split(';')[1].split()[0] 34 | y = int(self.lines[i].split(';')[0]) 35 | 36 | image = cvtColor(Image.open(annotation_path)) 37 | #------------------------------------------# 38 | # 翻转图像 39 | #------------------------------------------# 40 | if self.rand()<.5 and self.random: 41 | image = image.transpose(Image.FLIP_LEFT_RIGHT) 42 | image = resize_image(image, [self.input_shape[1], self.input_shape[0]], letterbox_image = True) 43 | image = preprocess_input(np.array(image, dtype='float32')) 44 | 45 | images.append(image) 46 | labels.append(y) 47 | 48 | labels = np_utils.to_categorical(np.array(labels), num_classes=self.num_classes) 49 | return np.array(images, np.float32), np.array(labels, np.float32) 50 | 51 | def rand(self, a=0, b=1): 52 | return np.random.rand()*(b-a) + a 53 | 54 | class LFWDataset(): 55 | def __init__(self, dir, pairs_path, input_shape, batch_size): 56 | super(LFWDataset, self).__init__() 57 | self.input_shape = input_shape 58 | self.pairs_path = pairs_path 59 | self.batch_size = batch_size 60 | self.validation_images = self.get_lfw_paths(dir) 61 | 62 | def generate(self): 63 | images1 = [] 64 | images2 = [] 65 | issames = [] 66 | for annotation_line in self.validation_images: 67 | (path_1, path_2, issame) = annotation_line 68 | image1, image2 = Image.open(path_1), Image.open(path_2) 69 | image1 = resize_image(image1, [self.input_shape[1], self.input_shape[0]], letterbox_image = True) 70 | image2 = resize_image(image2, [self.input_shape[1], self.input_shape[0]], letterbox_image = True) 71 | 72 | image1, image2 = preprocess_input(np.array(image1, np.float32)), preprocess_input(np.array(image2, np.float32)) 73 | 74 | images1.append(image1) 75 | images2.append(image2) 76 | issames.append(issame) 77 | if len(images1) == self.batch_size: 78 | yield np.array(images1), np.array(images2), np.array(issames) 79 | images1 = [] 80 | images2 = [] 81 | issames = [] 82 | 83 | yield np.array(images1), np.array(images2), np.array(issames) 84 | 85 | def read_lfw_pairs(self,pairs_filename): 86 | pairs = [] 87 | with open(pairs_filename, 'r') as f: 88 | for line in f.readlines()[1:]: 89 | pair = line.strip().split() 90 | pairs.append(pair) 91 | return np.array(pairs) 92 | 93 | def get_lfw_paths(self,lfw_dir,file_ext="jpg"): 94 | pairs = self.read_lfw_pairs(self.pairs_path) 95 | 96 | nrof_skipped_pairs = 0 97 | path_list = [] 98 | issame_list = [] 99 | 100 | for i in range(len(pairs)): 101 | #for pair in pairs: 102 | pair = pairs[i] 103 | if len(pair) == 3: 104 | path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'.'+file_ext) 105 | path1 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[2])+'.'+file_ext) 106 | issame = True 107 | elif len(pair) == 4: 108 | path0 = os.path.join(lfw_dir, pair[0], pair[0] + '_' + '%04d' % int(pair[1])+'.'+file_ext) 109 | path1 = os.path.join(lfw_dir, pair[2], pair[2] + '_' + '%04d' % int(pair[3])+'.'+file_ext) 110 | issame = False 111 | if os.path.exists(path0) and os.path.exists(path1): # Only add the pair if both paths exist 112 | path_list.append((path0,path1,issame)) 113 | issame_list.append(issame) 114 | else: 115 | nrof_skipped_pairs += 1 116 | if nrof_skipped_pairs>0: 117 | print('Skipped %d image pairs' % nrof_skipped_pairs) 118 | 119 | return path_list -------------------------------------------------------------------------------- /arcface.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | import matplotlib.pyplot as plt 4 | import numpy as np 5 | 6 | from nets.arcface import arcface 7 | from utils.utils import preprocess_input, resize_image, show_config 8 | 9 | 10 | class Arcface(object): 11 | _defaults = { 12 | #--------------------------------------------------------------------------# 13 | # 使用自己训练好的模型进行预测要修改model_path,指向logs文件夹下的权值文件 14 | # 训练好后logs文件夹下存在多个权值文件,选择验证集损失较低的即可。 15 | # 验证集损失较低不代表准确度较高,仅代表该权值在验证集上泛化性能较好。 16 | #--------------------------------------------------------------------------# 17 | "model_path" : "model_data/arcface_mobilefacenet.h5", 18 | #-------------------------------------------# 19 | # 输入图片的大小。 20 | #-------------------------------------------# 21 | "input_shape" : [112, 112, 3], 22 | #-------------------------------------------# 23 | # 所使用到的主干特征提取网络,与训练的相同 24 | # mobilefacenet 25 | # mobilenetv1 26 | # mobilenetv2 27 | # mobilenetv3 28 | # iresnet50 29 | #-------------------------------------------# 30 | "backbone" : "mobilefacenet", 31 | #-------------------------------------------# 32 | # 是否进行不失真的resize 33 | #-------------------------------------------# 34 | "letterbox_image" : True, 35 | } 36 | 37 | @classmethod 38 | def get_defaults(cls, n): 39 | if n in cls._defaults: 40 | return cls._defaults[n] 41 | else: 42 | return "Unrecognized attribute name '" + n + "'" 43 | 44 | #---------------------------------------------------# 45 | # 初始化Arcface 46 | #---------------------------------------------------# 47 | def __init__(self, **kwargs): 48 | self.__dict__.update(self._defaults) 49 | for name, value in kwargs.items(): 50 | setattr(self, name, value) 51 | self.generate() 52 | 53 | show_config(**self._defaults) 54 | 55 | def generate(self): 56 | #---------------------------------------------------# 57 | # 载入模型与权值 58 | #---------------------------------------------------# 59 | model_path = os.path.expanduser(self.model_path) 60 | assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.' 61 | self.model = arcface(self.input_shape, backbone=self.backbone, mode="predict") 62 | 63 | print('Loading weights into state dict...') 64 | self.model.load_weights(self.model_path, by_name=True) 65 | print('{} model loaded.'.format(self.model_path)) 66 | 67 | #---------------------------------------------------# 68 | # 检测图片 69 | #---------------------------------------------------# 70 | def detect_image(self, image_1, image_2): 71 | image_1 = resize_image(image_1, [self.input_shape[1], self.input_shape[0]], self.letterbox_image) 72 | image_2 = resize_image(image_2, [self.input_shape[1], self.input_shape[0]], self.letterbox_image) 73 | 74 | photo_1 = np.expand_dims(preprocess_input(np.array(image_1, np.float32)), 0) 75 | photo_2 = np.expand_dims(preprocess_input(np.array(image_2, np.float32)), 0) 76 | 77 | #---------------------------------------------------# 78 | # 图片传入网络进行预测 79 | #---------------------------------------------------# 80 | output1 = self.model.predict(photo_1) 81 | output2 = self.model.predict(photo_2) 82 | 83 | #---------------------------------------------------# 84 | # 计算二者之间的距离 85 | #---------------------------------------------------# 86 | l1 = np.linalg.norm(output1-output2, axis=1) 87 | # l1 = np.sum(np.square(output1 - output2), axis=-1) 88 | 89 | plt.subplot(1, 2, 1) 90 | plt.imshow(np.array(image_1)) 91 | 92 | plt.subplot(1, 2, 2) 93 | plt.imshow(np.array(image_2)) 94 | plt.text(-12, -12, 'Distance:%.3f' % l1, ha='center', va= 'bottom',fontsize=11) 95 | plt.show() 96 | return l1 97 | 98 | def get_FPS(self, image, test_interval): 99 | #---------------------------------------------------# 100 | # 对图片进行不失真的resize 101 | #---------------------------------------------------# 102 | image_data = resize_image(image, [self.input_shape[1], self.input_shape[0]], self.letterbox_image) 103 | #---------------------------------------------------------# 104 | # 归一化+添加上batch_size维度 105 | #---------------------------------------------------------# 106 | image_data = np.expand_dims(preprocess_input(np.array(image_data, np.float32)), 0) 107 | 108 | #---------------------------------------------------# 109 | # 图片传入网络进行预测 110 | #---------------------------------------------------# 111 | preds = self.model.predict(image_data)[0] 112 | import time 113 | t1 = time.time() 114 | for _ in range(test_interval): 115 | #---------------------------------------------------# 116 | # 图片传入网络进行预测 117 | #---------------------------------------------------# 118 | preds = self.model.predict(image_data)[0] 119 | t2 = time.time() 120 | tact_time = (t2 - t1) / test_interval 121 | return tact_time -------------------------------------------------------------------------------- /nets/iresnet.py: -------------------------------------------------------------------------------- 1 | from keras import initializers, layers 2 | from keras.layers import (BatchNormalization, Conv2D, Dense, Dropout, Flatten, 3 | PReLU, ZeroPadding2D) 4 | from keras.models import Model 5 | 6 | 7 | def identity_block(input_tensor, kernel_size, filters, stage, block): 8 | filters1, filters2 = filters 9 | 10 | conv_name_base = 'res' + str(stage) + block + '_branch' 11 | bn_name_base = 'bn' + str(stage) + block + '_branch' 12 | 13 | x = BatchNormalization(name=bn_name_base + '2', epsilon=1e-5)(input_tensor) 14 | 15 | #----------------------------# 16 | # 减少通道数 17 | #----------------------------# 18 | x = Conv2D(filters1, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2a', 19 | kernel_initializer=initializers.random_normal(stddev=0.1), 20 | bias_initializer='zeros')(x) 21 | x = BatchNormalization(name=bn_name_base + '2a', epsilon=1e-5)(x) 22 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 23 | 24 | #----------------------------# 25 | # 3x3卷积 26 | #----------------------------# 27 | x = Conv2D(filters2, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2b', 28 | kernel_initializer=initializers.random_normal(stddev=0.1), 29 | bias_initializer='zeros')(x) 30 | x = BatchNormalization(name=bn_name_base + '2b', epsilon=1e-5)(x) 31 | 32 | x = layers.add([x, input_tensor]) 33 | return x 34 | 35 | 36 | def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)): 37 | filters1, filters2 = filters 38 | 39 | conv_name_base = 'res' + str(stage) + block + '_branch' 40 | bn_name_base = 'bn' + str(stage) + block + '_branch' 41 | 42 | x = BatchNormalization(name=bn_name_base + '2', epsilon=1e-5)(input_tensor) 43 | 44 | #----------------------------# 45 | # 减少通道数 46 | #----------------------------# 47 | x = Conv2D(filters1, kernel_size, padding='same', use_bias=False, name=conv_name_base + '2a', 48 | kernel_initializer=initializers.random_normal(stddev=0.1), 49 | bias_initializer='zeros')(x) 50 | x = BatchNormalization(name=bn_name_base + '2a', epsilon=1e-5)(x) 51 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 52 | 53 | #----------------------------# 54 | # 3x3卷积 55 | #----------------------------# 56 | x = Conv2D(filters2, kernel_size, padding='same', use_bias=False, strides=strides, name=conv_name_base + '2b', 57 | kernel_initializer=initializers.random_normal(stddev=0.1), 58 | bias_initializer='zeros')(x) 59 | x = BatchNormalization(name=bn_name_base + '2b', epsilon=1e-5)(x) 60 | 61 | #----------------------------# 62 | # 残差边 63 | #----------------------------# 64 | shortcut = Conv2D(filters2, (1, 1), strides=strides, use_bias=False, name=conv_name_base + '1', 65 | kernel_initializer=initializers.random_normal(stddev=0.1), 66 | bias_initializer='zeros')(input_tensor) 67 | shortcut = BatchNormalization(name=bn_name_base + '1', epsilon=1e-5)(shortcut) 68 | 69 | x = layers.add([x, shortcut]) 70 | return x 71 | 72 | def iResNet50(inputs, embedding_size, dropout_keep_prob=0.5): 73 | x = ZeroPadding2D((1, 1))(inputs) 74 | x = Conv2D(64, (3, 3), strides=(1, 1), name='conv1', use_bias=False, 75 | kernel_initializer=initializers.random_normal(stddev=0.1), 76 | bias_initializer='zeros')(x) 77 | x = BatchNormalization(name='bn_conv1', epsilon=1e-5)(x) 78 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 79 | 80 | x = conv_block(x, 3, [64, 64], stage=2, block='a') 81 | x = identity_block(x, 3, [64, 64], stage=2, block='b') 82 | x = identity_block(x, 3, [64, 64], stage=2, block='c') 83 | 84 | x = conv_block(x, 3, [128, 128], stage=3, block='a') 85 | x = identity_block(x, 3, [128, 128], stage=3, block='b') 86 | x = identity_block(x, 3, [128, 128], stage=3, block='c') 87 | x = identity_block(x, 3, [128, 128], stage=3, block='d') 88 | 89 | x = conv_block(x, 3, [256, 256], stage=4, block='a') 90 | x = identity_block(x, 3, [256, 256], stage=4, block='b') 91 | x = identity_block(x, 3, [256, 256], stage=4, block='c') 92 | x = identity_block(x, 3, [256, 256], stage=4, block='d') 93 | x = identity_block(x, 3, [256, 256], stage=4, block='e') 94 | x = identity_block(x, 3, [256, 256], stage=4, block='f') 95 | 96 | x = identity_block(x, 3, [256, 256], stage=4, block='g') 97 | x = identity_block(x, 3, [256, 256], stage=4, block='h') 98 | x = identity_block(x, 3, [256, 256], stage=4, block='i') 99 | x = identity_block(x, 3, [256, 256], stage=4, block='j') 100 | x = identity_block(x, 3, [256, 256], stage=4, block='k') 101 | 102 | x = identity_block(x, 3, [256, 256], stage=4, block='l') 103 | x = identity_block(x, 3, [256, 256], stage=4, block='m') 104 | x = identity_block(x, 3, [256, 256], stage=4, block='n') 105 | 106 | x = conv_block(x, 3, [512, 512], stage=5, block='a') 107 | x = identity_block(x, 3, [512, 512], stage=5, block='b') 108 | x = identity_block(x, 3, [512, 512], stage=5, block='c') 109 | 110 | x = BatchNormalization(name='bn2', epsilon=1e-5)(x) 111 | x = Dropout(p=dropout_keep_prob)(x) 112 | x = Flatten()(x) 113 | x = Dense(embedding_size, name='linear', 114 | kernel_initializer=initializers.random_normal(stddev=0.1), 115 | bias_initializer='zeros')(x) 116 | x = BatchNormalization(name='features', epsilon=1e-5,)(x) 117 | 118 | return x 119 | -------------------------------------------------------------------------------- /utils/utils_metrics.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from scipy import interpolate 3 | from sklearn.model_selection import KFold 4 | from tqdm import tqdm 5 | 6 | def evaluate(distances, labels, nrof_folds=10): 7 | # Calculate evaluation metrics 8 | thresholds = np.arange(0, 4, 0.01) 9 | tpr, fpr, accuracy, best_thresholds = calculate_roc(thresholds, distances, 10 | labels, nrof_folds=nrof_folds) 11 | thresholds = np.arange(0, 4, 0.001) 12 | val, val_std, far = calculate_val(thresholds, distances, 13 | labels, 1e-3, nrof_folds=nrof_folds) 14 | return tpr, fpr, accuracy, val, val_std, far, best_thresholds 15 | 16 | def calculate_roc(thresholds, distances, labels, nrof_folds=10): 17 | 18 | nrof_pairs = min(len(labels), len(distances)) 19 | nrof_thresholds = len(thresholds) 20 | k_fold = KFold(n_splits=nrof_folds, shuffle=False) 21 | 22 | tprs = np.zeros((nrof_folds,nrof_thresholds)) 23 | fprs = np.zeros((nrof_folds,nrof_thresholds)) 24 | accuracy = np.zeros((nrof_folds)) 25 | 26 | indices = np.arange(nrof_pairs) 27 | 28 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): 29 | 30 | # Find the best threshold for the fold 31 | acc_train = np.zeros((nrof_thresholds)) 32 | for threshold_idx, threshold in enumerate(thresholds): 33 | _, _, acc_train[threshold_idx] = calculate_accuracy(threshold, distances[train_set], labels[train_set]) 34 | 35 | best_threshold_index = np.argmax(acc_train) 36 | for threshold_idx, threshold in enumerate(thresholds): 37 | tprs[fold_idx,threshold_idx], fprs[fold_idx,threshold_idx], _ = calculate_accuracy(threshold, distances[test_set], labels[test_set]) 38 | _, _, accuracy[fold_idx] = calculate_accuracy(thresholds[best_threshold_index], distances[test_set], labels[test_set]) 39 | tpr = np.mean(tprs,0) 40 | fpr = np.mean(fprs,0) 41 | return tpr, fpr, accuracy, thresholds[best_threshold_index] 42 | 43 | def calculate_accuracy(threshold, dist, actual_issame): 44 | predict_issame = np.less(dist, threshold) 45 | tp = np.sum(np.logical_and(predict_issame, actual_issame)) 46 | fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 47 | tn = np.sum(np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))) 48 | fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame)) 49 | 50 | tpr = 0 if (tp+fn==0) else float(tp) / float(tp+fn) 51 | fpr = 0 if (fp+tn==0) else float(fp) / float(fp+tn) 52 | acc = float(tp+tn)/dist.size 53 | return tpr, fpr, acc 54 | 55 | def calculate_val(thresholds, distances, labels, far_target=1e-3, nrof_folds=10): 56 | nrof_pairs = min(len(labels), len(distances)) 57 | nrof_thresholds = len(thresholds) 58 | k_fold = KFold(n_splits=nrof_folds, shuffle=False) 59 | 60 | val = np.zeros(nrof_folds) 61 | far = np.zeros(nrof_folds) 62 | 63 | indices = np.arange(nrof_pairs) 64 | 65 | for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)): 66 | # Find the threshold that gives FAR = far_target 67 | far_train = np.zeros(nrof_thresholds) 68 | for threshold_idx, threshold in enumerate(thresholds): 69 | _, far_train[threshold_idx] = calculate_val_far(threshold, distances[train_set], labels[train_set]) 70 | if np.max(far_train)>=far_target: 71 | f = interpolate.interp1d(far_train, thresholds, kind='slinear') 72 | threshold = f(far_target) 73 | else: 74 | threshold = 0.0 75 | 76 | val[fold_idx], far[fold_idx] = calculate_val_far(threshold, distances[test_set], labels[test_set]) 77 | 78 | val_mean = np.mean(val) 79 | far_mean = np.mean(far) 80 | val_std = np.std(val) 81 | return val_mean, val_std, far_mean 82 | 83 | def calculate_val_far(threshold, dist, actual_issame): 84 | predict_issame = np.less(dist, threshold) 85 | true_accept = np.sum(np.logical_and(predict_issame, actual_issame)) 86 | false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame))) 87 | n_same = np.sum(actual_issame) 88 | n_diff = np.sum(np.logical_not(actual_issame)) 89 | if n_diff == 0: 90 | n_diff = 1 91 | if n_same == 0: 92 | return 0,0 93 | val = float(true_accept) / float(n_same) 94 | far = float(false_accept) / float(n_diff) 95 | return val, far 96 | 97 | def test(test_loader, model, png_save_path, log_interval, batch_size): 98 | labels, distances = [], [] 99 | pbar = tqdm(enumerate(test_loader.generate())) 100 | for batch_idx, (data_a, data_p, label) in pbar: 101 | out_a, out_p = model.predict(data_a), model.predict(data_p) 102 | dists = np.linalg.norm(out_a - out_p, axis=1) 103 | 104 | #--------------------------------------# 105 | # 将结果添加进列表中 106 | #--------------------------------------# 107 | distances.append(dists) 108 | labels.append(label) 109 | 110 | #--------------------------------------# 111 | # 打印 112 | #--------------------------------------# 113 | if batch_idx % log_interval == 0: 114 | pbar.set_description('Test Epoch: [{}/{} ({:.0f}%)]'.format( 115 | batch_idx * batch_size, len(test_loader.validation_images), 116 | 100. * batch_idx / len(test_loader.validation_images))) 117 | 118 | #--------------------------------------# 119 | # 转换成numpy 120 | #--------------------------------------# 121 | labels = np.array([sublabel for label in labels for sublabel in label]) 122 | distances = np.array([subdist for dist in distances for subdist in dist]) 123 | 124 | tpr, fpr, accuracy, val, val_std, far, best_thresholds = evaluate(distances,labels) 125 | print('Accuracy: %2.5f+-%2.5f' % (np.mean(accuracy), np.std(accuracy))) 126 | print('Best_thresholds: %2.5f' % best_thresholds) 127 | print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far)) 128 | plot_roc(fpr, tpr, figure_name = png_save_path) 129 | 130 | def plot_roc(fpr, tpr, figure_name = "roc.png"): 131 | import matplotlib.pyplot as plt 132 | from sklearn.metrics import auc, roc_curve 133 | roc_auc = auc(fpr, tpr) 134 | fig = plt.figure() 135 | lw = 2 136 | plt.plot(fpr, tpr, color='darkorange', 137 | lw=lw, label='ROC curve (area = %0.2f)' % roc_auc) 138 | plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--') 139 | plt.xlim([0.0, 1.0]) 140 | plt.ylim([0.0, 1.05]) 141 | plt.xlabel('False Positive Rate') 142 | plt.ylabel('True Positive Rate') 143 | plt.title('Receiver operating characteristic') 144 | plt.legend(loc="lower right") 145 | fig.savefig(figure_name, dpi=fig.dpi) 146 | -------------------------------------------------------------------------------- /nets/mobilenetv3.py: -------------------------------------------------------------------------------- 1 | from keras import backend, initializers 2 | from keras.layers import (Activation, Add, BatchNormalization, Conv2D, Dense, 3 | DepthwiseConv2D, Dropout, Flatten, 4 | GlobalAveragePooling2D, Multiply, PReLU, Reshape) 5 | 6 | 7 | def _make_divisible(v, divisor=8, min_value=None): 8 | if min_value is None: 9 | min_value = divisor 10 | new_v = max(min_value, int(v + divisor / 2) // divisor * divisor) 11 | if new_v < 0.9 * v: 12 | new_v += divisor 13 | return new_v 14 | 15 | def _activation(x, name='relu'): 16 | if name == 'relu': 17 | return Activation('relu')(x) 18 | elif name == 'hardswish': 19 | return hard_swish(x) 20 | 21 | def hard_sigmoid(x): 22 | return backend.relu(x + 3.0, max_value=6.0) / 6.0 23 | 24 | def hard_swish(x): 25 | return Multiply()([Activation(hard_sigmoid)(x), x]) 26 | 27 | def _bneck(inputs, expansion, out_ch, alpha, kernel_size, stride, se_ratio, activation, block_id, rate=1): 28 | in_channels = backend.int_shape(inputs)[-1] 29 | exp_size = _make_divisible(in_channels * expansion, 8) 30 | out_channels = _make_divisible(out_ch * alpha, 8) 31 | 32 | x = inputs 33 | prefix = 'expanded_conv/' 34 | if block_id: 35 | # Expand 36 | prefix = 'expanded_conv_{}/'.format(block_id) 37 | x = Conv2D(exp_size, 1, padding='same', use_bias=False, name=prefix + 'expand', 38 | kernel_initializer=initializers.random_normal(stddev=0.1))(x) 39 | x = BatchNormalization(axis=-1, name=prefix + 'expand/BatchNorm')(x) 40 | x = _activation(x, activation) 41 | 42 | x = DepthwiseConv2D(kernel_size, strides=stride, padding='same', dilation_rate=(rate, rate), use_bias=False, depthwise_initializer=initializers.random_normal(stddev=0.1), name=prefix + 'depthwise')(x) 43 | x = BatchNormalization(axis=-1, name=prefix + 'depthwise/BatchNorm')(x) 44 | x = _activation(x, activation) 45 | 46 | if se_ratio: 47 | reduced_ch = _make_divisible(exp_size * se_ratio, 8) 48 | y = GlobalAveragePooling2D(name=prefix + 'squeeze_excite/AvgPool')(x) 49 | y = Reshape([1, 1, exp_size], name=prefix + 'reshape')(y) 50 | 51 | y = Conv2D(reduced_ch, 1, padding='same', use_bias=True, name=prefix + 'squeeze_excite/Conv', 52 | kernel_initializer=initializers.random_normal(stddev=0.1))(y) 53 | y = Activation("relu", name=prefix + 'squeeze_excite/Relu')(y) 54 | 55 | y = Conv2D(exp_size, 1, padding='same', use_bias=True, name=prefix + 'squeeze_excite/Conv_1', 56 | kernel_initializer=initializers.random_normal(stddev=0.1))(y) 57 | x = Multiply(name=prefix + 'squeeze_excite/Mul')([Activation(hard_sigmoid)(y), x]) 58 | 59 | x = Conv2D(out_channels, 1, padding='same', use_bias=False, name=prefix + 'project', 60 | kernel_initializer=initializers.random_normal(stddev=0.1))(x) 61 | x = BatchNormalization(axis=-1, name=prefix + 'project/BatchNorm')(x) 62 | 63 | if in_channels == out_channels and stride == 1: 64 | x = Add(name=prefix + 'Add')([inputs, x]) 65 | return x 66 | 67 | def MobilenetV3_small(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0, kernel=5, se_ratio=0.25): 68 | x = Conv2D(16, 3, strides=(1, 1), padding='same', use_bias=False, name='Conv', 69 | kernel_initializer=initializers.random_normal(stddev=0.1))(inputs) 70 | x = BatchNormalization(axis=-1, name='Conv/BatchNorm')(x) 71 | x = Activation(hard_swish)(x) 72 | 73 | x = _bneck(x, 1, 16, alpha, 3, 2, se_ratio, 'relu', 0) 74 | 75 | x = _bneck(x, 4.5, 24, alpha, 3, 2, None, 'relu', 1) 76 | x = _bneck(x, 3.66, 24, alpha, 3, 1, None, 'relu', 2) 77 | 78 | x = _bneck(x, 4, 40, alpha, kernel, 2, se_ratio, 'hardswish', 3) 79 | x = _bneck(x, 6, 40, alpha, kernel, 1, se_ratio, 'hardswish', 4) 80 | x = _bneck(x, 6, 40, alpha, kernel, 1, se_ratio, 'hardswish', 5) 81 | x = _bneck(x, 3, 48, alpha, kernel, 1, se_ratio, 'hardswish', 6) 82 | x = _bneck(x, 3, 48, alpha, kernel, 1, se_ratio, 'hardswish', 7) 83 | 84 | x = _bneck(x, 6, 96, alpha, kernel, 2, se_ratio, 'hardswish', 8) 85 | x = _bneck(x, 6, 96, alpha, kernel, 1, se_ratio, 'hardswish', 9) 86 | x = _bneck(x, 6, 96, alpha, kernel, 1, se_ratio, 'hardswish', 10) 87 | 88 | x = Conv2D(512, kernel_size=1, use_bias=False, name='sep', 89 | kernel_initializer=initializers.random_normal(stddev=0.1), 90 | bias_initializer='zeros')(x) 91 | x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x) 92 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 93 | 94 | x = BatchNormalization(name='bn2', epsilon=1e-5)(x) 95 | x = Dropout(p=dropout_keep_prob)(x) 96 | x = Flatten()(x) 97 | x = Dense(embedding_size, name='linear', 98 | kernel_initializer=initializers.random_normal(stddev=0.1), 99 | bias_initializer='zeros')(x) 100 | x = BatchNormalization(name='features', epsilon=1e-5)(x) 101 | return x 102 | 103 | def MobileNetV3_Large(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0, kernel=5, se_ratio=0.25): 104 | x = Conv2D(16, 3, strides=(1, 1), padding='same', use_bias=False, name='Conv', 105 | kernel_initializer=initializers.random_normal(stddev=0.1))(inputs) 106 | x = BatchNormalization(axis=-1, name='Conv/BatchNorm')(x) 107 | x = Activation(hard_swish)(x) 108 | 109 | x = _bneck(x, 1, 16, alpha, 3, 1, None, 'relu', 0) 110 | 111 | x = _bneck(x, 4, 24, alpha, 3, 2, None, 'relu', 1) 112 | x = _bneck(x, 3, 24, alpha, 3, 1, None, 'relu', 2) 113 | 114 | x = _bneck(x, 3, 40, alpha, kernel, 2, se_ratio, 'relu', 3) 115 | x = _bneck(x, 3, 40, alpha, kernel, 1, se_ratio, 'relu', 4) 116 | x = _bneck(x, 3, 40, alpha, kernel, 1, se_ratio, 'relu', 5) 117 | 118 | x = _bneck(x, 6, 80, alpha, 3, 2, None, 'hardswish', 6) 119 | x = _bneck(x, 2.5, 80, alpha, 3, 1, None, 'hardswish', 7) 120 | x = _bneck(x, 2.3, 80, alpha, 3, 1, None, 'hardswish', 8) 121 | x = _bneck(x, 2.3, 80, alpha, 3, 1, None, 'hardswish', 9) 122 | x = _bneck(x, 6, 112, alpha, 3, 1, se_ratio, 'hardswish', 10) 123 | x = _bneck(x, 6, 112, alpha, 3, 1, se_ratio, 'hardswish', 11) 124 | 125 | x = _bneck(x, 6, 160, alpha, kernel, 2, se_ratio, 'hardswish', 12) 126 | x = _bneck(x, 6, 160, alpha, kernel, 1, se_ratio, 'hardswish', 13) 127 | x = _bneck(x, 6, 160, alpha, kernel, 1, se_ratio, 'hardswish', 14) 128 | 129 | x = Conv2D(512, kernel_size=1, use_bias=False, name='sep', 130 | kernel_initializer=initializers.random_normal(stddev=0.1), 131 | bias_initializer='zeros')(x) 132 | x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x) 133 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 134 | 135 | x = BatchNormalization(name='bn2', epsilon=1e-5)(x) 136 | x = Dropout(p=dropout_keep_prob)(x) 137 | x = Flatten()(x) 138 | x = Dense(embedding_size, name='linear', 139 | kernel_initializer=initializers.random_normal(stddev=0.1), 140 | bias_initializer='zeros')(x) 141 | x = BatchNormalization(name='features', epsilon=1e-5)(x) 142 | return x 143 | -------------------------------------------------------------------------------- /nets/mobilenetv2.py: -------------------------------------------------------------------------------- 1 | #-------------------------------------------------------------# 2 | # MobileNetV2的网络部分 3 | #-------------------------------------------------------------# 4 | 5 | from keras import backend 6 | from keras import initializers 7 | from keras.layers import (Activation, Add, BatchNormalization, Conv2D, Dense, 8 | DepthwiseConv2D, Dropout, Flatten, 9 | GlobalAveragePooling2D, Input, PReLU, Reshape, 10 | ZeroPadding2D) 11 | from keras.layers.normalization import BatchNormalization 12 | 13 | 14 | def relu6(x): 15 | return backend.relu(x, max_value=6) 16 | 17 | def correct_pad(inputs, kernel_size): 18 | img_dim = 1 19 | input_size = backend.int_shape(inputs)[img_dim:(img_dim + 2)] 20 | 21 | if isinstance(kernel_size, int): 22 | kernel_size = (kernel_size, kernel_size) 23 | 24 | if input_size[0] is None: 25 | adjust = (1, 1) 26 | else: 27 | adjust = (1 - input_size[0] % 2, 1 - input_size[1] % 2) 28 | 29 | correct = (kernel_size[0] // 2, kernel_size[1] // 2) 30 | 31 | return ((correct[0] - adjust[0], correct[0]), 32 | (correct[1] - adjust[1], correct[1])) 33 | 34 | def _make_divisible(v, divisor, min_value=None): 35 | if min_value is None: 36 | min_value = divisor 37 | new_v = max(min_value, int(v + divisor / 2) // divisor * divisor) 38 | if new_v < 0.9 * v: 39 | new_v += divisor 40 | return new_v 41 | 42 | def _inverted_res_block(inputs, expansion, stride, alpha, filters, block_id): 43 | in_channels = backend.int_shape(inputs)[-1] 44 | pointwise_conv_filters = int(filters * alpha) 45 | pointwise_filters = _make_divisible(pointwise_conv_filters, 8) 46 | 47 | x = inputs 48 | prefix = 'block_{}_'.format(block_id) 49 | #---------------------------------------------# 50 | # part1 数据扩张 51 | #---------------------------------------------# 52 | if block_id: 53 | # Expand 54 | x = Conv2D(expansion * in_channels, 55 | kernel_size=1, 56 | padding='same', 57 | use_bias=False, 58 | kernel_initializer=initializers.random_normal(stddev=0.1), 59 | activation=None, 60 | name=prefix + 'expand')(x) 61 | x = BatchNormalization(epsilon=1e-3, 62 | momentum=0.999, 63 | name=prefix + 'expand_BN')(x) 64 | x = Activation(relu6, name=prefix + 'expand_relu')(x) 65 | else: 66 | prefix = 'expanded_conv_' 67 | 68 | if stride == 2: 69 | x = ZeroPadding2D(padding=correct_pad(x, 3), 70 | name=prefix + 'pad')(x) 71 | 72 | #---------------------------------------------# 73 | # part2 可分离卷积 74 | #---------------------------------------------# 75 | x = DepthwiseConv2D(kernel_size=3, 76 | strides=stride, 77 | activation=None, 78 | use_bias=False, 79 | depthwise_initializer=initializers.random_normal(stddev=0.1), 80 | padding='same' if stride == 1 else 'valid', 81 | name=prefix + 'depthwise')(x) 82 | x = BatchNormalization(epsilon=1e-3, 83 | momentum=0.999, 84 | name=prefix + 'depthwise_BN')(x) 85 | 86 | x = Activation(relu6, name=prefix + 'depthwise_relu')(x) 87 | 88 | #---------------------------------------------# 89 | # part3压缩特征,而且不使用relu函数,保证特征不被破坏 90 | #---------------------------------------------# 91 | x = Conv2D(pointwise_filters, 92 | kernel_size=1, 93 | padding='same', 94 | use_bias=False, 95 | kernel_initializer=initializers.random_normal(stddev=0.1), 96 | activation=None, 97 | name=prefix + 'project')(x) 98 | 99 | x = BatchNormalization(epsilon=1e-3, momentum=0.999, name=prefix + 'project_BN')(x) 100 | 101 | if in_channels == pointwise_filters and stride == 1: 102 | return Add(name=prefix + 'add')([inputs, x]) 103 | return x 104 | 105 | def MobilenetV2(inputs, embedding_size, dropout_keep_prob=0.5, alpha=1.0): 106 | #---------------------------------------------# 107 | # stem部分 108 | #---------------------------------------------# 109 | first_block_filters = _make_divisible(32 * alpha, 8) 110 | x = ZeroPadding2D(padding=correct_pad(inputs, 3), 111 | name='Conv1_pad')(inputs) 112 | 113 | x = Conv2D(first_block_filters, 114 | kernel_size=3, 115 | strides=(1, 1), 116 | padding='valid', 117 | use_bias=False, 118 | kernel_initializer=initializers.random_normal(stddev=0.1), 119 | name='Conv1')(x) 120 | x = BatchNormalization(epsilon=1e-3, 121 | momentum=0.999, 122 | name='bn_Conv1')(x) 123 | x = Activation(relu6, name='Conv1_relu')(x) 124 | 125 | x = _inverted_res_block(x, filters=16, alpha=alpha, stride=1, 126 | expansion=1, block_id=0) 127 | x = _inverted_res_block(x, filters=24, alpha=alpha, stride=2, 128 | expansion=6, block_id=1) 129 | x = _inverted_res_block(x, filters=24, alpha=alpha, stride=1, 130 | expansion=6, block_id=2) 131 | 132 | 133 | x = _inverted_res_block(x, filters=32, alpha=alpha, stride=2, 134 | expansion=6, block_id=3) 135 | x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1, 136 | expansion=6, block_id=4) 137 | x = _inverted_res_block(x, filters=32, alpha=alpha, stride=1, 138 | expansion=6, block_id=5) 139 | 140 | 141 | x = _inverted_res_block(x, filters=64, alpha=alpha, stride=2, 142 | expansion=6, block_id=6) 143 | x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1, 144 | expansion=6, block_id=7) 145 | x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1, 146 | expansion=6, block_id=8) 147 | x = _inverted_res_block(x, filters=64, alpha=alpha, stride=1, 148 | expansion=6, block_id=9) 149 | x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1, 150 | expansion=6, block_id=10) 151 | x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1, 152 | expansion=6, block_id=11) 153 | x = _inverted_res_block(x, filters=96, alpha=alpha, stride=1, 154 | expansion=6, block_id=12) 155 | 156 | 157 | x = _inverted_res_block(x, filters=160, alpha=alpha, stride=2, 158 | expansion=6, block_id=13) 159 | x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1, 160 | expansion=6, block_id=14) 161 | x = _inverted_res_block(x, filters=160, alpha=alpha, stride=1, 162 | expansion=6, block_id=15) 163 | x = _inverted_res_block(x, filters=320, alpha=alpha, stride=1, 164 | expansion=6, block_id=16) 165 | 166 | x = Conv2D(512, kernel_size=1, use_bias=False, name='sep', 167 | kernel_initializer=initializers.random_normal(stddev=0.1), 168 | bias_initializer='zeros')(x) 169 | x = BatchNormalization(name='sep_bn', epsilon=1e-5)(x) 170 | x = PReLU(alpha_initializer=initializers.constant(0.25), shared_axes=[1, 2])(x) 171 | 172 | x = BatchNormalization(name='bn2', epsilon=1e-5)(x) 173 | x = Dropout(p=dropout_keep_prob)(x) 174 | x = Flatten()(x) 175 | x = Dense(embedding_size, name='linear', 176 | kernel_initializer=initializers.random_normal(stddev=0.1), 177 | bias_initializer='zeros')(x) 178 | x = BatchNormalization(name='features', epsilon=1e-5)(x) 179 | return x 180 | -------------------------------------------------------------------------------- /utils/utils.py: -------------------------------------------------------------------------------- 1 | import math 2 | 3 | import numpy as np 4 | import tensorflow as tf 5 | from PIL import Image 6 | 7 | 8 | #---------------------------------------------------------# 9 | # 将图像转换成RGB图像,防止灰度图在预测时报错。 10 | # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB 11 | #---------------------------------------------------------# 12 | def cvtColor(image): 13 | if len(np.shape(image)) == 3 and np.shape(image)[2] == 3: 14 | return image 15 | else: 16 | image = image.convert('RGB') 17 | return image 18 | 19 | #---------------------------------------------------# 20 | # 对输入图像进行resize 21 | #---------------------------------------------------# 22 | def resize_image(image, size, letterbox_image): 23 | iw, ih = image.size 24 | w, h = size 25 | if letterbox_image: 26 | scale = min(w/iw, h/ih) 27 | nw = int(iw*scale) 28 | nh = int(ih*scale) 29 | 30 | image = image.resize((nw,nh), Image.BICUBIC) 31 | new_image = Image.new('RGB', size, (128,128,128)) 32 | new_image.paste(image, ((w-nw)//2, (h-nh)//2)) 33 | else: 34 | new_image = image.resize((w, h), Image.BICUBIC) 35 | return new_image 36 | 37 | def get_num_classes(annotation_path): 38 | with open(annotation_path) as f: 39 | dataset_path = f.readlines() 40 | 41 | labels = [] 42 | for path in dataset_path: 43 | path_split = path.split(";") 44 | labels.append(int(path_split[0])) 45 | num_classes = np.max(labels) + 1 46 | return num_classes 47 | 48 | def preprocess_input(image): 49 | image /= 255.0 50 | image -= 0.5 51 | image /= 0.5 52 | return image 53 | 54 | def get_acc(s=32.0, m=0.5): 55 | cos_m = math.cos(m) 56 | sin_m = math.sin(m) 57 | th = math.cos(math.pi - m) 58 | mm = math.sin(math.pi - m) * m 59 | def acc(y_true, y_pred): 60 | cosine = tf.cast(y_pred, tf.float32) 61 | labels = tf.cast(y_true, tf.float32) 62 | 63 | sine = tf.sqrt(1 - tf.square(cosine)) 64 | phi = cosine * cos_m - sine * sin_m 65 | phi = tf.where(cosine > th, phi, cosine - mm) 66 | 67 | output = (labels * phi) + ((1.0 - labels) * cosine) 68 | output *= s 69 | 70 | accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(tf.math.softmax(output, -1), -1), tf.argmax(y_true, -1)), tf.float32)) 71 | return accuracy 72 | return acc 73 | 74 | def show_config(**kwargs): 75 | print('Configurations:') 76 | print('-' * 70) 77 | print('|%25s | %40s|' % ('keys', 'values')) 78 | print('-' * 70) 79 | for key, value in kwargs.items(): 80 | print('|%25s | %40s|' % (str(key), str(value))) 81 | print('-' * 70) 82 | 83 | #-------------------------------------------------------------------------------------------------------------------------------# 84 | # From https://github.com/ckyrkou/Keras_FLOP_Estimator 85 | # Fix lots of bugs 86 | #-------------------------------------------------------------------------------------------------------------------------------# 87 | def net_flops(model, table=False, print_result=True): 88 | if (table == True): 89 | print("\n") 90 | print('%25s | %16s | %16s | %16s | %16s | %6s | %6s' % ( 91 | 'Layer Name', 'Input Shape', 'Output Shape', 'Kernel Size', 'Filters', 'Strides', 'FLOPS')) 92 | print('=' * 120) 93 | 94 | #---------------------------------------------------# 95 | # 总的FLOPs 96 | #---------------------------------------------------# 97 | t_flops = 0 98 | factor = 1e9 99 | 100 | for l in model.layers: 101 | try: 102 | #--------------------------------------# 103 | # 所需参数的初始化定义 104 | #--------------------------------------# 105 | o_shape, i_shape, strides, ks, filters = ('', '', ''), ('', '', ''), (1, 1), (0, 0), 0 106 | flops = 0 107 | #--------------------------------------# 108 | # 获得层的名字 109 | #--------------------------------------# 110 | name = l.name 111 | 112 | if ('InputLayer' in str(l)): 113 | i_shape = l.get_input_shape_at(0)[1:4] 114 | o_shape = l.get_output_shape_at(0)[1:4] 115 | 116 | #--------------------------------------# 117 | # Reshape层 118 | #--------------------------------------# 119 | elif ('Reshape' in str(l)): 120 | i_shape = l.get_input_shape_at(0)[1:4] 121 | o_shape = l.get_output_shape_at(0)[1:4] 122 | 123 | #--------------------------------------# 124 | # 填充层 125 | #--------------------------------------# 126 | elif ('Padding' in str(l)): 127 | i_shape = l.get_input_shape_at(0)[1:4] 128 | o_shape = l.get_output_shape_at(0)[1:4] 129 | 130 | #--------------------------------------# 131 | # 平铺层 132 | #--------------------------------------# 133 | elif ('Flatten' in str(l)): 134 | i_shape = l.get_input_shape_at(0)[1:4] 135 | o_shape = l.get_output_shape_at(0)[1:4] 136 | 137 | #--------------------------------------# 138 | # 激活函数层 139 | #--------------------------------------# 140 | elif 'Activation' in str(l): 141 | i_shape = l.get_input_shape_at(0)[1:4] 142 | o_shape = l.get_output_shape_at(0)[1:4] 143 | 144 | #--------------------------------------# 145 | # LeakyReLU 146 | #--------------------------------------# 147 | elif 'LeakyReLU' in str(l): 148 | for i in range(len(l._inbound_nodes)): 149 | i_shape = l.get_input_shape_at(i)[1:4] 150 | o_shape = l.get_output_shape_at(i)[1:4] 151 | 152 | flops += i_shape[0] * i_shape[1] * i_shape[2] 153 | 154 | #--------------------------------------# 155 | # 池化层 156 | #--------------------------------------# 157 | elif 'MaxPooling' in str(l): 158 | i_shape = l.get_input_shape_at(0)[1:4] 159 | o_shape = l.get_output_shape_at(0)[1:4] 160 | 161 | #--------------------------------------# 162 | # 池化层 163 | #--------------------------------------# 164 | elif ('AveragePooling' in str(l) and 'Global' not in str(l)): 165 | strides = l.strides 166 | ks = l.pool_size 167 | 168 | for i in range(len(l._inbound_nodes)): 169 | i_shape = l.get_input_shape_at(i)[1:4] 170 | o_shape = l.get_output_shape_at(i)[1:4] 171 | 172 | flops += o_shape[0] * o_shape[1] * o_shape[2] 173 | 174 | #--------------------------------------# 175 | # 全局池化层 176 | #--------------------------------------# 177 | elif ('AveragePooling' in str(l) and 'Global' in str(l)): 178 | for i in range(len(l._inbound_nodes)): 179 | i_shape = l.get_input_shape_at(i)[1:4] 180 | o_shape = l.get_output_shape_at(i)[1:4] 181 | 182 | flops += (i_shape[0] * i_shape[1] + 1) * i_shape[2] 183 | 184 | #--------------------------------------# 185 | # 标准化层 186 | #--------------------------------------# 187 | elif ('BatchNormalization' in str(l)): 188 | for i in range(len(l._inbound_nodes)): 189 | i_shape = l.get_input_shape_at(i)[1:4] 190 | o_shape = l.get_output_shape_at(i)[1:4] 191 | 192 | temp_flops = 1 193 | for i in range(len(i_shape)): 194 | temp_flops *= i_shape[i] 195 | temp_flops *= 2 196 | 197 | flops += temp_flops 198 | 199 | #--------------------------------------# 200 | # 全连接层 201 | #--------------------------------------# 202 | elif ('Dense' in str(l)): 203 | for i in range(len(l._inbound_nodes)): 204 | i_shape = l.get_input_shape_at(i)[1:4] 205 | o_shape = l.get_output_shape_at(i)[1:4] 206 | 207 | temp_flops = 1 208 | for i in range(len(o_shape)): 209 | temp_flops *= o_shape[i] 210 | 211 | if (i_shape[-1] == None): 212 | temp_flops = temp_flops * o_shape[-1] 213 | else: 214 | temp_flops = temp_flops * i_shape[-1] 215 | flops += temp_flops 216 | 217 | #--------------------------------------# 218 | # 普通卷积层 219 | #--------------------------------------# 220 | elif ('Conv2D' in str(l) and 'DepthwiseConv2D' not in str(l) and 'SeparableConv2D' not in str(l)): 221 | strides = l.strides 222 | ks = l.kernel_size 223 | filters = l.filters 224 | bias = 1 if l.use_bias else 0 225 | 226 | for i in range(len(l._inbound_nodes)): 227 | i_shape = l.get_input_shape_at(i)[1:4] 228 | o_shape = l.get_output_shape_at(i)[1:4] 229 | 230 | if (filters == None): 231 | filters = i_shape[2] 232 | flops += filters * o_shape[0] * o_shape[1] * (ks[0] * ks[1] * i_shape[2] + bias) 233 | 234 | #--------------------------------------# 235 | # 逐层卷积层 236 | #--------------------------------------# 237 | elif ('Conv2D' in str(l) and 'DepthwiseConv2D' in str(l) and 'SeparableConv2D' not in str(l)): 238 | strides = l.strides 239 | ks = l.kernel_size 240 | filters = l.filters 241 | bias = 1 if l.use_bias else 0 242 | 243 | for i in range(len(l._inbound_nodes)): 244 | i_shape = l.get_input_shape_at(i)[1:4] 245 | o_shape = l.get_output_shape_at(i)[1:4] 246 | 247 | if (filters == None): 248 | filters = i_shape[2] 249 | flops += filters * o_shape[0] * o_shape[1] * (ks[0] * ks[1] + bias) 250 | 251 | #--------------------------------------# 252 | # 深度可分离卷积层 253 | #--------------------------------------# 254 | elif ('Conv2D' in str(l) and 'DepthwiseConv2D' not in str(l) and 'SeparableConv2D' in str(l)): 255 | strides = l.strides 256 | ks = l.kernel_size 257 | filters = l.filters 258 | 259 | for i in range(len(l._inbound_nodes)): 260 | i_shape = l.get_input_shape_at(i)[1:4] 261 | o_shape = l.get_output_shape_at(i)[1:4] 262 | 263 | if (filters == None): 264 | filters = i_shape[2] 265 | flops += i_shape[2] * o_shape[0] * o_shape[1] * (ks[0] * ks[1] + bias) + \ 266 | filters * o_shape[0] * o_shape[1] * (1 * 1 * i_shape[2] + bias) 267 | #--------------------------------------# 268 | # 模型中有模型时 269 | #--------------------------------------# 270 | elif 'Model' in str(l): 271 | flops = net_flops(l, print_result=False) 272 | 273 | t_flops += flops 274 | 275 | if (table == True): 276 | print('%25s | %16s | %16s | %16s | %16s | %6s | %5.4f' % ( 277 | name[:25], str(i_shape), str(o_shape), str(ks), str(filters), str(strides), flops)) 278 | 279 | except: 280 | pass 281 | 282 | t_flops = t_flops * 2 283 | if print_result: 284 | show_flops = t_flops / factor 285 | print('Total GFLOPs: %.3fG' % (show_flops)) 286 | return t_flops -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | import os 3 | 4 | import numpy as np 5 | import tensorflow as tf 6 | from keras.callbacks import LearningRateScheduler, ModelCheckpoint, TensorBoard 7 | from keras.layers import Conv2D, Dense, DepthwiseConv2D, PReLU 8 | from keras.optimizers import SGD, Adam 9 | from keras.regularizers import l2 10 | from keras.utils.multi_gpu_utils import multi_gpu_model 11 | 12 | from nets.arcface import arcface 13 | from nets.arcface_training import ArcFaceLoss, get_lr_scheduler 14 | from utils.callbacks import (ExponentDecayScheduler, LFW_callback, LossHistory, 15 | ParallelModelCheckpoint) 16 | from utils.dataloader import FacenetDataset, LFWDataset 17 | from utils.utils import get_acc, get_num_classes, show_config 18 | 19 | tf.logging.set_verbosity(tf.logging.ERROR) 20 | 21 | if __name__ == "__main__": 22 | #---------------------------------------------------------------------# 23 | # train_gpu 训练用到的GPU 24 | # 默认为第一张卡、双卡为[0, 1]、三卡为[0, 1, 2] 25 | # 在使用多GPU时,每个卡上的batch为总batch除以卡的数量。 26 | #---------------------------------------------------------------------# 27 | train_gpu = [0,] 28 | #--------------------------------------------------------# 29 | # 指向根目录下的cls_train.txt,读取人脸路径与标签 30 | #--------------------------------------------------------# 31 | annotation_path = "cls_train.txt" 32 | #--------------------------------------------------------# 33 | # 输入图像大小 34 | #--------------------------------------------------------# 35 | input_shape = [112, 112, 3] 36 | #--------------------------------------------------------# 37 | # 主干特征提取网络的选择 38 | # mobilefacenet 39 | # mobilenetv1 40 | # mobilenetv2 41 | # mobilenetv3 42 | # iresnet50 43 | # 44 | # 除了mobilenetv1外,其它的backbone均可从0开始训练。 45 | #--------------------------------------------------------# 46 | backbone = "mobilefacenet" 47 | #----------------------------------------------------------------------------------------------------------------------------# 48 | # 如果训练过程中存在中断训练的操作,可以将model_path设置成logs文件夹下的权值文件,将已经训练了一部分的权值再次载入。 49 | # 同时修改下方的训练的参数,来保证模型epoch的连续性。 50 | # 51 | # 当model_path = ''的时候不加载整个模型的权值。 52 | # 53 | # 此处使用的是整个模型的权重,因此是在train.py进行加载的,pretrain不影响此处的权值加载。 54 | # 如果想要让模型从主干的预训练权值开始训练,则设置model_path = 主干的权值。 55 | # 如果想要让模型从0开始训练,则设置model_path = '',此时从0开始训练。 56 | #----------------------------------------------------------------------------------------------------------------------------# 57 | model_path = "" 58 | 59 | #----------------------------------------------------------------------------------------------------------------------------# 60 | # 显存不足与数据集大小无关,提示显存不足请调小batch_size。 61 | # 受到BatchNorm层影响,不能为1。 62 | # 63 | # 在此提供若干参数设置建议,各位训练者根据自己的需求进行灵活调整: 64 | # (一)从预训练权重开始训练: 65 | # Adam: 66 | # Init_Epoch = 0,Epoch = 100,optimizer_type = 'adam',Init_lr = 1e-3,weight_decay = 0。 67 | # SGD: 68 | # Init_Epoch = 0,Epoch = 100,optimizer_type = 'sgd',Init_lr = 1e-2,weight_decay = 5e-4。 69 | # 其中:UnFreeze_Epoch可以在100-300之间调整。 70 | # (二)batch_size的设置: 71 | # 在显卡能够接受的范围内,以大为好。显存不足与数据集大小无关,提示显存不足(OOM或者CUDA out of memory)请调小batch_size。 72 | # 受到BatchNorm层影响,batch_size最小为2,不能为1。 73 | #----------------------------------------------------------------------------------------------------------------------------# 74 | #------------------------------------------------------# 75 | # 训练参数 76 | # Init_Epoch 模型当前开始的训练世代 77 | # Epoch 模型总共训练的epoch 78 | # batch_size 每次输入的图片数量 79 | #------------------------------------------------------# 80 | Init_Epoch = 0 81 | Epoch = 100 82 | batch_size = 64 83 | 84 | #------------------------------------------------------------------# 85 | # 其它训练参数:学习率、优化器、学习率下降有关 86 | #------------------------------------------------------------------# 87 | #------------------------------------------------------------------# 88 | # Init_lr 模型的最大学习率 89 | # Min_lr 模型的最小学习率,默认为最大学习率的0.01 90 | #------------------------------------------------------------------# 91 | Init_lr = 1e-2 92 | Min_lr = Init_lr * 0.01 93 | #------------------------------------------------------------------# 94 | # optimizer_type 使用到的优化器种类,可选的有adam、sgd 95 | # 当使用Adam优化器时建议设置 Init_lr=1e-3 96 | # 当使用SGD优化器时建议设置 Init_lr=1e-2 97 | # momentum 优化器内部使用到的momentum参数 98 | # weight_decay 权值衰减,可防止过拟合 99 | # adam会导致weight_decay错误,使用adam时建议设置为0。 100 | #------------------------------------------------------------------# 101 | optimizer_type = "sgd" 102 | momentum = 0.9 103 | weight_decay = 5e-4 104 | #------------------------------------------------------------------# 105 | # lr_decay_type 使用到的学习率下降方式,可选的有step、cos 106 | #------------------------------------------------------------------# 107 | lr_decay_type = "cos" 108 | #------------------------------------------------------------------# 109 | # save_period 多少个epoch保存一次权值,默认每个世代都保存 110 | #------------------------------------------------------------------# 111 | save_period = 1 112 | #------------------------------------------------------------------# 113 | # save_dir 权值与日志文件保存的文件夹 114 | #------------------------------------------------------------------# 115 | save_dir = 'logs' 116 | #------------------------------------------------------------------# 117 | # 用于设置是否使用多线程读取数据 118 | # 开启后会加快数据读取速度,但是会占用更多内存 119 | # 内存较小的电脑可以设置为2或者1 120 | #------------------------------------------------------------------# 121 | num_workers = 1 122 | #------------------------------------------------------------------# 123 | # 是否开启LFW评估 124 | #------------------------------------------------------------------# 125 | lfw_eval_flag = True 126 | #------------------------------------------------------------------# 127 | # LFW评估数据集的文件路径和对应的txt文件 128 | #------------------------------------------------------------------# 129 | lfw_dir_path = "lfw" 130 | lfw_pairs_path = "model_data/lfw_pair.txt" 131 | 132 | #------------------------------------------------------# 133 | # 设置用到的显卡 134 | #------------------------------------------------------# 135 | os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(str(x) for x in train_gpu) 136 | ngpus_per_node = len(train_gpu) 137 | print('Number of devices: {}'.format(ngpus_per_node)) 138 | 139 | num_classes = get_num_classes(annotation_path) 140 | #-------------------------------------------# 141 | # 建立模型 142 | #-------------------------------------------# 143 | model_body = arcface(input_shape, num_classes, backbone=backbone, mode="train") 144 | if model_path != '': 145 | #------------------------------------------------------# 146 | # 载入预训练权重 147 | #------------------------------------------------------# 148 | print('Load weights {}.'.format(model_path)) 149 | model_body.load_weights(model_path, by_name=True, skip_mismatch=True) 150 | 151 | if ngpus_per_node > 1: 152 | model = multi_gpu_model(model_body, gpus=ngpus_per_node) 153 | else: 154 | model = model_body 155 | #-------------------------------------------------------# 156 | # 0.01用于验证,0.99用于训练 157 | #-------------------------------------------------------# 158 | val_split = 0.01 159 | with open(annotation_path,"r") as f: 160 | lines = f.readlines() 161 | np.random.seed(10101) 162 | np.random.shuffle(lines) 163 | np.random.seed(None) 164 | num_val = int(len(lines)*val_split) 165 | num_train = len(lines) - num_val 166 | 167 | show_config( 168 | num_classes = num_classes, backbone = backbone, model_path = model_path, input_shape = input_shape, \ 169 | Init_Epoch = Init_Epoch, Epoch = Epoch, batch_size = batch_size, \ 170 | Init_lr = Init_lr, Min_lr = Min_lr, optimizer_type = optimizer_type, momentum = momentum, lr_decay_type = lr_decay_type, \ 171 | save_period = save_period, save_dir = save_dir, num_workers = num_workers, num_train = num_train, num_val = num_val 172 | ) 173 | 174 | for layer in model.layers: 175 | if isinstance(layer, DepthwiseConv2D): 176 | layer.add_loss(l2(weight_decay)(layer.depthwise_kernel)) 177 | elif isinstance(layer, Conv2D) or isinstance(layer, Dense): 178 | layer.add_loss(l2(weight_decay)(layer.kernel)) 179 | elif isinstance(layer, PReLU): 180 | layer.add_loss(l2(weight_decay)(layer.alpha)) 181 | 182 | if True: 183 | #-------------------------------------------------------------------# 184 | # 判断当前batch_size,自适应调整学习率 185 | #-------------------------------------------------------------------# 186 | nbs = 64 187 | lr_limit_max = 1e-3 if optimizer_type == 'adam' else 1e-1 188 | lr_limit_min = 3e-4 if optimizer_type == 'adam' else 5e-4 189 | Init_lr_fit = min(max(batch_size / nbs * Init_lr, lr_limit_min), lr_limit_max) 190 | Min_lr_fit = min(max(batch_size / nbs * Min_lr, lr_limit_min * 1e-2), lr_limit_max * 1e-2) 191 | 192 | optimizer = { 193 | 'adam' : Adam(lr = Init_lr_fit, beta_1 = momentum), 194 | 'sgd' : SGD(lr = Init_lr_fit, momentum = momentum, nesterov=True) 195 | }[optimizer_type] 196 | m = 0.5 197 | s = 32 if backbone == "mobilefacenet" else 64 198 | model.compile(optimizer = optimizer, loss={'ArcMargin': ArcFaceLoss(s = s, m = m)}, metrics={'ArcMargin': get_acc()}) 199 | 200 | #---------------------------------------# 201 | # 获得学习率下降的公式 202 | #---------------------------------------# 203 | lr_scheduler_func = get_lr_scheduler(lr_decay_type, Init_lr_fit, Min_lr_fit, Epoch) 204 | 205 | epoch_step = num_train // batch_size 206 | epoch_step_val = num_val // batch_size 207 | 208 | if epoch_step == 0 or epoch_step_val == 0: 209 | raise ValueError('数据集过小,无法进行训练,请扩充数据集。') 210 | 211 | train_dataset = FacenetDataset(input_shape, lines[:num_train], batch_size, num_classes, random = True) 212 | val_dataset = FacenetDataset(input_shape, lines[num_train:], batch_size, num_classes, random = False) 213 | 214 | #-------------------------------------------------------------------------------# 215 | # 训练参数的设置 216 | # logging 用于设置tensorboard的保存地址 217 | # checkpoint 用于设置权值保存的细节,period用于修改多少epoch保存一次 218 | # lr_scheduler 用于设置学习率下降的方式 219 | # early_stopping 用于设定早停,val_loss多次不下降自动结束训练,表示模型基本收敛 220 | #-------------------------------------------------------------------------------# 221 | time_str = datetime.datetime.strftime(datetime.datetime.now(),'%Y_%m_%d_%H_%M_%S') 222 | log_dir = os.path.join(save_dir, "loss_" + str(time_str)) 223 | logging = TensorBoard(log_dir) 224 | loss_history = LossHistory(log_dir) 225 | if ngpus_per_node > 1: 226 | checkpoint = ParallelModelCheckpoint(model_body, os.path.join(save_dir, "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5"), 227 | monitor = 'val_loss', save_weights_only = True, save_best_only = False, period = save_period) 228 | else: 229 | checkpoint = ModelCheckpoint(os.path.join(save_dir, "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5"), 230 | monitor = 'val_loss', save_weights_only = True, save_best_only = False, period = save_period) 231 | lr_scheduler = LearningRateScheduler(lr_scheduler_func, verbose = 1) 232 | #---------------------------------# 233 | # LFW估计 234 | #---------------------------------# 235 | if lfw_eval_flag: 236 | lfw_callback = LFW_callback(LFWDataset(dir=lfw_dir_path, pairs_path=lfw_pairs_path, batch_size=32, input_shape=input_shape)) 237 | callbacks = [logging, loss_history, checkpoint, lr_scheduler, lfw_callback] 238 | else: 239 | callbacks = [logging, loss_history, checkpoint, lr_scheduler] 240 | 241 | print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size)) 242 | model.fit_generator( 243 | generator = train_dataset, 244 | steps_per_epoch = epoch_step, 245 | validation_data = val_dataset, 246 | validation_steps = epoch_step_val, 247 | epochs = Epoch, 248 | initial_epoch = Init_Epoch, 249 | use_multiprocessing = True if num_workers > 1 else False, 250 | workers = num_workers, 251 | callbacks = callbacks 252 | ) 253 | -------------------------------------------------------------------------------- /常见问题汇总.md: -------------------------------------------------------------------------------- 1 | 问题汇总的博客地址为[https://blog.csdn.net/weixin_44791964/article/details/107517428](https://blog.csdn.net/weixin_44791964/article/details/107517428)。 2 | 3 | # 问题汇总 4 | ## 1、下载问题 5 | ### a、代码下载 6 | **问:up主,可以给我发一份代码吗,代码在哪里下载啊? 7 | 答:Github上的地址就在视频简介里。复制一下就能进去下载了。** 8 | 9 | **问:up主,为什么我下载的代码提示压缩包损坏? 10 | 答:重新去Github下载。** 11 | 12 | **问:up主,为什么我下载的代码和你在视频以及博客上的代码不一样? 13 | 答:我常常会对代码进行更新,最终以实际的代码为准。** 14 | 15 | ### b、 权值下载 16 | **问:up主,为什么我下载的代码里面,model_data下面没有.pth或者.h5文件? 17 | 答:我一般会把权值上传到Github和百度网盘,在GITHUB的README里面就能找到。** 18 | 19 | ### c、 数据集下载 20 | **问:up主,XXXX数据集在哪里下载啊? 21 | 答:一般数据集的下载地址我会放在README里面,基本上都有,没有的话请及时联系我添加,直接发github的issue即可**。 22 | 23 | ## 2、环境配置问题 24 | ### a、现在库中所用的环境 25 | **pytorch代码对应的pytorch版本为1.2,博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/106037141](https://blog.csdn.net/weixin_44791964/article/details/106037141)。 26 | 27 | **keras代码对应的tensorflow版本为1.13.2,keras版本是2.1.5,博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/104702142](https://blog.csdn.net/weixin_44791964/article/details/104702142)。 28 | 29 | **tf2代码对应的tensorflow版本为2.2.0,无需安装keras,博客地址对应**[https://blog.csdn.net/weixin_44791964/article/details/109161493](https://blog.csdn.net/weixin_44791964/article/details/109161493)。 30 | 31 | **问:你的代码某某某版本的tensorflow和pytorch能用嘛? 32 | 答:最好按照我推荐的配置,配置教程也有!其它版本的我没有试过!可能出现问题但是一般问题不大。仅需要改少量代码即可。** 33 | 34 | ### b、30系列显卡环境配置 35 | 30系显卡由于框架更新不可使用上述环境配置教程。 36 | 当前我已经测试的可以用的30显卡配置如下: 37 | **pytorch代码对应的pytorch版本为1.7.0,cuda为11.0,cudnn为8.0.5**。 38 | 39 | **keras代码无法在win10下配置cuda11,在ubuntu下可以百度查询一下,配置tensorflow版本为1.15.4,keras版本是2.1.5或者2.3.1(少量函数接口不同,代码可能还需要少量调整。)** 40 | 41 | **tf2代码对应的tensorflow版本为2.4.0,cuda为11.0,cudnn为8.0.5**。 42 | 43 | ### c、GPU利用问题与环境使用问题 44 | **问:为什么我安装了tensorflow-gpu但是却没用利用GPU进行训练呢? 45 | 答:确认tensorflow-gpu已经装好,利用pip list查看tensorflow版本,然后查看任务管理器或者利用nvidia命令看看是否使用了gpu进行训练,任务管理器的话要看显存使用情况。** 46 | 47 | **问:up主,我好像没有在用gpu进行训练啊,怎么看是不是用了GPU进行训练? 48 | 答:查看是否使用GPU进行训练一般使用NVIDIA在命令行的查看命令,如果要看任务管理器的话,请看性能部分GPU的显存是否利用,或者查看任务管理器的Cuda,而非Copy。** 49 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20201013234241524.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70#pic_center) 50 | 51 | **问:up主,为什么我按照你的环境配置后还是不能使用? 52 | 答:请把你的GPU、CUDA、CUDNN、TF版本以及PYTORCH版本B站私聊告诉我。** 53 | 54 | **问:出现如下错误** 55 | ```python 56 | Traceback (most recent call last): 57 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in 58 | from tensorflow.python.pywrap_tensorflow_internal import * 59 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in 60 | pywrap_tensorflow_internal = swig_import_helper() 61 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper 62 | _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) 63 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\imp.py", line 243, in load_modulereturn load_dynamic(name, filename, file) 64 | File "C:\Users\focus\Anaconda3\ana\envs\tensorflow-gpu\lib\imp.py", line 343, in load_dynamic 65 | return _load(spec) 66 | ImportError: DLL load failed: 找不到指定的模块。 67 | ``` 68 | **答:如果没重启过就重启一下,否则重新按照步骤安装,还无法解决则把你的GPU、CUDA、CUDNN、TF版本以及PYTORCH版本私聊告诉我。** 69 | 70 | ### d、no module问题 71 | **问:为什么提示说no module name utils.utils(no module name nets.yolo、no module name nets.ssd等一系列问题)啊? 72 | 答:utils并不需要用pip装,它就在我上传的仓库的根目录,出现这个问题的原因是根目录不对,查查相对目录和根目录的概念。查了基本上就明白了。** 73 | 74 | **问:为什么提示说no module name matplotlib(no module name PIL,no module name cv2等等)? 75 | 答:这个库没安装打开命令行安装就好。pip install matplotlib** 76 | 77 | **问:为什么我已经用pip装了opencv(pillow、matplotlib等),还是提示no module name cv2? 78 | 答:没有激活环境装,要激活对应的conda环境进行安装才可以正常使用** 79 | 80 | **问:为什么提示说No module named 'torch' ? 81 | 答:其实我也真的很想知道为什么会有这个问题……这个pytorch没装是什么情况?一般就俩情况,一个是真的没装,还有一个是装到其它环境了,当前激活的环境不是自己装的环境。** 82 | 83 | **问:为什么提示说No module named 'tensorflow' ? 84 | 答:同上。** 85 | 86 | ### e、cuda安装失败问题 87 | 一般cuda安装前需要安装Visual Studio,装个2017版本即可。 88 | 89 | ### f、Ubuntu系统问题 90 | **所有代码在Ubuntu下可以使用,我两个系统都试过。** 91 | 92 | ### g、VSCODE提示错误的问题 93 | **问:为什么在VSCODE里面提示一大堆的错误啊? 94 | 答:我也提示一大堆的错误,但是不影响,是VSCODE的问题,如果不想看错误的话就装Pycharm。** 95 | 96 | ### h、使用cpu进行训练与预测的问题 97 | **对于keras和tf2的代码而言,如果想用cpu进行训练和预测,直接装cpu版本的tensorflow就可以了。** 98 | 99 | **对于pytorch的代码而言,如果想用cpu进行训练和预测,需要将cuda=True修改成cuda=False。** 100 | 101 | ### i、tqdm没有pos参数问题 102 | **问:运行代码提示'tqdm' object has no attribute 'pos'。 103 | 答:重装tqdm,换个版本就可以了。** 104 | 105 | ### j、提示decode(“utf-8”)的问题 106 | **由于h5py库的更新,安装过程中会自动安装h5py=3.0.0以上的版本,会导致decode("utf-8")的错误! 107 | 各位一定要在安装完tensorflow后利用命令装h5py=2.10.0!** 108 | ``` 109 | pip install h5py==2.10.0 110 | ``` 111 | 112 | ### k、提示TypeError: __array__() takes 1 positional argument but 2 were given错误 113 | 可以修改pillow版本解决。 114 | ``` 115 | pip install pillow==8.2.0 116 | ``` 117 | 118 | ### l、其它问题 119 | **问:为什么提示TypeError: cat() got an unexpected keyword argument 'axis',Traceback (most recent call last),AttributeError: 'Tensor' object has no attribute 'bool'? 120 | 答:这是版本问题,建议使用torch1.2以上版本** 121 | **其它有很多稀奇古怪的问题,很多是版本问题,建议按照我的视频教程安装Keras和tensorflow。比如装的是tensorflow2,就不用问我说为什么我没法运行Keras-yolo啥的。那是必然不行的。** 122 | 123 | ## 3、目标检测库问题汇总(人脸检测和分类库也可参考) 124 | ### a、shape不匹配问题 125 | #### 1)、训练时shape不匹配问题 126 | **问:up主,为什么运行train.py会提示shape不匹配啊? 127 | 答:在keras环境中,因为你训练的种类和原始的种类不同,网络结构会变化,所以最尾部的shape会有少量不匹配。** 128 | 129 | #### 2)、预测时shape不匹配问题 130 | **问:为什么我运行predict.py会提示我说shape不匹配呀。 131 | 在Pytorch里面是这样的:** 132 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171631901.png) 133 | 在Keras里面是这样的: 134 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171523380.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70) 135 | **答:原因主要有仨: 136 | 1、在ssd、FasterRCNN里面,可能是train.py里面的num_classes没改。 137 | 2、model_path没改。 138 | 3、classes_path没改。 139 | 请检查清楚了!确定自己所用的model_path和classes_path是对应的!训练的时候用到的num_classes或者classes_path也需要检查!** 140 | 141 | ### b、显存不足问题 142 | **问:为什么我运行train.py下面的命令行闪的贼快,还提示OOM啥的? 143 | 答:这是在keras中出现的,爆显存了,可以改小batch_size,SSD的显存占用率是最小的,建议用SSD; 144 | 2G显存:SSD、YOLOV4-TINY 145 | 4G显存:YOLOV3 146 | 6G显存:YOLOV4、Retinanet、M2det、Efficientdet、Faster RCNN等 147 | 8G+显存:随便选吧。** 148 | **需要注意的是,受到BatchNorm2d影响,batch_size不可为1,至少为2。** 149 | 150 | **问:为什么提示 RuntimeError: CUDA out of memory. Tried to allocate 52.00 MiB (GPU 0; 15.90 GiB total capacity; 14.85 GiB already allocated; 51.88 MiB free; 15.07 GiB reserved in total by PyTorch)? 151 | 答:这是pytorch中出现的,爆显存了,同上。** 152 | 153 | **问:为什么我显存都没利用,就直接爆显存了? 154 | 答:都爆显存了,自然就不利用了,模型没有开始训练。** 155 | ### c、训练问题(冻结训练,LOSS问题、训练效果问题等) 156 | **问:为什么要冻结训练和解冻训练呀? 157 | 答:这是迁移学习的思想,因为神经网络主干特征提取部分所提取到的特征是通用的,我们冻结起来训练可以加快训练效率,也可以防止权值被破坏。** 158 | 在冻结阶段,模型的主干被冻结了,特征提取网络不发生改变。占用的显存较小,仅对网络进行微调。 159 | 在解冻阶段,模型的主干不被冻结了,特征提取网络会发生改变。占用的显存较大,网络所有的参数都会发生改变。 160 | 161 | **问:为什么我的网络不收敛啊,LOSS是XXXX。 162 | 答:不同网络的LOSS不同,LOSS只是一个参考指标,用于查看网络是否收敛,而非评价网络好坏,我的yolo代码都没有归一化,所以LOSS值看起来比较高,LOSS的值不重要,重要的是是否在变小,预测是否有效果。** 163 | 164 | **问:为什么我的训练效果不好?预测了没有框(框不准)。 165 | 答:** 166 | 167 | 考虑几个问题: 168 | 1、目标信息问题,查看2007_train.txt文件是否有目标信息,没有的话请修改voc_annotation.py。 169 | 2、数据集问题,小于500的自行考虑增加数据集,同时测试不同的模型,确认数据集是好的。 170 | 3、是否解冻训练,如果数据集分布与常规画面差距过大需要进一步解冻训练,调整主干,加强特征提取能力。 171 | 4、网络问题,比如SSD不适合小目标,因为先验框固定了。 172 | 5、训练时长问题,有些同学只训练了几代表示没有效果,按默认参数训练完。 173 | 6、确认自己是否按照步骤去做了,如果比如voc_annotation.py里面的classes是否修改了等。 174 | 7、不同网络的LOSS不同,LOSS只是一个参考指标,用于查看网络是否收敛,而非评价网络好坏,LOSS的值不重要,重要的是是否收敛。 175 | 176 | **问:我怎么出现了gbk什么的编码错误啊:** 177 | ```python 178 | UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 446: illegal multibyte sequence 179 | ``` 180 | **答:标签和路径不要使用中文,如果一定要使用中文,请注意处理的时候编码的问题,改成打开文件的encoding方式改为utf-8。** 181 | 182 | **问:我的图片是xxx*xxx的分辨率的,可以用吗!** 183 | **答:可以用,代码里面会自动进行resize或者数据增强。** 184 | 185 | **问:怎么进行多GPU训练? 186 | 答:pytorch的大多数代码可以直接使用gpu训练,keras的话直接百度就好了,实现并不复杂,我没有多卡没法详细测试,还需要各位同学自己努力了。** 187 | ### d、灰度图问题 188 | **问:能不能训练灰度图(预测灰度图)啊? 189 | 答:我的大多数库会将灰度图转化成RGB进行训练和预测,如果遇到代码不能训练或者预测灰度图的情况,可以尝试一下在get_random_data里面将Image.open后的结果转换成RGB,预测的时候也这样试试。(仅供参考)** 190 | 191 | ### e、断点续练问题 192 | **问:我已经训练过几个世代了,能不能从这个基础上继续开始训练 193 | 答:可以,你在训练前,和载入预训练权重一样载入训练过的权重就行了。一般训练好的权重会保存在logs文件夹里面,将model_path修改成你要开始的权值的路径即可。** 194 | 195 | ### f、预训练权重的问题 196 | **问:如果我要训练其它的数据集,预训练权重要怎么办啊?** 197 | **答:数据的预训练权重对不同数据集是通用的,因为特征是通用的,预训练权重对于99%的情况都必须要用,不用的话权值太过随机,特征提取效果不明显,网络训练的结果也不会好。** 198 | 199 | **问:up,我修改了网络,预训练权重还能用吗? 200 | 答:修改了主干的话,如果不是用的现有的网络,基本上预训练权重是不能用的,要么就自己判断权值里卷积核的shape然后自己匹配,要么只能自己预训练去了;修改了后半部分的话,前半部分的主干部分的预训练权重还是可以用的,如果是pytorch代码的话,需要自己修改一下载入权值的方式,判断shape后载入,如果是keras代码,直接by_name=True,skip_mismatch=True即可。** 201 | 权值匹配的方式可以参考如下: 202 | ```python 203 | # 加快模型训练的效率 204 | print('Loading weights into state dict...') 205 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 206 | model_dict = model.state_dict() 207 | pretrained_dict = torch.load(model_path, map_location=device) 208 | a = {} 209 | for k, v in pretrained_dict.items(): 210 | try: 211 | if np.shape(model_dict[k]) == np.shape(v): 212 | a[k]=v 213 | except: 214 | pass 215 | model_dict.update(a) 216 | model.load_state_dict(model_dict) 217 | print('Finished!') 218 | ``` 219 | 220 | **问:我要怎么不使用预训练权重啊? 221 | 答:把载入预训练权重的代码注释了就行。** 222 | 223 | **问:为什么我不使用预训练权重效果这么差啊? 224 | 答:因为随机初始化的权值不好,提取的特征不好,也就导致了模型训练的效果不好,voc07+12、coco+voc07+12效果都不一样,预训练权重还是非常重要的。** 225 | 226 | ### g、视频检测问题与摄像头检测问题 227 | **问:怎么用摄像头检测呀? 228 | 答:predict.py修改参数可以进行摄像头检测,也有视频详细解释了摄像头检测的思路。** 229 | 230 | **问:怎么用视频检测呀? 231 | 答:同上** 232 | ### h、从0开始训练问题 233 | **问:怎么在模型上从0开始训练? 234 | 答:在算力不足与调参能力不足的情况下从0开始训练毫无意义。模型特征提取能力在随机初始化参数的情况下非常差。没有好的参数调节能力和算力,无法使得网络正常收敛。** 235 | 如果一定要从0开始,那么训练的时候请注意几点: 236 | - 不载入预训练权重。 237 | - 不要进行冻结训练,注释冻结模型的代码。 238 | 239 | **问:为什么我不使用预训练权重效果这么差啊? 240 | 答:因为随机初始化的权值不好,提取的特征不好,也就导致了模型训练的效果不好,voc07+12、coco+voc07+12效果都不一样,预训练权重还是非常重要的。** 241 | 242 | ### i、保存问题 243 | **问:检测完的图片怎么保存? 244 | 答:一般目标检测用的是Image,所以查询一下PIL库的Image如何进行保存。详细看看predict.py文件的注释。** 245 | 246 | **问:怎么用视频保存呀? 247 | 答:详细看看predict.py文件的注释。** 248 | 249 | ### j、遍历问题 250 | **问:如何对一个文件夹的图片进行遍历? 251 | 答:一般使用os.listdir先找出文件夹里面的所有图片,然后根据predict.py文件里面的执行思路检测图片就行了,详细看看predict.py文件的注释。** 252 | 253 | **问:如何对一个文件夹的图片进行遍历?并且保存。 254 | 答:遍历的话一般使用os.listdir先找出文件夹里面的所有图片,然后根据predict.py文件里面的执行思路检测图片就行了。保存的话一般目标检测用的是Image,所以查询一下PIL库的Image如何进行保存。如果有些库用的是cv2,那就是查一下cv2怎么保存图片。详细看看predict.py文件的注释。** 255 | 256 | ### k、路径问题(No such file or directory) 257 | **问:我怎么出现了这样的错误呀:** 258 | ```python 259 | FileNotFoundError: 【Errno 2】 No such file or directory 260 | …………………………………… 261 | …………………………………… 262 | ``` 263 | **答:去检查一下文件夹路径,查看是否有对应文件;并且检查一下2007_train.txt,其中文件路径是否有错。** 264 | 关于路径有几个重要的点: 265 | **文件夹名称中一定不要有空格。 266 | 注意相对路径和绝对路径。 267 | 多百度路径相关的知识。** 268 | 269 | **所有的路径问题基本上都是根目录问题,好好查一下相对目录的概念!** 270 | ### l、和原版比较问题 271 | **问:你这个代码和原版比怎么样,可以达到原版的效果么? 272 | 答:基本上可以达到,我都用voc数据测过,我没有好显卡,没有能力在coco上测试与训练。** 273 | 274 | **问:你有没有实现yolov4所有的tricks,和原版差距多少? 275 | 答:并没有实现全部的改进部分,由于YOLOV4使用的改进实在太多了,很难完全实现与列出来,这里只列出来了一些我比较感兴趣,而且非常有效的改进。论文中提到的SAM(注意力机制模块),作者自己的源码也没有使用。还有其它很多的tricks,不是所有的tricks都有提升,我也没法实现全部的tricks。至于和原版的比较,我没有能力训练coco数据集,根据使用过的同学反应差距不大。** 276 | 277 | ### m、FPS问题(检测速度问题) 278 | **问:你这个FPS可以到达多少,可以到 XX FPS么? 279 | 答:FPS和机子的配置有关,配置高就快,配置低就慢。** 280 | 281 | **问:为什么我用服务器去测试yolov4(or others)的FPS只有十几? 282 | 答:检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本,如果已经正确安装,可以去利用time.time()的方法查看detect_image里面,哪一段代码耗时更长(不仅只有网络耗时长,其它处理部分也会耗时,如绘图等)。** 283 | 284 | **问:为什么论文中说速度可以达到XX,但是这里却没有? 285 | 答:检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本,如果已经正确安装,可以去利用time.time()的方法查看detect_image里面,哪一段代码耗时更长(不仅只有网络耗时长,其它处理部分也会耗时,如绘图等)。有些论文还会使用多batch进行预测,我并没有去实现这个部分。** 286 | 287 | ### n、预测图片不显示问题 288 | **问:为什么你的代码在预测完成后不显示图片?只是在命令行告诉我有什么目标。 289 | 答:给系统安装一个图片查看器就行了。** 290 | 291 | ### o、算法评价问题(目标检测的map、PR曲线、Recall、Precision等) 292 | **问:怎么计算map? 293 | 答:看map视频,都一个流程。** 294 | 295 | **问:计算map的时候,get_map.py里面有一个MINOVERLAP是什么用的,是iou吗? 296 | 答:是iou,它的作用是判断预测框和真实框的重合成度,如果重合程度大于MINOVERLAP,则预测正确。** 297 | 298 | **问:为什么get_map.py里面的self.confidence(self.score)要设置的那么小? 299 | 答:看一下map的视频的原理部分,要知道所有的结果然后再进行pr曲线的绘制。** 300 | 301 | **问:能不能说说怎么绘制PR曲线啥的呀。 302 | 答:可以看mAP视频,结果里面有PR曲线。** 303 | 304 | **问:怎么计算Recall、Precision指标。 305 | 答:这俩指标应该是相对于特定的置信度的,计算map的时候也会获得。** 306 | 307 | ### p、coco数据集训练问题 308 | **问:目标检测怎么训练COCO数据集啊?。 309 | 答:coco数据训练所需要的txt文件可以参考qqwweee的yolo3的库,格式都是一样的。** 310 | 311 | ### q、模型优化(模型修改)问题 312 | **问:up,YOLO系列使用Focal LOSS的代码你有吗,有提升吗? 313 | 答:很多人试过,提升效果也不大(甚至变的更Low),它自己有自己的正负样本的平衡方式。** 314 | 315 | **问:up,我修改了网络,预训练权重还能用吗? 316 | 答:修改了主干的话,如果不是用的现有的网络,基本上预训练权重是不能用的,要么就自己判断权值里卷积核的shape然后自己匹配,要么只能自己预训练去了;修改了后半部分的话,前半部分的主干部分的预训练权重还是可以用的,如果是pytorch代码的话,需要自己修改一下载入权值的方式,判断shape后载入,如果是keras代码,直接by_name=True,skip_mismatch=True即可。** 317 | 权值匹配的方式可以参考如下: 318 | ```python 319 | # 加快模型训练的效率 320 | print('Loading weights into state dict...') 321 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 322 | model_dict = model.state_dict() 323 | pretrained_dict = torch.load(model_path, map_location=device) 324 | a = {} 325 | for k, v in pretrained_dict.items(): 326 | try: 327 | if np.shape(model_dict[k]) == np.shape(v): 328 | a[k]=v 329 | except: 330 | pass 331 | model_dict.update(a) 332 | model.load_state_dict(model_dict) 333 | print('Finished!') 334 | ``` 335 | 336 | **问:up,怎么修改模型啊,我想发个小论文! 337 | 答:建议看看yolov3和yolov4的区别,然后看看yolov4的论文,作为一个大型调参现场非常有参考意义,使用了很多tricks。我能给的建议就是多看一些经典模型,然后拆解里面的亮点结构并使用。** 338 | 339 | ### r、部署问题 340 | 我没有具体部署到手机等设备上过,所以很多部署问题我并不了解…… 341 | 342 | ## 4、语义分割库问题汇总 343 | ### a、shape不匹配问题 344 | #### 1)、训练时shape不匹配问题 345 | **问:up主,为什么运行train.py会提示shape不匹配啊? 346 | 答:在keras环境中,因为你训练的种类和原始的种类不同,网络结构会变化,所以最尾部的shape会有少量不匹配。** 347 | 348 | #### 2)、预测时shape不匹配问题 349 | **问:为什么我运行predict.py会提示我说shape不匹配呀。 350 | 在Pytorch里面是这样的:** 351 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171631901.png) 352 | 在Keras里面是这样的: 353 | ![在这里插入图片描述](https://img-blog.csdnimg.cn/20200722171523380.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80NDc5MTk2NA==,size_16,color_FFFFFF,t_70) 354 | **答:原因主要有二: 355 | 1、train.py里面的num_classes没改。 356 | 2、预测时num_classes没改。 357 | 请检查清楚!训练和预测的时候用到的num_classes都需要检查!** 358 | 359 | ### b、显存不足问题 360 | **问:为什么我运行train.py下面的命令行闪的贼快,还提示OOM啥的? 361 | 答:这是在keras中出现的,爆显存了,可以改小batch_size。** 362 | 363 | **需要注意的是,受到BatchNorm2d影响,batch_size不可为1,至少为2。** 364 | 365 | **问:为什么提示 RuntimeError: CUDA out of memory. Tried to allocate 52.00 MiB (GPU 0; 15.90 GiB total capacity; 14.85 GiB already allocated; 51.88 MiB free; 15.07 GiB reserved in total by PyTorch)? 366 | 答:这是pytorch中出现的,爆显存了,同上。** 367 | 368 | **问:为什么我显存都没利用,就直接爆显存了? 369 | 答:都爆显存了,自然就不利用了,模型没有开始训练。** 370 | 371 | ### c、训练问题(冻结训练,LOSS问题、训练效果问题等) 372 | **问:为什么要冻结训练和解冻训练呀? 373 | 答:这是迁移学习的思想,因为神经网络主干特征提取部分所提取到的特征是通用的,我们冻结起来训练可以加快训练效率,也可以防止权值被破坏。** 374 | **在冻结阶段,模型的主干被冻结了,特征提取网络不发生改变。占用的显存较小,仅对网络进行微调。** 375 | **在解冻阶段,模型的主干不被冻结了,特征提取网络会发生改变。占用的显存较大,网络所有的参数都会发生改变。** 376 | 377 | **问:为什么我的网络不收敛啊,LOSS是XXXX。 378 | 答:不同网络的LOSS不同,LOSS只是一个参考指标,用于查看网络是否收敛,而非评价网络好坏,我的yolo代码都没有归一化,所以LOSS值看起来比较高,LOSS的值不重要,重要的是是否在变小,预测是否有效果。** 379 | 380 | **问:为什么我的训练效果不好?预测了没有目标,结果是一片黑。 381 | 答:** 382 | **考虑几个问题: 383 | 1、数据集问题,这是最重要的问题。小于500的自行考虑增加数据集;一定要检查数据集的标签,视频中详细解析了VOC数据集的格式,但并不是有输入图片有输出标签即可,还需要确认标签的每一个像素值是否为它对应的种类。很多同学的标签格式不对,最常见的错误格式就是标签的背景为黑,目标为白,此时目标的像素点值为255,无法正常训练,目标需要为1才行。 384 | 2、是否解冻训练,如果数据集分布与常规画面差距过大需要进一步解冻训练,调整主干,加强特征提取能力。 385 | 3、网络问题,可以尝试不同的网络。 386 | 4、训练时长问题,有些同学只训练了几代表示没有效果,按默认参数训练完。 387 | 5、确认自己是否按照步骤去做了。 388 | 6、不同网络的LOSS不同,LOSS只是一个参考指标,用于查看网络是否收敛,而非评价网络好坏,LOSS的值不重要,重要的是是否收敛。** 389 | 390 | 391 | 392 | **问:为什么我的训练效果不好?对小目标预测不准确。 393 | 答:对于deeplab和pspnet而言,可以修改一下downsample_factor,当downsample_factor为16的时候下采样倍数过多,效果不太好,可以修改为8。** 394 | 395 | **问:我怎么出现了gbk什么的编码错误啊:** 396 | ```python 397 | UnicodeDecodeError: 'gbk' codec can't decode byte 0xa6 in position 446: illegal multibyte sequence 398 | ``` 399 | **答:标签和路径不要使用中文,如果一定要使用中文,请注意处理的时候编码的问题,改成打开文件的encoding方式改为utf-8。** 400 | 401 | **问:我的图片是xxx*xxx的分辨率的,可以用吗!** 402 | **答:可以用,代码里面会自动进行resize或者数据增强。** 403 | 404 | **问:怎么进行多GPU训练? 405 | 答:pytorch的大多数代码可以直接使用gpu训练,keras的话直接百度就好了,实现并不复杂,我没有多卡没法详细测试,还需要各位同学自己努力了。** 406 | 407 | ### d、灰度图问题 408 | **问:能不能训练灰度图(预测灰度图)啊? 409 | 答:我的大多数库会将灰度图转化成RGB进行训练和预测,如果遇到代码不能训练或者预测灰度图的情况,可以尝试一下在get_random_data里面将Image.open后的结果转换成RGB,预测的时候也这样试试。(仅供参考)** 410 | 411 | ### e、断点续练问题 412 | **问:我已经训练过几个世代了,能不能从这个基础上继续开始训练 413 | 答:可以,你在训练前,和载入预训练权重一样载入训练过的权重就行了。一般训练好的权重会保存在logs文件夹里面,将model_path修改成你要开始的权值的路径即可。** 414 | 415 | ### f、预训练权重的问题 416 | 417 | **问:如果我要训练其它的数据集,预训练权重要怎么办啊?** 418 | **答:数据的预训练权重对不同数据集是通用的,因为特征是通用的,预训练权重对于99%的情况都必须要用,不用的话权值太过随机,特征提取效果不明显,网络训练的结果也不会好。** 419 | 420 | **问:up,我修改了网络,预训练权重还能用吗? 421 | 答:修改了主干的话,如果不是用的现有的网络,基本上预训练权重是不能用的,要么就自己判断权值里卷积核的shape然后自己匹配,要么只能自己预训练去了;修改了后半部分的话,前半部分的主干部分的预训练权重还是可以用的,如果是pytorch代码的话,需要自己修改一下载入权值的方式,判断shape后载入,如果是keras代码,直接by_name=True,skip_mismatch=True即可。** 422 | 权值匹配的方式可以参考如下: 423 | 424 | ```python 425 | # 加快模型训练的效率 426 | print('Loading weights into state dict...') 427 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 428 | model_dict = model.state_dict() 429 | pretrained_dict = torch.load(model_path, map_location=device) 430 | a = {} 431 | for k, v in pretrained_dict.items(): 432 | try: 433 | if np.shape(model_dict[k]) == np.shape(v): 434 | a[k]=v 435 | except: 436 | pass 437 | model_dict.update(a) 438 | model.load_state_dict(model_dict) 439 | print('Finished!') 440 | ``` 441 | 442 | **问:我要怎么不使用预训练权重啊? 443 | 答:把载入预训练权重的代码注释了就行。** 444 | 445 | **问:为什么我不使用预训练权重效果这么差啊? 446 | 答:因为随机初始化的权值不好,提取的特征不好,也就导致了模型训练的效果不好,预训练权重还是非常重要的。** 447 | 448 | ### g、视频检测问题与摄像头检测问题 449 | **问:怎么用摄像头检测呀? 450 | 答:predict.py修改参数可以进行摄像头检测,也有视频详细解释了摄像头检测的思路。** 451 | 452 | **问:怎么用视频检测呀? 453 | 答:同上** 454 | 455 | ### h、从0开始训练问题 456 | **问:怎么在模型上从0开始训练? 457 | 答:在算力不足与调参能力不足的情况下从0开始训练毫无意义。模型特征提取能力在随机初始化参数的情况下非常差。没有好的参数调节能力和算力,无法使得网络正常收敛。** 458 | 如果一定要从0开始,那么训练的时候请注意几点: 459 | - 不载入预训练权重。 460 | - 不要进行冻结训练,注释冻结模型的代码。 461 | 462 | **问:为什么我不使用预训练权重效果这么差啊? 463 | 答:因为随机初始化的权值不好,提取的特征不好,也就导致了模型训练的效果不好,预训练权重还是非常重要的。** 464 | 465 | ### i、保存问题 466 | **问:检测完的图片怎么保存? 467 | 答:一般目标检测用的是Image,所以查询一下PIL库的Image如何进行保存。详细看看predict.py文件的注释。** 468 | 469 | **问:怎么用视频保存呀? 470 | 答:详细看看predict.py文件的注释。** 471 | 472 | ### j、遍历问题 473 | **问:如何对一个文件夹的图片进行遍历? 474 | 答:一般使用os.listdir先找出文件夹里面的所有图片,然后根据predict.py文件里面的执行思路检测图片就行了,详细看看predict.py文件的注释。** 475 | 476 | **问:如何对一个文件夹的图片进行遍历?并且保存。 477 | 答:遍历的话一般使用os.listdir先找出文件夹里面的所有图片,然后根据predict.py文件里面的执行思路检测图片就行了。保存的话一般目标检测用的是Image,所以查询一下PIL库的Image如何进行保存。如果有些库用的是cv2,那就是查一下cv2怎么保存图片。详细看看predict.py文件的注释。** 478 | 479 | ### k、路径问题(No such file or directory) 480 | **问:我怎么出现了这样的错误呀:** 481 | ```python 482 | FileNotFoundError: 【Errno 2】 No such file or directory 483 | …………………………………… 484 | …………………………………… 485 | ``` 486 | 487 | **答:去检查一下文件夹路径,查看是否有对应文件;并且检查一下2007_train.txt,其中文件路径是否有错。** 488 | 关于路径有几个重要的点: 489 | **文件夹名称中一定不要有空格。 490 | 注意相对路径和绝对路径。 491 | 多百度路径相关的知识。** 492 | 493 | **所有的路径问题基本上都是根目录问题,好好查一下相对目录的概念!** 494 | 495 | ### l、FPS问题(检测速度问题) 496 | **问:你这个FPS可以到达多少,可以到 XX FPS么? 497 | 答:FPS和机子的配置有关,配置高就快,配置低就慢。** 498 | 499 | **问:为什么论文中说速度可以达到XX,但是这里却没有? 500 | 答:检查是否正确安装了tensorflow-gpu或者pytorch的gpu版本,如果已经正确安装,可以去利用time.time()的方法查看detect_image里面,哪一段代码耗时更长(不仅只有网络耗时长,其它处理部分也会耗时,如绘图等)。有些论文还会使用多batch进行预测,我并没有去实现这个部分。** 501 | 502 | ### m、预测图片不显示问题 503 | **问:为什么你的代码在预测完成后不显示图片?只是在命令行告诉我有什么目标。 504 | 答:给系统安装一个图片查看器就行了。** 505 | 506 | ### n、算法评价问题(miou) 507 | **问:怎么计算miou? 508 | 答:参考视频里的miou测量部分。** 509 | 510 | **问:怎么计算Recall、Precision指标。 511 | 答:现有的代码还无法获得,需要各位同学理解一下混淆矩阵的概念,然后自行计算一下。** 512 | 513 | ### o、模型优化(模型修改)问题 514 | **问:up,我修改了网络,预训练权重还能用吗? 515 | 答:修改了主干的话,如果不是用的现有的网络,基本上预训练权重是不能用的,要么就自己判断权值里卷积核的shape然后自己匹配,要么只能自己预训练去了;修改了后半部分的话,前半部分的主干部分的预训练权重还是可以用的,如果是pytorch代码的话,需要自己修改一下载入权值的方式,判断shape后载入,如果是keras代码,直接by_name=True,skip_mismatch=True即可。** 516 | 权值匹配的方式可以参考如下: 517 | 518 | ```python 519 | # 加快模型训练的效率 520 | print('Loading weights into state dict...') 521 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 522 | model_dict = model.state_dict() 523 | pretrained_dict = torch.load(model_path, map_location=device) 524 | a = {} 525 | for k, v in pretrained_dict.items(): 526 | try: 527 | if np.shape(model_dict[k]) == np.shape(v): 528 | a[k]=v 529 | except: 530 | pass 531 | model_dict.update(a) 532 | model.load_state_dict(model_dict) 533 | print('Finished!') 534 | ``` 535 | 536 | 537 | 538 | **问:up,怎么修改模型啊,我想发个小论文! 539 | 答:建议看看目标检测中yolov4的论文,作为一个大型调参现场非常有参考意义,使用了很多tricks。我能给的建议就是多看一些经典模型,然后拆解里面的亮点结构并使用。常用的tricks如注意力机制什么的,可以试试。** 540 | 541 | ### p、部署问题 542 | 我没有具体部署到手机等设备上过,所以很多部署问题我并不了解…… 543 | 544 | ## 5、交流群问题 545 | **问:up,有没有QQ群啥的呢? 546 | 答:没有没有,我没有时间管理QQ群……** 547 | 548 | ## 6、怎么学习的问题 549 | **问:up,你的学习路线怎么样的?我是个小白我要怎么学? 550 | 答:这里有几点需要注意哈 551 | 1、我不是高手,很多东西我也不会,我的学习路线也不一定适用所有人。 552 | 2、我实验室不做深度学习,所以我很多东西都是自学,自己摸索,正确与否我也不知道。 553 | 3、我个人觉得学习更靠自学** 554 | 学习路线的话,我是先学习了莫烦的python教程,从tensorflow、keras、pytorch入门,入门完之后学的SSD,YOLO,然后了解了很多经典的卷积网,后面就开始学很多不同的代码了,我的学习方法就是一行一行的看,了解整个代码的执行流程,特征层的shape变化等,花了很多时间也没有什么捷径,就是要花时间吧。 --------------------------------------------------------------------------------